Although users view the Coda file system as a hierarchy of directories and files, system administrators view the Coda file system as a hierarchy of volumes. Each volume contains a subtree of related directories and files which is a subtree of the entire file system. Volumes, then, parallel traditional Unix file systems. Like a Unix file system, a volume can be mounted. Thus, the root of a volume can be named within another volume at a mount point. The Coda file system hierarchy is built in this manner and is then mounted by each client using a conventional Unix mount point within the local file system. Since all internal mount points are invisible, the user sees only a single mount point for the entire Coda file system.
All system administration tasks are performed relative to a volume or a set of volumes. So, adding new users requires creating new volumes for their files and directories; quotas are enforced on volumes; and backups are performed on a per-volume basis. The volume abstraction greatly simplifies the administration of large systems. NOTE: Quotas have not been implemented yet.
The Coda file system provides four different types of volumes. The simplest of these is the non-replicated volume. Non-replicated volumes reside on a single server and are in the custody of the Coda file server they reside on. The Coda servers work with the venus processes on client workstations to provide a single, seamless view of the file system. However, if a custodian crashes or is otherwise inaccessible, its non-replicated volumes are inaccessible as well.
To partially solve this availability problem, Coda provides replicated, read-only volumes. This type of volume has exactly one read-write copy, but may have any number of read-only copies controlled by other servers. Changes to such a volume are made on the custodians read-write copy and then distributed to all servers with read-only copies. Read-only replication provides higher availability for volumes containing frequently-requested but infrequently-updated objects, like system binaries. In addition, read-only replication is used in performing backups on volumes.
Unfortunately, read-only replicas cannot provide high availability for all types of volumes, e.g. user volumes. Thus, Coda also provides read-write, replicated volumes. Read-write, replicated volumes are logical volumes which group together multiple read-write, non-replicated volumes. Coda provides protocols which allow read-write, replicated volumes to reside on a number of servers and to be accessed even when some servers are inaccessible. Although read-write replication provides everything read-only replication provides, its protocols are more expensive. Thus, read-only replication, rather than read-write replication, should be used for volumes which change slowly but are accessed frequently. Table XXX illustrates the differences between the volume types.
Typically, volumes consist of a single users data objects or other logically connected groups of data objects. Four factors should be used in dividing the file system tree into volumes.
A volume naming convention should also be used by those administrators
who create volumes. Volume names are restricted to 32 characters and
should be chosen so that given a volume name, a system administrator
(who knows the naming conventions) can determine its correct location
in the file system hierarchy. The convention used by the Coda project
is to name volumes by their function and location. Thus, a replicated
volume named "u.hbovik" is mounted in /coda/usr/hbovik
and
contains hboviks data. A project volume is prefixed by "p." and a
system volume is prefixed by "s." Similarly, volumes containing
machine specific object files are prefixed by the machine type. For
instance, "p.c.alpha.pmax.bin" contains project coda binaries for our
current alpha release and is mounted on
/coda/project/coda/alpha/pmax_mach/bin
.
Use the commands createvol (8)
and
createvol_rep(8) to create non-replicated and
read-write replicated volumes respectively. (Read-only replication is
discussed in Section
XXX below). These
commands are actually scripts which ultimately invoke the
volutil (8)
command with the create option at
the appropriate server. The volume will contain an access list
initialized to System:AnyUser rlidwka. Creating the volume
does not mount the volume within the file system hierarchy. Mounting
the volume as well as changing the access list or the quota must be
done using the cfs (1)
command from a client.
A new volume may not be visible at client workstations for some time
(see Section
XXX below).
A few concrete examples should clarify the use of some of these commands. On the SCM, the command
* createvol u.hbovik mahler /vicepa
will create a non-replicated volume named "u.hbovik" on server "mahler"s /vicepa partition. Similarly, the command
* createvol_rep u.hbovik E0000107 /vicepa
will create a replicated volume named "user.hbovik" on each server in the
Volume Server Group (VSG) identified by "E0000107". The file
/vice/db/VSGDB
contains the mapping between VSGs and their
identifications. The names of the replicas will be "user.hbovik.n",
where n is a number between 0 and |VSG| - 1.
In order to use a volume which you have created and added to the appropriate databases, you must mount the volume. Although Unix file systems must be mounted upon reboot in Unix, Coda volumes are mounted only once. To mount a Coda volume, you must be using a Coda client and be authenticated as a user (use the clog) command who has write access to the directory in which the mount point will be created.
Mount the volumes using the command
* cfs mkmount <filename> volname
Note that cfs creates <filename> automatically. For example,
* cfs mkmount /coda/usr/hbovik u.hbovik
will create /coda/usr/hbovik
and then mount the u.hbovik
volume created in the example in Section
XXX.
The volume is now visible to all users of the Coda file system. When
mounting a volume, avoid creating multiple mount points for it. Coda
cannot check for this. More information about the cfs command can
be found in Chapter
XXX as well as in Appendix
XXX.
When a volume is no longer needed, it may be purged by running the purgevol or purgevol_rep scripts on the SCM. Before removing a volume, you should probably create a backup for offline storage (see Section XXX up to the restore step). The volume's mount point should be removed with the cfs(1) command (see the rmmount option) before purging the volume (if possible). Note that purging the volume will not purge related backup volumes. Backup and ReadOnly volumes should be purged with the purgevol script.
For complete details on the backup/restore process, see Chapter XXX. In short, one first needs to get the correct dumpfile, possibly merging incremental dumps to an older full dump to get the desired state to be restored. Once this file is obtained, use the volutil (8) restore facility.
* volutil restore <filename> <partition> [<volname> [<volid>]]
The <partition> should be one of the /vicep?
partitions on the
server.
You may optionally specify the volumeid and the volume name for the restored
volume. This is useful when creating read-only replicated volumes.
Note that currently dump files are not independent of byte ordering -- so
volumes cannot be dumped across architectures that differ in this respect.
Read-only replication of a volume requires more effort on the part of the system administrator. However, it greatly increases the availability of volumes which cannot be read-write replicated. The most important example of such a volume is the root of the Coda file system. Conflicting updates cannot be allowed to occur at the root volume since this would make the entire Coda file system inaccessible. However, if the root volume is not replicated, the availability of the entire Coda file system depends upon the availability of the server acting as the custodian for this one volume. For these reasons, we highly recommend making the root volume of the Coda file system read-only replicated. We provide an extended example here to show you exactly how to go about the replication and distribution process. Note that the example shows how to make the root read-only replicated. Details pertaining to the coda root volume can be ignored when making other read-only replicated volumes (such as subtrees containing standard binaries).
We assume that you have a non-replicated root volume, called coda.root.
If you are installing a new system, you can access this volume as /coda
by using a Coda client. This volume should look exactly how you want the read-only
replica to look. If it doesnt, make any changes now. Also note that any volumes
that you want mounted within the read-only replicas should be mounted before
continuing. Our root volume has three read-write replicated volumes mounted.
The usr
volume contains the home directories of our users, the
project
volume contains project directories and tmp
contains
temporary files. In addition, we have one subdirectory called nonrep
which has non-replicated volumes (one per server) mounted within it. The purpose
of these non-replicated volumes is to provide users with a location to perform
repairs of conflicting objects. (Although the need for such a directory may not
be clear at this point, we highly recommend providing such a directory.)
If you already have a read-only replicated root volume but want to update it,
you should mount the read-write version of the root volume elsewhere and make
your changes to this volume. Once you have made your changes, you will need to
purge the old read-only replicas of your root volume using the
volutil(8) command.
Be sure that you purge the replica on each server. Then, you
will need to edit the VolumeList file in /vice/vol
and remove the entry
for the read-only replicated root volume. (The name of the read-only replica will
probably be coda.root.readonly.)
On the SCM, you need to clone the read-write copy of the root volume. You can use the command
* volutil clone <VolumeId>
This command will create a read-only volume with the name coda.root.readonly (Assuming that your root volume is called coda.root). Next, you will need to dump this cloned volume to a file with the command
* volutil dump <VolumeId> <filename>
Now, copy this file to each of the servers which will have read-only replicas of the root volume and execute the command
* volutil restore <filename> <partition> [<volid> [<volname>]]
Note that the root volume currently must reside on /vicepa
. Read-only
replicated volumes must share the same volid and name, so take care to specify
these correctly when restoring to more than one server. The final step is
to build the VLDB by running the command
* bldvldb.sh
on the SCM and to make sure
that the file /vice/ROOTVOLUME
contains the name or volume id of the
root volume (coda.root.readonly). (Also, it may be necessary to restart the venus
on the clients.)
The volume location data base, VLDB, is used to provide volume
addressing information to workstations. Copies of the VLDB reside on the
servers and are updated periodically. The VLDB lists the latest known
location or locations of all on-line volumes in the system.
A human readable version of the VRDB is maintained on the SCM in the
file /vice/vol/VRList
.
The VLDB is maintained on the SCM. When you wishe to update
it, run the /vice/bin/bldvldb.sh (8) script on the SCM.
The script gathers a copy of the /vice/vol/VolumeList
file from
all the servers, merges it into a single list, and builds a new VLDB. The
UpdateMon program then propagates the new VLDB to all the servers.
Note that the createvol and purgevol scripts
automatically invoke bldvldb.sh.
The volume replication data base, the VRDB, is used to provide information about replicated volumes to client workstations. Copies of the VRDB reside on all servers and are updated periodically. The VRDB maps each logical volume to its corresponding set of physical volumes.
A human readable version of the VRDB is maintained on the SCM in the
file /vice/vol/VRList
.
The makevrdb option to the volutil(8) command
will create a new VRDB which will automatically be distributed to the other
servers.
The Volume Storage Group Data Base, VSGDB, is currently maintained by hand. Each valid volume storage group has an entry in this data base containing an identification number and the names of the servers in the group.
Coda servers ensure file system consistency after a crash by running fsck (8), recovering RVM, and running the Coda salvager. The fsck used here a CMU has been modified so that it does not require every inode to be referenced by a
Warning: the vanilla fsck must not be used on a Coda file system partition as the Coda files will be thrown away.Coda accesses inode directly. After the server machine is booted, the codasrv process starts and RVM recovers the servers committed state. The Coda salvager then reconciles the results from fsck and the salvager.
The cfs provides information on volumes. cfs can only be used on a machine which has a running venus (such as a client workstation). cfs is described in Chapter XXX as well as in the manual page contained in Appendix XXX.