0001 .. SPDX-License-Identifier: GPL-2.0
0002
0003 ===========================
0004 Coda Kernel-Venus Interface
0005 ===========================
0006
0007 .. Note::
0008
0009 This is one of the technical documents describing a component of
0010 Coda -- this document describes the client kernel-Venus interface.
0011
0012 For more information:
0013
0014 http://www.coda.cs.cmu.edu
0015
0016 For user level software needed to run Coda:
0017
0018 ftp://ftp.coda.cs.cmu.edu
0019
0020 To run Coda you need to get a user level cache manager for the client,
0021 named Venus, as well as tools to manipulate ACLs, to log in, etc. The
0022 client needs to have the Coda filesystem selected in the kernel
0023 configuration.
0024
0025 The server needs a user level server and at present does not depend on
0026 kernel support.
0027
0028 The Venus kernel interface
0029
0030 Peter J. Braam
0031
0032 v1.0, Nov 9, 1997
0033
0034 This document describes the communication between Venus and kernel
0035 level filesystem code needed for the operation of the Coda file sys-
0036 tem. This document version is meant to describe the current interface
0037 (version 1.0) as well as improvements we envisage.
0038
0039 .. Table of Contents
0040
0041 1. Introduction
0042
0043 2. Servicing Coda filesystem calls
0044
0045 3. The message layer
0046
0047 3.1 Implementation details
0048
0049 4. The interface at the call level
0050
0051 4.1 Data structures shared by the kernel and Venus
0052 4.2 The pioctl interface
0053 4.3 root
0054 4.4 lookup
0055 4.5 getattr
0056 4.6 setattr
0057 4.7 access
0058 4.8 create
0059 4.9 mkdir
0060 4.10 link
0061 4.11 symlink
0062 4.12 remove
0063 4.13 rmdir
0064 4.14 readlink
0065 4.15 open
0066 4.16 close
0067 4.17 ioctl
0068 4.18 rename
0069 4.19 readdir
0070 4.20 vget
0071 4.21 fsync
0072 4.22 inactive
0073 4.23 rdwr
0074 4.24 odymount
0075 4.25 ody_lookup
0076 4.26 ody_expand
0077 4.27 prefetch
0078 4.28 signal
0079
0080 5. The minicache and downcalls
0081
0082 5.1 INVALIDATE
0083 5.2 FLUSH
0084 5.3 PURGEUSER
0085 5.4 ZAPFILE
0086 5.5 ZAPDIR
0087 5.6 ZAPVNODE
0088 5.7 PURGEFID
0089 5.8 REPLACE
0090
0091 6. Initialization and cleanup
0092
0093 6.1 Requirements
0094
0095 1. Introduction
0096 ===============
0097
0098 A key component in the Coda Distributed File System is the cache
0099 manager, Venus.
0100
0101 When processes on a Coda enabled system access files in the Coda
0102 filesystem, requests are directed at the filesystem layer in the
0103 operating system. The operating system will communicate with Venus to
0104 service the request for the process. Venus manages a persistent
0105 client cache and makes remote procedure calls to Coda file servers and
0106 related servers (such as authentication servers) to service these
0107 requests it receives from the operating system. When Venus has
0108 serviced a request it replies to the operating system with appropriate
0109 return codes, and other data related to the request. Optionally the
0110 kernel support for Coda may maintain a minicache of recently processed
0111 requests to limit the number of interactions with Venus. Venus
0112 possesses the facility to inform the kernel when elements from its
0113 minicache are no longer valid.
0114
0115 This document describes precisely this communication between the
0116 kernel and Venus. The definitions of so called upcalls and downcalls
0117 will be given with the format of the data they handle. We shall also
0118 describe the semantic invariants resulting from the calls.
0119
0120 Historically Coda was implemented in a BSD file system in Mach 2.6.
0121 The interface between the kernel and Venus is very similar to the BSD
0122 VFS interface. Similar functionality is provided, and the format of
0123 the parameters and returned data is very similar to the BSD VFS. This
0124 leads to an almost natural environment for implementing a kernel-level
0125 filesystem driver for Coda in a BSD system. However, other operating
0126 systems such as Linux and Windows 95 and NT have virtual filesystem
0127 with different interfaces.
0128
0129 To implement Coda on these systems some reverse engineering of the
0130 Venus/Kernel protocol is necessary. Also it came to light that other
0131 systems could profit significantly from certain small optimizations
0132 and modifications to the protocol. To facilitate this work as well as
0133 to make future ports easier, communication between Venus and the
0134 kernel should be documented in great detail. This is the aim of this
0135 document.
0136
0137 2. Servicing Coda filesystem calls
0138 ===================================
0139
0140 The service of a request for a Coda file system service originates in
0141 a process P which accessing a Coda file. It makes a system call which
0142 traps to the OS kernel. Examples of such calls trapping to the kernel
0143 are ``read``, ``write``, ``open``, ``close``, ``create``, ``mkdir``,
0144 ``rmdir``, ``chmod`` in a Unix ontext. Similar calls exist in the Win32
0145 environment, and are named ``CreateFile``.
0146
0147 Generally the operating system handles the request in a virtual
0148 filesystem (VFS) layer, which is named I/O Manager in NT and IFS
0149 manager in Windows 95. The VFS is responsible for partial processing
0150 of the request and for locating the specific filesystem(s) which will
0151 service parts of the request. Usually the information in the path
0152 assists in locating the correct FS drivers. Sometimes after extensive
0153 pre-processing, the VFS starts invoking exported routines in the FS
0154 driver. This is the point where the FS specific processing of the
0155 request starts, and here the Coda specific kernel code comes into
0156 play.
0157
0158 The FS layer for Coda must expose and implement several interfaces.
0159 First and foremost the VFS must be able to make all necessary calls to
0160 the Coda FS layer, so the Coda FS driver must expose the VFS interface
0161 as applicable in the operating system. These differ very significantly
0162 among operating systems, but share features such as facilities to
0163 read/write and create and remove objects. The Coda FS layer services
0164 such VFS requests by invoking one or more well defined services
0165 offered by the cache manager Venus. When the replies from Venus have
0166 come back to the FS driver, servicing of the VFS call continues and
0167 finishes with a reply to the kernel's VFS. Finally the VFS layer
0168 returns to the process.
0169
0170 As a result of this design a basic interface exposed by the FS driver
0171 must allow Venus to manage message traffic. In particular Venus must
0172 be able to retrieve and place messages and to be notified of the
0173 arrival of a new message. The notification must be through a mechanism
0174 which does not block Venus since Venus must attend to other tasks even
0175 when no messages are waiting or being processed.
0176
0177 **Interfaces of the Coda FS Driver**
0178
0179 Furthermore the FS layer provides for a special path of communication
0180 between a user process and Venus, called the pioctl interface. The
0181 pioctl interface is used for Coda specific services, such as
0182 requesting detailed information about the persistent cache managed by
0183 Venus. Here the involvement of the kernel is minimal. It identifies
0184 the calling process and passes the information on to Venus. When
0185 Venus replies the response is passed back to the caller in unmodified
0186 form.
0187
0188 Finally Venus allows the kernel FS driver to cache the results from
0189 certain services. This is done to avoid excessive context switches
0190 and results in an efficient system. However, Venus may acquire
0191 information, for example from the network which implies that cached
0192 information must be flushed or replaced. Venus then makes a downcall
0193 to the Coda FS layer to request flushes or updates in the cache. The
0194 kernel FS driver handles such requests synchronously.
0195
0196 Among these interfaces the VFS interface and the facility to place,
0197 receive and be notified of messages are platform specific. We will
0198 not go into the calls exported to the VFS layer but we will state the
0199 requirements of the message exchange mechanism.
0200
0201
0202 3. The message layer
0203 =====================
0204
0205 At the lowest level the communication between Venus and the FS driver
0206 proceeds through messages. The synchronization between processes
0207 requesting Coda file service and Venus relies on blocking and waking
0208 up processes. The Coda FS driver processes VFS- and pioctl-requests
0209 on behalf of a process P, creates messages for Venus, awaits replies
0210 and finally returns to the caller. The implementation of the exchange
0211 of messages is platform specific, but the semantics have (so far)
0212 appeared to be generally applicable. Data buffers are created by the
0213 FS Driver in kernel memory on behalf of P and copied to user memory in
0214 Venus.
0215
0216 The FS Driver while servicing P makes upcalls to Venus. Such an
0217 upcall is dispatched to Venus by creating a message structure. The
0218 structure contains the identification of P, the message sequence
0219 number, the size of the request and a pointer to the data in kernel
0220 memory for the request. Since the data buffer is re-used to hold the
0221 reply from Venus, there is a field for the size of the reply. A flags
0222 field is used in the message to precisely record the status of the
0223 message. Additional platform dependent structures involve pointers to
0224 determine the position of the message on queues and pointers to
0225 synchronization objects. In the upcall routine the message structure
0226 is filled in, flags are set to 0, and it is placed on the *pending*
0227 queue. The routine calling upcall is responsible for allocating the
0228 data buffer; its structure will be described in the next section.
0229
0230 A facility must exist to notify Venus that the message has been
0231 created, and implemented using available synchronization objects in
0232 the OS. This notification is done in the upcall context of the process
0233 P. When the message is on the pending queue, process P cannot proceed
0234 in upcall. The (kernel mode) processing of P in the filesystem
0235 request routine must be suspended until Venus has replied. Therefore
0236 the calling thread in P is blocked in upcall. A pointer in the
0237 message structure will locate the synchronization object on which P is
0238 sleeping.
0239
0240 Venus detects the notification that a message has arrived, and the FS
0241 driver allow Venus to retrieve the message with a getmsg_from_kernel
0242 call. This action finishes in the kernel by putting the message on the
0243 queue of processing messages and setting flags to READ. Venus is
0244 passed the contents of the data buffer. The getmsg_from_kernel call
0245 now returns and Venus processes the request.
0246
0247 At some later point the FS driver receives a message from Venus,
0248 namely when Venus calls sendmsg_to_kernel. At this moment the Coda FS
0249 driver looks at the contents of the message and decides if:
0250
0251
0252 * the message is a reply for a suspended thread P. If so it removes
0253 the message from the processing queue and marks the message as
0254 WRITTEN. Finally, the FS driver unblocks P (still in the kernel
0255 mode context of Venus) and the sendmsg_to_kernel call returns to
0256 Venus. The process P will be scheduled at some point and continues
0257 processing its upcall with the data buffer replaced with the reply
0258 from Venus.
0259
0260 * The message is a ``downcall``. A downcall is a request from Venus to
0261 the FS Driver. The FS driver processes the request immediately
0262 (usually a cache eviction or replacement) and when it finishes
0263 sendmsg_to_kernel returns.
0264
0265 Now P awakes and continues processing upcall. There are some
0266 subtleties to take account of. First P will determine if it was woken
0267 up in upcall by a signal from some other source (for example an
0268 attempt to terminate P) or as is normally the case by Venus in its
0269 sendmsg_to_kernel call. In the normal case, the upcall routine will
0270 deallocate the message structure and return. The FS routine can proceed
0271 with its processing.
0272
0273
0274 **Sleeping and IPC arrangements**
0275
0276 In case P is woken up by a signal and not by Venus, it will first look
0277 at the flags field. If the message is not yet READ, the process P can
0278 handle its signal without notifying Venus. If Venus has READ, and
0279 the request should not be processed, P can send Venus a signal message
0280 to indicate that it should disregard the previous message. Such
0281 signals are put in the queue at the head, and read first by Venus. If
0282 the message is already marked as WRITTEN it is too late to stop the
0283 processing. The VFS routine will now continue. (-- If a VFS request
0284 involves more than one upcall, this can lead to complicated state, an
0285 extra field "handle_signals" could be added in the message structure
0286 to indicate points of no return have been passed.--)
0287
0288
0289
0290 3.1. Implementation details
0291 ----------------------------
0292
0293 The Unix implementation of this mechanism has been through the
0294 implementation of a character device associated with Coda. Venus
0295 retrieves messages by doing a read on the device, replies are sent
0296 with a write and notification is through the select system call on the
0297 file descriptor for the device. The process P is kept waiting on an
0298 interruptible wait queue object.
0299
0300 In Windows NT and the DPMI Windows 95 implementation a DeviceIoControl
0301 call is used. The DeviceIoControl call is designed to copy buffers
0302 from user memory to kernel memory with OPCODES. The sendmsg_to_kernel
0303 is issued as a synchronous call, while the getmsg_from_kernel call is
0304 asynchronous. Windows EventObjects are used for notification of
0305 message arrival. The process P is kept waiting on a KernelEvent
0306 object in NT and a semaphore in Windows 95.
0307
0308
0309 4. The interface at the call level
0310 ===================================
0311
0312
0313 This section describes the upcalls a Coda FS driver can make to Venus.
0314 Each of these upcalls make use of two structures: inputArgs and
0315 outputArgs. In pseudo BNF form the structures take the following
0316 form::
0317
0318
0319 struct inputArgs {
0320 u_long opcode;
0321 u_long unique; /* Keep multiple outstanding msgs distinct */
0322 u_short pid; /* Common to all */
0323 u_short pgid; /* Common to all */
0324 struct CodaCred cred; /* Common to all */
0325
0326 <union "in" of call dependent parts of inputArgs>
0327 };
0328
0329 struct outputArgs {
0330 u_long opcode;
0331 u_long unique; /* Keep multiple outstanding msgs distinct */
0332 u_long result;
0333
0334 <union "out" of call dependent parts of inputArgs>
0335 };
0336
0337
0338
0339 Before going on let us elucidate the role of the various fields. The
0340 inputArgs start with the opcode which defines the type of service
0341 requested from Venus. There are approximately 30 upcalls at present
0342 which we will discuss. The unique field labels the inputArg with a
0343 unique number which will identify the message uniquely. A process and
0344 process group id are passed. Finally the credentials of the caller
0345 are included.
0346
0347 Before delving into the specific calls we need to discuss a variety of
0348 data structures shared by the kernel and Venus.
0349
0350
0351
0352
0353 4.1. Data structures shared by the kernel and Venus
0354 ----------------------------------------------------
0355
0356
0357 The CodaCred structure defines a variety of user and group ids as
0358 they are set for the calling process. The vuid_t and vgid_t are 32 bit
0359 unsigned integers. It also defines group membership in an array. On
0360 Unix the CodaCred has proven sufficient to implement good security
0361 semantics for Coda but the structure may have to undergo modification
0362 for the Windows environment when these mature::
0363
0364 struct CodaCred {
0365 vuid_t cr_uid, cr_euid, cr_suid, cr_fsuid; /* Real, effective, set, fs uid */
0366 vgid_t cr_gid, cr_egid, cr_sgid, cr_fsgid; /* same for groups */
0367 vgid_t cr_groups[NGROUPS]; /* Group membership for caller */
0368 };
0369
0370
0371 .. Note::
0372
0373 It is questionable if we need CodaCreds in Venus. Finally Venus
0374 doesn't know about groups, although it does create files with the
0375 default uid/gid. Perhaps the list of group membership is superfluous.
0376
0377
0378 The next item is the fundamental identifier used to identify Coda
0379 files, the ViceFid. A fid of a file uniquely defines a file or
0380 directory in the Coda filesystem within a cell [1]_::
0381
0382 typedef struct ViceFid {
0383 VolumeId Volume;
0384 VnodeId Vnode;
0385 Unique_t Unique;
0386 } ViceFid;
0387
0388 .. [1] A cell is agroup of Coda servers acting under the aegis of a single
0389 system control machine or SCM. See the Coda Administration manual
0390 for a detailed description of the role of the SCM.
0391
0392 Each of the constituent fields: VolumeId, VnodeId and Unique_t are
0393 unsigned 32 bit integers. We envisage that a further field will need
0394 to be prefixed to identify the Coda cell; this will probably take the
0395 form of a Ipv6 size IP address naming the Coda cell through DNS.
0396
0397 The next important structure shared between Venus and the kernel is
0398 the attributes of the file. The following structure is used to
0399 exchange information. It has room for future extensions such as
0400 support for device files (currently not present in Coda)::
0401
0402
0403 struct coda_timespec {
0404 int64_t tv_sec; /* seconds */
0405 long tv_nsec; /* nanoseconds */
0406 };
0407
0408 struct coda_vattr {
0409 enum coda_vtype va_type; /* vnode type (for create) */
0410 u_short va_mode; /* files access mode and type */
0411 short va_nlink; /* number of references to file */
0412 vuid_t va_uid; /* owner user id */
0413 vgid_t va_gid; /* owner group id */
0414 long va_fsid; /* file system id (dev for now) */
0415 long va_fileid; /* file id */
0416 u_quad_t va_size; /* file size in bytes */
0417 long va_blocksize; /* blocksize preferred for i/o */
0418 struct coda_timespec va_atime; /* time of last access */
0419 struct coda_timespec va_mtime; /* time of last modification */
0420 struct coda_timespec va_ctime; /* time file changed */
0421 u_long va_gen; /* generation number of file */
0422 u_long va_flags; /* flags defined for file */
0423 dev_t va_rdev; /* device special file represents */
0424 u_quad_t va_bytes; /* bytes of disk space held by file */
0425 u_quad_t va_filerev; /* file modification number */
0426 u_int va_vaflags; /* operations flags, see below */
0427 long va_spare; /* remain quad aligned */
0428 };
0429
0430
0431 4.2. The pioctl interface
0432 --------------------------
0433
0434
0435 Coda specific requests can be made by application through the pioctl
0436 interface. The pioctl is implemented as an ordinary ioctl on a
0437 fictitious file /coda/.CONTROL. The pioctl call opens this file, gets
0438 a file handle and makes the ioctl call. Finally it closes the file.
0439
0440 The kernel involvement in this is limited to providing the facility to
0441 open and close and pass the ioctl message and to verify that a path in
0442 the pioctl data buffers is a file in a Coda filesystem.
0443
0444 The kernel is handed a data packet of the form::
0445
0446 struct {
0447 const char *path;
0448 struct ViceIoctl vidata;
0449 int follow;
0450 } data;
0451
0452
0453
0454 where::
0455
0456
0457 struct ViceIoctl {
0458 caddr_t in, out; /* Data to be transferred in, or out */
0459 short in_size; /* Size of input buffer <= 2K */
0460 short out_size; /* Maximum size of output buffer, <= 2K */
0461 };
0462
0463
0464
0465 The path must be a Coda file, otherwise the ioctl upcall will not be
0466 made.
0467
0468 .. Note:: The data structures and code are a mess. We need to clean this up.
0469
0470
0471 **We now proceed to document the individual calls**:
0472
0473
0474 4.3. root
0475 ----------
0476
0477
0478 Arguments
0479 in
0480
0481 empty
0482
0483 out::
0484
0485 struct cfs_root_out {
0486 ViceFid VFid;
0487 } cfs_root;
0488
0489
0490
0491 Description
0492 This call is made to Venus during the initialization of
0493 the Coda filesystem. If the result is zero, the cfs_root structure
0494 contains the ViceFid of the root of the Coda filesystem. If a non-zero
0495 result is generated, its value is a platform dependent error code
0496 indicating the difficulty Venus encountered in locating the root of
0497 the Coda filesystem.
0498
0499 4.4. lookup
0500 ------------
0501
0502
0503 Summary
0504 Find the ViceFid and type of an object in a directory if it exists.
0505
0506 Arguments
0507 in::
0508
0509 struct cfs_lookup_in {
0510 ViceFid VFid;
0511 char *name; /* Place holder for data. */
0512 } cfs_lookup;
0513
0514
0515
0516 out::
0517
0518 struct cfs_lookup_out {
0519 ViceFid VFid;
0520 int vtype;
0521 } cfs_lookup;
0522
0523
0524
0525 Description
0526 This call is made to determine the ViceFid and filetype of
0527 a directory entry. The directory entry requested carries name 'name'
0528 and Venus will search the directory identified by cfs_lookup_in.VFid.
0529 The result may indicate that the name does not exist, or that
0530 difficulty was encountered in finding it (e.g. due to disconnection).
0531 If the result is zero, the field cfs_lookup_out.VFid contains the
0532 targets ViceFid and cfs_lookup_out.vtype the coda_vtype giving the
0533 type of object the name designates.
0534
0535 The name of the object is an 8 bit character string of maximum length
0536 CFS_MAXNAMLEN, currently set to 256 (including a 0 terminator.)
0537
0538 It is extremely important to realize that Venus bitwise ors the field
0539 cfs_lookup.vtype with CFS_NOCACHE to indicate that the object should
0540 not be put in the kernel name cache.
0541
0542 .. Note::
0543
0544 The type of the vtype is currently wrong. It should be
0545 coda_vtype. Linux does not take note of CFS_NOCACHE. It should.
0546
0547
0548 4.5. getattr
0549 -------------
0550
0551
0552 Summary Get the attributes of a file.
0553
0554 Arguments
0555 in::
0556
0557 struct cfs_getattr_in {
0558 ViceFid VFid;
0559 struct coda_vattr attr; /* XXXXX */
0560 } cfs_getattr;
0561
0562
0563
0564 out::
0565
0566 struct cfs_getattr_out {
0567 struct coda_vattr attr;
0568 } cfs_getattr;
0569
0570
0571
0572 Description
0573 This call returns the attributes of the file identified by fid.
0574
0575 Errors
0576 Errors can occur if the object with fid does not exist, is
0577 unaccessible or if the caller does not have permission to fetch
0578 attributes.
0579
0580 .. Note::
0581
0582 Many kernel FS drivers (Linux, NT and Windows 95) need to acquire
0583 the attributes as well as the Fid for the instantiation of an internal
0584 "inode" or "FileHandle". A significant improvement in performance on
0585 such systems could be made by combining the lookup and getattr calls
0586 both at the Venus/kernel interaction level and at the RPC level.
0587
0588 The vattr structure included in the input arguments is superfluous and
0589 should be removed.
0590
0591
0592 4.6. setattr
0593 -------------
0594
0595
0596 Summary
0597 Set the attributes of a file.
0598
0599 Arguments
0600 in::
0601
0602 struct cfs_setattr_in {
0603 ViceFid VFid;
0604 struct coda_vattr attr;
0605 } cfs_setattr;
0606
0607
0608
0609
0610 out
0611
0612 empty
0613
0614 Description
0615 The structure attr is filled with attributes to be changed
0616 in BSD style. Attributes not to be changed are set to -1, apart from
0617 vtype which is set to VNON. Other are set to the value to be assigned.
0618 The only attributes which the FS driver may request to change are the
0619 mode, owner, groupid, atime, mtime and ctime. The return value
0620 indicates success or failure.
0621
0622 Errors
0623 A variety of errors can occur. The object may not exist, may
0624 be inaccessible, or permission may not be granted by Venus.
0625
0626
0627 4.7. access
0628 ------------
0629
0630
0631 Arguments
0632 in::
0633
0634 struct cfs_access_in {
0635 ViceFid VFid;
0636 int flags;
0637 } cfs_access;
0638
0639
0640
0641 out
0642
0643 empty
0644
0645 Description
0646 Verify if access to the object identified by VFid for
0647 operations described by flags is permitted. The result indicates if
0648 access will be granted. It is important to remember that Coda uses
0649 ACLs to enforce protection and that ultimately the servers, not the
0650 clients enforce the security of the system. The result of this call
0651 will depend on whether a token is held by the user.
0652
0653 Errors
0654 The object may not exist, or the ACL describing the protection
0655 may not be accessible.
0656
0657
0658 4.8. create
0659 ------------
0660
0661
0662 Summary
0663 Invoked to create a file
0664
0665 Arguments
0666 in::
0667
0668 struct cfs_create_in {
0669 ViceFid VFid;
0670 struct coda_vattr attr;
0671 int excl;
0672 int mode;
0673 char *name; /* Place holder for data. */
0674 } cfs_create;
0675
0676
0677
0678
0679 out::
0680
0681 struct cfs_create_out {
0682 ViceFid VFid;
0683 struct coda_vattr attr;
0684 } cfs_create;
0685
0686
0687
0688 Description
0689 This upcall is invoked to request creation of a file.
0690 The file will be created in the directory identified by VFid, its name
0691 will be name, and the mode will be mode. If excl is set an error will
0692 be returned if the file already exists. If the size field in attr is
0693 set to zero the file will be truncated. The uid and gid of the file
0694 are set by converting the CodaCred to a uid using a macro CRTOUID
0695 (this macro is platform dependent). Upon success the VFid and
0696 attributes of the file are returned. The Coda FS Driver will normally
0697 instantiate a vnode, inode or file handle at kernel level for the new
0698 object.
0699
0700
0701 Errors
0702 A variety of errors can occur. Permissions may be insufficient.
0703 If the object exists and is not a file the error EISDIR is returned
0704 under Unix.
0705
0706 .. Note::
0707
0708 The packing of parameters is very inefficient and appears to
0709 indicate confusion between the system call creat and the VFS operation
0710 create. The VFS operation create is only called to create new objects.
0711 This create call differs from the Unix one in that it is not invoked
0712 to return a file descriptor. The truncate and exclusive options,
0713 together with the mode, could simply be part of the mode as it is
0714 under Unix. There should be no flags argument; this is used in open
0715 (2) to return a file descriptor for READ or WRITE mode.
0716
0717 The attributes of the directory should be returned too, since the size
0718 and mtime changed.
0719
0720
0721 4.9. mkdir
0722 -----------
0723
0724
0725 Summary
0726 Create a new directory.
0727
0728 Arguments
0729 in::
0730
0731 struct cfs_mkdir_in {
0732 ViceFid VFid;
0733 struct coda_vattr attr;
0734 char *name; /* Place holder for data. */
0735 } cfs_mkdir;
0736
0737
0738
0739 out::
0740
0741 struct cfs_mkdir_out {
0742 ViceFid VFid;
0743 struct coda_vattr attr;
0744 } cfs_mkdir;
0745
0746
0747
0748
0749 Description
0750 This call is similar to create but creates a directory.
0751 Only the mode field in the input parameters is used for creation.
0752 Upon successful creation, the attr returned contains the attributes of
0753 the new directory.
0754
0755 Errors
0756 As for create.
0757
0758 .. Note::
0759
0760 The input parameter should be changed to mode instead of
0761 attributes.
0762
0763 The attributes of the parent should be returned since the size and
0764 mtime changes.
0765
0766
0767 4.10. link
0768 -----------
0769
0770
0771 Summary
0772 Create a link to an existing file.
0773
0774 Arguments
0775 in::
0776
0777 struct cfs_link_in {
0778 ViceFid sourceFid; /* cnode to link *to* */
0779 ViceFid destFid; /* Directory in which to place link */
0780 char *tname; /* Place holder for data. */
0781 } cfs_link;
0782
0783
0784
0785 out
0786
0787 empty
0788
0789 Description
0790 This call creates a link to the sourceFid in the directory
0791 identified by destFid with name tname. The source must reside in the
0792 target's parent, i.e. the source must be have parent destFid, i.e. Coda
0793 does not support cross directory hard links. Only the return value is
0794 relevant. It indicates success or the type of failure.
0795
0796 Errors
0797 The usual errors can occur.
0798
0799
0800 4.11. symlink
0801 --------------
0802
0803
0804 Summary
0805 create a symbolic link
0806
0807 Arguments
0808 in::
0809
0810 struct cfs_symlink_in {
0811 ViceFid VFid; /* Directory to put symlink in */
0812 char *srcname;
0813 struct coda_vattr attr;
0814 char *tname;
0815 } cfs_symlink;
0816
0817
0818
0819 out
0820
0821 none
0822
0823 Description
0824 Create a symbolic link. The link is to be placed in the
0825 directory identified by VFid and named tname. It should point to the
0826 pathname srcname. The attributes of the newly created object are to
0827 be set to attr.
0828
0829 .. Note::
0830
0831 The attributes of the target directory should be returned since
0832 its size changed.
0833
0834
0835 4.12. remove
0836 -------------
0837
0838
0839 Summary
0840 Remove a file
0841
0842 Arguments
0843 in::
0844
0845 struct cfs_remove_in {
0846 ViceFid VFid;
0847 char *name; /* Place holder for data. */
0848 } cfs_remove;
0849
0850
0851
0852 out
0853
0854 none
0855
0856 Description
0857 Remove file named cfs_remove_in.name in directory
0858 identified by VFid.
0859
0860
0861 .. Note::
0862
0863 The attributes of the directory should be returned since its
0864 mtime and size may change.
0865
0866
0867 4.13. rmdir
0868 ------------
0869
0870
0871 Summary
0872 Remove a directory
0873
0874 Arguments
0875 in::
0876
0877 struct cfs_rmdir_in {
0878 ViceFid VFid;
0879 char *name; /* Place holder for data. */
0880 } cfs_rmdir;
0881
0882
0883
0884 out
0885
0886 none
0887
0888 Description
0889 Remove the directory with name 'name' from the directory
0890 identified by VFid.
0891
0892 .. Note:: The attributes of the parent directory should be returned since
0893 its mtime and size may change.
0894
0895
0896 4.14. readlink
0897 ---------------
0898
0899
0900 Summary
0901 Read the value of a symbolic link.
0902
0903 Arguments
0904 in::
0905
0906 struct cfs_readlink_in {
0907 ViceFid VFid;
0908 } cfs_readlink;
0909
0910
0911
0912 out::
0913
0914 struct cfs_readlink_out {
0915 int count;
0916 caddr_t data; /* Place holder for data. */
0917 } cfs_readlink;
0918
0919
0920
0921 Description
0922 This routine reads the contents of symbolic link
0923 identified by VFid into the buffer data. The buffer data must be able
0924 to hold any name up to CFS_MAXNAMLEN (PATH or NAM??).
0925
0926 Errors
0927 No unusual errors.
0928
0929
0930 4.15. open
0931 -----------
0932
0933
0934 Summary
0935 Open a file.
0936
0937 Arguments
0938 in::
0939
0940 struct cfs_open_in {
0941 ViceFid VFid;
0942 int flags;
0943 } cfs_open;
0944
0945
0946
0947 out::
0948
0949 struct cfs_open_out {
0950 dev_t dev;
0951 ino_t inode;
0952 } cfs_open;
0953
0954
0955
0956 Description
0957 This request asks Venus to place the file identified by
0958 VFid in its cache and to note that the calling process wishes to open
0959 it with flags as in open(2). The return value to the kernel differs
0960 for Unix and Windows systems. For Unix systems the Coda FS Driver is
0961 informed of the device and inode number of the container file in the
0962 fields dev and inode. For Windows the path of the container file is
0963 returned to the kernel.
0964
0965
0966 .. Note::
0967
0968 Currently the cfs_open_out structure is not properly adapted to
0969 deal with the Windows case. It might be best to implement two
0970 upcalls, one to open aiming at a container file name, the other at a
0971 container file inode.
0972
0973
0974 4.16. close
0975 ------------
0976
0977
0978 Summary
0979 Close a file, update it on the servers.
0980
0981 Arguments
0982 in::
0983
0984 struct cfs_close_in {
0985 ViceFid VFid;
0986 int flags;
0987 } cfs_close;
0988
0989
0990
0991 out
0992
0993 none
0994
0995 Description
0996 Close the file identified by VFid.
0997
0998 .. Note::
0999
1000 The flags argument is bogus and not used. However, Venus' code
1001 has room to deal with an execp input field, probably this field should
1002 be used to inform Venus that the file was closed but is still memory
1003 mapped for execution. There are comments about fetching versus not
1004 fetching the data in Venus vproc_vfscalls. This seems silly. If a
1005 file is being closed, the data in the container file is to be the new
1006 data. Here again the execp flag might be in play to create confusion:
1007 currently Venus might think a file can be flushed from the cache when
1008 it is still memory mapped. This needs to be understood.
1009
1010
1011 4.17. ioctl
1012 ------------
1013
1014
1015 Summary
1016 Do an ioctl on a file. This includes the pioctl interface.
1017
1018 Arguments
1019 in::
1020
1021 struct cfs_ioctl_in {
1022 ViceFid VFid;
1023 int cmd;
1024 int len;
1025 int rwflag;
1026 char *data; /* Place holder for data. */
1027 } cfs_ioctl;
1028
1029
1030
1031 out::
1032
1033
1034 struct cfs_ioctl_out {
1035 int len;
1036 caddr_t data; /* Place holder for data. */
1037 } cfs_ioctl;
1038
1039
1040
1041 Description
1042 Do an ioctl operation on a file. The command, len and
1043 data arguments are filled as usual. flags is not used by Venus.
1044
1045 .. Note::
1046
1047 Another bogus parameter. flags is not used. What is the
1048 business about PREFETCHING in the Venus code?
1049
1050
1051
1052 4.18. rename
1053 -------------
1054
1055
1056 Summary
1057 Rename a fid.
1058
1059 Arguments
1060 in::
1061
1062 struct cfs_rename_in {
1063 ViceFid sourceFid;
1064 char *srcname;
1065 ViceFid destFid;
1066 char *destname;
1067 } cfs_rename;
1068
1069
1070
1071 out
1072
1073 none
1074
1075 Description
1076 Rename the object with name srcname in directory
1077 sourceFid to destname in destFid. It is important that the names
1078 srcname and destname are 0 terminated strings. Strings in Unix
1079 kernels are not always null terminated.
1080
1081
1082 4.19. readdir
1083 --------------
1084
1085
1086 Summary
1087 Read directory entries.
1088
1089 Arguments
1090 in::
1091
1092 struct cfs_readdir_in {
1093 ViceFid VFid;
1094 int count;
1095 int offset;
1096 } cfs_readdir;
1097
1098
1099
1100
1101 out::
1102
1103 struct cfs_readdir_out {
1104 int size;
1105 caddr_t data; /* Place holder for data. */
1106 } cfs_readdir;
1107
1108
1109
1110 Description
1111 Read directory entries from VFid starting at offset and
1112 read at most count bytes. Returns the data in data and returns
1113 the size in size.
1114
1115
1116 .. Note::
1117
1118 This call is not used. Readdir operations exploit container
1119 files. We will re-evaluate this during the directory revamp which is
1120 about to take place.
1121
1122
1123 4.20. vget
1124 -----------
1125
1126
1127 Summary
1128 instructs Venus to do an FSDB->Get.
1129
1130 Arguments
1131 in::
1132
1133 struct cfs_vget_in {
1134 ViceFid VFid;
1135 } cfs_vget;
1136
1137
1138
1139 out::
1140
1141 struct cfs_vget_out {
1142 ViceFid VFid;
1143 int vtype;
1144 } cfs_vget;
1145
1146
1147
1148 Description
1149 This upcall asks Venus to do a get operation on an fsobj
1150 labelled by VFid.
1151
1152 .. Note::
1153
1154 This operation is not used. However, it is extremely useful
1155 since it can be used to deal with read/write memory mapped files.
1156 These can be "pinned" in the Venus cache using vget and released with
1157 inactive.
1158
1159
1160 4.21. fsync
1161 ------------
1162
1163
1164 Summary
1165 Tell Venus to update the RVM attributes of a file.
1166
1167 Arguments
1168 in::
1169
1170 struct cfs_fsync_in {
1171 ViceFid VFid;
1172 } cfs_fsync;
1173
1174
1175
1176 out
1177
1178 none
1179
1180 Description
1181 Ask Venus to update RVM attributes of object VFid. This
1182 should be called as part of kernel level fsync type calls. The
1183 result indicates if the syncing was successful.
1184
1185 .. Note:: Linux does not implement this call. It should.
1186
1187
1188 4.22. inactive
1189 ---------------
1190
1191
1192 Summary
1193 Tell Venus a vnode is no longer in use.
1194
1195 Arguments
1196 in::
1197
1198 struct cfs_inactive_in {
1199 ViceFid VFid;
1200 } cfs_inactive;
1201
1202
1203
1204 out
1205
1206 none
1207
1208 Description
1209 This operation returns EOPNOTSUPP.
1210
1211 .. Note:: This should perhaps be removed.
1212
1213
1214 4.23. rdwr
1215 -----------
1216
1217
1218 Summary
1219 Read or write from a file
1220
1221 Arguments
1222 in::
1223
1224 struct cfs_rdwr_in {
1225 ViceFid VFid;
1226 int rwflag;
1227 int count;
1228 int offset;
1229 int ioflag;
1230 caddr_t data; /* Place holder for data. */
1231 } cfs_rdwr;
1232
1233
1234
1235
1236 out::
1237
1238 struct cfs_rdwr_out {
1239 int rwflag;
1240 int count;
1241 caddr_t data; /* Place holder for data. */
1242 } cfs_rdwr;
1243
1244
1245
1246 Description
1247 This upcall asks Venus to read or write from a file.
1248
1249
1250 .. Note::
1251
1252 It should be removed since it is against the Coda philosophy that
1253 read/write operations never reach Venus. I have been told the
1254 operation does not work. It is not currently used.
1255
1256
1257
1258 4.24. odymount
1259 ---------------
1260
1261
1262 Summary
1263 Allows mounting multiple Coda "filesystems" on one Unix mount point.
1264
1265 Arguments
1266 in::
1267
1268 struct ody_mount_in {
1269 char *name; /* Place holder for data. */
1270 } ody_mount;
1271
1272
1273
1274 out::
1275
1276 struct ody_mount_out {
1277 ViceFid VFid;
1278 } ody_mount;
1279
1280
1281
1282 Description
1283 Asks Venus to return the rootfid of a Coda system named
1284 name. The fid is returned in VFid.
1285
1286 .. Note::
1287
1288 This call was used by David for dynamic sets. It should be
1289 removed since it causes a jungle of pointers in the VFS mounting area.
1290 It is not used by Coda proper. Call is not implemented by Venus.
1291
1292
1293 4.25. ody_lookup
1294 -----------------
1295
1296
1297 Summary
1298 Looks up something.
1299
1300 Arguments
1301 in
1302
1303 irrelevant
1304
1305
1306 out
1307
1308 irrelevant
1309
1310
1311 .. Note:: Gut it. Call is not implemented by Venus.
1312
1313
1314 4.26. ody_expand
1315 -----------------
1316
1317
1318 Summary
1319 expands something in a dynamic set.
1320
1321 Arguments
1322 in
1323
1324 irrelevant
1325
1326 out
1327
1328 irrelevant
1329
1330 .. Note:: Gut it. Call is not implemented by Venus.
1331
1332
1333 4.27. prefetch
1334 ---------------
1335
1336
1337 Summary
1338 Prefetch a dynamic set.
1339
1340 Arguments
1341
1342 in
1343
1344 Not documented.
1345
1346 out
1347
1348 Not documented.
1349
1350 Description
1351 Venus worker.cc has support for this call, although it is
1352 noted that it doesn't work. Not surprising, since the kernel does not
1353 have support for it. (ODY_PREFETCH is not a defined operation).
1354
1355
1356 .. Note:: Gut it. It isn't working and isn't used by Coda.
1357
1358
1359
1360 4.28. signal
1361 -------------
1362
1363
1364 Summary
1365 Send Venus a signal about an upcall.
1366
1367 Arguments
1368 in
1369
1370 none
1371
1372 out
1373
1374 not applicable.
1375
1376 Description
1377 This is an out-of-band upcall to Venus to inform Venus
1378 that the calling process received a signal after Venus read the
1379 message from the input queue. Venus is supposed to clean up the
1380 operation.
1381
1382 Errors
1383 No reply is given.
1384
1385 .. Note::
1386
1387 We need to better understand what Venus needs to clean up and if
1388 it is doing this correctly. Also we need to handle multiple upcall
1389 per system call situations correctly. It would be important to know
1390 what state changes in Venus take place after an upcall for which the
1391 kernel is responsible for notifying Venus to clean up (e.g. open
1392 definitely is such a state change, but many others are maybe not).
1393
1394
1395 5. The minicache and downcalls
1396 ===============================
1397
1398
1399 The Coda FS Driver can cache results of lookup and access upcalls, to
1400 limit the frequency of upcalls. Upcalls carry a price since a process
1401 context switch needs to take place. The counterpart of caching the
1402 information is that Venus will notify the FS Driver that cached
1403 entries must be flushed or renamed.
1404
1405 The kernel code generally has to maintain a structure which links the
1406 internal file handles (called vnodes in BSD, inodes in Linux and
1407 FileHandles in Windows) with the ViceFid's which Venus maintains. The
1408 reason is that frequent translations back and forth are needed in
1409 order to make upcalls and use the results of upcalls. Such linking
1410 objects are called cnodes.
1411
1412 The current minicache implementations have cache entries which record
1413 the following:
1414
1415 1. the name of the file
1416
1417 2. the cnode of the directory containing the object
1418
1419 3. a list of CodaCred's for which the lookup is permitted.
1420
1421 4. the cnode of the object
1422
1423 The lookup call in the Coda FS Driver may request the cnode of the
1424 desired object from the cache, by passing its name, directory and the
1425 CodaCred's of the caller. The cache will return the cnode or indicate
1426 that it cannot be found. The Coda FS Driver must be careful to
1427 invalidate cache entries when it modifies or removes objects.
1428
1429 When Venus obtains information that indicates that cache entries are
1430 no longer valid, it will make a downcall to the kernel. Downcalls are
1431 intercepted by the Coda FS Driver and lead to cache invalidations of
1432 the kind described below. The Coda FS Driver does not return an error
1433 unless the downcall data could not be read into kernel memory.
1434
1435
1436 5.1. INVALIDATE
1437 ----------------
1438
1439
1440 No information is available on this call.
1441
1442
1443 5.2. FLUSH
1444 -----------
1445
1446
1447
1448 Arguments
1449 None
1450
1451 Summary
1452 Flush the name cache entirely.
1453
1454 Description
1455 Venus issues this call upon startup and when it dies. This
1456 is to prevent stale cache information being held. Some operating
1457 systems allow the kernel name cache to be switched off dynamically.
1458 When this is done, this downcall is made.
1459
1460
1461 5.3. PURGEUSER
1462 ---------------
1463
1464
1465 Arguments
1466 ::
1467
1468 struct cfs_purgeuser_out {/* CFS_PURGEUSER is a venus->kernel call */
1469 struct CodaCred cred;
1470 } cfs_purgeuser;
1471
1472
1473
1474 Description
1475 Remove all entries in the cache carrying the Cred. This
1476 call is issued when tokens for a user expire or are flushed.
1477
1478
1479 5.4. ZAPFILE
1480 -------------
1481
1482
1483 Arguments
1484 ::
1485
1486 struct cfs_zapfile_out { /* CFS_ZAPFILE is a venus->kernel call */
1487 ViceFid CodaFid;
1488 } cfs_zapfile;
1489
1490
1491
1492 Description
1493 Remove all entries which have the (dir vnode, name) pair.
1494 This is issued as a result of an invalidation of cached attributes of
1495 a vnode.
1496
1497 .. Note::
1498
1499 Call is not named correctly in NetBSD and Mach. The minicache
1500 zapfile routine takes different arguments. Linux does not implement
1501 the invalidation of attributes correctly.
1502
1503
1504
1505 5.5. ZAPDIR
1506 ------------
1507
1508
1509 Arguments
1510 ::
1511
1512 struct cfs_zapdir_out { /* CFS_ZAPDIR is a venus->kernel call */
1513 ViceFid CodaFid;
1514 } cfs_zapdir;
1515
1516
1517
1518 Description
1519 Remove all entries in the cache lying in a directory
1520 CodaFid, and all children of this directory. This call is issued when
1521 Venus receives a callback on the directory.
1522
1523
1524 5.6. ZAPVNODE
1525 --------------
1526
1527
1528
1529 Arguments
1530 ::
1531
1532 struct cfs_zapvnode_out { /* CFS_ZAPVNODE is a venus->kernel call */
1533 struct CodaCred cred;
1534 ViceFid VFid;
1535 } cfs_zapvnode;
1536
1537
1538
1539 Description
1540 Remove all entries in the cache carrying the cred and VFid
1541 as in the arguments. This downcall is probably never issued.
1542
1543
1544 5.7. PURGEFID
1545 --------------
1546
1547
1548 Arguments
1549 ::
1550
1551 struct cfs_purgefid_out { /* CFS_PURGEFID is a venus->kernel call */
1552 ViceFid CodaFid;
1553 } cfs_purgefid;
1554
1555
1556
1557 Description
1558 Flush the attribute for the file. If it is a dir (odd
1559 vnode), purge its children from the namecache and remove the file from the
1560 namecache.
1561
1562
1563
1564 5.8. REPLACE
1565 -------------
1566
1567
1568 Summary
1569 Replace the Fid's for a collection of names.
1570
1571 Arguments
1572 ::
1573
1574 struct cfs_replace_out { /* cfs_replace is a venus->kernel call */
1575 ViceFid NewFid;
1576 ViceFid OldFid;
1577 } cfs_replace;
1578
1579
1580
1581 Description
1582 This routine replaces a ViceFid in the name cache with
1583 another. It is added to allow Venus during reintegration to replace
1584 locally allocated temp fids while disconnected with global fids even
1585 when the reference counts on those fids are not zero.
1586
1587
1588 6. Initialization and cleanup
1589 ==============================
1590
1591
1592 This section gives brief hints as to desirable features for the Coda
1593 FS Driver at startup and upon shutdown or Venus failures. Before
1594 entering the discussion it is useful to repeat that the Coda FS Driver
1595 maintains the following data:
1596
1597
1598 1. message queues
1599
1600 2. cnodes
1601
1602 3. name cache entries
1603
1604 The name cache entries are entirely private to the driver, so they
1605 can easily be manipulated. The message queues will generally have
1606 clear points of initialization and destruction. The cnodes are
1607 much more delicate. User processes hold reference counts in Coda
1608 filesystems and it can be difficult to clean up the cnodes.
1609
1610 It can expect requests through:
1611
1612 1. the message subsystem
1613
1614 2. the VFS layer
1615
1616 3. pioctl interface
1617
1618 Currently the pioctl passes through the VFS for Coda so we can
1619 treat these similarly.
1620
1621
1622 6.1. Requirements
1623 ------------------
1624
1625
1626 The following requirements should be accommodated:
1627
1628 1. The message queues should have open and close routines. On Unix
1629 the opening of the character devices are such routines.
1630
1631 - Before opening, no messages can be placed.
1632
1633 - Opening will remove any old messages still pending.
1634
1635 - Close will notify any sleeping processes that their upcall cannot
1636 be completed.
1637
1638 - Close will free all memory allocated by the message queues.
1639
1640
1641 2. At open the namecache shall be initialized to empty state.
1642
1643 3. Before the message queues are open, all VFS operations will fail.
1644 Fortunately this can be achieved by making sure than mounting the
1645 Coda filesystem cannot succeed before opening.
1646
1647 4. After closing of the queues, no VFS operations can succeed. Here
1648 one needs to be careful, since a few operations (lookup,
1649 read/write, readdir) can proceed without upcalls. These must be
1650 explicitly blocked.
1651
1652 5. Upon closing the namecache shall be flushed and disabled.
1653
1654 6. All memory held by cnodes can be freed without relying on upcalls.
1655
1656 7. Unmounting the file system can be done without relying on upcalls.
1657
1658 8. Mounting the Coda filesystem should fail gracefully if Venus cannot
1659 get the rootfid or the attributes of the rootfid. The latter is
1660 best implemented by Venus fetching these objects before attempting
1661 to mount.
1662
1663 .. Note::
1664
1665 NetBSD in particular but also Linux have not implemented the
1666 above requirements fully. For smooth operation this needs to be
1667 corrected.
1668
1669
1670