Back to home page

OSCL-LXR

 
 

    


0001 .. SPDX-License-Identifier: GPL-2.0
0002 
0003 ===================================
0004 File management in the Linux kernel
0005 ===================================
0006 
0007 This document describes how locking for files (struct file)
0008 and file descriptor table (struct files) works.
0009 
0010 Up until 2.6.12, the file descriptor table has been protected
0011 with a lock (files->file_lock) and reference count (files->count).
0012 ->file_lock protected accesses to all the file related fields
0013 of the table. ->count was used for sharing the file descriptor
0014 table between tasks cloned with CLONE_FILES flag. Typically
0015 this would be the case for posix threads. As with the common
0016 refcounting model in the kernel, the last task doing
0017 a put_files_struct() frees the file descriptor (fd) table.
0018 The files (struct file) themselves are protected using
0019 reference count (->f_count).
0020 
0021 In the new lock-free model of file descriptor management,
0022 the reference counting is similar, but the locking is
0023 based on RCU. The file descriptor table contains multiple
0024 elements - the fd sets (open_fds and close_on_exec, the
0025 array of file pointers, the sizes of the sets and the array
0026 etc.). In order for the updates to appear atomic to
0027 a lock-free reader, all the elements of the file descriptor
0028 table are in a separate structure - struct fdtable.
0029 files_struct contains a pointer to struct fdtable through
0030 which the actual fd table is accessed. Initially the
0031 fdtable is embedded in files_struct itself. On a subsequent
0032 expansion of fdtable, a new fdtable structure is allocated
0033 and files->fdtab points to the new structure. The fdtable
0034 structure is freed with RCU and lock-free readers either
0035 see the old fdtable or the new fdtable making the update
0036 appear atomic. Here are the locking rules for
0037 the fdtable structure -
0038 
0039 1. All references to the fdtable must be done through
0040    the files_fdtable() macro::
0041 
0042         struct fdtable *fdt;
0043 
0044         rcu_read_lock();
0045 
0046         fdt = files_fdtable(files);
0047         ....
0048         if (n <= fdt->max_fds)
0049                 ....
0050         ...
0051         rcu_read_unlock();
0052 
0053    files_fdtable() uses rcu_dereference() macro which takes care of
0054    the memory barrier requirements for lock-free dereference.
0055    The fdtable pointer must be read within the read-side
0056    critical section.
0057 
0058 2. Reading of the fdtable as described above must be protected
0059    by rcu_read_lock()/rcu_read_unlock().
0060 
0061 3. For any update to the fd table, files->file_lock must
0062    be held.
0063 
0064 4. To look up the file structure given an fd, a reader
0065    must use either lookup_fd_rcu() or files_lookup_fd_rcu() APIs. These
0066    take care of barrier requirements due to lock-free lookup.
0067 
0068    An example::
0069 
0070         struct file *file;
0071 
0072         rcu_read_lock();
0073         file = lookup_fd_rcu(fd);
0074         if (file) {
0075                 ...
0076         }
0077         ....
0078         rcu_read_unlock();
0079 
0080 5. Handling of the file structures is special. Since the look-up
0081    of the fd (fget()/fget_light()) are lock-free, it is possible
0082    that look-up may race with the last put() operation on the
0083    file structure. This is avoided using atomic_long_inc_not_zero()
0084    on ->f_count::
0085 
0086         rcu_read_lock();
0087         file = files_lookup_fd_rcu(files, fd);
0088         if (file) {
0089                 if (atomic_long_inc_not_zero(&file->f_count))
0090                         *fput_needed = 1;
0091                 else
0092                 /* Didn't get the reference, someone's freed */
0093                         file = NULL;
0094         }
0095         rcu_read_unlock();
0096         ....
0097         return file;
0098 
0099    atomic_long_inc_not_zero() detects if refcounts is already zero or
0100    goes to zero during increment. If it does, we fail
0101    fget()/fget_light().
0102 
0103 6. Since both fdtable and file structures can be looked up
0104    lock-free, they must be installed using rcu_assign_pointer()
0105    API. If they are looked up lock-free, rcu_dereference()
0106    must be used. However it is advisable to use files_fdtable()
0107    and lookup_fd_rcu()/files_lookup_fd_rcu() which take care of these issues.
0108 
0109 7. While updating, the fdtable pointer must be looked up while
0110    holding files->file_lock. If ->file_lock is dropped, then
0111    another thread expand the files thereby creating a new
0112    fdtable and making the earlier fdtable pointer stale.
0113 
0114    For example::
0115 
0116         spin_lock(&files->file_lock);
0117         fd = locate_fd(files, file, start);
0118         if (fd >= 0) {
0119                 /* locate_fd() may have expanded fdtable, load the ptr */
0120                 fdt = files_fdtable(files);
0121                 __set_open_fd(fd, fdt);
0122                 __clear_close_on_exec(fd, fdt);
0123                 spin_unlock(&files->file_lock);
0124         .....
0125 
0126    Since locate_fd() can drop ->file_lock (and reacquire ->file_lock),
0127    the fdtable pointer (fdt) must be loaded after locate_fd().
0128