Back to home page

OSCL-LXR

 
 

    


0001 .. SPDX-License-Identifier: GPL-2.0
0002 
0003 Orphan file
0004 -----------
0005 
0006 In unix there can inodes that are unlinked from directory hierarchy but that
0007 are still alive because they are open. In case of crash the filesystem has to
0008 clean up these inodes as otherwise they (and the blocks referenced from them)
0009 would leak. Similarly if we truncate or extend the file, we need not be able
0010 to perform the operation in a single journalling transaction. In such case we
0011 track the inode as orphan so that in case of crash extra blocks allocated to
0012 the file get truncated.
0013 
0014 Traditionally ext4 tracks orphan inodes in a form of single linked list where
0015 superblock contains the inode number of the last orphan inode (s_last_orphan
0016 field) and then each inode contains inode number of the previously orphaned
0017 inode (we overload i_dtime inode field for this). However this filesystem
0018 global single linked list is a scalability bottleneck for workloads that result
0019 in heavy creation of orphan inodes. When orphan file feature
0020 (COMPAT_ORPHAN_FILE) is enabled, the filesystem has a special inode
0021 (referenced from the superblock through s_orphan_file_inum) with several
0022 blocks. Each of these blocks has a structure:
0023 
0024 ============= ================ =============== ===============================
0025 Offset        Type             Name            Description
0026 ============= ================ =============== ===============================
0027 0x0           Array of         Orphan inode    Each __le32 entry is either
0028               __le32 entries   entries         empty (0) or it contains
0029                                                inode number of an orphan
0030                                                inode.
0031 blocksize-8   __le32           ob_magic        Magic value stored in orphan
0032                                                block tail (0x0b10ca04)
0033 blocksize-4   __le32           ob_checksum     Checksum of the orphan block.
0034 ============= ================ =============== ===============================
0035 
0036 When a filesystem with orphan file feature is writeably mounted, we set
0037 RO_COMPAT_ORPHAN_PRESENT feature in the superblock to indicate there may
0038 be valid orphan entries. In case we see this feature when mounting the
0039 filesystem, we read the whole orphan file and process all orphan inodes found
0040 there as usual. When cleanly unmounting the filesystem we remove the
0041 RO_COMPAT_ORPHAN_PRESENT feature to avoid unnecessary scanning of the orphan
0042 file and also make the filesystem fully compatible with older kernels.