0001 .. SPDX-License-Identifier: GPL-2.0
0002
0003 ============
0004 Fiemap Ioctl
0005 ============
0006
0007 The fiemap ioctl is an efficient method for userspace to get file
0008 extent mappings. Instead of block-by-block mapping (such as bmap), fiemap
0009 returns a list of extents.
0010
0011
0012 Request Basics
0013 --------------
0014
0015 A fiemap request is encoded within struct fiemap::
0016
0017 struct fiemap {
0018 __u64 fm_start; /* logical offset (inclusive) at
0019 * which to start mapping (in) */
0020 __u64 fm_length; /* logical length of mapping which
0021 * userspace cares about (in) */
0022 __u32 fm_flags; /* FIEMAP_FLAG_* flags for request (in/out) */
0023 __u32 fm_mapped_extents; /* number of extents that were
0024 * mapped (out) */
0025 __u32 fm_extent_count; /* size of fm_extents array (in) */
0026 __u32 fm_reserved;
0027 struct fiemap_extent fm_extents[0]; /* array of mapped extents (out) */
0028 };
0029
0030
0031 fm_start, and fm_length specify the logical range within the file
0032 which the process would like mappings for. Extents returned mirror
0033 those on disk - that is, the logical offset of the 1st returned extent
0034 may start before fm_start, and the range covered by the last returned
0035 extent may end after fm_length. All offsets and lengths are in bytes.
0036
0037 Certain flags to modify the way in which mappings are looked up can be
0038 set in fm_flags. If the kernel doesn't understand some particular
0039 flags, it will return EBADR and the contents of fm_flags will contain
0040 the set of flags which caused the error. If the kernel is compatible
0041 with all flags passed, the contents of fm_flags will be unmodified.
0042 It is up to userspace to determine whether rejection of a particular
0043 flag is fatal to its operation. This scheme is intended to allow the
0044 fiemap interface to grow in the future but without losing
0045 compatibility with old software.
0046
0047 fm_extent_count specifies the number of elements in the fm_extents[] array
0048 that can be used to return extents. If fm_extent_count is zero, then the
0049 fm_extents[] array is ignored (no extents will be returned), and the
0050 fm_mapped_extents count will hold the number of extents needed in
0051 fm_extents[] to hold the file's current mapping. Note that there is
0052 nothing to prevent the file from changing between calls to FIEMAP.
0053
0054 The following flags can be set in fm_flags:
0055
0056 FIEMAP_FLAG_SYNC
0057 If this flag is set, the kernel will sync the file before mapping extents.
0058
0059 FIEMAP_FLAG_XATTR
0060 If this flag is set, the extents returned will describe the inodes
0061 extended attribute lookup tree, instead of its data tree.
0062
0063
0064 Extent Mapping
0065 --------------
0066
0067 Extent information is returned within the embedded fm_extents array
0068 which userspace must allocate along with the fiemap structure. The
0069 number of elements in the fiemap_extents[] array should be passed via
0070 fm_extent_count. The number of extents mapped by kernel will be
0071 returned via fm_mapped_extents. If the number of fiemap_extents
0072 allocated is less than would be required to map the requested range,
0073 the maximum number of extents that can be mapped in the fm_extent[]
0074 array will be returned and fm_mapped_extents will be equal to
0075 fm_extent_count. In that case, the last extent in the array will not
0076 complete the requested range and will not have the FIEMAP_EXTENT_LAST
0077 flag set (see the next section on extent flags).
0078
0079 Each extent is described by a single fiemap_extent structure as
0080 returned in fm_extents::
0081
0082 struct fiemap_extent {
0083 __u64 fe_logical; /* logical offset in bytes for the start of
0084 * the extent */
0085 __u64 fe_physical; /* physical offset in bytes for the start
0086 * of the extent */
0087 __u64 fe_length; /* length in bytes for the extent */
0088 __u64 fe_reserved64[2];
0089 __u32 fe_flags; /* FIEMAP_EXTENT_* flags for this extent */
0090 __u32 fe_reserved[3];
0091 };
0092
0093 All offsets and lengths are in bytes and mirror those on disk. It is valid
0094 for an extents logical offset to start before the request or its logical
0095 length to extend past the request. Unless FIEMAP_EXTENT_NOT_ALIGNED is
0096 returned, fe_logical, fe_physical, and fe_length will be aligned to the
0097 block size of the file system. With the exception of extents flagged as
0098 FIEMAP_EXTENT_MERGED, adjacent extents will not be merged.
0099
0100 The fe_flags field contains flags which describe the extent returned.
0101 A special flag, FIEMAP_EXTENT_LAST is always set on the last extent in
0102 the file so that the process making fiemap calls can determine when no
0103 more extents are available, without having to call the ioctl again.
0104
0105 Some flags are intentionally vague and will always be set in the
0106 presence of other more specific flags. This way a program looking for
0107 a general property does not have to know all existing and future flags
0108 which imply that property.
0109
0110 For example, if FIEMAP_EXTENT_DATA_INLINE or FIEMAP_EXTENT_DATA_TAIL
0111 are set, FIEMAP_EXTENT_NOT_ALIGNED will also be set. A program looking
0112 for inline or tail-packed data can key on the specific flag. Software
0113 which simply cares not to try operating on non-aligned extents
0114 however, can just key on FIEMAP_EXTENT_NOT_ALIGNED, and not have to
0115 worry about all present and future flags which might imply unaligned
0116 data. Note that the opposite is not true - it would be valid for
0117 FIEMAP_EXTENT_NOT_ALIGNED to appear alone.
0118
0119 FIEMAP_EXTENT_LAST
0120 This is generally the last extent in the file. A mapping attempt past
0121 this extent may return nothing. Some implementations set this flag to
0122 indicate this extent is the last one in the range queried by the user
0123 (via fiemap->fm_length).
0124
0125 FIEMAP_EXTENT_UNKNOWN
0126 The location of this extent is currently unknown. This may indicate
0127 the data is stored on an inaccessible volume or that no storage has
0128 been allocated for the file yet.
0129
0130 FIEMAP_EXTENT_DELALLOC
0131 This will also set FIEMAP_EXTENT_UNKNOWN.
0132
0133 Delayed allocation - while there is data for this extent, its
0134 physical location has not been allocated yet.
0135
0136 FIEMAP_EXTENT_ENCODED
0137 This extent does not consist of plain filesystem blocks but is
0138 encoded (e.g. encrypted or compressed). Reading the data in this
0139 extent via I/O to the block device will have undefined results.
0140
0141 Note that it is *always* undefined to try to update the data
0142 in-place by writing to the indicated location without the
0143 assistance of the filesystem, or to access the data using the
0144 information returned by the FIEMAP interface while the filesystem
0145 is mounted. In other words, user applications may only read the
0146 extent data via I/O to the block device while the filesystem is
0147 unmounted, and then only if the FIEMAP_EXTENT_ENCODED flag is
0148 clear; user applications must not try reading or writing to the
0149 filesystem via the block device under any other circumstances.
0150
0151 FIEMAP_EXTENT_DATA_ENCRYPTED
0152 This will also set FIEMAP_EXTENT_ENCODED
0153 The data in this extent has been encrypted by the file system.
0154
0155 FIEMAP_EXTENT_NOT_ALIGNED
0156 Extent offsets and length are not guaranteed to be block aligned.
0157
0158 FIEMAP_EXTENT_DATA_INLINE
0159 This will also set FIEMAP_EXTENT_NOT_ALIGNED
0160 Data is located within a meta data block.
0161
0162 FIEMAP_EXTENT_DATA_TAIL
0163 This will also set FIEMAP_EXTENT_NOT_ALIGNED
0164 Data is packed into a block with data from other files.
0165
0166 FIEMAP_EXTENT_UNWRITTEN
0167 Unwritten extent - the extent is allocated but its data has not been
0168 initialized. This indicates the extent's data will be all zero if read
0169 through the filesystem but the contents are undefined if read directly from
0170 the device.
0171
0172 FIEMAP_EXTENT_MERGED
0173 This will be set when a file does not support extents, i.e., it uses a block
0174 based addressing scheme. Since returning an extent for each block back to
0175 userspace would be highly inefficient, the kernel will try to merge most
0176 adjacent blocks into 'extents'.
0177
0178
0179 VFS -> File System Implementation
0180 ---------------------------------
0181
0182 File systems wishing to support fiemap must implement a ->fiemap callback on
0183 their inode_operations structure. The fs ->fiemap call is responsible for
0184 defining its set of supported fiemap flags, and calling a helper function on
0185 each discovered extent::
0186
0187 struct inode_operations {
0188 ...
0189
0190 int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start,
0191 u64 len);
0192
0193 ->fiemap is passed struct fiemap_extent_info which describes the
0194 fiemap request::
0195
0196 struct fiemap_extent_info {
0197 unsigned int fi_flags; /* Flags as passed from user */
0198 unsigned int fi_extents_mapped; /* Number of mapped extents */
0199 unsigned int fi_extents_max; /* Size of fiemap_extent array */
0200 struct fiemap_extent *fi_extents_start; /* Start of fiemap_extent array */
0201 };
0202
0203 It is intended that the file system should not need to access any of this
0204 structure directly. Filesystem handlers should be tolerant to signals and return
0205 EINTR once fatal signal received.
0206
0207
0208 Flag checking should be done at the beginning of the ->fiemap callback via the
0209 fiemap_prep() helper::
0210
0211 int fiemap_prep(struct inode *inode, struct fiemap_extent_info *fieinfo,
0212 u64 start, u64 *len, u32 supported_flags);
0213
0214 The struct fieinfo should be passed in as received from ioctl_fiemap(). The
0215 set of fiemap flags which the fs understands should be passed via fs_flags. If
0216 fiemap_prep finds invalid user flags, it will place the bad values in
0217 fieinfo->fi_flags and return -EBADR. If the file system gets -EBADR, from
0218 fiemap_prep(), it should immediately exit, returning that error back to
0219 ioctl_fiemap(). Additionally the range is validate against the supported
0220 maximum file size.
0221
0222
0223 For each extent in the request range, the file system should call
0224 the helper function, fiemap_fill_next_extent()::
0225
0226 int fiemap_fill_next_extent(struct fiemap_extent_info *info, u64 logical,
0227 u64 phys, u64 len, u32 flags, u32 dev);
0228
0229 fiemap_fill_next_extent() will use the passed values to populate the
0230 next free extent in the fm_extents array. 'General' extent flags will
0231 automatically be set from specific flags on behalf of the calling file
0232 system so that the userspace API is not broken.
0233
0234 fiemap_fill_next_extent() returns 0 on success, and 1 when the
0235 user-supplied fm_extents array is full. If an error is encountered
0236 while copying the extent to user memory, -EFAULT will be returned.