0001 .. _gfp_mask_from_fs_io:
0002
0003 =================================
0004 GFP masks used from FS/IO context
0005 =================================
0006
0007 :Date: May, 2018
0008 :Author: Michal Hocko <mhocko@kernel.org>
0009
0010 Introduction
0011 ============
0012
0013 Code paths in the filesystem and IO stacks must be careful when
0014 allocating memory to prevent recursion deadlocks caused by direct
0015 memory reclaim calling back into the FS or IO paths and blocking on
0016 already held resources (e.g. locks - most commonly those used for the
0017 transaction context).
0018
0019 The traditional way to avoid this deadlock problem is to clear __GFP_FS
0020 respectively __GFP_IO (note the latter implies clearing the first as well) in
0021 the gfp mask when calling an allocator. GFP_NOFS respectively GFP_NOIO can be
0022 used as shortcut. It turned out though that above approach has led to
0023 abuses when the restricted gfp mask is used "just in case" without a
0024 deeper consideration which leads to problems because an excessive use
0025 of GFP_NOFS/GFP_NOIO can lead to memory over-reclaim or other memory
0026 reclaim issues.
0027
0028 New API
0029 ========
0030
0031 Since 4.12 we do have a generic scope API for both NOFS and NOIO context
0032 ``memalloc_nofs_save``, ``memalloc_nofs_restore`` respectively ``memalloc_noio_save``,
0033 ``memalloc_noio_restore`` which allow to mark a scope to be a critical
0034 section from a filesystem or I/O point of view. Any allocation from that
0035 scope will inherently drop __GFP_FS respectively __GFP_IO from the given
0036 mask so no memory allocation can recurse back in the FS/IO.
0037
0038 .. kernel-doc:: include/linux/sched/mm.h
0039 :functions: memalloc_nofs_save memalloc_nofs_restore
0040 .. kernel-doc:: include/linux/sched/mm.h
0041 :functions: memalloc_noio_save memalloc_noio_restore
0042
0043 FS/IO code then simply calls the appropriate save function before
0044 any critical section with respect to the reclaim is started - e.g.
0045 lock shared with the reclaim context or when a transaction context
0046 nesting would be possible via reclaim. The restore function should be
0047 called when the critical section ends. All that ideally along with an
0048 explanation what is the reclaim context for easier maintenance.
0049
0050 Please note that the proper pairing of save/restore functions
0051 allows nesting so it is safe to call ``memalloc_noio_save`` or
0052 ``memalloc_noio_restore`` respectively from an existing NOIO or NOFS
0053 scope.
0054
0055 What about __vmalloc(GFP_NOFS)
0056 ==============================
0057
0058 vmalloc doesn't support GFP_NOFS semantic because there are hardcoded
0059 GFP_KERNEL allocations deep inside the allocator which are quite non-trivial
0060 to fix up. That means that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO is
0061 almost always a bug. The good news is that the NOFS/NOIO semantic can be
0062 achieved by the scope API.
0063
0064 In the ideal world, upper layers should already mark dangerous contexts
0065 and so no special care is required and vmalloc should be called without
0066 any problems. Sometimes if the context is not really clear or there are
0067 layering violations then the recommended way around that is to wrap ``vmalloc``
0068 by the scope API with a comment explaining the problem.