Back to home page

OSCL-LXR

 
 

    


0001 .. SPDX-License-Identifier: GPL-2.0
0002 
0003 ====================================================================
0004 Reference-count design for elements of lists/arrays protected by RCU
0005 ====================================================================
0006 
0007 
0008 Please note that the percpu-ref feature is likely your first
0009 stop if you need to combine reference counts and RCU.  Please see
0010 include/linux/percpu-refcount.h for more information.  However, in
0011 those unusual cases where percpu-ref would consume too much memory,
0012 please read on.
0013 
0014 ------------------------------------------------------------------------
0015 
0016 Reference counting on elements of lists which are protected by traditional
0017 reader/writer spinlocks or semaphores are straightforward:
0018 
0019 CODE LISTING A::
0020 
0021     1.                                      2.
0022     add()                                   search_and_reference()
0023     {                                       {
0024         alloc_object                            read_lock(&list_lock);
0025         ...                                     search_for_element
0026         atomic_set(&el->rc, 1);                 atomic_inc(&el->rc);
0027         write_lock(&list_lock);                  ...
0028         add_element                             read_unlock(&list_lock);
0029         ...                                     ...
0030         write_unlock(&list_lock);          }
0031     }
0032 
0033     3.                                      4.
0034     release_referenced()                    delete()
0035     {                                       {
0036         ...                                     write_lock(&list_lock);
0037         if(atomic_dec_and_test(&el->rc))        ...
0038             kfree(el);
0039         ...                                     remove_element
0040     }                                           write_unlock(&list_lock);
0041                                                 ...
0042                                                 if (atomic_dec_and_test(&el->rc))
0043                                                     kfree(el);
0044                                                 ...
0045                                             }
0046 
0047 If this list/array is made lock free using RCU as in changing the
0048 write_lock() in add() and delete() to spin_lock() and changing read_lock()
0049 in search_and_reference() to rcu_read_lock(), the atomic_inc() in
0050 search_and_reference() could potentially hold reference to an element which
0051 has already been deleted from the list/array.  Use atomic_inc_not_zero()
0052 in this scenario as follows:
0053 
0054 CODE LISTING B::
0055 
0056     1.                                      2.
0057     add()                                   search_and_reference()
0058     {                                       {
0059         alloc_object                            rcu_read_lock();
0060         ...                                     search_for_element
0061         atomic_set(&el->rc, 1);                 if (!atomic_inc_not_zero(&el->rc)) {
0062         spin_lock(&list_lock);                      rcu_read_unlock();
0063                                                     return FAIL;
0064         add_element                             }
0065         ...                                     ...
0066         spin_unlock(&list_lock);                rcu_read_unlock();
0067     }                                       }
0068     3.                                      4.
0069     release_referenced()                    delete()
0070     {                                       {
0071         ...                                     spin_lock(&list_lock);
0072         if (atomic_dec_and_test(&el->rc))       ...
0073             call_rcu(&el->head, el_free);       remove_element
0074         ...                                     spin_unlock(&list_lock);
0075     }                                           ...
0076                                                 if (atomic_dec_and_test(&el->rc))
0077                                                     call_rcu(&el->head, el_free);
0078                                                 ...
0079                                             }
0080 
0081 Sometimes, a reference to the element needs to be obtained in the
0082 update (write) stream.  In such cases, atomic_inc_not_zero() might be
0083 overkill, since we hold the update-side spinlock.  One might instead
0084 use atomic_inc() in such cases.
0085 
0086 It is not always convenient to deal with "FAIL" in the
0087 search_and_reference() code path.  In such cases, the
0088 atomic_dec_and_test() may be moved from delete() to el_free()
0089 as follows:
0090 
0091 CODE LISTING C::
0092 
0093     1.                                      2.
0094     add()                                   search_and_reference()
0095     {                                       {
0096         alloc_object                            rcu_read_lock();
0097         ...                                     search_for_element
0098         atomic_set(&el->rc, 1);                 atomic_inc(&el->rc);
0099         spin_lock(&list_lock);                  ...
0100 
0101         add_element                             rcu_read_unlock();
0102         ...                                 }
0103         spin_unlock(&list_lock);            4.
0104     }                                       delete()
0105     3.                                      {
0106     release_referenced()                        spin_lock(&list_lock);
0107     {                                           ...
0108         ...                                     remove_element
0109         if (atomic_dec_and_test(&el->rc))       spin_unlock(&list_lock);
0110             kfree(el);                          ...
0111         ...                                     call_rcu(&el->head, el_free);
0112     }                                           ...
0113     5.                                      }
0114     void el_free(struct rcu_head *rhp)
0115     {
0116         release_referenced();
0117     }
0118 
0119 The key point is that the initial reference added by add() is not removed
0120 until after a grace period has elapsed following removal.  This means that
0121 search_and_reference() cannot find this element, which means that the value
0122 of el->rc cannot increase.  Thus, once it reaches zero, there are no
0123 readers that can or ever will be able to reference the element.  The
0124 element can therefore safely be freed.  This in turn guarantees that if
0125 any reader finds the element, that reader may safely acquire a reference
0126 without checking the value of the reference counter.
0127 
0128 A clear advantage of the RCU-based pattern in listing C over the one
0129 in listing B is that any call to search_and_reference() that locates
0130 a given object will succeed in obtaining a reference to that object,
0131 even given a concurrent invocation of delete() for that same object.
0132 Similarly, a clear advantage of both listings B and C over listing A is
0133 that a call to delete() is not delayed even if there are an arbitrarily
0134 large number of calls to search_and_reference() searching for the same
0135 object that delete() was invoked on.  Instead, all that is delayed is
0136 the eventual invocation of kfree(), which is usually not a problem on
0137 modern computer systems, even the small ones.
0138 
0139 In cases where delete() can sleep, synchronize_rcu() can be called from
0140 delete(), so that el_free() can be subsumed into delete as follows::
0141 
0142     4.
0143     delete()
0144     {
0145         spin_lock(&list_lock);
0146         ...
0147         remove_element
0148         spin_unlock(&list_lock);
0149         ...
0150         synchronize_rcu();
0151         if (atomic_dec_and_test(&el->rc))
0152             kfree(el);
0153         ...
0154     }
0155 
0156 As additional examples in the kernel, the pattern in listing C is used by
0157 reference counting of struct pid, while the pattern in listing B is used by
0158 struct posix_acl.