0001 .. _array_rcu_doc:
0002
0003 Using RCU to Protect Read-Mostly Arrays
0004 =======================================
0005
0006 Although RCU is more commonly used to protect linked lists, it can
0007 also be used to protect arrays. Three situations are as follows:
0008
0009 1. :ref:`Hash Tables <hash_tables>`
0010
0011 2. :ref:`Static Arrays <static_arrays>`
0012
0013 3. :ref:`Resizable Arrays <resizable_arrays>`
0014
0015 Each of these three situations involves an RCU-protected pointer to an
0016 array that is separately indexed. It might be tempting to consider use
0017 of RCU to instead protect the index into an array, however, this use
0018 case is **not** supported. The problem with RCU-protected indexes into
0019 arrays is that compilers can play way too many optimization games with
0020 integers, which means that the rules governing handling of these indexes
0021 are far more trouble than they are worth. If RCU-protected indexes into
0022 arrays prove to be particularly valuable (which they have not thus far),
0023 explicit cooperation from the compiler will be required to permit them
0024 to be safely used.
0025
0026 That aside, each of the three RCU-protected pointer situations are
0027 described in the following sections.
0028
0029 .. _hash_tables:
0030
0031 Situation 1: Hash Tables
0032 ------------------------
0033
0034 Hash tables are often implemented as an array, where each array entry
0035 has a linked-list hash chain. Each hash chain can be protected by RCU
0036 as described in listRCU.rst. This approach also applies to other
0037 array-of-list situations, such as radix trees.
0038
0039 .. _static_arrays:
0040
0041 Situation 2: Static Arrays
0042 --------------------------
0043
0044 Static arrays, where the data (rather than a pointer to the data) is
0045 located in each array element, and where the array is never resized,
0046 have not been used with RCU. Rik van Riel recommends using seqlock in
0047 this situation, which would also have minimal read-side overhead as long
0048 as updates are rare.
0049
0050 Quick Quiz:
0051 Why is it so important that updates be rare when using seqlock?
0052
0053 :ref:`Answer to Quick Quiz <answer_quick_quiz_seqlock>`
0054
0055 .. _resizable_arrays:
0056
0057 Situation 3: Resizable Arrays
0058 ------------------------------
0059
0060 Use of RCU for resizable arrays is demonstrated by the grow_ary()
0061 function formerly used by the System V IPC code. The array is used
0062 to map from semaphore, message-queue, and shared-memory IDs to the data
0063 structure that represents the corresponding IPC construct. The grow_ary()
0064 function does not acquire any locks; instead its caller must hold the
0065 ids->sem semaphore.
0066
0067 The grow_ary() function, shown below, does some limit checks, allocates a
0068 new ipc_id_ary, copies the old to the new portion of the new, initializes
0069 the remainder of the new, updates the ids->entries pointer to point to
0070 the new array, and invokes ipc_rcu_putref() to free up the old array.
0071 Note that rcu_assign_pointer() is used to update the ids->entries pointer,
0072 which includes any memory barriers required on whatever architecture
0073 you are running on::
0074
0075 static int grow_ary(struct ipc_ids* ids, int newsize)
0076 {
0077 struct ipc_id_ary* new;
0078 struct ipc_id_ary* old;
0079 int i;
0080 int size = ids->entries->size;
0081
0082 if(newsize > IPCMNI)
0083 newsize = IPCMNI;
0084 if(newsize <= size)
0085 return newsize;
0086
0087 new = ipc_rcu_alloc(sizeof(struct kern_ipc_perm *)*newsize +
0088 sizeof(struct ipc_id_ary));
0089 if(new == NULL)
0090 return size;
0091 new->size = newsize;
0092 memcpy(new->p, ids->entries->p,
0093 sizeof(struct kern_ipc_perm *)*size +
0094 sizeof(struct ipc_id_ary));
0095 for(i=size;i<newsize;i++) {
0096 new->p[i] = NULL;
0097 }
0098 old = ids->entries;
0099
0100 /*
0101 * Use rcu_assign_pointer() to make sure the memcpyed
0102 * contents of the new array are visible before the new
0103 * array becomes visible.
0104 */
0105 rcu_assign_pointer(ids->entries, new);
0106
0107 ipc_rcu_putref(old);
0108 return newsize;
0109 }
0110
0111 The ipc_rcu_putref() function decrements the array's reference count
0112 and then, if the reference count has dropped to zero, uses call_rcu()
0113 to free the array after a grace period has elapsed.
0114
0115 The array is traversed by the ipc_lock() function. This function
0116 indexes into the array under the protection of rcu_read_lock(),
0117 using rcu_dereference() to pick up the pointer to the array so
0118 that it may later safely be dereferenced -- memory barriers are
0119 required on the Alpha CPU. Since the size of the array is stored
0120 with the array itself, there can be no array-size mismatches, so
0121 a simple check suffices. The pointer to the structure corresponding
0122 to the desired IPC object is placed in "out", with NULL indicating
0123 a non-existent entry. After acquiring "out->lock", the "out->deleted"
0124 flag indicates whether the IPC object is in the process of being
0125 deleted, and, if not, the pointer is returned::
0126
0127 struct kern_ipc_perm* ipc_lock(struct ipc_ids* ids, int id)
0128 {
0129 struct kern_ipc_perm* out;
0130 int lid = id % SEQ_MULTIPLIER;
0131 struct ipc_id_ary* entries;
0132
0133 rcu_read_lock();
0134 entries = rcu_dereference(ids->entries);
0135 if(lid >= entries->size) {
0136 rcu_read_unlock();
0137 return NULL;
0138 }
0139 out = entries->p[lid];
0140 if(out == NULL) {
0141 rcu_read_unlock();
0142 return NULL;
0143 }
0144 spin_lock(&out->lock);
0145
0146 /* ipc_rmid() may have already freed the ID while ipc_lock
0147 * was spinning: here verify that the structure is still valid
0148 */
0149 if (out->deleted) {
0150 spin_unlock(&out->lock);
0151 rcu_read_unlock();
0152 return NULL;
0153 }
0154 return out;
0155 }
0156
0157 .. _answer_quick_quiz_seqlock:
0158
0159 Answer to Quick Quiz:
0160 Why is it so important that updates be rare when using seqlock?
0161
0162 The reason that it is important that updates be rare when
0163 using seqlock is that frequent updates can livelock readers.
0164 One way to avoid this problem is to assign a seqlock for
0165 each array entry rather than to the entire array.