0001 ==================
0002 Partial Parity Log
0003 ==================
0004
0005 Partial Parity Log (PPL) is a feature available for RAID5 arrays. The issue
0006 addressed by PPL is that after a dirty shutdown, parity of a particular stripe
0007 may become inconsistent with data on other member disks. If the array is also
0008 in degraded state, there is no way to recalculate parity, because one of the
0009 disks is missing. This can lead to silent data corruption when rebuilding the
0010 array or using it is as degraded - data calculated from parity for array blocks
0011 that have not been touched by a write request during the unclean shutdown can
0012 be incorrect. Such condition is known as the RAID5 Write Hole. Because of
0013 this, md by default does not allow starting a dirty degraded array.
0014
0015 Partial parity for a write operation is the XOR of stripe data chunks not
0016 modified by this write. It is just enough data needed for recovering from the
0017 write hole. XORing partial parity with the modified chunks produces parity for
0018 the stripe, consistent with its state before the write operation, regardless of
0019 which chunk writes have completed. If one of the not modified data disks of
0020 this stripe is missing, this updated parity can be used to recover its
0021 contents. PPL recovery is also performed when starting an array after an
0022 unclean shutdown and all disks are available, eliminating the need to resync
0023 the array. Because of this, using write-intent bitmap and PPL together is not
0024 supported.
0025
0026 When handling a write request PPL writes partial parity before new data and
0027 parity are dispatched to disks. PPL is a distributed log - it is stored on
0028 array member drives in the metadata area, on the parity drive of a particular
0029 stripe. It does not require a dedicated journaling drive. Write performance is
0030 reduced by up to 30%-40% but it scales with the number of drives in the array
0031 and the journaling drive does not become a bottleneck or a single point of
0032 failure.
0033
0034 Unlike raid5-cache, the other solution in md for closing the write hole, PPL is
0035 not a true journal. It does not protect from losing in-flight data, only from
0036 silent data corruption. If a dirty disk of a stripe is lost, no PPL recovery is
0037 performed for this stripe (parity is not updated). So it is possible to have
0038 arbitrary data in the written part of a stripe if that disk is lost. In such
0039 case the behavior is the same as in plain raid5.
0040
0041 PPL is available for md version-1 metadata and external (specifically IMSM)
0042 metadata arrays. It can be enabled using mdadm option --consistency-policy=ppl.
0043
0044 There is a limitation of maximum 64 disks in the array for PPL. It allows to
0045 keep data structures and implementation simple. RAID5 arrays with so many disks
0046 are not likely due to high risk of multiple disks failure. Such restriction
0047 should not be a real life limitation.