Documentation/security/siphash.rst

0001 ===========================
0002 SipHash - a short input PRF
0003 ===========================
0004
0005 :Author: Written by Jason A. Donenfeld <jason@zx2c4.com>
0006
0007 SipHash is a cryptographically secure PRF -- a keyed hash function -- that
0008 performs very well for short inputs, hence the name. It was designed by
0009 cryptographers Daniel J. Bernstein and Jean-Philippe Aumasson. It is intended
0010 as a replacement for some uses of: `jhash`, `md5_transform`, `sha1_transform`,
0011 and so forth.
0012
0013 SipHash takes a secret key filled with randomly generated numbers and either
0014 an input buffer or several input integers. It spits out an integer that is
0015 indistinguishable from random. You may then use that integer as part of secure
0016 sequence numbers, secure cookies, or mask it off for use in a hash table.
0017
0018 Generating a key
0019 ================
0020
0021 Keys should always be generated from a cryptographically secure source of
0022 random numbers, either using get_random_bytes or get_random_once::
0023
0024         siphash_key_t key;
0025         get_random_bytes(&key, sizeof(key));
0026
0027 If you're not deriving your key from here, you're doing it wrong.
0028
0029 Using the functions
0030 ===================
0031
0032 There are two variants of the function, one that takes a list of integers, and
0033 one that takes a buffer::
0034
0035         u64 siphash(const void *data, size_t len, const siphash_key_t *key);
0036
0037 And::
0038
0039         u64 siphash_1u64(u64, const siphash_key_t *key);
0040         u64 siphash_2u64(u64, u64, const siphash_key_t *key);
0041         u64 siphash_3u64(u64, u64, u64, const siphash_key_t *key);
0042         u64 siphash_4u64(u64, u64, u64, u64, const siphash_key_t *key);
0043         u64 siphash_1u32(u32, const siphash_key_t *key);
0044         u64 siphash_2u32(u32, u32, const siphash_key_t *key);
0045         u64 siphash_3u32(u32, u32, u32, const siphash_key_t *key);
0046         u64 siphash_4u32(u32, u32, u32, u32, const siphash_key_t *key);
0047
0048 If you pass the generic siphash function something of a constant length, it
0049 will constant fold at compile-time and automatically choose one of the
0050 optimized functions.
0051
0052 Hashtable key function usage::
0053
0054         struct some_hashtable {
0055                 DECLARE_HASHTABLE(hashtable, 8);
0056                 siphash_key_t key;
0057         };
0058
0059         void init_hashtable(struct some_hashtable *table)
0060         {
0061                 get_random_bytes(&table->key, sizeof(table->key));
0062         }
0063
0064         static inline hlist_head *some_hashtable_bucket(struct some_hashtable *table, struct interesting_input *input)
0065         {
0066                 return &table->hashtable[siphash(input, sizeof(*input), &table->key) & (HASH_SIZE(table->hashtable) - 1)];
0067         }
0068
0069 You may then iterate like usual over the returned hash bucket.
0070
0071 Security
0072 ========
0073
0074 SipHash has a very high security margin, with its 128-bit key. So long as the
0075 key is kept secret, it is impossible for an attacker to guess the outputs of
0076 the function, even if being able to observe many outputs, since 2^128 outputs
0077 is significant.
0078
0079 Linux implements the "2-4" variant of SipHash.
0080
0081 Struct-passing Pitfalls
0082 =======================
0083
0084 Often times the XuY functions will not be large enough, and instead you'll
0085 want to pass a pre-filled struct to siphash. When doing this, it's important
0086 to always ensure the struct has no padding holes. The easiest way to do this
0087 is to simply arrange the members of the struct in descending order of size,
0088 and to use offsetofend() instead of sizeof() for getting the size. For
0089 performance reasons, if possible, it's probably a good thing to align the
0090 struct to the right boundary. Here's an example::
0091
0092         const struct {
0093                 struct in6_addr saddr;
0094                 u32 counter;
0095                 u16 dport;
0096         } __aligned(SIPHASH_ALIGNMENT) combined = {
0097                 .saddr = *(struct in6_addr *)saddr,
0098                 .counter = counter,
0099                 .dport = dport
0100         };
0101         u64 h = siphash(&combined, offsetofend(typeof(combined), dport), &secret);
0102
0103 Resources
0104 =========
0105
0106 Read the SipHash paper if you're interested in learning more:
0107 https://131002.net/siphash/siphash.pdf
0108
0109 -------------------------------------------------------------------------------
0110
0111 ===============================================
0112 HalfSipHash - SipHash's insecure younger cousin
0113 ===============================================
0114
0115 :Author: Written by Jason A. Donenfeld <jason@zx2c4.com>
0116
0117 On the off-chance that SipHash is not fast enough for your needs, you might be
0118 able to justify using HalfSipHash, a terrifying but potentially useful
0119 possibility. HalfSipHash cuts SipHash's rounds down from "2-4" to "1-3" and,
0120 even scarier, uses an easily brute-forcable 64-bit key (with a 32-bit output)
0121 instead of SipHash's 128-bit key. However, this may appeal to some
0122 high-performance `jhash` users.
0123
0124 HalfSipHash support is provided through the "hsiphash" family of functions.
0125
0126 .. warning::
0127    Do not ever use the hsiphash functions except for as a hashtable key
0128    function, and only then when you can be absolutely certain that the outputs
0129    will never be transmitted out of the kernel. This is only remotely useful
0130    over `jhash` as a means of mitigating hashtable flooding denial of service
0131    attacks.
0132
0133 On 64-bit kernels, the hsiphash functions actually implement SipHash-1-3, a
0134 reduced-round variant of SipHash, instead of HalfSipHash-1-3. This is because in
0135 64-bit code, SipHash-1-3 is no slower than HalfSipHash-1-3, and can be faster.
0136 Note, this does *not* mean that in 64-bit kernels the hsiphash functions are the
0137 same as the siphash ones, or that they are secure; the hsiphash functions still
0138 use a less secure reduced-round algorithm and truncate their outputs to 32
0139 bits.
0140
0141 Generating a hsiphash key
0142 =========================
0143
0144 Keys should always be generated from a cryptographically secure source of
0145 random numbers, either using get_random_bytes or get_random_once::
0146
0147         hsiphash_key_t key;
0148         get_random_bytes(&key, sizeof(key));
0149
0150 If you're not deriving your key from here, you're doing it wrong.
0151
0152 Using the hsiphash functions
0153 ============================
0154
0155 There are two variants of the function, one that takes a list of integers, and
0156 one that takes a buffer::
0157
0158         u32 hsiphash(const void *data, size_t len, const hsiphash_key_t *key);
0159
0160 And::
0161
0162         u32 hsiphash_1u32(u32, const hsiphash_key_t *key);
0163         u32 hsiphash_2u32(u32, u32, const hsiphash_key_t *key);
0164         u32 hsiphash_3u32(u32, u32, u32, const hsiphash_key_t *key);
0165         u32 hsiphash_4u32(u32, u32, u32, u32, const hsiphash_key_t *key);
0166
0167 If you pass the generic hsiphash function something of a constant length, it
0168 will constant fold at compile-time and automatically choose one of the
0169 optimized functions.
0170
0171 Hashtable key function usage
0172 ============================
0173
0174 ::
0175
0176         struct some_hashtable {
0177                 DECLARE_HASHTABLE(hashtable, 8);
0178                 hsiphash_key_t key;
0179         };
0180
0181         void init_hashtable(struct some_hashtable *table)
0182         {
0183                 get_random_bytes(&table->key, sizeof(table->key));
0184         }
0185
0186         static inline hlist_head *some_hashtable_bucket(struct some_hashtable *table, struct interesting_input *input)
0187         {
0188                 return &table->hashtable[hsiphash(input, sizeof(*input), &table->key) & (HASH_SIZE(table->hashtable) - 1)];
0189         }
0190
0191 You may then iterate like usual over the returned hash bucket.
0192
0193 Performance
0194 ===========
0195
0196 hsiphash() is roughly 3 times slower than jhash(). For many replacements, this
0197 will not be a problem, as the hashtable lookup isn't the bottleneck. And in
0198 general, this is probably a good sacrifice to make for the security and DoS
0199 resistance of hsiphash().