0001 .. _codingstyle:
0002
0003 Linux kernel coding style
0004 =========================
0005
0006 This is a short document describing the preferred coding style for the
0007 linux kernel. Coding style is very personal, and I won't **force** my
0008 views on anybody, but this is what goes for anything that I have to be
0009 able to maintain, and I'd prefer it for most other things too. Please
0010 at least consider the points made here.
0011
0012 First off, I'd suggest printing out a copy of the GNU coding standards,
0013 and NOT read it. Burn them, it's a great symbolic gesture.
0014
0015 Anyway, here goes:
0016
0017
0018 1) Indentation
0019 --------------
0020
0021 Tabs are 8 characters, and thus indentations are also 8 characters.
0022 There are heretic movements that try to make indentations 4 (or even 2!)
0023 characters deep, and that is akin to trying to define the value of PI to
0024 be 3.
0025
0026 Rationale: The whole idea behind indentation is to clearly define where
0027 a block of control starts and ends. Especially when you've been looking
0028 at your screen for 20 straight hours, you'll find it a lot easier to see
0029 how the indentation works if you have large indentations.
0030
0031 Now, some people will claim that having 8-character indentations makes
0032 the code move too far to the right, and makes it hard to read on a
0033 80-character terminal screen. The answer to that is that if you need
0034 more than 3 levels of indentation, you're screwed anyway, and should fix
0035 your program.
0036
0037 In short, 8-char indents make things easier to read, and have the added
0038 benefit of warning you when you're nesting your functions too deep.
0039 Heed that warning.
0040
0041 The preferred way to ease multiple indentation levels in a switch statement is
0042 to align the ``switch`` and its subordinate ``case`` labels in the same column
0043 instead of ``double-indenting`` the ``case`` labels. E.g.:
0044
0045 .. code-block:: c
0046
0047 switch (suffix) {
0048 case 'G':
0049 case 'g':
0050 mem <<= 30;
0051 break;
0052 case 'M':
0053 case 'm':
0054 mem <<= 20;
0055 break;
0056 case 'K':
0057 case 'k':
0058 mem <<= 10;
0059 fallthrough;
0060 default:
0061 break;
0062 }
0063
0064 Don't put multiple statements on a single line unless you have
0065 something to hide:
0066
0067 .. code-block:: c
0068
0069 if (condition) do_this;
0070 do_something_everytime;
0071
0072 Don't use commas to avoid using braces:
0073
0074 .. code-block:: c
0075
0076 if (condition)
0077 do_this(), do_that();
0078
0079 Always uses braces for multiple statements:
0080
0081 .. code-block:: c
0082
0083 if (condition) {
0084 do_this();
0085 do_that();
0086 }
0087
0088 Don't put multiple assignments on a single line either. Kernel coding style
0089 is super simple. Avoid tricky expressions.
0090
0091
0092 Outside of comments, documentation and except in Kconfig, spaces are never
0093 used for indentation, and the above example is deliberately broken.
0094
0095 Get a decent editor and don't leave whitespace at the end of lines.
0096
0097
0098 2) Breaking long lines and strings
0099 ----------------------------------
0100
0101 Coding style is all about readability and maintainability using commonly
0102 available tools.
0103
0104 The preferred limit on the length of a single line is 80 columns.
0105
0106 Statements longer than 80 columns should be broken into sensible chunks,
0107 unless exceeding 80 columns significantly increases readability and does
0108 not hide information.
0109
0110 Descendants are always substantially shorter than the parent and
0111 are placed substantially to the right. A very commonly used style
0112 is to align descendants to a function open parenthesis.
0113
0114 These same rules are applied to function headers with a long argument list.
0115
0116 However, never break user-visible strings such as printk messages because
0117 that breaks the ability to grep for them.
0118
0119
0120 3) Placing Braces and Spaces
0121 ----------------------------
0122
0123 The other issue that always comes up in C styling is the placement of
0124 braces. Unlike the indent size, there are few technical reasons to
0125 choose one placement strategy over the other, but the preferred way, as
0126 shown to us by the prophets Kernighan and Ritchie, is to put the opening
0127 brace last on the line, and put the closing brace first, thusly:
0128
0129 .. code-block:: c
0130
0131 if (x is true) {
0132 we do y
0133 }
0134
0135 This applies to all non-function statement blocks (if, switch, for,
0136 while, do). E.g.:
0137
0138 .. code-block:: c
0139
0140 switch (action) {
0141 case KOBJ_ADD:
0142 return "add";
0143 case KOBJ_REMOVE:
0144 return "remove";
0145 case KOBJ_CHANGE:
0146 return "change";
0147 default:
0148 return NULL;
0149 }
0150
0151 However, there is one special case, namely functions: they have the
0152 opening brace at the beginning of the next line, thus:
0153
0154 .. code-block:: c
0155
0156 int function(int x)
0157 {
0158 body of function
0159 }
0160
0161 Heretic people all over the world have claimed that this inconsistency
0162 is ... well ... inconsistent, but all right-thinking people know that
0163 (a) K&R are **right** and (b) K&R are right. Besides, functions are
0164 special anyway (you can't nest them in C).
0165
0166 Note that the closing brace is empty on a line of its own, **except** in
0167 the cases where it is followed by a continuation of the same statement,
0168 ie a ``while`` in a do-statement or an ``else`` in an if-statement, like
0169 this:
0170
0171 .. code-block:: c
0172
0173 do {
0174 body of do-loop
0175 } while (condition);
0176
0177 and
0178
0179 .. code-block:: c
0180
0181 if (x == y) {
0182 ..
0183 } else if (x > y) {
0184 ...
0185 } else {
0186 ....
0187 }
0188
0189 Rationale: K&R.
0190
0191 Also, note that this brace-placement also minimizes the number of empty
0192 (or almost empty) lines, without any loss of readability. Thus, as the
0193 supply of new-lines on your screen is not a renewable resource (think
0194 25-line terminal screens here), you have more empty lines to put
0195 comments on.
0196
0197 Do not unnecessarily use braces where a single statement will do.
0198
0199 .. code-block:: c
0200
0201 if (condition)
0202 action();
0203
0204 and
0205
0206 .. code-block:: none
0207
0208 if (condition)
0209 do_this();
0210 else
0211 do_that();
0212
0213 This does not apply if only one branch of a conditional statement is a single
0214 statement; in the latter case use braces in both branches:
0215
0216 .. code-block:: c
0217
0218 if (condition) {
0219 do_this();
0220 do_that();
0221 } else {
0222 otherwise();
0223 }
0224
0225 Also, use braces when a loop contains more than a single simple statement:
0226
0227 .. code-block:: c
0228
0229 while (condition) {
0230 if (test)
0231 do_something();
0232 }
0233
0234 3.1) Spaces
0235 ***********
0236
0237 Linux kernel style for use of spaces depends (mostly) on
0238 function-versus-keyword usage. Use a space after (most) keywords. The
0239 notable exceptions are sizeof, typeof, alignof, and __attribute__, which look
0240 somewhat like functions (and are usually used with parentheses in Linux,
0241 although they are not required in the language, as in: ``sizeof info`` after
0242 ``struct fileinfo info;`` is declared).
0243
0244 So use a space after these keywords::
0245
0246 if, switch, case, for, do, while
0247
0248 but not with sizeof, typeof, alignof, or __attribute__. E.g.,
0249
0250 .. code-block:: c
0251
0252
0253 s = sizeof(struct file);
0254
0255 Do not add spaces around (inside) parenthesized expressions. This example is
0256 **bad**:
0257
0258 .. code-block:: c
0259
0260
0261 s = sizeof( struct file );
0262
0263 When declaring pointer data or a function that returns a pointer type, the
0264 preferred use of ``*`` is adjacent to the data name or function name and not
0265 adjacent to the type name. Examples:
0266
0267 .. code-block:: c
0268
0269
0270 char *linux_banner;
0271 unsigned long long memparse(char *ptr, char **retptr);
0272 char *match_strdup(substring_t *s);
0273
0274 Use one space around (on each side of) most binary and ternary operators,
0275 such as any of these::
0276
0277 = + - < > * / % | & ^ <= >= == != ? :
0278
0279 but no space after unary operators::
0280
0281 & * + - ~ ! sizeof typeof alignof __attribute__ defined
0282
0283 no space before the postfix increment & decrement unary operators::
0284
0285 ++ --
0286
0287 no space after the prefix increment & decrement unary operators::
0288
0289 ++ --
0290
0291 and no space around the ``.`` and ``->`` structure member operators.
0292
0293 Do not leave trailing whitespace at the ends of lines. Some editors with
0294 ``smart`` indentation will insert whitespace at the beginning of new lines as
0295 appropriate, so you can start typing the next line of code right away.
0296 However, some such editors do not remove the whitespace if you end up not
0297 putting a line of code there, such as if you leave a blank line. As a result,
0298 you end up with lines containing trailing whitespace.
0299
0300 Git will warn you about patches that introduce trailing whitespace, and can
0301 optionally strip the trailing whitespace for you; however, if applying a series
0302 of patches, this may make later patches in the series fail by changing their
0303 context lines.
0304
0305
0306 4) Naming
0307 ---------
0308
0309 C is a Spartan language, and your naming conventions should follow suit.
0310 Unlike Modula-2 and Pascal programmers, C programmers do not use cute
0311 names like ThisVariableIsATemporaryCounter. A C programmer would call that
0312 variable ``tmp``, which is much easier to write, and not the least more
0313 difficult to understand.
0314
0315 HOWEVER, while mixed-case names are frowned upon, descriptive names for
0316 global variables are a must. To call a global function ``foo`` is a
0317 shooting offense.
0318
0319 GLOBAL variables (to be used only if you **really** need them) need to
0320 have descriptive names, as do global functions. If you have a function
0321 that counts the number of active users, you should call that
0322 ``count_active_users()`` or similar, you should **not** call it ``cntusr()``.
0323
0324 Encoding the type of a function into the name (so-called Hungarian
0325 notation) is asinine - the compiler knows the types anyway and can check
0326 those, and it only confuses the programmer.
0327
0328 LOCAL variable names should be short, and to the point. If you have
0329 some random integer loop counter, it should probably be called ``i``.
0330 Calling it ``loop_counter`` is non-productive, if there is no chance of it
0331 being mis-understood. Similarly, ``tmp`` can be just about any type of
0332 variable that is used to hold a temporary value.
0333
0334 If you are afraid to mix up your local variable names, you have another
0335 problem, which is called the function-growth-hormone-imbalance syndrome.
0336 See chapter 6 (Functions).
0337
0338 For symbol names and documentation, avoid introducing new usage of
0339 'master / slave' (or 'slave' independent of 'master') and 'blacklist /
0340 whitelist'.
0341
0342 Recommended replacements for 'master / slave' are:
0343 '{primary,main} / {secondary,replica,subordinate}'
0344 '{initiator,requester} / {target,responder}'
0345 '{controller,host} / {device,worker,proxy}'
0346 'leader / follower'
0347 'director / performer'
0348
0349 Recommended replacements for 'blacklist/whitelist' are:
0350 'denylist / allowlist'
0351 'blocklist / passlist'
0352
0353 Exceptions for introducing new usage is to maintain a userspace ABI/API,
0354 or when updating code for an existing (as of 2020) hardware or protocol
0355 specification that mandates those terms. For new specifications
0356 translate specification usage of the terminology to the kernel coding
0357 standard where possible.
0358
0359 5) Typedefs
0360 -----------
0361
0362 Please don't use things like ``vps_t``.
0363 It's a **mistake** to use typedef for structures and pointers. When you see a
0364
0365 .. code-block:: c
0366
0367
0368 vps_t a;
0369
0370 in the source, what does it mean?
0371 In contrast, if it says
0372
0373 .. code-block:: c
0374
0375 struct virtual_container *a;
0376
0377 you can actually tell what ``a`` is.
0378
0379 Lots of people think that typedefs ``help readability``. Not so. They are
0380 useful only for:
0381
0382 (a) totally opaque objects (where the typedef is actively used to **hide**
0383 what the object is).
0384
0385 Example: ``pte_t`` etc. opaque objects that you can only access using
0386 the proper accessor functions.
0387
0388 .. note::
0389
0390 Opaqueness and ``accessor functions`` are not good in themselves.
0391 The reason we have them for things like pte_t etc. is that there
0392 really is absolutely **zero** portably accessible information there.
0393
0394 (b) Clear integer types, where the abstraction **helps** avoid confusion
0395 whether it is ``int`` or ``long``.
0396
0397 u8/u16/u32 are perfectly fine typedefs, although they fit into
0398 category (d) better than here.
0399
0400 .. note::
0401
0402 Again - there needs to be a **reason** for this. If something is
0403 ``unsigned long``, then there's no reason to do
0404
0405 typedef unsigned long myflags_t;
0406
0407 but if there is a clear reason for why it under certain circumstances
0408 might be an ``unsigned int`` and under other configurations might be
0409 ``unsigned long``, then by all means go ahead and use a typedef.
0410
0411 (c) when you use sparse to literally create a **new** type for
0412 type-checking.
0413
0414 (d) New types which are identical to standard C99 types, in certain
0415 exceptional circumstances.
0416
0417 Although it would only take a short amount of time for the eyes and
0418 brain to become accustomed to the standard types like ``uint32_t``,
0419 some people object to their use anyway.
0420
0421 Therefore, the Linux-specific ``u8/u16/u32/u64`` types and their
0422 signed equivalents which are identical to standard types are
0423 permitted -- although they are not mandatory in new code of your
0424 own.
0425
0426 When editing existing code which already uses one or the other set
0427 of types, you should conform to the existing choices in that code.
0428
0429 (e) Types safe for use in userspace.
0430
0431 In certain structures which are visible to userspace, we cannot
0432 require C99 types and cannot use the ``u32`` form above. Thus, we
0433 use __u32 and similar types in all structures which are shared
0434 with userspace.
0435
0436 Maybe there are other cases too, but the rule should basically be to NEVER
0437 EVER use a typedef unless you can clearly match one of those rules.
0438
0439 In general, a pointer, or a struct that has elements that can reasonably
0440 be directly accessed should **never** be a typedef.
0441
0442
0443 6) Functions
0444 ------------
0445
0446 Functions should be short and sweet, and do just one thing. They should
0447 fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24,
0448 as we all know), and do one thing and do that well.
0449
0450 The maximum length of a function is inversely proportional to the
0451 complexity and indentation level of that function. So, if you have a
0452 conceptually simple function that is just one long (but simple)
0453 case-statement, where you have to do lots of small things for a lot of
0454 different cases, it's OK to have a longer function.
0455
0456 However, if you have a complex function, and you suspect that a
0457 less-than-gifted first-year high-school student might not even
0458 understand what the function is all about, you should adhere to the
0459 maximum limits all the more closely. Use helper functions with
0460 descriptive names (you can ask the compiler to in-line them if you think
0461 it's performance-critical, and it will probably do a better job of it
0462 than you would have done).
0463
0464 Another measure of the function is the number of local variables. They
0465 shouldn't exceed 5-10, or you're doing something wrong. Re-think the
0466 function, and split it into smaller pieces. A human brain can
0467 generally easily keep track of about 7 different things, anything more
0468 and it gets confused. You know you're brilliant, but maybe you'd like
0469 to understand what you did 2 weeks from now.
0470
0471 In source files, separate functions with one blank line. If the function is
0472 exported, the **EXPORT** macro for it should follow immediately after the
0473 closing function brace line. E.g.:
0474
0475 .. code-block:: c
0476
0477 int system_is_up(void)
0478 {
0479 return system_state == SYSTEM_RUNNING;
0480 }
0481 EXPORT_SYMBOL(system_is_up);
0482
0483 6.1) Function prototypes
0484 ************************
0485
0486 In function prototypes, include parameter names with their data types.
0487 Although this is not required by the C language, it is preferred in Linux
0488 because it is a simple way to add valuable information for the reader.
0489
0490 Do not use the ``extern`` keyword with function declarations as this makes
0491 lines longer and isn't strictly necessary.
0492
0493 When writing function prototypes, please keep the `order of elements regular
0494 <https://lore.kernel.org/mm-commits/CAHk-=wiOCLRny5aifWNhr621kYrJwhfURsa0vFPeUEm8mF0ufg@mail.gmail.com/>`_.
0495 For example, using this function declaration example::
0496
0497 __init void * __must_check action(enum magic value, size_t size, u8 count,
0498 char *fmt, ...) __printf(4, 5) __malloc;
0499
0500 The preferred order of elements for a function prototype is:
0501
0502 - storage class (below, ``static __always_inline``, noting that ``__always_inline``
0503 is technically an attribute but is treated like ``inline``)
0504 - storage class attributes (here, ``__init`` -- i.e. section declarations, but also
0505 things like ``__cold``)
0506 - return type (here, ``void *``)
0507 - return type attributes (here, ``__must_check``)
0508 - function name (here, ``action``)
0509 - function parameters (here, ``(enum magic value, size_t size, u8 count, char *fmt, ...)``,
0510 noting that parameter names should always be included)
0511 - function parameter attributes (here, ``__printf(4, 5)``)
0512 - function behavior attributes (here, ``__malloc``)
0513
0514 Note that for a function **definition** (i.e. the actual function body),
0515 the compiler does not allow function parameter attributes after the
0516 function parameters. In these cases, they should go after the storage
0517 class attributes (e.g. note the changed position of ``__printf(4, 5)``
0518 below, compared to the **declaration** example above)::
0519
0520 static __always_inline __init __printf(4, 5) void * __must_check action(enum magic value,
0521 size_t size, u8 count, char *fmt, ...) __malloc
0522 {
0523 ...
0524 }
0525
0526 7) Centralized exiting of functions
0527 -----------------------------------
0528
0529 Albeit deprecated by some people, the equivalent of the goto statement is
0530 used frequently by compilers in form of the unconditional jump instruction.
0531
0532 The goto statement comes in handy when a function exits from multiple
0533 locations and some common work such as cleanup has to be done. If there is no
0534 cleanup needed then just return directly.
0535
0536 Choose label names which say what the goto does or why the goto exists. An
0537 example of a good name could be ``out_free_buffer:`` if the goto frees ``buffer``.
0538 Avoid using GW-BASIC names like ``err1:`` and ``err2:``, as you would have to
0539 renumber them if you ever add or remove exit paths, and they make correctness
0540 difficult to verify anyway.
0541
0542 The rationale for using gotos is:
0543
0544 - unconditional statements are easier to understand and follow
0545 - nesting is reduced
0546 - errors by not updating individual exit points when making
0547 modifications are prevented
0548 - saves the compiler work to optimize redundant code away ;)
0549
0550 .. code-block:: c
0551
0552 int fun(int a)
0553 {
0554 int result = 0;
0555 char *buffer;
0556
0557 buffer = kmalloc(SIZE, GFP_KERNEL);
0558 if (!buffer)
0559 return -ENOMEM;
0560
0561 if (condition1) {
0562 while (loop1) {
0563 ...
0564 }
0565 result = 1;
0566 goto out_free_buffer;
0567 }
0568 ...
0569 out_free_buffer:
0570 kfree(buffer);
0571 return result;
0572 }
0573
0574 A common type of bug to be aware of is ``one err bugs`` which look like this:
0575
0576 .. code-block:: c
0577
0578 err:
0579 kfree(foo->bar);
0580 kfree(foo);
0581 return ret;
0582
0583 The bug in this code is that on some exit paths ``foo`` is NULL. Normally the
0584 fix for this is to split it up into two error labels ``err_free_bar:`` and
0585 ``err_free_foo:``:
0586
0587 .. code-block:: c
0588
0589 err_free_bar:
0590 kfree(foo->bar);
0591 err_free_foo:
0592 kfree(foo);
0593 return ret;
0594
0595 Ideally you should simulate errors to test all exit paths.
0596
0597
0598 8) Commenting
0599 -------------
0600
0601 Comments are good, but there is also a danger of over-commenting. NEVER
0602 try to explain HOW your code works in a comment: it's much better to
0603 write the code so that the **working** is obvious, and it's a waste of
0604 time to explain badly written code.
0605
0606 Generally, you want your comments to tell WHAT your code does, not HOW.
0607 Also, try to avoid putting comments inside a function body: if the
0608 function is so complex that you need to separately comment parts of it,
0609 you should probably go back to chapter 6 for a while. You can make
0610 small comments to note or warn about something particularly clever (or
0611 ugly), but try to avoid excess. Instead, put the comments at the head
0612 of the function, telling people what it does, and possibly WHY it does
0613 it.
0614
0615 When commenting the kernel API functions, please use the kernel-doc format.
0616 See the files at :ref:`Documentation/doc-guide/ <doc_guide>` and
0617 ``scripts/kernel-doc`` for details.
0618
0619 The preferred style for long (multi-line) comments is:
0620
0621 .. code-block:: c
0622
0623 /*
0624 * This is the preferred style for multi-line
0625 * comments in the Linux kernel source code.
0626 * Please use it consistently.
0627 *
0628 * Description: A column of asterisks on the left side,
0629 * with beginning and ending almost-blank lines.
0630 */
0631
0632 For files in net/ and drivers/net/ the preferred style for long (multi-line)
0633 comments is a little different.
0634
0635 .. code-block:: c
0636
0637 /* The preferred comment style for files in net/ and drivers/net
0638 * looks like this.
0639 *
0640 * It is nearly the same as the generally preferred comment style,
0641 * but there is no initial almost-blank line.
0642 */
0643
0644 It's also important to comment data, whether they are basic types or derived
0645 types. To this end, use just one data declaration per line (no commas for
0646 multiple data declarations). This leaves you room for a small comment on each
0647 item, explaining its use.
0648
0649
0650 9) You've made a mess of it
0651 ---------------------------
0652
0653 That's OK, we all do. You've probably been told by your long-time Unix
0654 user helper that ``GNU emacs`` automatically formats the C sources for
0655 you, and you've noticed that yes, it does do that, but the defaults it
0656 uses are less than desirable (in fact, they are worse than random
0657 typing - an infinite number of monkeys typing into GNU emacs would never
0658 make a good program).
0659
0660 So, you can either get rid of GNU emacs, or change it to use saner
0661 values. To do the latter, you can stick the following in your .emacs file:
0662
0663 .. code-block:: none
0664
0665 (defun c-lineup-arglist-tabs-only (ignored)
0666 "Line up argument lists by tabs, not spaces"
0667 (let* ((anchor (c-langelem-pos c-syntactic-element))
0668 (column (c-langelem-2nd-pos c-syntactic-element))
0669 (offset (- (1+ column) anchor))
0670 (steps (floor offset c-basic-offset)))
0671 (* (max steps 1)
0672 c-basic-offset)))
0673
0674 (dir-locals-set-class-variables
0675 'linux-kernel
0676 '((c-mode . (
0677 (c-basic-offset . 8)
0678 (c-label-minimum-indentation . 0)
0679 (c-offsets-alist . (
0680 (arglist-close . c-lineup-arglist-tabs-only)
0681 (arglist-cont-nonempty .
0682 (c-lineup-gcc-asm-reg c-lineup-arglist-tabs-only))
0683 (arglist-intro . +)
0684 (brace-list-intro . +)
0685 (c . c-lineup-C-comments)
0686 (case-label . 0)
0687 (comment-intro . c-lineup-comment)
0688 (cpp-define-intro . +)
0689 (cpp-macro . -1000)
0690 (cpp-macro-cont . +)
0691 (defun-block-intro . +)
0692 (else-clause . 0)
0693 (func-decl-cont . +)
0694 (inclass . +)
0695 (inher-cont . c-lineup-multi-inher)
0696 (knr-argdecl-intro . 0)
0697 (label . -1000)
0698 (statement . 0)
0699 (statement-block-intro . +)
0700 (statement-case-intro . +)
0701 (statement-cont . +)
0702 (substatement . +)
0703 ))
0704 (indent-tabs-mode . t)
0705 (show-trailing-whitespace . t)
0706 ))))
0707
0708 (dir-locals-set-directory-class
0709 (expand-file-name "~/src/linux-trees")
0710 'linux-kernel)
0711
0712 This will make emacs go better with the kernel coding style for C
0713 files below ``~/src/linux-trees``.
0714
0715 But even if you fail in getting emacs to do sane formatting, not
0716 everything is lost: use ``indent``.
0717
0718 Now, again, GNU indent has the same brain-dead settings that GNU emacs
0719 has, which is why you need to give it a few command line options.
0720 However, that's not too bad, because even the makers of GNU indent
0721 recognize the authority of K&R (the GNU people aren't evil, they are
0722 just severely misguided in this matter), so you just give indent the
0723 options ``-kr -i8`` (stands for ``K&R, 8 character indents``), or use
0724 ``scripts/Lindent``, which indents in the latest style.
0725
0726 ``indent`` has a lot of options, and especially when it comes to comment
0727 re-formatting you may want to take a look at the man page. But
0728 remember: ``indent`` is not a fix for bad programming.
0729
0730 Note that you can also use the ``clang-format`` tool to help you with
0731 these rules, to quickly re-format parts of your code automatically,
0732 and to review full files in order to spot coding style mistakes,
0733 typos and possible improvements. It is also handy for sorting ``#includes``,
0734 for aligning variables/macros, for reflowing text and other similar tasks.
0735 See the file :ref:`Documentation/process/clang-format.rst <clangformat>`
0736 for more details.
0737
0738
0739 10) Kconfig configuration files
0740 -------------------------------
0741
0742 For all of the Kconfig* configuration files throughout the source tree,
0743 the indentation is somewhat different. Lines under a ``config`` definition
0744 are indented with one tab, while help text is indented an additional two
0745 spaces. Example::
0746
0747 config AUDIT
0748 bool "Auditing support"
0749 depends on NET
0750 help
0751 Enable auditing infrastructure that can be used with another
0752 kernel subsystem, such as SELinux (which requires this for
0753 logging of avc messages output). Does not do system-call
0754 auditing without CONFIG_AUDITSYSCALL.
0755
0756 Seriously dangerous features (such as write support for certain
0757 filesystems) should advertise this prominently in their prompt string::
0758
0759 config ADFS_FS_RW
0760 bool "ADFS write support (DANGEROUS)"
0761 depends on ADFS_FS
0762 ...
0763
0764 For full documentation on the configuration files, see the file
0765 Documentation/kbuild/kconfig-language.rst.
0766
0767
0768 11) Data structures
0769 -------------------
0770
0771 Data structures that have visibility outside the single-threaded
0772 environment they are created and destroyed in should always have
0773 reference counts. In the kernel, garbage collection doesn't exist (and
0774 outside the kernel garbage collection is slow and inefficient), which
0775 means that you absolutely **have** to reference count all your uses.
0776
0777 Reference counting means that you can avoid locking, and allows multiple
0778 users to have access to the data structure in parallel - and not having
0779 to worry about the structure suddenly going away from under them just
0780 because they slept or did something else for a while.
0781
0782 Note that locking is **not** a replacement for reference counting.
0783 Locking is used to keep data structures coherent, while reference
0784 counting is a memory management technique. Usually both are needed, and
0785 they are not to be confused with each other.
0786
0787 Many data structures can indeed have two levels of reference counting,
0788 when there are users of different ``classes``. The subclass count counts
0789 the number of subclass users, and decrements the global count just once
0790 when the subclass count goes to zero.
0791
0792 Examples of this kind of ``multi-level-reference-counting`` can be found in
0793 memory management (``struct mm_struct``: mm_users and mm_count), and in
0794 filesystem code (``struct super_block``: s_count and s_active).
0795
0796 Remember: if another thread can find your data structure, and you don't
0797 have a reference count on it, you almost certainly have a bug.
0798
0799
0800 12) Macros, Enums and RTL
0801 -------------------------
0802
0803 Names of macros defining constants and labels in enums are capitalized.
0804
0805 .. code-block:: c
0806
0807 #define CONSTANT 0x12345
0808
0809 Enums are preferred when defining several related constants.
0810
0811 CAPITALIZED macro names are appreciated but macros resembling functions
0812 may be named in lower case.
0813
0814 Generally, inline functions are preferable to macros resembling functions.
0815
0816 Macros with multiple statements should be enclosed in a do - while block:
0817
0818 .. code-block:: c
0819
0820 #define macrofun(a, b, c) \
0821 do { \
0822 if (a == 5) \
0823 do_this(b, c); \
0824 } while (0)
0825
0826 Things to avoid when using macros:
0827
0828 1) macros that affect control flow:
0829
0830 .. code-block:: c
0831
0832 #define FOO(x) \
0833 do { \
0834 if (blah(x) < 0) \
0835 return -EBUGGERED; \
0836 } while (0)
0837
0838 is a **very** bad idea. It looks like a function call but exits the ``calling``
0839 function; don't break the internal parsers of those who will read the code.
0840
0841 2) macros that depend on having a local variable with a magic name:
0842
0843 .. code-block:: c
0844
0845 #define FOO(val) bar(index, val)
0846
0847 might look like a good thing, but it's confusing as hell when one reads the
0848 code and it's prone to breakage from seemingly innocent changes.
0849
0850 3) macros with arguments that are used as l-values: FOO(x) = y; will
0851 bite you if somebody e.g. turns FOO into an inline function.
0852
0853 4) forgetting about precedence: macros defining constants using expressions
0854 must enclose the expression in parentheses. Beware of similar issues with
0855 macros using parameters.
0856
0857 .. code-block:: c
0858
0859 #define CONSTANT 0x4000
0860 #define CONSTEXP (CONSTANT | 3)
0861
0862 5) namespace collisions when defining local variables in macros resembling
0863 functions:
0864
0865 .. code-block:: c
0866
0867 #define FOO(x) \
0868 ({ \
0869 typeof(x) ret; \
0870 ret = calc_ret(x); \
0871 (ret); \
0872 })
0873
0874 ret is a common name for a local variable - __foo_ret is less likely
0875 to collide with an existing variable.
0876
0877 The cpp manual deals with macros exhaustively. The gcc internals manual also
0878 covers RTL which is used frequently with assembly language in the kernel.
0879
0880
0881 13) Printing kernel messages
0882 ----------------------------
0883
0884 Kernel developers like to be seen as literate. Do mind the spelling
0885 of kernel messages to make a good impression. Do not use incorrect
0886 contractions like ``dont``; use ``do not`` or ``don't`` instead. Make the
0887 messages concise, clear, and unambiguous.
0888
0889 Kernel messages do not have to be terminated with a period.
0890
0891 Printing numbers in parentheses (%d) adds no value and should be avoided.
0892
0893 There are a number of driver model diagnostic macros in <linux/dev_printk.h>
0894 which you should use to make sure messages are matched to the right device
0895 and driver, and are tagged with the right level: dev_err(), dev_warn(),
0896 dev_info(), and so forth. For messages that aren't associated with a
0897 particular device, <linux/printk.h> defines pr_notice(), pr_info(),
0898 pr_warn(), pr_err(), etc.
0899
0900 Coming up with good debugging messages can be quite a challenge; and once
0901 you have them, they can be a huge help for remote troubleshooting. However
0902 debug message printing is handled differently than printing other non-debug
0903 messages. While the other pr_XXX() functions print unconditionally,
0904 pr_debug() does not; it is compiled out by default, unless either DEBUG is
0905 defined or CONFIG_DYNAMIC_DEBUG is set. That is true for dev_dbg() also,
0906 and a related convention uses VERBOSE_DEBUG to add dev_vdbg() messages to
0907 the ones already enabled by DEBUG.
0908
0909 Many subsystems have Kconfig debug options to turn on -DDEBUG in the
0910 corresponding Makefile; in other cases specific files #define DEBUG. And
0911 when a debug message should be unconditionally printed, such as if it is
0912 already inside a debug-related #ifdef section, printk(KERN_DEBUG ...) can be
0913 used.
0914
0915
0916 14) Allocating memory
0917 ---------------------
0918
0919 The kernel provides the following general purpose memory allocators:
0920 kmalloc(), kzalloc(), kmalloc_array(), kcalloc(), vmalloc(), and
0921 vzalloc(). Please refer to the API documentation for further information
0922 about them. :ref:`Documentation/core-api/memory-allocation.rst
0923 <memory_allocation>`
0924
0925 The preferred form for passing a size of a struct is the following:
0926
0927 .. code-block:: c
0928
0929 p = kmalloc(sizeof(*p), ...);
0930
0931 The alternative form where struct name is spelled out hurts readability and
0932 introduces an opportunity for a bug when the pointer variable type is changed
0933 but the corresponding sizeof that is passed to a memory allocator is not.
0934
0935 Casting the return value which is a void pointer is redundant. The conversion
0936 from void pointer to any other pointer type is guaranteed by the C programming
0937 language.
0938
0939 The preferred form for allocating an array is the following:
0940
0941 .. code-block:: c
0942
0943 p = kmalloc_array(n, sizeof(...), ...);
0944
0945 The preferred form for allocating a zeroed array is the following:
0946
0947 .. code-block:: c
0948
0949 p = kcalloc(n, sizeof(...), ...);
0950
0951 Both forms check for overflow on the allocation size n * sizeof(...),
0952 and return NULL if that occurred.
0953
0954 These generic allocation functions all emit a stack dump on failure when used
0955 without __GFP_NOWARN so there is no use in emitting an additional failure
0956 message when NULL is returned.
0957
0958 15) The inline disease
0959 ----------------------
0960
0961 There appears to be a common misperception that gcc has a magic "make me
0962 faster" speedup option called ``inline``. While the use of inlines can be
0963 appropriate (for example as a means of replacing macros, see Chapter 12), it
0964 very often is not. Abundant use of the inline keyword leads to a much bigger
0965 kernel, which in turn slows the system as a whole down, due to a bigger
0966 icache footprint for the CPU and simply because there is less memory
0967 available for the pagecache. Just think about it; a pagecache miss causes a
0968 disk seek, which easily takes 5 milliseconds. There are a LOT of cpu cycles
0969 that can go into these 5 milliseconds.
0970
0971 A reasonable rule of thumb is to not put inline at functions that have more
0972 than 3 lines of code in them. An exception to this rule are the cases where
0973 a parameter is known to be a compiletime constant, and as a result of this
0974 constantness you *know* the compiler will be able to optimize most of your
0975 function away at compile time. For a good example of this later case, see
0976 the kmalloc() inline function.
0977
0978 Often people argue that adding inline to functions that are static and used
0979 only once is always a win since there is no space tradeoff. While this is
0980 technically correct, gcc is capable of inlining these automatically without
0981 help, and the maintenance issue of removing the inline when a second user
0982 appears outweighs the potential value of the hint that tells gcc to do
0983 something it would have done anyway.
0984
0985
0986 16) Function return values and names
0987 ------------------------------------
0988
0989 Functions can return values of many different kinds, and one of the
0990 most common is a value indicating whether the function succeeded or
0991 failed. Such a value can be represented as an error-code integer
0992 (-Exxx = failure, 0 = success) or a ``succeeded`` boolean (0 = failure,
0993 non-zero = success).
0994
0995 Mixing up these two sorts of representations is a fertile source of
0996 difficult-to-find bugs. If the C language included a strong distinction
0997 between integers and booleans then the compiler would find these mistakes
0998 for us... but it doesn't. To help prevent such bugs, always follow this
0999 convention::
1000
1001 If the name of a function is an action or an imperative command,
1002 the function should return an error-code integer. If the name
1003 is a predicate, the function should return a "succeeded" boolean.
1004
1005 For example, ``add work`` is a command, and the add_work() function returns 0
1006 for success or -EBUSY for failure. In the same way, ``PCI device present`` is
1007 a predicate, and the pci_dev_present() function returns 1 if it succeeds in
1008 finding a matching device or 0 if it doesn't.
1009
1010 All EXPORTed functions must respect this convention, and so should all
1011 public functions. Private (static) functions need not, but it is
1012 recommended that they do.
1013
1014 Functions whose return value is the actual result of a computation, rather
1015 than an indication of whether the computation succeeded, are not subject to
1016 this rule. Generally they indicate failure by returning some out-of-range
1017 result. Typical examples would be functions that return pointers; they use
1018 NULL or the ERR_PTR mechanism to report failure.
1019
1020
1021 17) Using bool
1022 --------------
1023
1024 The Linux kernel bool type is an alias for the C99 _Bool type. bool values can
1025 only evaluate to 0 or 1, and implicit or explicit conversion to bool
1026 automatically converts the value to true or false. When using bool types the
1027 !! construction is not needed, which eliminates a class of bugs.
1028
1029 When working with bool values the true and false definitions should be used
1030 instead of 1 and 0.
1031
1032 bool function return types and stack variables are always fine to use whenever
1033 appropriate. Use of bool is encouraged to improve readability and is often a
1034 better option than 'int' for storing boolean values.
1035
1036 Do not use bool if cache line layout or size of the value matters, as its size
1037 and alignment varies based on the compiled architecture. Structures that are
1038 optimized for alignment and size should not use bool.
1039
1040 If a structure has many true/false values, consider consolidating them into a
1041 bitfield with 1 bit members, or using an appropriate fixed width type, such as
1042 u8.
1043
1044 Similarly for function arguments, many true/false values can be consolidated
1045 into a single bitwise 'flags' argument and 'flags' can often be a more
1046 readable alternative if the call-sites have naked true/false constants.
1047
1048 Otherwise limited use of bool in structures and arguments can improve
1049 readability.
1050
1051 18) Don't re-invent the kernel macros
1052 -------------------------------------
1053
1054 The header file include/linux/kernel.h contains a number of macros that
1055 you should use, rather than explicitly coding some variant of them yourself.
1056 For example, if you need to calculate the length of an array, take advantage
1057 of the macro
1058
1059 .. code-block:: c
1060
1061 #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
1062
1063 Similarly, if you need to calculate the size of some structure member, use
1064
1065 .. code-block:: c
1066
1067 #define sizeof_field(t, f) (sizeof(((t*)0)->f))
1068
1069 There are also min() and max() macros that do strict type checking if you
1070 need them. Feel free to peruse that header file to see what else is already
1071 defined that you shouldn't reproduce in your code.
1072
1073
1074 19) Editor modelines and other cruft
1075 ------------------------------------
1076
1077 Some editors can interpret configuration information embedded in source files,
1078 indicated with special markers. For example, emacs interprets lines marked
1079 like this:
1080
1081 .. code-block:: c
1082
1083 -*- mode: c -*-
1084
1085 Or like this:
1086
1087 .. code-block:: c
1088
1089 /*
1090 Local Variables:
1091 compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c"
1092 End:
1093 */
1094
1095 Vim interprets markers that look like this:
1096
1097 .. code-block:: c
1098
1099 /* vim:set sw=8 noet */
1100
1101 Do not include any of these in source files. People have their own personal
1102 editor configurations, and your source files should not override them. This
1103 includes markers for indentation and mode configuration. People may use their
1104 own custom mode, or may have some other magic method for making indentation
1105 work correctly.
1106
1107
1108 20) Inline assembly
1109 -------------------
1110
1111 In architecture-specific code, you may need to use inline assembly to interface
1112 with CPU or platform functionality. Don't hesitate to do so when necessary.
1113 However, don't use inline assembly gratuitously when C can do the job. You can
1114 and should poke hardware from C when possible.
1115
1116 Consider writing simple helper functions that wrap common bits of inline
1117 assembly, rather than repeatedly writing them with slight variations. Remember
1118 that inline assembly can use C parameters.
1119
1120 Large, non-trivial assembly functions should go in .S files, with corresponding
1121 C prototypes defined in C header files. The C prototypes for assembly
1122 functions should use ``asmlinkage``.
1123
1124 You may need to mark your asm statement as volatile, to prevent GCC from
1125 removing it if GCC doesn't notice any side effects. You don't always need to
1126 do so, though, and doing so unnecessarily can limit optimization.
1127
1128 When writing a single inline assembly statement containing multiple
1129 instructions, put each instruction on a separate line in a separate quoted
1130 string, and end each string except the last with ``\n\t`` to properly indent
1131 the next instruction in the assembly output:
1132
1133 .. code-block:: c
1134
1135 asm ("magic %reg1, #42\n\t"
1136 "more_magic %reg2, %reg3"
1137 : /* outputs */ : /* inputs */ : /* clobbers */);
1138
1139
1140 21) Conditional Compilation
1141 ---------------------------
1142
1143 Wherever possible, don't use preprocessor conditionals (#if, #ifdef) in .c
1144 files; doing so makes code harder to read and logic harder to follow. Instead,
1145 use such conditionals in a header file defining functions for use in those .c
1146 files, providing no-op stub versions in the #else case, and then call those
1147 functions unconditionally from .c files. The compiler will avoid generating
1148 any code for the stub calls, producing identical results, but the logic will
1149 remain easy to follow.
1150
1151 Prefer to compile out entire functions, rather than portions of functions or
1152 portions of expressions. Rather than putting an ifdef in an expression, factor
1153 out part or all of the expression into a separate helper function and apply the
1154 conditional to that function.
1155
1156 If you have a function or variable which may potentially go unused in a
1157 particular configuration, and the compiler would warn about its definition
1158 going unused, mark the definition as __maybe_unused rather than wrapping it in
1159 a preprocessor conditional. (However, if a function or variable *always* goes
1160 unused, delete it.)
1161
1162 Within code, where possible, use the IS_ENABLED macro to convert a Kconfig
1163 symbol into a C boolean expression, and use it in a normal C conditional:
1164
1165 .. code-block:: c
1166
1167 if (IS_ENABLED(CONFIG_SOMETHING)) {
1168 ...
1169 }
1170
1171 The compiler will constant-fold the conditional away, and include or exclude
1172 the block of code just as with an #ifdef, so this will not add any runtime
1173 overhead. However, this approach still allows the C compiler to see the code
1174 inside the block, and check it for correctness (syntax, types, symbol
1175 references, etc). Thus, you still have to use an #ifdef if the code inside the
1176 block references symbols that will not exist if the condition is not met.
1177
1178 At the end of any non-trivial #if or #ifdef block (more than a few lines),
1179 place a comment after the #endif on the same line, noting the conditional
1180 expression used. For instance:
1181
1182 .. code-block:: c
1183
1184 #ifdef CONFIG_SOMETHING
1185 ...
1186 #endif /* CONFIG_SOMETHING */
1187
1188
1189 Appendix I) References
1190 ----------------------
1191
1192 The C Programming Language, Second Edition
1193 by Brian W. Kernighan and Dennis M. Ritchie.
1194 Prentice Hall, Inc., 1988.
1195 ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback).
1196
1197 The Practice of Programming
1198 by Brian W. Kernighan and Rob Pike.
1199 Addison-Wesley, Inc., 1999.
1200 ISBN 0-201-61586-X.
1201
1202 GNU manuals - where in compliance with K&R and this text - for cpp, gcc,
1203 gcc internals and indent, all available from https://www.gnu.org/manual/
1204
1205 WG14 is the international standardization working group for the programming
1206 language C, URL: http://www.open-std.org/JTC1/SC22/WG14/
1207
1208 Kernel :ref:`process/coding-style.rst <codingstyle>`, by greg@kroah.com at OLS 2002:
1209 http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/