Back to home page

OSCL-LXR

 
 

    


0001 .. _codingstyle:
0002 
0003 Linux kernel coding style
0004 =========================
0005 
0006 This is a short document describing the preferred coding style for the
0007 linux kernel.  Coding style is very personal, and I won't **force** my
0008 views on anybody, but this is what goes for anything that I have to be
0009 able to maintain, and I'd prefer it for most other things too.  Please
0010 at least consider the points made here.
0011 
0012 First off, I'd suggest printing out a copy of the GNU coding standards,
0013 and NOT read it.  Burn them, it's a great symbolic gesture.
0014 
0015 Anyway, here goes:
0016 
0017 
0018 1) Indentation
0019 --------------
0020 
0021 Tabs are 8 characters, and thus indentations are also 8 characters.
0022 There are heretic movements that try to make indentations 4 (or even 2!)
0023 characters deep, and that is akin to trying to define the value of PI to
0024 be 3.
0025 
0026 Rationale: The whole idea behind indentation is to clearly define where
0027 a block of control starts and ends.  Especially when you've been looking
0028 at your screen for 20 straight hours, you'll find it a lot easier to see
0029 how the indentation works if you have large indentations.
0030 
0031 Now, some people will claim that having 8-character indentations makes
0032 the code move too far to the right, and makes it hard to read on a
0033 80-character terminal screen.  The answer to that is that if you need
0034 more than 3 levels of indentation, you're screwed anyway, and should fix
0035 your program.
0036 
0037 In short, 8-char indents make things easier to read, and have the added
0038 benefit of warning you when you're nesting your functions too deep.
0039 Heed that warning.
0040 
0041 The preferred way to ease multiple indentation levels in a switch statement is
0042 to align the ``switch`` and its subordinate ``case`` labels in the same column
0043 instead of ``double-indenting`` the ``case`` labels.  E.g.:
0044 
0045 .. code-block:: c
0046 
0047         switch (suffix) {
0048         case 'G':
0049         case 'g':
0050                 mem <<= 30;
0051                 break;
0052         case 'M':
0053         case 'm':
0054                 mem <<= 20;
0055                 break;
0056         case 'K':
0057         case 'k':
0058                 mem <<= 10;
0059                 fallthrough;
0060         default:
0061                 break;
0062         }
0063 
0064 Don't put multiple statements on a single line unless you have
0065 something to hide:
0066 
0067 .. code-block:: c
0068 
0069         if (condition) do_this;
0070           do_something_everytime;
0071 
0072 Don't use commas to avoid using braces:
0073 
0074 .. code-block:: c
0075 
0076         if (condition)
0077                 do_this(), do_that();
0078 
0079 Always uses braces for multiple statements:
0080 
0081 .. code-block:: c
0082 
0083         if (condition) {
0084                 do_this();
0085                 do_that();
0086         }
0087 
0088 Don't put multiple assignments on a single line either.  Kernel coding style
0089 is super simple.  Avoid tricky expressions.
0090 
0091 
0092 Outside of comments, documentation and except in Kconfig, spaces are never
0093 used for indentation, and the above example is deliberately broken.
0094 
0095 Get a decent editor and don't leave whitespace at the end of lines.
0096 
0097 
0098 2) Breaking long lines and strings
0099 ----------------------------------
0100 
0101 Coding style is all about readability and maintainability using commonly
0102 available tools.
0103 
0104 The preferred limit on the length of a single line is 80 columns.
0105 
0106 Statements longer than 80 columns should be broken into sensible chunks,
0107 unless exceeding 80 columns significantly increases readability and does
0108 not hide information.
0109 
0110 Descendants are always substantially shorter than the parent and
0111 are placed substantially to the right.  A very commonly used style
0112 is to align descendants to a function open parenthesis.
0113 
0114 These same rules are applied to function headers with a long argument list.
0115 
0116 However, never break user-visible strings such as printk messages because
0117 that breaks the ability to grep for them.
0118 
0119 
0120 3) Placing Braces and Spaces
0121 ----------------------------
0122 
0123 The other issue that always comes up in C styling is the placement of
0124 braces.  Unlike the indent size, there are few technical reasons to
0125 choose one placement strategy over the other, but the preferred way, as
0126 shown to us by the prophets Kernighan and Ritchie, is to put the opening
0127 brace last on the line, and put the closing brace first, thusly:
0128 
0129 .. code-block:: c
0130 
0131         if (x is true) {
0132                 we do y
0133         }
0134 
0135 This applies to all non-function statement blocks (if, switch, for,
0136 while, do).  E.g.:
0137 
0138 .. code-block:: c
0139 
0140         switch (action) {
0141         case KOBJ_ADD:
0142                 return "add";
0143         case KOBJ_REMOVE:
0144                 return "remove";
0145         case KOBJ_CHANGE:
0146                 return "change";
0147         default:
0148                 return NULL;
0149         }
0150 
0151 However, there is one special case, namely functions: they have the
0152 opening brace at the beginning of the next line, thus:
0153 
0154 .. code-block:: c
0155 
0156         int function(int x)
0157         {
0158                 body of function
0159         }
0160 
0161 Heretic people all over the world have claimed that this inconsistency
0162 is ...  well ...  inconsistent, but all right-thinking people know that
0163 (a) K&R are **right** and (b) K&R are right.  Besides, functions are
0164 special anyway (you can't nest them in C).
0165 
0166 Note that the closing brace is empty on a line of its own, **except** in
0167 the cases where it is followed by a continuation of the same statement,
0168 ie a ``while`` in a do-statement or an ``else`` in an if-statement, like
0169 this:
0170 
0171 .. code-block:: c
0172 
0173         do {
0174                 body of do-loop
0175         } while (condition);
0176 
0177 and
0178 
0179 .. code-block:: c
0180 
0181         if (x == y) {
0182                 ..
0183         } else if (x > y) {
0184                 ...
0185         } else {
0186                 ....
0187         }
0188 
0189 Rationale: K&R.
0190 
0191 Also, note that this brace-placement also minimizes the number of empty
0192 (or almost empty) lines, without any loss of readability.  Thus, as the
0193 supply of new-lines on your screen is not a renewable resource (think
0194 25-line terminal screens here), you have more empty lines to put
0195 comments on.
0196 
0197 Do not unnecessarily use braces where a single statement will do.
0198 
0199 .. code-block:: c
0200 
0201         if (condition)
0202                 action();
0203 
0204 and
0205 
0206 .. code-block:: none
0207 
0208         if (condition)
0209                 do_this();
0210         else
0211                 do_that();
0212 
0213 This does not apply if only one branch of a conditional statement is a single
0214 statement; in the latter case use braces in both branches:
0215 
0216 .. code-block:: c
0217 
0218         if (condition) {
0219                 do_this();
0220                 do_that();
0221         } else {
0222                 otherwise();
0223         }
0224 
0225 Also, use braces when a loop contains more than a single simple statement:
0226 
0227 .. code-block:: c
0228 
0229         while (condition) {
0230                 if (test)
0231                         do_something();
0232         }
0233 
0234 3.1) Spaces
0235 ***********
0236 
0237 Linux kernel style for use of spaces depends (mostly) on
0238 function-versus-keyword usage.  Use a space after (most) keywords.  The
0239 notable exceptions are sizeof, typeof, alignof, and __attribute__, which look
0240 somewhat like functions (and are usually used with parentheses in Linux,
0241 although they are not required in the language, as in: ``sizeof info`` after
0242 ``struct fileinfo info;`` is declared).
0243 
0244 So use a space after these keywords::
0245 
0246         if, switch, case, for, do, while
0247 
0248 but not with sizeof, typeof, alignof, or __attribute__.  E.g.,
0249 
0250 .. code-block:: c
0251 
0252 
0253         s = sizeof(struct file);
0254 
0255 Do not add spaces around (inside) parenthesized expressions.  This example is
0256 **bad**:
0257 
0258 .. code-block:: c
0259 
0260 
0261         s = sizeof( struct file );
0262 
0263 When declaring pointer data or a function that returns a pointer type, the
0264 preferred use of ``*`` is adjacent to the data name or function name and not
0265 adjacent to the type name.  Examples:
0266 
0267 .. code-block:: c
0268 
0269 
0270         char *linux_banner;
0271         unsigned long long memparse(char *ptr, char **retptr);
0272         char *match_strdup(substring_t *s);
0273 
0274 Use one space around (on each side of) most binary and ternary operators,
0275 such as any of these::
0276 
0277         =  +  -  <  >  *  /  %  |  &  ^  <=  >=  ==  !=  ?  :
0278 
0279 but no space after unary operators::
0280 
0281         &  *  +  -  ~  !  sizeof  typeof  alignof  __attribute__  defined
0282 
0283 no space before the postfix increment & decrement unary operators::
0284 
0285         ++  --
0286 
0287 no space after the prefix increment & decrement unary operators::
0288 
0289         ++  --
0290 
0291 and no space around the ``.`` and ``->`` structure member operators.
0292 
0293 Do not leave trailing whitespace at the ends of lines.  Some editors with
0294 ``smart`` indentation will insert whitespace at the beginning of new lines as
0295 appropriate, so you can start typing the next line of code right away.
0296 However, some such editors do not remove the whitespace if you end up not
0297 putting a line of code there, such as if you leave a blank line.  As a result,
0298 you end up with lines containing trailing whitespace.
0299 
0300 Git will warn you about patches that introduce trailing whitespace, and can
0301 optionally strip the trailing whitespace for you; however, if applying a series
0302 of patches, this may make later patches in the series fail by changing their
0303 context lines.
0304 
0305 
0306 4) Naming
0307 ---------
0308 
0309 C is a Spartan language, and your naming conventions should follow suit.
0310 Unlike Modula-2 and Pascal programmers, C programmers do not use cute
0311 names like ThisVariableIsATemporaryCounter. A C programmer would call that
0312 variable ``tmp``, which is much easier to write, and not the least more
0313 difficult to understand.
0314 
0315 HOWEVER, while mixed-case names are frowned upon, descriptive names for
0316 global variables are a must.  To call a global function ``foo`` is a
0317 shooting offense.
0318 
0319 GLOBAL variables (to be used only if you **really** need them) need to
0320 have descriptive names, as do global functions.  If you have a function
0321 that counts the number of active users, you should call that
0322 ``count_active_users()`` or similar, you should **not** call it ``cntusr()``.
0323 
0324 Encoding the type of a function into the name (so-called Hungarian
0325 notation) is asinine - the compiler knows the types anyway and can check
0326 those, and it only confuses the programmer.
0327 
0328 LOCAL variable names should be short, and to the point.  If you have
0329 some random integer loop counter, it should probably be called ``i``.
0330 Calling it ``loop_counter`` is non-productive, if there is no chance of it
0331 being mis-understood.  Similarly, ``tmp`` can be just about any type of
0332 variable that is used to hold a temporary value.
0333 
0334 If you are afraid to mix up your local variable names, you have another
0335 problem, which is called the function-growth-hormone-imbalance syndrome.
0336 See chapter 6 (Functions).
0337 
0338 For symbol names and documentation, avoid introducing new usage of
0339 'master / slave' (or 'slave' independent of 'master') and 'blacklist /
0340 whitelist'.
0341 
0342 Recommended replacements for 'master / slave' are:
0343     '{primary,main} / {secondary,replica,subordinate}'
0344     '{initiator,requester} / {target,responder}'
0345     '{controller,host} / {device,worker,proxy}'
0346     'leader / follower'
0347     'director / performer'
0348 
0349 Recommended replacements for 'blacklist/whitelist' are:
0350     'denylist / allowlist'
0351     'blocklist / passlist'
0352 
0353 Exceptions for introducing new usage is to maintain a userspace ABI/API,
0354 or when updating code for an existing (as of 2020) hardware or protocol
0355 specification that mandates those terms. For new specifications
0356 translate specification usage of the terminology to the kernel coding
0357 standard where possible.
0358 
0359 5) Typedefs
0360 -----------
0361 
0362 Please don't use things like ``vps_t``.
0363 It's a **mistake** to use typedef for structures and pointers. When you see a
0364 
0365 .. code-block:: c
0366 
0367 
0368         vps_t a;
0369 
0370 in the source, what does it mean?
0371 In contrast, if it says
0372 
0373 .. code-block:: c
0374 
0375         struct virtual_container *a;
0376 
0377 you can actually tell what ``a`` is.
0378 
0379 Lots of people think that typedefs ``help readability``. Not so. They are
0380 useful only for:
0381 
0382  (a) totally opaque objects (where the typedef is actively used to **hide**
0383      what the object is).
0384 
0385      Example: ``pte_t`` etc. opaque objects that you can only access using
0386      the proper accessor functions.
0387 
0388      .. note::
0389 
0390        Opaqueness and ``accessor functions`` are not good in themselves.
0391        The reason we have them for things like pte_t etc. is that there
0392        really is absolutely **zero** portably accessible information there.
0393 
0394  (b) Clear integer types, where the abstraction **helps** avoid confusion
0395      whether it is ``int`` or ``long``.
0396 
0397      u8/u16/u32 are perfectly fine typedefs, although they fit into
0398      category (d) better than here.
0399 
0400      .. note::
0401 
0402        Again - there needs to be a **reason** for this. If something is
0403        ``unsigned long``, then there's no reason to do
0404 
0405         typedef unsigned long myflags_t;
0406 
0407      but if there is a clear reason for why it under certain circumstances
0408      might be an ``unsigned int`` and under other configurations might be
0409      ``unsigned long``, then by all means go ahead and use a typedef.
0410 
0411  (c) when you use sparse to literally create a **new** type for
0412      type-checking.
0413 
0414  (d) New types which are identical to standard C99 types, in certain
0415      exceptional circumstances.
0416 
0417      Although it would only take a short amount of time for the eyes and
0418      brain to become accustomed to the standard types like ``uint32_t``,
0419      some people object to their use anyway.
0420 
0421      Therefore, the Linux-specific ``u8/u16/u32/u64`` types and their
0422      signed equivalents which are identical to standard types are
0423      permitted -- although they are not mandatory in new code of your
0424      own.
0425 
0426      When editing existing code which already uses one or the other set
0427      of types, you should conform to the existing choices in that code.
0428 
0429  (e) Types safe for use in userspace.
0430 
0431      In certain structures which are visible to userspace, we cannot
0432      require C99 types and cannot use the ``u32`` form above. Thus, we
0433      use __u32 and similar types in all structures which are shared
0434      with userspace.
0435 
0436 Maybe there are other cases too, but the rule should basically be to NEVER
0437 EVER use a typedef unless you can clearly match one of those rules.
0438 
0439 In general, a pointer, or a struct that has elements that can reasonably
0440 be directly accessed should **never** be a typedef.
0441 
0442 
0443 6) Functions
0444 ------------
0445 
0446 Functions should be short and sweet, and do just one thing.  They should
0447 fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24,
0448 as we all know), and do one thing and do that well.
0449 
0450 The maximum length of a function is inversely proportional to the
0451 complexity and indentation level of that function.  So, if you have a
0452 conceptually simple function that is just one long (but simple)
0453 case-statement, where you have to do lots of small things for a lot of
0454 different cases, it's OK to have a longer function.
0455 
0456 However, if you have a complex function, and you suspect that a
0457 less-than-gifted first-year high-school student might not even
0458 understand what the function is all about, you should adhere to the
0459 maximum limits all the more closely.  Use helper functions with
0460 descriptive names (you can ask the compiler to in-line them if you think
0461 it's performance-critical, and it will probably do a better job of it
0462 than you would have done).
0463 
0464 Another measure of the function is the number of local variables.  They
0465 shouldn't exceed 5-10, or you're doing something wrong.  Re-think the
0466 function, and split it into smaller pieces.  A human brain can
0467 generally easily keep track of about 7 different things, anything more
0468 and it gets confused.  You know you're brilliant, but maybe you'd like
0469 to understand what you did 2 weeks from now.
0470 
0471 In source files, separate functions with one blank line.  If the function is
0472 exported, the **EXPORT** macro for it should follow immediately after the
0473 closing function brace line.  E.g.:
0474 
0475 .. code-block:: c
0476 
0477         int system_is_up(void)
0478         {
0479                 return system_state == SYSTEM_RUNNING;
0480         }
0481         EXPORT_SYMBOL(system_is_up);
0482 
0483 6.1) Function prototypes
0484 ************************
0485 
0486 In function prototypes, include parameter names with their data types.
0487 Although this is not required by the C language, it is preferred in Linux
0488 because it is a simple way to add valuable information for the reader.
0489 
0490 Do not use the ``extern`` keyword with function declarations as this makes
0491 lines longer and isn't strictly necessary.
0492 
0493 When writing function prototypes, please keep the `order of elements regular
0494 <https://lore.kernel.org/mm-commits/CAHk-=wiOCLRny5aifWNhr621kYrJwhfURsa0vFPeUEm8mF0ufg@mail.gmail.com/>`_.
0495 For example, using this function declaration example::
0496 
0497  __init void * __must_check action(enum magic value, size_t size, u8 count,
0498                                    char *fmt, ...) __printf(4, 5) __malloc;
0499 
0500 The preferred order of elements for a function prototype is:
0501 
0502 - storage class (below, ``static __always_inline``, noting that ``__always_inline``
0503   is technically an attribute but is treated like ``inline``)
0504 - storage class attributes (here, ``__init`` -- i.e. section declarations, but also
0505   things like ``__cold``)
0506 - return type (here, ``void *``)
0507 - return type attributes (here, ``__must_check``)
0508 - function name (here, ``action``)
0509 - function parameters (here, ``(enum magic value, size_t size, u8 count, char *fmt, ...)``,
0510   noting that parameter names should always be included)
0511 - function parameter attributes (here, ``__printf(4, 5)``)
0512 - function behavior attributes (here, ``__malloc``)
0513 
0514 Note that for a function **definition** (i.e. the actual function body),
0515 the compiler does not allow function parameter attributes after the
0516 function parameters. In these cases, they should go after the storage
0517 class attributes (e.g. note the changed position of ``__printf(4, 5)``
0518 below, compared to the **declaration** example above)::
0519 
0520  static __always_inline __init __printf(4, 5) void * __must_check action(enum magic value,
0521                 size_t size, u8 count, char *fmt, ...) __malloc
0522  {
0523         ...
0524  }
0525 
0526 7) Centralized exiting of functions
0527 -----------------------------------
0528 
0529 Albeit deprecated by some people, the equivalent of the goto statement is
0530 used frequently by compilers in form of the unconditional jump instruction.
0531 
0532 The goto statement comes in handy when a function exits from multiple
0533 locations and some common work such as cleanup has to be done.  If there is no
0534 cleanup needed then just return directly.
0535 
0536 Choose label names which say what the goto does or why the goto exists.  An
0537 example of a good name could be ``out_free_buffer:`` if the goto frees ``buffer``.
0538 Avoid using GW-BASIC names like ``err1:`` and ``err2:``, as you would have to
0539 renumber them if you ever add or remove exit paths, and they make correctness
0540 difficult to verify anyway.
0541 
0542 The rationale for using gotos is:
0543 
0544 - unconditional statements are easier to understand and follow
0545 - nesting is reduced
0546 - errors by not updating individual exit points when making
0547   modifications are prevented
0548 - saves the compiler work to optimize redundant code away ;)
0549 
0550 .. code-block:: c
0551 
0552         int fun(int a)
0553         {
0554                 int result = 0;
0555                 char *buffer;
0556 
0557                 buffer = kmalloc(SIZE, GFP_KERNEL);
0558                 if (!buffer)
0559                         return -ENOMEM;
0560 
0561                 if (condition1) {
0562                         while (loop1) {
0563                                 ...
0564                         }
0565                         result = 1;
0566                         goto out_free_buffer;
0567                 }
0568                 ...
0569         out_free_buffer:
0570                 kfree(buffer);
0571                 return result;
0572         }
0573 
0574 A common type of bug to be aware of is ``one err bugs`` which look like this:
0575 
0576 .. code-block:: c
0577 
0578         err:
0579                 kfree(foo->bar);
0580                 kfree(foo);
0581                 return ret;
0582 
0583 The bug in this code is that on some exit paths ``foo`` is NULL.  Normally the
0584 fix for this is to split it up into two error labels ``err_free_bar:`` and
0585 ``err_free_foo:``:
0586 
0587 .. code-block:: c
0588 
0589          err_free_bar:
0590                 kfree(foo->bar);
0591          err_free_foo:
0592                 kfree(foo);
0593                 return ret;
0594 
0595 Ideally you should simulate errors to test all exit paths.
0596 
0597 
0598 8) Commenting
0599 -------------
0600 
0601 Comments are good, but there is also a danger of over-commenting.  NEVER
0602 try to explain HOW your code works in a comment: it's much better to
0603 write the code so that the **working** is obvious, and it's a waste of
0604 time to explain badly written code.
0605 
0606 Generally, you want your comments to tell WHAT your code does, not HOW.
0607 Also, try to avoid putting comments inside a function body: if the
0608 function is so complex that you need to separately comment parts of it,
0609 you should probably go back to chapter 6 for a while.  You can make
0610 small comments to note or warn about something particularly clever (or
0611 ugly), but try to avoid excess.  Instead, put the comments at the head
0612 of the function, telling people what it does, and possibly WHY it does
0613 it.
0614 
0615 When commenting the kernel API functions, please use the kernel-doc format.
0616 See the files at :ref:`Documentation/doc-guide/ <doc_guide>` and
0617 ``scripts/kernel-doc`` for details.
0618 
0619 The preferred style for long (multi-line) comments is:
0620 
0621 .. code-block:: c
0622 
0623         /*
0624          * This is the preferred style for multi-line
0625          * comments in the Linux kernel source code.
0626          * Please use it consistently.
0627          *
0628          * Description:  A column of asterisks on the left side,
0629          * with beginning and ending almost-blank lines.
0630          */
0631 
0632 For files in net/ and drivers/net/ the preferred style for long (multi-line)
0633 comments is a little different.
0634 
0635 .. code-block:: c
0636 
0637         /* The preferred comment style for files in net/ and drivers/net
0638          * looks like this.
0639          *
0640          * It is nearly the same as the generally preferred comment style,
0641          * but there is no initial almost-blank line.
0642          */
0643 
0644 It's also important to comment data, whether they are basic types or derived
0645 types.  To this end, use just one data declaration per line (no commas for
0646 multiple data declarations).  This leaves you room for a small comment on each
0647 item, explaining its use.
0648 
0649 
0650 9) You've made a mess of it
0651 ---------------------------
0652 
0653 That's OK, we all do.  You've probably been told by your long-time Unix
0654 user helper that ``GNU emacs`` automatically formats the C sources for
0655 you, and you've noticed that yes, it does do that, but the defaults it
0656 uses are less than desirable (in fact, they are worse than random
0657 typing - an infinite number of monkeys typing into GNU emacs would never
0658 make a good program).
0659 
0660 So, you can either get rid of GNU emacs, or change it to use saner
0661 values.  To do the latter, you can stick the following in your .emacs file:
0662 
0663 .. code-block:: none
0664 
0665   (defun c-lineup-arglist-tabs-only (ignored)
0666     "Line up argument lists by tabs, not spaces"
0667     (let* ((anchor (c-langelem-pos c-syntactic-element))
0668            (column (c-langelem-2nd-pos c-syntactic-element))
0669            (offset (- (1+ column) anchor))
0670            (steps (floor offset c-basic-offset)))
0671       (* (max steps 1)
0672          c-basic-offset)))
0673 
0674   (dir-locals-set-class-variables
0675    'linux-kernel
0676    '((c-mode . (
0677           (c-basic-offset . 8)
0678           (c-label-minimum-indentation . 0)
0679           (c-offsets-alist . (
0680                   (arglist-close         . c-lineup-arglist-tabs-only)
0681                   (arglist-cont-nonempty .
0682                       (c-lineup-gcc-asm-reg c-lineup-arglist-tabs-only))
0683                   (arglist-intro         . +)
0684                   (brace-list-intro      . +)
0685                   (c                     . c-lineup-C-comments)
0686                   (case-label            . 0)
0687                   (comment-intro         . c-lineup-comment)
0688                   (cpp-define-intro      . +)
0689                   (cpp-macro             . -1000)
0690                   (cpp-macro-cont        . +)
0691                   (defun-block-intro     . +)
0692                   (else-clause           . 0)
0693                   (func-decl-cont        . +)
0694                   (inclass               . +)
0695                   (inher-cont            . c-lineup-multi-inher)
0696                   (knr-argdecl-intro     . 0)
0697                   (label                 . -1000)
0698                   (statement             . 0)
0699                   (statement-block-intro . +)
0700                   (statement-case-intro  . +)
0701                   (statement-cont        . +)
0702                   (substatement          . +)
0703                   ))
0704           (indent-tabs-mode . t)
0705           (show-trailing-whitespace . t)
0706           ))))
0707 
0708   (dir-locals-set-directory-class
0709    (expand-file-name "~/src/linux-trees")
0710    'linux-kernel)
0711 
0712 This will make emacs go better with the kernel coding style for C
0713 files below ``~/src/linux-trees``.
0714 
0715 But even if you fail in getting emacs to do sane formatting, not
0716 everything is lost: use ``indent``.
0717 
0718 Now, again, GNU indent has the same brain-dead settings that GNU emacs
0719 has, which is why you need to give it a few command line options.
0720 However, that's not too bad, because even the makers of GNU indent
0721 recognize the authority of K&R (the GNU people aren't evil, they are
0722 just severely misguided in this matter), so you just give indent the
0723 options ``-kr -i8`` (stands for ``K&R, 8 character indents``), or use
0724 ``scripts/Lindent``, which indents in the latest style.
0725 
0726 ``indent`` has a lot of options, and especially when it comes to comment
0727 re-formatting you may want to take a look at the man page.  But
0728 remember: ``indent`` is not a fix for bad programming.
0729 
0730 Note that you can also use the ``clang-format`` tool to help you with
0731 these rules, to quickly re-format parts of your code automatically,
0732 and to review full files in order to spot coding style mistakes,
0733 typos and possible improvements. It is also handy for sorting ``#includes``,
0734 for aligning variables/macros, for reflowing text and other similar tasks.
0735 See the file :ref:`Documentation/process/clang-format.rst <clangformat>`
0736 for more details.
0737 
0738 
0739 10) Kconfig configuration files
0740 -------------------------------
0741 
0742 For all of the Kconfig* configuration files throughout the source tree,
0743 the indentation is somewhat different.  Lines under a ``config`` definition
0744 are indented with one tab, while help text is indented an additional two
0745 spaces.  Example::
0746 
0747   config AUDIT
0748         bool "Auditing support"
0749         depends on NET
0750         help
0751           Enable auditing infrastructure that can be used with another
0752           kernel subsystem, such as SELinux (which requires this for
0753           logging of avc messages output).  Does not do system-call
0754           auditing without CONFIG_AUDITSYSCALL.
0755 
0756 Seriously dangerous features (such as write support for certain
0757 filesystems) should advertise this prominently in their prompt string::
0758 
0759   config ADFS_FS_RW
0760         bool "ADFS write support (DANGEROUS)"
0761         depends on ADFS_FS
0762         ...
0763 
0764 For full documentation on the configuration files, see the file
0765 Documentation/kbuild/kconfig-language.rst.
0766 
0767 
0768 11) Data structures
0769 -------------------
0770 
0771 Data structures that have visibility outside the single-threaded
0772 environment they are created and destroyed in should always have
0773 reference counts.  In the kernel, garbage collection doesn't exist (and
0774 outside the kernel garbage collection is slow and inefficient), which
0775 means that you absolutely **have** to reference count all your uses.
0776 
0777 Reference counting means that you can avoid locking, and allows multiple
0778 users to have access to the data structure in parallel - and not having
0779 to worry about the structure suddenly going away from under them just
0780 because they slept or did something else for a while.
0781 
0782 Note that locking is **not** a replacement for reference counting.
0783 Locking is used to keep data structures coherent, while reference
0784 counting is a memory management technique.  Usually both are needed, and
0785 they are not to be confused with each other.
0786 
0787 Many data structures can indeed have two levels of reference counting,
0788 when there are users of different ``classes``.  The subclass count counts
0789 the number of subclass users, and decrements the global count just once
0790 when the subclass count goes to zero.
0791 
0792 Examples of this kind of ``multi-level-reference-counting`` can be found in
0793 memory management (``struct mm_struct``: mm_users and mm_count), and in
0794 filesystem code (``struct super_block``: s_count and s_active).
0795 
0796 Remember: if another thread can find your data structure, and you don't
0797 have a reference count on it, you almost certainly have a bug.
0798 
0799 
0800 12) Macros, Enums and RTL
0801 -------------------------
0802 
0803 Names of macros defining constants and labels in enums are capitalized.
0804 
0805 .. code-block:: c
0806 
0807         #define CONSTANT 0x12345
0808 
0809 Enums are preferred when defining several related constants.
0810 
0811 CAPITALIZED macro names are appreciated but macros resembling functions
0812 may be named in lower case.
0813 
0814 Generally, inline functions are preferable to macros resembling functions.
0815 
0816 Macros with multiple statements should be enclosed in a do - while block:
0817 
0818 .. code-block:: c
0819 
0820         #define macrofun(a, b, c)                       \
0821                 do {                                    \
0822                         if (a == 5)                     \
0823                                 do_this(b, c);          \
0824                 } while (0)
0825 
0826 Things to avoid when using macros:
0827 
0828 1) macros that affect control flow:
0829 
0830 .. code-block:: c
0831 
0832         #define FOO(x)                                  \
0833                 do {                                    \
0834                         if (blah(x) < 0)                \
0835                                 return -EBUGGERED;      \
0836                 } while (0)
0837 
0838 is a **very** bad idea.  It looks like a function call but exits the ``calling``
0839 function; don't break the internal parsers of those who will read the code.
0840 
0841 2) macros that depend on having a local variable with a magic name:
0842 
0843 .. code-block:: c
0844 
0845         #define FOO(val) bar(index, val)
0846 
0847 might look like a good thing, but it's confusing as hell when one reads the
0848 code and it's prone to breakage from seemingly innocent changes.
0849 
0850 3) macros with arguments that are used as l-values: FOO(x) = y; will
0851 bite you if somebody e.g. turns FOO into an inline function.
0852 
0853 4) forgetting about precedence: macros defining constants using expressions
0854 must enclose the expression in parentheses. Beware of similar issues with
0855 macros using parameters.
0856 
0857 .. code-block:: c
0858 
0859         #define CONSTANT 0x4000
0860         #define CONSTEXP (CONSTANT | 3)
0861 
0862 5) namespace collisions when defining local variables in macros resembling
0863 functions:
0864 
0865 .. code-block:: c
0866 
0867         #define FOO(x)                          \
0868         ({                                      \
0869                 typeof(x) ret;                  \
0870                 ret = calc_ret(x);              \
0871                 (ret);                          \
0872         })
0873 
0874 ret is a common name for a local variable - __foo_ret is less likely
0875 to collide with an existing variable.
0876 
0877 The cpp manual deals with macros exhaustively. The gcc internals manual also
0878 covers RTL which is used frequently with assembly language in the kernel.
0879 
0880 
0881 13) Printing kernel messages
0882 ----------------------------
0883 
0884 Kernel developers like to be seen as literate. Do mind the spelling
0885 of kernel messages to make a good impression. Do not use incorrect
0886 contractions like ``dont``; use ``do not`` or ``don't`` instead. Make the
0887 messages concise, clear, and unambiguous.
0888 
0889 Kernel messages do not have to be terminated with a period.
0890 
0891 Printing numbers in parentheses (%d) adds no value and should be avoided.
0892 
0893 There are a number of driver model diagnostic macros in <linux/dev_printk.h>
0894 which you should use to make sure messages are matched to the right device
0895 and driver, and are tagged with the right level:  dev_err(), dev_warn(),
0896 dev_info(), and so forth.  For messages that aren't associated with a
0897 particular device, <linux/printk.h> defines pr_notice(), pr_info(),
0898 pr_warn(), pr_err(), etc.
0899 
0900 Coming up with good debugging messages can be quite a challenge; and once
0901 you have them, they can be a huge help for remote troubleshooting.  However
0902 debug message printing is handled differently than printing other non-debug
0903 messages.  While the other pr_XXX() functions print unconditionally,
0904 pr_debug() does not; it is compiled out by default, unless either DEBUG is
0905 defined or CONFIG_DYNAMIC_DEBUG is set.  That is true for dev_dbg() also,
0906 and a related convention uses VERBOSE_DEBUG to add dev_vdbg() messages to
0907 the ones already enabled by DEBUG.
0908 
0909 Many subsystems have Kconfig debug options to turn on -DDEBUG in the
0910 corresponding Makefile; in other cases specific files #define DEBUG.  And
0911 when a debug message should be unconditionally printed, such as if it is
0912 already inside a debug-related #ifdef section, printk(KERN_DEBUG ...) can be
0913 used.
0914 
0915 
0916 14) Allocating memory
0917 ---------------------
0918 
0919 The kernel provides the following general purpose memory allocators:
0920 kmalloc(), kzalloc(), kmalloc_array(), kcalloc(), vmalloc(), and
0921 vzalloc().  Please refer to the API documentation for further information
0922 about them.  :ref:`Documentation/core-api/memory-allocation.rst
0923 <memory_allocation>`
0924 
0925 The preferred form for passing a size of a struct is the following:
0926 
0927 .. code-block:: c
0928 
0929         p = kmalloc(sizeof(*p), ...);
0930 
0931 The alternative form where struct name is spelled out hurts readability and
0932 introduces an opportunity for a bug when the pointer variable type is changed
0933 but the corresponding sizeof that is passed to a memory allocator is not.
0934 
0935 Casting the return value which is a void pointer is redundant. The conversion
0936 from void pointer to any other pointer type is guaranteed by the C programming
0937 language.
0938 
0939 The preferred form for allocating an array is the following:
0940 
0941 .. code-block:: c
0942 
0943         p = kmalloc_array(n, sizeof(...), ...);
0944 
0945 The preferred form for allocating a zeroed array is the following:
0946 
0947 .. code-block:: c
0948 
0949         p = kcalloc(n, sizeof(...), ...);
0950 
0951 Both forms check for overflow on the allocation size n * sizeof(...),
0952 and return NULL if that occurred.
0953 
0954 These generic allocation functions all emit a stack dump on failure when used
0955 without __GFP_NOWARN so there is no use in emitting an additional failure
0956 message when NULL is returned.
0957 
0958 15) The inline disease
0959 ----------------------
0960 
0961 There appears to be a common misperception that gcc has a magic "make me
0962 faster" speedup option called ``inline``. While the use of inlines can be
0963 appropriate (for example as a means of replacing macros, see Chapter 12), it
0964 very often is not. Abundant use of the inline keyword leads to a much bigger
0965 kernel, which in turn slows the system as a whole down, due to a bigger
0966 icache footprint for the CPU and simply because there is less memory
0967 available for the pagecache. Just think about it; a pagecache miss causes a
0968 disk seek, which easily takes 5 milliseconds. There are a LOT of cpu cycles
0969 that can go into these 5 milliseconds.
0970 
0971 A reasonable rule of thumb is to not put inline at functions that have more
0972 than 3 lines of code in them. An exception to this rule are the cases where
0973 a parameter is known to be a compiletime constant, and as a result of this
0974 constantness you *know* the compiler will be able to optimize most of your
0975 function away at compile time. For a good example of this later case, see
0976 the kmalloc() inline function.
0977 
0978 Often people argue that adding inline to functions that are static and used
0979 only once is always a win since there is no space tradeoff. While this is
0980 technically correct, gcc is capable of inlining these automatically without
0981 help, and the maintenance issue of removing the inline when a second user
0982 appears outweighs the potential value of the hint that tells gcc to do
0983 something it would have done anyway.
0984 
0985 
0986 16) Function return values and names
0987 ------------------------------------
0988 
0989 Functions can return values of many different kinds, and one of the
0990 most common is a value indicating whether the function succeeded or
0991 failed.  Such a value can be represented as an error-code integer
0992 (-Exxx = failure, 0 = success) or a ``succeeded`` boolean (0 = failure,
0993 non-zero = success).
0994 
0995 Mixing up these two sorts of representations is a fertile source of
0996 difficult-to-find bugs.  If the C language included a strong distinction
0997 between integers and booleans then the compiler would find these mistakes
0998 for us... but it doesn't.  To help prevent such bugs, always follow this
0999 convention::
1000 
1001         If the name of a function is an action or an imperative command,
1002         the function should return an error-code integer.  If the name
1003         is a predicate, the function should return a "succeeded" boolean.
1004 
1005 For example, ``add work`` is a command, and the add_work() function returns 0
1006 for success or -EBUSY for failure.  In the same way, ``PCI device present`` is
1007 a predicate, and the pci_dev_present() function returns 1 if it succeeds in
1008 finding a matching device or 0 if it doesn't.
1009 
1010 All EXPORTed functions must respect this convention, and so should all
1011 public functions.  Private (static) functions need not, but it is
1012 recommended that they do.
1013 
1014 Functions whose return value is the actual result of a computation, rather
1015 than an indication of whether the computation succeeded, are not subject to
1016 this rule.  Generally they indicate failure by returning some out-of-range
1017 result.  Typical examples would be functions that return pointers; they use
1018 NULL or the ERR_PTR mechanism to report failure.
1019 
1020 
1021 17) Using bool
1022 --------------
1023 
1024 The Linux kernel bool type is an alias for the C99 _Bool type. bool values can
1025 only evaluate to 0 or 1, and implicit or explicit conversion to bool
1026 automatically converts the value to true or false. When using bool types the
1027 !! construction is not needed, which eliminates a class of bugs.
1028 
1029 When working with bool values the true and false definitions should be used
1030 instead of 1 and 0.
1031 
1032 bool function return types and stack variables are always fine to use whenever
1033 appropriate. Use of bool is encouraged to improve readability and is often a
1034 better option than 'int' for storing boolean values.
1035 
1036 Do not use bool if cache line layout or size of the value matters, as its size
1037 and alignment varies based on the compiled architecture. Structures that are
1038 optimized for alignment and size should not use bool.
1039 
1040 If a structure has many true/false values, consider consolidating them into a
1041 bitfield with 1 bit members, or using an appropriate fixed width type, such as
1042 u8.
1043 
1044 Similarly for function arguments, many true/false values can be consolidated
1045 into a single bitwise 'flags' argument and 'flags' can often be a more
1046 readable alternative if the call-sites have naked true/false constants.
1047 
1048 Otherwise limited use of bool in structures and arguments can improve
1049 readability.
1050 
1051 18) Don't re-invent the kernel macros
1052 -------------------------------------
1053 
1054 The header file include/linux/kernel.h contains a number of macros that
1055 you should use, rather than explicitly coding some variant of them yourself.
1056 For example, if you need to calculate the length of an array, take advantage
1057 of the macro
1058 
1059 .. code-block:: c
1060 
1061         #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
1062 
1063 Similarly, if you need to calculate the size of some structure member, use
1064 
1065 .. code-block:: c
1066 
1067         #define sizeof_field(t, f) (sizeof(((t*)0)->f))
1068 
1069 There are also min() and max() macros that do strict type checking if you
1070 need them.  Feel free to peruse that header file to see what else is already
1071 defined that you shouldn't reproduce in your code.
1072 
1073 
1074 19) Editor modelines and other cruft
1075 ------------------------------------
1076 
1077 Some editors can interpret configuration information embedded in source files,
1078 indicated with special markers.  For example, emacs interprets lines marked
1079 like this:
1080 
1081 .. code-block:: c
1082 
1083         -*- mode: c -*-
1084 
1085 Or like this:
1086 
1087 .. code-block:: c
1088 
1089         /*
1090         Local Variables:
1091         compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c"
1092         End:
1093         */
1094 
1095 Vim interprets markers that look like this:
1096 
1097 .. code-block:: c
1098 
1099         /* vim:set sw=8 noet */
1100 
1101 Do not include any of these in source files.  People have their own personal
1102 editor configurations, and your source files should not override them.  This
1103 includes markers for indentation and mode configuration.  People may use their
1104 own custom mode, or may have some other magic method for making indentation
1105 work correctly.
1106 
1107 
1108 20) Inline assembly
1109 -------------------
1110 
1111 In architecture-specific code, you may need to use inline assembly to interface
1112 with CPU or platform functionality.  Don't hesitate to do so when necessary.
1113 However, don't use inline assembly gratuitously when C can do the job.  You can
1114 and should poke hardware from C when possible.
1115 
1116 Consider writing simple helper functions that wrap common bits of inline
1117 assembly, rather than repeatedly writing them with slight variations.  Remember
1118 that inline assembly can use C parameters.
1119 
1120 Large, non-trivial assembly functions should go in .S files, with corresponding
1121 C prototypes defined in C header files.  The C prototypes for assembly
1122 functions should use ``asmlinkage``.
1123 
1124 You may need to mark your asm statement as volatile, to prevent GCC from
1125 removing it if GCC doesn't notice any side effects.  You don't always need to
1126 do so, though, and doing so unnecessarily can limit optimization.
1127 
1128 When writing a single inline assembly statement containing multiple
1129 instructions, put each instruction on a separate line in a separate quoted
1130 string, and end each string except the last with ``\n\t`` to properly indent
1131 the next instruction in the assembly output:
1132 
1133 .. code-block:: c
1134 
1135         asm ("magic %reg1, #42\n\t"
1136              "more_magic %reg2, %reg3"
1137              : /* outputs */ : /* inputs */ : /* clobbers */);
1138 
1139 
1140 21) Conditional Compilation
1141 ---------------------------
1142 
1143 Wherever possible, don't use preprocessor conditionals (#if, #ifdef) in .c
1144 files; doing so makes code harder to read and logic harder to follow.  Instead,
1145 use such conditionals in a header file defining functions for use in those .c
1146 files, providing no-op stub versions in the #else case, and then call those
1147 functions unconditionally from .c files.  The compiler will avoid generating
1148 any code for the stub calls, producing identical results, but the logic will
1149 remain easy to follow.
1150 
1151 Prefer to compile out entire functions, rather than portions of functions or
1152 portions of expressions.  Rather than putting an ifdef in an expression, factor
1153 out part or all of the expression into a separate helper function and apply the
1154 conditional to that function.
1155 
1156 If you have a function or variable which may potentially go unused in a
1157 particular configuration, and the compiler would warn about its definition
1158 going unused, mark the definition as __maybe_unused rather than wrapping it in
1159 a preprocessor conditional.  (However, if a function or variable *always* goes
1160 unused, delete it.)
1161 
1162 Within code, where possible, use the IS_ENABLED macro to convert a Kconfig
1163 symbol into a C boolean expression, and use it in a normal C conditional:
1164 
1165 .. code-block:: c
1166 
1167         if (IS_ENABLED(CONFIG_SOMETHING)) {
1168                 ...
1169         }
1170 
1171 The compiler will constant-fold the conditional away, and include or exclude
1172 the block of code just as with an #ifdef, so this will not add any runtime
1173 overhead.  However, this approach still allows the C compiler to see the code
1174 inside the block, and check it for correctness (syntax, types, symbol
1175 references, etc).  Thus, you still have to use an #ifdef if the code inside the
1176 block references symbols that will not exist if the condition is not met.
1177 
1178 At the end of any non-trivial #if or #ifdef block (more than a few lines),
1179 place a comment after the #endif on the same line, noting the conditional
1180 expression used.  For instance:
1181 
1182 .. code-block:: c
1183 
1184         #ifdef CONFIG_SOMETHING
1185         ...
1186         #endif /* CONFIG_SOMETHING */
1187 
1188 
1189 Appendix I) References
1190 ----------------------
1191 
1192 The C Programming Language, Second Edition
1193 by Brian W. Kernighan and Dennis M. Ritchie.
1194 Prentice Hall, Inc., 1988.
1195 ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback).
1196 
1197 The Practice of Programming
1198 by Brian W. Kernighan and Rob Pike.
1199 Addison-Wesley, Inc., 1999.
1200 ISBN 0-201-61586-X.
1201 
1202 GNU manuals - where in compliance with K&R and this text - for cpp, gcc,
1203 gcc internals and indent, all available from https://www.gnu.org/manual/
1204 
1205 WG14 is the international standardization working group for the programming
1206 language C, URL: http://www.open-std.org/JTC1/SC22/WG14/
1207 
1208 Kernel :ref:`process/coding-style.rst <codingstyle>`, by greg@kroah.com at OLS 2002:
1209 http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/