Back to home page

OSCL-LXR

 
 

    


0001 Assembler Annotations
0002 =====================
0003 
0004 Copyright (c) 2017-2019 Jiri Slaby
0005 
0006 This document describes the new macros for annotation of data and code in
0007 assembly. In particular, it contains information about ``SYM_FUNC_START``,
0008 ``SYM_FUNC_END``, ``SYM_CODE_START``, and similar.
0009 
0010 Rationale
0011 ---------
0012 Some code like entries, trampolines, or boot code needs to be written in
0013 assembly. The same as in C, such code is grouped into functions and
0014 accompanied with data. Standard assemblers do not force users into precisely
0015 marking these pieces as code, data, or even specifying their length.
0016 Nevertheless, assemblers provide developers with such annotations to aid
0017 debuggers throughout assembly. On top of that, developers also want to mark
0018 some functions as *global* in order to be visible outside of their translation
0019 units.
0020 
0021 Over time, the Linux kernel has adopted macros from various projects (like
0022 ``binutils``) to facilitate such annotations. So for historic reasons,
0023 developers have been using ``ENTRY``, ``END``, ``ENDPROC``, and other
0024 annotations in assembly.  Due to the lack of their documentation, the macros
0025 are used in rather wrong contexts at some locations. Clearly, ``ENTRY`` was
0026 intended to denote the beginning of global symbols (be it data or code).
0027 ``END`` used to mark the end of data or end of special functions with
0028 *non-standard* calling convention. In contrast, ``ENDPROC`` should annotate
0029 only ends of *standard* functions.
0030 
0031 When these macros are used correctly, they help assemblers generate a nice
0032 object with both sizes and types set correctly. For example, the result of
0033 ``arch/x86/lib/putuser.S``::
0034 
0035    Num:    Value          Size Type    Bind   Vis      Ndx Name
0036     25: 0000000000000000    33 FUNC    GLOBAL DEFAULT    1 __put_user_1
0037     29: 0000000000000030    37 FUNC    GLOBAL DEFAULT    1 __put_user_2
0038     32: 0000000000000060    36 FUNC    GLOBAL DEFAULT    1 __put_user_4
0039     35: 0000000000000090    37 FUNC    GLOBAL DEFAULT    1 __put_user_8
0040 
0041 This is not only important for debugging purposes. When there are properly
0042 annotated objects like this, tools can be run on them to generate more useful
0043 information. In particular, on properly annotated objects, ``objtool`` can be
0044 run to check and fix the object if needed. Currently, ``objtool`` can report
0045 missing frame pointer setup/destruction in functions. It can also
0046 automatically generate annotations for :doc:`ORC unwinder <x86/orc-unwinder>`
0047 for most code. Both of these are especially important to support reliable
0048 stack traces which are in turn necessary for :doc:`Kernel live patching
0049 <livepatch/livepatch>`.
0050 
0051 Caveat and Discussion
0052 ---------------------
0053 As one might realize, there were only three macros previously. That is indeed
0054 insufficient to cover all the combinations of cases:
0055 
0056 * standard/non-standard function
0057 * code/data
0058 * global/local symbol
0059 
0060 There was a discussion_ and instead of extending the current ``ENTRY/END*``
0061 macros, it was decided that brand new macros should be introduced instead::
0062 
0063     So how about using macro names that actually show the purpose, instead
0064     of importing all the crappy, historic, essentially randomly chosen
0065     debug symbol macro names from the binutils and older kernels?
0066 
0067 .. _discussion: https://lore.kernel.org/r/20170217104757.28588-1-jslaby@suse.cz
0068 
0069 Macros Description
0070 ------------------
0071 
0072 The new macros are prefixed with the ``SYM_`` prefix and can be divided into
0073 three main groups:
0074 
0075 1. ``SYM_FUNC_*`` -- to annotate C-like functions. This means functions with
0076    standard C calling conventions. For example, on x86, this means that the
0077    stack contains a return address at the predefined place and a return from
0078    the function can happen in a standard way. When frame pointers are enabled,
0079    save/restore of frame pointer shall happen at the start/end of a function,
0080    respectively, too.
0081 
0082    Checking tools like ``objtool`` should ensure such marked functions conform
0083    to these rules. The tools can also easily annotate these functions with
0084    debugging information (like *ORC data*) automatically.
0085 
0086 2. ``SYM_CODE_*`` -- special functions called with special stack. Be it
0087    interrupt handlers with special stack content, trampolines, or startup
0088    functions.
0089 
0090    Checking tools mostly ignore checking of these functions. But some debug
0091    information still can be generated automatically. For correct debug data,
0092    this code needs hints like ``UNWIND_HINT_REGS`` provided by developers.
0093 
0094 3. ``SYM_DATA*`` -- obviously data belonging to ``.data`` sections and not to
0095    ``.text``. Data do not contain instructions, so they have to be treated
0096    specially by the tools: they should not treat the bytes as instructions,
0097    nor assign any debug information to them.
0098 
0099 Instruction Macros
0100 ~~~~~~~~~~~~~~~~~~
0101 This section covers ``SYM_FUNC_*`` and ``SYM_CODE_*`` enumerated above.
0102 
0103 ``objtool`` requires that all code must be contained in an ELF symbol. Symbol
0104 names that have a ``.L`` prefix do not emit symbol table entries. ``.L``
0105 prefixed symbols can be used within a code region, but should be avoided for
0106 denoting a range of code via ``SYM_*_START/END`` annotations.
0107 
0108 * ``SYM_FUNC_START`` and ``SYM_FUNC_START_LOCAL`` are supposed to be **the
0109   most frequent markings**. They are used for functions with standard calling
0110   conventions -- global and local. Like in C, they both align the functions to
0111   architecture specific ``__ALIGN`` bytes. There are also ``_NOALIGN`` variants
0112   for special cases where developers do not want this implicit alignment.
0113 
0114   ``SYM_FUNC_START_WEAK`` and ``SYM_FUNC_START_WEAK_NOALIGN`` markings are
0115   also offered as an assembler counterpart to the *weak* attribute known from
0116   C.
0117 
0118   All of these **shall** be coupled with ``SYM_FUNC_END``. First, it marks
0119   the sequence of instructions as a function and computes its size to the
0120   generated object file. Second, it also eases checking and processing such
0121   object files as the tools can trivially find exact function boundaries.
0122 
0123   So in most cases, developers should write something like in the following
0124   example, having some asm instructions in between the macros, of course::
0125 
0126     SYM_FUNC_START(memset)
0127         ... asm insns ...
0128     SYM_FUNC_END(memset)
0129 
0130   In fact, this kind of annotation corresponds to the now deprecated ``ENTRY``
0131   and ``ENDPROC`` macros.
0132 
0133 * ``SYM_FUNC_ALIAS``, ``SYM_FUNC_ALIAS_LOCAL``, and ``SYM_FUNC_ALIAS_WEAK`` can
0134   be used to define multiple names for a function. The typical use is::
0135 
0136     SYM_FUNC_START(__memset)
0137         ... asm insns ...
0138     SYN_FUNC_END(__memset)
0139     SYM_FUNC_ALIAS(memset, __memset)
0140 
0141   In this example, one can call ``__memset`` or ``memset`` with the same
0142   result, except the debug information for the instructions is generated to
0143   the object file only once -- for the non-``ALIAS`` case.
0144 
0145 * ``SYM_CODE_START`` and ``SYM_CODE_START_LOCAL`` should be used only in
0146   special cases -- if you know what you are doing. This is used exclusively
0147   for interrupt handlers and similar where the calling convention is not the C
0148   one. ``_NOALIGN`` variants exist too. The use is the same as for the ``FUNC``
0149   category above::
0150 
0151     SYM_CODE_START_LOCAL(bad_put_user)
0152         ... asm insns ...
0153     SYM_CODE_END(bad_put_user)
0154 
0155   Again, every ``SYM_CODE_START*`` **shall** be coupled by ``SYM_CODE_END``.
0156 
0157   To some extent, this category corresponds to deprecated ``ENTRY`` and
0158   ``END``. Except ``END`` had several other meanings too.
0159 
0160 * ``SYM_INNER_LABEL*`` is used to denote a label inside some
0161   ``SYM_{CODE,FUNC}_START`` and ``SYM_{CODE,FUNC}_END``.  They are very similar
0162   to C labels, except they can be made global. An example of use::
0163 
0164     SYM_CODE_START(ftrace_caller)
0165         /* save_mcount_regs fills in first two parameters */
0166         ...
0167 
0168     SYM_INNER_LABEL(ftrace_caller_op_ptr, SYM_L_GLOBAL)
0169         /* Load the ftrace_ops into the 3rd parameter */
0170         ...
0171 
0172     SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
0173         call ftrace_stub
0174         ...
0175         retq
0176     SYM_CODE_END(ftrace_caller)
0177 
0178 Data Macros
0179 ~~~~~~~~~~~
0180 Similar to instructions, there is a couple of macros to describe data in the
0181 assembly.
0182 
0183 * ``SYM_DATA_START`` and ``SYM_DATA_START_LOCAL`` mark the start of some data
0184   and shall be used in conjunction with either ``SYM_DATA_END``, or
0185   ``SYM_DATA_END_LABEL``. The latter adds also a label to the end, so that
0186   people can use ``lstack`` and (local) ``lstack_end`` in the following
0187   example::
0188 
0189     SYM_DATA_START_LOCAL(lstack)
0190         .skip 4096
0191     SYM_DATA_END_LABEL(lstack, SYM_L_LOCAL, lstack_end)
0192 
0193 * ``SYM_DATA`` and ``SYM_DATA_LOCAL`` are variants for simple, mostly one-line
0194   data::
0195 
0196     SYM_DATA(HEAP,     .long rm_heap)
0197     SYM_DATA(heap_end, .long rm_stack)
0198 
0199   In the end, they expand to ``SYM_DATA_START`` with ``SYM_DATA_END``
0200   internally.
0201 
0202 Support Macros
0203 ~~~~~~~~~~~~~~
0204 All the above reduce themselves to some invocation of ``SYM_START``,
0205 ``SYM_END``, or ``SYM_ENTRY`` at last. Normally, developers should avoid using
0206 these.
0207 
0208 Further, in the above examples, one could see ``SYM_L_LOCAL``. There are also
0209 ``SYM_L_GLOBAL`` and ``SYM_L_WEAK``. All are intended to denote linkage of a
0210 symbol marked by them. They are used either in ``_LABEL`` variants of the
0211 earlier macros, or in ``SYM_START``.
0212 
0213 
0214 Overriding Macros
0215 ~~~~~~~~~~~~~~~~~
0216 Architecture can also override any of the macros in their own
0217 ``asm/linkage.h``, including macros specifying the type of a symbol
0218 (``SYM_T_FUNC``, ``SYM_T_OBJECT``, and ``SYM_T_NONE``).  As every macro
0219 described in this file is surrounded by ``#ifdef`` + ``#endif``, it is enough
0220 to define the macros differently in the aforementioned architecture-dependent
0221 header.