trace/rv/runtime-verification.rst

0001 ====================
0002 Runtime Verification
0003 ====================
0004
0005 Runtime Verification (RV) is a lightweight (yet rigorous) method that
0006 complements classical exhaustive verification techniques (such as *model
0007 checking* and *theorem proving*) with a more practical approach for complex
0008 systems.
0009
0010 Instead of relying on a fine-grained model of a system (e.g., a
0011 re-implementation a instruction level), RV works by analyzing the trace of the
0012 system's actual execution, comparing it against a formal specification of
0013 the system behavior.
0014
0015 The main advantage is that RV can give precise information on the runtime
0016 behavior of the monitored system, without the pitfalls of developing models
0017 that require a re-implementation of the entire system in a modeling language.
0018 Moreover, given an efficient monitoring method, it is possible execute an
0019 *online* verification of a system, enabling the *reaction* for unexpected
0020 events, avoiding, for example, the propagation of a failure on safety-critical
0021 systems.
0022
0023 Runtime Monitors and Reactors
0024 =============================
0025
0026 A monitor is the central part of the runtime verification of a system. The
0027 monitor stands in between the formal specification of the desired (or
0028 undesired) behavior, and the trace of the actual system.
0029
0030 In Linux terms, the runtime verification monitors are encapsulated inside the
0031 *RV monitor* abstraction. A *RV monitor* includes a reference model of the
0032 system, a set of instances of the monitor (per-cpu monitor, per-task monitor,
0033 and so on), and the helper functions that glue the monitor to the system via
0034 trace, as depicted bellow::
0035
0036  Linux   +---- RV Monitor ----------------------------------+ Formal
0037   Realm  |                                                  |  Realm
0038   +-------------------+     +----------------+     +-----------------+
0039   |   Linux kernel    |     |     Monitor    |     |     Reference   |
0040   |     Tracing       |  -> |   Instance(s)  | <-  |       Model     |
0041   | (instrumentation) |     | (verification) |     | (specification) |
0042   +-------------------+     +----------------+     +-----------------+
0043          |                          |                       |
0044          |                          V                       |
0045          |                     +----------+                 |
0046          |                     | Reaction |                 |
0047          |                     +--+--+--+-+                 |
0048          |                        |  |  |                   |
0049          |                        |  |  +-> trace output ?  |
0050          +------------------------|--|----------------------+
0051                                   |  +----> panic ?
0052                                   +-------> <user-specified>
0053
0054 In addition to the verification and monitoring of the system, a monitor can
0055 react to an unexpected event. The forms of reaction can vary from logging the
0056 event occurrence to the enforcement of the correct behavior to the extreme
0057 action of taking a system down to avoid the propagation of a failure.
0058
0059 In Linux terms, a *reactor* is an reaction method available for *RV monitors*.
0060 By default, all monitors should provide a trace output of their actions,
0061 which is already a reaction. In addition, other reactions will be available
0062 so the user can enable them as needed.
0063
0064 For further information about the principles of runtime verification and
0065 RV applied to Linux:
0066
0067   Bartocci, Ezio, et al. *Introduction to runtime verification.* In: Lectures on
0068   Runtime Verification. Springer, Cham, 2018. p. 1-33.
0069
0070   Falcone, Ylies, et al. *A taxonomy for classifying runtime verification tools.*
0071   In: International Conference on Runtime Verification. Springer, Cham, 2018. p.
0072   241-262.
0073
0074   De Oliveira, Daniel Bristot. *Automata-based formal analysis and
0075   verification of the real-time Linux kernel.* Ph.D. Thesis, 2020.
0076
0077 Online RV monitors
0078 ==================
0079
0080 Monitors can be classified as *offline* and *online* monitors. *Offline*
0081 monitor process the traces generated by a system after the events, generally by
0082 reading the trace execution from a permanent storage system. *Online* monitors
0083 process the trace during the execution of the system. Online monitors are said
0084 to be *synchronous* if the processing of an event is attached to the system
0085 execution, blocking the system during the event monitoring. On the other hand,
0086 an *asynchronous* monitor has its execution detached from the system. Each type
0087 of monitor has a set of advantages. For example, *offline* monitors can be
0088 executed on different machines but require operations to save the log to a
0089 file. In contrast, *synchronous online* method can react at the exact moment
0090 a violation occurs.
0091
0092 Another important aspect regarding monitors is the overhead associated with the
0093 event analysis. If the system generates events at a frequency higher than the
0094 monitor's ability to process them in the same system, only the *offline*
0095 methods are viable. On the other hand, if the tracing of the events incurs
0096 on higher overhead than the simple handling of an event by a monitor, then a
0097 *synchronous online* monitors will incur on lower overhead.
0098
0099 Indeed, the research presented in:
0100
0101   De Oliveira, Daniel Bristot; Cucinotta, Tommaso; De Oliveira, Romulo Silva.
0102   *Efficient formal verification for the Linux kernel.* In: International
0103   Conference on Software Engineering and Formal Methods. Springer, Cham, 2019.
0104   p. 315-332.
0105
0106 Shows that for Deterministic Automata models, the synchronous processing of
0107 events in-kernel causes lower overhead than saving the same events to the trace
0108 buffer, not even considering collecting the trace for user-space analysis.
0109 This motivated the development of an in-kernel interface for online monitors.
0110
0111 For further information about modeling of Linux kernel behavior using automata,
0112 see:
0113
0114   De Oliveira, Daniel B.; De Oliveira, Romulo S.; Cucinotta, Tommaso. *A thread
0115   synchronization model for the PREEMPT_RT Linux kernel.* Journal of Systems
0116   Architecture, 2020, 107: 101729.
0117
0118 The user interface
0119 ==================
0120
0121 The user interface resembles the tracing interface (on purpose). It is
0122 currently at "/sys/kernel/tracing/rv/".
0123
0124 The following files/folders are currently available:
0125
0126 **available_monitors**
0127
0128 - Reading list the available monitors, one per line
0129
0130 For example::
0131
0132    # cat available_monitors
0133    wip
0134    wwnr
0135
0136 **available_reactors**
0137
0138 - Reading shows the available reactors, one per line.
0139
0140 For example::
0141
0142    # cat available_reactors
0143    nop
0144    panic
0145    printk
0146
0147 **enabled_monitors**:
0148
0149 - Reading lists the enabled monitors, one per line
0150 - Writing to it enables a given monitor
0151 - Writing a monitor name with a '!' prefix disables it
0152 - Truncating the file disables all enabled monitors
0153
0154 For example::
0155
0156    # cat enabled_monitors
0157    # echo wip > enabled_monitors
0158    # echo wwnr >> enabled_monitors
0159    # cat enabled_monitors
0160    wip
0161    wwnr
0162    # echo '!wip' >> enabled_monitors
0163    # cat enabled_monitors
0164    wwnr
0165    # echo > enabled_monitors
0166    # cat enabled_monitors
0167    #
0168
0169 Note that it is possible to enable more than one monitor concurrently.
0170
0171 **monitoring_on**
0172
0173 This is an on/off general switcher for monitoring. It resembles the
0174 "tracing_on" switcher in the trace interface.
0175
0176 - Writing "0" stops the monitoring
0177 - Writing "1" continues the monitoring
0178 - Reading returns the current status of the monitoring
0179
0180 Note that it does not disable enabled monitors but stop the per-entity
0181 monitors monitoring the events received from the system.
0182
0183 **reacting_on**
0184
0185 - Writing "0" prevents reactions for happening
0186 - Writing "1" enable reactions
0187 - Reading returns the current status of the reaction
0188
0189 **monitors/**
0190
0191 Each monitor will have its own directory inside "monitors/". There the
0192 monitor-specific files will be presented. The "monitors/" directory resembles
0193 the "events" directory on tracefs.
0194
0195 For example::
0196
0197    # cd monitors/wip/
0198    # ls
0199    desc  enable
0200    # cat desc
0201    wakeup in preemptive per-cpu testing monitor.
0202    # cat enable
0203    0
0204
0205 **monitors/MONITOR/desc**
0206
0207 - Reading shows a description of the monitor *MONITOR*
0208
0209 **monitors/MONITOR/enable**
0210
0211 - Writing "0" disables the *MONITOR*
0212 - Writing "1" enables the *MONITOR*
0213 - Reading return the current status of the *MONITOR*
0214
0215 **monitors/MONITOR/reactors**
0216
0217 - List available reactors, with the select reaction for the given *MONITOR*
0218   inside "[]". The default one is the nop (no operation) reactor.
0219 - Writing the name of a reactor enables it to the given MONITOR.
0220
0221 For example::
0222
0223    # cat monitors/wip/reactors
0224    [nop]
0225    panic
0226    printk
0227    # echo panic > monitors/wip/reactors
0228    # cat monitors/wip/reactors
0229    nop
0230    [panic]
0231    printk