0001 .. _NMI_rcu_doc:
0002
0003 Using RCU to Protect Dynamic NMI Handlers
0004 =========================================
0005
0006
0007 Although RCU is usually used to protect read-mostly data structures,
0008 it is possible to use RCU to provide dynamic non-maskable interrupt
0009 handlers, as well as dynamic irq handlers. This document describes
0010 how to do this, drawing loosely from Zwane Mwaikambo's NMI-timer
0011 work in "arch/x86/kernel/traps.c".
0012
0013 The relevant pieces of code are listed below, each followed by a
0014 brief explanation::
0015
0016 static int dummy_nmi_callback(struct pt_regs *regs, int cpu)
0017 {
0018 return 0;
0019 }
0020
0021 The dummy_nmi_callback() function is a "dummy" NMI handler that does
0022 nothing, but returns zero, thus saying that it did nothing, allowing
0023 the NMI handler to take the default machine-specific action::
0024
0025 static nmi_callback_t nmi_callback = dummy_nmi_callback;
0026
0027 This nmi_callback variable is a global function pointer to the current
0028 NMI handler::
0029
0030 void do_nmi(struct pt_regs * regs, long error_code)
0031 {
0032 int cpu;
0033
0034 nmi_enter();
0035
0036 cpu = smp_processor_id();
0037 ++nmi_count(cpu);
0038
0039 if (!rcu_dereference_sched(nmi_callback)(regs, cpu))
0040 default_do_nmi(regs);
0041
0042 nmi_exit();
0043 }
0044
0045 The do_nmi() function processes each NMI. It first disables preemption
0046 in the same way that a hardware irq would, then increments the per-CPU
0047 count of NMIs. It then invokes the NMI handler stored in the nmi_callback
0048 function pointer. If this handler returns zero, do_nmi() invokes the
0049 default_do_nmi() function to handle a machine-specific NMI. Finally,
0050 preemption is restored.
0051
0052 In theory, rcu_dereference_sched() is not needed, since this code runs
0053 only on i386, which in theory does not need rcu_dereference_sched()
0054 anyway. However, in practice it is a good documentation aid, particularly
0055 for anyone attempting to do something similar on Alpha or on systems
0056 with aggressive optimizing compilers.
0057
0058 Quick Quiz:
0059 Why might the rcu_dereference_sched() be necessary on Alpha, given that the code referenced by the pointer is read-only?
0060
0061 :ref:`Answer to Quick Quiz <answer_quick_quiz_NMI>`
0062
0063 Back to the discussion of NMI and RCU::
0064
0065 void set_nmi_callback(nmi_callback_t callback)
0066 {
0067 rcu_assign_pointer(nmi_callback, callback);
0068 }
0069
0070 The set_nmi_callback() function registers an NMI handler. Note that any
0071 data that is to be used by the callback must be initialized up -before-
0072 the call to set_nmi_callback(). On architectures that do not order
0073 writes, the rcu_assign_pointer() ensures that the NMI handler sees the
0074 initialized values::
0075
0076 void unset_nmi_callback(void)
0077 {
0078 rcu_assign_pointer(nmi_callback, dummy_nmi_callback);
0079 }
0080
0081 This function unregisters an NMI handler, restoring the original
0082 dummy_nmi_handler(). However, there may well be an NMI handler
0083 currently executing on some other CPU. We therefore cannot free
0084 up any data structures used by the old NMI handler until execution
0085 of it completes on all other CPUs.
0086
0087 One way to accomplish this is via synchronize_rcu(), perhaps as
0088 follows::
0089
0090 unset_nmi_callback();
0091 synchronize_rcu();
0092 kfree(my_nmi_data);
0093
0094 This works because (as of v4.20) synchronize_rcu() blocks until all
0095 CPUs complete any preemption-disabled segments of code that they were
0096 executing.
0097 Since NMI handlers disable preemption, synchronize_rcu() is guaranteed
0098 not to return until all ongoing NMI handlers exit. It is therefore safe
0099 to free up the handler's data as soon as synchronize_rcu() returns.
0100
0101 Important note: for this to work, the architecture in question must
0102 invoke nmi_enter() and nmi_exit() on NMI entry and exit, respectively.
0103
0104 .. _answer_quick_quiz_NMI:
0105
0106 Answer to Quick Quiz:
0107 Why might the rcu_dereference_sched() be necessary on Alpha, given that the code referenced by the pointer is read-only?
0108
0109 The caller to set_nmi_callback() might well have
0110 initialized some data that is to be used by the new NMI
0111 handler. In this case, the rcu_dereference_sched() would
0112 be needed, because otherwise a CPU that received an NMI
0113 just after the new handler was set might see the pointer
0114 to the new NMI handler, but the old pre-initialized
0115 version of the handler's data.
0116
0117 This same sad story can happen on other CPUs when using
0118 a compiler with aggressive pointer-value speculation
0119 optimizations.
0120
0121 More important, the rcu_dereference_sched() makes it
0122 clear to someone reading the code that the pointer is
0123 being protected by RCU-sched.