0001 ======================================================
0002 Net DIM - Generic Network Dynamic Interrupt Moderation
0003 ======================================================
0004
0005 :Author: Tal Gilboa <talgi@mellanox.com>
0006
0007 .. contents:: :depth: 2
0008
0009 Assumptions
0010 ===========
0011
0012 This document assumes the reader has basic knowledge in network drivers
0013 and in general interrupt moderation.
0014
0015
0016 Introduction
0017 ============
0018
0019 Dynamic Interrupt Moderation (DIM) (in networking) refers to changing the
0020 interrupt moderation configuration of a channel in order to optimize packet
0021 processing. The mechanism includes an algorithm which decides if and how to
0022 change moderation parameters for a channel, usually by performing an analysis on
0023 runtime data sampled from the system. Net DIM is such a mechanism. In each
0024 iteration of the algorithm, it analyses a given sample of the data, compares it
0025 to the previous sample and if required, it can decide to change some of the
0026 interrupt moderation configuration fields. The data sample is composed of data
0027 bandwidth, the number of packets and the number of events. The time between
0028 samples is also measured. Net DIM compares the current and the previous data and
0029 returns an adjusted interrupt moderation configuration object. In some cases,
0030 the algorithm might decide not to change anything. The configuration fields are
0031 the minimum duration (microseconds) allowed between events and the maximum
0032 number of wanted packets per event. The Net DIM algorithm ascribes importance to
0033 increase bandwidth over reducing interrupt rate.
0034
0035
0036 Net DIM Algorithm
0037 =================
0038
0039 Each iteration of the Net DIM algorithm follows these steps:
0040
0041 #. Calculates new data sample.
0042 #. Compares it to previous sample.
0043 #. Makes a decision - suggests interrupt moderation configuration fields.
0044 #. Applies a schedule work function, which applies suggested configuration.
0045
0046 The first two steps are straightforward, both the new and the previous data are
0047 supplied by the driver registered to Net DIM. The previous data is the new data
0048 supplied to the previous iteration. The comparison step checks the difference
0049 between the new and previous data and decides on the result of the last step.
0050 A step would result as "better" if bandwidth increases and as "worse" if
0051 bandwidth reduces. If there is no change in bandwidth, the packet rate is
0052 compared in a similar fashion - increase == "better" and decrease == "worse".
0053 In case there is no change in the packet rate as well, the interrupt rate is
0054 compared. Here the algorithm tries to optimize for lower interrupt rate so an
0055 increase in the interrupt rate is considered "worse" and a decrease is
0056 considered "better". Step #2 has an optimization for avoiding false results: it
0057 only considers a difference between samples as valid if it is greater than a
0058 certain percentage. Also, since Net DIM does not measure anything by itself, it
0059 assumes the data provided by the driver is valid.
0060
0061 Step #3 decides on the suggested configuration based on the result from step #2
0062 and the internal state of the algorithm. The states reflect the "direction" of
0063 the algorithm: is it going left (reducing moderation), right (increasing
0064 moderation) or standing still. Another optimization is that if a decision
0065 to stay still is made multiple times, the interval between iterations of the
0066 algorithm would increase in order to reduce calculation overhead. Also, after
0067 "parking" on one of the most left or most right decisions, the algorithm may
0068 decide to verify this decision by taking a step in the other direction. This is
0069 done in order to avoid getting stuck in a "deep sleep" scenario. Once a
0070 decision is made, an interrupt moderation configuration is selected from
0071 the predefined profiles.
0072
0073 The last step is to notify the registered driver that it should apply the
0074 suggested configuration. This is done by scheduling a work function, defined by
0075 the Net DIM API and provided by the registered driver.
0076
0077 As you can see, Net DIM itself does not actively interact with the system. It
0078 would have trouble making the correct decisions if the wrong data is supplied to
0079 it and it would be useless if the work function would not apply the suggested
0080 configuration. This does, however, allow the registered driver some room for
0081 manoeuvre as it may provide partial data or ignore the algorithm suggestion
0082 under some conditions.
0083
0084
0085 Registering a Network Device to DIM
0086 ===================================
0087
0088 Net DIM API exposes the main function net_dim().
0089 This function is the entry point to the Net
0090 DIM algorithm and has to be called every time the driver would like to check if
0091 it should change interrupt moderation parameters. The driver should provide two
0092 data structures: :c:type:`struct dim <dim>` and
0093 :c:type:`struct dim_sample <dim_sample>`. :c:type:`struct dim <dim>`
0094 describes the state of DIM for a specific object (RX queue, TX queue,
0095 other queues, etc.). This includes the current selected profile, previous data
0096 samples, the callback function provided by the driver and more.
0097 :c:type:`struct dim_sample <dim_sample>` describes a data sample,
0098 which will be compared to the data sample stored in :c:type:`struct dim <dim>`
0099 in order to decide on the algorithm's next
0100 step. The sample should include bytes, packets and interrupts, measured by
0101 the driver.
0102
0103 In order to use Net DIM from a networking driver, the driver needs to call the
0104 main net_dim() function. The recommended method is to call net_dim() on each
0105 interrupt. Since Net DIM has a built-in moderation and it might decide to skip
0106 iterations under certain conditions, there is no need to moderate the net_dim()
0107 calls as well. As mentioned above, the driver needs to provide an object of type
0108 :c:type:`struct dim <dim>` to the net_dim() function call. It is advised for
0109 each entity using Net DIM to hold a :c:type:`struct dim <dim>` as part of its
0110 data structure and use it as the main Net DIM API object.
0111 The :c:type:`struct dim_sample <dim_sample>` should hold the latest
0112 bytes, packets and interrupts count. No need to perform any calculations, just
0113 include the raw data.
0114
0115 The net_dim() call itself does not return anything. Instead Net DIM relies on
0116 the driver to provide a callback function, which is called when the algorithm
0117 decides to make a change in the interrupt moderation parameters. This callback
0118 will be scheduled and run in a separate thread in order not to add overhead to
0119 the data flow. After the work is done, Net DIM algorithm needs to be set to
0120 the proper state in order to move to the next iteration.
0121
0122
0123 Example
0124 =======
0125
0126 The following code demonstrates how to register a driver to Net DIM. The actual
0127 usage is not complete but it should make the outline of the usage clear.
0128
0129 .. code-block:: c
0130
0131 #include <linux/dim.h>
0132
0133 /* Callback for net DIM to schedule on a decision to change moderation */
0134 void my_driver_do_dim_work(struct work_struct *work)
0135 {
0136 /* Get struct dim from struct work_struct */
0137 struct dim *dim = container_of(work, struct dim,
0138 work);
0139 /* Do interrupt moderation related stuff */
0140 ...
0141
0142 /* Signal net DIM work is done and it should move to next iteration */
0143 dim->state = DIM_START_MEASURE;
0144 }
0145
0146 /* My driver's interrupt handler */
0147 int my_driver_handle_interrupt(struct my_driver_entity *my_entity, ...)
0148 {
0149 ...
0150 /* A struct to hold current measured data */
0151 struct dim_sample dim_sample;
0152 ...
0153 /* Initiate data sample struct with current data */
0154 dim_update_sample(my_entity->events,
0155 my_entity->packets,
0156 my_entity->bytes,
0157 &dim_sample);
0158 /* Call net DIM */
0159 net_dim(&my_entity->dim, dim_sample);
0160 ...
0161 }
0162
0163 /* My entity's initialization function (my_entity was already allocated) */
0164 int my_driver_init_my_entity(struct my_driver_entity *my_entity, ...)
0165 {
0166 ...
0167 /* Initiate struct work_struct with my driver's callback function */
0168 INIT_WORK(&my_entity->dim.work, my_driver_do_dim_work);
0169 ...
0170 }
0171
0172 Dynamic Interrupt Moderation (DIM) library API
0173 ==============================================
0174
0175 .. kernel-doc:: include/linux/dim.h
0176 :internal: