0001 .. SPDX-License-Identifier: (GPL-2.0+ OR CC-BY-4.0)
0002 .. See the bottom of this file for additional redistribution information.
0003
0004 Reporting issues
0005 ++++++++++++++++
0006
0007
0008 The short guide (aka TL;DR)
0009 ===========================
0010
0011 Are you facing a regression with vanilla kernels from the same stable or
0012 longterm series? One still supported? Then search the `LKML
0013 <https://lore.kernel.org/lkml/>`_ and the `Linux stable mailing list
0014 <https://lore.kernel.org/stable/>`_ archives for matching reports to join. If
0015 you don't find any, install `the latest release from that series
0016 <https://kernel.org/>`_. If it still shows the issue, report it to the stable
0017 mailing list (stable@vger.kernel.org) and CC the regressions list
0018 (regressions@lists.linux.dev); ideally also CC the maintainer and the mailing
0019 list for the subsystem in question.
0020
0021 In all other cases try your best guess which kernel part might be causing the
0022 issue. Check the :ref:`MAINTAINERS <maintainers>` file for how its developers
0023 expect to be told about problems, which most of the time will be by email with a
0024 mailing list in CC. Check the destination's archives for matching reports;
0025 search the `LKML <https://lore.kernel.org/lkml/>`_ and the web, too. If you
0026 don't find any to join, install `the latest mainline kernel
0027 <https://kernel.org/>`_. If the issue is present there, send a report.
0028
0029 The issue was fixed there, but you would like to see it resolved in a still
0030 supported stable or longterm series as well? Then install its latest release.
0031 If it shows the problem, search for the change that fixed it in mainline and
0032 check if backporting is in the works or was discarded; if it's neither, ask
0033 those who handled the change for it.
0034
0035 **General remarks**: When installing and testing a kernel as outlined above,
0036 ensure it's vanilla (IOW: not patched and not using add-on modules). Also make
0037 sure it's built and running in a healthy environment and not already tainted
0038 before the issue occurs.
0039
0040 If you are facing multiple issues with the Linux kernel at once, report each
0041 separately. While writing your report, include all information relevant to the
0042 issue, like the kernel and the distro used. In case of a regression, CC the
0043 regressions mailing list (regressions@lists.linux.dev) to your report. Also try
0044 to pin-point the culprit with a bisection; if you succeed, include its
0045 commit-id and CC everyone in the sign-off-by chain.
0046
0047 Once the report is out, answer any questions that come up and help where you
0048 can. That includes keeping the ball rolling by occasionally retesting with newer
0049 releases and sending a status update afterwards.
0050
0051 Step-by-step guide how to report issues to the kernel maintainers
0052 =================================================================
0053
0054 The above TL;DR outlines roughly how to report issues to the Linux kernel
0055 developers. It might be all that's needed for people already familiar with
0056 reporting issues to Free/Libre & Open Source Software (FLOSS) projects. For
0057 everyone else there is this section. It is more detailed and uses a
0058 step-by-step approach. It still tries to be brief for readability and leaves
0059 out a lot of details; those are described below the step-by-step guide in a
0060 reference section, which explains each of the steps in more detail.
0061
0062 Note: this section covers a few more aspects than the TL;DR and does things in
0063 a slightly different order. That's in your interest, to make sure you notice
0064 early if an issue that looks like a Linux kernel problem is actually caused by
0065 something else. These steps thus help to ensure the time you invest in this
0066 process won't feel wasted in the end:
0067
0068 * Are you facing an issue with a Linux kernel a hardware or software vendor
0069 provided? Then in almost all cases you are better off to stop reading this
0070 document and reporting the issue to your vendor instead, unless you are
0071 willing to install the latest Linux version yourself. Be aware the latter
0072 will often be needed anyway to hunt down and fix issues.
0073
0074 * Perform a rough search for existing reports with your favorite internet
0075 search engine; additionally, check the archives of the `Linux Kernel Mailing
0076 List (LKML) <https://lore.kernel.org/lkml/>`_. If you find matching reports,
0077 join the discussion instead of sending a new one.
0078
0079 * See if the issue you are dealing with qualifies as regression, security
0080 issue, or a really severe problem: those are 'issues of high priority' that
0081 need special handling in some steps that are about to follow.
0082
0083 * Make sure it's not the kernel's surroundings that are causing the issue
0084 you face.
0085
0086 * Create a fresh backup and put system repair and restore tools at hand.
0087
0088 * Ensure your system does not enhance its kernels by building additional
0089 kernel modules on-the-fly, which solutions like DKMS might be doing locally
0090 without your knowledge.
0091
0092 * Check if your kernel was 'tainted' when the issue occurred, as the event
0093 that made the kernel set this flag might be causing the issue you face.
0094
0095 * Write down coarsely how to reproduce the issue. If you deal with multiple
0096 issues at once, create separate notes for each of them and make sure they
0097 work independently on a freshly booted system. That's needed, as each issue
0098 needs to get reported to the kernel developers separately, unless they are
0099 strongly entangled.
0100
0101 * If you are facing a regression within a stable or longterm version line
0102 (say something broke when updating from 5.10.4 to 5.10.5), scroll down to
0103 'Dealing with regressions within a stable and longterm kernel line'.
0104
0105 * Locate the driver or kernel subsystem that seems to be causing the issue.
0106 Find out how and where its developers expect reports. Note: most of the
0107 time this won't be bugzilla.kernel.org, as issues typically need to be sent
0108 by mail to a maintainer and a public mailing list.
0109
0110 * Search the archives of the bug tracker or mailing list in question
0111 thoroughly for reports that might match your issue. If you find anything,
0112 join the discussion instead of sending a new report.
0113
0114 After these preparations you'll now enter the main part:
0115
0116 * Unless you are already running the latest 'mainline' Linux kernel, better
0117 go and install it for the reporting process. Testing and reporting with
0118 the latest 'stable' Linux can be an acceptable alternative in some
0119 situations; during the merge window that actually might be even the best
0120 approach, but in that development phase it can be an even better idea to
0121 suspend your efforts for a few days anyway. Whatever version you choose,
0122 ideally use a 'vanilla' build. Ignoring these advices will dramatically
0123 increase the risk your report will be rejected or ignored.
0124
0125 * Ensure the kernel you just installed does not 'taint' itself when
0126 running.
0127
0128 * Reproduce the issue with the kernel you just installed. If it doesn't show
0129 up there, scroll down to the instructions for issues only happening with
0130 stable and longterm kernels.
0131
0132 * Optimize your notes: try to find and write the most straightforward way to
0133 reproduce your issue. Make sure the end result has all the important
0134 details, and at the same time is easy to read and understand for others
0135 that hear about it for the first time. And if you learned something in this
0136 process, consider searching again for existing reports about the issue.
0137
0138 * If your failure involves a 'panic', 'Oops', 'warning', or 'BUG', consider
0139 decoding the kernel log to find the line of code that triggered the error.
0140
0141 * If your problem is a regression, try to narrow down when the issue was
0142 introduced as much as possible.
0143
0144 * Start to compile the report by writing a detailed description about the
0145 issue. Always mention a few things: the latest kernel version you installed
0146 for reproducing, the Linux Distribution used, and your notes on how to
0147 reproduce the issue. Ideally, make the kernel's build configuration
0148 (.config) and the output from ``dmesg`` available somewhere on the net and
0149 link to it. Include or upload all other information that might be relevant,
0150 like the output/screenshot of an Oops or the output from ``lspci``. Once
0151 you wrote this main part, insert a normal length paragraph on top of it
0152 outlining the issue and the impact quickly. On top of this add one sentence
0153 that briefly describes the problem and gets people to read on. Now give the
0154 thing a descriptive title or subject that yet again is shorter. Then you're
0155 ready to send or file the report like the MAINTAINERS file told you, unless
0156 you are dealing with one of those 'issues of high priority': they need
0157 special care which is explained in 'Special handling for high priority
0158 issues' below.
0159
0160 * Wait for reactions and keep the thing rolling until you can accept the
0161 outcome in one way or the other. Thus react publicly and in a timely manner
0162 to any inquiries. Test proposed fixes. Do proactive testing: retest with at
0163 least every first release candidate (RC) of a new mainline version and
0164 report your results. Send friendly reminders if things stall. And try to
0165 help yourself, if you don't get any help or if it's unsatisfying.
0166
0167
0168 Reporting regressions within a stable and longterm kernel line
0169 --------------------------------------------------------------
0170
0171 This subsection is for you, if you followed above process and got sent here at
0172 the point about regression within a stable or longterm kernel version line. You
0173 face one of those if something breaks when updating from 5.10.4 to 5.10.5 (a
0174 switch from 5.9.15 to 5.10.5 does not qualify). The developers want to fix such
0175 regressions as quickly as possible, hence there is a streamlined process to
0176 report them:
0177
0178 * Check if the kernel developers still maintain the Linux kernel version
0179 line you care about: go to the `front page of kernel.org
0180 <https://kernel.org/>`_ and make sure it mentions
0181 the latest release of the particular version line without an '[EOL]' tag.
0182
0183 * Check the archives of the `Linux stable mailing list
0184 <https://lore.kernel.org/stable/>`_ for existing reports.
0185
0186 * Install the latest release from the particular version line as a vanilla
0187 kernel. Ensure this kernel is not tainted and still shows the problem, as
0188 the issue might have already been fixed there. If you first noticed the
0189 problem with a vendor kernel, check a vanilla build of the last version
0190 known to work performs fine as well.
0191
0192 * Send a short problem report to the Linux stable mailing list
0193 (stable@vger.kernel.org) and CC the Linux regressions mailing list
0194 (regressions@lists.linux.dev); if you suspect the cause in a particular
0195 subsystem, CC its maintainer and its mailing list. Roughly describe the
0196 issue and ideally explain how to reproduce it. Mention the first version
0197 that shows the problem and the last version that's working fine. Then
0198 wait for further instructions.
0199
0200 The reference section below explains each of these steps in more detail.
0201
0202
0203 Reporting issues only occurring in older kernel version lines
0204 -------------------------------------------------------------
0205
0206 This subsection is for you, if you tried the latest mainline kernel as outlined
0207 above, but failed to reproduce your issue there; at the same time you want to
0208 see the issue fixed in a still supported stable or longterm series or vendor
0209 kernels regularly rebased on those. If that the case, follow these steps:
0210
0211 * Prepare yourself for the possibility that going through the next few steps
0212 might not get the issue solved in older releases: the fix might be too big
0213 or risky to get backported there.
0214
0215 * Perform the first three steps in the section "Dealing with regressions
0216 within a stable and longterm kernel line" above.
0217
0218 * Search the Linux kernel version control system for the change that fixed
0219 the issue in mainline, as its commit message might tell you if the fix is
0220 scheduled for backporting already. If you don't find anything that way,
0221 search the appropriate mailing lists for posts that discuss such an issue
0222 or peer-review possible fixes; then check the discussions if the fix was
0223 deemed unsuitable for backporting. If backporting was not considered at
0224 all, join the newest discussion, asking if it's in the cards.
0225
0226 * One of the former steps should lead to a solution. If that doesn't work
0227 out, ask the maintainers for the subsystem that seems to be causing the
0228 issue for advice; CC the mailing list for the particular subsystem as well
0229 as the stable mailing list.
0230
0231 The reference section below explains each of these steps in more detail.
0232
0233
0234 Reference section: Reporting issues to the kernel maintainers
0235 =============================================================
0236
0237 The detailed guides above outline all the major steps in brief fashion, which
0238 should be enough for most people. But sometimes there are situations where even
0239 experienced users might wonder how to actually do one of those steps. That's
0240 what this section is for, as it will provide a lot more details on each of the
0241 above steps. Consider this as reference documentation: it's possible to read it
0242 from top to bottom. But it's mainly meant to skim over and a place to look up
0243 details how to actually perform those steps.
0244
0245 A few words of general advice before digging into the details:
0246
0247 * The Linux kernel developers are well aware this process is complicated and
0248 demands more than other FLOSS projects. We'd love to make it simpler. But
0249 that would require work in various places as well as some infrastructure,
0250 which would need constant maintenance; nobody has stepped up to do that
0251 work, so that's just how things are for now.
0252
0253 * A warranty or support contract with some vendor doesn't entitle you to
0254 request fixes from developers in the upstream Linux kernel community: such
0255 contracts are completely outside the scope of the Linux kernel, its
0256 development community, and this document. That's why you can't demand
0257 anything such a contract guarantees in this context, not even if the
0258 developer handling the issue works for the vendor in question. If you want
0259 to claim your rights, use the vendor's support channel instead. When doing
0260 so, you might want to mention you'd like to see the issue fixed in the
0261 upstream Linux kernel; motivate them by saying it's the only way to ensure
0262 the fix in the end will get incorporated in all Linux distributions.
0263
0264 * If you never reported an issue to a FLOSS project before you should consider
0265 reading `How to Report Bugs Effectively
0266 <https://www.chiark.greenend.org.uk/~sgtatham/bugs.html>`_, `How To Ask
0267 Questions The Smart Way
0268 <http://www.catb.org/esr/faqs/smart-questions.html>`_, and `How to ask good
0269 questions <https://jvns.ca/blog/good-questions/>`_.
0270
0271 With that off the table, find below the details on how to properly report
0272 issues to the Linux kernel developers.
0273
0274
0275 Make sure you're using the upstream Linux kernel
0276 ------------------------------------------------
0277
0278 *Are you facing an issue with a Linux kernel a hardware or software vendor
0279 provided? Then in almost all cases you are better off to stop reading this
0280 document and reporting the issue to your vendor instead, unless you are
0281 willing to install the latest Linux version yourself. Be aware the latter
0282 will often be needed anyway to hunt down and fix issues.*
0283
0284 Like most programmers, Linux kernel developers don't like to spend time dealing
0285 with reports for issues that don't even happen with their current code. It's
0286 just a waste everybody's time, especially yours. Unfortunately such situations
0287 easily happen when it comes to the kernel and often leads to frustration on both
0288 sides. That's because almost all Linux-based kernels pre-installed on devices
0289 (Computers, Laptops, Smartphones, Routers, …) and most shipped by Linux
0290 distributors are quite distant from the official Linux kernel as distributed by
0291 kernel.org: these kernels from these vendors are often ancient from the point of
0292 Linux development or heavily modified, often both.
0293
0294 Most of these vendor kernels are quite unsuitable for reporting issues to the
0295 Linux kernel developers: an issue you face with one of them might have been
0296 fixed by the Linux kernel developers months or years ago already; additionally,
0297 the modifications and enhancements by the vendor might be causing the issue you
0298 face, even if they look small or totally unrelated. That's why you should report
0299 issues with these kernels to the vendor. Its developers should look into the
0300 report and, in case it turns out to be an upstream issue, fix it directly
0301 upstream or forward the report there. In practice that often does not work out
0302 or might not what you want. You thus might want to consider circumventing the
0303 vendor by installing the very latest Linux kernel core yourself. If that's an
0304 option for you move ahead in this process, as a later step in this guide will
0305 explain how to do that once it rules out other potential causes for your issue.
0306
0307 Note, the previous paragraph is starting with the word 'most', as sometimes
0308 developers in fact are willing to handle reports about issues occurring with
0309 vendor kernels. If they do in the end highly depends on the developers and the
0310 issue in question. Your chances are quite good if the distributor applied only
0311 small modifications to a kernel based on a recent Linux version; that for
0312 example often holds true for the mainline kernels shipped by Debian GNU/Linux
0313 Sid or Fedora Rawhide. Some developers will also accept reports about issues
0314 with kernels from distributions shipping the latest stable kernel, as long as
0315 its only slightly modified; that for example is often the case for Arch Linux,
0316 regular Fedora releases, and openSUSE Tumbleweed. But keep in mind, you better
0317 want to use a mainline Linux and avoid using a stable kernel for this
0318 process, as outlined in the section 'Install a fresh kernel for testing' in more
0319 detail.
0320
0321 Obviously you are free to ignore all this advice and report problems with an old
0322 or heavily modified vendor kernel to the upstream Linux developers. But note,
0323 those often get rejected or ignored, so consider yourself warned. But it's still
0324 better than not reporting the issue at all: sometimes such reports directly or
0325 indirectly will help to get the issue fixed over time.
0326
0327
0328 Search for existing reports, first run
0329 --------------------------------------
0330
0331 *Perform a rough search for existing reports with your favorite internet
0332 search engine; additionally, check the archives of the Linux Kernel Mailing
0333 List (LKML). If you find matching reports, join the discussion instead of
0334 sending a new one.*
0335
0336 Reporting an issue that someone else already brought forward is often a waste of
0337 time for everyone involved, especially you as the reporter. So it's in your own
0338 interest to thoroughly check if somebody reported the issue already. At this
0339 step of the process it's okay to just perform a rough search: a later step will
0340 tell you to perform a more detailed search once you know where your issue needs
0341 to be reported to. Nevertheless, do not hurry with this step of the reporting
0342 process, it can save you time and trouble.
0343
0344 Simply search the internet with your favorite search engine first. Afterwards,
0345 search the `Linux Kernel Mailing List (LKML) archives
0346 <https://lore.kernel.org/lkml/>`_.
0347
0348 If you get flooded with results consider telling your search engine to limit
0349 search timeframe to the past month or year. And wherever you search, make sure
0350 to use good search terms; vary them a few times, too. While doing so try to
0351 look at the issue from the perspective of someone else: that will help you to
0352 come up with other words to use as search terms. Also make sure not to use too
0353 many search terms at once. Remember to search with and without information like
0354 the name of the kernel driver or the name of the affected hardware component.
0355 But its exact brand name (say 'ASUS Red Devil Radeon RX 5700 XT Gaming OC')
0356 often is not much helpful, as it is too specific. Instead try search terms like
0357 the model line (Radeon 5700 or Radeon 5000) and the code name of the main chip
0358 ('Navi' or 'Navi10') with and without its manufacturer ('AMD').
0359
0360 In case you find an existing report about your issue, join the discussion, as
0361 you might be able to provide valuable additional information. That can be
0362 important even when a fix is prepared or in its final stages already, as
0363 developers might look for people that can provide additional information or
0364 test a proposed fix. Jump to the section 'Duties after the report went out' for
0365 details on how to get properly involved.
0366
0367 Note, searching `bugzilla.kernel.org <https://bugzilla.kernel.org/>`_ might also
0368 be a good idea, as that might provide valuable insights or turn up matching
0369 reports. If you find the latter, just keep in mind: most subsystems expect
0370 reports in different places, as described below in the section "Check where you
0371 need to report your issue". The developers that should take care of the issue
0372 thus might not even be aware of the bugzilla ticket. Hence, check the ticket if
0373 the issue already got reported as outlined in this document and if not consider
0374 doing so.
0375
0376
0377 Issue of high priority?
0378 -----------------------
0379
0380 *See if the issue you are dealing with qualifies as regression, security
0381 issue, or a really severe problem: those are 'issues of high priority' that
0382 need special handling in some steps that are about to follow.*
0383
0384 Linus Torvalds and the leading Linux kernel developers want to see some issues
0385 fixed as soon as possible, hence there are 'issues of high priority' that get
0386 handled slightly differently in the reporting process. Three type of cases
0387 qualify: regressions, security issues, and really severe problems.
0388
0389 You deal with a regression if some application or practical use case running
0390 fine with one Linux kernel works worse or not at all with a newer version
0391 compiled using a similar configuration. The document
0392 Documentation/admin-guide/reporting-regressions.rst explains this in more
0393 detail. It also provides a good deal of other information about regressions you
0394 might want to be aware of; it for example explains how to add your issue to the
0395 list of tracked regressions, to ensure it won't fall through the cracks.
0396
0397 What qualifies as security issue is left to your judgment. Consider reading
0398 Documentation/admin-guide/security-bugs.rst before proceeding, as it
0399 provides additional details how to best handle security issues.
0400
0401 An issue is a 'really severe problem' when something totally unacceptably bad
0402 happens. That's for example the case when a Linux kernel corrupts the data it's
0403 handling or damages hardware it's running on. You're also dealing with a severe
0404 issue when the kernel suddenly stops working with an error message ('kernel
0405 panic') or without any farewell note at all. Note: do not confuse a 'panic' (a
0406 fatal error where the kernel stop itself) with a 'Oops' (a recoverable error),
0407 as the kernel remains running after the latter.
0408
0409
0410 Ensure a healthy environment
0411 ----------------------------
0412
0413 *Make sure it's not the kernel's surroundings that are causing the issue
0414 you face.*
0415
0416 Problems that look a lot like a kernel issue are sometimes caused by build or
0417 runtime environment. It's hard to rule out that problem completely, but you
0418 should minimize it:
0419
0420 * Use proven tools when building your kernel, as bugs in the compiler or the
0421 binutils can cause the resulting kernel to misbehave.
0422
0423 * Ensure your computer components run within their design specifications;
0424 that's especially important for the main processor, the main memory, and the
0425 motherboard. Therefore, stop undervolting or overclocking when facing a
0426 potential kernel issue.
0427
0428 * Try to make sure it's not faulty hardware that is causing your issue. Bad
0429 main memory for example can result in a multitude of issues that will
0430 manifest itself in problems looking like kernel issues.
0431
0432 * If you're dealing with a filesystem issue, you might want to check the file
0433 system in question with ``fsck``, as it might be damaged in a way that leads
0434 to unexpected kernel behavior.
0435
0436 * When dealing with a regression, make sure it's not something else that
0437 changed in parallel to updating the kernel. The problem for example might be
0438 caused by other software that was updated at the same time. It can also
0439 happen that a hardware component coincidentally just broke when you rebooted
0440 into a new kernel for the first time. Updating the systems BIOS or changing
0441 something in the BIOS Setup can also lead to problems that on look a lot
0442 like a kernel regression.
0443
0444
0445 Prepare for emergencies
0446 -----------------------
0447
0448 *Create a fresh backup and put system repair and restore tools at hand.*
0449
0450 Reminder, you are dealing with computers, which sometimes do unexpected things,
0451 especially if you fiddle with crucial parts like the kernel of its operating
0452 system. That's what you are about to do in this process. Thus, make sure to
0453 create a fresh backup; also ensure you have all tools at hand to repair or
0454 reinstall the operating system as well as everything you need to restore the
0455 backup.
0456
0457
0458 Make sure your kernel doesn't get enhanced
0459 ------------------------------------------
0460
0461 *Ensure your system does not enhance its kernels by building additional
0462 kernel modules on-the-fly, which solutions like DKMS might be doing locally
0463 without your knowledge.*
0464
0465 The risk your issue report gets ignored or rejected dramatically increases if
0466 your kernel gets enhanced in any way. That's why you should remove or disable
0467 mechanisms like akmods and DKMS: those build add-on kernel modules
0468 automatically, for example when you install a new Linux kernel or boot it for
0469 the first time. Also remove any modules they might have installed. Then reboot
0470 before proceeding.
0471
0472 Note, you might not be aware that your system is using one of these solutions:
0473 they often get set up silently when you install Nvidia's proprietary graphics
0474 driver, VirtualBox, or other software that requires a some support from a
0475 module not part of the Linux kernel. That why your might need to uninstall the
0476 packages with such software to get rid of any 3rd party kernel module.
0477
0478
0479 Check 'taint' flag
0480 ------------------
0481
0482 *Check if your kernel was 'tainted' when the issue occurred, as the event
0483 that made the kernel set this flag might be causing the issue you face.*
0484
0485 The kernel marks itself with a 'taint' flag when something happens that might
0486 lead to follow-up errors that look totally unrelated. The issue you face might
0487 be such an error if your kernel is tainted. That's why it's in your interest to
0488 rule this out early before investing more time into this process. This is the
0489 only reason why this step is here, as this process later will tell you to
0490 install the latest mainline kernel; you will need to check the taint flag again
0491 then, as that's when it matters because it's the kernel the report will focus
0492 on.
0493
0494 On a running system is easy to check if the kernel tainted itself: if ``cat
0495 /proc/sys/kernel/tainted`` returns '0' then the kernel is not tainted and
0496 everything is fine. Checking that file is impossible in some situations; that's
0497 why the kernel also mentions the taint status when it reports an internal
0498 problem (a 'kernel bug'), a recoverable error (a 'kernel Oops') or a
0499 non-recoverable error before halting operation (a 'kernel panic'). Look near
0500 the top of the error messages printed when one of these occurs and search for a
0501 line starting with 'CPU:'. It should end with 'Not tainted' if the kernel was
0502 not tainted when it noticed the problem; it was tainted if you see 'Tainted:'
0503 followed by a few spaces and some letters.
0504
0505 If your kernel is tainted, study Documentation/admin-guide/tainted-kernels.rst
0506 to find out why. Try to eliminate the reason. Often it's caused by one these
0507 three things:
0508
0509 1. A recoverable error (a 'kernel Oops') occurred and the kernel tainted
0510 itself, as the kernel knows it might misbehave in strange ways after that
0511 point. In that case check your kernel or system log and look for a section
0512 that starts with this::
0513
0514 Oops: 0000 [#1] SMP
0515
0516 That's the first Oops since boot-up, as the '#1' between the brackets shows.
0517 Every Oops and any other problem that happens after that point might be a
0518 follow-up problem to that first Oops, even if both look totally unrelated.
0519 Rule this out by getting rid of the cause for the first Oops and reproducing
0520 the issue afterwards. Sometimes simply restarting will be enough, sometimes
0521 a change to the configuration followed by a reboot can eliminate the Oops.
0522 But don't invest too much time into this at this point of the process, as
0523 the cause for the Oops might already be fixed in the newer Linux kernel
0524 version you are going to install later in this process.
0525
0526 2. Your system uses a software that installs its own kernel modules, for
0527 example Nvidia's proprietary graphics driver or VirtualBox. The kernel
0528 taints itself when it loads such module from external sources (even if
0529 they are Open Source): they sometimes cause errors in unrelated kernel
0530 areas and thus might be causing the issue you face. You therefore have to
0531 prevent those modules from loading when you want to report an issue to the
0532 Linux kernel developers. Most of the time the easiest way to do that is:
0533 temporarily uninstall such software including any modules they might have
0534 installed. Afterwards reboot.
0535
0536 3. The kernel also taints itself when it's loading a module that resides in
0537 the staging tree of the Linux kernel source. That's a special area for
0538 code (mostly drivers) that does not yet fulfill the normal Linux kernel
0539 quality standards. When you report an issue with such a module it's
0540 obviously okay if the kernel is tainted; just make sure the module in
0541 question is the only reason for the taint. If the issue happens in an
0542 unrelated area reboot and temporarily block the module from being loaded
0543 by specifying ``foo.blacklist=1`` as kernel parameter (replace 'foo' with
0544 the name of the module in question).
0545
0546
0547 Document how to reproduce issue
0548 -------------------------------
0549
0550 *Write down coarsely how to reproduce the issue. If you deal with multiple
0551 issues at once, create separate notes for each of them and make sure they
0552 work independently on a freshly booted system. That's needed, as each issue
0553 needs to get reported to the kernel developers separately, unless they are
0554 strongly entangled.*
0555
0556 If you deal with multiple issues at once, you'll have to report each of them
0557 separately, as they might be handled by different developers. Describing
0558 various issues in one report also makes it quite difficult for others to tear
0559 it apart. Hence, only combine issues in one report if they are very strongly
0560 entangled.
0561
0562 Additionally, during the reporting process you will have to test if the issue
0563 happens with other kernel versions. Therefore, it will make your work easier if
0564 you know exactly how to reproduce an issue quickly on a freshly booted system.
0565
0566 Note: it's often fruitless to report issues that only happened once, as they
0567 might be caused by a bit flip due to cosmic radiation. That's why you should
0568 try to rule that out by reproducing the issue before going further. Feel free
0569 to ignore this advice if you are experienced enough to tell a one-time error
0570 due to faulty hardware apart from a kernel issue that rarely happens and thus
0571 is hard to reproduce.
0572
0573
0574 Regression in stable or longterm kernel?
0575 ----------------------------------------
0576
0577 *If you are facing a regression within a stable or longterm version line
0578 (say something broke when updating from 5.10.4 to 5.10.5), scroll down to
0579 'Dealing with regressions within a stable and longterm kernel line'.*
0580
0581 Regression within a stable and longterm kernel version line are something the
0582 Linux developers want to fix badly, as such issues are even more unwanted than
0583 regression in the main development branch, as they can quickly affect a lot of
0584 people. The developers thus want to learn about such issues as quickly as
0585 possible, hence there is a streamlined process to report them. Note,
0586 regressions with newer kernel version line (say something broke when switching
0587 from 5.9.15 to 5.10.5) do not qualify.
0588
0589
0590 Check where you need to report your issue
0591 -----------------------------------------
0592
0593 *Locate the driver or kernel subsystem that seems to be causing the issue.
0594 Find out how and where its developers expect reports. Note: most of the
0595 time this won't be bugzilla.kernel.org, as issues typically need to be sent
0596 by mail to a maintainer and a public mailing list.*
0597
0598 It's crucial to send your report to the right people, as the Linux kernel is a
0599 big project and most of its developers are only familiar with a small subset of
0600 it. Quite a few programmers for example only care for just one driver, for
0601 example one for a WiFi chip; its developer likely will only have small or no
0602 knowledge about the internals of remote or unrelated "subsystems", like the TCP
0603 stack, the PCIe/PCI subsystem, memory management or file systems.
0604
0605 Problem is: the Linux kernel lacks a central bug tracker where you can simply
0606 file your issue and make it reach the developers that need to know about it.
0607 That's why you have to find the right place and way to report issues yourself.
0608 You can do that with the help of a script (see below), but it mainly targets
0609 kernel developers and experts. For everybody else the MAINTAINERS file is the
0610 better place.
0611
0612 How to read the MAINTAINERS file
0613 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0614 To illustrate how to use the :ref:`MAINTAINERS <maintainers>` file, lets assume
0615 the WiFi in your Laptop suddenly misbehaves after updating the kernel. In that
0616 case it's likely an issue in the WiFi driver. Obviously it could also be some
0617 code it builds upon, but unless you suspect something like that stick to the
0618 driver. If it's really something else, the driver's developers will get the
0619 right people involved.
0620
0621 Sadly, there is no way to check which code is driving a particular hardware
0622 component that is both universal and easy.
0623
0624 In case of a problem with the WiFi driver you for example might want to look at
0625 the output of ``lspci -k``, as it lists devices on the PCI/PCIe bus and the
0626 kernel module driving it::
0627
0628 [user@something ~]$ lspci -k
0629 [...]
0630 3a:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter (rev 32)
0631 Subsystem: Bigfoot Networks, Inc. Device 1535
0632 Kernel driver in use: ath10k_pci
0633 Kernel modules: ath10k_pci
0634 [...]
0635
0636 But this approach won't work if your WiFi chip is connected over USB or some
0637 other internal bus. In those cases you might want to check your WiFi manager or
0638 the output of ``ip link``. Look for the name of the problematic network
0639 interface, which might be something like 'wlp58s0'. This name can be used like
0640 this to find the module driving it::
0641
0642 [user@something ~]$ realpath --relative-to=/sys/module/ /sys/class/net/wlp58s0/device/driver/module
0643 ath10k_pci
0644
0645 In case tricks like these don't bring you any further, try to search the
0646 internet on how to narrow down the driver or subsystem in question. And if you
0647 are unsure which it is: just try your best guess, somebody will help you if you
0648 guessed poorly.
0649
0650 Once you know the driver or subsystem, you want to search for it in the
0651 MAINTAINERS file. In the case of 'ath10k_pci' you won't find anything, as the
0652 name is too specific. Sometimes you will need to search on the net for help;
0653 but before doing so, try a somewhat shorted or modified name when searching the
0654 MAINTAINERS file, as then you might find something like this::
0655
0656 QUALCOMM ATHEROS ATH10K WIRELESS DRIVER
0657 Mail: A. Some Human <shuman@example.com>
0658 Mailing list: ath10k@lists.infradead.org
0659 Status: Supported
0660 Web-page: https://wireless.wiki.kernel.org/en/users/Drivers/ath10k
0661 SCM: git git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git
0662 Files: drivers/net/wireless/ath/ath10k/
0663
0664 Note: the line description will be abbreviations, if you read the plain
0665 MAINTAINERS file found in the root of the Linux source tree. 'Mail:' for
0666 example will be 'M:', 'Mailing list:' will be 'L', and 'Status:' will be 'S:'.
0667 A section near the top of the file explains these and other abbreviations.
0668
0669 First look at the line 'Status'. Ideally it should be 'Supported' or
0670 'Maintained'. If it states 'Obsolete' then you are using some outdated approach
0671 that was replaced by a newer solution you need to switch to. Sometimes the code
0672 only has someone who provides 'Odd Fixes' when feeling motivated. And with
0673 'Orphan' you are totally out of luck, as nobody takes care of the code anymore.
0674 That only leaves these options: arrange yourself to live with the issue, fix it
0675 yourself, or find a programmer somewhere willing to fix it.
0676
0677 After checking the status, look for a line starting with 'bugs:': it will tell
0678 you where to find a subsystem specific bug tracker to file your issue. The
0679 example above does not have such a line. That is the case for most sections, as
0680 Linux kernel development is completely driven by mail. Very few subsystems use
0681 a bug tracker, and only some of those rely on bugzilla.kernel.org.
0682
0683 In this and many other cases you thus have to look for lines starting with
0684 'Mail:' instead. Those mention the name and the email addresses for the
0685 maintainers of the particular code. Also look for a line starting with 'Mailing
0686 list:', which tells you the public mailing list where the code is developed.
0687 Your report later needs to go by mail to those addresses. Additionally, for all
0688 issue reports sent by email, make sure to add the Linux Kernel Mailing List
0689 (LKML) <linux-kernel@vger.kernel.org> to CC. Don't omit either of the mailing
0690 lists when sending your issue report by mail later! Maintainers are busy people
0691 and might leave some work for other developers on the subsystem specific list;
0692 and LKML is important to have one place where all issue reports can be found.
0693
0694
0695 Finding the maintainers with the help of a script
0696 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0697
0698 For people that have the Linux sources at hand there is a second option to find
0699 the proper place to report: the script 'scripts/get_maintainer.pl' which tries
0700 to find all people to contact. It queries the MAINTAINERS file and needs to be
0701 called with a path to the source code in question. For drivers compiled as
0702 module if often can be found with a command like this::
0703
0704 $ modinfo ath10k_pci | grep filename | sed 's!/lib/modules/.*/kernel/!!; s!filename:!!; s!\.ko\(\|\.xz\)!!'
0705 drivers/net/wireless/ath/ath10k/ath10k_pci.ko
0706
0707 Pass parts of this to the script::
0708
0709 $ ./scripts/get_maintainer.pl -f drivers/net/wireless/ath/ath10k*
0710 Some Human <shuman@example.com> (supporter:QUALCOMM ATHEROS ATH10K WIRELESS DRIVER)
0711 Another S. Human <asomehuman@example.com> (maintainer:NETWORKING DRIVERS)
0712 ath10k@lists.infradead.org (open list:QUALCOMM ATHEROS ATH10K WIRELESS DRIVER)
0713 linux-wireless@vger.kernel.org (open list:NETWORKING DRIVERS (WIRELESS))
0714 netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
0715 linux-kernel@vger.kernel.org (open list)
0716
0717 Don't sent your report to all of them. Send it to the maintainers, which the
0718 script calls "supporter:"; additionally CC the most specific mailing list for
0719 the code as well as the Linux Kernel Mailing List (LKML). In this case you thus
0720 would need to send the report to 'Some Human <shuman@example.com>' with
0721 'ath10k@lists.infradead.org' and 'linux-kernel@vger.kernel.org' in CC.
0722
0723 Note: in case you cloned the Linux sources with git you might want to call
0724 ``get_maintainer.pl`` a second time with ``--git``. The script then will look
0725 at the commit history to find which people recently worked on the code in
0726 question, as they might be able to help. But use these results with care, as it
0727 can easily send you in a wrong direction. That for example happens quickly in
0728 areas rarely changed (like old or unmaintained drivers): sometimes such code is
0729 modified during tree-wide cleanups by developers that do not care about the
0730 particular driver at all.
0731
0732
0733 Search for existing reports, second run
0734 ---------------------------------------
0735
0736 *Search the archives of the bug tracker or mailing list in question
0737 thoroughly for reports that might match your issue. If you find anything,
0738 join the discussion instead of sending a new report.*
0739
0740 As mentioned earlier already: reporting an issue that someone else already
0741 brought forward is often a waste of time for everyone involved, especially you
0742 as the reporter. That's why you should search for existing report again, now
0743 that you know where they need to be reported to. If it's mailing list, you will
0744 often find its archives on `lore.kernel.org <https://lore.kernel.org/>`_.
0745
0746 But some list are hosted in different places. That for example is the case for
0747 the ath10k WiFi driver used as example in the previous step. But you'll often
0748 find the archives for these lists easily on the net. Searching for 'archive
0749 ath10k@lists.infradead.org' for example will lead you to the `Info page for the
0750 ath10k mailing list <https://lists.infradead.org/mailman/listinfo/ath10k>`_,
0751 which at the top links to its
0752 `list archives <https://lists.infradead.org/pipermail/ath10k/>`_. Sadly this and
0753 quite a few other lists miss a way to search the archives. In those cases use a
0754 regular internet search engine and add something like
0755 'site:lists.infradead.org/pipermail/ath10k/' to your search terms, which limits
0756 the results to the archives at that URL.
0757
0758 It's also wise to check the internet, LKML and maybe bugzilla.kernel.org again
0759 at this point. If your report needs to be filed in a bug tracker, you may want
0760 to check the mailing list archives for the subsystem as well, as someone might
0761 have reported it only there.
0762
0763 For details how to search and what to do if you find matching reports see
0764 "Search for existing reports, first run" above.
0765
0766 Do not hurry with this step of the reporting process: spending 30 to 60 minutes
0767 or even more time can save you and others quite a lot of time and trouble.
0768
0769
0770 Install a fresh kernel for testing
0771 ----------------------------------
0772
0773 *Unless you are already running the latest 'mainline' Linux kernel, better
0774 go and install it for the reporting process. Testing and reporting with
0775 the latest 'stable' Linux can be an acceptable alternative in some
0776 situations; during the merge window that actually might be even the best
0777 approach, but in that development phase it can be an even better idea to
0778 suspend your efforts for a few days anyway. Whatever version you choose,
0779 ideally use a 'vanilla' built. Ignoring these advices will dramatically
0780 increase the risk your report will be rejected or ignored.*
0781
0782 As mentioned in the detailed explanation for the first step already: Like most
0783 programmers, Linux kernel developers don't like to spend time dealing with
0784 reports for issues that don't even happen with the current code. It's just a
0785 waste everybody's time, especially yours. That's why it's in everybody's
0786 interest that you confirm the issue still exists with the latest upstream code
0787 before reporting it. You are free to ignore this advice, but as outlined
0788 earlier: doing so dramatically increases the risk that your issue report might
0789 get rejected or simply ignored.
0790
0791 In the scope of the kernel "latest upstream" normally means:
0792
0793 * Install a mainline kernel; the latest stable kernel can be an option, but
0794 most of the time is better avoided. Longterm kernels (sometimes called 'LTS
0795 kernels') are unsuitable at this point of the process. The next subsection
0796 explains all of this in more detail.
0797
0798 * The over next subsection describes way to obtain and install such a kernel.
0799 It also outlines that using a pre-compiled kernel are fine, but better are
0800 vanilla, which means: it was built using Linux sources taken straight `from
0801 kernel.org <https://kernel.org/>`_ and not modified or enhanced in any way.
0802
0803 Choosing the right version for testing
0804 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0805
0806 Head over to `kernel.org <https://kernel.org/>`_ to find out which version you
0807 want to use for testing. Ignore the big yellow button that says 'Latest release'
0808 and look a little lower at the table. At its top you'll see a line starting with
0809 mainline, which most of the time will point to a pre-release with a version
0810 number like '5.8-rc2'. If that's the case, you'll want to use this mainline
0811 kernel for testing, as that where all fixes have to be applied first. Do not let
0812 that 'rc' scare you, these 'development kernels' are pretty reliable — and you
0813 made a backup, as you were instructed above, didn't you?
0814
0815 In about two out of every nine to ten weeks, mainline might point you to a
0816 proper release with a version number like '5.7'. If that happens, consider
0817 suspending the reporting process until the first pre-release of the next
0818 version (5.8-rc1) shows up on kernel.org. That's because the Linux development
0819 cycle then is in its two-week long 'merge window'. The bulk of the changes and
0820 all intrusive ones get merged for the next release during this time. It's a bit
0821 more risky to use mainline during this period. Kernel developers are also often
0822 quite busy then and might have no spare time to deal with issue reports. It's
0823 also quite possible that one of the many changes applied during the merge
0824 window fixes the issue you face; that's why you soon would have to retest with
0825 a newer kernel version anyway, as outlined below in the section 'Duties after
0826 the report went out'.
0827
0828 That's why it might make sense to wait till the merge window is over. But don't
0829 to that if you're dealing with something that shouldn't wait. In that case
0830 consider obtaining the latest mainline kernel via git (see below) or use the
0831 latest stable version offered on kernel.org. Using that is also acceptable in
0832 case mainline for some reason does currently not work for you. An in general:
0833 using it for reproducing the issue is also better than not reporting it issue
0834 at all.
0835
0836 Better avoid using the latest stable kernel outside merge windows, as all fixes
0837 must be applied to mainline first. That's why checking the latest mainline
0838 kernel is so important: any issue you want to see fixed in older version lines
0839 needs to be fixed in mainline first before it can get backported, which can
0840 take a few days or weeks. Another reason: the fix you hope for might be too
0841 hard or risky for backporting; reporting the issue again hence is unlikely to
0842 change anything.
0843
0844 These aspects are also why longterm kernels (sometimes called "LTS kernels")
0845 are unsuitable for this part of the reporting process: they are to distant from
0846 the current code. Hence go and test mainline first and follow the process
0847 further: if the issue doesn't occur with mainline it will guide you how to get
0848 it fixed in older version lines, if that's in the cards for the fix in question.
0849
0850 How to obtain a fresh Linux kernel
0851 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0852
0853 **Using a pre-compiled kernel**: This is often the quickest, easiest, and safest
0854 way for testing — especially is you are unfamiliar with the Linux kernel. The
0855 problem: most of those shipped by distributors or add-on repositories are build
0856 from modified Linux sources. They are thus not vanilla and therefore often
0857 unsuitable for testing and issue reporting: the changes might cause the issue
0858 you face or influence it somehow.
0859
0860 But you are in luck if you are using a popular Linux distribution: for quite a
0861 few of them you'll find repositories on the net that contain packages with the
0862 latest mainline or stable Linux built as vanilla kernel. It's totally okay to
0863 use these, just make sure from the repository's description they are vanilla or
0864 at least close to it. Additionally ensure the packages contain the latest
0865 versions as offered on kernel.org. The packages are likely unsuitable if they
0866 are older than a week, as new mainline and stable kernels typically get released
0867 at least once a week.
0868
0869 Please note that you might need to build your own kernel manually later: that's
0870 sometimes needed for debugging or testing fixes, as described later in this
0871 document. Also be aware that pre-compiled kernels might lack debug symbols that
0872 are needed to decode messages the kernel prints when a panic, Oops, warning, or
0873 BUG occurs; if you plan to decode those, you might be better off compiling a
0874 kernel yourself (see the end of this subsection and the section titled 'Decode
0875 failure messages' for details).
0876
0877 **Using git**: Developers and experienced Linux users familiar with git are
0878 often best served by obtaining the latest Linux kernel sources straight from the
0879 `official development repository on kernel.org
0880 <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/>`_.
0881 Those are likely a bit ahead of the latest mainline pre-release. Don't worry
0882 about it: they are as reliable as a proper pre-release, unless the kernel's
0883 development cycle is currently in the middle of a merge window. But even then
0884 they are quite reliable.
0885
0886 **Conventional**: People unfamiliar with git are often best served by
0887 downloading the sources as tarball from `kernel.org <https://kernel.org/>`_.
0888
0889 How to actually build a kernel is not described here, as many websites explain
0890 the necessary steps already. If you are new to it, consider following one of
0891 those how-to's that suggest to use ``make localmodconfig``, as that tries to
0892 pick up the configuration of your current kernel and then tries to adjust it
0893 somewhat for your system. That does not make the resulting kernel any better,
0894 but quicker to compile.
0895
0896 Note: If you are dealing with a panic, Oops, warning, or BUG from the kernel,
0897 please try to enable CONFIG_KALLSYMS when configuring your kernel.
0898 Additionally, enable CONFIG_DEBUG_KERNEL and CONFIG_DEBUG_INFO, too; the
0899 latter is the relevant one of those two, but can only be reached if you enable
0900 the former. Be aware CONFIG_DEBUG_INFO increases the storage space required to
0901 build a kernel by quite a bit. But that's worth it, as these options will allow
0902 you later to pinpoint the exact line of code that triggers your issue. The
0903 section 'Decode failure messages' below explains this in more detail.
0904
0905 But keep in mind: Always keep a record of the issue encountered in case it is
0906 hard to reproduce. Sending an undecoded report is better than not reporting
0907 the issue at all.
0908
0909
0910 Check 'taint' flag
0911 ------------------
0912
0913 *Ensure the kernel you just installed does not 'taint' itself when
0914 running.*
0915
0916 As outlined above in more detail already: the kernel sets a 'taint' flag when
0917 something happens that can lead to follow-up errors that look totally
0918 unrelated. That's why you need to check if the kernel you just installed does
0919 not set this flag. And if it does, you in almost all the cases needs to
0920 eliminate the reason for it before you reporting issues that occur with it. See
0921 the section above for details how to do that.
0922
0923
0924 Reproduce issue with the fresh kernel
0925 -------------------------------------
0926
0927 *Reproduce the issue with the kernel you just installed. If it doesn't show
0928 up there, scroll down to the instructions for issues only happening with
0929 stable and longterm kernels.*
0930
0931 Check if the issue occurs with the fresh Linux kernel version you just
0932 installed. If it was fixed there already, consider sticking with this version
0933 line and abandoning your plan to report the issue. But keep in mind that other
0934 users might still be plagued by it, as long as it's not fixed in either stable
0935 and longterm version from kernel.org (and thus vendor kernels derived from
0936 those). If you prefer to use one of those or just want to help their users,
0937 head over to the section "Details about reporting issues only occurring in
0938 older kernel version lines" below.
0939
0940
0941 Optimize description to reproduce issue
0942 ---------------------------------------
0943
0944 *Optimize your notes: try to find and write the most straightforward way to
0945 reproduce your issue. Make sure the end result has all the important
0946 details, and at the same time is easy to read and understand for others
0947 that hear about it for the first time. And if you learned something in this
0948 process, consider searching again for existing reports about the issue.*
0949
0950 An unnecessarily complex report will make it hard for others to understand your
0951 report. Thus try to find a reproducer that's straight forward to describe and
0952 thus easy to understand in written form. Include all important details, but at
0953 the same time try to keep it as short as possible.
0954
0955 In this in the previous steps you likely have learned a thing or two about the
0956 issue you face. Use this knowledge and search again for existing reports
0957 instead you can join.
0958
0959
0960 Decode failure messages
0961 -----------------------
0962
0963 *If your failure involves a 'panic', 'Oops', 'warning', or 'BUG', consider
0964 decoding the kernel log to find the line of code that triggered the error.*
0965
0966 When the kernel detects an internal problem, it will log some information about
0967 the executed code. This makes it possible to pinpoint the exact line in the
0968 source code that triggered the issue and shows how it was called. But that only
0969 works if you enabled CONFIG_DEBUG_INFO and CONFIG_KALLSYMS when configuring
0970 your kernel. If you did so, consider to decode the information from the
0971 kernel's log. That will make it a lot easier to understand what lead to the
0972 'panic', 'Oops', 'warning', or 'BUG', which increases the chances that someone
0973 can provide a fix.
0974
0975 Decoding can be done with a script you find in the Linux source tree. If you
0976 are running a kernel you compiled yourself earlier, call it like this::
0977
0978 [user@something ~]$ sudo dmesg | ./linux-5.10.5/scripts/decode_stacktrace.sh ./linux-5.10.5/vmlinux
0979
0980 If you are running a packaged vanilla kernel, you will likely have to install
0981 the corresponding packages with debug symbols. Then call the script (which you
0982 might need to get from the Linux sources if your distro does not package it)
0983 like this::
0984
0985 [user@something ~]$ sudo dmesg | ./linux-5.10.5/scripts/decode_stacktrace.sh \
0986 /usr/lib/debug/lib/modules/5.10.10-4.1.x86_64/vmlinux /usr/src/kernels/5.10.10-4.1.x86_64/
0987
0988 The script will work on log lines like the following, which show the address of
0989 the code the kernel was executing when the error occurred::
0990
0991 [ 68.387301] RIP: 0010:test_module_init+0x5/0xffa [test_module]
0992
0993 Once decoded, these lines will look like this::
0994
0995 [ 68.387301] RIP: 0010:test_module_init (/home/username/linux-5.10.5/test-module/test-module.c:16) test_module
0996
0997 In this case the executed code was built from the file
0998 '~/linux-5.10.5/test-module/test-module.c' and the error occurred by the
0999 instructions found in line '16'.
1000
1001 The script will similarly decode the addresses mentioned in the section
1002 starting with 'Call trace', which show the path to the function where the
1003 problem occurred. Additionally, the script will show the assembler output for
1004 the code section the kernel was executing.
1005
1006 Note, if you can't get this to work, simply skip this step and mention the
1007 reason for it in the report. If you're lucky, it might not be needed. And if it
1008 is, someone might help you to get things going. Also be aware this is just one
1009 of several ways to decode kernel stack traces. Sometimes different steps will
1010 be required to retrieve the relevant details. Don't worry about that, if that's
1011 needed in your case, developers will tell you what to do.
1012
1013
1014 Special care for regressions
1015 ----------------------------
1016
1017 *If your problem is a regression, try to narrow down when the issue was
1018 introduced as much as possible.*
1019
1020 Linux lead developer Linus Torvalds insists that the Linux kernel never
1021 worsens, that's why he deems regressions as unacceptable and wants to see them
1022 fixed quickly. That's why changes that introduced a regression are often
1023 promptly reverted if the issue they cause can't get solved quickly any other
1024 way. Reporting a regression is thus a bit like playing a kind of trump card to
1025 get something quickly fixed. But for that to happen the change that's causing
1026 the regression needs to be known. Normally it's up to the reporter to track
1027 down the culprit, as maintainers often won't have the time or setup at hand to
1028 reproduce it themselves.
1029
1030 To find the change there is a process called 'bisection' which the document
1031 Documentation/admin-guide/bug-bisect.rst describes in detail. That process
1032 will often require you to build about ten to twenty kernel images, trying to
1033 reproduce the issue with each of them before building the next. Yes, that takes
1034 some time, but don't worry, it works a lot quicker than most people assume.
1035 Thanks to a 'binary search' this will lead you to the one commit in the source
1036 code management system that's causing the regression. Once you find it, search
1037 the net for the subject of the change, its commit id and the shortened commit id
1038 (the first 12 characters of the commit id). This will lead you to existing
1039 reports about it, if there are any.
1040
1041 Note, a bisection needs a bit of know-how, which not everyone has, and quite a
1042 bit of effort, which not everyone is willing to invest. Nevertheless, it's
1043 highly recommended performing a bisection yourself. If you really can't or
1044 don't want to go down that route at least find out which mainline kernel
1045 introduced the regression. If something for example breaks when switching from
1046 5.5.15 to 5.8.4, then try at least all the mainline releases in that area (5.6,
1047 5.7 and 5.8) to check when it first showed up. Unless you're trying to find a
1048 regression in a stable or longterm kernel, avoid testing versions which number
1049 has three sections (5.6.12, 5.7.8), as that makes the outcome hard to
1050 interpret, which might render your testing useless. Once you found the major
1051 version which introduced the regression, feel free to move on in the reporting
1052 process. But keep in mind: it depends on the issue at hand if the developers
1053 will be able to help without knowing the culprit. Sometimes they might
1054 recognize from the report want went wrong and can fix it; other times they will
1055 be unable to help unless you perform a bisection.
1056
1057 When dealing with regressions make sure the issue you face is really caused by
1058 the kernel and not by something else, as outlined above already.
1059
1060 In the whole process keep in mind: an issue only qualifies as regression if the
1061 older and the newer kernel got built with a similar configuration. This can be
1062 achieved by using ``make olddefconfig``, as explained in more detail by
1063 Documentation/admin-guide/reporting-regressions.rst; that document also
1064 provides a good deal of other information about regressions you might want to be
1065 aware of.
1066
1067
1068 Write and send the report
1069 -------------------------
1070
1071 *Start to compile the report by writing a detailed description about the
1072 issue. Always mention a few things: the latest kernel version you installed
1073 for reproducing, the Linux Distribution used, and your notes on how to
1074 reproduce the issue. Ideally, make the kernel's build configuration
1075 (.config) and the output from ``dmesg`` available somewhere on the net and
1076 link to it. Include or upload all other information that might be relevant,
1077 like the output/screenshot of an Oops or the output from ``lspci``. Once
1078 you wrote this main part, insert a normal length paragraph on top of it
1079 outlining the issue and the impact quickly. On top of this add one sentence
1080 that briefly describes the problem and gets people to read on. Now give the
1081 thing a descriptive title or subject that yet again is shorter. Then you're
1082 ready to send or file the report like the MAINTAINERS file told you, unless
1083 you are dealing with one of those 'issues of high priority': they need
1084 special care which is explained in 'Special handling for high priority
1085 issues' below.*
1086
1087 Now that you have prepared everything it's time to write your report. How to do
1088 that is partly explained by the three documents linked to in the preface above.
1089 That's why this text will only mention a few of the essentials as well as
1090 things specific to the Linux kernel.
1091
1092 There is one thing that fits both categories: the most crucial parts of your
1093 report are the title/subject, the first sentence, and the first paragraph.
1094 Developers often get quite a lot of mail. They thus often just take a few
1095 seconds to skim a mail before deciding to move on or look closer. Thus: the
1096 better the top section of your report, the higher are the chances that someone
1097 will look into it and help you. And that is why you should ignore them for now
1098 and write the detailed report first. ;-)
1099
1100 Things each report should mention
1101 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1102
1103 Describe in detail how your issue happens with the fresh vanilla kernel you
1104 installed. Try to include the step-by-step instructions you wrote and optimized
1105 earlier that outline how you and ideally others can reproduce the issue; in
1106 those rare cases where that's impossible try to describe what you did to
1107 trigger it.
1108
1109 Also include all the relevant information others might need to understand the
1110 issue and its environment. What's actually needed depends a lot on the issue,
1111 but there are some things you should include always:
1112
1113 * the output from ``cat /proc/version``, which contains the Linux kernel
1114 version number and the compiler it was built with.
1115
1116 * the Linux distribution the machine is running (``hostnamectl | grep
1117 "Operating System"``)
1118
1119 * the architecture of the CPU and the operating system (``uname -mi``)
1120
1121 * if you are dealing with a regression and performed a bisection, mention the
1122 subject and the commit-id of the change that is causing it.
1123
1124 In a lot of cases it's also wise to make two more things available to those
1125 that read your report:
1126
1127 * the configuration used for building your Linux kernel (the '.config' file)
1128
1129 * the kernel's messages that you get from ``dmesg`` written to a file. Make
1130 sure that it starts with a line like 'Linux version 5.8-1
1131 (foobar@example.com) (gcc (GCC) 10.2.1, GNU ld version 2.34) #1 SMP Mon Aug
1132 3 14:54:37 UTC 2020' If it's missing, then important messages from the first
1133 boot phase already got discarded. In this case instead consider using
1134 ``journalctl -b 0 -k``; alternatively you can also reboot, reproduce the
1135 issue and call ``dmesg`` right afterwards.
1136
1137 These two files are big, that's why it's a bad idea to put them directly into
1138 your report. If you are filing the issue in a bug tracker then attach them to
1139 the ticket. If you report the issue by mail do not attach them, as that makes
1140 the mail too large; instead do one of these things:
1141
1142 * Upload the files somewhere public (your website, a public file paste
1143 service, a ticket created just for this purpose on `bugzilla.kernel.org
1144 <https://bugzilla.kernel.org/>`_, ...) and include a link to them in your
1145 report. Ideally use something where the files stay available for years, as
1146 they could be useful to someone many years from now; this for example can
1147 happen if five or ten years from now a developer works on some code that was
1148 changed just to fix your issue.
1149
1150 * Put the files aside and mention you will send them later in individual
1151 replies to your own mail. Just remember to actually do that once the report
1152 went out. ;-)
1153
1154 Things that might be wise to provide
1155 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1156
1157 Depending on the issue you might need to add more background data. Here are a
1158 few suggestions what often is good to provide:
1159
1160 * If you are dealing with a 'warning', an 'OOPS' or a 'panic' from the kernel,
1161 include it. If you can't copy'n'paste it, try to capture a netconsole trace
1162 or at least take a picture of the screen.
1163
1164 * If the issue might be related to your computer hardware, mention what kind
1165 of system you use. If you for example have problems with your graphics card,
1166 mention its manufacturer, the card's model, and what chip is uses. If it's a
1167 laptop mention its name, but try to make sure it's meaningful. 'Dell XPS 13'
1168 for example is not, because it might be the one from 2012; that one looks
1169 not that different from the one sold today, but apart from that the two have
1170 nothing in common. Hence, in such cases add the exact model number, which
1171 for example are '9380' or '7390' for XPS 13 models introduced during 2019.
1172 Names like 'Lenovo Thinkpad T590' are also somewhat ambiguous: there are
1173 variants of this laptop with and without a dedicated graphics chip, so try
1174 to find the exact model name or specify the main components.
1175
1176 * Mention the relevant software in use. If you have problems with loading
1177 modules, you want to mention the versions of kmod, systemd, and udev in use.
1178 If one of the DRM drivers misbehaves, you want to state the versions of
1179 libdrm and Mesa; also specify your Wayland compositor or the X-Server and
1180 its driver. If you have a filesystem issue, mention the version of
1181 corresponding filesystem utilities (e2fsprogs, btrfs-progs, xfsprogs, ...).
1182
1183 * Gather additional information from the kernel that might be of interest. The
1184 output from ``lspci -nn`` will for example help others to identify what
1185 hardware you use. If you have a problem with hardware you even might want to
1186 make the output from ``sudo lspci -vvv`` available, as that provides
1187 insights how the components were configured. For some issues it might be
1188 good to include the contents of files like ``/proc/cpuinfo``,
1189 ``/proc/ioports``, ``/proc/iomem``, ``/proc/modules``, or
1190 ``/proc/scsi/scsi``. Some subsystem also offer tools to collect relevant
1191 information. One such tool is ``alsa-info.sh`` `which the audio/sound
1192 subsystem developers provide <https://www.alsa-project.org/wiki/AlsaInfo>`_.
1193
1194 Those examples should give your some ideas of what data might be wise to
1195 attach, but you have to think yourself what will be helpful for others to know.
1196 Don't worry too much about forgetting something, as developers will ask for
1197 additional details they need. But making everything important available from
1198 the start increases the chance someone will take a closer look.
1199
1200
1201 The important part: the head of your report
1202 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1203
1204 Now that you have the detailed part of the report prepared let's get to the
1205 most important section: the first few sentences. Thus go to the top, add
1206 something like 'The detailed description:' before the part you just wrote and
1207 insert two newlines at the top. Now write one normal length paragraph that
1208 describes the issue roughly. Leave out all boring details and focus on the
1209 crucial parts readers need to know to understand what this is all about; if you
1210 think this bug affects a lot of users, mention this to get people interested.
1211
1212 Once you did that insert two more lines at the top and write a one sentence
1213 summary that explains quickly what the report is about. After that you have to
1214 get even more abstract and write an even shorter subject/title for the report.
1215
1216 Now that you have written this part take some time to optimize it, as it is the
1217 most important parts of your report: a lot of people will only read this before
1218 they decide if reading the rest is time well spent.
1219
1220 Now send or file the report like the :ref:`MAINTAINERS <maintainers>` file told
1221 you, unless it's one of those 'issues of high priority' outlined earlier: in
1222 that case please read the next subsection first before sending the report on
1223 its way.
1224
1225 Special handling for high priority issues
1226 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1227
1228 Reports for high priority issues need special handling.
1229
1230 **Severe issues**: make sure the subject or ticket title as well as the first
1231 paragraph makes the severeness obvious.
1232
1233 **Regressions**: make the report's subject start with '[REGRESSION]'.
1234
1235 In case you performed a successful bisection, use the title of the change that
1236 introduced the regression as the second part of your subject. Make the report
1237 also mention the commit id of the culprit. In case of an unsuccessful bisection,
1238 make your report mention the latest tested version that's working fine (say 5.7)
1239 and the oldest where the issue occurs (say 5.8-rc1).
1240
1241 When sending the report by mail, CC the Linux regressions mailing list
1242 (regressions@lists.linux.dev). In case the report needs to be filed to some web
1243 tracker, proceed to do so. Once filed, forward the report by mail to the
1244 regressions list; CC the maintainer and the mailing list for the subsystem in
1245 question. Make sure to inline the forwarded report, hence do not attach it.
1246 Also add a short note at the top where you mention the URL to the ticket.
1247
1248 When mailing or forwarding the report, in case of a successful bisection add the
1249 author of the culprit to the recipients; also CC everyone in the signed-off-by
1250 chain, which you find at the end of its commit message.
1251
1252 **Security issues**: for these issues your will have to evaluate if a
1253 short-term risk to other users would arise if details were publicly disclosed.
1254 If that's not the case simply proceed with reporting the issue as described.
1255 For issues that bear such a risk you will need to adjust the reporting process
1256 slightly:
1257
1258 * If the MAINTAINERS file instructed you to report the issue by mail, do not
1259 CC any public mailing lists.
1260
1261 * If you were supposed to file the issue in a bug tracker make sure to mark
1262 the ticket as 'private' or 'security issue'. If the bug tracker does not
1263 offer a way to keep reports private, forget about it and send your report as
1264 a private mail to the maintainers instead.
1265
1266 In both cases make sure to also mail your report to the addresses the
1267 MAINTAINERS file lists in the section 'security contact'. Ideally directly CC
1268 them when sending the report by mail. If you filed it in a bug tracker, forward
1269 the report's text to these addresses; but on top of it put a small note where
1270 you mention that you filed it with a link to the ticket.
1271
1272 See Documentation/admin-guide/security-bugs.rst for more information.
1273
1274
1275 Duties after the report went out
1276 --------------------------------
1277
1278 *Wait for reactions and keep the thing rolling until you can accept the
1279 outcome in one way or the other. Thus react publicly and in a timely manner
1280 to any inquiries. Test proposed fixes. Do proactive testing: retest with at
1281 least every first release candidate (RC) of a new mainline version and
1282 report your results. Send friendly reminders if things stall. And try to
1283 help yourself, if you don't get any help or if it's unsatisfying.*
1284
1285 If your report was good and you are really lucky then one of the developers
1286 might immediately spot what's causing the issue; they then might write a patch
1287 to fix it, test it, and send it straight for integration in mainline while
1288 tagging it for later backport to stable and longterm kernels that need it. Then
1289 all you need to do is reply with a 'Thank you very much' and switch to a version
1290 with the fix once it gets released.
1291
1292 But this ideal scenario rarely happens. That's why the job is only starting
1293 once you got the report out. What you'll have to do depends on the situations,
1294 but often it will be the things listed below. But before digging into the
1295 details, here are a few important things you need to keep in mind for this part
1296 of the process.
1297
1298
1299 General advice for further interactions
1300 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1301
1302 **Always reply in public**: When you filed the issue in a bug tracker, always
1303 reply there and do not contact any of the developers privately about it. For
1304 mailed reports always use the 'Reply-all' function when replying to any mails
1305 you receive. That includes mails with any additional data you might want to add
1306 to your report: go to your mail applications 'Sent' folder and use 'reply-all'
1307 on your mail with the report. This approach will make sure the public mailing
1308 list(s) and everyone else that gets involved over time stays in the loop; it
1309 also keeps the mail thread intact, which among others is really important for
1310 mailing lists to group all related mails together.
1311
1312 There are just two situations where a comment in a bug tracker or a 'Reply-all'
1313 is unsuitable:
1314
1315 * Someone tells you to send something privately.
1316
1317 * You were told to send something, but noticed it contains sensitive
1318 information that needs to be kept private. In that case it's okay to send it
1319 in private to the developer that asked for it. But note in the ticket or a
1320 mail that you did that, so everyone else knows you honored the request.
1321
1322 **Do research before asking for clarifications or help**: In this part of the
1323 process someone might tell you to do something that requires a skill you might
1324 not have mastered yet. For example, you might be asked to use some test tools
1325 you never have heard of yet; or you might be asked to apply a patch to the
1326 Linux kernel sources to test if it helps. In some cases it will be fine sending
1327 a reply asking for instructions how to do that. But before going that route try
1328 to find the answer own your own by searching the internet; alternatively
1329 consider asking in other places for advice. For example ask a friend or post
1330 about it to a chatroom or forum you normally hang out.
1331
1332 **Be patient**: If you are really lucky you might get a reply to your report
1333 within a few hours. But most of the time it will take longer, as maintainers
1334 are scattered around the globe and thus might be in a different time zone – one
1335 where they already enjoy their night away from keyboard.
1336
1337 In general, kernel developers will take one to five business days to respond to
1338 reports. Sometimes it will take longer, as they might be busy with the merge
1339 windows, other work, visiting developer conferences, or simply enjoying a long
1340 summer holiday.
1341
1342 The 'issues of high priority' (see above for an explanation) are an exception
1343 here: maintainers should address them as soon as possible; that's why you
1344 should wait a week at maximum (or just two days if it's something urgent)
1345 before sending a friendly reminder.
1346
1347 Sometimes the maintainer might not be responding in a timely manner; other
1348 times there might be disagreements, for example if an issue qualifies as
1349 regression or not. In such cases raise your concerns on the mailing list and
1350 ask others for public or private replies how to move on. If that fails, it
1351 might be appropriate to get a higher authority involved. In case of a WiFi
1352 driver that would be the wireless maintainers; if there are no higher level
1353 maintainers or all else fails, it might be one of those rare situations where
1354 it's okay to get Linus Torvalds involved.
1355
1356 **Proactive testing**: Every time the first pre-release (the 'rc1') of a new
1357 mainline kernel version gets released, go and check if the issue is fixed there
1358 or if anything of importance changed. Mention the outcome in the ticket or in a
1359 mail you sent as reply to your report (make sure it has all those in the CC
1360 that up to that point participated in the discussion). This will show your
1361 commitment and that you are willing to help. It also tells developers if the
1362 issue persists and makes sure they do not forget about it. A few other
1363 occasional retests (for example with rc3, rc5 and the final) are also a good
1364 idea, but only report your results if something relevant changed or if you are
1365 writing something anyway.
1366
1367 With all these general things off the table let's get into the details of how
1368 to help to get issues resolved once they were reported.
1369
1370 Inquires and testing request
1371 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1372
1373 Here are your duties in case you got replies to your report:
1374
1375 **Check who you deal with**: Most of the time it will be the maintainer or a
1376 developer of the particular code area that will respond to your report. But as
1377 issues are normally reported in public it could be anyone that's replying —
1378 including people that want to help, but in the end might guide you totally off
1379 track with their questions or requests. That rarely happens, but it's one of
1380 many reasons why it's wise to quickly run an internet search to see who you're
1381 interacting with. By doing this you also get aware if your report was heard by
1382 the right people, as a reminder to the maintainer (see below) might be in order
1383 later if discussion fades out without leading to a satisfying solution for the
1384 issue.
1385
1386 **Inquiries for data**: Often you will be asked to test something or provide
1387 additional details. Try to provide the requested information soon, as you have
1388 the attention of someone that might help and risk losing it the longer you
1389 wait; that outcome is even likely if you do not provide the information within
1390 a few business days.
1391
1392 **Requests for testing**: When you are asked to test a diagnostic patch or a
1393 possible fix, try to test it in timely manner, too. But do it properly and make
1394 sure to not rush it: mixing things up can happen easily and can lead to a lot
1395 of confusion for everyone involved. A common mistake for example is thinking a
1396 proposed patch with a fix was applied, but in fact wasn't. Things like that
1397 happen even to experienced testers occasionally, but they most of the time will
1398 notice when the kernel with the fix behaves just as one without it.
1399
1400 What to do when nothing of substance happens
1401 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1402
1403 Some reports will not get any reaction from the responsible Linux kernel
1404 developers; or a discussion around the issue evolved, but faded out with
1405 nothing of substance coming out of it.
1406
1407 In these cases wait two (better: three) weeks before sending a friendly
1408 reminder: maybe the maintainer was just away from keyboard for a while when
1409 your report arrived or had something more important to take care of. When
1410 writing the reminder, kindly ask if anything else from your side is needed to
1411 get the ball running somehow. If the report got out by mail, do that in the
1412 first lines of a mail that is a reply to your initial mail (see above) which
1413 includes a full quote of the original report below: that's on of those few
1414 situations where such a 'TOFU' (Text Over, Fullquote Under) is the right
1415 approach, as then all the recipients will have the details at hand immediately
1416 in the proper order.
1417
1418 After the reminder wait three more weeks for replies. If you still don't get a
1419 proper reaction, you first should reconsider your approach. Did you maybe try
1420 to reach out to the wrong people? Was the report maybe offensive or so
1421 confusing that people decided to completely stay away from it? The best way to
1422 rule out such factors: show the report to one or two people familiar with FLOSS
1423 issue reporting and ask for their opinion. Also ask them for their advice how
1424 to move forward. That might mean: prepare a better report and make those people
1425 review it before you send it out. Such an approach is totally fine; just
1426 mention that this is the second and improved report on the issue and include a
1427 link to the first report.
1428
1429 If the report was proper you can send a second reminder; in it ask for advice
1430 why the report did not get any replies. A good moment for this second reminder
1431 mail is shortly after the first pre-release (the 'rc1') of a new Linux kernel
1432 version got published, as you should retest and provide a status update at that
1433 point anyway (see above).
1434
1435 If the second reminder again results in no reaction within a week, try to
1436 contact a higher-level maintainer asking for advice: even busy maintainers by
1437 then should at least have sent some kind of acknowledgment.
1438
1439 Remember to prepare yourself for a disappointment: maintainers ideally should
1440 react somehow to every issue report, but they are only obliged to fix those
1441 'issues of high priority' outlined earlier. So don't be too devastating if you
1442 get a reply along the lines of 'thanks for the report, I have more important
1443 issues to deal with currently and won't have time to look into this for the
1444 foreseeable future'.
1445
1446 It's also possible that after some discussion in the bug tracker or on a list
1447 nothing happens anymore and reminders don't help to motivate anyone to work out
1448 a fix. Such situations can be devastating, but is within the cards when it
1449 comes to Linux kernel development. This and several other reasons for not
1450 getting help are explained in 'Why some issues won't get any reaction or remain
1451 unfixed after being reported' near the end of this document.
1452
1453 Don't get devastated if you don't find any help or if the issue in the end does
1454 not get solved: the Linux kernel is FLOSS and thus you can still help yourself.
1455 You for example could try to find others that are affected and team up with
1456 them to get the issue resolved. Such a team could prepare a fresh report
1457 together that mentions how many you are and why this is something that in your
1458 option should get fixed. Maybe together you can also narrow down the root cause
1459 or the change that introduced a regression, which often makes developing a fix
1460 easier. And with a bit of luck there might be someone in the team that knows a
1461 bit about programming and might be able to write a fix.
1462
1463
1464 Reference for "Reporting regressions within a stable and longterm kernel line"
1465 ------------------------------------------------------------------------------
1466
1467 This subsection provides details for the steps you need to perform if you face
1468 a regression within a stable and longterm kernel line.
1469
1470 Make sure the particular version line still gets support
1471 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1472
1473 *Check if the kernel developers still maintain the Linux kernel version
1474 line you care about: go to the front page of kernel.org and make sure it
1475 mentions the latest release of the particular version line without an
1476 '[EOL]' tag.*
1477
1478 Most kernel version lines only get supported for about three months, as
1479 maintaining them longer is quite a lot of work. Hence, only one per year is
1480 chosen and gets supported for at least two years (often six). That's why you
1481 need to check if the kernel developers still support the version line you care
1482 for.
1483
1484 Note, if kernel.org lists two stable version lines on the front page, you
1485 should consider switching to the newer one and forget about the older one:
1486 support for it is likely to be abandoned soon. Then it will get a "end-of-life"
1487 (EOL) stamp. Version lines that reached that point still get mentioned on the
1488 kernel.org front page for a week or two, but are unsuitable for testing and
1489 reporting.
1490
1491 Search stable mailing list
1492 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1493
1494 *Check the archives of the Linux stable mailing list for existing reports.*
1495
1496 Maybe the issue you face is already known and was fixed or is about to. Hence,
1497 `search the archives of the Linux stable mailing list
1498 <https://lore.kernel.org/stable/>`_ for reports about an issue like yours. If
1499 you find any matches, consider joining the discussion, unless the fix is
1500 already finished and scheduled to get applied soon.
1501
1502 Reproduce issue with the newest release
1503 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1504
1505 *Install the latest release from the particular version line as a vanilla
1506 kernel. Ensure this kernel is not tainted and still shows the problem, as
1507 the issue might have already been fixed there. If you first noticed the
1508 problem with a vendor kernel, check a vanilla build of the last version
1509 known to work performs fine as well.*
1510
1511 Before investing any more time in this process you want to check if the issue
1512 was already fixed in the latest release of version line you're interested in.
1513 This kernel needs to be vanilla and shouldn't be tainted before the issue
1514 happens, as detailed outlined already above in the section "Install a fresh
1515 kernel for testing".
1516
1517 Did you first notice the regression with a vendor kernel? Then changes the
1518 vendor applied might be interfering. You need to rule that out by performing
1519 a recheck. Say something broke when you updated from 5.10.4-vendor.42 to
1520 5.10.5-vendor.43. Then after testing the latest 5.10 release as outlined in
1521 the previous paragraph check if a vanilla build of Linux 5.10.4 works fine as
1522 well. If things are broken there, the issue does not qualify as upstream
1523 regression and you need switch back to the main step-by-step guide to report
1524 the issue.
1525
1526 Report the regression
1527 ~~~~~~~~~~~~~~~~~~~~~
1528
1529 *Send a short problem report to the Linux stable mailing list
1530 (stable@vger.kernel.org) and CC the Linux regressions mailing list
1531 (regressions@lists.linux.dev); if you suspect the cause in a particular
1532 subsystem, CC its maintainer and its mailing list. Roughly describe the
1533 issue and ideally explain how to reproduce it. Mention the first version
1534 that shows the problem and the last version that's working fine. Then
1535 wait for further instructions.*
1536
1537 When reporting a regression that happens within a stable or longterm kernel
1538 line (say when updating from 5.10.4 to 5.10.5) a brief report is enough for
1539 the start to get the issue reported quickly. Hence a rough description to the
1540 stable and regressions mailing list is all it takes; but in case you suspect
1541 the cause in a particular subsystem, CC its maintainers and its mailing list
1542 as well, because that will speed things up.
1543
1544 And note, it helps developers a great deal if you can specify the exact version
1545 that introduced the problem. Hence if possible within a reasonable time frame,
1546 try to find that version using vanilla kernels. Lets assume something broke when
1547 your distributor released a update from Linux kernel 5.10.5 to 5.10.8. Then as
1548 instructed above go and check the latest kernel from that version line, say
1549 5.10.9. If it shows the problem, try a vanilla 5.10.5 to ensure that no patches
1550 the distributor applied interfere. If the issue doesn't manifest itself there,
1551 try 5.10.7 and then (depending on the outcome) 5.10.8 or 5.10.6 to find the
1552 first version where things broke. Mention it in the report and state that 5.10.9
1553 is still broken.
1554
1555 What the previous paragraph outlines is basically a rough manual 'bisection'.
1556 Once your report is out your might get asked to do a proper one, as it allows to
1557 pinpoint the exact change that causes the issue (which then can easily get
1558 reverted to fix the issue quickly). Hence consider to do a proper bisection
1559 right away if time permits. See the section 'Special care for regressions' and
1560 the document Documentation/admin-guide/bug-bisect.rst for details how to
1561 perform one. In case of a successful bisection add the author of the culprit to
1562 the recipients; also CC everyone in the signed-off-by chain, which you find at
1563 the end of its commit message.
1564
1565
1566 Reference for "Reporting issues only occurring in older kernel version lines"
1567 -----------------------------------------------------------------------------
1568
1569 This section provides details for the steps you need to take if you could not
1570 reproduce your issue with a mainline kernel, but want to see it fixed in older
1571 version lines (aka stable and longterm kernels).
1572
1573 Some fixes are too complex
1574 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1575
1576 *Prepare yourself for the possibility that going through the next few steps
1577 might not get the issue solved in older releases: the fix might be too big
1578 or risky to get backported there.*
1579
1580 Even small and seemingly obvious code-changes sometimes introduce new and
1581 totally unexpected problems. The maintainers of the stable and longterm kernels
1582 are very aware of that and thus only apply changes to these kernels that are
1583 within rules outlined in Documentation/process/stable-kernel-rules.rst.
1584
1585 Complex or risky changes for example do not qualify and thus only get applied
1586 to mainline. Other fixes are easy to get backported to the newest stable and
1587 longterm kernels, but too risky to integrate into older ones. So be aware the
1588 fix you are hoping for might be one of those that won't be backported to the
1589 version line your care about. In that case you'll have no other choice then to
1590 live with the issue or switch to a newer Linux version, unless you want to
1591 patch the fix into your kernels yourself.
1592
1593 Common preparations
1594 ~~~~~~~~~~~~~~~~~~~
1595
1596 *Perform the first three steps in the section "Reporting issues only
1597 occurring in older kernel version lines" above.*
1598
1599 You need to carry out a few steps already described in another section of this
1600 guide. Those steps will let you:
1601
1602 * Check if the kernel developers still maintain the Linux kernel version line
1603 you care about.
1604
1605 * Search the Linux stable mailing list for exiting reports.
1606
1607 * Check with the latest release.
1608
1609
1610 Check code history and search for existing discussions
1611 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1612
1613 *Search the Linux kernel version control system for the change that fixed
1614 the issue in mainline, as its commit message might tell you if the fix is
1615 scheduled for backporting already. If you don't find anything that way,
1616 search the appropriate mailing lists for posts that discuss such an issue
1617 or peer-review possible fixes; then check the discussions if the fix was
1618 deemed unsuitable for backporting. If backporting was not considered at
1619 all, join the newest discussion, asking if it's in the cards.*
1620
1621 In a lot of cases the issue you deal with will have happened with mainline, but
1622 got fixed there. The commit that fixed it would need to get backported as well
1623 to get the issue solved. That's why you want to search for it or any
1624 discussions abound it.
1625
1626 * First try to find the fix in the Git repository that holds the Linux kernel
1627 sources. You can do this with the web interfaces `on kernel.org
1628 <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/>`_
1629 or its mirror `on GitHub <https://github.com/torvalds/linux>`_; if you have
1630 a local clone you alternatively can search on the command line with ``git
1631 log --grep=<pattern>``.
1632
1633 If you find the fix, look if the commit message near the end contains a
1634 'stable tag' that looks like this:
1635
1636 Cc: <stable@vger.kernel.org> # 5.4+
1637
1638 If that's case the developer marked the fix safe for backporting to version
1639 line 5.4 and later. Most of the time it's getting applied there within two
1640 weeks, but sometimes it takes a bit longer.
1641
1642 * If the commit doesn't tell you anything or if you can't find the fix, look
1643 again for discussions about the issue. Search the net with your favorite
1644 internet search engine as well as the archives for the `Linux kernel
1645 developers mailing list <https://lore.kernel.org/lkml/>`_. Also read the
1646 section `Locate kernel area that causes the issue` above and follow the
1647 instructions to find the subsystem in question: its bug tracker or mailing
1648 list archive might have the answer you are looking for.
1649
1650 * If you see a proposed fix, search for it in the version control system as
1651 outlined above, as the commit might tell you if a backport can be expected.
1652
1653 * Check the discussions for any indicators the fix might be too risky to get
1654 backported to the version line you care about. If that's the case you have
1655 to live with the issue or switch to the kernel version line where the fix
1656 got applied.
1657
1658 * If the fix doesn't contain a stable tag and backporting was not discussed,
1659 join the discussion: mention the version where you face the issue and that
1660 you would like to see it fixed, if suitable.
1661
1662
1663 Ask for advice
1664 ~~~~~~~~~~~~~~
1665
1666 *One of the former steps should lead to a solution. If that doesn't work
1667 out, ask the maintainers for the subsystem that seems to be causing the
1668 issue for advice; CC the mailing list for the particular subsystem as well
1669 as the stable mailing list.*
1670
1671 If the previous three steps didn't get you closer to a solution there is only
1672 one option left: ask for advice. Do that in a mail you sent to the maintainers
1673 for the subsystem where the issue seems to have its roots; CC the mailing list
1674 for the subsystem as well as the stable mailing list (stable@vger.kernel.org).
1675
1676
1677 Why some issues won't get any reaction or remain unfixed after being reported
1678 =============================================================================
1679
1680 When reporting a problem to the Linux developers, be aware only 'issues of high
1681 priority' (regressions, security issues, severe problems) are definitely going
1682 to get resolved. The maintainers or if all else fails Linus Torvalds himself
1683 will make sure of that. They and the other kernel developers will fix a lot of
1684 other issues as well. But be aware that sometimes they can't or won't help; and
1685 sometimes there isn't even anyone to send a report to.
1686
1687 This is best explained with kernel developers that contribute to the Linux
1688 kernel in their spare time. Quite a few of the drivers in the kernel were
1689 written by such programmers, often because they simply wanted to make their
1690 hardware usable on their favorite operating system.
1691
1692 These programmers most of the time will happily fix problems other people
1693 report. But nobody can force them to do, as they are contributing voluntarily.
1694
1695 Then there are situations where such developers really want to fix an issue,
1696 but can't: sometimes they lack hardware programming documentation to do so.
1697 This often happens when the publicly available docs are superficial or the
1698 driver was written with the help of reverse engineering.
1699
1700 Sooner or later spare time developers will also stop caring for the driver.
1701 Maybe their test hardware broke, got replaced by something more fancy, or is so
1702 old that it's something you don't find much outside of computer museums
1703 anymore. Sometimes developer stops caring for their code and Linux at all, as
1704 something different in their life became way more important. In some cases
1705 nobody is willing to take over the job as maintainer – and nobody can be forced
1706 to, as contributing to the Linux kernel is done on a voluntary basis. Abandoned
1707 drivers nevertheless remain in the kernel: they are still useful for people and
1708 removing would be a regression.
1709
1710 The situation is not that different with developers that are paid for their
1711 work on the Linux kernel. Those contribute most changes these days. But their
1712 employers sooner or later also stop caring for their code or make its
1713 programmer focus on other things. Hardware vendors for example earn their money
1714 mainly by selling new hardware; quite a few of them hence are not investing
1715 much time and energy in maintaining a Linux kernel driver for something they
1716 stopped selling years ago. Enterprise Linux distributors often care for a
1717 longer time period, but in new versions often leave support for old and rare
1718 hardware aside to limit the scope. Often spare time contributors take over once
1719 a company orphans some code, but as mentioned above: sooner or later they will
1720 leave the code behind, too.
1721
1722 Priorities are another reason why some issues are not fixed, as maintainers
1723 quite often are forced to set those, as time to work on Linux is limited.
1724 That's true for spare time or the time employers grant their developers to
1725 spend on maintenance work on the upstream kernel. Sometimes maintainers also
1726 get overwhelmed with reports, even if a driver is working nearly perfectly. To
1727 not get completely stuck, the programmer thus might have no other choice than
1728 to prioritize issue reports and reject some of them.
1729
1730 But don't worry too much about all of this, a lot of drivers have active
1731 maintainers who are quite interested in fixing as many issues as possible.
1732
1733
1734 Closing words
1735 =============
1736
1737 Compared with other Free/Libre & Open Source Software it's hard to report
1738 issues to the Linux kernel developers: the length and complexity of this
1739 document and the implications between the lines illustrate that. But that's how
1740 it is for now. The main author of this text hopes documenting the state of the
1741 art will lay some groundwork to improve the situation over time.
1742
1743
1744 ..
1745 end-of-content
1746 ..
1747 This document is maintained by Thorsten Leemhuis <linux@leemhuis.info>. If
1748 you spot a typo or small mistake, feel free to let him know directly and
1749 he'll fix it. You are free to do the same in a mostly informal way if you
1750 want to contribute changes to the text, but for copyright reasons please CC
1751 linux-doc@vger.kernel.org and "sign-off" your contribution as
1752 Documentation/process/submitting-patches.rst outlines in the section "Sign
1753 your work - the Developer's Certificate of Origin".
1754 ..
1755 This text is available under GPL-2.0+ or CC-BY-4.0, as stated at the top
1756 of the file. If you want to distribute this text under CC-BY-4.0 only,
1757 please use "The Linux kernel developers" for author attribution and link
1758 this as source:
1759 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/Documentation/admin-guide/reporting-issues.rst
1760 ..
1761 Note: Only the content of this RST file as found in the Linux kernel sources
1762 is available under CC-BY-4.0, as versions of this text that were processed
1763 (for example by the kernel's build system) might contain content taken from
1764 files which use a more restrictive license.