public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Luis Chamberlain <mcgrof@kernel•org>
To: jeyu@kernel•org, davem@davemloft•net, kuba@kernel•org
Cc: michael.chan@broadcom•com, dchickles@marvell•com,
	sburla@marvell•com, fmanlunas@marvell•com, aelior@marvell•com,
	GR-everest-linux-l2@marvell•com, kvalo@codeaurora•org,
	johannes@sipsolutions•net, akpm@linux-foundation•org,
	arnd@arndb•de, rostedt@goodmis•org, mingo@redhat•com,
	aquini@redhat•com, cai@lca•pw, dyoung@redhat•com, bhe@redhat•com,
	peterz@infradead•org, tglx@linutronix•de, gpiccoli@canonical•com,
	pmladek@suse•com, tiwai@suse•de, schlad@suse•de,
	andriy.shevchenko@linux•intel.com, derosier@gmail•com,
	keescook@chromium•org, daniel.vetter@ffwll•ch, will@kernel•org,
	mchehab+samsung@kernel•org, vkoul@kernel•org,
	mchehab+huawei@kernel•org, robh@kernel•org, mhiramat@kernel•org,
	sfr@canb•auug.org.au, linux@dominikbrodowski•net,
	glider@google•com, paulmck@kernel•org, elver@google•com,
	bauerman@linux•ibm.com, yamada.masahiro@socionext•com,
	samitolvanen@google•com, yzaikin@google•com, dvyukov@google•com,
	rdunlap@infradead•org, corbet@lwn•net, dianders@chromium•org,
	netdev@vger•kernel.org, linux-kernel@vger•kernel.org,
	linux-doc@vger•kernel.org, Luis Chamberlain <mcgrof@kernel•org>,
	linux-wireless@vger•kernel.org, ath10k@lists•infradead.org
Subject: [PATCH v3 5/8] ath10k: use new taint_firmware_crashed()
Date: Tue, 26 May 2020 14:58:12 +0000	[thread overview]
Message-ID: <20200526145815.6415-6-mcgrof@kernel.org> (raw)
In-Reply-To: <20200526145815.6415-1-mcgrof@kernel.org>

This makes use of the new taint_firmware_crashed() to help
annotate when firmware for device drivers crash. When firmware
crashes devices can sometimes become unresponsive, and recovery
sometimes requires a driver unload / reload and in the worst cases
a reboot.

Using a taint flag allows us to annotate when this happens clearly.

I have run into this situation with this driver with the latest
firmware as of today, May 21, 2020 using v5.6.0, leaving me at
a state at which my only option is to reboot. Driver removal and
addition does not fix the situation. This is reported on kernel.org
bugzilla korg#207851 [0]. But this isn't the first firmware crash reported,
others have been filed before and none of these bugs have yet been
addressed [1] [2] [3].  Including my own I see these firmware crash
reports:

  * korg#207851 [0]
  * korg#197013 [1]
  * korg#201237 [2]
  * korg#195987 [3]

[0] https://bugzilla.kernel.org/show_bug.cgi?id=207851
[1] https://bugzilla.kernel.org/show_bug.cgi?id=197013
[2] https://bugzilla.kernel.org/show_bug.cgi?id=201237
[3] https://bugzilla.kernel.org/show_bug.cgi?id=195987

Cc: linux-wireless@vger•kernel.org
Cc: ath10k@lists•infradead.org
Cc: Kalle Valo <kvalo@codeaurora•org>
Acked-by: Rafael Aquini <aquini@redhat•com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel•org>
---
 drivers/net/wireless/ath/ath10k/pci.c  | 2 ++
 drivers/net/wireless/ath/ath10k/sdio.c | 2 ++
 drivers/net/wireless/ath/ath10k/snoc.c | 1 +
 3 files changed, 5 insertions(+)

diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c
index 1d941d53fdc9..818c3acc2468 100644
--- a/drivers/net/wireless/ath/ath10k/pci.c
+++ b/drivers/net/wireless/ath/ath10k/pci.c
@@ -1767,6 +1767,7 @@ static void ath10k_pci_fw_dump_work(struct work_struct *work)
 		scnprintf(guid, sizeof(guid), "n/a");
 
 	ath10k_err(ar, "firmware crashed! (guid %s)\n", guid);
+	taint_firmware_crashed();
 	ath10k_print_driver_info(ar);
 	ath10k_pci_dump_registers(ar, crash_data);
 	ath10k_ce_dump_registers(ar, crash_data);
@@ -2837,6 +2838,7 @@ static int ath10k_pci_hif_power_up(struct ath10k *ar,
 	if (ret) {
 		if (ath10k_pci_has_fw_crashed(ar)) {
 			ath10k_warn(ar, "firmware crashed during chip reset\n");
+			taint_firmware_crashed();
 			ath10k_pci_fw_crashed_clear(ar);
 			ath10k_pci_fw_crashed_dump(ar);
 		}
diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c
index e2aff2254a40..8b2fc0b89be4 100644
--- a/drivers/net/wireless/ath/ath10k/sdio.c
+++ b/drivers/net/wireless/ath/ath10k/sdio.c
@@ -794,6 +794,7 @@ static int ath10k_sdio_mbox_proc_dbg_intr(struct ath10k *ar)
 
 	/* TODO: Add firmware crash handling */
 	ath10k_warn(ar, "firmware crashed\n");
+	taint_firmware_crashed();
 
 	/* read counter to clear the interrupt, the debug error interrupt is
 	 * counter 0.
@@ -915,6 +916,7 @@ static int ath10k_sdio_mbox_proc_cpu_intr(struct ath10k *ar)
 	if (cpu_int_status & MBOX_CPU_STATUS_ENABLE_ASSERT_MASK) {
 		ath10k_err(ar, "firmware crashed!\n");
 		queue_work(ar->workqueue, &ar->restart_work);
+		taint_firmware_crashed();
 	}
 	return ret;
 }
diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c
index 354d49b1cd45..071ee7607a4c 100644
--- a/drivers/net/wireless/ath/ath10k/snoc.c
+++ b/drivers/net/wireless/ath/ath10k/snoc.c
@@ -1451,6 +1451,7 @@ void ath10k_snoc_fw_crashed_dump(struct ath10k *ar)
 		scnprintf(guid, sizeof(guid), "n/a");
 
 	ath10k_err(ar, "firmware crashed! (guid %s)\n", guid);
+	taint_firmware_crashed();
 	ath10k_print_driver_info(ar);
 	ath10k_msa_dump_memory(ar, crash_data);
 	mutex_unlock(&ar->dump_mutex);
-- 
2.26.2


  parent reply	other threads:[~2020-05-26 14:58 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-26 14:58 [PATCH v3 0/8] kernel: taint when the driver firmware crashes Luis Chamberlain
2020-05-26 14:58 ` [PATCH v3 1/8] kernel.h: move taint and system state flags to uapi Luis Chamberlain
2020-05-26 14:58 ` [PATCH v3 2/8] panic: add uevent support Luis Chamberlain
2020-05-26 14:58 ` [PATCH v3 3/8] taint: add firmware crash taint support Luis Chamberlain
2020-05-26 14:58 ` [PATCH v3 4/8] panic: make taint data type clearer Luis Chamberlain
2020-05-26 14:58 ` Luis Chamberlain [this message]
2020-06-02 21:01   ` [PATCH v3 5/8] ath10k: use new taint_firmware_crashed() Brian Norris
2020-05-26 14:58 ` [PATCH v3 6/8] bnxt_en: " Luis Chamberlain
2020-05-26 18:09   ` Michael Chan
2020-05-26 14:58 ` [PATCH v3 7/8] liquidio: " Luis Chamberlain
2020-05-26 14:58 ` [PATCH v3 8/8] qed: " Luis Chamberlain
2020-05-26 22:46 ` [PATCH v3 0/8] kernel: taint when the driver firmware crashes Jakub Kicinski
2020-05-26 23:07   ` Luis Chamberlain
2020-05-26 23:30     ` Jakub Kicinski
2020-05-27  3:19       ` Luis Chamberlain
2020-05-27 21:36         ` Jakub Kicinski
2020-05-28 14:27           ` Luis Chamberlain
2020-05-28 15:04             ` Ben Greear
2020-05-28 16:33               ` Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200526145815.6415-6-mcgrof@kernel.org \
    --to=mcgrof@kernel$(echo .)org \
    --cc=GR-everest-linux-l2@marvell$(echo .)com \
    --cc=aelior@marvell$(echo .)com \
    --cc=akpm@linux-foundation$(echo .)org \
    --cc=andriy.shevchenko@linux$(echo .)intel.com \
    --cc=aquini@redhat$(echo .)com \
    --cc=arnd@arndb$(echo .)de \
    --cc=ath10k@lists$(echo .)infradead.org \
    --cc=bauerman@linux$(echo .)ibm.com \
    --cc=bhe@redhat$(echo .)com \
    --cc=cai@lca$(echo .)pw \
    --cc=corbet@lwn$(echo .)net \
    --cc=daniel.vetter@ffwll$(echo .)ch \
    --cc=davem@davemloft$(echo .)net \
    --cc=dchickles@marvell$(echo .)com \
    --cc=derosier@gmail$(echo .)com \
    --cc=dianders@chromium$(echo .)org \
    --cc=dvyukov@google$(echo .)com \
    --cc=dyoung@redhat$(echo .)com \
    --cc=elver@google$(echo .)com \
    --cc=fmanlunas@marvell$(echo .)com \
    --cc=glider@google$(echo .)com \
    --cc=gpiccoli@canonical$(echo .)com \
    --cc=jeyu@kernel$(echo .)org \
    --cc=johannes@sipsolutions$(echo .)net \
    --cc=keescook@chromium$(echo .)org \
    --cc=kuba@kernel$(echo .)org \
    --cc=kvalo@codeaurora$(echo .)org \
    --cc=linux-doc@vger$(echo .)kernel.org \
    --cc=linux-kernel@vger$(echo .)kernel.org \
    --cc=linux-wireless@vger$(echo .)kernel.org \
    --cc=linux@dominikbrodowski$(echo .)net \
    --cc=mchehab+huawei@kernel$(echo .)org \
    --cc=mchehab+samsung@kernel$(echo .)org \
    --cc=mhiramat@kernel$(echo .)org \
    --cc=michael.chan@broadcom$(echo .)com \
    --cc=mingo@redhat$(echo .)com \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=paulmck@kernel$(echo .)org \
    --cc=peterz@infradead$(echo .)org \
    --cc=pmladek@suse$(echo .)com \
    --cc=rdunlap@infradead$(echo .)org \
    --cc=robh@kernel$(echo .)org \
    --cc=rostedt@goodmis$(echo .)org \
    --cc=samitolvanen@google$(echo .)com \
    --cc=sburla@marvell$(echo .)com \
    --cc=schlad@suse$(echo .)de \
    --cc=sfr@canb$(echo .)auug.org.au \
    --cc=tglx@linutronix$(echo .)de \
    --cc=tiwai@suse$(echo .)de \
    --cc=vkoul@kernel$(echo .)org \
    --cc=will@kernel$(echo .)org \
    --cc=yamada.masahiro@socionext$(echo .)com \
    --cc=yzaikin@google$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox