public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel•org>
To: linux-kernel@vger•kernel.org, stable@vger•kernel.org
Cc: Baochen Qiang <quic_bqiang@quicinc•com>,
	Kalle Valo <quic_kvalo@quicinc•com>,
	Sasha Levin <sashal@kernel•org>,
	kvalo@kernel•org, davem@davemloft•net, kuba@kernel•org,
	pabeni@redhat•com, ath11k@lists•infradead.org,
	linux-wireless@vger•kernel.org, netdev@vger•kernel.org
Subject: [PATCH AUTOSEL 5.17 086/149] ath11k: Fix frames flush failure caused by deadlock
Date: Fri,  1 Apr 2022 10:24:33 -0400	[thread overview]
Message-ID: <20220401142536.1948161-86-sashal@kernel.org> (raw)
In-Reply-To: <20220401142536.1948161-1-sashal@kernel.org>

From: Baochen Qiang <quic_bqiang@quicinc•com>

[ Upstream commit 261b07519518bd14cb168b287b17e1d195f8d0c8 ]

We are seeing below warnings:

kernel: [25393.301506] ath11k_pci 0000:01:00.0: failed to flush mgmt transmit queue 0
kernel: [25398.421509] ath11k_pci 0000:01:00.0: failed to flush mgmt transmit queue 0
kernel: [25398.421831] ath11k_pci 0000:01:00.0: dropping mgmt frame for vdev 0, is_started 0

this means ath11k fails to flush mgmt. frames because wmi_mgmt_tx_work
has no chance to run in 5 seconds.

By setting /proc/sys/kernel/hung_task_timeout_secs to 20 and increasing
ATH11K_FLUSH_TIMEOUT to 50 we get below warnings:

kernel: [  120.763160] INFO: task wpa_supplicant:924 blocked for more than 20 seconds.
kernel: [  120.763169]       Not tainted 5.10.90 #12
kernel: [  120.763177] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: [  120.763186] task:wpa_supplicant  state:D stack:    0 pid:  924 ppid:     1 flags:0x000043a0
kernel: [  120.763201] Call Trace:
kernel: [  120.763214]  __schedule+0x785/0x12fa
kernel: [  120.763224]  ? lockdep_hardirqs_on_prepare+0xe2/0x1bb
kernel: [  120.763242]  schedule+0x7e/0xa1
kernel: [  120.763253]  schedule_timeout+0x98/0xfe
kernel: [  120.763266]  ? run_local_timers+0x4a/0x4a
kernel: [  120.763291]  ath11k_mac_flush_tx_complete+0x197/0x2b1 [ath11k 13c3a9bf37790f4ac8103b3decf7ab4008ac314a]
kernel: [  120.763306]  ? init_wait_entry+0x2e/0x2e
kernel: [  120.763343]  __ieee80211_flush_queues+0x167/0x21f [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [  120.763378]  __ieee80211_recalc_idle+0x105/0x125 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [  120.763411]  ieee80211_recalc_idle+0x14/0x27 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [  120.763441]  ieee80211_free_chanctx+0x77/0xa2 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [  120.763473]  __ieee80211_vif_release_channel+0x100/0x131 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [  120.763540]  ieee80211_vif_release_channel+0x66/0x81 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [  120.763572]  ieee80211_destroy_auth_data+0xa3/0xe6 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [  120.763612]  ieee80211_mgd_deauth+0x178/0x29b [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [  120.763654]  cfg80211_mlme_deauth+0x1a8/0x22c [cfg80211 8945aa5bc2af5f6972336665d8ad6f9c191ad5be]
kernel: [  120.763697]  nl80211_deauthenticate+0xfa/0x123 [cfg80211 8945aa5bc2af5f6972336665d8ad6f9c191ad5be]
kernel: [  120.763715]  genl_rcv_msg+0x392/0x3c2
kernel: [  120.763750]  ? nl80211_associate+0x432/0x432 [cfg80211 8945aa5bc2af5f6972336665d8ad6f9c191ad5be]
kernel: [  120.763782]  ? nl80211_associate+0x432/0x432 [cfg80211 8945aa5bc2af5f6972336665d8ad6f9c191ad5be]
kernel: [  120.763802]  ? genl_rcv+0x36/0x36
kernel: [  120.763814]  netlink_rcv_skb+0x89/0xf7
kernel: [  120.763829]  genl_rcv+0x28/0x36
kernel: [  120.763840]  netlink_unicast+0x179/0x24b
kernel: [  120.763854]  netlink_sendmsg+0x393/0x401
kernel: [  120.763872]  sock_sendmsg+0x72/0x76
kernel: [  120.763886]  ____sys_sendmsg+0x170/0x1e6
kernel: [  120.763897]  ? copy_msghdr_from_user+0x7a/0xa2
kernel: [  120.763914]  ___sys_sendmsg+0x95/0xd1
kernel: [  120.763940]  __sys_sendmsg+0x85/0xbf
kernel: [  120.763956]  do_syscall_64+0x43/0x55
kernel: [  120.763966]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
kernel: [  120.763977] RIP: 0033:0x79089f3fcc83
kernel: [  120.763986] RSP: 002b:00007ffe604f0508 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
kernel: [  120.763997] RAX: ffffffffffffffda RBX: 000059b40e987690 RCX: 000079089f3fcc83
kernel: [  120.764006] RDX: 0000000000000000 RSI: 00007ffe604f0558 RDI: 0000000000000009
kernel: [  120.764014] RBP: 00007ffe604f0540 R08: 0000000000000004 R09: 0000000000400000
kernel: [  120.764023] R10: 00007ffe604f0638 R11: 0000000000000246 R12: 000059b40ea04980
kernel: [  120.764032] R13: 00007ffe604f0638 R14: 000059b40e98c360 R15: 00007ffe604f0558
...
kernel: [  120.765230] INFO: task kworker/u32:26:4239 blocked for more than 20 seconds.
kernel: [  120.765238]       Not tainted 5.10.90 #12
kernel: [  120.765245] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: [  120.765253] task:kworker/u32:26  state:D stack:    0 pid: 4239 ppid:     2 flags:0x00004080
kernel: [  120.765284] Workqueue: phy0 ieee80211_iface_work [mac80211]
kernel: [  120.765295] Call Trace:
kernel: [  120.765306]  __schedule+0x785/0x12fa
kernel: [  120.765316]  ? find_held_lock+0x3d/0xb2
kernel: [  120.765331]  schedule+0x7e/0xa1
kernel: [  120.765340]  schedule_preempt_disabled+0x15/0x1e
kernel: [  120.765349]  __mutex_lock_common+0x561/0xc0d
kernel: [  120.765375]  ? ieee80211_sta_work+0x3e/0x1232 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [  120.765390]  mutex_lock_nested+0x20/0x26
kernel: [  120.765416]  ieee80211_sta_work+0x3e/0x1232 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [  120.765430]  ? skb_dequeue+0x54/0x5e
kernel: [  120.765456]  ? ieee80211_iface_work+0x7b/0x339 [mac80211 335da900954f1c5ea7f1613d92088ce83342042c]
kernel: [  120.765485]  process_one_work+0x270/0x504
kernel: [  120.765501]  worker_thread+0x215/0x376
kernel: [  120.765514]  kthread+0x159/0x168
kernel: [  120.765526]  ? pr_cont_work+0x5b/0x5b
kernel: [  120.765536]  ? kthread_blkcg+0x31/0x31
kernel: [  120.765550]  ret_from_fork+0x22/0x30
...
kernel: [  120.765867] Showing all locks held in the system:
...
kernel: [  120.766164] 5 locks held by wpa_supplicant/924:
kernel: [  120.766172]  #0: ffffffffb1e63eb0 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x36
kernel: [  120.766197]  #1: ffffffffb1e5b1c8 (rtnl_mutex){+.+.}-{3:3}, at: nl80211_pre_doit+0x2a/0x15c [cfg80211]
kernel: [  120.766238]  #2: ffff99f08347cd08 (&wdev->mtx){+.+.}-{3:3}, at: nl80211_deauthenticate+0xde/0x123 [cfg80211]
kernel: [  120.766279]  #3: ffff99f09df12a48 (&local->mtx){+.+.}-{3:3}, at: ieee80211_destroy_auth_data+0x9b/0xe6 [mac80211]
kernel: [  120.766321]  #4: ffff99f09df12ce0 (&local->chanctx_mtx){+.+.}-{3:3}, at: ieee80211_vif_release_channel+0x5e/0x81 [mac80211]
...
kernel: [  120.766585] 3 locks held by kworker/u32:26/4239:
kernel: [  120.766593]  #0: ffff99f04458f948 ((wq_completion)phy0){+.+.}-{0:0}, at: process_one_work+0x19a/0x504
kernel: [  120.766621]  #1: ffffbad54b3cfe50 ((work_completion)(&sdata->work)){+.+.}-{0:0}, at: process_one_work+0x1c0/0x504
kernel: [  120.766649]  #2: ffff99f08347cd08 (&wdev->mtx){+.+.}-{3:3}, at: ieee80211_sta_work+0x3e/0x1232 [mac80211]

With above info the issue is clear: First wmi_mgmt_tx_work is inserted
to local->workqueue after sdata->work inserted, then wpa_supplicant
acquires wdev->mtx in nl80211_deauthenticate and finally calls
ath11k_mac_op_flush where it waits all mgmt. frames to be sent out by
wmi_mgmt_tx_work. Meanwhile, sdata->work is blocked by wdev->mtx in
ieee80211_sta_work, as a result wmi_mgmt_tx_work has no chance to run.

Change to use ab->workqueue instead of local->workqueue to fix this issue.

Signed-off-by: Baochen Qiang <quic_bqiang@quicinc•com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc•com>
Link: https://lore.kernel.org/r/20220217084545.18844-1-quic_bqiang@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel•org>
---
 drivers/net/wireless/ath/ath11k/mac.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c
index 07f499d5ec92..36be4ec969ad 100644
--- a/drivers/net/wireless/ath/ath11k/mac.c
+++ b/drivers/net/wireless/ath/ath11k/mac.c
@@ -5566,7 +5566,7 @@ static int ath11k_mac_mgmt_tx(struct ath11k *ar, struct sk_buff *skb,
 
 	skb_queue_tail(q, skb);
 	atomic_inc(&ar->num_pending_mgmt_tx);
-	ieee80211_queue_work(ar->hw, &ar->wmi_mgmt_tx_work);
+	queue_work(ar->ab->workqueue, &ar->wmi_mgmt_tx_work);
 
 	return 0;
 }
-- 
2.34.1


  parent reply	other threads:[~2022-04-01 14:35 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220401142536.1948161-1-sashal@kernel.org>
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 002/149] Bluetooth: hci_sync: Fix compilation warning Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 003/149] ath5k: fix OOB in ath5k_eeprom_read_pcal_info_5111 Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 004/149] Bluetooth: fix null ptr deref on hci_sync_conn_complete_evt Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 009/149] Bluetooth: hci_event: Ignore multiple conn complete events Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 017/149] ptp: replace snprintf with sysfs_emit Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 018/149] Bluetooth: hci_sync: Fix queuing commands when HCI_UNREGISTER is set Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 019/149] selftests, xsk: Fix bpf_res cleanup test Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 020/149] net/mlx5e: TC, Hold sample_attr on stack instead of pointer Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 025/149] mlxsw: spectrum: Guard against invalid local ports Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 029/149] ath11k: fix kernel panic during unload/load ath11k modules Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 030/149] ath11k: pci: fix crash on suspend if board file is not found Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 031/149] ath11k: mhi: use mhi_sync_power_up() Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 032/149] net/smc: Send directly when TCP_CORK is cleared Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 034/149] bpf: Make dst_port field in struct bpf_sock 16-bit wide Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 039/149] mt76: mt7921: fix crash when startup fails Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 040/149] mt76: dma: initialize skip_unmap in mt76_dma_rx_fill Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 041/149] i40e: Add sending commands in atomic context Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 042/149] cfg80211: don't add non transmitted BSS to 6GHz scanned channels Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 043/149] libbpf: Fix build issue with llvm-readelf Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 044/149] ipv6: make mc_forwarding atomic Sasha Levin
2022-04-01 14:23 ` [PATCH AUTOSEL 5.17 046/149] net: initialize init_net earlier Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 060/149] libbpf: Fix accessing syscall arguments on powerpc Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 061/149] libbpf: Fix accessing the first syscall argument on arm64 Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 062/149] libbpf: Fix accessing the first syscall argument on s390 Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 064/149] tcp: Don't acquire inet_listen_hashbucket::lock with disabled BH Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 074/149] net/mlx5e: Disable TX queues before registering the netdev Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 078/149] iwlwifi: mvm: Correctly set fragmented EBS Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 079/149] iwlwifi: mvm: Passively scan non PSC channels only when requested so Sasha Levin
2022-04-01 14:52   ` Ben Greear
2022-04-09 14:04     ` Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 080/149] iwlwifi: fix small doc mistake for iwl_fw_ini_addr_val Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 081/149] iwlwifi: mvm: move only to an enabled channel Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 082/149] ipv6: annotate some data-races around sk->sk_prot Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 085/149] rtw89: fix RCU usage in rtw89_core_txq_push() Sasha Levin
2022-04-01 14:24 ` Sasha Levin [this message]
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 087/149] ipv4: Invalidate neighbour for broadcast address upon address addition Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 088/149] rtw88: change rtw_info() to proper message level Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 099/149] mt76: mt7915: fix injected MPDU transmission to not use HW A-MSDU Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 101/149] mctp: make __mctp_dev_get() take a refcount hold Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 103/149] mt76: mt7615: Fix assigning negative values to unsigned variable Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 109/149] net/smc: correct settings of RMB window update limit Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 111/149] iavf: stop leaking iavf_status as "errno" values Sasha Levin
2022-04-01 14:24 ` [PATCH AUTOSEL 5.17 112/149] macvtap: advertise link netns via netlink Sasha Levin
2022-04-01 14:25 ` [PATCH AUTOSEL 5.17 114/149] tuntap: add sanity checks about msg_controllen in sendmsg Sasha Levin
2022-04-01 14:25 ` [PATCH AUTOSEL 5.17 117/149] Bluetooth: Fix not checking for valid hdev on bt_dev_{info,warn,err,dbg} Sasha Levin
2022-04-01 14:25 ` [PATCH AUTOSEL 5.17 118/149] Bluetooth: use memset avoid memory leaks Sasha Levin
2022-04-01 14:25 ` [PATCH AUTOSEL 5.17 119/149] bnxt_en: Eliminate unintended link toggle during FW reset Sasha Levin
2022-04-01 14:25 ` [PATCH AUTOSEL 5.17 123/149] powerpc/64e: Tie PPC_BOOK3E_64 to PPC_FSL_BOOK3E Sasha Levin
2022-04-01 14:25 ` [PATCH AUTOSEL 5.17 127/149] can: isotp: set default value for N_As to 50 micro seconds Sasha Levin
2022-04-01 14:25 ` [PATCH AUTOSEL 5.17 128/149] can: etas_es58x: es58x_fd_rx_event_msg(): initialize rx_event_msg before calling es58x_check_msg_len() Sasha Levin
2022-04-01 14:25 ` [PATCH AUTOSEL 5.17 130/149] net: account alternate interface name memory Sasha Levin
2022-04-01 14:25 ` [PATCH AUTOSEL 5.17 131/149] net: limit altnames to 64k total Sasha Levin
2022-04-01 14:25 ` [PATCH AUTOSEL 5.17 132/149] net/mlx5e: Remove overzealous validations in netlink EEPROM query Sasha Levin
2022-04-01 14:25 ` [PATCH AUTOSEL 5.17 135/149] net: sfp: add 2500base-X quirk for Lantech SFP module Sasha Levin
2022-04-01 14:25 ` [PATCH AUTOSEL 5.17 138/149] mt76: fix monitor mode crash with sdio driver Sasha Levin
2022-04-01 14:25 ` [PATCH AUTOSEL 5.17 140/149] iwlwifi: mei: fix building iwlmei Sasha Levin
2022-04-01 14:25 ` [PATCH AUTOSEL 5.17 142/149] Bluetooth: Fix use after free in hci_send_acl Sasha Levin
2022-04-01 14:25 ` [PATCH AUTOSEL 5.17 143/149] netfilter: conntrack: revisit gc autotuning Sasha Levin
2022-04-01 14:25 ` [PATCH AUTOSEL 5.17 144/149] netlabel: fix out-of-bounds memory accesses Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220401142536.1948161-86-sashal@kernel.org \
    --to=sashal@kernel$(echo .)org \
    --cc=ath11k@lists$(echo .)infradead.org \
    --cc=davem@davemloft$(echo .)net \
    --cc=kuba@kernel$(echo .)org \
    --cc=kvalo@kernel$(echo .)org \
    --cc=linux-kernel@vger$(echo .)kernel.org \
    --cc=linux-wireless@vger$(echo .)kernel.org \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=pabeni@redhat$(echo .)com \
    --cc=quic_bqiang@quicinc$(echo .)com \
    --cc=quic_kvalo@quicinc$(echo .)com \
    --cc=stable@vger$(echo .)kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox