From: Jakub Kicinski <kuba@kernel•org>
To: dipayanroy@linux•microsoft.com
Cc: Jakub Kicinski <kuba@kernel•org>,
kys@microsoft•com, haiyangz@microsoft•com, wei.liu@kernel•org,
decui@microsoft•com, andrew+netdev@lunn•ch, davem@davemloft•net,
edumazet@google•com, pabeni@redhat•com, leon@kernel•org,
longli@microsoft•com, kotaranov@microsoft•com, horms@kernel•org,
shradhagupta@linux•microsoft.com, ssengar@linux•microsoft.com,
ernis@linux•microsoft.com, shirazsaleem@microsoft•com,
linux-hyperv@vger•kernel.org, netdev@vger•kernel.org,
linux-kernel@vger•kernel.org, linux-rdma@vger•kernel.org,
stephen@networkplumber•org, jacob.e.keller@intel•com,
dipayanroy@microsoft•com, leitao@debian•org, kees@kernel•org,
john.fastabend@gmail•com, hawk@kernel•org, bpf@vger•kernel.org,
daniel@iogearbox•net, ast@kernel•org, sdf@fomichev•me,
yury.norov@gmail•com
Subject: Re: [PATCH net 1/2] net: mana: Add NULL guards in teardown path to prevent panic on attach failure
Date: Tue, 19 May 2026 15:55:56 -0700 [thread overview]
Message-ID: <20260519225556.3939204-1-kuba@kernel.org> (raw)
In-Reply-To: <20260518194654.735580-2-dipayanroy@linux.microsoft.com>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: mana: Add NULL guards in teardown path to prevent panic on attach failure
When queue allocation fails partway through, the error cleanup frees
and NULLs apc->tx_qp and apc->rxqs. The patch adds NULL guards for these
pointers in teardown paths such as mana_destroy_vport(), mana_fence_rqs(),
and mana_dealloc_queues() to prevent NULL pointer dereferences.
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index 9afc786b297a8..9e7e4bf526bf6 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
[ ... ]
> @@ -2858,13 +2861,16 @@ static void mana_destroy_vport(struct mana_port_context *apc)
> struct mana_rxq *rxq;
> u32 rxq_idx;
>
> - for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) {
> - rxq = apc->rxqs[rxq_idx];
> - if (!rxq)
> - continue;
> + if (apc->rxqs) {
>
> - mana_destroy_rxq(apc, rxq, true);
> - apc->rxqs[rxq_idx] = NULL;
> + for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) {
> + rxq = apc->rxqs[rxq_idx];
> + if (!rxq)
> + continue;
> +
> + mana_destroy_rxq(apc, rxq, true);
> + apc->rxqs[rxq_idx] = NULL;
> + }
> }
If the teardown path frees apc->rxqs and sets it to NULL on attach failure,
how does the interface recover if the user subsequently brings it back up?
When mana_open() is invoked, it directly calls mana_alloc_queues(). Inside
mana_alloc_queues() -> mana_add_rx_queues(), the code does:
apc->rxqs[i] = rxq;
If apc->rxqs was left as NULL during the teardown, will this cause a NULL
pointer dereference on the next interface bring-up?
[ ... ]
> @@ -3287,33 +3294,38 @@ static int mana_dealloc_queues(struct net_device *ndev)
> * number of queues.
> */
>
> - for (i = 0; i < apc->num_queues; i++) {
> - txq = &apc->tx_qp[i].txq;
> - tsleep = 1000;
> - while (atomic_read(&txq->pending_sends) > 0 &&
> - time_before(jiffies, timeout)) {
> - usleep_range(tsleep, tsleep + 1000);
> - tsleep <<= 1;
> - }
> - if (atomic_read(&txq->pending_sends)) {
> - err = pcie_flr(to_pci_dev(gd->gdma_context->dev));
> - if (err) {
> - netdev_err(ndev, "flr failed %d with %d pkts pending in txq %u\n",
> - err, atomic_read(&txq->pending_sends),
> - txq->gdma_txq_id);
> + if (apc->tx_qp) {
In the error path where queue creation fails partway through, will
mana_destroy_txq() be called?
If mana_destroy_txq() unconditionally iterates over all apc->num_queues:
for (i = 0; i < apc->num_queues; i++) {
...
mana_destroy_wq_obj(apc, GDMA_SQ, apc->tx_qp[i].tx_object);
...
Will it pass 0 or INVALID_MANA_HANDLE to mana_destroy_wq_obj() for queues
that were never fully initialized?
Does mana_destroy_wq_obj() send a MANA_DESTROY_WQ_OBJ command to the hardware
with these uninitialized handles, and could handle 0 corrupt the firmware
state?
> + for (i = 0; i < apc->num_queues; i++) {
> + txq = &apc->tx_qp[i].txq;
> + tsleep = 1000;
> + while (atomic_read(&txq->pending_sends) > 0 &&
> + time_before(jiffies, timeout)) {
> + usleep_range(tsleep, tsleep + 1000);
> + tsleep <<= 1;
> + }
> + if (atomic_read(&txq->pending_sends)) {
> + err =
> + pcie_flr(to_pci_dev(gd->gdma_context->dev));
If pcie_flr() is triggered here, the PCI Function Level Reset wipes the
internal GDMA state.
Further down in this function, mana_disable_vport_rx(), mana_fence_rqs(),
and mana_destroy_vport() are called. Will these functions attempt to send
commands to the now-destroyed GDMA admin queue via mana_send_request() and
wait for completions?
Could this lead to cascading timeouts since the hardware was just reset and
cannot process admin commands?
> + if (err) {
> + netdev_err(ndev, "flr failed %d with %d pkts pending in txq %u\n",
> + err,
> + atomic_read(&txq->pending_sends),
> + txq->gdma_txq_id);
> + }
> + break;
> }
> - break;
> }
--
pw-bot: cr
next prev parent reply other threads:[~2026-05-19 22:55 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-18 19:43 [PATCH net 0/2] net: mana: Fix NULL dereferences during teardown after attach failure Dipayaan Roy
2026-05-18 19:43 ` [PATCH net 1/2] net: mana: Add NULL guards in teardown path to prevent panic on " Dipayaan Roy
2026-05-18 21:49 ` Harshitha Ramamurthy
2026-05-19 22:55 ` Jakub Kicinski [this message]
2026-05-20 18:11 ` Dipayaan Roy
2026-05-18 19:43 ` [PATCH net 2/2] net: mana: Skip redundant detach in queue reset handler if already detached Dipayaan Roy
2026-05-19 22:55 ` Jakub Kicinski
2026-05-20 18:23 ` Dipayaan Roy
2026-05-21 0:17 ` Jakub Kicinski
2026-05-22 23:16 ` Dipayaan Roy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260519225556.3939204-1-kuba@kernel.org \
--to=kuba@kernel$(echo .)org \
--cc=andrew+netdev@lunn$(echo .)ch \
--cc=ast@kernel$(echo .)org \
--cc=bpf@vger$(echo .)kernel.org \
--cc=daniel@iogearbox$(echo .)net \
--cc=davem@davemloft$(echo .)net \
--cc=decui@microsoft$(echo .)com \
--cc=dipayanroy@linux$(echo .)microsoft.com \
--cc=dipayanroy@microsoft$(echo .)com \
--cc=edumazet@google$(echo .)com \
--cc=ernis@linux$(echo .)microsoft.com \
--cc=haiyangz@microsoft$(echo .)com \
--cc=hawk@kernel$(echo .)org \
--cc=horms@kernel$(echo .)org \
--cc=jacob.e.keller@intel$(echo .)com \
--cc=john.fastabend@gmail$(echo .)com \
--cc=kees@kernel$(echo .)org \
--cc=kotaranov@microsoft$(echo .)com \
--cc=kys@microsoft$(echo .)com \
--cc=leitao@debian$(echo .)org \
--cc=leon@kernel$(echo .)org \
--cc=linux-hyperv@vger$(echo .)kernel.org \
--cc=linux-kernel@vger$(echo .)kernel.org \
--cc=linux-rdma@vger$(echo .)kernel.org \
--cc=longli@microsoft$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=pabeni@redhat$(echo .)com \
--cc=sdf@fomichev$(echo .)me \
--cc=shirazsaleem@microsoft$(echo .)com \
--cc=shradhagupta@linux$(echo .)microsoft.com \
--cc=ssengar@linux$(echo .)microsoft.com \
--cc=stephen@networkplumber$(echo .)org \
--cc=wei.liu@kernel$(echo .)org \
--cc=yury.norov@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox