public inbox for linux-next@vger.kernel.org 
 help / color / mirror / Atom feed
From: "Aithal, Srikanth" <sraithal@amd•com>
To: Hannes Reinecke <hare@suse•de>, hare@kernel•org
Cc: sagi@grimberg•me, hch@lst•de, kbusch@kernel•org,
	Ankit.Soni@amd•com, Vasant Hegde <vasant.hegde@amd•com>,
	open list <linux-kernel@vger•kernel.org>,
	Linux-Next Mailing List <linux-next@vger•kernel.org>
Subject: Re: Patch "nvme: re-read ANA log page after ns scan completes" causing regression
Date: Mon, 14 Apr 2025 16:55:07 +0530	[thread overview]
Message-ID: <4c258a85-7b2d-4946-a64f-d0341c444119@amd.com> (raw)
In-Reply-To: <e1f2ac49-25f4-4b2c-b67c-10782b4e3455@suse.de>


On 4/14/2025 4:39 PM, Hannes Reinecke wrote:
> On 4/14/25 12:53, Aithal, Srikanth wrote:
>> Hello,
>>
>> With below patch in todays linux-next next-20250414 and v6.15-rc2 we 
>> are seeing host boot issues. The host with nvme disk just hangs on boot.
>>
>> If we revert this patch or disable CONFIG_NVME_MULTIPATH then host 
>> boots fine.
>>
>> commit 62baf70c327444338c34703c71aa8cc8e4189bd6
>> Author: Hannes Reinecke <hare@kernel•org>
>> Date:   Thu Apr 3 09:19:30 2025 +0200
>>
>>      nvme: re-read ANA log page after ns scan completes
>>
>>      When scanning for new namespaces we might have missed an ANA AEN.
>>
>>      The NVMe base spec (NVMe Base Specification v2.1, Figure 151 
>> 'Asynchonous
>>      Event Information - Notice': Asymmetric Namespace Access Change) 
>> states:
>>
>>        A controller shall not send this even if an Attached Namespace
>>        Attribute Changed asynchronous event [...] is sent for the 
>> same event.
>>
>>      so we need to re-read the ANA log page after we rescanned the 
>> namespace
>>      list to update the ANA states of the new namespaces.
>>
>>      Signed-off-by: Hannes Reinecke <hare@kernel•org>
>>      Reviewed-by: Keith Busch <kbusch@kernel•org>
>>      Signed-off-by: Christoph Hellwig <hch@lst•de>
>>
>>
>> Host console starts dumping a lot of errors and log size is more than 
>> 100 MB. So I am not posting all logs here. I am pasting part of the 
>> logs here:
>> ...
>> ...
>> [   49.361223] nvme nvme0: controller is down; will reset: CSTS=0x3, 
>> PCI_STATUS=0x1010
>> [   49.434564] nvme0n1: I/O Cmd(0x2) @ LBA 0, 8 blocks, I/O Error 
>> (sct 0x3 / sc 0x71)
>> [   49.443123] I/O error, dev nvme0n1, sector 0 op 0x0:(READ) flags 
>> 0x80700 phys_seg 1 prio class 0
>> [   49.457080] nvme nvme0: Failed to get ANA log: -4
>> [   49.506511] nvme nvme0: D3 entry latency set to 8 seconds
>> [   49.536300] nvme nvme0: 32/0/0 default/read/poll queues
>> [   49.605281] nvme 0000:41:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
>> domain=0x0018 address=0x0 flags=0x0000]
>> [   80.081190] nvme nvme0: controller is down; will reset: CSTS=0x3, 
>> PCI_STATUS=0x1010
>> [   80.154109] nvme0n1: I/O Cmd(0x2) @ LBA 128, 8 blocks, I/O Error 
>> (sct 0x3 / sc 0x71)
>> [   80.162864] I/O error, dev nvme0n1, sector 128 op 0x0:(READ) flags 
>> 0x80700 phys_seg 1 prio class 0
>> [   80.177032] nvme nvme0: Failed to get ANA log: -4
>> [   80.225460] nvme nvme0: D3 entry latency set to 8 seconds
>> [   80.255395] nvme nvme0: 32/0/0 default/read/poll queues
>> [   80.301278] nvme 0000:41:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
>> domain=0x0018 address=0x0 flags=0x0000]
>> [  110.789207] nvme nvme0: controller is down; will reset: CSTS=0x3, 
>> PCI_STATUS=0x1010
>> [  110.861990] nvme0n1: I/O Cmd(0x2) @ LBA 2048, 8 blocks, I/O Error 
>> (sct 0x3 / sc 0x71)
>> [  110.870842] I/O error, dev nvme0n1, sector 2048 op 0x0:(READ) 
>> flags 0x80700 phys_seg 1 prio class 0
>> [  110.885040] nvme nvme0: Failed to get ANA log: -4
>> [  110.933460] nvme nvme0: D3 entry latency set to 8 seconds
>> [  110.963447] nvme nvme0: 32/0/0 default/read/poll queues
>> [  111.009276] nvme 0000:41:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
>> domain=0x0018 address=0x0 flags=0x0000]
>> ...
>> ...
>>
>>
> Can you try this?
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 78963cab1f74..425c00b02f3e 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4455,7 +4455,7 @@ static void nvme_scan_work(struct work_struct 
> *work)
>         if (test_bit(NVME_AER_NOTICE_NS_CHANGED, &ctrl->events))
>                 nvme_queue_scan(ctrl);
>  #if CONFIG_NVME_MULTIPATH
> -       else
> +       else if (ctrl->ana_log_buf)
>                 /* Re-read the ANA log page to not miss updates */
>                 queue_work(nvme_wq, &ctrl->ana_work);
>  #endif


I applied it on top of next-20250414, tested and it fixes the issue.
Tested-by: Srikanth Aithal <sraithal@amd•com>


>
> Cheers,
>
> Hannes

  reply	other threads:[~2025-04-14 11:25 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-14 10:53 Patch "nvme: re-read ANA log page after ns scan completes" causing regression Aithal, Srikanth
2025-04-14 11:09 ` Hannes Reinecke
2025-04-14 11:25   ` Aithal, Srikanth [this message]
2025-04-16 14:44   ` nvme nvme0: Failed to get ANA log after suspend/resume David Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c258a85-7b2d-4946-a64f-d0341c444119@amd.com \
    --to=sraithal@amd$(echo .)com \
    --cc=Ankit.Soni@amd$(echo .)com \
    --cc=hare@kernel$(echo .)org \
    --cc=hare@suse$(echo .)de \
    --cc=hch@lst$(echo .)de \
    --cc=kbusch@kernel$(echo .)org \
    --cc=linux-kernel@vger$(echo .)kernel.org \
    --cc=linux-next@vger$(echo .)kernel.org \
    --cc=sagi@grimberg$(echo .)me \
    --cc=vasant.hegde@amd$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox