public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Michal Kubiak <michal.kubiak@intel•com>
To: Stefan Assmann <sassmann@kpanic•de>
Cc: <intel-wired-lan@lists•osuosl.org>, <netdev@vger•kernel.org>,
	<anthony.l.nguyen@intel•com>, <patryk.piotrowski@intel•com>,
	<slawomirx.laba@intel•com>
Subject: Re: [PATCH net] iavf: fix hang on reboot with ice
Date: Fri, 10 Mar 2023 18:24:05 +0100	[thread overview]
Message-ID: <ZAtnqlHZ02EJn5xt@localhost.localdomain> (raw)
In-Reply-To: <20230310122653.1116051-1-sassmann@kpanic.de>

On Fri, Mar 10, 2023 at 01:26:53PM +0100, Stefan Assmann wrote:
> When a system with E810 with existing VFs gets rebooted the following
> hang may be observed.
> 
>  Pid 1 is hung in iavf_remove(), part of a network driver:
>  PID: 1        TASK: ffff965400e5a340  CPU: 24   COMMAND: "systemd-shutdow"
>   #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb
>   #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d
>   #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc
>   #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930
>   #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf]
>   #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513
>   #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa
>   #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc
>   #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e
>   #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429
>  #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4
>  #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice]
>  #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice]
>  #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice]
>  #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1
>  #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386
>  #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870
>  #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6
>  #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159
>  #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc
>  #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d
>  #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169
>  #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b
>      RIP: 00007f1baa5c13d7  RSP: 00007fffbcc55a98  RFLAGS: 00000202
>      RAX: ffffffffffffffda  RBX: 0000000000000000  RCX: 00007f1baa5c13d7
>      RDX: 0000000001234567  RSI: 0000000028121969  RDI: 00000000fee1dead
>      RBP: 00007fffbcc55ca0   R8: 0000000000000000   R9: 00007fffbcc54e90
>      R10: 00007fffbcc55050  R11: 0000000000000202  R12: 0000000000000005
>      R13: 0000000000000000  R14: 00007fffbcc55af0  R15: 0000000000000000
>      ORIG_RAX: 00000000000000a9  CS: 0033  SS: 002b
> 
> During reboot all drivers PM shutdown callbacks are invoked.
> In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE.
> In ice_shutdown() the call chain above is executed, which at some point
> calls iavf_remove(). However iavf_remove() expects the VF to be in one
> of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If
> that's not the case it sleeps forever.
> So if iavf_shutdown() gets invoked before ice_shutdown() the system will
> hang indefinitely because the adapter is already in state __IAVF_REMOVE.
> 
> Fix this by adding __IAVF_REMOVE to the list of allowed states in
> iavf_remove().
> 
> Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in remove")
> Reported-by: Marius Cornea <mcornea@redhat•com>
> Signed-off-by: Stefan Assmann <sassmann@kpanic•de>
> ---
>  drivers/net/ethernet/intel/iavf/iavf_main.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index 3273aeb8fa67..83ef3a343ef0 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> @@ -5062,7 +5062,8 @@ static void iavf_remove(struct pci_dev *pdev)
>  		mutex_lock(&adapter->crit_lock);
>  		if (adapter->state == __IAVF_RUNNING ||
>  		    adapter->state == __IAVF_DOWN ||
> -		    adapter->state == __IAVF_INIT_FAILED) {
> +		    adapter->state == __IAVF_INIT_FAILED ||
> +		    adapter->state == __IAVF_REMOVE) {
>  			mutex_unlock(&adapter->crit_lock);
>  			break;
>  		}

Adding the __IAVF_REMOVE state to the loop break condition seems OK to
me.
I would only consider adding a timeout to this loop to prevent endless hangs
for other potential corner cases.

Thanks,
Michal

> -- 
> 2.39.1
> 

  reply	other threads:[~2023-03-10 17:25 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-10 12:26 [PATCH net] iavf: fix hang on reboot with ice Stefan Assmann
2023-03-10 17:24 ` Michal Kubiak [this message]
2023-03-13  6:48   ` Stefan Assmann
2023-03-13  9:15     ` Michal Kubiak
2023-03-13 14:52       ` Stefan Assmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZAtnqlHZ02EJn5xt@localhost.localdomain \
    --to=michal.kubiak@intel$(echo .)com \
    --cc=anthony.l.nguyen@intel$(echo .)com \
    --cc=intel-wired-lan@lists$(echo .)osuosl.org \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=patryk.piotrowski@intel$(echo .)com \
    --cc=sassmann@kpanic$(echo .)de \
    --cc=slawomirx.laba@intel$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox