* [PATCH net-next] igb: assume MSI-X interrupts during initialization @ 2015-09-17 12:46 Stefan Assmann 2016-02-05 21:24 ` [net-next] " Laine Stump 0 siblings, 1 reply; 4+ messages in thread From: Stefan Assmann @ 2015-09-17 12:46 UTC (permalink / raw) To: intel-wired-lan; +Cc: netdev, davem, jeffrey.t.kirsher, sassmann In igb_sw_init() the sequence of calls was changed from igb_init_queue_configuration() igb_init_interrupt_scheme() igb_probe_vfs() to igb_probe_vfs() igb_init_queue_configuration() igb_init_interrupt_scheme() This results in adapter->flags not having the IGB_FLAG_HAS_MSIX bit set during igb_probe_vfs()->igb_enable_sriov(). Therefore SR-IOV does not get enabled properly and we run into a NULL pointer if the max_vfs module parameter is specified (adapter->vf_data does not get allocated, crash on accessing the structure). [ 7.419348] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 [ 7.419367] IP: [<ffffffffa02161c6>] igb_reset+0xe6/0x5d0 [igb] [ 7.419370] PGD 0 [ 7.419373] Oops: 0002 [#1] SMP [ 7.419381] Modules linked in: ahci(+) libahci igb(+) i40e(+) vxlan ip6_udp_tunnel udp_tunnel megaraid_sas(+) ixgbe(+) mdio [ 7.419385] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 4.2.0+ #153 [ 7.419387] Hardware name: Dell Inc. PowerEdge R720/0C4Y3R, BIOS 1.6.0 03/07/2013 [...] [ 7.419431] Call Trace: [ 7.419442] [<ffffffffa0217236>] igb_probe+0x8b6/0x1340 [igb] [ 7.419447] [<ffffffff814c7f15>] local_pci_probe+0x45/0xa0 Prevent this by setting the IGB_FLAG_HAS_MSIX bit before calling igb_probe_vfs(). The real interrupt capabilities will be checked during igb_init_interrupt_scheme() so this is safe to do. Signed-off-by: Stefan Assmann <sassmann@kpanic•de> --- drivers/net/ethernet/intel/igb/igb_main.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index e174fbb..ba019fc 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -2986,6 +2986,9 @@ static int igb_sw_init(struct igb_adapter *adapter) } #endif /* CONFIG_PCI_IOV */ + /* Assume MSI-X interrupts, will be checked during IRQ allocation */ + adapter->flags |= IGB_FLAG_HAS_MSIX; + igb_probe_vfs(adapter); igb_init_queue_configuration(adapter); -- 2.4.3 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [net-next] igb: assume MSI-X interrupts during initialization 2015-09-17 12:46 [PATCH net-next] igb: assume MSI-X interrupts during initialization Stefan Assmann @ 2016-02-05 21:24 ` Laine Stump 2016-02-05 23:13 ` Stefan Assmann 0 siblings, 1 reply; 4+ messages in thread From: Laine Stump @ 2016-02-05 21:24 UTC (permalink / raw) To: netdev; +Cc: Stefan Assmann, intel-wired-lan, davem, jeffrey.t.kirsher Stefan, I have an AMD 990FX system with an Intel 82576 card that could not successfully boot with any kernel starting somewhere prior to 4.2, but does boot properly in 4.4+. After a lot of time bisecting, I found that this patch, when applied to kernel 4.3.0, solves the problem (applying to 4.2.0 has no effect, so there's some other patch/patches in the interim that were also part of the fix). Since I don't know the details of proposing this patch for 4.3 stable, would it be possible for you to do that? Thanks! The full saga of my problem and investigaton is here: https://www.mail-archive.com/iommu@lists.linux-foundation.org/msg10687.html On 09/17/2015 08:46 AM, Stefan Assmann wrote: > In igb_sw_init() the sequence of calls was changed from > igb_init_queue_configuration() > igb_init_interrupt_scheme() > igb_probe_vfs() > to > igb_probe_vfs() > igb_init_queue_configuration() > igb_init_interrupt_scheme() > > This results in adapter->flags not having the IGB_FLAG_HAS_MSIX bit set > during igb_probe_vfs()->igb_enable_sriov(). Therefore SR-IOV does not > get enabled properly and we run into a NULL pointer if the max_vfs > module parameter is specified (adapter->vf_data does not get allocated, > crash on accessing the structure). > > [ 7.419348] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 > [ 7.419367] IP: [<ffffffffa02161c6>] igb_reset+0xe6/0x5d0 [igb] > [ 7.419370] PGD 0 > [ 7.419373] Oops: 0002 [#1] SMP > [ 7.419381] Modules linked in: ahci(+) libahci igb(+) i40e(+) vxlan ip6_udp_tunnel udp_tunnel megaraid_sas(+) ixgbe(+) mdio > [ 7.419385] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 4.2.0+ #153 > [ 7.419387] Hardware name: Dell Inc. PowerEdge R720/0C4Y3R, BIOS 1.6.0 03/07/2013 > [...] > [ 7.419431] Call Trace: > [ 7.419442] [<ffffffffa0217236>] igb_probe+0x8b6/0x1340 [igb] > [ 7.419447] [<ffffffff814c7f15>] local_pci_probe+0x45/0xa0 > > Prevent this by setting the IGB_FLAG_HAS_MSIX bit before calling > igb_probe_vfs(). The real interrupt capabilities will be checked during > igb_init_interrupt_scheme() so this is safe to do. > > Signed-off-by: Stefan Assmann <sassmann@kpanic•de> > --- > drivers/net/ethernet/intel/igb/igb_main.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c > index e174fbb..ba019fc 100644 > --- a/drivers/net/ethernet/intel/igb/igb_main.c > +++ b/drivers/net/ethernet/intel/igb/igb_main.c > @@ -2986,6 +2986,9 @@ static int igb_sw_init(struct igb_adapter *adapter) > } > #endif /* CONFIG_PCI_IOV */ > > + /* Assume MSI-X interrupts, will be checked during IRQ allocation */ > + adapter->flags |= IGB_FLAG_HAS_MSIX; > + > igb_probe_vfs(adapter); > > igb_init_queue_configuration(adapter); > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [net-next] igb: assume MSI-X interrupts during initialization 2016-02-05 21:24 ` [net-next] " Laine Stump @ 2016-02-05 23:13 ` Stefan Assmann 2016-02-10 10:26 ` Stefan Assmann 0 siblings, 1 reply; 4+ messages in thread From: Stefan Assmann @ 2016-02-05 23:13 UTC (permalink / raw) To: Laine Stump, netdev; +Cc: intel-wired-lan, davem, jeffrey.t.kirsher On 02/05/2016 10:24 PM, Laine Stump wrote: > Stefan, > > I have an AMD 990FX system with an Intel 82576 card that could not > successfully boot with any kernel starting somewhere prior to 4.2, but > does boot properly in 4.4+. After a lot of time bisecting, I found that > this patch, when applied to kernel 4.3.0, solves the problem (applying > to 4.2.0 has no effect, so there's some other patch/patches in the > interim that were also part of the fix). > > Since I don't know the details of proposing this patch for 4.3 stable, > would it be possible for you to do that? > > Thanks! Hi Laine, I took a quick look at 4.3 and the patch you mention should be sufficient. For 4.2 I'll have to take a closer look. I'm currently traveling but going to get back to you early next week. I'd like double check things before taking any action. Thanks! Stefan > The full saga of my problem and investigaton is here: > > https://www.mail-archive.com/iommu@lists.linux-foundation.org/msg10687.html > > > On 09/17/2015 08:46 AM, Stefan Assmann wrote: >> In igb_sw_init() the sequence of calls was changed from >> igb_init_queue_configuration() >> igb_init_interrupt_scheme() >> igb_probe_vfs() >> to >> igb_probe_vfs() >> igb_init_queue_configuration() >> igb_init_interrupt_scheme() >> >> This results in adapter->flags not having the IGB_FLAG_HAS_MSIX bit set >> during igb_probe_vfs()->igb_enable_sriov(). Therefore SR-IOV does not >> get enabled properly and we run into a NULL pointer if the max_vfs >> module parameter is specified (adapter->vf_data does not get allocated, >> crash on accessing the structure). >> >> [ 7.419348] BUG: unable to handle kernel NULL pointer dereference >> at 0000000000000048 >> [ 7.419367] IP: [<ffffffffa02161c6>] igb_reset+0xe6/0x5d0 [igb] >> [ 7.419370] PGD 0 >> [ 7.419373] Oops: 0002 [#1] SMP >> [ 7.419381] Modules linked in: ahci(+) libahci igb(+) i40e(+) vxlan >> ip6_udp_tunnel udp_tunnel megaraid_sas(+) ixgbe(+) mdio >> [ 7.419385] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 4.2.0+ #153 >> [ 7.419387] Hardware name: Dell Inc. PowerEdge R720/0C4Y3R, BIOS >> 1.6.0 03/07/2013 >> [...] >> [ 7.419431] Call Trace: >> [ 7.419442] [<ffffffffa0217236>] igb_probe+0x8b6/0x1340 [igb] >> [ 7.419447] [<ffffffff814c7f15>] local_pci_probe+0x45/0xa0 >> >> Prevent this by setting the IGB_FLAG_HAS_MSIX bit before calling >> igb_probe_vfs(). The real interrupt capabilities will be checked during >> igb_init_interrupt_scheme() so this is safe to do. >> >> Signed-off-by: Stefan Assmann <sassmann@kpanic•de> >> --- >> drivers/net/ethernet/intel/igb/igb_main.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c >> b/drivers/net/ethernet/intel/igb/igb_main.c >> index e174fbb..ba019fc 100644 >> --- a/drivers/net/ethernet/intel/igb/igb_main.c >> +++ b/drivers/net/ethernet/intel/igb/igb_main.c >> @@ -2986,6 +2986,9 @@ static int igb_sw_init(struct igb_adapter *adapter) >> } >> #endif /* CONFIG_PCI_IOV */ >> >> + /* Assume MSI-X interrupts, will be checked during IRQ allocation */ >> + adapter->flags |= IGB_FLAG_HAS_MSIX; >> + >> igb_probe_vfs(adapter); >> >> igb_init_queue_configuration(adapter); >> > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [net-next] igb: assume MSI-X interrupts during initialization 2016-02-05 23:13 ` Stefan Assmann @ 2016-02-10 10:26 ` Stefan Assmann 0 siblings, 0 replies; 4+ messages in thread From: Stefan Assmann @ 2016-02-10 10:26 UTC (permalink / raw) To: Laine Stump, netdev; +Cc: intel-wired-lan, davem, jeffrey.t.kirsher On 06.02.2016 00:13, Stefan Assmann wrote: > On 02/05/2016 10:24 PM, Laine Stump wrote: >> Stefan, >> >> I have an AMD 990FX system with an Intel 82576 card that could not >> successfully boot with any kernel starting somewhere prior to 4.2, but >> does boot properly in 4.4+. After a lot of time bisecting, I found that >> this patch, when applied to kernel 4.3.0, solves the problem (applying >> to 4.2.0 has no effect, so there's some other patch/patches in the >> interim that were also part of the fix). >> >> Since I don't know the details of proposing this patch for 4.3 stable, >> would it be possible for you to do that? >> >> Thanks! > > Hi Laine, > > I took a quick look at 4.3 and the patch you mention should be > sufficient. For 4.2 I'll have to take a closer look. I'm currently > traveling but going to get back to you early next week. > > I'd like double check things before taking any action. I've tried to reproduce your issue on several systems, mostly Intel though, without success running 4.2 or 4.3. As I know you mostly care for 4.3 (current fedora) I'd suggest to queue commit cbfe360a1541a32e9e28f8f8ac925d2b7979d767 igb: assume MSI-X interrupts during initialization to 4.3-stable as it's a follow-up fix to ceee3450b3a85db05a107d54fbea031c77d30401 igb: make sure SR-IOV init uses the right number of queues Dave, if you agree with that please queue the patch for 4.3-stable. Thanks! Stefan > > Thanks! > > Stefan > >> The full saga of my problem and investigaton is here: >> >> https://www.mail-archive.com/iommu@lists.linux-foundation.org/msg10687.html >> >> >> On 09/17/2015 08:46 AM, Stefan Assmann wrote: >>> In igb_sw_init() the sequence of calls was changed from >>> igb_init_queue_configuration() >>> igb_init_interrupt_scheme() >>> igb_probe_vfs() >>> to >>> igb_probe_vfs() >>> igb_init_queue_configuration() >>> igb_init_interrupt_scheme() >>> >>> This results in adapter->flags not having the IGB_FLAG_HAS_MSIX bit set >>> during igb_probe_vfs()->igb_enable_sriov(). Therefore SR-IOV does not >>> get enabled properly and we run into a NULL pointer if the max_vfs >>> module parameter is specified (adapter->vf_data does not get allocated, >>> crash on accessing the structure). >>> >>> [ 7.419348] BUG: unable to handle kernel NULL pointer dereference >>> at 0000000000000048 >>> [ 7.419367] IP: [<ffffffffa02161c6>] igb_reset+0xe6/0x5d0 [igb] >>> [ 7.419370] PGD 0 >>> [ 7.419373] Oops: 0002 [#1] SMP >>> [ 7.419381] Modules linked in: ahci(+) libahci igb(+) i40e(+) vxlan >>> ip6_udp_tunnel udp_tunnel megaraid_sas(+) ixgbe(+) mdio >>> [ 7.419385] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 4.2.0+ #153 >>> [ 7.419387] Hardware name: Dell Inc. PowerEdge R720/0C4Y3R, BIOS >>> 1.6.0 03/07/2013 >>> [...] >>> [ 7.419431] Call Trace: >>> [ 7.419442] [<ffffffffa0217236>] igb_probe+0x8b6/0x1340 [igb] >>> [ 7.419447] [<ffffffff814c7f15>] local_pci_probe+0x45/0xa0 >>> >>> Prevent this by setting the IGB_FLAG_HAS_MSIX bit before calling >>> igb_probe_vfs(). The real interrupt capabilities will be checked during >>> igb_init_interrupt_scheme() so this is safe to do. >>> >>> Signed-off-by: Stefan Assmann <sassmann@kpanic•de> >>> --- >>> drivers/net/ethernet/intel/igb/igb_main.c | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c >>> b/drivers/net/ethernet/intel/igb/igb_main.c >>> index e174fbb..ba019fc 100644 >>> --- a/drivers/net/ethernet/intel/igb/igb_main.c >>> +++ b/drivers/net/ethernet/intel/igb/igb_main.c >>> @@ -2986,6 +2986,9 @@ static int igb_sw_init(struct igb_adapter *adapter) >>> } >>> #endif /* CONFIG_PCI_IOV */ >>> >>> + /* Assume MSI-X interrupts, will be checked during IRQ allocation */ >>> + adapter->flags |= IGB_FLAG_HAS_MSIX; >>> + >>> igb_probe_vfs(adapter); >>> >>> igb_init_queue_configuration(adapter); >>> >> > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-02-10 10:26 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-09-17 12:46 [PATCH net-next] igb: assume MSI-X interrupts during initialization Stefan Assmann 2016-02-05 21:24 ` [net-next] " Laine Stump 2016-02-05 23:13 ` Stefan Assmann 2016-02-10 10:26 ` Stefan Assmann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox