From: Luis Henriques <luis.henriques@canonical•com>
To: Ben Hutchings <bhutchings@solarflare•com>
Cc: Neil Horman <nhorman@tuxdriver•com>, <netdev@vger•kernel.org>,
Jay Cliburn <jcliburn@gmail•com>,
"David S. Miller" <davem@davemloft•net>, <stable@vger•kernel.org>
Subject: Re: [net PATCH] atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring
Date: Sat, 27 Jul 2013 20:30:13 +0100 [thread overview]
Message-ID: <87k3kbdcmy.fsf@canonical.com> (raw)
In-Reply-To: <1374884670.1666.72.camel@bwh-desktop.uk.level5networks.com> (Ben Hutchings's message of "Sat, 27 Jul 2013 01:24:30 +0100")
Ben Hutchings <bhutchings@solarflare•com> writes:
> On Sat, 2013-07-27 at 01:02 +0100, Ben Hutchings wrote:
>> On Fri, 2013-07-26 at 12:47 -0400, Neil Horman wrote:
>> > atl1c uses netdev_alloc_skb to refill its rx dma ring, but that call makes no
>> > guarantees about the suitability of the memory for use in DMA. As a result
>> > we've gotten reports of atl1c drivers occasionally hanging and needing to be
>> > reset:
>> > https://bugzilla.kernel.org/show_bug.cgi?id=54021
>> >
>> > Fix this by modifying the call to use the internal version __netdev_alloc_skb,
>> > where you can set the gfp_mask explicitly to include GFP_DMA.
>>
>> This is a really bad idea. GFP_DMA means allocation from the ISA DMA
>> region (< 16 MB). pci_map_single() takes care of allocating a bounce
>> buffer if necessary.
>>
>> Ben.
>>
>> > Tested by two reporters in the above bug, who have the hardware to validate it.
>> > Both report immediate cessation of the problem with this patch
> [...]
>
> So perhaps the chip somehow fails to support a full 32-bit address
> (which is the current DMA mask), though given that there are 64 address
> bits in RX descriptors this seems unlikely. And the most likely result
> of that would be memory corruption, not a stall.
>
> Alternately, perhaps more likely, there's something wrong with the
> driver's error handling. If atl1_alloc_rx_buffer() fails then the RX
> queue could run dry. Depending on how the hardware is designed, that
> could result in a complete RX stall (no RX buffers available => no RX
> completions => no attempt to allocate more RX buffers).
>
> Maybe your change makes it less likely for atl1_alloc_rx_buffer() to
> fail. On a modern PC the (ISA) DMA zone is basically unused whereas
> bounce buffers might be more contended. Did you try adding some logging
> for failure of pci_map_single()?
>
> Ben.
Just to add a little bit more context (and hopefully not noise), I
started seeing this issue on 3.7. Bisection resulted on the following
first bad commit:
69b08f6 net: use bigger pages in __netdev_alloc_frag
Reverting this commit (and e5e6730 "skbuff: Move definition of
NETDEV_FRAG_PAGE_MAX_SIZE") solved the problem.
Note also that I'm seeing this issue on a 32 bits system (64 bits
isn't supported). This initially made me think the problem could be
related with this as 69b08f6 log explicitly refers to 32/64 bit
archs. But I failed to find any obvious issue with the patch.
Cheers,
--
Luis
next prev parent reply other threads:[~2013-07-27 19:30 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-26 16:47 [net PATCH] atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring Neil Horman
2013-07-26 16:56 ` Luis Henriques
2013-07-26 17:02 ` Neil Horman
2013-07-26 22:56 ` David Miller
2013-07-27 16:25 ` Luis Henriques
2013-07-27 0:02 ` Ben Hutchings
2013-07-27 0:24 ` Ben Hutchings
2013-07-27 19:30 ` Luis Henriques [this message]
2013-07-27 19:49 ` Eric Dumazet
2013-07-27 21:30 ` Ben Hutchings
2013-07-27 23:59 ` Eric Dumazet
2013-07-28 3:02 ` David Miller
2013-07-28 10:44 ` Neil Horman
2013-07-28 16:15 ` Eric Dumazet
2013-07-28 18:53 ` Neil Horman
2013-07-28 19:21 ` Eric Dumazet
2013-07-28 20:08 ` Eric Dumazet
2013-07-28 20:22 ` Ben Hutchings
2013-07-28 23:01 ` Eric Dumazet
2013-07-28 23:20 ` Eric Dumazet
2013-07-28 23:25 ` Eric Dumazet
2013-07-28 23:38 ` Neil Horman
2013-07-29 0:07 ` Ben Hutchings
2013-07-29 0:21 ` David Miller
2013-07-29 0:26 ` Eric Dumazet
2013-07-29 9:55 ` Luis Henriques
2013-07-29 10:57 ` Eric Dumazet
2013-07-29 12:09 ` Luis Henriques
2013-07-29 15:30 ` Eric Dumazet
2013-07-29 17:24 ` Eric Dumazet
2013-07-30 8:53 ` Luis Henriques
2013-07-31 2:11 ` David Miller
2013-07-31 17:48 ` Benjamin Poirier
2013-07-31 17:56 ` Eric Dumazet
2013-07-31 19:01 ` David Miller
2013-08-01 1:57 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k3kbdcmy.fsf@canonical.com \
--to=luis.henriques@canonical$(echo .)com \
--cc=bhutchings@solarflare$(echo .)com \
--cc=davem@davemloft$(echo .)net \
--cc=jcliburn@gmail$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=nhorman@tuxdriver$(echo .)com \
--cc=stable@vger$(echo .)kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox