From: Jarek Poplawski <jarkao2@gmail•com>
To: Eric Dumazet <eric.dumazet@gmail•com>
Cc: David Miller <davem@davemloft•net>, emin ak <eminak71@gmail•com>,
Andrew Morton <akpm@linux-foundation•org>,
netdev@vger•kernel.org, bugzilla-daemon@bugzilla•kernel.org,
bugme-daemon@bugzilla•kernel.org,
Anton Vorontsov <avorontsov@mvista•com>,
Andy Fleming <afleming@freescale•com>
Subject: Re: [PATCH] gianfar: Fix crashes on RX path (Was Re: [Bugme-new] [Bug 19692] New: linux-2.6.36-rc5 crash with gianfar ethernet at full line rate traffic)
Date: Fri, 22 Oct 2010 06:52:31 +0000 [thread overview]
Message-ID: <20101022065231.GA7036@ff.dom.local> (raw)
In-Reply-To: <1287727917.9059.117.camel@edumazet-laptop>
On Fri, Oct 22, 2010 at 08:11:57AM +0200, Eric Dumazet wrote:
> Le mardi 19 octobre 2010 ?? 10:06 +0000, Jarek Poplawski a écrit :
> > On Tue, Oct 19, 2010 at 09:44:33AM +0300, emin ak wrote:
> > > Hi Jarek;
> > > After 5 days and more then 20 billion packets passed without crash, it
> > > seems that this patch is working for me, at least for crash type 2.
> > > (For type 1, it only occured once and I can never reproduce this
> > > again, but still trying. I think with this patch is also lowers the
> > > risk for type 1.
> >
> > It would be interesting to have a look if it's exactly type 1, because
> > skb_over_panic can happen for different reasons, e.g. like here:
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=63b88b9041ceef8217f34de71a2e96f0c3f0fd3b
> >
> > > For adding a new bug entry for skb_over_panic, before that I think I
> > > must find a reliable way to make this type of crash reproducable,
> > > otherwise I don't know how to test it if it solved or not.
> >
> > Maybe for now let's try to get and see this type 1 again? Since the
> > recycle path is suspicious a bit to me, probably limiting memory or
> > slowing tx (maybe different mtus on eth0 and 1) under heavy multi cpu
> > load might help.
> >
> > > Lastly, thanks a lot for your valuable help to overcome this problem
> > > and also is there anything that I can do for testing / commiting this
> > > patch to mainline?
> >
> > Here it is for David to handle the rest.
> >
> > Thanks a lot for such an intense testing,
> > Jarek P.
> > --------------------------->
> >
> > The rx_recycle queue is global per device but can be accesed by many
> > napi handlers at the same time, so it needs full skb_queue primitives
> > (with locking). Otherwise, various crashes caused by broken skbs are
> > possible.
> >
> > This patch resolves, at least partly, bugzilla bug 19692. (Because of
> > some doubts that there could be still something around which is hard
> > to reproduce my proposal is to leave this bug opened for a month.)
> >
> > Fixes commit: 0fd56bb5be6455d0d42241e65aed057244665e5e
> >
> > Reported-by: emin ak <eminak71@gmail•com>
> > Tested-by: emin ak <eminak71@gmail•com>
> > Signed-off-by: Jarek Poplawski <jarkao2@gmail•com>
> > CC: Andy Fleming <afleming@freescale•com>
> > ---
> > diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c
> > index 4f7c3f3..db47b55 100644
> > --- a/drivers/net/gianfar.c
> > +++ b/drivers/net/gianfar.c
> > @@ -2515,7 +2515,7 @@ static int gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue)
> > skb_recycle_check(skb, priv->rx_buffer_size +
> > RXBUF_ALIGNMENT)) {
> > gfar_align_skb(skb);
> > - __skb_queue_head(&priv->rx_recycle, skb);
> > + skb_queue_head(&priv->rx_recycle, skb);
> > } else
> > dev_kfree_skb_any(skb);
> >
> > @@ -2598,7 +2598,7 @@ struct sk_buff * gfar_new_skb(struct net_device *dev)
> > struct gfar_private *priv = netdev_priv(dev);
> > struct sk_buff *skb = NULL;
> >
> > - skb = __skb_dequeue(&priv->rx_recycle);
> > + skb = skb_dequeue(&priv->rx_recycle);
> > if (!skb)
> > skb = gfar_alloc_skb(dev);
> >
> > @@ -2754,7 +2754,7 @@ int gfar_clean_rx_ring(struct gfar_priv_rx_q *rx_queue, int rx_work_limit)
> > if (unlikely(!newskb))
> > newskb = skb;
> > else if (skb)
> > - __skb_queue_head(&priv->rx_recycle, skb);
> > + skb_queue_head(&priv->rx_recycle, skb);
> > } else {
> > /* Increment the number of packets */
> > rx_queue->stats.rx_packets++;
>
> Are you sure its needed at all ?
Yes, after Emin's testing I'm quite sure this fix is needed.
>
> Gianfar claims to be multiqueue, but only one cpu can run gfar_poll()
> and call gfar_clean_tx_ring() / gfar_clean_rx_ring()
>
> If not, there would be more bugs than only rx_recycle thing
I didn't find what prevents running gfar_poll on many cpus and don't
claim there is no more bugs around.
Jarek P.
>
> vi +2822 drivers/net/gianfar.c
>
> for_each_set_bit(i, &gfargrp->rx_bit_map, priv->num_rx_queues) {
> if (test_bit(i, &serviced_queues))
> continue;
> rx_queue = priv->rx_queue[i];
> tx_queue = priv->tx_queue[rx_queue->qindex];
>
> tx_cleaned += gfar_clean_tx_ring(tx_queue);
> rx_cleaned_per_queue = gfar_clean_rx_ring(rx_queue,
> budget_per_queue);
> rx_cleaned += rx_cleaned_per_queue;
> if(rx_cleaned_per_queue < budget_per_queue) {
> left_over_budget = left_over_budget +
> (budget_per_queue - rx_cleaned_per_queue);
> set_bit(i, &serviced_queues);
> num_queues--;
> }
> }
>
>
next prev parent reply other threads:[~2010-10-22 6:52 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <bug-19692-10286@https.bugzilla.kernel.org/>
2010-10-04 20:53 ` [Bugme-new] [Bug 19692] New: linux-2.6.36-rc5 crash with gianfar ethernet at full line rate traffic Andrew Morton
2010-10-08 9:24 ` Jarek Poplawski
2010-10-09 12:10 ` emin ak
2010-10-10 10:32 ` Jarek Poplawski
2010-10-15 8:58 ` Jarek Poplawski
2010-10-15 23:14 ` emin ak
2010-10-16 19:48 ` Jarek Poplawski
2010-10-19 6:44 ` emin ak
2010-10-19 10:06 ` [PATCH] gianfar: Fix crashes on RX path (Was Re: [Bugme-new] [Bug 19692] New: linux-2.6.36-rc5 crash with gianfar ethernet at full line rate traffic) Jarek Poplawski
2010-10-22 5:42 ` emin ak
2010-10-22 6:14 ` Eric Dumazet
2010-10-22 7:03 ` Jarek Poplawski
2010-10-22 6:11 ` Eric Dumazet
2010-10-22 6:52 ` Jarek Poplawski [this message]
2010-10-22 8:52 ` Jarek Poplawski
2010-10-26 17:42 ` [PATCH] gianfar: Fix crashes on RX path David Miller
2010-10-26 21:20 ` Jarek Poplawski
2010-10-26 21:23 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101022065231.GA7036@ff.dom.local \
--to=jarkao2@gmail$(echo .)com \
--cc=afleming@freescale$(echo .)com \
--cc=akpm@linux-foundation$(echo .)org \
--cc=avorontsov@mvista$(echo .)com \
--cc=bugme-daemon@bugzilla$(echo .)kernel.org \
--cc=bugzilla-daemon@bugzilla$(echo .)kernel.org \
--cc=davem@davemloft$(echo .)net \
--cc=eminak71@gmail$(echo .)com \
--cc=eric.dumazet@gmail$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox