public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Nikolay Aleksandrov <nikolay@cumulusnetworks•com>
To: Florian Westphal <fw@strlen•de>
Cc: Frank Schreuder <fschreuder@transip•nl>,
	Johan Schuijt <johan@transip•nl>,
	Eric Dumazet <eric.dumazet@gmail•com>,
	"nikolay@redhat•com" <nikolay@redhat•com>,
	"davem@davemloft•net" <davem@davemloft•net>,
	"chutzpah@gentoo•org" <chutzpah@gentoo•org>,
	Robin Geuze <robing@transip•nl>, netdev <netdev@vger•kernel.org>
Subject: Re: reproducable panic eviction work queue
Date: Wed, 22 Jul 2015 16:14:53 +0200	[thread overview]
Message-ID: <55AFA55D.4000606@cumulusnetworks.com> (raw)
In-Reply-To: <55AFA295.1070600@cumulusnetworks.com>

On 07/22/2015 04:03 PM, Nikolay Aleksandrov wrote:
> On 07/22/2015 03:58 PM, Florian Westphal wrote:
>> Nikolay Aleksandrov <nikolay@cumulusnetworks•com> wrote:
>>> On 07/22/2015 10:17 AM, Frank Schreuder wrote:
>>>> I got some additional information from syslog:
>>>>
>>>> Jul 22 09:49:33 dommy0 kernel: [  675.987890] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:42]
>>>> Jul 22 09:49:42 dommy0 kernel: [  685.114033] INFO: rcu_sched self-detected stall on CPU { 3}  (t=39918 jiffies g=988 c=987 q=23168)
>>>>
>>>> Thanks,
>>>> Frank
>>>>
>>>>
>>>
>>> Hi,
>>> It looks like it's happening because of the evict_again logic, I think we should also
>>> add Florian's first suggestion about simplifying it to the patch and just skip the
>>> entry if we can't delete its timer otherwise we can restart the eviction and see
>>> entries that already had their timer stopped by us and can keep restarting for
>>> a long time.
>>> Here's an updated patch that removes the evict_again logic.
>>
>> Thanks Nik.  I'm afraid this adds bug when netns is exiting.
>>
>> Currently, we wait until timer has finished, but after the change
>> we might destroy percpu counter while a timer is still executing on
>> another cpu.
>>
>> I pushed a patch series to
>> https://git.breakpoint.cc/cgit/fw/net.git/log/?h=inetfrag_fixes_02
>>
>> It includes this patch with a small change -- deferral of the percpu
>> counter subtraction until after queue has been free'd.
>>
>> Frank -- it would be great if you could test with the four patches in
>> that series applied.
>>
>> I'll then add your tested-by Tag to all of them before submitting this.
>>
>> Thanks again for all your help in getting this fixed!
>>
> 
> Sure, I didn't think it through, just supplied it for the test. :-)
> Thanks for fixing it up!
> 

Patches look great, even the INET_FRAG_EVICTED flag will not be accidentally cleared 
this way. I'll give them a try.

  reply	other threads:[~2015-07-22 14:14 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <F8D94413-90A2-4F80-AAA2-7A6AB57DF314@transip.nl>
2015-07-18  8:56 ` reproducable panic eviction work queue Eric Dumazet
2015-07-18  9:01   ` Johan Schuijt
2015-07-18 10:02     ` Nikolay Aleksandrov
2015-07-18 13:31       ` Nikolay Aleksandrov
2015-07-18 15:28       ` Johan Schuijt
2015-07-18 15:30         ` Johan Schuijt
2015-07-18 15:32         ` Nikolay Aleksandrov
2015-07-20 12:47           ` Frank Schreuder
2015-07-20 14:02             ` Nikolay Aleksandrov
2015-07-20 14:30             ` Florian Westphal
2015-07-21 11:50               ` Frank Schreuder
2015-07-21 18:34                 ` Florian Westphal
2015-07-22  8:09                   ` Frank Schreuder
2015-07-22  8:17                     ` Frank Schreuder
2015-07-22  9:11                       ` Nikolay Aleksandrov
2015-07-22 10:55                         ` Frank Schreuder
2015-07-22 13:58                         ` Florian Westphal
2015-07-22 14:03                           ` Nikolay Aleksandrov
2015-07-22 14:14                             ` Nikolay Aleksandrov [this message]
2015-07-22 15:31                               ` Frank Schreuder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55AFA55D.4000606@cumulusnetworks.com \
    --to=nikolay@cumulusnetworks$(echo .)com \
    --cc=chutzpah@gentoo$(echo .)org \
    --cc=davem@davemloft$(echo .)net \
    --cc=eric.dumazet@gmail$(echo .)com \
    --cc=fschreuder@transip$(echo .)nl \
    --cc=fw@strlen$(echo .)de \
    --cc=johan@transip$(echo .)nl \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=nikolay@redhat$(echo .)com \
    --cc=robing@transip$(echo .)nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox