public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Ulf Samuelsson <netdev@emagii•com>
To: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux•com>,
	ulf@emagii•com, netdev@vger•kernel.org
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6•org>
Subject: Re: [PATCH] neighbour.c: Avoid GC directly after state change
Date: Wed, 18 Mar 2015 00:27:17 +0100	[thread overview]
Message-ID: <5508B855.9090303@emagii.com> (raw)
In-Reply-To: <55081E87.7020603@miraclelinux.com>

Den 2015-03-17 13:31, YOSHIFUJI Hideaki skrev:
> Hello,
>
> Ulf Samuelsson wrote:
>> Den 2015-03-16 05:57, YOSHIFUJI Hideaki/吉藤英明 skrev:
>>> Hello.
>>>
>>> Ulf Samuelsson wrote:
>>>> Den 2015-03-15 09:27, YOSHIFUJI Hideaki skrev:
>>>>> Hello.
>>>>>
>>>>> Ulf Samuelsson wrote:
>>>>>> From: Ulf Samuelsson <ulf@emagii•com>
>>>>>>
>>>>>> The neighbour state is changed in the ARP timer handler.
>>>>>> If the state is changed to NUD_STALE, then the neighbour
>>>>>> entry becomes a candidate for garbage collection.
>>>>>>
>>>>>> The garbage collection is handled by a "periodic work" routine.
>>>>>>
>>>>>> When :
>>>>>>
>>>>>>     * noone refers to the entry
>>>>>>     * the state is no longer valid (I.E: NUD_STALE).
>>>>> NUD_STALE is still valid.
>>>> Yes, my fault.
>>>> The condition which causes garbage collection to be skipped is.
>>>>
>>>>
>>>>      if (state & (NUD_PERMANENT | NUD_IN_TIMER)) {
>>>>
>>>>      NUD_STALE is not part of that, so GC will not be skipped,
>>>>      and therefore the patch is needed if you want to be able
>>>>      to use the API to modify the neigh statemachine..
>>>>>
>>>>>>     * the timeout value has  been reached or state is FAILED
>>>>>>
>>>>>> the "periodic work" routine will notify
>>>>>> the stack that the entry should be deleted.
>>>>>>
>>>>>> A user application monitoring and controlling the neighbour table
>>>>>> using NETLINK may fail, if the "period work" routine is run
>>>>>> directly after the state has been changed to NUD_STALE,
>>>>>> but before the user application has had a chance to change
>>>>>> the state to something valid.
>>>>>>
>>>>>> The "period work" routine will detect the NUD_STALE state
>>>>>> and if the timeout value has been reached, it will notify the stack
>>>>>> that the entry should be deleted.
>>>>>>
>>>>>> The patch adds a check in the periodic work routine
>>>>>> which will skip test for garbage collection
>>>>>> unless a number of ticks has passed since the last time
>>>>>> the neighbour entry state was changed.
>>>>>>
>>>>>> The feature is controlled through Kconfig
>>>>>>
>>>>>> The featuree is enabled by setting ARP_GC_APPLY_GUARDBAND
>>>>>> The guardband time (in ticks) is set in ARP_GC_GUARDBAND
>>>>>> Default time is 100 ms if HZ_### is set.
>>>>> We have "lower limit" not to start releasing neighbour entries.
>>>>> Try increasing gc_thresh1.
>>>> Why would  that work?
>>>>
>>>> The only place where this is used is
>>>>
>>>>      "if (atomic_read(&tbl->entries) < tbl->gc_thresh1)"
>>>>
>>>> tbl->entries is related to how many entries there are in the 
>>>> neighbour table.
>>>>
>>>> The only way I think this would work, is if this is raised so high 
>>>> that
>>>> garbage collection does not occur.
>>>>
>>>> That is not the intention.
>>>>
>>>> It does not solve the race condition between the timer_handler and 
>>>> the periodic_work.
>>>
>>> I don't think it is a race.
>> And I think you are wrong and my logging shows that it is.
>
> We do not gurantee holding "stale" entries more than
> gc_thresh1, so it shall not be called as a race at all.


And that is a problem, which results in traffic loss.

>
>>
>>>
>>> You can try increasing gc_staletime to hold each entry based
>>> on last usage.  Plus, you can "confirm" neighbors by
>>> MSG_CONFIRM.
>>>
>>> Note that if the number of entries becomes high, "forced GC" will
>>> drop valid, "not connected" entries as well.
>>
>> I can try increasing gc_staletime, but its a waste of time, because 
>> it is not the last usage which is interesting.
>> What is interesting is the time when the entry state was updated by 
>> the timer handler.
>>
>> Pls explain further MSG_CONFIRM.
>
> Typically, base reachable time is set to 30sec and gc_staletime is
> set to 60sec.  So, entries are expected to be held for a while
> after it has become "stale", no?
>
> MSG_CONFIRM is a sendmsg() flag.  It allows user-space application
> to confirm reachability of neighbor.  It refreshes "confirmed"
> time. In neigh_periodic_work(), "used" time is updated to
> "confirmed" time if "used" time is before "confirmed" time.
>
May work, if the application was totally redesigned,

> ping(8), ping6(8), tftpd(8) use that flag, for example.
>
>
>> The problem occurs when the periodic_work routine runs immediately
>> after the timer handler has changes the state to NUD_STALE and
>> the entry has reached the expiry time.
>
> It is what we expect.
>
> In neigh_periodic_work(), you may try to release non-STALE
> entries first, and then STALE entries if the number of entries is
> still high.
That sounds much more complex than the proposed fix.

>
> --yoshfuji

  reply	other threads:[~2015-03-17 23:27 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-11 20:28 [PATCH] neighbour.c: Avoid GC directly after state change Ulf Samuelsson
2015-03-15  8:27 ` YOSHIFUJI Hideaki
2015-03-15 19:34   ` Ulf Samuelsson
2015-03-16  4:57     ` YOSHIFUJI Hideaki/吉藤英明
2015-03-16 19:55       ` Ulf Samuelsson
2015-03-17 12:31         ` YOSHIFUJI Hideaki
2015-03-17 23:27           ` Ulf Samuelsson [this message]
  -- strict thread matches above, loose matches on Subject: below --
2015-03-11 21:01 Ulf Samuelsson
2015-03-12 18:26 ` David Miller
2015-03-17 23:33   ` Ulf Samuelsson
2015-03-18  1:56     ` YOSHIFUJI Hideaki/吉藤英明
2015-04-10  8:26   ` Ulf Samuelsson
2015-04-15 13:40     ` Ulf Samuelsson
2015-04-16  5:16     ` YOSHIFUJI Hideaki
2015-04-17  8:03       ` Ulf Samuelsson
2015-04-20  2:33         ` YOSHIFUJI Hideaki
2015-04-20 12:48           ` Ulf Samuelsson
2015-04-21  3:58             ` YOSHIFUJI Hideaki
2015-04-22  7:42               ` Ulf Samuelsson
2015-04-22 10:46                 ` YOSHIFUJI Hideaki
2015-04-22 11:49                   ` Ulf Samuelsson
2015-05-08  9:39                     ` Ulf Samuelsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5508B855.9090303@emagii.com \
    --to=netdev@emagii$(echo .)com \
    --cc=hideaki.yoshifuji@miraclelinux$(echo .)com \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=ulf@emagii$(echo .)com \
    --cc=yoshfuji@linux-ipv6$(echo .)org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox