public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Vlad Buslov <vladbu@nvidia•com>
To: Pedro Tammela <pctammela@mojatatu•com>
Cc: <netdev@vger•kernel.org>, <davem@davemloft•net>,
	<edumazet@google•com>, <kuba@kernel•org>, <pabeni@redhat•com>,
	<jhs@mojatatu•com>, <xiyou.wangcong@gmail•com>,
	<jiri@resnulli•us>, <marcelo.leitner@gmail•com>
Subject: Re: [PATCH net-next 1/2] net/sched: act_api: rely on rcu in tcf_idr_check_alloc
Date: Tue, 5 Dec 2023 20:34:03 +0200	[thread overview]
Message-ID: <87jzpso53a.fsf@nvidia.com> (raw)
In-Reply-To: <20231205153012.484687-2-pctammela@mojatatu.com>

On Tue 05 Dec 2023 at 12:30, Pedro Tammela <pctammela@mojatatu•com> wrote:
> Instead of relying only on the idrinfo->lock mutex for
> bind/alloc logic, rely on a combination of rcu + mutex + atomics
> to better scale the case where multiple rtnl-less filters are
> binding to the same action object.
>
> Action binding happens when an action index is specified explicitly and
> an action exists which such index exists. Example:

Nit: the first sentence looks mangled, extra 'exists' word and probably
'which' should be 'with'.

>   tc actions add action drop index 1
>   tc filter add ... matchall action drop index 1
>   tc filter add ... matchall action drop index 1
>   tc filter add ... matchall action drop index 1
>   tc filter ls ...
>      filter protocol all pref 49150 matchall chain 0 filter protocol all pref 49150 matchall chain 0 handle 0x1
>      not_in_hw
>            action order 1: gact action drop
>             random type none pass val 0
>             index 1 ref 4 bind 3
>
>    filter protocol all pref 49151 matchall chain 0 filter protocol all pref 49151 matchall chain 0 handle 0x1
>      not_in_hw
>            action order 1: gact action drop
>             random type none pass val 0
>             index 1 ref 4 bind 3
>
>    filter protocol all pref 49152 matchall chain 0 filter protocol all pref 49152 matchall chain 0 handle 0x1
>      not_in_hw
>            action order 1: gact action drop
>             random type none pass val 0
>             index 1 ref 4 bind 3
>
> When no index is specified, as before, grab the mutex and allocate
> in the idr the next available id. In this version, as opposed to before,
> it's simplified to store the -EBUSY pointer instead of the previous
> alloc + replace combination.
>
> When an index is specified, rely on rcu to find if there's an object in
> such index. If there's none, fallback to the above, serializing on the
> mutex and reserving the specified id. If there's one, it can be an -EBUSY
> pointer, in which case we just try again until it's an action, or an action.
> Given the rcu guarantees, the action found could be dead and therefore
> we need to bump the refcount if it's not 0, handling the case it's
> in fact 0.
>
> As bind and the action refcount are already atomics, these increments can
> happen without the mutex protection while many tcf_idr_check_alloc race
> to bind to the same action instance.
>
> Signed-off-by: Pedro Tammela <pctammela@mojatatu•com>
> ---
>  net/sched/act_api.c | 56 +++++++++++++++++++++++++++------------------
>  1 file changed, 34 insertions(+), 22 deletions(-)
>
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index abec5c45b5a4..79a044d2ae02 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -824,43 +824,55 @@ int tcf_idr_check_alloc(struct tc_action_net *tn, u32 *index,
>  	struct tcf_idrinfo *idrinfo = tn->idrinfo;
>  	struct tc_action *p;
>  	int ret;
> +	u32 max;
>  
> -again:
> -	mutex_lock(&idrinfo->lock);
>  	if (*index) {
> +again:
> +		rcu_read_lock();
>  		p = idr_find(&idrinfo->action_idr, *index);
> +
>  		if (IS_ERR(p)) {
>  			/* This means that another process allocated
>  			 * index but did not assign the pointer yet.
>  			 */
> -			mutex_unlock(&idrinfo->lock);
> +			rcu_read_unlock();
>  			goto again;
>  		}
>  
> -		if (p) {
> -			refcount_inc(&p->tcfa_refcnt);
> -			if (bind)
> -				atomic_inc(&p->tcfa_bindcnt);
> -			*a = p;
> -			ret = 1;
> -		} else {
> -			*a = NULL;
> -			ret = idr_alloc_u32(&idrinfo->action_idr, NULL, index,
> -					    *index, GFP_KERNEL);
> -			if (!ret)
> -				idr_replace(&idrinfo->action_idr,
> -					    ERR_PTR(-EBUSY), *index);
> +		if (!p) {
> +			/* Empty slot, try to allocate it */
> +			max = *index;
> +			rcu_read_unlock();
> +			goto new;
> +		}
> +
> +		if (!refcount_inc_not_zero(&p->tcfa_refcnt)) {
> +			/* Action was deleted in parallel */
> +			rcu_read_unlock();
> +			return -ENOENT;

Current version doesn't return ENOENT since it is synchronous. You are
now introducing basically a change to UAPI since users of this function
(individual actions) are not prepared to retry on ENOENT and will
propagate the error up the call chain. I guess you need to try to create
a new action with specified index instead.

>  		}
> +
> +		if (bind)
> +			atomic_inc(&p->tcfa_bindcnt);
> +		*a = p;
> +
> +		rcu_read_unlock();
> +
> +		return 1;
>  	} else {
> +		/* Find a slot */
>  		*index = 1;
> -		*a = NULL;
> -		ret = idr_alloc_u32(&idrinfo->action_idr, NULL, index,
> -				    UINT_MAX, GFP_KERNEL);
> -		if (!ret)
> -			idr_replace(&idrinfo->action_idr, ERR_PTR(-EBUSY),
> -				    *index);
> +		max = UINT_MAX;
>  	}
> +
> +new:
> +	*a = NULL;
> +
> +	mutex_lock(&idrinfo->lock);
> +	ret = idr_alloc_u32(&idrinfo->action_idr, ERR_PTR(-EBUSY), index, max,
> +			    GFP_KERNEL);

What if multiple concurrent tasks didn't find the action by index with
rcu and get here, synchronizing on the idrinfo->lock? It looks like
after the one who got the lock first successfully allocates the index
everyone else will fail (also propagating ENOSPACE to the user). I guess
you need some mechanism to account for such case and retry.

>  	mutex_unlock(&idrinfo->lock);
> +
>  	return ret;
>  }
>  EXPORT_SYMBOL(tcf_idr_check_alloc);


  reply	other threads:[~2023-12-05 18:52 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-05 15:30 [PATCH net-next 0/2] net/sched: optimizations around action binding and init Pedro Tammela
2023-12-05 15:30 ` [PATCH net-next 1/2] net/sched: act_api: rely on rcu in tcf_idr_check_alloc Pedro Tammela
2023-12-05 18:34   ` Vlad Buslov [this message]
2023-12-05 20:19     ` Pedro Tammela
2023-12-06  9:52       ` Vlad Buslov
2023-12-08 21:07         ` Pedro Tammela
2023-12-11 16:10           ` Vlad Buslov
2023-12-05 15:30 ` [PATCH net-next 2/2] net/sched: act_api: skip idr replace on bound actions Pedro Tammela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jzpso53a.fsf@nvidia.com \
    --to=vladbu@nvidia$(echo .)com \
    --cc=davem@davemloft$(echo .)net \
    --cc=edumazet@google$(echo .)com \
    --cc=jhs@mojatatu$(echo .)com \
    --cc=jiri@resnulli$(echo .)us \
    --cc=kuba@kernel$(echo .)org \
    --cc=marcelo.leitner@gmail$(echo .)com \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=pabeni@redhat$(echo .)com \
    --cc=pctammela@mojatatu$(echo .)com \
    --cc=xiyou.wangcong@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox