public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Vadim Fedorenko <vadim.fedorenko@linux•dev>
To: Willem de Bruijn <willemdebruijn.kernel@gmail•com>,
	Vadim Fedorenko <vadfed@meta•com>,
	Jakub Kicinski <kuba@kernel•org>,
	Willem de Bruijn <willemb@google•com>,
	Paolo Abeni <pabeni@redhat•com>, David Ahern <dsahern@kernel•org>
Cc: netdev@vger•kernel.org
Subject: Re: [RFC PATCH] net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message
Date: Thu, 29 Aug 2024 15:13:43 +0100	[thread overview]
Message-ID: <dfe033f1-cc61-4be3-a59d-e6b623591cc6@linux.dev> (raw)
In-Reply-To: <66d0783ca3dc4_3895fa2946a@willemb.c.googlers.com.notmuch>

On 29/08/2024 14:31, Willem de Bruijn wrote:
> Vadim Fedorenko wrote:
>> SOF_TIMESTAMPING_OPT_ID socket option flag gives a way to correlate TX
>> timestamps
> 
> +1 on the feature. Few minor points only.
> 
> Not a hard requirement, but would be nice if there was a test,
> e.g., as a tools/testing/../txtimestamp.c extension.

Sure, I'll add some tests in the next version.


>> and packets sent via socket. Unfortunately, there is no way
>> to reliably predict socket timestamp ID value in case of error returned
>> by sendmsg [1].
> 
> Might be good to copy more context from the discussion to explain why
> reliable OPT_ID is infeasible. For UDP, it is as simple as lockless
> transmit. For RAW, things like MSG_MORE come into play.

Ok, I'll add it, thanks!

>> This patch adds new control message type to give user-space
>> software an opportunity to control the mapping between packets and
>> values by providing ID with each sendmsg. This works fine for UDP
>> sockets only, and explicit check is added to control message parser.
>> Also, there is no easy way to use 0 as provided ID, so this is value
>> treated as invalid.
> 
> This is because the code branches on non-zero value in the cookie,
> else uses ts_key. Please make this explicit. Or perhaps better, add a
> bit in the cookie so that the full 32-bit space can be used.

Adding a bit in the cookie is not enough, I have to add another flag to
inet_cork. And we are running out of space for tx flags, 
inet_cork::tx_flags is u8 and we have only 1 bit left for SKBTX* enum.
Do you think it's OK to use this last bit for OPT_ID feature?

>> [1] https://lore.kernel.org/netdev/CALCETrU0jB+kg0mhV6A8mrHfTE1D1pr1SD_B9Eaa9aDPfgHdtA@mail.gmail.com/
>>
>> Signed-off-by: Vadim Fedorenko <vadfed@meta•com>
>> ---
>>   include/net/inet_sock.h           |  1 +
>>   include/net/sock.h                |  1 +
>>   include/uapi/asm-generic/socket.h |  2 ++
>>   net/core/sock.c                   | 14 ++++++++++++++
>>   net/ipv4/ip_output.c              | 11 +++++++++--
>>   net/ipv6/ip6_output.c             | 11 +++++++++--
>>   6 files changed, 36 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
>> index 394c3b66065e..7e8545311557 100644
>> --- a/include/net/inet_sock.h
>> +++ b/include/net/inet_sock.h
>> @@ -174,6 +174,7 @@ struct inet_cork {
>>   	__s16			tos;
>>   	char			priority;
>>   	__u16			gso_size;
>> +	u32			ts_opt_id;
>>   	u64			transmit_time;
>>   	u32			mark;
>>   };
> 
> Ah there's a hole here. Nice!
> 
>> diff --git a/include/net/sock.h b/include/net/sock.h
>> index f51d61fab059..73e21dad5660 100644
>> --- a/include/net/sock.h
>> +++ b/include/net/sock.h
>> @@ -1794,6 +1794,7 @@ struct sockcm_cookie {
>>   	u64 transmit_time;
>>   	u32 mark;
>>   	u32 tsflags;
>> +	u32 ts_opt_id;
>>   };
>>   
>>   static inline void sockcm_init(struct sockcm_cookie *sockc,
>> diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
>> index 8ce8a39a1e5f..db3df3e74b01 100644
>> --- a/include/uapi/asm-generic/socket.h
>> +++ b/include/uapi/asm-generic/socket.h
>> @@ -135,6 +135,8 @@
>>   #define SO_PASSPIDFD		76
>>   #define SO_PEERPIDFD		77
>>   
>> +#define SCM_TS_OPT_ID		78
>> +
>>   #if !defined(__KERNEL__)
>>   
>>   #if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
>> diff --git a/net/core/sock.c b/net/core/sock.c
>> index 468b1239606c..918cb6a0dcba 100644
>> --- a/net/core/sock.c
>> +++ b/net/core/sock.c
>> @@ -2859,6 +2859,20 @@ int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,
>>   			return -EINVAL;
>>   		sockc->transmit_time = get_unaligned((u64 *)CMSG_DATA(cmsg));
>>   		break;
>> +	case SCM_TS_OPT_ID:
>> +		/* allow this option for UDP sockets only */
>> +		if (!sk_is_udp(sk))
>> +			return -EINVAL;
>> +		tsflags = READ_ONCE(sk->sk_tsflags);
>> +		if (!(tsflags & SOF_TIMESTAMPING_OPT_ID))
>> +			return -EINVAL;
>> +		if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32)))
>> +			return -EINVAL;
>> +		sockc->ts_opt_id = get_unaligned((u32 *)CMSG_DATA(cmsg));
> 
> Is the get_unaligned here needed? I don't usually see that on
> CMSG_DATA accesses. Even though they are indeed likely to be
> unaligned.

Well, maybe you are right and we don't need get_unaligned for u32
here, at least SO_MARK uses direct access. I have no strong opinion.

>> +		/* do not allow 0 as packet id for timestamp */
>> +		if (!sockc->ts_opt_id)
>> +			return -EINVAL;
>> +		break;
>>   	/* SCM_RIGHTS and SCM_CREDENTIALS are semantically in SOL_UNIX. */
>>   	case SCM_RIGHTS:
>>   	case SCM_CREDENTIALS:
>> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
>> index b90d0f78ac80..f1e6695cafd2 100644
>> --- a/net/ipv4/ip_output.c
>> +++ b/net/ipv4/ip_output.c
>> @@ -1050,8 +1050,14 @@ static int __ip_append_data(struct sock *sk,
>>   
>>   	hold_tskey = cork->tx_flags & SKBTX_ANY_TSTAMP &&
>>   		     READ_ONCE(sk->sk_tsflags) & SOF_TIMESTAMPING_OPT_ID;
>> -	if (hold_tskey)
>> -		tskey = atomic_inc_return(&sk->sk_tskey) - 1;
>> +	if (hold_tskey) {
>> +                if (cork->ts_opt_id) {
>> +                        hold_tskey = false;
>> +                        tskey = cork->ts_opt_id;
>> +                } else {
>> +                        tskey = atomic_inc_return(&sk->sk_tskey) - 1;
>> +                }
>> +	}
>>   
>>   	/* So, what's going on in the loop below?
>>   	 *
>> @@ -1324,6 +1330,7 @@ static int ip_setup_cork(struct sock *sk, struct inet_cork *cork,
>>   	cork->mark = ipc->sockc.mark;
>>   	cork->priority = ipc->priority;
>>   	cork->transmit_time = ipc->sockc.transmit_time;
>> +	cork->ts_opt_id = ipc->sockc.ts_opt_id;
>>   	cork->tx_flags = 0;
>>   	sock_tx_timestamp(sk, ipc->sockc.tsflags, &cork->tx_flags);
>>   
>> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
>> index f26841f1490f..602064250546 100644
>> --- a/net/ipv6/ip6_output.c
>> +++ b/net/ipv6/ip6_output.c
>> @@ -1401,6 +1401,7 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
>>   	cork->base.gso_size = ipc6->gso_size;
>>   	cork->base.tx_flags = 0;
>>   	cork->base.mark = ipc6->sockc.mark;
>> +	cork->base.ts_opt_id = ipc6->sockc.ts_opt_id;
>>   	sock_tx_timestamp(sk, ipc6->sockc.tsflags, &cork->base.tx_flags);
>>   
>>   	cork->base.length = 0;
>> @@ -1545,8 +1546,14 @@ static int __ip6_append_data(struct sock *sk,
>>   
>>   	hold_tskey = cork->tx_flags & SKBTX_ANY_TSTAMP &&
>>   		     READ_ONCE(sk->sk_tsflags) & SOF_TIMESTAMPING_OPT_ID;
>> -	if (hold_tskey)
>> -		tskey = atomic_inc_return(&sk->sk_tskey) - 1;
>> +	if (hold_tskey) {
>> +		if (cork->ts_opt_id) {
>> +			hold_tskey = false;
>> +			tskey = cork->ts_opt_id;
>> +		} else {
>> +			tskey = atomic_inc_return(&sk->sk_tskey) - 1;
>> +		}
>> +	}
>>   
>>   	/*
>>   	 * Let's try using as much space as possible.
>> -- 
>> 2.43.5
>>
> 
> 


  reply	other threads:[~2024-08-29 14:13 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-29  0:03 [RFC PATCH] net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message Vadim Fedorenko
2024-08-29 13:31 ` Willem de Bruijn
2024-08-29 14:13   ` Vadim Fedorenko [this message]
2024-08-29 14:27     ` Willem de Bruijn
2024-08-29 15:00       ` Vadim Fedorenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dfe033f1-cc61-4be3-a59d-e6b623591cc6@linux.dev \
    --to=vadim.fedorenko@linux$(echo .)dev \
    --cc=dsahern@kernel$(echo .)org \
    --cc=kuba@kernel$(echo .)org \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=pabeni@redhat$(echo .)com \
    --cc=vadfed@meta$(echo .)com \
    --cc=willemb@google$(echo .)com \
    --cc=willemdebruijn.kernel@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox