* [PATCH] xfrm: Fix oops in __xfrm_state_delete()
@ 2022-10-31 15:26 Thomas Jarosch
2022-11-01 4:39 ` Herbert Xu
0 siblings, 1 reply; 9+ messages in thread
From: Thomas Jarosch @ 2022-10-31 15:26 UTC (permalink / raw)
To: Steffen Klassert, Herbert Xu, Antony Antony, Sabrina Dubroca,
Leon Romanovsky, Roth Mark, Zhihao Chen, Tobias Brunner, netdev
Kernel 5.14 added a new "byseq" index to speed
up xfrm_state lookups by sequence number in commit
fe9f1d8779cb ("xfrm: add state hashtable keyed by seq")
While the patch was thorough, the function pfkey_send_new_mapping()
in net/af_key.c also modifies x->km.seq and never added
the current xfrm_state to the "byseq" index.
This leads to the following kernel Ooops:
BUG: kernel NULL pointer dereference, address: 0000000000000000
..
RIP: 0010:__xfrm_state_delete+0xc9/0x1c0
..
Call Trace:
<TASK>
xfrm_state_delete+0x1e/0x40
xfrm_del_sa+0xb0/0x110 [xfrm_user]
xfrm_user_rcv_msg+0x12d/0x270 [xfrm_user]
? remove_entity_load_avg+0x8a/0xa0
? copy_to_user_state_extra+0x580/0x580 [xfrm_user]
netlink_rcv_skb+0x51/0x100
xfrm_netlink_rcv+0x30/0x50 [xfrm_user]
netlink_unicast+0x1a6/0x270
netlink_sendmsg+0x22a/0x480
__sys_sendto+0x1a6/0x1c0
? __audit_syscall_entry+0xd8/0x130
? __audit_syscall_exit+0x249/0x2b0
__x64_sys_sendto+0x23/0x30
do_syscall_64+0x3a/0x90
entry_SYSCALL_64_after_hwframe+0x61/0xcb
Exact location of the crash in __xfrm_state_delete():
if (x->km.seq)
hlist_del_rcu(&x->byseq);
The hlist_node "byseq" was never populated.
The bug only triggers if a new NAT traversal mapping (changed IP or port)
is detected in esp_input_done2() / esp6_input_done2(), which in turn
indirectly calls pfkey_send_new_mapping() *if* the kernel is compiled
with CONFIG_NET_KEY and "af_key" is active or loaded.
The PF_KEYv2 message SADB_X_NAT_T_NEW_MAPPING is not part of RFC 2367.
Various implementations have been examined how they handle
the "sadb_msg_seq" header field:
- racoon (Android): does not process SADB_X_NAT_T_NEW_MAPPING
- strongswan: does not care about sadb_msg_seq
- openswan: does not care about sadb_msg_seq
Since there is no standard how PF_KEYv2 sadb_msg_seq should be populated
for SADB_X_NAT_T_NEW_MAPPING and it's not used in popular
implementations and is actually the sequence number of the
PF_KEYv2 message itself, we can fix the root cause of the oops
and not modify "x->km.seq" in pfkey_send_new_mapping().
The update of "km.seq" looks like a copy'n'paste error
from pfkey_send_acquire(). SADB_ACQUIRE must indeed assign a unique km.seq
number according to RFC 2367. It has been verified that code paths
involving pfkey_send_acquire() don't cause the same Oops.
PF_KEYv2 SADB_X_NAT_T_NEW_MAPPING support was originally added here:
https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
commit cbc3488685b20e7b2a98ad387a1a816aada569d8
Author: Derek Atkins <derek@ihtfp•com>
AuthorDate: Wed Apr 2 13:21:02 2003 -0800
[IPSEC]: Implement UDP Encapsulation framework.
In particular, implement ESPinUDP encapsulation for IPsec
Nat Traversal.
A note on triggering the bug: I was not able to trigger it using VMs.
There is one VPN using a high latency link on our production VPN server
that triggered it like once a day though.
Link: https://github.com/strongswan/strongswan/issues/992
Link: https://lore.kernel.org/netdev/00959f33ee52c4b3b0084d42c430418e502db554.1652340703.git.antony.antony@secunet.com/T/
Link: https://lore.kernel.org/netdev/20221027142455.3975224-1-chenzhihao@meizu.com/T/
Fixes: fe9f1d8779cb ("xfrm: add state hashtable keyed by seq")
Reported-by: Roth Mark <rothm@mail•com>
Reported-by: Zhihao Chen <chenzhihao@meizu•com>
Tested-by: Roth Mark <rothm@mail•com>
Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net•com>
---
net/key/af_key.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/key/af_key.c b/net/key/af_key.c
index c85df5b958d2..65a9ede62d65 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3382,7 +3382,7 @@ static int pfkey_send_new_mapping(struct xfrm_state *x, xfrm_address_t *ipaddr,
hdr->sadb_msg_len = size / sizeof(uint64_t);
hdr->sadb_msg_errno = 0;
hdr->sadb_msg_reserved = 0;
- hdr->sadb_msg_seq = x->km.seq = get_acqseq();
+ hdr->sadb_msg_seq = get_acqseq();
hdr->sadb_msg_pid = 0;
/* SA */
--
2.37.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] xfrm: Fix oops in __xfrm_state_delete()
2022-10-31 15:26 [PATCH] xfrm: Fix oops in __xfrm_state_delete() Thomas Jarosch
@ 2022-11-01 4:39 ` Herbert Xu
2022-11-01 19:10 ` Antony Antony
0 siblings, 1 reply; 9+ messages in thread
From: Herbert Xu @ 2022-11-01 4:39 UTC (permalink / raw)
To: Thomas Jarosch
Cc: Steffen Klassert, Antony Antony, Sabrina Dubroca, Leon Romanovsky,
Roth Mark, Zhihao Chen, Tobias Brunner, netdev
On Mon, Oct 31, 2022 at 04:26:12PM +0100, Thomas Jarosch wrote:
>
> diff --git a/net/key/af_key.c b/net/key/af_key.c
> index c85df5b958d2..65a9ede62d65 100644
> --- a/net/key/af_key.c
> +++ b/net/key/af_key.c
> @@ -3382,7 +3382,7 @@ static int pfkey_send_new_mapping(struct xfrm_state *x, xfrm_address_t *ipaddr,
> hdr->sadb_msg_len = size / sizeof(uint64_t);
> hdr->sadb_msg_errno = 0;
> hdr->sadb_msg_reserved = 0;
> - hdr->sadb_msg_seq = x->km.seq = get_acqseq();
> + hdr->sadb_msg_seq = get_acqseq();
This looks broken. x->km.seq is part of the state which you are
changing. Shouldn't you do whatever xfrm_user does in the same
situation?
Thanks,
--
Email: Herbert Xu <herbert@gondor•apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] xfrm: Fix oops in __xfrm_state_delete()
2022-11-01 4:39 ` Herbert Xu
@ 2022-11-01 19:10 ` Antony Antony
2022-11-02 7:07 ` Herbert Xu
0 siblings, 1 reply; 9+ messages in thread
From: Antony Antony @ 2022-11-01 19:10 UTC (permalink / raw)
To: Herbert Xu
Cc: Thomas Jarosch, Steffen Klassert, Antony Antony, Sabrina Dubroca,
Leon Romanovsky, Roth Mark, Zhihao Chen, Tobias Brunner, netdev
On Tue, Nov 01, 2022 at 12:39:48 +0800, Herbert Xu wrote:
> On Mon, Oct 31, 2022 at 04:26:12PM +0100, Thomas Jarosch wrote:
> >
> > diff --git a/net/key/af_key.c b/net/key/af_key.c
> > index c85df5b958d2..65a9ede62d65 100644
> > --- a/net/key/af_key.c
> > +++ b/net/key/af_key.c
> > @@ -3382,7 +3382,7 @@ static int pfkey_send_new_mapping(struct xfrm_state *x, xfrm_address_t *ipaddr,
> > hdr->sadb_msg_len = size / sizeof(uint64_t);
> > hdr->sadb_msg_errno = 0;
> > hdr->sadb_msg_reserved = 0;
> > - hdr->sadb_msg_seq = x->km.seq = get_acqseq();
This line looks very odd.
> > + hdr->sadb_msg_seq = get_acqseq();
>
> This looks broken. x->km.seq is part of the state which you are
> changing. Shouldn't you do whatever xfrm_user does in the same
> situation?
xfrm_user sets msg_seq to zero in mapping change message. seq is only useful for
acquire message. I think setting to zero would be a better fix.
- hdr->sadb_msg_seq = x->km.seq = get_acqseq();
+ hdr->sadb_msg_seq = 0;
While increasing x->km.seq in every call to pfkey_send_new_mapping()
could be an issue, would it alone explan the crash?
Tobias would pfkey_send_new_mapping() called in a default setting?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] xfrm: Fix oops in __xfrm_state_delete()
2022-11-01 19:10 ` Antony Antony
@ 2022-11-02 7:07 ` Herbert Xu
2022-11-02 8:31 ` Thomas Jarosch
0 siblings, 1 reply; 9+ messages in thread
From: Herbert Xu @ 2022-11-02 7:07 UTC (permalink / raw)
To: Antony Antony
Cc: Thomas Jarosch, Steffen Klassert, Sabrina Dubroca,
Leon Romanovsky, Roth Mark, Zhihao Chen, Tobias Brunner, netdev
On Tue, Nov 01, 2022 at 08:10:21PM +0100, Antony Antony wrote:
>
> xfrm_user sets msg_seq to zero in mapping change message. seq is only useful for
Oh I had misread the patch and thought this was send_acquire.
> acquire message. I think setting to zero would be a better fix.
>
> - hdr->sadb_msg_seq = x->km.seq = get_acqseq();
> + hdr->sadb_msg_seq = 0;
>
> While increasing x->km.seq in every call to pfkey_send_new_mapping()
> could be an issue, would it alone explan the crash?
Probably, if you change the state without moving it to the right
hash slot then the xfrm state hash table will be inconsistent.
We should copy the xfrm_user behaviour which is to leave x->km.seq
alone. So the patch should change the above line to
hdr->sadb_msg_seq = x->km.seq;
Thanks,
--
Email: Herbert Xu <herbert@gondor•apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] xfrm: Fix oops in __xfrm_state_delete()
2022-11-02 7:07 ` Herbert Xu
@ 2022-11-02 8:31 ` Thomas Jarosch
2022-11-02 10:18 ` [PATCH v2] " Thomas Jarosch
0 siblings, 1 reply; 9+ messages in thread
From: Thomas Jarosch @ 2022-11-02 8:31 UTC (permalink / raw)
To: Herbert Xu
Cc: Antony Antony, Steffen Klassert, Sabrina Dubroca, Leon Romanovsky,
Roth Mark, Zhihao Chen, Tobias Brunner, netdev
Hi Herbert,
You wrote on Wed, Nov 02, 2022 at 03:07:57PM +0800:
> > xfrm_user sets msg_seq to zero in mapping change message. seq is only useful for
>
> Oh I had misread the patch and thought this was send_acquire.
it's a complex bug after all ^^ it took many printks() to trace
the flow of the state corruption on the production system.
> > acquire message. I think setting to zero would be a better fix.
> >
> > - hdr->sadb_msg_seq = x->km.seq = get_acqseq();
> > + hdr->sadb_msg_seq = 0;
> >
> > While increasing x->km.seq in every call to pfkey_send_new_mapping()
> > could be an issue, would it alone explan the crash?
>
> Probably, if you change the state without moving it to the right
> hash slot then the xfrm state hash table will be inconsistent.
in the observed cases, km.seq is always zero before. So it was never added to
the byseq hash table in the first place, resulting in the NULL pointer Oops.
If km.seq would be non-zero before entering pfkey_send_new_mapping(),
then of course the xfrm_state would stay in the wrong hash table bucket.
The only other xfrm_states I've seen in my extensive tracing with a non-zero
sequence number were ACQUIRE states and I'm not sure those will ever end up on
the pfkey_send_new_mapping() code path. Either way, let's fix the root cause.
> We should copy the xfrm_user behaviour which is to leave x->km.seq
> alone. So the patch should change the above line to
>
> hdr->sadb_msg_seq = x->km.seq;
thanks for your feedback, I'll update the patch and send a v2.
I'll also put it in production tonight.
Cheers,
Thomas
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2] xfrm: Fix oops in __xfrm_state_delete()
2022-11-02 8:31 ` Thomas Jarosch
@ 2022-11-02 10:18 ` Thomas Jarosch
2022-11-03 10:32 ` Antony Antony
2022-11-04 4:43 ` Herbert Xu
0 siblings, 2 replies; 9+ messages in thread
From: Thomas Jarosch @ 2022-11-02 10:18 UTC (permalink / raw)
To: Herbert Xu
Cc: Antony Antony, Steffen Klassert, Sabrina Dubroca, Leon Romanovsky,
Roth Mark, Zhihao Chen, Tobias Brunner, netdev
Kernel 5.14 added a new "byseq" index to speed
up xfrm_state lookups by sequence number in commit
fe9f1d8779cb ("xfrm: add state hashtable keyed by seq")
While the patch was thorough, the function pfkey_send_new_mapping()
in net/af_key.c also modifies x->km.seq and never added
the current xfrm_state to the "byseq" index.
This leads to the following kernel Ooops:
BUG: kernel NULL pointer dereference, address: 0000000000000000
..
RIP: 0010:__xfrm_state_delete+0xc9/0x1c0
..
Call Trace:
<TASK>
xfrm_state_delete+0x1e/0x40
xfrm_del_sa+0xb0/0x110 [xfrm_user]
xfrm_user_rcv_msg+0x12d/0x270 [xfrm_user]
? remove_entity_load_avg+0x8a/0xa0
? copy_to_user_state_extra+0x580/0x580 [xfrm_user]
netlink_rcv_skb+0x51/0x100
xfrm_netlink_rcv+0x30/0x50 [xfrm_user]
netlink_unicast+0x1a6/0x270
netlink_sendmsg+0x22a/0x480
__sys_sendto+0x1a6/0x1c0
? __audit_syscall_entry+0xd8/0x130
? __audit_syscall_exit+0x249/0x2b0
__x64_sys_sendto+0x23/0x30
do_syscall_64+0x3a/0x90
entry_SYSCALL_64_after_hwframe+0x61/0xcb
Exact location of the crash in __xfrm_state_delete():
if (x->km.seq)
hlist_del_rcu(&x->byseq);
The hlist_node "byseq" was never populated.
The bug only triggers if a new NAT traversal mapping (changed IP or port)
is detected in esp_input_done2() / esp6_input_done2(), which in turn
indirectly calls pfkey_send_new_mapping() *if* the kernel is compiled
with CONFIG_NET_KEY and "af_key" is active.
The PF_KEYv2 message SADB_X_NAT_T_NEW_MAPPING is not part of RFC 2367.
Various implementations have been examined how they handle
the "sadb_msg_seq" header field:
- racoon (Android): does not process SADB_X_NAT_T_NEW_MAPPING
- strongswan: does not care about sadb_msg_seq
- openswan: does not care about sadb_msg_seq
There is no standard how PF_KEYv2 sadb_msg_seq should be populated
for SADB_X_NAT_T_NEW_MAPPING and it's not used in popular
implementations either. Herbert Xu suggested we should just
use the current km.seq value as is. This fixes the root cause
of the oops since we no longer modify km.seq itself.
The update of "km.seq" looks like a copy'n'paste error
from pfkey_send_acquire(). SADB_ACQUIRE must indeed assign a unique km.seq
number according to RFC 2367. It has been verified that code paths
involving pfkey_send_acquire() don't cause the same Oops.
PF_KEYv2 SADB_X_NAT_T_NEW_MAPPING support was originally added here:
https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
commit cbc3488685b20e7b2a98ad387a1a816aada569d8
Author: Derek Atkins <derek@ihtfp•com>
AuthorDate: Wed Apr 2 13:21:02 2003 -0800
[IPSEC]: Implement UDP Encapsulation framework.
In particular, implement ESPinUDP encapsulation for IPsec
Nat Traversal.
A note on triggering the bug: I was not able to trigger it using VMs.
There is one VPN using a high latency link on our production VPN server
that triggered it like once a day though.
Link: https://github.com/strongswan/strongswan/issues/992
Link: https://lore.kernel.org/netdev/00959f33ee52c4b3b0084d42c430418e502db554.1652340703.git.antony.antony@secunet.com/T/
Link: https://lore.kernel.org/netdev/20221027142455.3975224-1-chenzhihao@meizu.com/T/
Fixes: fe9f1d8779cb ("xfrm: add state hashtable keyed by seq")
Reported-by: Roth Mark <rothm@mail•com>
Reported-by: Zhihao Chen <chenzhihao@meizu•com>
Tested-by: Roth Mark <rothm@mail•com>
Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net•com>
---
net/key/af_key.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/key/af_key.c b/net/key/af_key.c
index c85df5b958d2..a4c128bab377 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3382,7 +3382,7 @@ static int pfkey_send_new_mapping(struct xfrm_state *x, xfrm_address_t *ipaddr,
hdr->sadb_msg_len = size / sizeof(uint64_t);
hdr->sadb_msg_errno = 0;
hdr->sadb_msg_reserved = 0;
- hdr->sadb_msg_seq = x->km.seq = get_acqseq();
+ hdr->sadb_msg_seq = x->km.seq;
hdr->sadb_msg_pid = 0;
/* SA */
--
2.37.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2] xfrm: Fix oops in __xfrm_state_delete()
2022-11-02 10:18 ` [PATCH v2] " Thomas Jarosch
@ 2022-11-03 10:32 ` Antony Antony
2022-11-04 4:43 ` Herbert Xu
1 sibling, 0 replies; 9+ messages in thread
From: Antony Antony @ 2022-11-03 10:32 UTC (permalink / raw)
To: Thomas Jarosch
Cc: Herbert Xu, Antony Antony, Steffen Klassert, Sabrina Dubroca,
Leon Romanovsky, Roth Mark, Zhihao Chen, Tobias Brunner, netdev
On Wed, Nov 02, 2022 at 11:18:48 +0100, Thomas Jarosch wrote:
> Kernel 5.14 added a new "byseq" index to speed
> up xfrm_state lookups by sequence number in commit
> fe9f1d8779cb ("xfrm: add state hashtable keyed by seq")
>
> While the patch was thorough, the function pfkey_send_new_mapping()
> in net/af_key.c also modifies x->km.seq and never added
> the current xfrm_state to the "byseq" index.
>
> This leads to the following kernel Ooops:
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> ..
> RIP: 0010:__xfrm_state_delete+0xc9/0x1c0
> ..
> Call Trace:
> <TASK>
> xfrm_state_delete+0x1e/0x40
> xfrm_del_sa+0xb0/0x110 [xfrm_user]
> xfrm_user_rcv_msg+0x12d/0x270 [xfrm_user]
> ? remove_entity_load_avg+0x8a/0xa0
> ? copy_to_user_state_extra+0x580/0x580 [xfrm_user]
> netlink_rcv_skb+0x51/0x100
> xfrm_netlink_rcv+0x30/0x50 [xfrm_user]
> netlink_unicast+0x1a6/0x270
> netlink_sendmsg+0x22a/0x480
> __sys_sendto+0x1a6/0x1c0
> ? __audit_syscall_entry+0xd8/0x130
> ? __audit_syscall_exit+0x249/0x2b0
> __x64_sys_sendto+0x23/0x30
> do_syscall_64+0x3a/0x90
> entry_SYSCALL_64_after_hwframe+0x61/0xcb
>
> Exact location of the crash in __xfrm_state_delete():
> if (x->km.seq)
> hlist_del_rcu(&x->byseq);
>
> The hlist_node "byseq" was never populated.
>
> The bug only triggers if a new NAT traversal mapping (changed IP or port)
> is detected in esp_input_done2() / esp6_input_done2(), which in turn
> indirectly calls pfkey_send_new_mapping() *if* the kernel is compiled
> with CONFIG_NET_KEY and "af_key" is active.
>
> The PF_KEYv2 message SADB_X_NAT_T_NEW_MAPPING is not part of RFC 2367.
> Various implementations have been examined how they handle
> the "sadb_msg_seq" header field:
>
> - racoon (Android): does not process SADB_X_NAT_T_NEW_MAPPING
> - strongswan: does not care about sadb_msg_seq
> - openswan: does not care about sadb_msg_seq
>
> There is no standard how PF_KEYv2 sadb_msg_seq should be populated
> for SADB_X_NAT_T_NEW_MAPPING and it's not used in popular
> implementations either. Herbert Xu suggested we should just
> use the current km.seq value as is. This fixes the root cause
> of the oops since we no longer modify km.seq itself.
>
> The update of "km.seq" looks like a copy'n'paste error
> from pfkey_send_acquire(). SADB_ACQUIRE must indeed assign a unique km.seq
> number according to RFC 2367. It has been verified that code paths
> involving pfkey_send_acquire() don't cause the same Oops.
>
> PF_KEYv2 SADB_X_NAT_T_NEW_MAPPING support was originally added here:
> https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
>
> commit cbc3488685b20e7b2a98ad387a1a816aada569d8
> Author: Derek Atkins <derek@ihtfp•com>
> AuthorDate: Wed Apr 2 13:21:02 2003 -0800
>
> [IPSEC]: Implement UDP Encapsulation framework.
>
> In particular, implement ESPinUDP encapsulation for IPsec
> Nat Traversal.
>
> A note on triggering the bug: I was not able to trigger it using VMs.
> There is one VPN using a high latency link on our production VPN server
> that triggered it like once a day though.
>
> Link: https://github.com/strongswan/strongswan/issues/992
> Link: https://lore.kernel.org/netdev/00959f33ee52c4b3b0084d42c430418e502db554.1652340703.git.antony.antony@secunet.com/T/
> Link: https://lore.kernel.org/netdev/20221027142455.3975224-1-chenzhihao@meizu.com/T/
>
> Fixes: fe9f1d8779cb ("xfrm: add state hashtable keyed by seq")
> Reported-by: Roth Mark <rothm@mail•com>
> Reported-by: Zhihao Chen <chenzhihao@meizu•com>
> Tested-by: Roth Mark <rothm@mail•com>
> Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net•com>
Acked-by: Antony Antony <antony.antony@secunet•com>
> ---
> net/key/af_key.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/key/af_key.c b/net/key/af_key.c
> index c85df5b958d2..a4c128bab377 100644
> --- a/net/key/af_key.c
> +++ b/net/key/af_key.c
> @@ -3382,7 +3382,7 @@ static int pfkey_send_new_mapping(struct xfrm_state *x, xfrm_address_t *ipaddr,
> hdr->sadb_msg_len = size / sizeof(uint64_t);
> hdr->sadb_msg_errno = 0;
> hdr->sadb_msg_reserved = 0;
> - hdr->sadb_msg_seq = x->km.seq = get_acqseq();
> + hdr->sadb_msg_seq = x->km.seq;
> hdr->sadb_msg_pid = 0;
>
> /* SA */
> --
> 2.37.3
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] xfrm: Fix oops in __xfrm_state_delete()
2022-11-02 10:18 ` [PATCH v2] " Thomas Jarosch
2022-11-03 10:32 ` Antony Antony
@ 2022-11-04 4:43 ` Herbert Xu
2022-11-17 6:32 ` Steffen Klassert
1 sibling, 1 reply; 9+ messages in thread
From: Herbert Xu @ 2022-11-04 4:43 UTC (permalink / raw)
To: Thomas Jarosch
Cc: Antony Antony, Steffen Klassert, Sabrina Dubroca, Leon Romanovsky,
Roth Mark, Zhihao Chen, Tobias Brunner, netdev
On Wed, Nov 02, 2022 at 11:18:48AM +0100, Thomas Jarosch wrote:
> Kernel 5.14 added a new "byseq" index to speed
> up xfrm_state lookups by sequence number in commit
> fe9f1d8779cb ("xfrm: add state hashtable keyed by seq")
>
> While the patch was thorough, the function pfkey_send_new_mapping()
> in net/af_key.c also modifies x->km.seq and never added
> the current xfrm_state to the "byseq" index.
>
> This leads to the following kernel Ooops:
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> ..
> RIP: 0010:__xfrm_state_delete+0xc9/0x1c0
> ..
> Call Trace:
> <TASK>
> xfrm_state_delete+0x1e/0x40
> xfrm_del_sa+0xb0/0x110 [xfrm_user]
> xfrm_user_rcv_msg+0x12d/0x270 [xfrm_user]
> ? remove_entity_load_avg+0x8a/0xa0
> ? copy_to_user_state_extra+0x580/0x580 [xfrm_user]
> netlink_rcv_skb+0x51/0x100
> xfrm_netlink_rcv+0x30/0x50 [xfrm_user]
> netlink_unicast+0x1a6/0x270
> netlink_sendmsg+0x22a/0x480
> __sys_sendto+0x1a6/0x1c0
> ? __audit_syscall_entry+0xd8/0x130
> ? __audit_syscall_exit+0x249/0x2b0
> __x64_sys_sendto+0x23/0x30
> do_syscall_64+0x3a/0x90
> entry_SYSCALL_64_after_hwframe+0x61/0xcb
>
> Exact location of the crash in __xfrm_state_delete():
> if (x->km.seq)
> hlist_del_rcu(&x->byseq);
>
> The hlist_node "byseq" was never populated.
>
> The bug only triggers if a new NAT traversal mapping (changed IP or port)
> is detected in esp_input_done2() / esp6_input_done2(), which in turn
> indirectly calls pfkey_send_new_mapping() *if* the kernel is compiled
> with CONFIG_NET_KEY and "af_key" is active.
>
> The PF_KEYv2 message SADB_X_NAT_T_NEW_MAPPING is not part of RFC 2367.
> Various implementations have been examined how they handle
> the "sadb_msg_seq" header field:
>
> - racoon (Android): does not process SADB_X_NAT_T_NEW_MAPPING
> - strongswan: does not care about sadb_msg_seq
> - openswan: does not care about sadb_msg_seq
>
> There is no standard how PF_KEYv2 sadb_msg_seq should be populated
> for SADB_X_NAT_T_NEW_MAPPING and it's not used in popular
> implementations either. Herbert Xu suggested we should just
> use the current km.seq value as is. This fixes the root cause
> of the oops since we no longer modify km.seq itself.
>
> The update of "km.seq" looks like a copy'n'paste error
> from pfkey_send_acquire(). SADB_ACQUIRE must indeed assign a unique km.seq
> number according to RFC 2367. It has been verified that code paths
> involving pfkey_send_acquire() don't cause the same Oops.
>
> PF_KEYv2 SADB_X_NAT_T_NEW_MAPPING support was originally added here:
> https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
>
> commit cbc3488685b20e7b2a98ad387a1a816aada569d8
> Author: Derek Atkins <derek@ihtfp•com>
> AuthorDate: Wed Apr 2 13:21:02 2003 -0800
>
> [IPSEC]: Implement UDP Encapsulation framework.
>
> In particular, implement ESPinUDP encapsulation for IPsec
> Nat Traversal.
>
> A note on triggering the bug: I was not able to trigger it using VMs.
> There is one VPN using a high latency link on our production VPN server
> that triggered it like once a day though.
>
> Link: https://github.com/strongswan/strongswan/issues/992
> Link: https://lore.kernel.org/netdev/00959f33ee52c4b3b0084d42c430418e502db554.1652340703.git.antony.antony@secunet.com/T/
> Link: https://lore.kernel.org/netdev/20221027142455.3975224-1-chenzhihao@meizu.com/T/
>
> Fixes: fe9f1d8779cb ("xfrm: add state hashtable keyed by seq")
> Reported-by: Roth Mark <rothm@mail•com>
> Reported-by: Zhihao Chen <chenzhihao@meizu•com>
> Tested-by: Roth Mark <rothm@mail•com>
> Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net•com>
> ---
> net/key/af_key.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
Acked-by: Herbert Xu <herbert@gondor•apana.org.au>
Thanks,
--
Email: Herbert Xu <herbert@gondor•apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] xfrm: Fix oops in __xfrm_state_delete()
2022-11-04 4:43 ` Herbert Xu
@ 2022-11-17 6:32 ` Steffen Klassert
0 siblings, 0 replies; 9+ messages in thread
From: Steffen Klassert @ 2022-11-17 6:32 UTC (permalink / raw)
To: Herbert Xu
Cc: Thomas Jarosch, Antony Antony, Sabrina Dubroca, Leon Romanovsky,
Roth Mark, Zhihao Chen, Tobias Brunner, netdev
On Fri, Nov 04, 2022 at 12:43:53PM +0800, Herbert Xu wrote:
> On Wed, Nov 02, 2022 at 11:18:48AM +0100, Thomas Jarosch wrote:
...
> >
> > Fixes: fe9f1d8779cb ("xfrm: add state hashtable keyed by seq")
> > Reported-by: Roth Mark <rothm@mail•com>
> > Reported-by: Zhihao Chen <chenzhihao@meizu•com>
> > Tested-by: Roth Mark <rothm@mail•com>
> > Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net•com>
> > ---
> > net/key/af_key.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Acked-by: Herbert Xu <herbert@gondor•apana.org.au>
Applied, thanks everyone!
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-11-17 6:32 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-31 15:26 [PATCH] xfrm: Fix oops in __xfrm_state_delete() Thomas Jarosch
2022-11-01 4:39 ` Herbert Xu
2022-11-01 19:10 ` Antony Antony
2022-11-02 7:07 ` Herbert Xu
2022-11-02 8:31 ` Thomas Jarosch
2022-11-02 10:18 ` [PATCH v2] " Thomas Jarosch
2022-11-03 10:32 ` Antony Antony
2022-11-04 4:43 ` Herbert Xu
2022-11-17 6:32 ` Steffen Klassert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox