public inbox for linux-next@vger.kernel.org 
 help / color / mirror / Atom feed
* [linux-next20251112]Kernel OOPs while running btrfs/023 test case
@ 2025-11-13 12:51 Venkat Rao Bagalkote
  2025-11-13 13:17 ` Venkat Rao Bagalkote
  0 siblings, 1 reply; 5+ messages in thread
From: Venkat Rao Bagalkote @ 2025-11-13 12:51 UTC (permalink / raw)
  To: riteshh, linux-btrfs, Qu Wenruo, David Sterba, LKML,
	Madhavan Srinivasan, Linux Next Mailing List, Stephen Rothwell

Greetings!!!


IBM CI has reported a kernel crash while running btrfs/023 test from 
xfstest suite on IBM Power11 system.


Traces:


[  184.714500] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363 
devid 1 transid 8 /dev/loop1 (7:1) scanned by mkfs.btrfs (2697)
[  184.714612] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363 
devid 2 transid 8 /dev/loop2 (7:2) scanned by mkfs.btrfs (2697)
[  184.714731] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363 
devid 3 transid 8 /dev/loop3 (7:3) scanned by mkfs.btrfs (2697)
[  184.714825] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363 
devid 4 transid 8 /dev/loop4 (7:4) scanned by mkfs.btrfs (2697)
[  184.714918] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363 
devid 5 transid 8 /dev/loop5 (7:5) scanned by mkfs.btrfs (2697)
[  184.720659] BTRFS info (device loop1): first mount of filesystem 
b8c762d5-3f1a-4020-bca9-2e7e107e5363
[  184.720694] BTRFS info (device loop1): using crc32c (crc32c-lib) 
checksum algorithm
[  184.720708] BTRFS info (device loop1): forcing free space tree for 
sector size 4096 with page size 65536
[  184.725011] BTRFS info (device loop1): checking UUID tree
[  184.725060] BTRFS info (device loop1): enabling ssd optimizations
[  184.725068] BTRFS info (device loop1): turning on async discard
[  184.725075] BTRFS info (device loop1): enabling free space tree
[  184.735050] BUG: Unable to handle kernel data access at 
0x6696fffdda1ea4c2
[  184.735072] Faulting instruction address: 0xc0000000007bd030
[  184.735087] Oops: Kernel access of bad area, sig: 11 [#1]
[  184.735101] LE PAGE_SIZE=64K MMU=Radix  SMP NR_CPUS=8192 NUMA pSeries
[  184.735118] Modules linked in: loop nft_fib_inet nft_fib_ipv4 
nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 
nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 
nf_defrag_ipv4 bonding tls ip_set rfkill nf_tables sunrpc nfnetlink 
pseries_rng vmx_crypto fuse ext4 crc16 mbcache jbd2 sd_mod sg ibmvscsi 
ibmveth scsi_transport_srp pseries_wdt
[  184.735316] CPU: 22 UID: 0 PID: 1948 Comm: systemd-udevd Kdump: 
loaded Tainted: G    B               6.18.0-rc5-next-20251112 #1 VOLUNTARY
[  184.735342] Tainted: [B]=BAD_PAGE
[  184.735352] Hardware name: IBM,9080-HEX Power11 (architected) 
0x820200 0xf000007 of:IBM,FW1110.01 (NH1110_069) hv:phyp pSeries
[  184.735369] NIP:  c0000000007bd030 LR: c0000000007bcef4 CTR: 
c000000000902824
[  184.735386] REGS: c00000006fdb7910 TRAP: 0380   Tainted: G B          
       (6.18.0-rc5-next-20251112)
[  184.735404] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 
28004402  XER: 20040000
[  184.735460] CFAR: c0000000007bcf98 IRQMASK: 0
[  184.735460] GPR00: c0000000007bcef4 c00000006fdb7bb0 c0000000026aa100 
0000000000000000
[  184.735460] GPR04: 0000000000000cc0 000000013470ff60 00000000000006f0 
c0000009906ff4f0
[  184.735460] GPR08: 669164fddb1e9c02 0000000000000800 000000098d420000 
0000000000000000
[  184.735460] GPR12: c000000000902824 c000000991e0e700 0000000000000000 
0000000000000000
[  184.735460] GPR16: 0000000000000000 0000000000000000 0000000000000000 
0000000000000000
[  184.735460] GPR20: 0000000000000000 0000000000000000 0000000000000000 
0000000000000000
[  184.735460] GPR24: 00000000000006ef 0000000000001000 ffffffffffffffff 
c00c000000402680
[  184.735460] GPR28: c0000000008f312c 0000000000000cc0 6696fffdda1e9cc2 
c00000000701e880
[  184.735688] NIP [c0000000007bd030] kmem_cache_alloc_noprof+0x4ac/0x708
[  184.735711] LR [c0000000007bcef4] kmem_cache_alloc_noprof+0x370/0x708
[  184.735729] Call Trace:
[  184.735738] [c00000006fdb7bb0] [c0000000007bcef4] 
kmem_cache_alloc_noprof+0x370/0x708 (unreliable)
[  184.735766] [c00000006fdb7c30] [c0000000008f312c] 
getname_flags.part.0+0x54/0x30c
[  184.735793] [c00000006fdb7c80] [c0000000009028a0] sys_unlinkat+0x7c/0xe4
[  184.735814] [c00000006fdb7cc0] [c000000000039d50] 
system_call_exception+0x1e0/0x450
[  184.735839] [c00000006fdb7e50] [c00000000000d05c] 
system_call_vectored_common+0x15c/0x2ec
[  184.735866] ---- interrupt: 3000 at 0x7fff9df366bc
[  184.735881] NIP:  00007fff9df366bc LR: 00007fff9df366bc CTR: 
0000000000000000
[  184.735897] REGS: c00000006fdb7e80 TRAP: 3000   Tainted: G B          
       (6.18.0-rc5-next-20251112)
[  184.735913] MSR:  800000000280f033 
<SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 48004402  XER: 00000000
[  184.735989] IRQMASK: 0
[  184.735989] GPR00: 0000000000000124 00007fffe0b3a3a0 00007fff9e037d00 
0000000000000006
[  184.735989] GPR04: 000000013470ff60 0000000000000000 0000000000001000 
00007fff9e0314b8
[  184.735989] GPR08: 0000000000000271 0000000000000000 0000000000000000 
0000000000000000
[  184.735989] GPR12: 0000000000000000 00007fff9e8c4ca0 00000001161e5a78 
00007fffe0b3ab10
[  184.735989] GPR16: 0000000000000003 0000000000000000 00000001161aaed0 
00000001161e9750
[  184.735989] GPR20: 00007fffe0b3a780 00000001161eb260 00000001161eb320 
0000000000000008
[  184.735989] GPR24: 00000001347061c0 0000000000000000 0000000000000009 
00000001347061c0
[  184.735989] GPR28: 0000000000000006 00007fffe0b3a53c 0000000134715740 
0000000000100000
[  184.736216] NIP [00007fff9df366bc] 0x7fff9df366bc
[  184.736231] LR [00007fff9df366bc] 0x7fff9df366bc
[  184.736251] ---- interrupt: 3000
[  184.736262] Code: f8610030 4082fccc 4bfffc28 2c3e0000 4182ff98 
2c3b0000 4182ff90 60000000 3b40ffff 813f0030 e91f00c0 38d80001 
<7f7e482a> 7d3e4a14 79270022 552ac03e
[  184.736362] ---[ end trace 0000000000000000 ]---


If you happen to fix this, please add below tag.


Reported-by: Venkat Rao Bagalkote <venkat88@linux•ibm.com>


Regards,

Venkat.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-next20251112]Kernel OOPs while running btrfs/023 test case
  2025-11-13 12:51 [linux-next20251112]Kernel OOPs while running btrfs/023 test case Venkat Rao Bagalkote
@ 2025-11-13 13:17 ` Venkat Rao Bagalkote
  2025-11-13 15:51   ` David Sterba
  0 siblings, 1 reply; 5+ messages in thread
From: Venkat Rao Bagalkote @ 2025-11-13 13:17 UTC (permalink / raw)
  To: riteshh, linux-btrfs, Qu Wenruo, David Sterba, LKML,
	Madhavan Srinivasan, Linux Next Mailing List, Stephen Rothwell


On 13/11/25 6:21 pm, Venkat Rao Bagalkote wrote:
> Greetings!!!
>
>
> IBM CI has reported a kernel crash while running btrfs/023 test from 
> xfstest suite on IBM Power11 system.
>
>
> Traces:
>
>
> [  184.714500] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363 
> devid 1 transid 8 /dev/loop1 (7:1) scanned by mkfs.btrfs (2697)
> [  184.714612] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363 
> devid 2 transid 8 /dev/loop2 (7:2) scanned by mkfs.btrfs (2697)
> [  184.714731] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363 
> devid 3 transid 8 /dev/loop3 (7:3) scanned by mkfs.btrfs (2697)
> [  184.714825] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363 
> devid 4 transid 8 /dev/loop4 (7:4) scanned by mkfs.btrfs (2697)
> [  184.714918] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363 
> devid 5 transid 8 /dev/loop5 (7:5) scanned by mkfs.btrfs (2697)
> [  184.720659] BTRFS info (device loop1): first mount of filesystem 
> b8c762d5-3f1a-4020-bca9-2e7e107e5363
> [  184.720694] BTRFS info (device loop1): using crc32c (crc32c-lib) 
> checksum algorithm
> [  184.720708] BTRFS info (device loop1): forcing free space tree for 
> sector size 4096 with page size 65536
> [  184.725011] BTRFS info (device loop1): checking UUID tree
> [  184.725060] BTRFS info (device loop1): enabling ssd optimizations
> [  184.725068] BTRFS info (device loop1): turning on async discard
> [  184.725075] BTRFS info (device loop1): enabling free space tree
> [  184.735050] BUG: Unable to handle kernel data access at 
> 0x6696fffdda1ea4c2
> [  184.735072] Faulting instruction address: 0xc0000000007bd030
> [  184.735087] Oops: Kernel access of bad area, sig: 11 [#1]
> [  184.735101] LE PAGE_SIZE=64K MMU=Radix  SMP NR_CPUS=8192 NUMA pSeries
> [  184.735118] Modules linked in: loop nft_fib_inet nft_fib_ipv4 
> nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 
> nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 
> nf_defrag_ipv4 bonding tls ip_set rfkill nf_tables sunrpc nfnetlink 
> pseries_rng vmx_crypto fuse ext4 crc16 mbcache jbd2 sd_mod sg ibmvscsi 
> ibmveth scsi_transport_srp pseries_wdt
> [  184.735316] CPU: 22 UID: 0 PID: 1948 Comm: systemd-udevd Kdump: 
> loaded Tainted: G    B               6.18.0-rc5-next-20251112 #1 
> VOLUNTARY
> [  184.735342] Tainted: [B]=BAD_PAGE
> [  184.735352] Hardware name: IBM,9080-HEX Power11 (architected) 
> 0x820200 0xf000007 of:IBM,FW1110.01 (NH1110_069) hv:phyp pSeries
> [  184.735369] NIP:  c0000000007bd030 LR: c0000000007bcef4 CTR: 
> c000000000902824
> [  184.735386] REGS: c00000006fdb7910 TRAP: 0380   Tainted: G B       
>       (6.18.0-rc5-next-20251112)
> [  184.735404] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 
> 28004402  XER: 20040000
> [  184.735460] CFAR: c0000000007bcf98 IRQMASK: 0
> [  184.735460] GPR00: c0000000007bcef4 c00000006fdb7bb0 
> c0000000026aa100 0000000000000000
> [  184.735460] GPR04: 0000000000000cc0 000000013470ff60 
> 00000000000006f0 c0000009906ff4f0
> [  184.735460] GPR08: 669164fddb1e9c02 0000000000000800 
> 000000098d420000 0000000000000000
> [  184.735460] GPR12: c000000000902824 c000000991e0e700 
> 0000000000000000 0000000000000000
> [  184.735460] GPR16: 0000000000000000 0000000000000000 
> 0000000000000000 0000000000000000
> [  184.735460] GPR20: 0000000000000000 0000000000000000 
> 0000000000000000 0000000000000000
> [  184.735460] GPR24: 00000000000006ef 0000000000001000 
> ffffffffffffffff c00c000000402680
> [  184.735460] GPR28: c0000000008f312c 0000000000000cc0 
> 6696fffdda1e9cc2 c00000000701e880
> [  184.735688] NIP [c0000000007bd030] kmem_cache_alloc_noprof+0x4ac/0x708
> [  184.735711] LR [c0000000007bcef4] kmem_cache_alloc_noprof+0x370/0x708
> [  184.735729] Call Trace:
> [  184.735738] [c00000006fdb7bb0] [c0000000007bcef4] 
> kmem_cache_alloc_noprof+0x370/0x708 (unreliable)
> [  184.735766] [c00000006fdb7c30] [c0000000008f312c] 
> getname_flags.part.0+0x54/0x30c
> [  184.735793] [c00000006fdb7c80] [c0000000009028a0] 
> sys_unlinkat+0x7c/0xe4
> [  184.735814] [c00000006fdb7cc0] [c000000000039d50] 
> system_call_exception+0x1e0/0x450
> [  184.735839] [c00000006fdb7e50] [c00000000000d05c] 
> system_call_vectored_common+0x15c/0x2ec
> [  184.735866] ---- interrupt: 3000 at 0x7fff9df366bc
> [  184.735881] NIP:  00007fff9df366bc LR: 00007fff9df366bc CTR: 
> 0000000000000000
> [  184.735897] REGS: c00000006fdb7e80 TRAP: 3000   Tainted: G B       
>       (6.18.0-rc5-next-20251112)
> [  184.735913] MSR:  800000000280f033 
> <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 48004402  XER: 00000000
> [  184.735989] IRQMASK: 0
> [  184.735989] GPR00: 0000000000000124 00007fffe0b3a3a0 
> 00007fff9e037d00 0000000000000006
> [  184.735989] GPR04: 000000013470ff60 0000000000000000 
> 0000000000001000 00007fff9e0314b8
> [  184.735989] GPR08: 0000000000000271 0000000000000000 
> 0000000000000000 0000000000000000
> [  184.735989] GPR12: 0000000000000000 00007fff9e8c4ca0 
> 00000001161e5a78 00007fffe0b3ab10
> [  184.735989] GPR16: 0000000000000003 0000000000000000 
> 00000001161aaed0 00000001161e9750
> [  184.735989] GPR20: 00007fffe0b3a780 00000001161eb260 
> 00000001161eb320 0000000000000008
> [  184.735989] GPR24: 00000001347061c0 0000000000000000 
> 0000000000000009 00000001347061c0
> [  184.735989] GPR28: 0000000000000006 00007fffe0b3a53c 
> 0000000134715740 0000000000100000
> [  184.736216] NIP [00007fff9df366bc] 0x7fff9df366bc
> [  184.736231] LR [00007fff9df366bc] 0x7fff9df366bc
> [  184.736251] ---- interrupt: 3000
> [  184.736262] Code: f8610030 4082fccc 4bfffc28 2c3e0000 4182ff98 
> 2c3b0000 4182ff90 60000000 3b40ffff 813f0030 e91f00c0 38d80001 
> <7f7e482a> 7d3e4a14 79270022 552ac03e
> [  184.736362] ---[ end trace 0000000000000000 ]---
>

Mostly the issue got introduced by one of the below three commits. As 
reverting these three, this issue is not seen.


9299051573d9 e8ea54f86241 cd93c0aad7e3

>
> If you happen to fix this, please add below tag.
>
>
> Reported-by: Venkat Rao Bagalkote <venkat88@linux•ibm.com>
>
>
> Regards,
>
> Venkat.
>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-next20251112]Kernel OOPs while running btrfs/023 test case
  2025-11-13 13:17 ` Venkat Rao Bagalkote
@ 2025-11-13 15:51   ` David Sterba
  2025-11-13 20:14     ` Qu Wenruo
  2025-11-13 21:33     ` Qu Wenruo
  0 siblings, 2 replies; 5+ messages in thread
From: David Sterba @ 2025-11-13 15:51 UTC (permalink / raw)
  To: Venkat Rao Bagalkote
  Cc: riteshh, linux-btrfs, Qu Wenruo, David Sterba, LKML,
	Madhavan Srinivasan, Linux Next Mailing List, Stephen Rothwell

On Thu, Nov 13, 2025 at 06:47:43PM +0530, Venkat Rao Bagalkote wrote:
> On 13/11/25 6:21 pm, Venkat Rao Bagalkote wrote:
> > Greetings!!!
> >
> > IBM CI has reported a kernel crash while running btrfs/023 test from 
> > xfstest suite on IBM Power11 system.
> >
> >
> > Traces:
> > [  184.714500] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363 
> > devid 1 transid 8 /dev/loop1 (7:1) scanned by mkfs.btrfs (2697)
> > [  184.714612] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363 
> > devid 2 transid 8 /dev/loop2 (7:2) scanned by mkfs.btrfs (2697)
> > [  184.714731] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363 
> > devid 3 transid 8 /dev/loop3 (7:3) scanned by mkfs.btrfs (2697)
> > [  184.714825] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363 
> > devid 4 transid 8 /dev/loop4 (7:4) scanned by mkfs.btrfs (2697)
> > [  184.714918] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363 
> > devid 5 transid 8 /dev/loop5 (7:5) scanned by mkfs.btrfs (2697)
> > [  184.720659] BTRFS info (device loop1): first mount of filesystem 
> > b8c762d5-3f1a-4020-bca9-2e7e107e5363
> > [  184.720694] BTRFS info (device loop1): using crc32c (crc32c-lib) 
> > checksum algorithm
> > [  184.720708] BTRFS info (device loop1): forcing free space tree for 
> > sector size 4096 with page size 65536
> > [  184.725011] BTRFS info (device loop1): checking UUID tree
> > [  184.725060] BTRFS info (device loop1): enabling ssd optimizations
> > [  184.725068] BTRFS info (device loop1): turning on async discard
> > [  184.725075] BTRFS info (device loop1): enabling free space tree
> > [  184.735050] BUG: Unable to handle kernel data access at 
> > 0x6696fffdda1ea4c2
> > [  184.735072] Faulting instruction address: 0xc0000000007bd030
> > [  184.735087] Oops: Kernel access of bad area, sig: 11 [#1]
> > [  184.735101] LE PAGE_SIZE=64K MMU=Radix  SMP NR_CPUS=8192 NUMA pSeries
> > [  184.735118] Modules linked in: loop nft_fib_inet nft_fib_ipv4 
> > nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 
> > nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 
> > nf_defrag_ipv4 bonding tls ip_set rfkill nf_tables sunrpc nfnetlink 
> > pseries_rng vmx_crypto fuse ext4 crc16 mbcache jbd2 sd_mod sg ibmvscsi 
> > ibmveth scsi_transport_srp pseries_wdt
> > [  184.735316] CPU: 22 UID: 0 PID: 1948 Comm: systemd-udevd Kdump: 
> > loaded Tainted: G    B               6.18.0-rc5-next-20251112 #1 
> > VOLUNTARY
> > [  184.735342] Tainted: [B]=BAD_PAGE
> > [  184.735352] Hardware name: IBM,9080-HEX Power11 (architected) 
> > 0x820200 0xf000007 of:IBM,FW1110.01 (NH1110_069) hv:phyp pSeries
> > [  184.735369] NIP:  c0000000007bd030 LR: c0000000007bcef4 CTR: 
> > c000000000902824
> > [  184.735386] REGS: c00000006fdb7910 TRAP: 0380   Tainted: G B       
> >       (6.18.0-rc5-next-20251112)
> > [  184.735404] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 
> > 28004402  XER: 20040000
> > [  184.735460] CFAR: c0000000007bcf98 IRQMASK: 0
> > [  184.735460] GPR00: c0000000007bcef4 c00000006fdb7bb0 
> > c0000000026aa100 0000000000000000
> > [  184.735460] GPR04: 0000000000000cc0 000000013470ff60 
> > 00000000000006f0 c0000009906ff4f0
> > [  184.735460] GPR08: 669164fddb1e9c02 0000000000000800 
> > 000000098d420000 0000000000000000
> > [  184.735460] GPR12: c000000000902824 c000000991e0e700 
> > 0000000000000000 0000000000000000
> > [  184.735460] GPR16: 0000000000000000 0000000000000000 
> > 0000000000000000 0000000000000000
> > [  184.735460] GPR20: 0000000000000000 0000000000000000 
> > 0000000000000000 0000000000000000
> > [  184.735460] GPR24: 00000000000006ef 0000000000001000 
> > ffffffffffffffff c00c000000402680
> > [  184.735460] GPR28: c0000000008f312c 0000000000000cc0 
> > 6696fffdda1e9cc2 c00000000701e880
> > [  184.735688] NIP [c0000000007bd030] kmem_cache_alloc_noprof+0x4ac/0x708
> > [  184.735711] LR [c0000000007bcef4] kmem_cache_alloc_noprof+0x370/0x708
> > [  184.735729] Call Trace:
> > [  184.735738] [c00000006fdb7bb0] [c0000000007bcef4] 
> > kmem_cache_alloc_noprof+0x370/0x708 (unreliable)
> > [  184.735766] [c00000006fdb7c30] [c0000000008f312c] 
> > getname_flags.part.0+0x54/0x30c
> > [  184.735793] [c00000006fdb7c80] [c0000000009028a0] 
> > sys_unlinkat+0x7c/0xe4
> > [  184.735814] [c00000006fdb7cc0] [c000000000039d50] 
> > system_call_exception+0x1e0/0x450
> > [  184.735839] [c00000006fdb7e50] [c00000000000d05c] 
> > system_call_vectored_common+0x15c/0x2ec
> > [  184.735866] ---- interrupt: 3000 at 0x7fff9df366bc
> > [  184.735881] NIP:  00007fff9df366bc LR: 00007fff9df366bc CTR: 
> > 0000000000000000
> > [  184.735897] REGS: c00000006fdb7e80 TRAP: 3000   Tainted: G B       
> >       (6.18.0-rc5-next-20251112)
> > [  184.735913] MSR:  800000000280f033 
> > <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 48004402  XER: 00000000
> > [  184.735989] IRQMASK: 0
> > [  184.735989] GPR00: 0000000000000124 00007fffe0b3a3a0 
> > 00007fff9e037d00 0000000000000006
> > [  184.735989] GPR04: 000000013470ff60 0000000000000000 
> > 0000000000001000 00007fff9e0314b8
> > [  184.735989] GPR08: 0000000000000271 0000000000000000 
> > 0000000000000000 0000000000000000
> > [  184.735989] GPR12: 0000000000000000 00007fff9e8c4ca0 
> > 00000001161e5a78 00007fffe0b3ab10
> > [  184.735989] GPR16: 0000000000000003 0000000000000000 
> > 00000001161aaed0 00000001161e9750
> > [  184.735989] GPR20: 00007fffe0b3a780 00000001161eb260 
> > 00000001161eb320 0000000000000008
> > [  184.735989] GPR24: 00000001347061c0 0000000000000000 
> > 0000000000000009 00000001347061c0
> > [  184.735989] GPR28: 0000000000000006 00007fffe0b3a53c 
> > 0000000134715740 0000000000100000
> > [  184.736216] NIP [00007fff9df366bc] 0x7fff9df366bc
> > [  184.736231] LR [00007fff9df366bc] 0x7fff9df366bc
> > [  184.736251] ---- interrupt: 3000
> > [  184.736262] Code: f8610030 4082fccc 4bfffc28 2c3e0000 4182ff98 
> > 2c3b0000 4182ff90 60000000 3b40ffff 813f0030 e91f00c0 38d80001 
> > <7f7e482a> 7d3e4a14 79270022 552ac03e
> > [  184.736362] ---[ end trace 0000000000000000 ]---
> >

Thanks for the report.

> Mostly the issue got introduced by one of the below three commits. As 
> reverting these three, this issue is not seen.
> 
> 
> 9299051573d9 e8ea54f86241 cd93c0aad7e3

9299051573d9 btrfs: enable encoded read/write/send for bs > ps cases
e8ea54f86241 btrfs: make read verification handle bs > ps cases without large folios
cd93c0aad7e3 btrfs: make btrfs_repair_io_failure() handle bs > ps cases without large folios

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-next20251112]Kernel OOPs while running btrfs/023 test case
  2025-11-13 15:51   ` David Sterba
@ 2025-11-13 20:14     ` Qu Wenruo
  2025-11-13 21:33     ` Qu Wenruo
  1 sibling, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2025-11-13 20:14 UTC (permalink / raw)
  To: dsterba, Venkat Rao Bagalkote
  Cc: riteshh, linux-btrfs, Qu Wenruo, David Sterba, LKML,
	Madhavan Srinivasan, Linux Next Mailing List, Stephen Rothwell



在 2025/11/14 02:21, David Sterba 写道:
> On Thu, Nov 13, 2025 at 06:47:43PM +0530, Venkat Rao Bagalkote wrote:
>> On 13/11/25 6:21 pm, Venkat Rao Bagalkote wrote:
>>> Greetings!!!
>>>
>>> IBM CI has reported a kernel crash while running btrfs/023 test from
>>> xfstest suite on IBM Power11 system.
>>>
>>>
>>> Traces:
>>> [  184.714500] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363
>>> devid 1 transid 8 /dev/loop1 (7:1) scanned by mkfs.btrfs (2697)
>>> [  184.714612] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363
>>> devid 2 transid 8 /dev/loop2 (7:2) scanned by mkfs.btrfs (2697)
>>> [  184.714731] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363
>>> devid 3 transid 8 /dev/loop3 (7:3) scanned by mkfs.btrfs (2697)
>>> [  184.714825] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363
>>> devid 4 transid 8 /dev/loop4 (7:4) scanned by mkfs.btrfs (2697)
>>> [  184.714918] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363
>>> devid 5 transid 8 /dev/loop5 (7:5) scanned by mkfs.btrfs (2697)
>>> [  184.720659] BTRFS info (device loop1): first mount of filesystem
>>> b8c762d5-3f1a-4020-bca9-2e7e107e5363
>>> [  184.720694] BTRFS info (device loop1): using crc32c (crc32c-lib)
>>> checksum algorithm
>>> [  184.720708] BTRFS info (device loop1): forcing free space tree for
>>> sector size 4096 with page size 65536
>>> [  184.725011] BTRFS info (device loop1): checking UUID tree
>>> [  184.725060] BTRFS info (device loop1): enabling ssd optimizations
>>> [  184.725068] BTRFS info (device loop1): turning on async discard
>>> [  184.725075] BTRFS info (device loop1): enabling free space tree
>>> [  184.735050] BUG: Unable to handle kernel data access at
>>> 0x6696fffdda1ea4c2
>>> [  184.735072] Faulting instruction address: 0xc0000000007bd030
>>> [  184.735087] Oops: Kernel access of bad area, sig: 11 [#1]
>>> [  184.735101] LE PAGE_SIZE=64K MMU=Radix  SMP NR_CPUS=8192 NUMA pSeries
>>> [  184.735118] Modules linked in: loop nft_fib_inet nft_fib_ipv4
>>> nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6
>>> nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6
>>> nf_defrag_ipv4 bonding tls ip_set rfkill nf_tables sunrpc nfnetlink
>>> pseries_rng vmx_crypto fuse ext4 crc16 mbcache jbd2 sd_mod sg ibmvscsi
>>> ibmveth scsi_transport_srp pseries_wdt
>>> [  184.735316] CPU: 22 UID: 0 PID: 1948 Comm: systemd-udevd Kdump:
>>> loaded Tainted: G    B               6.18.0-rc5-next-20251112 #1
>>> VOLUNTARY
>>> [  184.735342] Tainted: [B]=BAD_PAGE
>>> [  184.735352] Hardware name: IBM,9080-HEX Power11 (architected)
>>> 0x820200 0xf000007 of:IBM,FW1110.01 (NH1110_069) hv:phyp pSeries
>>> [  184.735369] NIP:  c0000000007bd030 LR: c0000000007bcef4 CTR:
>>> c000000000902824
>>> [  184.735386] REGS: c00000006fdb7910 TRAP: 0380   Tainted: G B
>>>        (6.18.0-rc5-next-20251112)
>>> [  184.735404] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR:
>>> 28004402  XER: 20040000
>>> [  184.735460] CFAR: c0000000007bcf98 IRQMASK: 0
>>> [  184.735460] GPR00: c0000000007bcef4 c00000006fdb7bb0
>>> c0000000026aa100 0000000000000000
>>> [  184.735460] GPR04: 0000000000000cc0 000000013470ff60
>>> 00000000000006f0 c0000009906ff4f0
>>> [  184.735460] GPR08: 669164fddb1e9c02 0000000000000800
>>> 000000098d420000 0000000000000000
>>> [  184.735460] GPR12: c000000000902824 c000000991e0e700
>>> 0000000000000000 0000000000000000
>>> [  184.735460] GPR16: 0000000000000000 0000000000000000
>>> 0000000000000000 0000000000000000
>>> [  184.735460] GPR20: 0000000000000000 0000000000000000
>>> 0000000000000000 0000000000000000
>>> [  184.735460] GPR24: 00000000000006ef 0000000000001000
>>> ffffffffffffffff c00c000000402680
>>> [  184.735460] GPR28: c0000000008f312c 0000000000000cc0
>>> 6696fffdda1e9cc2 c00000000701e880
>>> [  184.735688] NIP [c0000000007bd030] kmem_cache_alloc_noprof+0x4ac/0x708
>>> [  184.735711] LR [c0000000007bcef4] kmem_cache_alloc_noprof+0x370/0x708
>>> [  184.735729] Call Trace:
>>> [  184.735738] [c00000006fdb7bb0] [c0000000007bcef4]
>>> kmem_cache_alloc_noprof+0x370/0x708 (unreliable)
>>> [  184.735766] [c00000006fdb7c30] [c0000000008f312c]
>>> getname_flags.part.0+0x54/0x30c
>>> [  184.735793] [c00000006fdb7c80] [c0000000009028a0]
>>> sys_unlinkat+0x7c/0xe4
>>> [  184.735814] [c00000006fdb7cc0] [c000000000039d50]
>>> system_call_exception+0x1e0/0x450
>>> [  184.735839] [c00000006fdb7e50] [c00000000000d05c]
>>> system_call_vectored_common+0x15c/0x2ec
>>> [  184.735866] ---- interrupt: 3000 at 0x7fff9df366bc
>>> [  184.735881] NIP:  00007fff9df366bc LR: 00007fff9df366bc CTR:
>>> 0000000000000000
>>> [  184.735897] REGS: c00000006fdb7e80 TRAP: 3000   Tainted: G B
>>>        (6.18.0-rc5-next-20251112)
>>> [  184.735913] MSR:  800000000280f033
>>> <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 48004402  XER: 00000000
>>> [  184.735989] IRQMASK: 0
>>> [  184.735989] GPR00: 0000000000000124 00007fffe0b3a3a0
>>> 00007fff9e037d00 0000000000000006
>>> [  184.735989] GPR04: 000000013470ff60 0000000000000000
>>> 0000000000001000 00007fff9e0314b8
>>> [  184.735989] GPR08: 0000000000000271 0000000000000000
>>> 0000000000000000 0000000000000000
>>> [  184.735989] GPR12: 0000000000000000 00007fff9e8c4ca0
>>> 00000001161e5a78 00007fffe0b3ab10
>>> [  184.735989] GPR16: 0000000000000003 0000000000000000
>>> 00000001161aaed0 00000001161e9750
>>> [  184.735989] GPR20: 00007fffe0b3a780 00000001161eb260
>>> 00000001161eb320 0000000000000008
>>> [  184.735989] GPR24: 00000001347061c0 0000000000000000
>>> 0000000000000009 00000001347061c0
>>> [  184.735989] GPR28: 0000000000000006 00007fffe0b3a53c
>>> 0000000134715740 0000000000100000
>>> [  184.736216] NIP [00007fff9df366bc] 0x7fff9df366bc
>>> [  184.736231] LR [00007fff9df366bc] 0x7fff9df366bc
>>> [  184.736251] ---- interrupt: 3000
>>> [  184.736262] Code: f8610030 4082fccc 4bfffc28 2c3e0000 4182ff98
>>> 2c3b0000 4182ff90 60000000 3b40ffff 813f0030 e91f00c0 38d80001
>>> <7f7e482a> 7d3e4a14 79270022 552ac03e
>>> [  184.736362] ---[ end trace 0000000000000000 ]---
>>>
> 
> Thanks for the report.
> 
>> Mostly the issue got introduced by one of the below three commits. As
>> reverting these three, this issue is not seen.

Mind to share the block size of the fs? 4K or 64K?
>>
>>
>> 9299051573d9 e8ea54f86241 cd93c0aad7e3
> 
> 9299051573d9 btrfs: enable encoded read/write/send for bs > ps cases
> e8ea54f86241 btrfs: make read verification handle bs > ps cases without large folios
> cd93c0aad7e3 btrfs: make btrfs_repair_io_failure() handle bs > ps cases without large folios
> 

The problem looks weird, as for 64K page sized power11, there should be 
no path involved for bs > ps cases.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-next20251112]Kernel OOPs while running btrfs/023 test case
  2025-11-13 15:51   ` David Sterba
  2025-11-13 20:14     ` Qu Wenruo
@ 2025-11-13 21:33     ` Qu Wenruo
  1 sibling, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2025-11-13 21:33 UTC (permalink / raw)
  To: dsterba, Venkat Rao Bagalkote
  Cc: riteshh, linux-btrfs, Qu Wenruo, David Sterba, LKML,
	Madhavan Srinivasan, Linux Next Mailing List, Stephen Rothwell



在 2025/11/14 02:21, David Sterba 写道:
> On Thu, Nov 13, 2025 at 06:47:43PM +0530, Venkat Rao Bagalkote wrote:
>> On 13/11/25 6:21 pm, Venkat Rao Bagalkote wrote:
>>> Greetings!!!
>>>
>>> IBM CI has reported a kernel crash while running btrfs/023 test from
>>> xfstest suite on IBM Power11 system.
>>>
>>>
>>> Traces:
>>> [  184.714500] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363
>>> devid 1 transid 8 /dev/loop1 (7:1) scanned by mkfs.btrfs (2697)
>>> [  184.714612] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363
>>> devid 2 transid 8 /dev/loop2 (7:2) scanned by mkfs.btrfs (2697)
>>> [  184.714731] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363
>>> devid 3 transid 8 /dev/loop3 (7:3) scanned by mkfs.btrfs (2697)
>>> [  184.714825] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363
>>> devid 4 transid 8 /dev/loop4 (7:4) scanned by mkfs.btrfs (2697)
>>> [  184.714918] BTRFS: device fsid b8c762d5-3f1a-4020-bca9-2e7e107e5363
>>> devid 5 transid 8 /dev/loop5 (7:5) scanned by mkfs.btrfs (2697)
>>> [  184.720659] BTRFS info (device loop1): first mount of filesystem
>>> b8c762d5-3f1a-4020-bca9-2e7e107e5363
>>> [  184.720694] BTRFS info (device loop1): using crc32c (crc32c-lib)
>>> checksum algorithm
>>> [  184.720708] BTRFS info (device loop1): forcing free space tree for
>>> sector size 4096 with page size 65536
>>> [  184.725011] BTRFS info (device loop1): checking UUID tree
>>> [  184.725060] BTRFS info (device loop1): enabling ssd optimizations
>>> [  184.725068] BTRFS info (device loop1): turning on async discard
>>> [  184.725075] BTRFS info (device loop1): enabling free space tree
>>> [  184.735050] BUG: Unable to handle kernel data access at
>>> 0x6696fffdda1ea4c2
>>> [  184.735072] Faulting instruction address: 0xc0000000007bd030
>>> [  184.735087] Oops: Kernel access of bad area, sig: 11 [#1]
>>> [  184.735101] LE PAGE_SIZE=64K MMU=Radix  SMP NR_CPUS=8192 NUMA pSeries
>>> [  184.735118] Modules linked in: loop nft_fib_inet nft_fib_ipv4
>>> nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6
>>> nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6
>>> nf_defrag_ipv4 bonding tls ip_set rfkill nf_tables sunrpc nfnetlink
>>> pseries_rng vmx_crypto fuse ext4 crc16 mbcache jbd2 sd_mod sg ibmvscsi
>>> ibmveth scsi_transport_srp pseries_wdt
>>> [  184.735316] CPU: 22 UID: 0 PID: 1948 Comm: systemd-udevd Kdump:
>>> loaded Tainted: G    B               6.18.0-rc5-next-20251112 #1
>>> VOLUNTARY
>>> [  184.735342] Tainted: [B]=BAD_PAGE
>>> [  184.735352] Hardware name: IBM,9080-HEX Power11 (architected)
>>> 0x820200 0xf000007 of:IBM,FW1110.01 (NH1110_069) hv:phyp pSeries
>>> [  184.735369] NIP:  c0000000007bd030 LR: c0000000007bcef4 CTR:
>>> c000000000902824
>>> [  184.735386] REGS: c00000006fdb7910 TRAP: 0380   Tainted: G B
>>>        (6.18.0-rc5-next-20251112)
>>> [  184.735404] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR:
>>> 28004402  XER: 20040000
>>> [  184.735460] CFAR: c0000000007bcf98 IRQMASK: 0
>>> [  184.735460] GPR00: c0000000007bcef4 c00000006fdb7bb0
>>> c0000000026aa100 0000000000000000
>>> [  184.735460] GPR04: 0000000000000cc0 000000013470ff60
>>> 00000000000006f0 c0000009906ff4f0
>>> [  184.735460] GPR08: 669164fddb1e9c02 0000000000000800
>>> 000000098d420000 0000000000000000
>>> [  184.735460] GPR12: c000000000902824 c000000991e0e700
>>> 0000000000000000 0000000000000000
>>> [  184.735460] GPR16: 0000000000000000 0000000000000000
>>> 0000000000000000 0000000000000000
>>> [  184.735460] GPR20: 0000000000000000 0000000000000000
>>> 0000000000000000 0000000000000000
>>> [  184.735460] GPR24: 00000000000006ef 0000000000001000
>>> ffffffffffffffff c00c000000402680
>>> [  184.735460] GPR28: c0000000008f312c 0000000000000cc0
>>> 6696fffdda1e9cc2 c00000000701e880
>>> [  184.735688] NIP [c0000000007bd030] kmem_cache_alloc_noprof+0x4ac/0x708
>>> [  184.735711] LR [c0000000007bcef4] kmem_cache_alloc_noprof+0x370/0x708
>>> [  184.735729] Call Trace:
>>> [  184.735738] [c00000006fdb7bb0] [c0000000007bcef4]
>>> kmem_cache_alloc_noprof+0x370/0x708 (unreliable)
>>> [  184.735766] [c00000006fdb7c30] [c0000000008f312c]
>>> getname_flags.part.0+0x54/0x30c
>>> [  184.735793] [c00000006fdb7c80] [c0000000009028a0]
>>> sys_unlinkat+0x7c/0xe4
>>> [  184.735814] [c00000006fdb7cc0] [c000000000039d50]
>>> system_call_exception+0x1e0/0x450
>>> [  184.735839] [c00000006fdb7e50] [c00000000000d05c]
>>> system_call_vectored_common+0x15c/0x2ec
>>> [  184.735866] ---- interrupt: 3000 at 0x7fff9df366bc
>>> [  184.735881] NIP:  00007fff9df366bc LR: 00007fff9df366bc CTR:
>>> 0000000000000000
>>> [  184.735897] REGS: c00000006fdb7e80 TRAP: 3000   Tainted: G B
>>>        (6.18.0-rc5-next-20251112)
>>> [  184.735913] MSR:  800000000280f033
>>> <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 48004402  XER: 00000000
>>> [  184.735989] IRQMASK: 0
>>> [  184.735989] GPR00: 0000000000000124 00007fffe0b3a3a0
>>> 00007fff9e037d00 0000000000000006
>>> [  184.735989] GPR04: 000000013470ff60 0000000000000000
>>> 0000000000001000 00007fff9e0314b8
>>> [  184.735989] GPR08: 0000000000000271 0000000000000000
>>> 0000000000000000 0000000000000000
>>> [  184.735989] GPR12: 0000000000000000 00007fff9e8c4ca0
>>> 00000001161e5a78 00007fffe0b3ab10
>>> [  184.735989] GPR16: 0000000000000003 0000000000000000
>>> 00000001161aaed0 00000001161e9750
>>> [  184.735989] GPR20: 00007fffe0b3a780 00000001161eb260
>>> 00000001161eb320 0000000000000008
>>> [  184.735989] GPR24: 00000001347061c0 0000000000000000
>>> 0000000000000009 00000001347061c0
>>> [  184.735989] GPR28: 0000000000000006 00007fffe0b3a53c
>>> 0000000134715740 0000000000100000
>>> [  184.736216] NIP [00007fff9df366bc] 0x7fff9df366bc
>>> [  184.736231] LR [00007fff9df366bc] 0x7fff9df366bc
>>> [  184.736251] ---- interrupt: 3000
>>> [  184.736262] Code: f8610030 4082fccc 4bfffc28 2c3e0000 4182ff98
>>> 2c3b0000 4182ff90 60000000 3b40ffff 813f0030 e91f00c0 38d80001
>>> <7f7e482a> 7d3e4a14 79270022 552ac03e
>>> [  184.736362] ---[ end trace 0000000000000000 ]---
>>>
> 
> Thanks for the report.
> 
>> Mostly the issue got introduced by one of the below three commits. As
>> reverting these three, this issue is not seen.
>>
>>
>> 9299051573d9 e8ea54f86241 cd93c0aad7e3
> 
> 9299051573d9 btrfs: enable encoded read/write/send for bs > ps cases
> e8ea54f86241 btrfs: make read verification handle bs > ps cases without large folios
> cd93c0aad7e3 btrfs: make btrfs_repair_io_failure() handle bs > ps cases without large folios
> 

I located the problem to be the patch "btrfs: raid56: remove sector_ptr 
structure", where I have a local fix not submitted to the mailing list.

And during the recent push into for-next branch, I'm again using the 
mailing list one, not the local fixed one, resulting 
btrfs_raid_bio::stripe_paddrs[*] to be assigned way beyond its boundary.

This makes us to randomly corrupt the memory, resulting weird results.

And the fix is pretty straightforward:

Bad:

+		rbio->stripe_paddrs[i] = page_to_phys(rbio->stripe_pages[page_index] +
+						      offset_in_page(offset));

Good:

+		rbio->stripe_paddrs[i] = page_to_phys(rbio->stripe_pages[page_index]) +
+						      offset_in_page(offset);

Since offset_in_page() is involved, it only affects subpage systems.

I'll fold the fix into the offending patch.

Thanks for the report, and sorry for the bug.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-11-13 21:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-13 12:51 [linux-next20251112]Kernel OOPs while running btrfs/023 test case Venkat Rao Bagalkote
2025-11-13 13:17 ` Venkat Rao Bagalkote
2025-11-13 15:51   ` David Sterba
2025-11-13 20:14     ` Qu Wenruo
2025-11-13 21:33     ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox