public inbox for linuxppc-dev@ozlabs.org 
 help / color / mirror / Atom feed
From: "Nicholas Piggin" <npiggin@gmail•com>
To: "Timothy Pearson" <tpearson@raptorengineering•com>,
	"Michael Ellerman" <mpe@ellerman•id.au>
Cc: Jens Axboe <axboe@kernel•dk>,
	linuxppc-dev <linuxppc-dev@lists•ozlabs.org>
Subject: Re: [PATCH] powerpc: Fix data corruption on IPI
Date: Wed, 15 Nov 2023 13:11:35 +1000	[thread overview]
Message-ID: <CWZ21RR3QHP3.294BZGFLKYC74@wheely> (raw)
In-Reply-To: <993112821.47345042.1699997526942.JavaMail.zimbra@raptorengineeringinc.com>

(Sorry I didn't see that Michael already made the same comment,
ignore my previous.)

On Wed Nov 15, 2023 at 7:32 AM AEST, Timothy Pearson wrote:
>
>
> ----- Original Message -----
> > From: "Michael Ellerman" <mpe@ellerman•id.au>
> > To: "Timothy Pearson" <tpearson@raptorengineering•com>, "linuxppc-dev" <linuxppc-dev@lists•ozlabs.org>
> > Cc: "Jens Axboe" <axboe@kernel•dk>
> > Sent: Tuesday, November 14, 2023 6:14:37 AM
> > Subject: Re: [PATCH] powerpc: Fix data corruption on IPI
>
> > Hi Timothy,
> > 
> > Thanks for debugging this, but I'm unclear why this is helping because
> > we should already have a full barrier (hwsync) on both the sending and
> > receiving side.
> > 
> > More below.
>
> So it looks like we might be dealing with a couple of potentially separate issues here, not sure.  This is probably the most infuriating bug I've run across in my career, so please bear with me -- with the amount of active test kernels I have installed at any one time I'm occassionally infecting one with the tests from another. ;)
>
> First, I'm not 100% convinced the code in try_to_wake_up() is race-free on ppc64, since adding a memory barrier between it and the call to kick_process() significantly reduces the frequency of the on-disk corruption.  Can you take a look and see if anything stands out as to why that would be important?

We have had memory ordering bugs in the scheduler before, but normally
they would result in a lost wakeup (hang) or crash in the scheduler.
AFAIK things that sleep are not supposed to assume they won't get an
interrupt before the event they are waiting on.

Where are you adding the barrier?

If it only reduces corruption then it seems like the race or ordering
bug is elsewhere (but maybe nearby).

>
> The second part of this though is that the barrier only reduces the frequency of the corruption, it does not eliminate the corruption.  After some consolidation, what completely eliminates the corruption is a combination of:
>  * Switching to TWA_SIGNAL_NO_IPI in task_work_add() * 
>  * Adding udelay(1000) in io_uring/rw.c:io_write(), right before the call to io_rw_init_file()

>
> Adding a memory barrier instead of the udelay() doesn't work, nor does adding the delay without switching to NO_IPI.

And just NO_IPI alone doesn't eliminate it?

>
> [1] Keeping TWA_SIGNAL and adding the barrier instruction also works, but this is conceptually simpler to understand as to why it would have an effect at all

Which barrier, the ttwu one? And by work you mean fixes completely?

Still smells like a bug in io uring code, possibly MySQL. Is it only
MySQL it's been seen with do you know? I don't suppose there are any
clues about the corruption pattern or when or where it appears?

Thanks,
Nick

  reply	other threads:[~2023-11-15  3:12 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-14  8:00 [PATCH] powerpc: Fix data corruption on IPI Timothy Pearson
2023-11-14 12:14 ` Michael Ellerman
2023-11-14 21:32   ` Timothy Pearson
2023-11-15  3:11     ` Nicholas Piggin [this message]
2023-11-17  7:39   ` Timothy Pearson
2023-11-17  7:52     ` Timothy Pearson
2023-11-17  8:01     ` Nicholas Piggin
2023-11-17  8:20       ` Timothy Pearson
2023-11-17  8:26         ` Timothy Pearson
2023-11-17  8:54           ` Timothy Pearson
     [not found] <19221908.47168775.1699937769845.JavaMail.zimbra@raptorengineeringinc.com>
     [not found] ` <ZVMo0vOZAxuxT8la@eldamar.lan>
2023-11-14  8:03   ` Timothy Pearson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CWZ21RR3QHP3.294BZGFLKYC74@wheely \
    --to=npiggin@gmail$(echo .)com \
    --cc=axboe@kernel$(echo .)dk \
    --cc=linuxppc-dev@lists$(echo .)ozlabs.org \
    --cc=mpe@ellerman$(echo .)id.au \
    --cc=tpearson@raptorengineering$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox