Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"

From: Andrew Murray <andrew.murray@arm•com>
To: Russell King - ARM Linux admin <linux@armlinux•org.uk>
Cc: Arnd Bergmann <arnd@arndb•de>,
	Catalin Marinas <catalin.marinas@arm•com>,
	Linus Torvalds <torvalds@linux-foundation•org>,
	Linux Kernel Mailing List <linux-kernel@vger•kernel.org>,
	Masahiro Yamada <yamada.masahiro@socionext•com>,
	Nicolas Saenz Julienne <nsaenzjulienne@suse•de>,
	Will Deacon <will@kernel•org>,
	linux-arm-kernel <linux-arm-kernel@lists•infradead.org>
Subject: Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"
Date: Tue, 1 Oct 2019 16:28:27 +0100	[thread overview]
Message-ID: <20191001152826.GM42880@e119886-lin.cambridge.arm.com> (raw)
In-Reply-To: <20191001143626.GI25745@shell.armlinux.org.uk>

On Tue, Oct 01, 2019 at 03:36:26PM +0100, Russell King - ARM Linux admin wrote:
> On Tue, Oct 01, 2019 at 12:41:30PM +0100, Andrew Murray wrote:
> > On Tue, Oct 01, 2019 at 11:42:54AM +0100, Will Deacon wrote:
> > > On Tue, Oct 01, 2019 at 06:40:26PM +0900, Masahiro Yamada wrote:
> > > > On Mon, Sep 30, 2019 at 8:45 PM Will Deacon <will@kernel•org> wrote:
> > > > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > > > > index 93d97f9b0157..c37c72adaeff 100644
> > > > > --- a/lib/Kconfig.debug
> > > > > +++ b/lib/Kconfig.debug
> > > > > @@ -312,6 +312,7 @@ config HEADERS_CHECK
> > > > >
> > > > >  config OPTIMIZE_INLINING
> > > > >         def_bool y
> > > > > +       depends on !(ARM || ARM64) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91111
> > > > 
> > > > 
> > > > This is a too big hammer.
> > > 
> > > It matches the previous default behaviour!
> > > 
> > > > For ARM, it is not a compiler bug, so I am trying to fix the kernel code.
> > > > 
> > > > For ARM64, even if it is a compiler bug, you can add __always_inline
> > > > to the functions in question.
> > > > (arch_atomic64_dec_if_positive in this case).
> > > > 
> > > > You do not need to force __always_inline globally.
> > > 
> > > So you'd prefer I do something like the diff below? I mean, it's a start,
> > > but I do worry that we're hanging arch/arm/ out to dry.
> > 
> > If I've understood one part of this issue correctly - and using the
> > c2p_unsupported build failure as an example [1], there are instances in
> > the kernel where it is assumed that the compiler will optimise out a call
> > to an undefined function, and also assumed that the compiler will know
> > at compile time that the function will never get called. It's common to
> > satisfy this assumption when the calling function is inlined.
> > 
> > But I suspect there may be other cases similar to c2p_unsupported which
> > are still lurking.
> > 
> > For example the following functions are called but non-existent, and thus
> > may be an area worth investigating:
> > 
> > __buggy_use_of_MTHCA_PUT, __put_dbe_unknown, __cmpxchg_wrong_size,
> > __bad_percpu_size, __put_user_bad, __get_user_unknown,
> > __bad_unaligned_access_size, __bad_xchg
> > 
> > But more generally, as this is a common pattern - isn't there a benefit
> > here for changing all of these to BUILD_BUG? (So they can be found easily).
> 
> Precisely, what is your suggestion?
> 
> If you think that replacing the call to __get_user_bad with BUILD_BUG(),
> BUILD_BUG() becomes a no-op when __OPTIMIZE__ is not defined (see the
> definition of __compiletime_assert() in linux/compiler.h); this means
> such places will be reachable, which leads to uninitialised variables.

I hadn't noticed the use of __OPTIMIZE__ - indeed if __compiletime_assert
is no-op'd and you reach it then you won't have a build error - but you
may get uninitialised values instead.

Presumably the purpose of __OPTIMIZE__ in this case is to prevent getting
an undefined function error for the __compiletime_assert line, even though
it doesn't get called (when using a compiler that doesn't optimize out the
call to the unused function).

Why is the call to __get_user_bad not guarded in this way for when
__OPTIMIZE__ isn't set, i.e. why doesn't it suffer from the issue
that the following fixes?

c03567a8e8d5 ("include/linux/compiler.h: don't perform compiletime_assert with -O0")

> 
> > Or to avoid this class of issues, change them to BUG or unreachable - but
> > lose the benefit of compile time detection?
> 
> I think you ought to read the GCC manual wrt __builtin_unreachable().
> "If control flow reaches the point of the `__builtin_unreachable',
>  the program is undefined.  It is useful in situations where the
>  compiler cannot deduce the unreachability of the code."

Thanks, I've now read it. My suggestion arose from looking at existing
uses in the kernel - e.g. drivers/spi/spi-rockchip.c,
drivers/pinctrl/pinctrl-ingenic.c, etc. I guess these types of uses
should use BUG or similar instead right?

> 
> I have seen cases where the instructions following an unreachable
> code section have been the literal pool for the function - which,
> if reached, would be quite confusing to debug.  If you're lucky, you
> might get an undefined instruction exception.  If not, you could
> continue and start executing another part of the function, leading
> to possibly no crash at all - but unexpected results (which may end
> up leaking sensitive data.)
> 
> For example, in our BUG() implementation on 32-bit ARM, we use
> unreachable() after the asm() statement creating the bug table
> entry and inserting the undefined instruction into the text.
> Here's the resulting disassembly:
> 
>      278:       ebfffffe        bl      0 <page_mapped>
>                         278: R_ARM_CALL page_mapped
>      27c:       e3500000        cmp     r0, #0
>      280:       1a00006c        bne     438 <invalidate_inode_pages2_range+0x3ac>
> ...
>      2d4:       ebfffffe        bl      0 <_raw_spin_lock_irqsave>
>                         2d4: R_ARM_CALL _raw_spin_lock_irqsave
>      2d8:       e5943008        ldr     r3, [r4, #8]
>      2dc:       e3130001        tst     r3, #1
>      2e0:       e1a02000        mov     r2, r0
>      2e4:       1a000054        bne     43c <invalidate_inode_pages2_range+0x3b0>
> ...
>      438:       e7f001f2        .word   0xe7f001f2
>      43c:       e2433001        sub     r3, r3, #1
>      440:       eaffffa9        b       2ec <invalidate_inode_pages2_range+0x260>
> 
> Now, consider what unreachable() actually gets you here - it tells
> the compiler that we do not expect to reach this point (that being
> the point between 438 and 43c.)  If we were to reach that point, we
> would continue executing the code at 43c.
> 
> In this case, it would be like...
> 
> 	if (BUG_ON(page_mapped(page)))
> 	    goto random-location-in-xa_lock_irqsave()-inside-invalidate_complete_page2();
> 
> So no.  unreachable() is not an option.

Thanks for the example.

> 
> We really do want these places to be compile-time detected - relying
> on triggering them at runtime is just not good enough IMHO (think
> about how much testing the kernel would require to discover one of
> these suckers buried in the depths of the code.)
> 
> Here's the question to ask: do we want to reliably detect issues
> that we know are bad, which can lead to:
> - unreliable kernel operation,
> - user exploitable crashes,
> or do we want to hide them for the sake of allowing -O0 compilation?
> 
> Given that the kernel as a general rule has very poor run-time test
> coverage at the moment, I don't think this is the time to consider
> giving up the protection that we have against this badness.
> 
> We've had several instances where these checks have triggered in the
> user access code, and people have noticed when doing build tests.
> They probably don't have the ability to do run-time testing on every
> arch.
> 
> So, the existing facility of detecting these at build time is, IMHO,
> an absolute must.
> 
> It would be different if the kernel community as a whole had the
> ability to run-test every code path through the kernel source on
> every arch, but I don't see that's a remotely realistic prospect.

I completely agree.

> 
> If we want -O0 to work, but still want to preserve the ability to
> detect these adequately, I think the easiest solution to that would
> be to provide these dummy functions only when building with -O0,
> making them all BUG().

Though please note that this issue isn't limited to -O0, it relates to
building without CONFIG_OPTIMIZE_INLINING - resulting in functions
marked as inline not always being inlined.

I'm interested to determine if there is a benefit for all the functions
I mentioned (there are more) to have the same name which may be something
other than BUILD_BUG.

Thanks,

Andrew Murray

> 
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
> According to speedtest.net: 11.9Mbps down 500kbps up

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists•infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel