public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Daniel Borkmann <daniel@iogearbox•net>
To: Linus Torvalds <torvalds@linux-foundation•org>
Cc: Thomas Gleixner <tglx@linutronix•de>,
	Kees Cook <keescook@chromium•org>,
	Laura Abbott <labbott@redhat•com>, Ingo Molnar <mingo@kernel•org>,
	Peter Anvin <hpa@zytor•com>,
	Fengguang Wu <fengguang.wu@intel•com>,
	Network Development <netdev@vger•kernel.org>,
	LKML <linux-kernel@vger•kernel.org>, LKP <lkp@01•org>,
	ast@fb•com, the arch/x86 maintainers <x86@kernel•org>,
	"David S. Miller" <davem@davemloft•net>,
	bp@suse•de
Subject: Re: [net/bpf] 3051bf36c2 BUG: unable to handle kernel paging request at 0000a7cf
Date: Thu, 09 Mar 2017 22:32:12 +0100	[thread overview]
Message-ID: <58C1C9DC.7070509@iogearbox.net> (raw)
In-Reply-To: <58C19F67.3040509@iogearbox.net>

[ + Borislav ]

On 03/09/2017 07:31 PM, Daniel Borkmann wrote:
> On 03/09/2017 07:15 PM, Linus Torvalds wrote:
>> On Thu, Mar 9, 2017 at 10:10 AM, Linus Torvalds
>> <torvalds@linux-foundation•org> wrote:
>>>
>>> Very odd. We should always have PGE (0x0080) set in cr4 (if the CPU
>>> supports it).
>>
>> Daniel, do you see the code in probe_page_size_mask() triggering?
>>
>>          /* Enable PGE if available */
>>          if (boot_cpu_has(X86_FEATURE_PGE)) {
>>                  cr4_set_bits_and_update_boot(X86_CR4_PGE);
>>                  __supported_pte_mask |= _PAGE_GLOBAL;
>
> We do have boot_cpu_has(X86_FEATURE_PGE) and go indeed into this
> branch here. So it seems something must be clearing it later, hmm.
>
>>          } else
>>                  __supported_pte_mask &= ~_PAGE_GLOBAL;
>>
>> but maybe there's something wrong with the percpu cr4 caching?

So, I think I got a little bit further. To the printk's I added
previously in the test_setmem code that run on the problematic
kernel, I now extended them into:

   printk("static_cpu X86_FEATURE_PGE:%u\n",     static_cpu_has(X86_FEATURE_PGE));
   printk("boot_cpu   X86_FEATURE_PGE:%u\n",       boot_cpu_has(X86_FEATURE_PGE));
   printk("static_cpu X86_FEATURE_INVPCID:%u\n", static_cpu_has(X86_FEATURE_INVPCID));
   printk("boot_cpu   X86_FEATURE_INVPCID:%u\n",   boot_cpu_has(X86_FEATURE_INVPCID));

And here's what I get in the log:

"-cpu kvm64" gives:

[    8.426865] static_cpu X86_FEATURE_PGE:1
[    8.427148] boot_cpu   X86_FEATURE_PGE:0
[    8.427428] static_cpu X86_FEATURE_INVPCID:0
[    8.427732] boot_cpu   X86_FEATURE_INVPCID:0

"-cpu host" gives:

[    8.426408] static_cpu X86_FEATURE_PGE:1
[    8.426726] boot_cpu   X86_FEATURE_PGE:0
[    8.427037] static_cpu X86_FEATURE_INVPCID:1
[    8.427375] boot_cpu   X86_FEATURE_INVPCID:1

This means at that point in time static_cpu_has(X86_FEATURE_PGE) is
not the same as boot_cpu_has(X86_FEATURE_PGE).

The code that switches this off is in lguest_arch_host_init(). Right
before that, both are X86_FEATURE_PGE:1, X86_FEATURE_INVPCID:0 for
the "-cpu kvm64" case.

Then, the lguest code does:

	get_online_cpus();
	if (boot_cpu_has(X86_FEATURE_PGE)) { /* We have a broader idea of "global". */
		/* Remember that this was originally set (for cleanup). */
		cpu_had_pge = 1;
		/*
		 * adjust_pge is a helper function which sets or unsets the PGE
		 * bit on its CPU, depending on the argument (0 == unset).
		 */
		on_each_cpu(adjust_pge, (void *)0, 1);
		/* Turn off the feature in the global feature set. */
		clear_cpu_cap(&boot_cpu_data, X86_FEATURE_PGE);
	}
	put_online_cpus();

So, adjust_pge() clears X86_CR4_PGE, and boot cpu has X86_FEATURE_PGE
unset. This means, with static_cpu_has(X86_FEATURE_PGE) still 1, we
run into using cr4 for TLB flushing with no X86_CR4_PGE bit set (which
doesn't trigger the flush), whereas we should be using cr3 for flushing
from that point onwards instead.

I tried with this one, and things seem to work again:

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 6fa8594..fc5abff 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -188,7 +188,7 @@ static inline void __native_flush_tlb_single(unsigned long addr)

  static inline void __flush_tlb_all(void)
  {
-       if (static_cpu_has(X86_FEATURE_PGE))
+       if (boot_cpu_has(X86_FEATURE_PGE))
                 __flush_tlb_global();
         else
                 __flush_tlb();

Presumably coming from c109bf95992b ("x86/cpufeature: Remove cpu_has_pge")?

Thanks,
Daniel

  reply	other threads:[~2017-03-09 21:32 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-01 12:54 [net/bpf] 3051bf36c2 BUG: unable to handle kernel paging request at 0000a7cf Fengguang Wu
2017-03-02 20:23 ` Fengguang Wu
2017-03-02 20:40   ` Daniel Borkmann
2017-03-08 19:25     ` Linus Torvalds
2017-03-08 22:27       ` Daniel Borkmann
2017-03-08 22:36         ` Kees Cook
2017-03-08 22:51           ` Daniel Borkmann
2017-03-08 23:55           ` Laura Abbott
2017-03-09  5:36             ` Kees Cook
2017-03-09 13:04               ` Daniel Borkmann
2017-03-09 13:10                 ` Thomas Gleixner
2017-03-09 13:25                   ` Daniel Borkmann
2017-03-09 14:49                     ` Thomas Gleixner
2017-03-09 17:51                       ` Daniel Borkmann
2017-03-09 18:08                         ` David Miller
2017-03-09 18:10                         ` Linus Torvalds
2017-03-09 18:15                           ` Linus Torvalds
2017-03-09 18:31                             ` Daniel Borkmann
2017-03-09 21:32                               ` Daniel Borkmann [this message]
2017-03-09 21:55                                 ` Borislav Petkov
2017-03-09 22:07                                   ` Borislav Petkov
2017-03-09 22:11                                     ` Daniel Borkmann
2017-03-09 22:48                                       ` Borislav Petkov
2017-03-09 23:26                                         ` Linus Torvalds
2017-03-09 23:44                                           ` Borislav Petkov
2017-03-10  0:13                                             ` Daniel Borkmann
2017-03-12 21:40                                           ` Borislav Petkov
2017-03-09 14:53                     ` Daniel Borkmann
2017-03-09 17:48                       ` Linus Torvalds
2017-03-08 22:43         ` Linus Torvalds
2017-03-09  1:34           ` Fengguang Wu
2017-03-09 13:09       ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58C1C9DC.7070509@iogearbox.net \
    --to=daniel@iogearbox$(echo .)net \
    --cc=ast@fb$(echo .)com \
    --cc=bp@suse$(echo .)de \
    --cc=davem@davemloft$(echo .)net \
    --cc=fengguang.wu@intel$(echo .)com \
    --cc=hpa@zytor$(echo .)com \
    --cc=keescook@chromium$(echo .)org \
    --cc=labbott@redhat$(echo .)com \
    --cc=linux-kernel@vger$(echo .)kernel.org \
    --cc=lkp@01$(echo .)org \
    --cc=mingo@kernel$(echo .)org \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=tglx@linutronix$(echo .)de \
    --cc=torvalds@linux-foundation$(echo .)org \
    --cc=x86@kernel$(echo .)org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox