public inbox for linux-arm-kernel@lists.infradead.org 
 help / color / mirror / Atom feed
From: catalin.marinas@arm•com (Catalin Marinas)
To: linux-arm-kernel@lists•infradead.org
Subject: [PATCH 2/7] Add various hugetlb page table fix
Date: Tue, 7 Feb 2012 14:11:00 +0000	[thread overview]
Message-ID: <20120207141100.GI3351@arm.com> (raw)
In-Reply-To: <CAOMgcGLhGyz5xU1P2C=Av9LWazJF3X=U14riNi=xESZX68mdwQ@mail.gmail.com>

On Tue, Feb 07, 2012 at 01:24:09PM +0000, carson bill wrote:
> 2012/2/7, Catalin Marinas <catalin.marinas@arm•com>:
> > On Tue, Feb 07, 2012 at 01:42:01AM +0000, bill4carson wrote:
> >> On 2012?02?07? 00:26, Catalin Marinas wrote:
> >> > On Wed, Feb 01, 2012 at 03:10:21AM +0000, bill4carson wrote:
> >> >> Why L_PTE_HUGEPAGE is needed?
> >> >>
> >> >> hugetlb subsystem will call pte_page to derive the corresponding page
> >> >> struct from a given pte, and pte_pfn is used first to convert pte into
> >> >> a page frame number.
> >> >
> >> > Are you sure the pte_pfn() conversion is right? Does it need to be
> >> > different from the 4K pfn?
> > ...
> >> pte_page is defined as following to derive page struct from a given pte.
> >> This macro is used both in generic mm as well as hugetlb sub-system, so
> >> we need do the switch in pte_pfn to mark huge page based linux pte out
> >> of normal page based linux pte, that's what L_PTE_HUGEPAGE for.
> >>
> >> #define pte_page(pte)		pfn_to_page(pte_pfn(pte))
> >>
> >> So L_PTE_HUGEPAGE is *NOT* set in normal page based linux pte,
> >> linux pte bits[31:12] is the page frame number;
> >
> > I agree.
> >
> >> otherwise, we got a huge page based linux pte, and linux pte
> >> bits[31:20] is page frame number for SECTION mapping, and bits[31:24]
> >> is page frame number for SUPER-SECTION mapping.
> >
> > Actually it is still 31:12 but with bits 19:12 or 23:12 masked out. So
> > you do the correct shift by PAGE_SHIFT with the additional masking for
> > huge pages (harmless).
> >
> > But do we actually need this masking? Do the huge_pte_offset() or
> > huge_pte_alloc() functions return the Linux pte (pmd) for the huge page?
> > If yes, can we not ensure that bits 19:12 are already zero? This
> > shouldn't be any different from the 4K Linux pte but with an address
> > aligned to 1MB.
> 
> I'm afraid there is some misunderstanding.
> huge_pte_offset() returns the huge linux pte address if they exist;
> huge_pte_alloc()  allocates a location to store huge linux pte, and
> return this address;
> non of above functions return huge linux pte *value*.

I agree, huge_pte_offset() returns a pointer to the Linux pte/pmd if it
exists. My point is that the values stored in Linux pte/pmd have bits
20:12 cleared already as the address is at least 2MB aligned (well,
apart from the additional L_PTE_HPAGE_* bits that you declared). Is this
correct? If yes, then you don't need any additional masking for
pte_pfn() even if it is passed a Linux pmd.

> make_huge_pte() will return huge linux pte for a given page and vma
> protection bits,
> please notice pte_mkhuge is used to mark this pte as huge linux pte by setting
> L_PTE_HUGEPAGE, then set_huge_pte_at() is used to set huge linux pte as well
> huge hardware pte.
> 
> 
> 2113static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page,
> 2114                                int writable)
> 2115{
> 2116        pte_t entry;
> 2117
> 2118        if (writable) {
> 2119                entry =
> 2120                    pte_mkwrite(pte_mkdirty(mk_pte(page,
> vma->vm_page_prot)));
> 2121        } else {
> 2122                entry = huge_pte_wrprotect(mk_pte(page, vma->vm_page_prot));
> 2123        }
> 2124        entry = pte_mkyoung(entry);
> 2125        entry = pte_mkhuge(entry);
> 2126
> 2127        return entry;
> 2128}
> 
> Hence, normal linux pte must has L_PTE_HUGEPAE cleared;
> A huge linux pte must has L_PTE_HUGEPAGE(BIT11) set
> This could lead to L_PTE_HPAGE_2M(BIT12) or L_PTE_HPAGE_16M(BIT13) set
> respectively, that's why the masking is needed for pte_pfn.

But if you avoid setting L_PTE_HPAGE_*, than we don't need the masking
for pte_pfn. In which case, we don't need to differentiate between a
normal and a huge pte in pte_pfn(), so no need for L_PTE_HUGEPAGE. The
set_huge_pte_at() function is only called with a huge pte, so it doesn't
need to check the L_PTE_HUGEPAGE bit either.

-- 
Catalin

  reply	other threads:[~2012-02-07 14:11 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-30  7:57 [RFC] ARM hugetlb support bill4carson at gmail.com
2012-01-30  7:57 ` [PATCH 1/7] Add various hugetlb arm high level hooks bill4carson at gmail.com
2012-02-06 17:07   ` Catalin Marinas
2012-02-07  2:00     ` bill4carson
2012-02-07 11:54       ` Catalin Marinas
2012-02-07 12:15   ` Catalin Marinas
2012-02-07 12:57     ` carson bill
2012-01-30  7:57 ` [PATCH 2/7] Add various hugetlb page table fix bill4carson at gmail.com
2012-01-31  9:57   ` Catalin Marinas
2012-01-31  9:58   ` Russell King - ARM Linux
2012-01-31 12:25     ` Catalin Marinas
2012-02-01  3:10       ` bill4carson
2012-02-06 16:26         ` Catalin Marinas
2012-02-07  1:42           ` bill4carson
2012-02-07 11:50             ` Catalin Marinas
2012-02-07 13:24               ` carson bill
2012-02-07 14:11                 ` Catalin Marinas [this message]
2012-02-07 14:46                   ` carson bill
2012-02-07 15:09                     ` Catalin Marinas
2012-02-07 15:41                       ` carson bill
2012-01-30  7:57 ` [PATCH 3/7] Introduce set_hugepte_ext api for huge page hardware page table setup bill4carson at gmail.com
2012-01-30  7:57 ` [PATCH 4/7] Store huge page linux pte in mm_struct bill4carson at gmail.com
2012-01-31  9:37   ` Catalin Marinas
2012-01-31 10:01   ` Russell King - ARM Linux
2012-02-01  5:45     ` bill4carson
2012-02-06  2:04       ` bill4carson
2012-02-06 10:29         ` Catalin Marinas
2012-02-06 14:40           ` carson bill
2012-01-30  7:57 ` [PATCH 5/7] Using do_page_fault for section fault handling bill4carson at gmail.com
2012-01-30  7:57 ` [PATCH 6/7] Add hugetlb Kconfig option bill4carson at gmail.com
2012-01-30  7:57 ` [PATCH 7/7] Minor compiling fix bill4carson at gmail.com
2012-01-31  9:29 ` [RFC] ARM hugetlb support Catalin Marinas
2012-02-01  1:56   ` bill4carson
2012-02-02 14:38     ` Catalin Marinas
2012-02-03  1:41       ` bill4carson
2012-02-06 16:29         ` Catalin Marinas
  -- strict thread matches above, loose matches on Subject: below --
2012-02-13  9:44 [RFC-PATCH V2] " Bill Carson
2012-02-13  9:44 ` [PATCH 2/7] Add various hugetlb page table fix Bill Carson
2012-03-01 10:13   ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120207141100.GI3351@arm.com \
    --to=catalin.marinas@arm$(echo .)com \
    --cc=linux-arm-kernel@lists$(echo .)infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox