From: Dev Jain <dev.jain@arm•com>
To: Wen Jiang <jiangwenxiaomi@gmail•com>,
linux-mm@kvack•org, linux-arm-kernel@lists•infradead.org,
catalin.marinas@arm•com, will@kernel•org,
akpm@linux-foundation•org, urezki@gmail•com
Cc: baohua@kernel•org, Xueyuan.chen21@gmail•com, rppt@kernel•org,
david@kernel•org, ryan.roberts@arm•com,
anshuman.khandual@arm•com, ajd@linux•ibm.com,
linux-kernel@vger•kernel.org, jiangwen6@xiaomi•com
Subject: Re: [PATCH v3 5/6] mm/vmalloc: map contiguous pages in batches for vmap() if possible
Date: Wed, 27 May 2026 13:57:54 +0530 [thread overview]
Message-ID: <340c811e-2501-46c3-8a55-19e955c5ae8a@arm.com> (raw)
In-Reply-To: <20260522053146.83209-6-jiangwenxiaomi@gmail.com>
On 22/05/26 11:01 am, Wen Jiang wrote:
> From: "Barry Song (Xiaomi)" <baohua@kernel•org>
>
> In many cases, the pages passed to vmap() may include high-order
> pages. For example, the systemheap often allocates pages in descending
> order: order 8, then 4, then 0. Currently, vmap() iterates over every
> page individually—even pages inside a high-order block are handled
> one by one.
>
> This patch detects physically contiguous pages (regardless of whether
> they are compound or non-compound) by scanning with
> num_pages_contiguous(), and maps them as a single contiguous block
> whenever possible. The first page's pfn must be aligned to the
> mapping order for the batched mapping to be used.
>
> Pages with the same page_shift are coalesced and mapped via
> vmap_pages_range_noflush_walk() to avoid page table rewalk.
>
> As users typically allocate memory in descending orders (e.g.
> 8 → 4 → 0), once an order-0 page is encountered, we stop scanning
> for contiguous pages since subsequent pages are likely order-0 as well.
>
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel•org>
> Co-developed-by: Dev Jain <dev.jain@arm•com>
> Signed-off-by: Dev Jain <dev.jain@arm•com>
> Signed-off-by: Wen Jiang <jiangwen6@xiaomi•com>
> Tested-by: Xueyuan Chen <xueyuan.chen21@gmail•com>
> ---
> mm/vmalloc.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 80 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index deb764abc0571..50642246f4d40 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3542,6 +3542,84 @@ void vunmap(const void *addr)
> }
> EXPORT_SYMBOL(vunmap);
>
> +static inline int get_vmap_batch_order(struct page **pages,
> + unsigned int max_steps, unsigned int idx)
> +{
> + unsigned int nr_contig;
> + int order;
> +
> + if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP) ||
> + ioremap_max_page_shift == PAGE_SHIFT)
Why bail out on ioremap_max_page_shift == PAGE_SHIFT? The code
path for ioremap is different from vmap right?
> + return 0;
> +
> + nr_contig = num_pages_contiguous(&pages[idx], max_steps);
> + if (nr_contig < 2)
> + return 0;
> +
> + order = fls(nr_contig) - 1;
> +
> + if (arch_vmap_pte_supported_shift(PAGE_SIZE << order) == PAGE_SHIFT)
> + return 0;
> +
> + /* Ensure the first page's pfn is aligned to the order */
> + if (!IS_ALIGNED(page_to_pfn(pages[idx]), 1 << order))
> + return 0;
> +
> + return order;
> +}
> +
> +static int vmap_batched(unsigned long addr, unsigned long end,
> + pgprot_t prot, struct page **pages)
> +{
> + unsigned int count = (end - addr) >> PAGE_SHIFT;
> + unsigned int prev_shift = 0, idx = 0;
> + unsigned long start = addr, map_addr = addr;
> + int err;
> +
> + err = kmsan_vmap_pages_range_noflush(addr, end, prot, pages,
> + PAGE_SHIFT, GFP_KERNEL);
> + if (err)
> + goto out;
> +
> + for (unsigned int i = 0; i < count; ) {
> + unsigned int shift = PAGE_SHIFT +
> + get_vmap_batch_order(pages, count - i, i);
> +
> + if (!i)
> + prev_shift = shift;
> +
> + if (shift != prev_shift) {
> + err = vmap_pages_range_noflush_walk(map_addr, addr,
It would be worth documenting vmap_pages_range_noflush_walk() that
it can take an array of pages which are not all contiguous, but it
may have contiguous chunks, as hinted by page_shift.
Otherwise this looks good.
> + prot, pages + idx,
> + min(prev_shift, PMD_SHIFT));
> + if (err)
> + goto out;
> + prev_shift = shift;
> + map_addr = addr;
> + idx = i;
> + }
> +
> + /*
> + * Once small pages are encountered, the remaining pages
> + * are likely small as well.
> + */
> + if (shift == PAGE_SHIFT)
> + break;
> +
> + addr += 1UL << shift;
> + i += 1U << (shift - PAGE_SHIFT);
> + }
> +
> + /* Remaining */
> + if (map_addr < end)
> + err = vmap_pages_range_noflush_walk(map_addr, end,
> + prot, pages + idx, min(prev_shift, PMD_SHIFT));
> +
> +out:
> + flush_cache_vmap(start, end);
> + return err;
> +}
> +
> /**
> * vmap - map an array of pages into virtually contiguous space
> * @pages: array of page pointers
> @@ -3585,8 +3663,8 @@ void *vmap(struct page **pages, unsigned int count,
> return NULL;
>
> addr = (unsigned long)area->addr;
> - if (vmap_pages_range(addr, addr + size, pgprot_nx(prot),
> - pages, PAGE_SHIFT) < 0) {
> + if (vmap_batched(addr, addr + size, pgprot_nx(prot),
> + pages) < 0) {
> vunmap(area->addr);
> return NULL;
> }
next prev parent reply other threads:[~2026-05-27 8:28 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-22 5:31 [PATCH v3 0/6] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
2026-05-22 5:31 ` [PATCH v3 1/6] arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE setup Wen Jiang
2026-05-26 7:56 ` Dev Jain
2026-05-22 5:31 ` [PATCH v3 2/6] arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple CONT_PTE Wen Jiang
2026-05-27 5:43 ` Dev Jain
2026-05-22 5:31 ` [PATCH v3 3/6] mm/vmalloc: Extract vmap_set_ptes() to consolidate PTE mapping logic Wen Jiang
2026-06-01 17:34 ` Uladzislau Rezki
2026-06-02 7:45 ` Wen Jiang
2026-05-22 5:31 ` [PATCH v3 4/6] mm/vmalloc: Extend page table walk to support larger page_shift sizes and eliminate page table rewalk Wen Jiang
2026-05-27 5:58 ` Dev Jain
2026-05-28 3:39 ` Wen Jiang
2026-05-29 5:28 ` Dev Jain
2026-06-05 6:02 ` Dev Jain
2026-05-22 5:31 ` [PATCH v3 5/6] mm/vmalloc: map contiguous pages in batches for vmap() if possible Wen Jiang
2026-05-27 8:27 ` Dev Jain [this message]
2026-05-28 3:42 ` Wen Jiang
2026-05-29 5:57 ` Dev Jain
2026-06-02 7:34 ` Wen Jiang
2026-05-22 5:31 ` [PATCH v3 6/6] mm/vmalloc: align vm_area so vmap() can batch mappings Wen Jiang
2026-05-23 7:53 ` Uladzislau Rezki
2026-05-27 6:25 ` Dev Jain
2026-06-02 8:57 ` Wen Jiang
2026-05-22 18:07 ` [PATCH v3 0/6] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Andrew Morton
2026-05-23 8:26 ` Wen Jiang
2026-05-23 21:40 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=340c811e-2501-46c3-8a55-19e955c5ae8a@arm.com \
--to=dev.jain@arm$(echo .)com \
--cc=Xueyuan.chen21@gmail$(echo .)com \
--cc=ajd@linux$(echo .)ibm.com \
--cc=akpm@linux-foundation$(echo .)org \
--cc=anshuman.khandual@arm$(echo .)com \
--cc=baohua@kernel$(echo .)org \
--cc=catalin.marinas@arm$(echo .)com \
--cc=david@kernel$(echo .)org \
--cc=jiangwen6@xiaomi$(echo .)com \
--cc=jiangwenxiaomi@gmail$(echo .)com \
--cc=linux-arm-kernel@lists$(echo .)infradead.org \
--cc=linux-kernel@vger$(echo .)kernel.org \
--cc=linux-mm@kvack$(echo .)org \
--cc=rppt@kernel$(echo .)org \
--cc=ryan.roberts@arm$(echo .)com \
--cc=urezki@gmail$(echo .)com \
--cc=will@kernel$(echo .)org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox