From: Alex Williamson <alex.williamson@redhat•com>
To: Alexey Kardashevskiy <aik@ozlabs•ru>
Cc: linuxppc-dev@lists•ozlabs.org, linux-kernel@vger•kernel.org,
Paul Mackerras <paulus@samba•org>
Subject: Re: [PATCH kernel v8 00/31] powerpc/iommu/vfio: Enable Dynamic DMA windows
Date: Fri, 10 Apr 2015 16:13:33 -0600 [thread overview]
Message-ID: <1428704013.5567.632.camel@redhat.com> (raw)
In-Reply-To: <1428647473-11738-1-git-send-email-aik@ozlabs.ru>
On Fri, 2015-04-10 at 16:30 +1000, Alexey Kardashevskiy wrote:
> This enables sPAPR defined feature called Dynamic DMA windows (DDW).
>
> Each Partitionable Endpoint (IOMMU group) has an address range on a PCI bus
> where devices are allowed to do DMA. These ranges are called DMA windows.
> By default, there is a single DMA window, 1 or 2GB big, mapped at zero
> on a PCI bus.
>
> Hi-speed devices may suffer from the limited size of the window.
> The recent host kernels use a TCE bypass window on POWER8 CPU which implements
> direct PCI bus address range mapping (with offset of 1<<59) to the host memory.
>
> For guests, PAPR defines a DDW RTAS API which allows pseries guests
> querying the hypervisor about DDW support and capabilities (page size mask
> for now). A pseries guest may request an additional (to the default)
> DMA windows using this RTAS API.
> The existing pseries Linux guests request an additional window as big as
> the guest RAM and map the entire guest window which effectively creates
> direct mapping of the guest memory to a PCI bus.
>
> The multiple DMA windows feature is supported by POWER7/POWER8 CPUs; however
> this patchset only adds support for POWER8 as TCE tables are implemented
> in POWER7 in a quite different way ans POWER7 is not the highest priority.
>
> This patchset reworks PPC64 IOMMU code and adds necessary structures
> to support big windows.
>
> Once a Linux guest discovers the presence of DDW, it does:
> 1. query hypervisor about number of available windows and page size masks;
> 2. create a window with the biggest possible page size (today 4K/64K/16M);
> 3. map the entire guest RAM via H_PUT_TCE* hypercalls;
> 4. switche dma_ops to direct_dma_ops on the selected PE.
>
> Once this is done, H_PUT_TCE is not called anymore for 64bit devices and
> the guest does not waste time on DMA map/unmap operations.
>
> Note that 32bit devices won't use DDW and will keep using the default
> DMA window so KVM optimizations will be required (to be posted later).
>
> This is pushed to git@github•com:aik/linux.git
> + 09bb8ea...d9b711d vfio-for-github -> vfio-for-github (forced update)
>
>
> Please comment. Thank you!
>
>
> Changes:
> v8:
> * fixed a bug in error fallback in "powerpc/mmu: Add userspace-to-physical
> addresses translation cache"
> * fixed subject in "vfio: powerpc/spapr: Check that IOMMU page is fully
> contained by system page"
> * moved v2 documentation to the correct patch
> * added checks for failed vzalloc() in "powerpc/iommu: Add userspace view
> of TCE table"
>
> v7:
> * moved memory preregistration to the current process's MMU context
> * added code preventing unregistration if some pages are still mapped;
> for this, there is a userspace view of the table is stored in iommu_table
> * added locked_vm counting for DDW tables (including userspace view of those)
>
> v6:
> * fixed a bunch of errors in "vfio: powerpc/spapr: Support Dynamic DMA windows"
> * moved static IOMMU properties from iommu_table_group to iommu_table_group_ops
>
> v5:
> * added SPAPR_TCE_IOMMU_v2 to tell the userspace that there is a memory
> pre-registration feature
> * added backward compatibility
> * renamed few things (mostly powerpc_iommu -> iommu_table_group)
>
> v4:
> * moved patches around to have VFIO and PPC patches separated as much as
> possible
> * now works with the existing upstream QEMU
>
> v3:
> * redesigned the whole thing
> * multiple IOMMU groups per PHB -> one PHB is needed for VFIO in the guest ->
> no problems with locked_vm counting; also we save memory on actual tables
> * guest RAM preregistration is required for DDW
> * PEs (IOMMU groups) are passed to VFIO with no DMA windows at all so
> we do not bother with iommu_table::it_map anymore
> * added multilevel TCE tables support to support really huge guests
>
> v2:
> * added missing __pa() in "powerpc/powernv: Release replaced TCE"
> * reposted to make some noise
>
>
>
>
> Alexey Kardashevskiy (31):
> vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU
> driver
> vfio: powerpc/spapr: Do cleanup when releasing the group
> vfio: powerpc/spapr: Check that IOMMU page is fully contained by
> system page
> vfio: powerpc/spapr: Use it_page_size
> vfio: powerpc/spapr: Move locked_vm accounting to helpers
> vfio: powerpc/spapr: Disable DMA mappings on disabled container
> vfio: powerpc/spapr: Moving pinning/unpinning to helpers
> vfio: powerpc/spapr: Rework groups attaching
> powerpc/powernv: Do not set "read" flag if direction==DMA_NONE
> powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table
> powerpc/iommu: Introduce iommu_table_alloc() helper
> powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group
> vfio: powerpc/spapr: powerpc/iommu: Rework IOMMU ownership control
> vfio: powerpc/spapr: powerpc/powernv/ioda2: Rework IOMMU ownership
> control
> powerpc/iommu: Fix IOMMU ownership control functions
> powerpc/powernv/ioda/ioda2: Rework tce_build()/tce_free()
> powerpc/iommu/powernv: Release replaced TCE
> powerpc/powernv/ioda2: Rework iommu_table creation
> powerpc/powernv/ioda2: Introduce
> pnv_pci_ioda2_create_table/pnc_pci_free_table
> powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window
> powerpc/iommu: Split iommu_free_table into 2 helpers
> powerpc/powernv: Implement multilevel TCE tables
> powerpc/powernv: Change prototypes to receive iommu
> powerpc/powernv/ioda: Define and implement DMA table/window management
> callbacks
> vfio: powerpc/spapr: powerpc/powernv/ioda2: Rework ownership
> powerpc/iommu: Add userspace view of TCE table
> powerpc/iommu/ioda2: Add get_table_size() to calculate the size of
> fiture table
> powerpc/mmu: Add userspace-to-physical addresses translation cache
> vfio: powerpc/spapr: Register memory and define IOMMU v2
> vfio: powerpc/spapr: Support multiple groups in one container if
> possible
> vfio: powerpc/spapr: Support Dynamic DMA windows
>
> Documentation/vfio.txt | 50 +-
> arch/powerpc/include/asm/iommu.h | 111 ++-
> arch/powerpc/include/asm/machdep.h | 25 -
> arch/powerpc/include/asm/mmu-hash64.h | 3 +
> arch/powerpc/include/asm/mmu_context.h | 17 +
> arch/powerpc/kernel/iommu.c | 336 +++++----
> arch/powerpc/kernel/vio.c | 5 +
> arch/powerpc/mm/Makefile | 1 +
> arch/powerpc/mm/mmu_context_hash64.c | 6 +
> arch/powerpc/mm/mmu_context_hash64_iommu.c | 215 ++++++
> arch/powerpc/platforms/cell/iommu.c | 8 +-
> arch/powerpc/platforms/pasemi/iommu.c | 7 +-
> arch/powerpc/platforms/powernv/pci-ioda.c | 589 ++++++++++++---
> arch/powerpc/platforms/powernv/pci-p5ioc2.c | 33 +-
> arch/powerpc/platforms/powernv/pci.c | 116 ++-
> arch/powerpc/platforms/powernv/pci.h | 12 +-
> arch/powerpc/platforms/pseries/iommu.c | 55 +-
> arch/powerpc/sysdev/dart_iommu.c | 12 +-
> drivers/vfio/vfio_iommu_spapr_tce.c | 1021 ++++++++++++++++++++++++---
> include/uapi/linux/vfio.h | 88 ++-
> 20 files changed, 2218 insertions(+), 492 deletions(-)
> create mode 100644 arch/powerpc/mm/mmu_context_hash64_iommu.c
There are still some issues that need to be addressed in arch code, I've
noted them in comments for patches 15 & 26. I think I've run out of
issues for the vfio changes, so for the vfio related changes in patches
1-8,12-14,17,25,29-31:
Acked-by: Alex Williamson <alex.williamson@redhat•com>
prev parent reply other threads:[~2015-04-10 22:13 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-10 6:30 [PATCH kernel v8 00/31] powerpc/iommu/vfio: Enable Dynamic DMA windows Alexey Kardashevskiy
2015-04-10 6:30 ` [PATCH kernel v8 01/31] vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU driver Alexey Kardashevskiy
2015-04-15 3:56 ` David Gibson
2015-04-10 6:30 ` [PATCH kernel v8 02/31] vfio: powerpc/spapr: Do cleanup when releasing the group Alexey Kardashevskiy
2015-04-15 4:00 ` David Gibson
2015-04-10 6:30 ` [PATCH kernel v8 03/31] vfio: powerpc/spapr: Check that IOMMU page is fully contained by system page Alexey Kardashevskiy
2015-04-15 4:03 ` David Gibson
2015-04-10 6:30 ` [PATCH kernel v8 04/31] vfio: powerpc/spapr: Use it_page_size Alexey Kardashevskiy
2015-04-10 6:30 ` [PATCH kernel v8 05/31] vfio: powerpc/spapr: Move locked_vm accounting to helpers Alexey Kardashevskiy
2015-04-15 4:09 ` David Gibson
2015-04-10 6:30 ` [PATCH kernel v8 06/31] vfio: powerpc/spapr: Disable DMA mappings on disabled container Alexey Kardashevskiy
2015-04-15 7:05 ` David Gibson
2015-04-10 6:30 ` [PATCH kernel v8 07/31] vfio: powerpc/spapr: Moving pinning/unpinning to helpers Alexey Kardashevskiy
2015-04-15 7:10 ` David Gibson
2015-04-15 12:09 ` Alexey Kardashevskiy
2015-04-10 6:30 ` [PATCH kernel v8 08/31] vfio: powerpc/spapr: Rework groups attaching Alexey Kardashevskiy
2015-04-15 7:14 ` David Gibson
2015-04-10 6:30 ` [PATCH kernel v8 09/31] powerpc/powernv: Do not set "read" flag if direction==DMA_NONE Alexey Kardashevskiy
2015-04-15 7:17 ` David Gibson
2015-04-10 6:30 ` [PATCH kernel v8 10/31] powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table Alexey Kardashevskiy
2015-04-15 7:23 ` David Gibson
2015-04-10 6:30 ` [PATCH kernel v8 11/31] powerpc/iommu: Introduce iommu_table_alloc() helper Alexey Kardashevskiy
2015-04-16 5:31 ` David Gibson
2015-04-10 6:30 ` [PATCH kernel v8 12/31] powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group Alexey Kardashevskiy
2015-04-16 5:55 ` David Gibson
2015-04-16 15:48 ` Alexey Kardashevskiy
2015-04-20 2:36 ` David Gibson
2015-04-17 9:46 ` Alexey Kardashevskiy
2015-04-20 2:37 ` David Gibson
2015-04-10 6:30 ` [PATCH kernel v8 13/31] vfio: powerpc/spapr: powerpc/iommu: Rework IOMMU ownership control Alexey Kardashevskiy
2015-04-16 6:00 ` David Gibson
2015-04-10 6:30 ` [PATCH kernel v8 14/31] vfio: powerpc/spapr: powerpc/powernv/ioda2: " Alexey Kardashevskiy
2015-04-16 6:07 ` David Gibson
2015-04-17 10:09 ` Alexey Kardashevskiy
2015-04-20 2:44 ` David Gibson
2015-04-20 6:55 ` Alexey Kardashevskiy
2015-04-21 9:43 ` David Gibson
2015-04-21 11:47 ` Alexey Kardashevskiy
2015-04-22 5:22 ` David Gibson
2015-04-10 6:30 ` [PATCH kernel v8 15/31] powerpc/iommu: Fix IOMMU ownership control functions Alexey Kardashevskiy
2015-04-10 21:28 ` Alex Williamson
2015-04-16 6:10 ` David Gibson
2015-04-17 10:16 ` Alexey Kardashevskiy
2015-04-20 2:46 ` David Gibson
2015-04-20 6:34 ` Alexey Kardashevskiy
2015-04-21 7:12 ` David Gibson
2015-04-10 6:30 ` [PATCH kernel v8 16/31] powerpc/powernv/ioda/ioda2: Rework tce_build()/tce_free() Alexey Kardashevskiy
2015-04-16 6:17 ` David Gibson
2015-04-10 6:30 ` [PATCH kernel v8 17/31] powerpc/iommu/powernv: Release replaced TCE Alexey Kardashevskiy
2015-04-16 6:26 ` David Gibson
2015-04-17 10:37 ` Alexey Kardashevskiy
2015-04-20 2:50 ` David Gibson
2015-04-10 6:31 ` [PATCH kernel v8 18/31] powerpc/powernv/ioda2: Rework iommu_table creation Alexey Kardashevskiy
2015-04-16 6:29 ` David Gibson
2015-04-10 6:31 ` [PATCH kernel v8 19/31] powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_create_table/pnc_pci_free_table Alexey Kardashevskiy
2015-04-16 6:42 ` David Gibson
2015-04-10 6:31 ` [PATCH kernel v8 20/31] powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window Alexey Kardashevskiy
2015-04-16 6:43 ` David Gibson
2015-04-10 6:31 ` [PATCH kernel v8 21/31] powerpc/iommu: Split iommu_free_table into 2 helpers Alexey Kardashevskiy
2015-04-16 6:46 ` David Gibson
2015-04-16 16:29 ` Alexey Kardashevskiy
2015-04-20 2:51 ` David Gibson
2015-04-10 6:31 ` [PATCH kernel v8 22/31] powerpc/powernv: Implement multilevel TCE tables Alexey Kardashevskiy
2015-04-10 6:31 ` [PATCH kernel v8 23/31] powerpc/powernv: Change prototypes to receive iommu Alexey Kardashevskiy
2015-04-10 6:31 ` [PATCH kernel v8 24/31] powerpc/powernv/ioda: Define and implement DMA table/window management callbacks Alexey Kardashevskiy
2015-04-10 6:31 ` [PATCH kernel v8 25/31] vfio: powerpc/spapr: powerpc/powernv/ioda2: Rework ownership Alexey Kardashevskiy
2015-04-10 6:31 ` [PATCH kernel v8 26/31] powerpc/iommu: Add userspace view of TCE table Alexey Kardashevskiy
2015-04-10 21:31 ` Alex Williamson
2015-04-10 6:31 ` [PATCH kernel v8 27/31] powerpc/iommu/ioda2: Add get_table_size() to calculate the size of fiture table Alexey Kardashevskiy
2015-04-10 6:31 ` [PATCH kernel v8 28/31] powerpc/mmu: Add userspace-to-physical addresses translation cache Alexey Kardashevskiy
2015-04-10 6:31 ` [PATCH kernel v8 29/31] vfio: powerpc/spapr: Register memory and define IOMMU v2 Alexey Kardashevskiy
2015-04-10 6:31 ` [PATCH kernel v8 30/31] vfio: powerpc/spapr: Support multiple groups in one container if possible Alexey Kardashevskiy
2015-04-10 6:31 ` [PATCH kernel v8 31/31] vfio: powerpc/spapr: Support Dynamic DMA windows Alexey Kardashevskiy
2015-04-10 22:13 ` Alex Williamson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1428704013.5567.632.camel@redhat.com \
--to=alex.williamson@redhat$(echo .)com \
--cc=aik@ozlabs$(echo .)ru \
--cc=linux-kernel@vger$(echo .)kernel.org \
--cc=linuxppc-dev@lists$(echo .)ozlabs.org \
--cc=paulus@samba$(echo .)org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox