public inbox for linuxppc-dev@ozlabs.org 
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat•com>
To: Alexey Kardashevskiy <aik@ozlabs•ru>
Cc: linuxppc-dev@lists•ozlabs.org, linux-kernel@vger•kernel.org,
	Paul Mackerras <paulus@samba•org>
Subject: Re: [PATCH kernel v8 00/31] powerpc/iommu/vfio: Enable Dynamic DMA windows
Date: Fri, 10 Apr 2015 16:13:33 -0600	[thread overview]
Message-ID: <1428704013.5567.632.camel@redhat.com> (raw)
In-Reply-To: <1428647473-11738-1-git-send-email-aik@ozlabs.ru>

On Fri, 2015-04-10 at 16:30 +1000, Alexey Kardashevskiy wrote:
> This enables sPAPR defined feature called Dynamic DMA windows (DDW).
> 
> Each Partitionable Endpoint (IOMMU group) has an address range on a PCI bus
> where devices are allowed to do DMA. These ranges are called DMA windows.
> By default, there is a single DMA window, 1 or 2GB big, mapped at zero
> on a PCI bus.
> 
> Hi-speed devices may suffer from the limited size of the window.
> The recent host kernels use a TCE bypass window on POWER8 CPU which implements
> direct PCI bus address range mapping (with offset of 1<<59) to the host memory.
> 
> For guests, PAPR defines a DDW RTAS API which allows pseries guests
> querying the hypervisor about DDW support and capabilities (page size mask
> for now). A pseries guest may request an additional (to the default)
> DMA windows using this RTAS API.
> The existing pseries Linux guests request an additional window as big as
> the guest RAM and map the entire guest window which effectively creates
> direct mapping of the guest memory to a PCI bus.
> 
> The multiple DMA windows feature is supported by POWER7/POWER8 CPUs; however
> this patchset only adds support for POWER8 as TCE tables are implemented
> in POWER7 in a quite different way ans POWER7 is not the highest priority.
> 
> This patchset reworks PPC64 IOMMU code and adds necessary structures
> to support big windows.
> 
> Once a Linux guest discovers the presence of DDW, it does:
> 1. query hypervisor about number of available windows and page size masks;
> 2. create a window with the biggest possible page size (today 4K/64K/16M);
> 3. map the entire guest RAM via H_PUT_TCE* hypercalls;
> 4. switche dma_ops to direct_dma_ops on the selected PE.
> 
> Once this is done, H_PUT_TCE is not called anymore for 64bit devices and
> the guest does not waste time on DMA map/unmap operations.
> 
> Note that 32bit devices won't use DDW and will keep using the default
> DMA window so KVM optimizations will be required (to be posted later).
> 
> This is pushed to git@github•com:aik/linux.git
>  + 09bb8ea...d9b711d vfio-for-github -> vfio-for-github (forced update)
> 
> 
> Please comment. Thank you!
> 
> 
> Changes:
> v8:
> * fixed a bug in error fallback in "powerpc/mmu: Add userspace-to-physical
> addresses translation cache"
> * fixed subject in "vfio: powerpc/spapr: Check that IOMMU page is fully
> contained by system page"
> * moved v2 documentation to the correct patch
> * added checks for failed vzalloc() in "powerpc/iommu: Add userspace view
> of TCE table"
> 
> v7:
> * moved memory preregistration to the current process's MMU context
> * added code preventing unregistration if some pages are still mapped;
> for this, there is a userspace view of the table is stored in iommu_table
> * added locked_vm counting for DDW tables (including userspace view of those)
> 
> v6:
> * fixed a bunch of errors in "vfio: powerpc/spapr: Support Dynamic DMA windows"
> * moved static IOMMU properties from iommu_table_group to iommu_table_group_ops
> 
> v5:
> * added SPAPR_TCE_IOMMU_v2 to tell the userspace that there is a memory
> pre-registration feature
> * added backward compatibility
> * renamed few things (mostly powerpc_iommu -> iommu_table_group)
> 
> v4:
> * moved patches around to have VFIO and PPC patches separated as much as
> possible
> * now works with the existing upstream QEMU
> 
> v3:
> * redesigned the whole thing
> * multiple IOMMU groups per PHB -> one PHB is needed for VFIO in the guest ->
> no problems with locked_vm counting; also we save memory on actual tables
> * guest RAM preregistration is required for DDW
> * PEs (IOMMU groups) are passed to VFIO with no DMA windows at all so
> we do not bother with iommu_table::it_map anymore
> * added multilevel TCE tables support to support really huge guests
> 
> v2:
> * added missing __pa() in "powerpc/powernv: Release replaced TCE"
> * reposted to make some noise
> 
> 
> 
> 
> Alexey Kardashevskiy (31):
>   vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU
>     driver
>   vfio: powerpc/spapr: Do cleanup when releasing the group
>   vfio: powerpc/spapr: Check that IOMMU page is fully contained by
>     system page
>   vfio: powerpc/spapr: Use it_page_size
>   vfio: powerpc/spapr: Move locked_vm accounting to helpers
>   vfio: powerpc/spapr: Disable DMA mappings on disabled container
>   vfio: powerpc/spapr: Moving pinning/unpinning to helpers
>   vfio: powerpc/spapr: Rework groups attaching
>   powerpc/powernv: Do not set "read" flag if direction==DMA_NONE
>   powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table
>   powerpc/iommu: Introduce iommu_table_alloc() helper
>   powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group
>   vfio: powerpc/spapr: powerpc/iommu: Rework IOMMU ownership control
>   vfio: powerpc/spapr: powerpc/powernv/ioda2: Rework IOMMU ownership
>     control
>   powerpc/iommu: Fix IOMMU ownership control functions
>   powerpc/powernv/ioda/ioda2: Rework tce_build()/tce_free()
>   powerpc/iommu/powernv: Release replaced TCE
>   powerpc/powernv/ioda2: Rework iommu_table creation
>   powerpc/powernv/ioda2: Introduce
>     pnv_pci_ioda2_create_table/pnc_pci_free_table
>   powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window
>   powerpc/iommu: Split iommu_free_table into 2 helpers
>   powerpc/powernv: Implement multilevel TCE tables
>   powerpc/powernv: Change prototypes to receive iommu
>   powerpc/powernv/ioda: Define and implement DMA table/window management
>     callbacks
>   vfio: powerpc/spapr: powerpc/powernv/ioda2: Rework ownership
>   powerpc/iommu: Add userspace view of TCE table
>   powerpc/iommu/ioda2: Add get_table_size() to calculate the size of
>     fiture table
>   powerpc/mmu: Add userspace-to-physical addresses translation cache
>   vfio: powerpc/spapr: Register memory and define IOMMU v2
>   vfio: powerpc/spapr: Support multiple groups in one container if
>     possible
>   vfio: powerpc/spapr: Support Dynamic DMA windows
> 
>  Documentation/vfio.txt                      |   50 +-
>  arch/powerpc/include/asm/iommu.h            |  111 ++-
>  arch/powerpc/include/asm/machdep.h          |   25 -
>  arch/powerpc/include/asm/mmu-hash64.h       |    3 +
>  arch/powerpc/include/asm/mmu_context.h      |   17 +
>  arch/powerpc/kernel/iommu.c                 |  336 +++++----
>  arch/powerpc/kernel/vio.c                   |    5 +
>  arch/powerpc/mm/Makefile                    |    1 +
>  arch/powerpc/mm/mmu_context_hash64.c        |    6 +
>  arch/powerpc/mm/mmu_context_hash64_iommu.c  |  215 ++++++
>  arch/powerpc/platforms/cell/iommu.c         |    8 +-
>  arch/powerpc/platforms/pasemi/iommu.c       |    7 +-
>  arch/powerpc/platforms/powernv/pci-ioda.c   |  589 ++++++++++++---
>  arch/powerpc/platforms/powernv/pci-p5ioc2.c |   33 +-
>  arch/powerpc/platforms/powernv/pci.c        |  116 ++-
>  arch/powerpc/platforms/powernv/pci.h        |   12 +-
>  arch/powerpc/platforms/pseries/iommu.c      |   55 +-
>  arch/powerpc/sysdev/dart_iommu.c            |   12 +-
>  drivers/vfio/vfio_iommu_spapr_tce.c         | 1021 ++++++++++++++++++++++++---
>  include/uapi/linux/vfio.h                   |   88 ++-
>  20 files changed, 2218 insertions(+), 492 deletions(-)
>  create mode 100644 arch/powerpc/mm/mmu_context_hash64_iommu.c


There are still some issues that need to be addressed in arch code, I've
noted them in comments for patches 15 & 26.  I think I've run out of
issues for the vfio changes, so for the vfio related changes in patches
1-8,12-14,17,25,29-31:

Acked-by: Alex Williamson <alex.williamson@redhat•com>

      parent reply	other threads:[~2015-04-10 22:13 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-10  6:30 [PATCH kernel v8 00/31] powerpc/iommu/vfio: Enable Dynamic DMA windows Alexey Kardashevskiy
2015-04-10  6:30 ` [PATCH kernel v8 01/31] vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU driver Alexey Kardashevskiy
2015-04-15  3:56   ` David Gibson
2015-04-10  6:30 ` [PATCH kernel v8 02/31] vfio: powerpc/spapr: Do cleanup when releasing the group Alexey Kardashevskiy
2015-04-15  4:00   ` David Gibson
2015-04-10  6:30 ` [PATCH kernel v8 03/31] vfio: powerpc/spapr: Check that IOMMU page is fully contained by system page Alexey Kardashevskiy
2015-04-15  4:03   ` David Gibson
2015-04-10  6:30 ` [PATCH kernel v8 04/31] vfio: powerpc/spapr: Use it_page_size Alexey Kardashevskiy
2015-04-10  6:30 ` [PATCH kernel v8 05/31] vfio: powerpc/spapr: Move locked_vm accounting to helpers Alexey Kardashevskiy
2015-04-15  4:09   ` David Gibson
2015-04-10  6:30 ` [PATCH kernel v8 06/31] vfio: powerpc/spapr: Disable DMA mappings on disabled container Alexey Kardashevskiy
2015-04-15  7:05   ` David Gibson
2015-04-10  6:30 ` [PATCH kernel v8 07/31] vfio: powerpc/spapr: Moving pinning/unpinning to helpers Alexey Kardashevskiy
2015-04-15  7:10   ` David Gibson
2015-04-15 12:09     ` Alexey Kardashevskiy
2015-04-10  6:30 ` [PATCH kernel v8 08/31] vfio: powerpc/spapr: Rework groups attaching Alexey Kardashevskiy
2015-04-15  7:14   ` David Gibson
2015-04-10  6:30 ` [PATCH kernel v8 09/31] powerpc/powernv: Do not set "read" flag if direction==DMA_NONE Alexey Kardashevskiy
2015-04-15  7:17   ` David Gibson
2015-04-10  6:30 ` [PATCH kernel v8 10/31] powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table Alexey Kardashevskiy
2015-04-15  7:23   ` David Gibson
2015-04-10  6:30 ` [PATCH kernel v8 11/31] powerpc/iommu: Introduce iommu_table_alloc() helper Alexey Kardashevskiy
2015-04-16  5:31   ` David Gibson
2015-04-10  6:30 ` [PATCH kernel v8 12/31] powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group Alexey Kardashevskiy
2015-04-16  5:55   ` David Gibson
2015-04-16 15:48     ` Alexey Kardashevskiy
2015-04-20  2:36       ` David Gibson
2015-04-17  9:46     ` Alexey Kardashevskiy
2015-04-20  2:37       ` David Gibson
2015-04-10  6:30 ` [PATCH kernel v8 13/31] vfio: powerpc/spapr: powerpc/iommu: Rework IOMMU ownership control Alexey Kardashevskiy
2015-04-16  6:00   ` David Gibson
2015-04-10  6:30 ` [PATCH kernel v8 14/31] vfio: powerpc/spapr: powerpc/powernv/ioda2: " Alexey Kardashevskiy
2015-04-16  6:07   ` David Gibson
2015-04-17 10:09     ` Alexey Kardashevskiy
2015-04-20  2:44       ` David Gibson
2015-04-20  6:55         ` Alexey Kardashevskiy
2015-04-21  9:43           ` David Gibson
2015-04-21 11:47             ` Alexey Kardashevskiy
2015-04-22  5:22               ` David Gibson
2015-04-10  6:30 ` [PATCH kernel v8 15/31] powerpc/iommu: Fix IOMMU ownership control functions Alexey Kardashevskiy
2015-04-10 21:28   ` Alex Williamson
2015-04-16  6:10   ` David Gibson
2015-04-17 10:16     ` Alexey Kardashevskiy
2015-04-20  2:46       ` David Gibson
2015-04-20  6:34         ` Alexey Kardashevskiy
2015-04-21  7:12           ` David Gibson
2015-04-10  6:30 ` [PATCH kernel v8 16/31] powerpc/powernv/ioda/ioda2: Rework tce_build()/tce_free() Alexey Kardashevskiy
2015-04-16  6:17   ` David Gibson
2015-04-10  6:30 ` [PATCH kernel v8 17/31] powerpc/iommu/powernv: Release replaced TCE Alexey Kardashevskiy
2015-04-16  6:26   ` David Gibson
2015-04-17 10:37     ` Alexey Kardashevskiy
2015-04-20  2:50       ` David Gibson
2015-04-10  6:31 ` [PATCH kernel v8 18/31] powerpc/powernv/ioda2: Rework iommu_table creation Alexey Kardashevskiy
2015-04-16  6:29   ` David Gibson
2015-04-10  6:31 ` [PATCH kernel v8 19/31] powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_create_table/pnc_pci_free_table Alexey Kardashevskiy
2015-04-16  6:42   ` David Gibson
2015-04-10  6:31 ` [PATCH kernel v8 20/31] powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window Alexey Kardashevskiy
2015-04-16  6:43   ` David Gibson
2015-04-10  6:31 ` [PATCH kernel v8 21/31] powerpc/iommu: Split iommu_free_table into 2 helpers Alexey Kardashevskiy
2015-04-16  6:46   ` David Gibson
2015-04-16 16:29     ` Alexey Kardashevskiy
2015-04-20  2:51       ` David Gibson
2015-04-10  6:31 ` [PATCH kernel v8 22/31] powerpc/powernv: Implement multilevel TCE tables Alexey Kardashevskiy
2015-04-10  6:31 ` [PATCH kernel v8 23/31] powerpc/powernv: Change prototypes to receive iommu Alexey Kardashevskiy
2015-04-10  6:31 ` [PATCH kernel v8 24/31] powerpc/powernv/ioda: Define and implement DMA table/window management callbacks Alexey Kardashevskiy
2015-04-10  6:31 ` [PATCH kernel v8 25/31] vfio: powerpc/spapr: powerpc/powernv/ioda2: Rework ownership Alexey Kardashevskiy
2015-04-10  6:31 ` [PATCH kernel v8 26/31] powerpc/iommu: Add userspace view of TCE table Alexey Kardashevskiy
2015-04-10 21:31   ` Alex Williamson
2015-04-10  6:31 ` [PATCH kernel v8 27/31] powerpc/iommu/ioda2: Add get_table_size() to calculate the size of fiture table Alexey Kardashevskiy
2015-04-10  6:31 ` [PATCH kernel v8 28/31] powerpc/mmu: Add userspace-to-physical addresses translation cache Alexey Kardashevskiy
2015-04-10  6:31 ` [PATCH kernel v8 29/31] vfio: powerpc/spapr: Register memory and define IOMMU v2 Alexey Kardashevskiy
2015-04-10  6:31 ` [PATCH kernel v8 30/31] vfio: powerpc/spapr: Support multiple groups in one container if possible Alexey Kardashevskiy
2015-04-10  6:31 ` [PATCH kernel v8 31/31] vfio: powerpc/spapr: Support Dynamic DMA windows Alexey Kardashevskiy
2015-04-10 22:13 ` Alex Williamson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1428704013.5567.632.camel@redhat.com \
    --to=alex.williamson@redhat$(echo .)com \
    --cc=aik@ozlabs$(echo .)ru \
    --cc=linux-kernel@vger$(echo .)kernel.org \
    --cc=linuxppc-dev@lists$(echo .)ozlabs.org \
    --cc=paulus@samba$(echo .)org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox