* Re: [PATCH v2 1/2] vfio: add dma-buf get_tph callback and DMA_BUF_TPH feature [not found] ` <20260430200704.352228-2-zhipingz@meta.com> @ 2026-05-04 21:44 ` Alex Williamson 2026-05-05 6:54 ` Zhiping Zhang 2026-05-06 6:58 ` fengchengwen 1 sibling, 1 reply; 8+ messages in thread From: Alex Williamson @ 2026-05-04 21:44 UTC (permalink / raw) To: Zhiping Zhang Cc: Jason Gunthorpe, Leon Romanovsky, Bjorn Helgaas, linux-rdma, linux-pci, netdev, dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas, alex On Thu, 30 Apr 2026 13:06:56 -0700 Zhiping Zhang <zhipingz@meta•com> wrote: > Add a dma-buf callback that returns raw TPH metadata from the exporter > so peer devices can reuse the steering tag and processing hint > associated with a VFIO-exported buffer. > > Add a new VFIO_DEVICE_FEATURE_DMA_BUF_TPH ioctl that takes the fd from > VFIO_DEVICE_FEATURE_DMA_BUF along with a steering tag and processing > hint, validates the fd is a vfio-exported dma-buf belonging to this > device, and stores the TPH values under memory_lock. This keeps the > existing VFIO_DEVICE_FEATURE_DMA_BUF uAPI completely unchanged. > > The user sequences setting TPH on the dma-buf before the importer > consumes it. > > Add an st_width parameter to get_tph() so the exporter can reject > steering tags that exceed the consumer's supported width (8 vs 16 bit). > When no TPH metadata was supplied, get_tph() returns -EOPNOTSUPP. > > Signed-off-by: Zhiping Zhang <zhipingz@meta•com> The uAPI is better, but sashiko has some review comments[1] for you. Please also copy the kvm list for vfio related development. Thanks, Alex [1]https://sashiko.dev/#/patchset/20260430200704.352228-1-zhipingz@meta.com > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c > --- a/drivers/vfio/pci/vfio_pci_core.c > +++ b/drivers/vfio/pci/vfio_pci_core.c > @@ -1534,6 +1534,9 @@ int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags, > return vfio_pci_core_feature_token(vdev, flags, arg, argsz); > case VFIO_DEVICE_FEATURE_DMA_BUF: > return vfio_pci_core_feature_dma_buf(vdev, flags, arg, argsz); > + case VFIO_DEVICE_FEATURE_DMA_BUF_TPH: > + return vfio_pci_core_feature_dma_buf_tph(vdev, flags, arg, > + argsz); > default: > return -ENOTTY; > } > diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c > --- a/drivers/vfio/pci/vfio_pci_dmabuf.c > +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c > @@ -19,6 +19,9 @@ struct vfio_pci_dma_buf { > u32 nr_ranges; > struct kref kref; > struct completion comp; > + u16 steering_tag; > + u8 ph; > + u8 tph_present : 1; > u8 revoked : 1; > }; > > @@ -69,6 +72,22 @@ vfio_pci_dma_buf_map(struct dma_buf_attachment *attachment, > return ret; > } > > +static int vfio_pci_dma_buf_get_tph(struct dma_buf *dmabuf, u16 *steering_tag, > + u8 *ph, u8 st_width) > +{ > + struct vfio_pci_dma_buf *priv = dmabuf->priv; > + > + if (!priv->tph_present) > + return -EOPNOTSUPP; > + > + if (st_width < 16 && priv->steering_tag > ((1U << st_width) - 1)) > + return -EINVAL; > + > + *steering_tag = priv->steering_tag; > + *ph = priv->ph; > + return 0; > +} > + > static void vfio_pci_dma_buf_unmap(struct dma_buf_attachment *attachment, > struct sg_table *sgt, > enum dma_data_direction dir) > @@ -101,6 +120,7 @@ static void vfio_pci_dma_buf_release(struct dma_buf *dmabuf) > > static const struct dma_buf_ops vfio_pci_dmabuf_ops = { > .attach = vfio_pci_dma_buf_attach, > + .get_tph = vfio_pci_dma_buf_get_tph, > .map_dma_buf = vfio_pci_dma_buf_map, > .unmap_dma_buf = vfio_pci_dma_buf_unmap, > .release = vfio_pci_dma_buf_release, > @@ -331,6 +351,55 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, > return ret; > } > > +int vfio_pci_core_feature_dma_buf_tph(struct vfio_pci_core_device *vdev, > + u32 flags, > + struct vfio_device_feature_dma_buf_tph __user *arg, > + size_t argsz) > +{ > + struct vfio_device_feature_dma_buf_tph set_tph; > + struct vfio_pci_dma_buf *priv; > + struct dma_buf *dmabuf; > + int ret; > + > + ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_SET, > + sizeof(set_tph)); > + if (ret != 1) > + return ret; > + > + if (copy_from_user(&set_tph, arg, sizeof(set_tph))) > + return -EFAULT; > + > + if (set_tph.reserved) > + return -EINVAL; > + > + dmabuf = dma_buf_get(set_tph.dmabuf_fd); > + if (IS_ERR(dmabuf)) > + return PTR_ERR(dmabuf); > + > + if (dmabuf->ops != &vfio_pci_dmabuf_ops) { > + ret = -EINVAL; > + goto out_put; > + } > + > + priv = dmabuf->priv; > + down_write(&vdev->memory_lock); > + if (priv->vdev != vdev) { > + ret = -EINVAL; > + goto out_unlock; > + } > + > + priv->steering_tag = set_tph.steering_tag; > + priv->ph = set_tph.ph; > + priv->tph_present = 1; > + ret = 0; > + > +out_unlock: > + up_write(&vdev->memory_lock); > +out_put: > + dma_buf_put(dmabuf); > + return ret; > +} > + > void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked) > { > struct vfio_pci_dma_buf *priv; > diff --git a/drivers/vfio/pci/vfio_pci_priv.h b/drivers/vfio/pci/vfio_pci_priv.h > --- a/drivers/vfio/pci/vfio_pci_priv.h > +++ b/drivers/vfio/pci/vfio_pci_priv.h > @@ -118,6 +118,10 @@ static inline bool vfio_pci_is_vga(struct pci_dev *pdev) > int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, > struct vfio_device_feature_dma_buf __user *arg, > size_t argsz); > +int vfio_pci_core_feature_dma_buf_tph(struct vfio_pci_core_device *vdev, > + u32 flags, > + struct vfio_device_feature_dma_buf_tph __user *arg, > + size_t argsz); > void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev); > void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked); > #else > @@ -128,6 +132,13 @@ vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, > { > return -ENOTTY; > } > +static inline int > +vfio_pci_core_feature_dma_buf_tph(struct vfio_pci_core_device *vdev, u32 flags, > + struct vfio_device_feature_dma_buf_tph __user *arg, > + size_t argsz) > +{ > + return -ENOTTY; > +} > static inline void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev) > { > } > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > --- a/include/linux/dma-buf.h > +++ b/include/linux/dma-buf.h > @@ -113,6 +113,23 @@ struct dma_buf_ops { > */ > void (*unpin)(struct dma_buf_attachment *attach); > > + /** > + * @get_tph: > + * @dmabuf: DMA buffer for which to retrieve TPH metadata > + * @steering_tag: Returns the raw TPH steering tag > + * @ph: Returns the TPH processing hint > + * @st_width: Consumer's supported steering tag width in bits (8 or 16) > + * > + * Return the TPH (TLP Processing Hints) metadata associated with this > + * DMA buffer. Exporters that do not provide TPH metadata should return > + * -EOPNOTSUPP. If the steering tag exceeds @st_width bits, return > + * -EINVAL. > + * > + * This callback is optional. > + */ > + int (*get_tph)(struct dma_buf *dmabuf, u16 *steering_tag, u8 *ph, > + u8 st_width); > + > /** > * @map_dma_buf: > * > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h > --- a/include/uapi/linux/vfio.h > +++ b/include/uapi/linux/vfio.h > @@ -1534,6 +1534,28 @@ struct vfio_device_feature_dma_buf { > */ > #define VFIO_DEVICE_FEATURE_MIG_PRECOPY_INFOv2 12 > > +/** > + * Upon VFIO_DEVICE_FEATURE_SET associate TPH (TLP Processing Hints) metadata > + * with a vfio-exported dma-buf. The dma-buf must have been created by > + * VFIO_DEVICE_FEATURE_DMA_BUF on this device. > + * > + * dmabuf_fd is the file descriptor returned by VFIO_DEVICE_FEATURE_DMA_BUF. > + * steering_tag and ph are the raw TPH values that importing drivers should use > + * when accessing the buffer. > + * > + * The user must set TPH on the dma-buf before the importer consumes it. > + * > + * Return: 0 on success, -errno on failure. > + */ > +#define VFIO_DEVICE_FEATURE_DMA_BUF_TPH 13 > + > +struct vfio_device_feature_dma_buf_tph { > + __s32 dmabuf_fd; > + __u16 steering_tag; > + __u8 ph; > + __u8 reserved; > +}; > + > /* -------- API for Type1 VFIO IOMMU -------- */ > > /** ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] vfio: add dma-buf get_tph callback and DMA_BUF_TPH feature 2026-05-04 21:44 ` [PATCH v2 1/2] vfio: add dma-buf get_tph callback and DMA_BUF_TPH feature Alex Williamson @ 2026-05-05 6:54 ` Zhiping Zhang 0 siblings, 0 replies; 8+ messages in thread From: Zhiping Zhang @ 2026-05-05 6:54 UTC (permalink / raw) To: Alex Williamson Cc: Jason Gunthorpe, Leon Romanovsky, Bjorn Helgaas, linux-rdma, linux-pci, netdev, dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas, kvm On Mon, May 4, 2026 at 2:45 PM Alex Williamson <alex@shazbot•org> wrote: > > > > On Thu, 30 Apr 2026 13:06:56 -0700 > Zhiping Zhang <zhipingz@meta•com> wrote: > > > Add a dma-buf callback that returns raw TPH metadata from the exporter > > so peer devices can reuse the steering tag and processing hint > > associated with a VFIO-exported buffer. > > > > Add a new VFIO_DEVICE_FEATURE_DMA_BUF_TPH ioctl that takes the fd from > > VFIO_DEVICE_FEATURE_DMA_BUF along with a steering tag and processing > > hint, validates the fd is a vfio-exported dma-buf belonging to this > > device, and stores the TPH values under memory_lock. This keeps the > > existing VFIO_DEVICE_FEATURE_DMA_BUF uAPI completely unchanged. > > > > The user sequences setting TPH on the dma-buf before the importer > > consumes it. > > > > Add an st_width parameter to get_tph() so the exporter can reject > > steering tags that exceed the consumer's supported width (8 vs 16 bit). > > When no TPH metadata was supplied, get_tph() returns -EOPNOTSUPP. > > > > Signed-off-by: Zhiping Zhang <zhipingz@meta•com> > > The uAPI is better, but sashiko has some review comments[1] for you. > > Please also copy the kvm list for vfio related development. Thanks, > > Alex Got it, thanks Alex. let me check sashiko's comments and post a new revision. i also copied kvm@vger•kernel.org and will include in future revisions. Zhiping > > [1]https://urldefense.com/v3/__https://sashiko.dev/*/patchset/20260430200704.352228-1-zhipingz@meta.com__;Iw!!Bt8RZUm9aw!7glmqoMRhcdDwOgCAQuuEVqlhFJrh9bAYHXvicXPAO2M-k-NPwE_wFeUjVhe7EXbkXMd6g7eOe13$ > > > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c > > --- a/drivers/vfio/pci/vfio_pci_core.c > > +++ b/drivers/vfio/pci/vfio_pci_core.c > > @@ -1534,6 +1534,9 @@ int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags, > > return vfio_pci_core_feature_token(vdev, flags, arg, argsz); > > case VFIO_DEVICE_FEATURE_DMA_BUF: > > return vfio_pci_core_feature_dma_buf(vdev, flags, arg, argsz); > > + case VFIO_DEVICE_FEATURE_DMA_BUF_TPH: > > + return vfio_pci_core_feature_dma_buf_tph(vdev, flags, arg, > > + argsz); > > default: > > return -ENOTTY; > > } > > diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c > > --- a/drivers/vfio/pci/vfio_pci_dmabuf.c > > +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c > > @@ -19,6 +19,9 @@ struct vfio_pci_dma_buf { > > u32 nr_ranges; > > struct kref kref; > > struct completion comp; > > + u16 steering_tag; > > + u8 ph; > > + u8 tph_present : 1; > > u8 revoked : 1; > > }; > > > > @@ -69,6 +72,22 @@ vfio_pci_dma_buf_map(struct dma_buf_attachment *attachment, > > return ret; > > } > > > > +static int vfio_pci_dma_buf_get_tph(struct dma_buf *dmabuf, u16 *steering_tag, > > + u8 *ph, u8 st_width) > > +{ > > + struct vfio_pci_dma_buf *priv = dmabuf->priv; > > + > > + if (!priv->tph_present) > > + return -EOPNOTSUPP; > > + > > + if (st_width < 16 && priv->steering_tag > ((1U << st_width) - 1)) > > + return -EINVAL; > > + > > + *steering_tag = priv->steering_tag; > > + *ph = priv->ph; > > + return 0; > > +} > > + > > static void vfio_pci_dma_buf_unmap(struct dma_buf_attachment *attachment, > > struct sg_table *sgt, > > enum dma_data_direction dir) > > @@ -101,6 +120,7 @@ static void vfio_pci_dma_buf_release(struct dma_buf *dmabuf) > > > > static const struct dma_buf_ops vfio_pci_dmabuf_ops = { > > .attach = vfio_pci_dma_buf_attach, > > + .get_tph = vfio_pci_dma_buf_get_tph, > > .map_dma_buf = vfio_pci_dma_buf_map, > > .unmap_dma_buf = vfio_pci_dma_buf_unmap, > > .release = vfio_pci_dma_buf_release, > > @@ -331,6 +351,55 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, > > return ret; > > } > > > > +int vfio_pci_core_feature_dma_buf_tph(struct vfio_pci_core_device *vdev, > > + u32 flags, > > + struct vfio_device_feature_dma_buf_tph __user *arg, > > + size_t argsz) > > +{ > > + struct vfio_device_feature_dma_buf_tph set_tph; > > + struct vfio_pci_dma_buf *priv; > > + struct dma_buf *dmabuf; > > + int ret; > > + > > + ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_SET, > > + sizeof(set_tph)); > > + if (ret != 1) > > + return ret; > > + > > + if (copy_from_user(&set_tph, arg, sizeof(set_tph))) > > + return -EFAULT; > > + > > + if (set_tph.reserved) > > + return -EINVAL; > > + > > + dmabuf = dma_buf_get(set_tph.dmabuf_fd); > > + if (IS_ERR(dmabuf)) > > + return PTR_ERR(dmabuf); > > + > > + if (dmabuf->ops != &vfio_pci_dmabuf_ops) { > > + ret = -EINVAL; > > + goto out_put; > > + } > > + > > + priv = dmabuf->priv; > > + down_write(&vdev->memory_lock); > > + if (priv->vdev != vdev) { > > + ret = -EINVAL; > > + goto out_unlock; > > + } > > + > > + priv->steering_tag = set_tph.steering_tag; > > + priv->ph = set_tph.ph; > > + priv->tph_present = 1; > > + ret = 0; > > + > > +out_unlock: > > + up_write(&vdev->memory_lock); > > +out_put: > > + dma_buf_put(dmabuf); > > + return ret; > > +} > > + > > void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked) > > { > > struct vfio_pci_dma_buf *priv; > > diff --git a/drivers/vfio/pci/vfio_pci_priv.h b/drivers/vfio/pci/vfio_pci_priv.h > > --- a/drivers/vfio/pci/vfio_pci_priv.h > > +++ b/drivers/vfio/pci/vfio_pci_priv.h > > @@ -118,6 +118,10 @@ static inline bool vfio_pci_is_vga(struct pci_dev *pdev) > > int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, > > struct vfio_device_feature_dma_buf __user *arg, > > size_t argsz); > > +int vfio_pci_core_feature_dma_buf_tph(struct vfio_pci_core_device *vdev, > > + u32 flags, > > + struct vfio_device_feature_dma_buf_tph __user *arg, > > + size_t argsz); > > void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev); > > void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked); > > #else > > @@ -128,6 +132,13 @@ vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, > > { > > return -ENOTTY; > > } > > +static inline int > > +vfio_pci_core_feature_dma_buf_tph(struct vfio_pci_core_device *vdev, u32 flags, > > + struct vfio_device_feature_dma_buf_tph __user *arg, > > + size_t argsz) > > +{ > > + return -ENOTTY; > > +} > > static inline void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev) > > { > > } > > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > > --- a/include/linux/dma-buf.h > > +++ b/include/linux/dma-buf.h > > @@ -113,6 +113,23 @@ struct dma_buf_ops { > > */ > > void (*unpin)(struct dma_buf_attachment *attach); > > > > + /** > > + * @get_tph: > > + * @dmabuf: DMA buffer for which to retrieve TPH metadata > > + * @steering_tag: Returns the raw TPH steering tag > > + * @ph: Returns the TPH processing hint > > + * @st_width: Consumer's supported steering tag width in bits (8 or 16) > > + * > > + * Return the TPH (TLP Processing Hints) metadata associated with this > > + * DMA buffer. Exporters that do not provide TPH metadata should return > > + * -EOPNOTSUPP. If the steering tag exceeds @st_width bits, return > > + * -EINVAL. > > + * > > + * This callback is optional. > > + */ > > + int (*get_tph)(struct dma_buf *dmabuf, u16 *steering_tag, u8 *ph, > > + u8 st_width); > > + > > /** > > * @map_dma_buf: > > * > > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h > > --- a/include/uapi/linux/vfio.h > > +++ b/include/uapi/linux/vfio.h > > @@ -1534,6 +1534,28 @@ struct vfio_device_feature_dma_buf { > > */ > > #define VFIO_DEVICE_FEATURE_MIG_PRECOPY_INFOv2 12 > > > > +/** > > + * Upon VFIO_DEVICE_FEATURE_SET associate TPH (TLP Processing Hints) metadata > > + * with a vfio-exported dma-buf. The dma-buf must have been created by > > + * VFIO_DEVICE_FEATURE_DMA_BUF on this device. > > + * > > + * dmabuf_fd is the file descriptor returned by VFIO_DEVICE_FEATURE_DMA_BUF. > > + * steering_tag and ph are the raw TPH values that importing drivers should use > > + * when accessing the buffer. > > + * > > + * The user must set TPH on the dma-buf before the importer consumes it. > > + * > > + * Return: 0 on success, -errno on failure. > > + */ > > +#define VFIO_DEVICE_FEATURE_DMA_BUF_TPH 13 > > + > > +struct vfio_device_feature_dma_buf_tph { > > + __s32 dmabuf_fd; > > + __u16 steering_tag; > > + __u8 ph; > > + __u8 reserved; > > +}; > > + > > /* -------- API for Type1 VFIO IOMMU -------- */ > > > > /** > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] vfio: add dma-buf get_tph callback and DMA_BUF_TPH feature [not found] ` <20260430200704.352228-2-zhipingz@meta.com> 2026-05-04 21:44 ` [PATCH v2 1/2] vfio: add dma-buf get_tph callback and DMA_BUF_TPH feature Alex Williamson @ 2026-05-06 6:58 ` fengchengwen 2026-05-06 18:23 ` Zhiping Zhang 1 sibling, 1 reply; 8+ messages in thread From: fengchengwen @ 2026-05-06 6:58 UTC (permalink / raw) To: Zhiping Zhang, Alex Williamson, Jason Gunthorpe, Leon Romanovsky Cc: Bjorn Helgaas, linux-rdma, linux-pci, netdev, dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas On 5/1/2026 4:06 AM, Zhiping Zhang wrote: > Add a dma-buf callback that returns raw TPH metadata from the exporter > so peer devices can reuse the steering tag and processing hint > associated with a VFIO-exported buffer. > > Add a new VFIO_DEVICE_FEATURE_DMA_BUF_TPH ioctl that takes the fd from > VFIO_DEVICE_FEATURE_DMA_BUF along with a steering tag and processing > hint, validates the fd is a vfio-exported dma-buf belonging to this > device, and stores the TPH values under memory_lock. This keeps the > existing VFIO_DEVICE_FEATURE_DMA_BUF uAPI completely unchanged. > > The user sequences setting TPH on the dma-buf before the importer > consumes it. > > Add an st_width parameter to get_tph() so the exporter can reject > steering tags that exceed the consumer's supported width (8 vs 16 bit). > When no TPH metadata was supplied, get_tph() returns -EOPNOTSUPP. > > Signed-off-by: Zhiping Zhang <zhipingz@meta•com> > > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c > --- a/drivers/vfio/pci/vfio_pci_core.c > +++ b/drivers/vfio/pci/vfio_pci_core.c > @@ -1534,6 +1534,9 @@ int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags, > return vfio_pci_core_feature_token(vdev, flags, arg, argsz); > case VFIO_DEVICE_FEATURE_DMA_BUF: > return vfio_pci_core_feature_dma_buf(vdev, flags, arg, argsz); > + case VFIO_DEVICE_FEATURE_DMA_BUF_TPH: > + return vfio_pci_core_feature_dma_buf_tph(vdev, flags, arg, > + argsz); > default: > return -ENOTTY; > } > diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c > --- a/drivers/vfio/pci/vfio_pci_dmabuf.c > +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c > @@ -19,6 +19,9 @@ struct vfio_pci_dma_buf { > u32 nr_ranges; > struct kref kref; > struct completion comp; > + u16 steering_tag; > + u8 ph; > + u8 tph_present : 1; > u8 revoked : 1; > }; > > @@ -69,6 +72,22 @@ vfio_pci_dma_buf_map(struct dma_buf_attachment *attachment, > return ret; > } > > +static int vfio_pci_dma_buf_get_tph(struct dma_buf *dmabuf, u16 *steering_tag, > + u8 *ph, u8 st_width) > +{ > + struct vfio_pci_dma_buf *priv = dmabuf->priv; > + > + if (!priv->tph_present) > + return -EOPNOTSUPP; > + > + if (st_width < 16 && priv->steering_tag > ((1U << st_width) - 1)) > + return -EINVAL; The checker will failed in following cases: 1. If the exporter passed 8bit st, and importer support 16bit st, then it will pass the checker. 2. The exporter enabled 16bit st and its st is < 256 (note: the pcie protocol doesn't restrict 16bit-st must >=256), and importer only support 8bit st, then it will also pass the checker Suggest userspace passing both st(8bit) and extend-st(16bit), and importer chose the right one. > + > + *steering_tag = priv->steering_tag; > + *ph = priv->ph; > + return 0; > +} > + > static void vfio_pci_dma_buf_unmap(struct dma_buf_attachment *attachment, > struct sg_table *sgt, > enum dma_data_direction dir) > @@ -101,6 +120,7 @@ static void vfio_pci_dma_buf_release(struct dma_buf *dmabuf) > > static const struct dma_buf_ops vfio_pci_dmabuf_ops = { > .attach = vfio_pci_dma_buf_attach, > + .get_tph = vfio_pci_dma_buf_get_tph, > .map_dma_buf = vfio_pci_dma_buf_map, > .unmap_dma_buf = vfio_pci_dma_buf_unmap, > .release = vfio_pci_dma_buf_release, > @@ -331,6 +351,55 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, > return ret; > } > > +int vfio_pci_core_feature_dma_buf_tph(struct vfio_pci_core_device *vdev, > + u32 flags, > + struct vfio_device_feature_dma_buf_tph __user *arg, > + size_t argsz) > +{ > + struct vfio_device_feature_dma_buf_tph set_tph; > + struct vfio_pci_dma_buf *priv; > + struct dma_buf *dmabuf; > + int ret; > + > + ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_SET, > + sizeof(set_tph)); > + if (ret != 1) > + return ret; > + > + if (copy_from_user(&set_tph, arg, sizeof(set_tph))) > + return -EFAULT; > + > + if (set_tph.reserved) > + return -EINVAL; > + > + dmabuf = dma_buf_get(set_tph.dmabuf_fd); > + if (IS_ERR(dmabuf)) > + return PTR_ERR(dmabuf); > + > + if (dmabuf->ops != &vfio_pci_dmabuf_ops) { > + ret = -EINVAL; > + goto out_put; > + } > + > + priv = dmabuf->priv; > + down_write(&vdev->memory_lock); > + if (priv->vdev != vdev) { > + ret = -EINVAL; > + goto out_unlock; > + } > + > + priv->steering_tag = set_tph.steering_tag; > + priv->ph = set_tph.ph; > + priv->tph_present = 1; > + ret = 0; > + > +out_unlock: > + up_write(&vdev->memory_lock); > +out_put: > + dma_buf_put(dmabuf); > + return ret; > +} > + > void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked) > { > struct vfio_pci_dma_buf *priv; > diff --git a/drivers/vfio/pci/vfio_pci_priv.h b/drivers/vfio/pci/vfio_pci_priv.h > --- a/drivers/vfio/pci/vfio_pci_priv.h > +++ b/drivers/vfio/pci/vfio_pci_priv.h > @@ -118,6 +118,10 @@ static inline bool vfio_pci_is_vga(struct pci_dev *pdev) > int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, > struct vfio_device_feature_dma_buf __user *arg, > size_t argsz); > +int vfio_pci_core_feature_dma_buf_tph(struct vfio_pci_core_device *vdev, > + u32 flags, > + struct vfio_device_feature_dma_buf_tph __user *arg, > + size_t argsz); > void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev); > void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked); > #else > @@ -128,6 +132,13 @@ vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, > { > return -ENOTTY; > } > +static inline int > +vfio_pci_core_feature_dma_buf_tph(struct vfio_pci_core_device *vdev, u32 flags, > + struct vfio_device_feature_dma_buf_tph __user *arg, > + size_t argsz) > +{ > + return -ENOTTY; > +} > static inline void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev) > { > } > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > --- a/include/linux/dma-buf.h > +++ b/include/linux/dma-buf.h > @@ -113,6 +113,23 @@ struct dma_buf_ops { > */ > void (*unpin)(struct dma_buf_attachment *attach); > > + /** > + * @get_tph: > + * @dmabuf: DMA buffer for which to retrieve TPH metadata > + * @steering_tag: Returns the raw TPH steering tag > + * @ph: Returns the TPH processing hint > + * @st_width: Consumer's supported steering tag width in bits (8 or 16) > + * > + * Return the TPH (TLP Processing Hints) metadata associated with this > + * DMA buffer. Exporters that do not provide TPH metadata should return > + * -EOPNOTSUPP. If the steering tag exceeds @st_width bits, return > + * -EINVAL. > + * > + * This callback is optional. > + */ > + int (*get_tph)(struct dma_buf *dmabuf, u16 *steering_tag, u8 *ph, > + u8 st_width); > + > /** > * @map_dma_buf: > * > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h > --- a/include/uapi/linux/vfio.h > +++ b/include/uapi/linux/vfio.h > @@ -1534,6 +1534,28 @@ struct vfio_device_feature_dma_buf { > */ > #define VFIO_DEVICE_FEATURE_MIG_PRECOPY_INFOv2 12 > > +/** > + * Upon VFIO_DEVICE_FEATURE_SET associate TPH (TLP Processing Hints) metadata > + * with a vfio-exported dma-buf. The dma-buf must have been created by > + * VFIO_DEVICE_FEATURE_DMA_BUF on this device. > + * > + * dmabuf_fd is the file descriptor returned by VFIO_DEVICE_FEATURE_DMA_BUF. > + * steering_tag and ph are the raw TPH values that importing drivers should use > + * when accessing the buffer. > + * > + * The user must set TPH on the dma-buf before the importer consumes it. > + * > + * Return: 0 on success, -errno on failure. > + */ > +#define VFIO_DEVICE_FEATURE_DMA_BUF_TPH 13 > + > +struct vfio_device_feature_dma_buf_tph { > + __s32 dmabuf_fd; > + __u16 steering_tag; > + __u8 ph; > + __u8 reserved; > +}; > + > /* -------- API for Type1 VFIO IOMMU -------- */ > > /** > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] vfio: add dma-buf get_tph callback and DMA_BUF_TPH feature 2026-05-06 6:58 ` fengchengwen @ 2026-05-06 18:23 ` Zhiping Zhang 2026-05-13 6:31 ` Leon Romanovsky 0 siblings, 1 reply; 8+ messages in thread From: Zhiping Zhang @ 2026-05-06 18:23 UTC (permalink / raw) To: fengchengwen Cc: Alex Williamson, Jason Gunthorpe, Leon Romanovsky, Bjorn Helgaas, linux-rdma, linux-pci, netdev, dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas, kvm On Tue, May 5, 2026 at 11:58 PM fengchengwen <fengchengwen@huawei•com> wrote: > > > > On 5/1/2026 4:06 AM, Zhiping Zhang wrote: > > Add a dma-buf callback that returns raw TPH metadata from the exporter > > so peer devices can reuse the steering tag and processing hint > > associated with a VFIO-exported buffer. > > > > Add a new VFIO_DEVICE_FEATURE_DMA_BUF_TPH ioctl that takes the fd from > > VFIO_DEVICE_FEATURE_DMA_BUF along with a steering tag and processing > > hint, validates the fd is a vfio-exported dma-buf belonging to this > > device, and stores the TPH values under memory_lock. This keeps the > > existing VFIO_DEVICE_FEATURE_DMA_BUF uAPI completely unchanged. > > > > The user sequences setting TPH on the dma-buf before the importer > > consumes it. > > > > Add an st_width parameter to get_tph() so the exporter can reject > > steering tags that exceed the consumer's supported width (8 vs 16 bit). > > When no TPH metadata was supplied, get_tph() returns -EOPNOTSUPP. > > > > Signed-off-by: Zhiping Zhang <zhipingz@meta•com> > > > > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c > > --- a/drivers/vfio/pci/vfio_pci_core.c > > +++ b/drivers/vfio/pci/vfio_pci_core.c > > @@ -1534,6 +1534,9 @@ int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags, > > return vfio_pci_core_feature_token(vdev, flags, arg, argsz); > > case VFIO_DEVICE_FEATURE_DMA_BUF: > > return vfio_pci_core_feature_dma_buf(vdev, flags, arg, argsz); > > + case VFIO_DEVICE_FEATURE_DMA_BUF_TPH: > > + return vfio_pci_core_feature_dma_buf_tph(vdev, flags, arg, > > + argsz); > > default: > > return -ENOTTY; > > } > > diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c > > --- a/drivers/vfio/pci/vfio_pci_dmabuf.c > > +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c > > @@ -19,6 +19,9 @@ struct vfio_pci_dma_buf { > > u32 nr_ranges; > > struct kref kref; > > struct completion comp; > > + u16 steering_tag; > > + u8 ph; > > + u8 tph_present : 1; > > u8 revoked : 1; > > }; > > > > @@ -69,6 +72,22 @@ vfio_pci_dma_buf_map(struct dma_buf_attachment *attachment, > > return ret; > > } > > > > +static int vfio_pci_dma_buf_get_tph(struct dma_buf *dmabuf, u16 *steering_tag, > > + u8 *ph, u8 st_width) > > +{ > > + struct vfio_pci_dma_buf *priv = dmabuf->priv; > > + > > + if (!priv->tph_present) > > + return -EOPNOTSUPP; > > + > > + if (st_width < 16 && priv->steering_tag > ((1U << st_width) - 1)) > > + return -EINVAL; > > The checker will failed in following cases: > 1. If the exporter passed 8bit st, and importer support 16bit st, then it will pass > the checker. > 2. The exporter enabled 16bit st and its st is < 256 (note: the pcie protocol doesn't > restrict 16bit-st must >=256), and importer only support 8bit st, then it will also > pass the checker > > Suggest userspace passing both st(8bit) and extend-st(16bit), and importer chose the > right one. > Agreed — 8-bit ST and 16-bit Extended ST are distinct namespaces (firmware returns them as separate fields with separate validity bits), so a numeric range check is insufficient. For v3 I'll change the uAPI to carry both, gated by a flags field: #define VFIO_DMA_BUF_TPH_ST (1 << 0) /* steering_tag valid */ #define VFIO_DMA_BUF_TPH_ST_EXT (1 << 1) /* steering_tag_ext valid */ struct vfio_device_feature_dma_buf_tph { __s32 dmabuf_fd; __u32 flags; __u16 steering_tag; /* 8-bit ST */ __u16 steering_tag_ext; /* 16-bit Extended ST */ __u8 ph; __u8 reserved[3]; }; get_tph() then picks the field matching the importer's st_width and returns -EOPNOTSUPP if that one isn't valid. Thanks, Zhiping ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] vfio: add dma-buf get_tph callback and DMA_BUF_TPH feature 2026-05-06 18:23 ` Zhiping Zhang @ 2026-05-13 6:31 ` Leon Romanovsky 2026-05-14 5:52 ` Zhiping Zhang 0 siblings, 1 reply; 8+ messages in thread From: Leon Romanovsky @ 2026-05-13 6:31 UTC (permalink / raw) To: Zhiping Zhang Cc: fengchengwen, Alex Williamson, Jason Gunthorpe, Bjorn Helgaas, linux-rdma, linux-pci, netdev, dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas, kvm On Wed, May 06, 2026 at 11:23:19AM -0700, Zhiping Zhang wrote: > On Tue, May 5, 2026 at 11:58 PM fengchengwen <fengchengwen@huawei•com> wrote: > > > > > > > On 5/1/2026 4:06 AM, Zhiping Zhang wrote: > > > Add a dma-buf callback that returns raw TPH metadata from the exporter > > > so peer devices can reuse the steering tag and processing hint > > > associated with a VFIO-exported buffer. > > > > > > Add a new VFIO_DEVICE_FEATURE_DMA_BUF_TPH ioctl that takes the fd from > > > VFIO_DEVICE_FEATURE_DMA_BUF along with a steering tag and processing > > > hint, validates the fd is a vfio-exported dma-buf belonging to this > > > device, and stores the TPH values under memory_lock. This keeps the > > > existing VFIO_DEVICE_FEATURE_DMA_BUF uAPI completely unchanged. > > > > > > The user sequences setting TPH on the dma-buf before the importer > > > consumes it. > > > > > > Add an st_width parameter to get_tph() so the exporter can reject > > > steering tags that exceed the consumer's supported width (8 vs 16 bit). > > > When no TPH metadata was supplied, get_tph() returns -EOPNOTSUPP. > > > > > > Signed-off-by: Zhiping Zhang <zhipingz@meta•com> > > > > > > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c > > > --- a/drivers/vfio/pci/vfio_pci_core.c > > > +++ b/drivers/vfio/pci/vfio_pci_core.c > > > @@ -1534,6 +1534,9 @@ int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags, > > > return vfio_pci_core_feature_token(vdev, flags, arg, argsz); > > > case VFIO_DEVICE_FEATURE_DMA_BUF: > > > return vfio_pci_core_feature_dma_buf(vdev, flags, arg, argsz); > > > + case VFIO_DEVICE_FEATURE_DMA_BUF_TPH: > > > + return vfio_pci_core_feature_dma_buf_tph(vdev, flags, arg, > > > + argsz); > > > default: > > > return -ENOTTY; > > > } > > > diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c > > > --- a/drivers/vfio/pci/vfio_pci_dmabuf.c > > > +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c > > > @@ -19,6 +19,9 @@ struct vfio_pci_dma_buf { > > > u32 nr_ranges; > > > struct kref kref; > > > struct completion comp; > > > + u16 steering_tag; > > > + u8 ph; > > > + u8 tph_present : 1; > > > u8 revoked : 1; > > > }; > > > > > > @@ -69,6 +72,22 @@ vfio_pci_dma_buf_map(struct dma_buf_attachment *attachment, > > > return ret; > > > } > > > > > > +static int vfio_pci_dma_buf_get_tph(struct dma_buf *dmabuf, u16 *steering_tag, > > > + u8 *ph, u8 st_width) > > > +{ > > > + struct vfio_pci_dma_buf *priv = dmabuf->priv; > > > + > > > + if (!priv->tph_present) > > > + return -EOPNOTSUPP; > > > + > > > + if (st_width < 16 && priv->steering_tag > ((1U << st_width) - 1)) > > > + return -EINVAL; > > > > The checker will failed in following cases: > > 1. If the exporter passed 8bit st, and importer support 16bit st, then it will pass > > the checker. > > 2. The exporter enabled 16bit st and its st is < 256 (note: the pcie protocol doesn't > > restrict 16bit-st must >=256), and importer only support 8bit st, then it will also > > pass the checker > > > > Suggest userspace passing both st(8bit) and extend-st(16bit), and importer chose the > > right one. > > > > Agreed — 8-bit ST and 16-bit Extended ST are distinct namespaces > (firmware returns > them as separate fields with separate validity bits), so a numeric > range check is insufficient. > For v3 I'll change the uAPI to carry both, gated by a flags field: > > #define VFIO_DMA_BUF_TPH_ST (1 << 0) /* steering_tag valid */ > #define VFIO_DMA_BUF_TPH_ST_EXT (1 << 1) /* steering_tag_ext valid > */ > struct vfio_device_feature_dma_buf_tph { > __s32 dmabuf_fd; > __u32 flags; > __u16 steering_tag; /* 8-bit ST */ > __u16 steering_tag_ext; /* 16-bit Extended ST */ I wonder whether `steering_tag` and `steering_tag_ext` can coexist and hold different values at the same time. BTW, please send your patches with diffstat. Thanks > __u8 ph; > __u8 reserved[3]; > }; > > get_tph() then picks the field matching the importer's st_width and > returns -EOPNOTSUPP > if that one isn't valid. > > Thanks, > Zhiping ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] vfio: add dma-buf get_tph callback and DMA_BUF_TPH feature 2026-05-13 6:31 ` Leon Romanovsky @ 2026-05-14 5:52 ` Zhiping Zhang 0 siblings, 0 replies; 8+ messages in thread From: Zhiping Zhang @ 2026-05-14 5:52 UTC (permalink / raw) To: Leon Romanovsky Cc: fengchengwen, Alex Williamson, Jason Gunthorpe, Bjorn Helgaas, linux-rdma, linux-pci, netdev, dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas, kvm On Tue, May 12, 2026 at 11:31 PM Leon Romanovsky <leon@kernel•org> wrote: ... > > #define VFIO_DMA_BUF_TPH_ST (1 << 0) /* steering_tag valid */ > > #define VFIO_DMA_BUF_TPH_ST_EXT (1 << 1) /* steering_tag_ext valid > > */ > > struct vfio_device_feature_dma_buf_tph { > > __s32 dmabuf_fd; > > __u32 flags; > > __u16 steering_tag; /* 8-bit ST */ > > __u16 steering_tag_ext; /* 16-bit Extended ST */ > > I wonder whether `steering_tag` and `steering_tag_ext` can coexist > and hold different values at the same time. > > BTW, please send your patches with diffstat. > > Thanks Yes, firmware can report both `steering_tag` (8-bit ST) and `steering_tag_ext` (16-bit Extended ST), they may differ. An importer can consume only the one matching its width. I’ll clarify that in the next revision. I’ll also make sure the next posting is sent with diffstat. Thanks, Zhiping ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <20260430200704.352228-3-zhipingz@meta.com>]
* Re: [PATCH v2 2/2] RDMA/mlx5: get tph for p2p access when registering dma-buf mr [not found] ` <20260430200704.352228-3-zhipingz@meta.com> @ 2026-05-06 7:04 ` fengchengwen 2026-05-06 18:13 ` Zhiping Zhang 0 siblings, 1 reply; 8+ messages in thread From: fengchengwen @ 2026-05-06 7:04 UTC (permalink / raw) To: Zhiping Zhang, Alex Williamson, Jason Gunthorpe, Leon Romanovsky Cc: Bjorn Helgaas, linux-rdma, linux-pci, netdev, dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas On 5/1/2026 4:06 AM, Zhiping Zhang wrote: > Query dma-buf TPH metadata when registering a dma-buf MR for peer to > peer access and translate the raw steering tag into an mlx5 steering tag > index. Factor mlx5_st_alloc_index() so callers that already have a raw > steering tag can allocate the corresponding mlx5 index directly. Keep the > DMAH path as the first priority and only fall back to dma-buf metadata when > no DMAH is supplied. > > Pass the device's supported ST width (8 or 16 bit, derived from > pdev->tph_req_type) to get_tph() so the exporter can reject tags that > exceed the consumer's capability. Initialize ret in mlx5_st_create() so the > cached steering-tag path returns success cleanly under clang builds. > > Signed-off-by: Zhiping Zhang <zhipingz@meta•com> > > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c > --- a/drivers/infiniband/hw/mlx5/mr.c > +++ b/drivers/infiniband/hw/mlx5/mr.c > @@ -46,6 +46,8 @@ > #include "data_direct.h" > #include "dmah.h" > > +MODULE_IMPORT_NS("DMA_BUF"); > + > static int mkey_max_umr_order(struct mlx5_ib_dev *dev) > { > if (MLX5_CAP_GEN(dev->mdev, umr_extended_translation_offset)) > @@ -899,6 +901,40 @@ static struct dma_buf_attach_ops mlx5_ib_dmabuf_attach_ops = { > .invalidate_mappings = mlx5_ib_dmabuf_invalidate_cb, > }; > > +static void get_tph_mr_dmabuf(struct mlx5_ib_dev *dev, int fd, u16 *st_index, > + u8 *ph) > +{ > + struct pci_dev *pdev = dev->mdev->pdev; > + struct dma_buf *dmabuf; > + u16 steering_tag; > + u8 st_width; > + int ret; > + > + st_width = (pdev->tph_req_type == PCI_TPH_REQ_EXT_TPH) ? 16 : 8; The tph_req_type is defined under CONFIG_PCIE_TPH, how about add a wrap function to query it. > + > + dmabuf = dma_buf_get(fd); > + if (IS_ERR(dmabuf)) > + return; > + > + if (!dmabuf->ops->get_tph) > + goto end_dbuf_put; > + > + ret = dmabuf->ops->get_tph(dmabuf, &steering_tag, ph, st_width); > + if (ret) { > + mlx5_ib_dbg(dev, "get_tph failed (%d)\n", ret); > + goto end_dbuf_put; > + } > + > + ret = mlx5_st_alloc_index_by_tag(dev->mdev, steering_tag, st_index); > + if (ret) { > + *ph = MLX5_IB_NO_PH; > + mlx5_ib_dbg(dev, "st_alloc_index_by_tag failed (%d)\n", ret); > + } > + > +end_dbuf_put: > + dma_buf_put(dmabuf); > +} > + > static struct ib_mr * > reg_user_mr_dmabuf(struct ib_pd *pd, struct device *dma_device, > u64 offset, u64 length, u64 virt_addr, > @@ -941,6 +977,8 @@ reg_user_mr_dmabuf(struct ib_pd *pd, struct device *dma_device, > ph = dmah->ph; > if (dmah->valid_fields & BIT(IB_DMAH_CPU_ID_EXISTS)) > st_index = mdmah->st_index; > + } else { > + get_tph_mr_dmabuf(dev, fd, &st_index, &ph); > } > > mr = alloc_cacheable_mr(pd, &umem_dmabuf->umem, virt_addr, > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/st.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/st.c > --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/st.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/st.c > @@ -29,7 +29,7 @@ struct mlx5_st *mlx5_st_create(struct mlx5_core_dev *dev) > u8 direct_mode = 0; > u16 num_entries; > u32 tbl_loc; > - int ret; > + int ret = 0; > > if (!MLX5_CAP_GEN(dev, mkey_pcie_tph)) > return NULL; > @@ -92,23 +92,18 @@ void mlx5_st_destroy(struct mlx5_core_dev *dev) > kfree(st); > } > > -int mlx5_st_alloc_index(struct mlx5_core_dev *dev, enum tph_mem_type mem_type, > - unsigned int cpu_uid, u16 *st_index) > +int mlx5_st_alloc_index_by_tag(struct mlx5_core_dev *dev, u16 tag, > + u16 *st_index) > { > struct mlx5_st_idx_data *idx_data; > struct mlx5_st *st = dev->st; > unsigned long index; > u32 xa_id; > - u16 tag; > - int ret; > + int ret = 0; > > if (!st) > return -EOPNOTSUPP; > > - ret = pcie_tph_get_cpu_st(dev->pdev, mem_type, cpu_uid, &tag); > - if (ret) > - return ret; > - > if (st->direct_mode) { > *st_index = tag; > return 0; > @@ -152,6 +147,20 @@ int mlx5_st_alloc_index(struct mlx5_core_dev *dev, enum tph_mem_type mem_type, > mutex_unlock(&st->lock); > return ret; > } > +EXPORT_SYMBOL_GPL(mlx5_st_alloc_index_by_tag); > + > +int mlx5_st_alloc_index(struct mlx5_core_dev *dev, enum tph_mem_type mem_type, > + unsigned int cpu_uid, u16 *st_index) > +{ > + u16 tag; > + int ret; > + > + ret = pcie_tph_get_cpu_st(dev->pdev, mem_type, cpu_uid, &tag); > + if (ret) > + return ret; > + > + return mlx5_st_alloc_index_by_tag(dev, tag, st_index); > +} > EXPORT_SYMBOL_GPL(mlx5_st_alloc_index); > > int mlx5_st_dealloc_index(struct mlx5_core_dev *dev, u16 st_index) > diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h > --- a/include/linux/mlx5/driver.h > +++ b/include/linux/mlx5/driver.h > @@ -1166,10 +1166,17 @@ int mlx5_dm_sw_icm_dealloc(struct mlx5_core_dev *dev, enum mlx5_sw_icm_type type > u64 length, u16 uid, phys_addr_t addr, u32 obj_id); > > #ifdef CONFIG_PCIE_TPH > +int mlx5_st_alloc_index_by_tag(struct mlx5_core_dev *dev, u16 tag, > + u16 *st_index); > int mlx5_st_alloc_index(struct mlx5_core_dev *dev, enum tph_mem_type mem_type, > unsigned int cpu_uid, u16 *st_index); > int mlx5_st_dealloc_index(struct mlx5_core_dev *dev, u16 st_index); > #else > +static inline int mlx5_st_alloc_index_by_tag(struct mlx5_core_dev *dev, > + u16 tag, u16 *st_index) > +{ > + return -EOPNOTSUPP; > +} > static inline int mlx5_st_alloc_index(struct mlx5_core_dev *dev, > enum tph_mem_type mem_type, > unsigned int cpu_uid, u16 *st_index) > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] RDMA/mlx5: get tph for p2p access when registering dma-buf mr 2026-05-06 7:04 ` [PATCH v2 2/2] RDMA/mlx5: get tph for p2p access when registering dma-buf mr fengchengwen @ 2026-05-06 18:13 ` Zhiping Zhang 0 siblings, 0 replies; 8+ messages in thread From: Zhiping Zhang @ 2026-05-06 18:13 UTC (permalink / raw) To: fengchengwen Cc: Alex Williamson, Jason Gunthorpe, Leon Romanovsky, Bjorn Helgaas, linux-rdma, linux-pci, netdev, dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas, kvm On Wed, May 6, 2026 at 12:04 AM fengchengwen <fengchengwen@huawei•com> wrote: > > > > On 5/1/2026 4:06 AM, Zhiping Zhang wrote: > > Query dma-buf TPH metadata when registering a dma-buf MR for peer to > > peer access and translate the raw steering tag into an mlx5 steering tag > > index. Factor mlx5_st_alloc_index() so callers that already have a raw > > steering tag can allocate the corresponding mlx5 index directly. Keep the > > DMAH path as the first priority and only fall back to dma-buf metadata when > > no DMAH is supplied. > > > > Pass the device's supported ST width (8 or 16 bit, derived from > > pdev->tph_req_type) to get_tph() so the exporter can reject tags that > > exceed the consumer's capability. Initialize ret in mlx5_st_create() so the > > cached steering-tag path returns success cleanly under clang builds. > > > > Signed-off-by: Zhiping Zhang <zhipingz@meta•com> > > > > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c > > --- a/drivers/infiniband/hw/mlx5/mr.c > > +++ b/drivers/infiniband/hw/mlx5/mr.c > > @@ -46,6 +46,8 @@ > > #include "data_direct.h" > > #include "dmah.h" > > > > +MODULE_IMPORT_NS("DMA_BUF"); > > + > > static int mkey_max_umr_order(struct mlx5_ib_dev *dev) > > { > > if (MLX5_CAP_GEN(dev->mdev, umr_extended_translation_offset)) > > @@ -899,6 +901,40 @@ static struct dma_buf_attach_ops mlx5_ib_dmabuf_attach_ops = { > > .invalidate_mappings = mlx5_ib_dmabuf_invalidate_cb, > > }; > > > > +static void get_tph_mr_dmabuf(struct mlx5_ib_dev *dev, int fd, u16 *st_index, > > + u8 *ph) > > +{ > > + struct pci_dev *pdev = dev->mdev->pdev; > > + struct dma_buf *dmabuf; > > + u16 steering_tag; > > + u8 st_width; > > + int ret; > > + > > + st_width = (pdev->tph_req_type == PCI_TPH_REQ_EXT_TPH) ? 16 : 8; > > The tph_req_type is defined under CONFIG_PCIE_TPH, how about add a wrap function > to query it. > Good catch! so the direct dereference here will break the build when TPH is disabled. I'll add a small wrapper in include/linux/pci-tph.h alongside the existing helpers, e.g.: #ifdef CONFIG_PCIE_TPH u8 pcie_tph_get_st_width(struct pci_dev *pdev); #else static inline u8 pcie_tph_get_st_width(struct pci_dev *pdev) { return 0; } #endif with the implementation in drivers/pci/pcie/tph.c returning 16 for PCI_TPH_REQ_EXT_TPH and 8 otherwise. Then get_tph_mr_dmabuf() becomes: st_width = pcie_tph_get_st_width(pdev); if (!st_width) goto end_dbuf_put; which also gives us a clean early-out when TPH isn't supported on the device. Will fix in v3. Thanks, Zhiping ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-05-14 5:52 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260430200704.352228-1-zhipingz@meta.com>
[not found] ` <20260430200704.352228-2-zhipingz@meta.com>
2026-05-04 21:44 ` [PATCH v2 1/2] vfio: add dma-buf get_tph callback and DMA_BUF_TPH feature Alex Williamson
2026-05-05 6:54 ` Zhiping Zhang
2026-05-06 6:58 ` fengchengwen
2026-05-06 18:23 ` Zhiping Zhang
2026-05-13 6:31 ` Leon Romanovsky
2026-05-14 5:52 ` Zhiping Zhang
[not found] ` <20260430200704.352228-3-zhipingz@meta.com>
2026-05-06 7:04 ` [PATCH v2 2/2] RDMA/mlx5: get tph for p2p access when registering dma-buf mr fengchengwen
2026-05-06 18:13 ` Zhiping Zhang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox