From: Alexey Kardashevskiy <aik@au1•ibm.com>
To: Wei Yang <weiyang@linux•vnet.ibm.com>
Cc: linuxppc-dev@lists•ozlabs.org, gwshan@linux•vnet.ibm.com
Subject: Re: [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device()
Date: Tue, 29 Apr 2014 17:55:48 +1000 [thread overview]
Message-ID: <535F5B04.5000102@au1.ibm.com> (raw)
In-Reply-To: <20140429064955.GA5066@richard>
On 04/29/2014 04:49 PM, Wei Yang wrote:
> On Mon, Apr 28, 2014 at 11:35:32PM +1000, Alexey Kardashevskiy wrote:
>> On 04/23/2014 12:26 PM, Wei Yang wrote:
>>> During the EEH hotplug event, iommu_add_device() will be invoked three times
>>> and two of them will trigger warning or error.
>>>
>>> The three times to invoke the iommu_add_device() are:
>>>
>>> pci_device_add
>>> ...
>>> set_iommu_table_base_and_group <- 1st time, fail
>>> device_add
>>> ...
>>> tce_iommu_bus_notifier <- 2nd time, succees
>>> pcibios_add_pci_devices
>>> ...
>>> pcibios_setup_bus_devices <- 3rd time, re-attach
>>>
>>> The first time fails, since the dev->kobj->sd is not initialized. The
>>> dev->kobj->sd is initialized in device_add().
>>> The third time's warning is triggered by the re-attach of the iommu_group.
>>>
>>> After applying this patch, the error
>>
>> Nack.
>>
>> The patch still seems incorrect and we actually need to remove
>> tce_iommu_bus_notifier completely as pcibios_setup_bus_devices is called
>>from another notifier anyway. Could you please test it?
>>
>>
>
> Hi, Alexey,
>
> Nice to see your comment. Let me show what I got fist.
>
> Generally, when kernel enumerate on the pci device, following functions will
> be invoked.
>
> pci_device_add
> pcibios_setup_bus_device
> ...
> set_iommu_table_base_and_group
> device_add
> ...
> tce_iommu_bus_notifier
> pcibios_fixp_bus
> pcibios_add_pci_devices
> ...
> pcibios_setup_bus_devices
>
>>From the call flow, we see for a normall pci device, the
> pcibios_setup_bus_device() will be invoked twice.
tce_iommu_bus_notifier should not be called for devices during boot-time
PCI discovery as the discovery is done from subsys_initcall handler and
tce_iommu_bus_notifier is registered as subsys_initcall_sync. Are you sure
this is happening? You should see warnings as for PF's EEH but you do not
say that.
> At the bootup time, none of them succeed to setup the dma, since the PE is not
> assigned or the tbl is not set. The iommu tbl and group is setup in
> pnv_pci_ioda_setup_DMA().
>
> This call flow maintains the same when EEH error happens on Bus PE, while the
> behavior is a little different.
>
> pci_device_add
> pcibios_setup_bus_device
> ...
> set_iommu_table_base_and_group <- fail, kobj->sd is not initialized
Sorry, what is uninitialized? The only kobj here is the one in iommu_group
and it must have been taken care of before adding a device.
> device_add
> ...
> tce_iommu_bus_notifier <- succeed
> pcibios_fixp_bus
> pcibios_add_pci_devices
> ...
> pcibios_setup_bus_devices <- warning, re-attach
This is why I am suggesting getting rid of tce_iommu_bus_notifier - we will
avoid the warning.
> While this call flow will change a little on a VF. For a VF,
> pcibios_fixp_bus() will not be invoked. Current behavior is this.
You meant pcibios_fixup_bus? I would expect it to get called (from
pci_rescan_bus() or something like that?) and this seems to be the problem
here.
How are VFs added in terms of code flow? Is there any git tree to look at?
> pci_device_add
> pcibios_setup_bus_device
> ...
> set_iommu_table_base_and_group <- fail, kobj->sd is not initialized
> device_add
> ...
> tce_iommu_bus_notifier <- succeed
>
> And if an EEH error happens just on a VF, I believe the same call flow should
> go for recovery. (This is not set down, still under discussion with Gavin.)
>
> My conclusion is:
> 1. remove the tce_iommu_bus_notifier completely will make the VF not work.
> So I choose to revert the code and attach make the iommu group attachment
> just happens in one place.
>
> BTW, I know my patch is not a perfect one. For a PF, the tbl will still be set
> twice. I am not sure why we need to invoke pcibios_setup_device() twice on a
> PF, maybe some platform need to fixup some thing after the pci_bus is added.
> So I don't remove one of them to solve the problem.
>
> If you have a better idea, I am glad to take it.
>
--
Alexey Kardashevskiy
IBM OzLabs, LTC Team
e-mail: aik@au1•ibm.com
notes: Alexey Kardashevskiy/Australia/IBM
next prev parent reply other threads:[~2014-04-29 7:55 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-23 2:26 [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device() Wei Yang
2014-04-23 2:26 ` [PATCH 2/2] powerpc/powernv: release the refcount for pci_dev Wei Yang
2014-04-28 13:35 ` [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device() Alexey Kardashevskiy
2014-04-29 6:49 ` Wei Yang
2014-04-29 7:55 ` Alexey Kardashevskiy [this message]
2014-04-29 9:37 ` Wei Yang
2014-04-29 13:11 ` Alexey Kardashevskiy
2014-04-30 1:31 ` Wei Yang
2014-04-30 0:28 ` Gavin Shan
2014-04-30 3:28 ` Wei Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=535F5B04.5000102@au1.ibm.com \
--to=aik@au1$(echo .)ibm.com \
--cc=gwshan@linux$(echo .)vnet.ibm.com \
--cc=linuxppc-dev@lists$(echo .)ozlabs.org \
--cc=weiyang@linux$(echo .)vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox