From: Zhu Yanjun <yanjun.zhu@linux•dev>
To: Xuan Zhuo <xuanzhuo@linux•alibaba.com>, Jason Wang <jasowang@redhat•com>
Cc: Andrew Lunn <andrew@lunn•ch>, Heng Qi <hengqi@linux•alibaba.com>,
Paolo Abeni <pabeni@redhat•com>,
Zhu Yanjun <yanjun.zhu@intel•com>,
mst@redhat•com, davem@davemloft•net, edumazet@google•com,
kuba@kernel•org, virtualization@lists•linux.dev,
netdev@vger•kernel.org
Subject: Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
Date: Fri, 26 Jan 2024 11:13:48 +0800 [thread overview]
Message-ID: <68ebd92e-cf7a-43af-af71-386495606d90@linux.dev> (raw)
In-Reply-To: <e46d04d7-4eb7-4fcd-821a-d558c07531b7@linux.dev>
在 2024/1/26 11:11, Zhu Yanjun 写道:
>
>
> 在 2024/1/22 15:02, Xuan Zhuo 写道:
>> On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang<jasowang@redhat•com> wrote:
>>> On Mon, Jan 22, 2024 at 2:55 PM Jason Wang<jasowang@redhat•com> wrote:
>>>> On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo<xuanzhuo@linux•alibaba.com> wrote:
>>>>> On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang<jasowang@redhat•com> wrote:
>>>>>> On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo<xuanzhuo@linux•alibaba.com> wrote:
>>>>>>> On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang<jasowang@redhat•com> wrote:
>>>>>>>> On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun<yanjun.zhu@linux•dev> wrote:
>>>>>>>>> 在 2024/1/20 1:29, Andrew Lunn 写道:
>>>>>>>>>>>>>> while (!virtqueue_get_buf(vi->cvq, &tmp) &&
>>>>>>>>>>>>>> - !virtqueue_is_broken(vi->cvq))
>>>>>>>>>>>>>> + !virtqueue_is_broken(vi->cvq)) {
>>>>>>>>>>>>>> + if (timeout)
>>>>>>>>>>>>>> + timeout--;
>>>>>>>>>>>>> This is not really a timeout, just a loop counter. 200 iterations could
>>>>>>>>>>>>> be a very short time on reasonable H/W. I guess this avoid the soft
>>>>>>>>>>>>> lockup, but possibly (likely?) breaks the functionality when we need to
>>>>>>>>>>>>> loop for some non negligible time.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I fear we need a more complex solution, as mentioned by Micheal in the
>>>>>>>>>>>>> thread you quoted.
>>>>>>>>>>>> Got it. I also look forward to the more complex solution to this problem.
>>>>>>>>>>> Can we add a device capability (new feature bit) such as ctrq_wait_timeout
>>>>>>>>>>> to get a reasonable timeout?
>>>>>>>>>> The usual solution to this is include/linux/iopoll.h. If you can sleep
>>>>>>>>>> read_poll_timeout() otherwise read_poll_timeout_atomic().
>>>>>>>>> I read carefully the functions read_poll_timeout() and
>>>>>>>>> read_poll_timeout_atomic(). The timeout is set by the caller of the 2
>>>>>>>>> functions.
>>>>>>>> FYI, in order to avoid a swtich of atomic or not, we need convert rx
>>>>>>>> mode setting to workqueue first:
>>>>>>>>
>>>>>>>> https://www.mail-archive.com/virtualization@lists.linux-foundation.org/msg60298.html
>>>>>>>>
>>>>>>>>> As such, can we add a module parameter to customize this timeout value
>>>>>>>>> by the user?
>>>>>>>> Who is the "user" here, or how can the "user" know the value?
>>>>>>>>
>>>>>>>>> Or this timeout value is stored in device register, virtio_net driver
>>>>>>>>> will read this timeout value at initialization?
>>>>>>>> See another thread. The design needs to be general, or you can post a RFC.
>>>>>>>>
>>>>>>>> In another thought, we've already had a tx watchdog, maybe we can have
>>>>>>>> something similar to cvq and use timeout + reset in that case.
>>>>>>> But we may block by the reset ^_^ if the device is broken?
>>>>>> I mean vq reset here.
>>>>> I see.
>>>>>
>>>>> I mean when the deivce is broken, the vq reset also many be blocked.
>>>>>
>>>>> void vp_modern_set_queue_reset(struct virtio_pci_modern_device *mdev, u16 index)
>>>>> {
>>>>> struct virtio_pci_modern_common_cfg __iomem *cfg;
>>>>>
>>>>> cfg = (struct virtio_pci_modern_common_cfg __iomem *)mdev->common;
>>>>>
>>>>> vp_iowrite16(index, &cfg->cfg.queue_select);
>>>>> vp_iowrite16(1, &cfg->queue_reset);
>>>>>
>>>>> while (vp_ioread16(&cfg->queue_reset))
>>>>> msleep(1);
>>>>>
>>>>> while (vp_ioread16(&cfg->cfg.queue_enable))
>>>>> msleep(1);
>>>>> }
>>>>> EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
>>>>>
>>>>> In this function, for the broken device, we can not expect something.
>>>> Yes, it's best effort, there's no guarantee then. But it doesn't harm to try.
>>>>
>>>> Thanks
>>>>
>>>>>> It looks like we have multiple goals here
>>>>>>
>>>>>> 1) avoid lockups, using workqueue + cond_resched() seems to be
>>>>>> sufficient, it has issue but nothing new
>>>>>> 2) recover from the unresponsive device, the issue for timeout is that
>>>>>> it needs to deal with false positives
>>>>> I agree.
>>>>>
>>>>> But I want to add a new goal, cvq async. In the netdim, we will
>>>>> send many requests via the cvq, so the cvq async will be nice.
>>> Then you need an interrupt for cvq.
>>>
>>> FYI, I've posted a series that use interrupt for cvq in the past:
>>>
>>> https://lore.kernel.org/lkml/6026e801-6fda-fee9-a69b-d06a80368621@redhat.com/t/
>> I know this. But the interrupt maybe not a good solution without new space.
>>
>>> Haven't found time in working on this anymore, maybe we can start from
>>> this or not.
>> I said async, but my aim is to put many requests to the cvq before getting the
>> response.
>>
>> Heng Qi posted thishttps://lore.kernel.org/all/1705410693-118895-4-git-send-email-hengqi@linux.alibaba.com/
Sorry. This mail is rejected by netdev maillist. So I have to resend it.
Thanks a lot. I read Heng Qi's commits carefully. This patch series are
similiar with the NIC feature xmit_more.
But if cvq command is urgent, can we let this urgent cvq command be
passed ASAP?
I mean, can we set a flag similar to xmit_more? if cvq command is not
urgent, it can be queued. If it is urgent,
this cvq command is passed ASAP.
Zhu Yanjun
> Zhu Yanjun
>
>> Thanks.
>>
>>
>>> Thanks
>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>>
>>>>>>>> Thans
>>>>>>>>
>>>>>>>>> Zhu Yanjun
>>>>>>>>>
>>>>>>>>>> Andrew
next prev parent reply other threads:[~2024-01-26 3:13 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-15 1:29 [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang Zhu Yanjun
2024-01-15 2:20 ` Jason Wang
2024-01-15 10:25 ` Zhu Yanjun
[not found] ` <6cf2699a-483d-4124-9782-b6a771a41e70@linux.dev>
2024-01-22 3:06 ` Jason Wang
2024-01-16 12:04 ` Paolo Abeni
2024-01-18 12:01 ` Zhu Yanjun
2024-01-19 14:27 ` Heng Qi
2024-01-19 17:29 ` Andrew Lunn
2024-01-20 4:21 ` Zhu Yanjun
2024-01-22 2:12 ` Zhu Yanjun
2024-01-22 3:14 ` Jason Wang
2024-01-22 3:58 ` Xuan Zhuo
2024-01-22 4:16 ` Jason Wang
2024-01-22 6:16 ` Xuan Zhuo
2024-01-22 6:55 ` Jason Wang
2024-01-22 6:58 ` Jason Wang
2024-01-22 7:02 ` Xuan Zhuo
2024-01-22 7:19 ` Jason Wang
2024-01-22 7:25 ` Xuan Zhuo
2024-01-22 7:57 ` Jason Wang
2024-01-22 8:01 ` Xuan Zhuo
2024-01-22 8:32 ` Jason Wang
2024-01-22 9:11 ` Xuan Zhuo
[not found] ` <e46d04d7-4eb7-4fcd-821a-d558c07531b7@linux.dev>
2024-01-26 3:13 ` Zhu Yanjun [this message]
2024-01-22 7:01 ` Xuan Zhuo
2024-01-22 3:08 ` Jason Wang
2024-01-22 4:42 ` Heng Qi
2024-01-18 13:14 ` Michael S. Tsirkin
2024-01-19 1:42 ` Xuan Zhuo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=68ebd92e-cf7a-43af-af71-386495606d90@linux.dev \
--to=yanjun.zhu@linux$(echo .)dev \
--cc=andrew@lunn$(echo .)ch \
--cc=davem@davemloft$(echo .)net \
--cc=edumazet@google$(echo .)com \
--cc=hengqi@linux$(echo .)alibaba.com \
--cc=jasowang@redhat$(echo .)com \
--cc=kuba@kernel$(echo .)org \
--cc=mst@redhat$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=pabeni@redhat$(echo .)com \
--cc=virtualization@lists$(echo .)linux.dev \
--cc=xuanzhuo@linux$(echo .)alibaba.com \
--cc=yanjun.zhu@intel$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox