public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: <Claudiu.Beznea@microchip•com>
To: <tomas.melin@vaisala•com>, <netdev@vger•kernel.org>
Cc: <Nicolas.Ferre@microchip•com>, <davem@davemloft•net>,
	<kuba@kernel•org>, <pabeni@redhat•com>
Subject: Re: [PATCH] net: macb: Restart tx only if queue pointer is lagging
Date: Fri, 25 Mar 2022 13:41:50 +0000	[thread overview]
Message-ID: <b643b825-e68a-875e-f4ac-edddae613705@microchip.com> (raw)
In-Reply-To: <7310fea8-8f55-a198-5317-a7ad95980beb@vaisala.com>

On 25.03.2022 11:35, Tomas Melin wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the
> content is safe
> 
> Hi,
> 
> On 25/03/2022 10:57, Claudiu.Beznea@microchip•com wrote:
>> On 25.03.2022 08:50, Tomas Melin wrote:
>>> [Some people who received this message don't often get email from
>>> tomas.melin@vaisala•com. Learn why this is important at
>>> http://aka.ms/LearnAboutSenderIdentification.]
>>>
>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know
>>> the content is safe
>>>
>>> commit 5ea9c08a8692 ("net: macb: restart tx after tx used bit read")
>>> added support for restarting transmission. Restarting tx does not work
>>> in case controller asserts TXUBR interrupt and TQBP is already at the end
>>> of the tx queue. In that situation, restarting tx will immediately cause
>>> assertion of another TXUBR interrupt. The driver will end up in an infinite
>>> interrupt loop which it cannot break out of.
>>>
>>> For cases where TQBP is at the end of the tx queue, instead
>>> only clear TXUBR interrupt. As more data gets pushed to the queue,
>>> transmission will resume.
>>>
>>> This issue was observed on a Xilinx Zynq based board. During stress test of
>>> the network interface, driver would get stuck on interrupt loop
>>> within seconds or minutes causing CPU to stall.
>>>
>>> Signed-off-by: Tomas Melin <tomas.melin@vaisala•com>
>>> ---
>>>   drivers/net/ethernet/cadence/macb_main.c | 8 ++++++++
>>>   1 file changed, 8 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/cadence/macb_main.c
>>> b/drivers/net/ethernet/cadence/macb_main.c
>>> index 800d5ced5800..e475be29845c 100644
>>> --- a/drivers/net/ethernet/cadence/macb_main.c
>>> +++ b/drivers/net/ethernet/cadence/macb_main.c
>>> @@ -1658,6 +1658,7 @@ static void macb_tx_restart(struct macb_queue *queue)
>>>          unsigned int head = queue->tx_head;
>>>          unsigned int tail = queue->tx_tail;
>>>          struct macb *bp = queue->bp;
>>> +       unsigned int head_idx, tbqp;
>>>
>>>          if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
>>>                  queue_writel(queue, ISR, MACB_BIT(TXUBR));
>>> @@ -1665,6 +1666,13 @@ static void macb_tx_restart(struct macb_queue
>>> *queue)
>>>          if (head == tail)
>>>                  return;
>>>
>>> +       tbqp = queue_readl(queue, TBQP) / macb_dma_desc_get_size(bp);
>>> +       tbqp = macb_adj_dma_desc_idx(bp, macb_tx_ring_wrap(bp, tbqp));
>>> +       head_idx = macb_adj_dma_desc_idx(bp, macb_tx_ring_wrap(bp, head));
>>> +
>>> +       if (tbqp == head_idx)
>>> +               return;
>>> +
>>
>> This looks like TBQP is not advancing though there are packets in the
>> software queues (head != tail). Packets are added in the software queues on
>> TX path and removed when TX was done for them.
> 
> TBQP is at the end of the queue, and that matches with tx_head
> maintained by driver. So seems controller is happily at end marker,
> and when restarted immediately sees that end marker used tag and
> triggers an interrupt again.
> 
> Also when looking at the buffer descriptor memory it shows that all
> frames between tx_tail and tx_head have been marked as used.

I see. Controller sets TX_USED on the 1st descriptor of the transmitted
frame. If there were packets with one descriptor enqueued that should mean
controller did its job.

head != tail on software data structures when receiving TXUBR interrupt and
all descriptors in queue have TX_USED bit set might signal that  a
descriptor is not updated to CPU on TCOMP interrupt when CPU uses it and
thus driver doesn't treat a TCOMP interrupt. See the above code on
macb_tx_interrupt():

desc = macb_tx_desc(queue, tail);

/* Make hw descriptor updates visible to CPU */
rmb();

ctrl = desc->ctrl;

/* TX_USED bit is only set by hardware on the very first buffer
* descriptor of the transmitted frame.
*/

if (!(ctrl & MACB_BIT(TX_USED)))
	break;


> 
> GEM documentation says "transmission is restarted from
> the first buffer descriptor of the frame being transmitted when the
> transmit start bit is rewritten" but since all frames are already marked
> as transmitted, restarting wont help. Adding this additional check will
> help for the issue we have.
> 

I see but according to your description (all descriptors treated by
controller) if no packets are enqueued for TX after:

+       if (tbqp == head_idx)
+               return;
+

there are some SKBs that were correctly treated by controller but not freed
by software (they are freed on macb_tx_unmap() called from
macb_tx_interrupt()). They will be freed on next TCOMP interrupt for other
packets being transmitted.

> 
>>
>> Maybe TX_WRAP is missing on one TX descriptor? Few months ago while
>> investigating some other issues on this I found that this might be missed
>> on one descriptor [1] but haven't managed to make it break at that point
>> anyhow.
>>
>> Could you check on your side if this is solving your issue?
> 
> I have seen that we can get stuck at any location in the ring buffer, so
> this does not seem to be the case here. I can try though if it would
> have any effect.

I was thinking that having small packets there is high chance that TBQP to
not reach a descriptor with wrap bit set due to the code pointed in my
previous email.

Thank you,
Claudiu Beznea

> 
> thanks,
> Tomas
> 
> 
>>
>>       /* Set 'TX_USED' bit in buffer descriptor at tx_head position
>>        * to set the end of TX queue
>>        */
>>       i = tx_head;
>>       entry = macb_tx_ring_wrap(bp, i);
>>       ctrl = MACB_BIT(TX_USED);
>> +     if (entry == bp->tx_ring_size - 1)
>> +             ctrl |= MACB_BIT(TX_WRAP);
>>       desc = macb_tx_desc(queue, entry);
>>       desc->ctrl = ctrl;
>>
>> [1]
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Ftree%2Fdrivers%2Fnet%2Fethernet%2Fcadence%2Fmacb_main.c%23n1958&amp;data=04%7C01%7Ctomas.melin%40vaisala.com%7C2fe72e2a6a874b5279a708da0e3d7852%7C6d7393e041f54c2e9b124c2be5da5c57%7C0%7C0%7C637837954434714462%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=hijcfj3TnOxj12dhG0Q8d0AJNFNBJSxtEjOTkCoZThI%3D&amp;reserved=0
>>
>>
>>>          macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
>>>   }
>>>
>>> -- 
>>> 2.35.1
>>>
>>


  reply	other threads:[~2022-03-25 13:42 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-25  6:50 [PATCH] net: macb: Restart tx only if queue pointer is lagging Tomas Melin
2022-03-25  8:57 ` Claudiu.Beznea
2022-03-25  9:35   ` Tomas Melin
2022-03-25 13:41     ` Claudiu.Beznea [this message]
2022-03-25 14:41       ` Tomas Melin
2022-03-25 15:19         ` Claudiu.Beznea
2022-03-28  4:16           ` Melin Tomas
2022-03-30 14:27             ` Claudiu.Beznea
2022-03-31 10:08               ` Tomas Melin
2022-03-31 18:31 ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b643b825-e68a-875e-f4ac-edddae613705@microchip.com \
    --to=claudiu.beznea@microchip$(echo .)com \
    --cc=Nicolas.Ferre@microchip$(echo .)com \
    --cc=davem@davemloft$(echo .)net \
    --cc=kuba@kernel$(echo .)org \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=pabeni@redhat$(echo .)com \
    --cc=tomas.melin@vaisala$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox