The receive path of the RTL93xx SoCs is currently discarding packets
in software. Analysis gives the following explanation:
- RX ring size registers are setup with the full software ring size
- When packets are received the packet counter registers are increased
- After RX processing the counter registers are changed the wrong way
- From then SOC is allowed to receive more packets than software allows
- Overflow interrupts are fired
- As a reaction to that the software drops packets
Change the processing as follows:
- Setup ring size registers with a headroom of 2 buffers
- Decrease the counter registers with the real work done
With this change no more overflow interrupts occur because the SoC
disables the queues before they can overflow or hit a buffer that is
still owned by the CPU.
Benchmark from single stream iperf3 run, with server process running
on ZyXEL XGS1210 (RTL930x).
iperf3 run before
-----------------------------------------------------------
Server listening on 5201 (test #1)
-----------------------------------------------------------
Accepted connection from 192.168.2.86, port 54412
[ 5] local 192.168.2.71 port 5201 connected to 192.168.2.86 port 54418
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 384 KBytes 3.14 Mbits/sec
[ 5] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec
[ 5] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec
[ 5] 3.00-4.01 sec 5.12 MBytes 42.8 Mbits/sec
[ 5] 4.01-5.00 sec 11.4 MBytes 95.8 Mbits/sec
[ 5] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec
[ 5] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec
[ 5] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec
[ 5] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec
[ 5] 9.00-10.00 sec 0.00 Bytes 0.00 bits/sec
iperf3 run after
-----------------------------------------------------------
Server listening on 5201 (test #1)
-----------------------------------------------------------
Accepted connection from 192.168.2.86, port 55228
[ 5] local 192.168.2.71 port 5201 connected to 192.168.2.86 port 55232
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 22.8 MBytes 191 Mbits/sec
[ 5] 1.00-2.01 sec 25.4 MBytes 211 Mbits/sec
[ 5] 2.01-3.00 sec 25.4 MBytes 215 Mbits/sec
[ 5] 3.00-4.01 sec 26.5 MBytes 220 Mbits/sec
[ 5] 4.01-5.00 sec 26.2 MBytes 222 Mbits/sec
[ 5] 5.00-6.00 sec 26.9 MBytes 225 Mbits/sec
[ 5] 6.00-7.00 sec 27.0 MBytes 226 Mbits/sec
[ 5] 7.00-8.01 sec 26.9 MBytes 224 Mbits/sec
[ 5] 8.01-9.00 sec 26.5 MBytes 223 Mbits/sec
[ 5] 9.00-10.00 sec 26.8 MBytes 225 Mbits/sec
[ 5] 10.00-10.02 sec 640 KBytes 224 Mbits/sec
Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
Link: https://github.com/openwrt/openwrt/pull/19960
Signed-off-by: Robert Marko <robimarko@gmail.com>
static void rtl930x_update_cntr(int r, int released)
{
+ u32 reg = rtl930x_dma_if_rx_ring_cntr(r);
int pos = (r % 3) * 10;
- u32 reg = RTL930X_DMA_IF_RX_RING_CNTR + ((r / 3) << 2);
- u32 v = sw_r32(reg);
- v = (v >> pos) & 0x3ff;
- pr_debug("RX: Work done %d, old value: %d, pos %d, reg %04x\n", released, v, pos, reg);
- sw_w32_mask(0x3ff << pos, released << pos, reg);
- sw_w32(v, reg);
+ sw_w32(released << pos, reg);
}
static void rtl931x_update_cntr(int r, int released)
{
+ u32 reg = rtl931x_dma_if_rx_ring_cntr(r);
int pos = (r % 3) * 10;
- u32 reg = RTL931X_DMA_IF_RX_RING_CNTR + ((r / 3) << 2);
- u32 v = sw_r32(reg);
- v = (v >> pos) & 0x3ff;
- sw_w32_mask(0x3ff << pos, released << pos, reg);
- sw_w32(v, reg);
+ sw_w32(released << pos, reg);
}
struct dsa_tag {
sw_w32((DEFAULT_MTU << 16) | RX_TRUNCATE_EN_93XX, priv->r->dma_if_ctrl);
for (int i = 0; i < priv->rxrings; i++) {
+ int cnt = min(priv->rxringlen - 2, 0x3ff);
int pos = (i % 3) * 10;
u32 v;
- sw_w32_mask(0x3ff << pos, priv->rxringlen << pos, priv->r->dma_if_rx_ring_size(i));
+ sw_w32_mask(0x3ff << pos, cnt << pos, priv->r->dma_if_rx_ring_size(i));
/* Some SoCs have issues with missing underflow protection */
v = (sw_r32(priv->r->dma_if_rx_ring_cntr(i)) >> pos) & 0x3ff;
netif_receive_skb_list(&rx_list);
/* Update counters */
- priv->r->update_cntr(r, 0);
+ priv->r->update_cntr(r, work_done);
spin_unlock_irqrestore(&priv->lock, flags);