net/mlx5e: SHAMPO, Allow high order pages in zerocopy mode
Allow high order pages only when SHAMPO mode is enabled (hw-gro) and the
queue is used for zerocopy (has memory provider ops set). The limit is
128K and it was chosen for the following reasons:
- 256K size requires a special case during MTT calculation to split the
page in two. That's because two MTTs are needed to form an octword.
- Higher sizes require increasing WQE size and/or reducing the number
of WQEs.
- Having the RQ lined with too few large pages can lead to refill
issues.
Results show an increase in BW and a decrease in CPU usage.
The benchmark was done with the zcrx samples from liburing [0].
rx_buf_len=4K, oncpu [1]:
packets=
3358832 (MB=820027), rps=55794 (MB/s=13621)
Average: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
Average: 9 1.56 0.00 18.09 13.42 0.00 66.80 0.00 0.00 0.00 0.12
rx_buf_len=128K, oncpu [2]:
packets=
3781376 (MB=923187), rps=62813 (MB/s=15335)
Average: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
Average: 9 0.33 0.00 7.61 18.86 0.00 73.08 0.00 0.00 0.00 0.12
rx_buf_len=4K, offcpu [3]:
packets=
3460368 (MB=844816), rps=57481 (MB/s=14033)
Average: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
Average: 9 0.00 0.00 0.26 0.00 0.00 92.63 0.00 0.00 0.00 7.11
Average: 11 3.04 0.00 68.09 28.87 0.00 0.00 0.00 0.00 0.00 0.00
rx_buf_len=128K, offcpu [4]:
packets=
4119840 (MB=
1005820), rps=68435 (MB/s=16707)
Average: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
Average: 9 0.00 0.00 0.87 0.00 0.00 63.77 0.00 0.00 0.00 35.36
Average: 11 1.96 0.00 43.68 54.37 0.00 0.00 0.00 0.00 0.00 0.00
[0] https://github.com/isilence/liburing/tree/zcrx/rx-buf-len
[1] commands:
$> taskset -c 9 ./zcrx 6 -i eth2 -q 9 -A 1 -B 4096 -S
33554432
$> ./send-zerocopy tcp -6 -D 2001:db8::1 -t 60 -C 0 -l 1 -b 1 -n 1 -z 1 -d -s 256000
[2] commands:
$> taskset -c 9 ./zcrx 6 -i eth2 -q 9 -A 1 -B 131072 -S
33554432
$> ./send-zerocopy tcp -6 -D 2001:db8::1 -t 60 -C 0 -l 1 -b 1 -n 1 -z 1 -d -s 256000
[3] commands:
$> taskset -c 11 ./zcrx 6 -i eth2 -q 9 -A 1 -B 4096 -S
33554432
$> ./send-zerocopy tcp -6 -D 2001:db8::1 -t 60 -C 0 -l 1 -b 1 -n 1 -z 1 -d -s 256000
[4] commands:
$> taskset -c 11 ./zcrx 6 -i eth2 -q 9 -A 1 -B 131072 -S
33554432
$> ./send-zerocopy tcp -6 -D 2001:db8::1 -t 60 -C 0 -l 1 -b 1 -n 1 -z 1 -d -s 256000
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260223204155.1783580-16-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>