]> git.ipfire.org Git - thirdparty/kernel/linux.git/commit
net/mlx5e: Set default burst period for TX and RX reporters
authorShahar Shitrit <shshitrit@nvidia.com>
Sun, 24 Aug 2025 08:43:54 +0000 (11:43 +0300)
committerJakub Kicinski <kuba@kernel.org>
Wed, 27 Aug 2025 00:24:16 +0000 (17:24 -0700)
commit2d5ccb93bbb4f161ccb677ce0b2ebcfe4a089d62
treed4f5f425e3f1ef240c3b0d846f7eaca8aca7a157
parentda0e2197645c8e01bb6080c7a2b86d9a56cc64a9
net/mlx5e: Set default burst period for TX and RX reporters

System errors can sometimes cause multiple errors to be reported
to the TX reporter at the same time. For instance, lost interrupts
may cause several SQs to time out simultaneously. When dev_watchdog
notifies the driver for that, it iterates over all SQs to trigger
recovery for the timed-out ones, via TX health reporter.
However, grace period allows only one recovery at a time, so only
the first SQ recovers while others remain blocked. Since no further
recoveries are allowed during the grace period, subsequent errors
cause the reporter to enter an ERROR state, requiring manual
intervention.

To address this, set the TX reporter's default burst period
to 0.5 second. This allows the reporter to detect and handle all
timed-out SQs within this window before initiating the grace period.

To account for the possibility of a similar issue in the RX reporter,
its default burst period is also configured.

Additionally, while here, align the TX definition prefix with the RX,
as these are used only in EN driver.

Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20250824084354.533182-6-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c