rcu: Latch normal synchronize_rcu() path on flood
Currently, rcu_normal_wake_from_gp is only enabled by default
on small systems(<= 16 CPUs) or when a user explicitly set it
enabled.
Introduce an adaptive latching mechanism:
* Track the number of in-flight synchronize_rcu() requests
using a new rcu_sr_normal_count counter;
* If the count reaches/exceeds RCU_SR_NORMAL_LATCH_THR(64),
it sets the rcu_sr_normal_latched, reverting new requests
onto the scaled wait_rcu_gp() path;
* The latch is cleared only when the pending requests are fully
drained(nr == 0);
* Enables rcu_normal_wake_from_gp by default for all systems,
relying on this dynamic throttling instead of static CPU
limits.
Testing(synthetic flood workload):
* Kernel version: 6.19.0-rc6
* Number of CPUs: 1536
* 60K concurrent synchronize_rcu() calls
Perf(cycles, system-wide):
total cycles:
932020263832
rcu_sr_normal_add_req():
2650282811 cycles(~0.28%)
Perf report excerpt:
0.01% 0.01% sync_test/... [k] rcu_sr_normal_add_req
Measured overhead of rcu_sr_normal_add_req() remained ~0.28%
of total CPU cycles in this synthetic stress test.
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Tested-by: Samir M <samir@linux.ibm.com>
Suggested-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>