net/sched: add qstats_cpu_drop_inc() helper
1) Using this_cpu_inc() is better than going through this_cpu_ptr():
- Single instruction on x86.
- Store tearing prevention.
2) Change tcf_action_update_stats() to use this_cpu_add().
3) Add WRITE_ONCE() to __qdisc_qstats_drop() and qstats_drop_inc()
in preparation for lockless "tc qdisc show".
$ scripts/bloat-o-meter -t vmlinux.old vmlinux.new
add/remove: 0/0 grow/shrink: 3/17 up/down: 72/-216 (-144)
Function old new delta
dualpi2_enqueue_skb 462 511 +49
tcf_ife_act 1061 1077 +16
taprio_enqueue 613 620 +7
codel_qdisc_enqueue 149 143 -6
tcf_vlan_act 684 676 -8
tcf_skbedit_act 626 618 -8
tcf_police_act 725 717 -8
tcf_mpls_act 1297 1289 -8
tcf_gate_act 310 302 -8
tcf_gact_act 222 214 -8
tcf_csum_act 2438 2430 -8
tcf_bpf_act 709 701 -8
tcf_action_update_stats 124 115 -9
pie_qdisc_enqueue 865 856 -9
pfifo_enqueue 116 107 -9
choke_enqueue 2069 2059 -10
plug_enqueue 139 128 -11
bfifo_enqueue 121 110 -11
tcf_nat_act 1501 1489 -12
gred_enqueue 1743 1668 -75
Total: Before=
24388609, After=
24388465, chg -0.00%
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260501135916.2566766-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>