Now we store and retrieve only counters for the current tgid when more
than one is supported. This allows to significantly reduce contention
on shared stats. The haterm utility saw its performance increase from
4.9 to 5.8M req/s in H1, and 6.0 to 7.6M for H2, both with 5 groups of
16 threads, showing that we don't necessarily need insane amounts of
groups.
((void *)(*(counters)->datap + (mod)->counters_off[(counters)->type])) : \
(trash_counters))
+/* retrieve the pointer to the extra counters storage for module <mod> for the
+ * current TGID.
+ */
#define EXTRA_COUNTERS_GET(counters, mod) \
(likely(counters) ? \
- ((void *)(*(counters)->datap + (mod)->counters_off[(counters)->type])) : \
+ ((void *)(counters)->datap[(counters)->tgrp_step * (tgid - 1)] + \
+ (mod)->counters_off[(counters)->type]) : \
(trash_counters))
#define EXTRA_COUNTERS_REGISTER(counters, ctype, alloc_failed_label, storage, step) \