From: Willy Tarreau Date: Thu, 26 Jun 2025 14:01:55 +0000 (+0200) Subject: MEDIUM: cpu-topo: switch to the "performance" cpu-policy by default X-Git-Tag: v3.3-dev2~5 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=b74336984dff1db9a1afd6b1e4eb2c49b428ce25;p=thirdparty%2Fhaproxy.git MEDIUM: cpu-topo: switch to the "performance" cpu-policy by default As mentioned during the NUMA series development, the goal is to use all available cores in the most efficient way by default, which normally corresponds to "cpu-policy performance". The previous default choice of "cpu-policy first-usable-node" was only meant to stay 100% identical to before cpu-policy. So let's switch the default cpu-policy to "performance" right now. The doc was updated to reflect this. --- diff --git a/doc/configuration.txt b/doc/configuration.txt index 999564e5e..cbb44aa01 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -2174,7 +2174,7 @@ cpu-policy The "cpu-policy" directive chooses between a small number of allocation policies which one to use instead, when "cpu-map" is not used. The following - policies are currently supported: + policies are currently supported, with "performance" being the default one: - none no particular post-selection is performed. All enabled CPUs will be usable, and if the number of threads is @@ -2202,8 +2202,7 @@ cpu-policy node with enabled CPUs will be used, and this number of CPUs will be used as the number of threads. A single thread group will be enabled with all of them, within - the limit of 32 or 64 depending on the system. This is - the default policy. + the limit of 32 or 64 depending on the system. - group-by-2-ccx same as "group-by-ccx" below but create a group every two CCX. This can make sense on CPUs having many CCX of @@ -2299,7 +2298,7 @@ cpu-policy such as network handling is much more effective. On development systems, these can also be used to run auxiliary tools such as load generators and monitoring - tools. + tools. This is the default policy. - resource this is like "group-by-cluster" above, except that only the smallest and most efficient CPU cluster will be @@ -2904,18 +2903,15 @@ no-quic processed by haproxy. See also "quic_enabled" sample fetch. numa-cpu-mapping - When running on a NUMA-aware platform with the cpu-policy is set to - "first-usable-node" (the default one), HAProxy inspects on startup the CPU - topology of the machine. If a multi-socket machine is detected, the affinity - is automatically calculated to run on the CPUs of a single node. This is done - in order to not suffer from the performance penalties caused by the - inter-socket bus latency. However, if the applied binding is non optimal on a - particular architecture, it can be disabled with the statement 'no - numa-cpu-mapping'. This automatic binding is also not applied if a nbthread - statement is present in the configuration, if the affinity of the process is - already specified, for example via the 'cpu-map' directive or the taskset - utility, or if the cpu-policy is set to any other value. See also "cpu-map", - "cpu-policy", "cpu-set". + When running on a NUMA-aware platform, this enables the "cpu-policy" + directive to inspect the topology and figure the best set of CPUs to use and + the corresponding number of threads. However, if the applied binding is non + optimal on a particular architecture, it can be disabled with the statement + 'no numa-cpu-mapping'. This automatic binding is also not applied if a + 'nbthread' statement is present in the configuration, if the affinity of the + process is already specified, for example via the 'cpu-map' directive or the + taskset utility, or if the cpu-policy is set to any other value. See also + "cpu-map", "cpu-policy", "cpu-set". ocsp-update.disable [ on | off ] Disable completely the ocsp-update in HAProxy. Any ocsp-update configuration diff --git a/src/cpu_topo.c b/src/cpu_topo.c index 45cff86ee..7422046c3 100644 --- a/src/cpu_topo.c +++ b/src/cpu_topo.c @@ -60,7 +60,7 @@ static int cpu_policy_resource(int policy, int tmin, int tmax, int gmin, int gma static struct ha_cpu_policy ha_cpu_policy[] = { { .name = "none", .desc = "use all available CPUs", .fct = NULL }, - { .name = "first-usable-node", .desc = "use only first usable node if nbthreads not set", .fct = cpu_policy_first_usable_node, .arg = 0 }, + { .name = "performance", .desc = "make one thread group per perf. core cluster", .fct = cpu_policy_performance , .arg = 0 }, { .name = "group-by-ccx", .desc = "make one thread group per CCX", .fct = cpu_policy_group_by_ccx , .arg = 1 }, { .name = "group-by-2-ccx", .desc = "make one thread group per 2 CCX", .fct = cpu_policy_group_by_ccx , .arg = 2 }, { .name = "group-by-3-ccx", .desc = "make one thread group per 3 CCX", .fct = cpu_policy_group_by_ccx , .arg = 3 }, @@ -69,9 +69,9 @@ static struct ha_cpu_policy ha_cpu_policy[] = { { .name = "group-by-2-clusters",.desc = "make one thread group per 2 core clusters", .fct = cpu_policy_group_by_cluster , .arg = 2 }, { .name = "group-by-3-clusters",.desc = "make one thread group per 3 core clusters", .fct = cpu_policy_group_by_cluster , .arg = 3 }, { .name = "group-by-4-clusters",.desc = "make one thread group per 4 core clusters", .fct = cpu_policy_group_by_cluster , .arg = 4 }, - { .name = "performance", .desc = "make one thread group per perf. core cluster", .fct = cpu_policy_performance , .arg = 0 }, { .name = "efficiency", .desc = "make one thread group per eff. core cluster", .fct = cpu_policy_efficiency , .arg = 0 }, { .name = "resource", .desc = "make one thread group from the smallest cluster", .fct = cpu_policy_resource , .arg = 0 }, + { .name = "first-usable-node", .desc = "use only first usable node if nbthreads not set", .fct = cpu_policy_first_usable_node, .arg = 0 }, { 0 } /* end */ };