]> git.ipfire.org Git - thirdparty/kernel/linux.git/commit
sched/cache: Limit the scan number of CPUs when calculating task occupancy
authorChen Yu <yu.c.chen@intel.com>
Wed, 1 Apr 2026 21:52:14 +0000 (14:52 -0700)
committerPeter Zijlstra <peterz@infradead.org>
Thu, 9 Apr 2026 13:49:47 +0000 (15:49 +0200)
commitb4606faab3188beeacc2287b8a369cca943cc8eb
tree67b83ffb834f9cbf236ecdb730f2848e8c883c54
parentdf0d98475954d655571979aa061ecb07d7e00392
sched/cache: Limit the scan number of CPUs when calculating task occupancy

When NUMA balancing is enabled, the kernel currently iterates over all
online CPUs to aggregate process-wide occupancy data. On large systems,
this global scan introduces significant overhead.

To reduce scan latency, limit the search to a subset of relevant CPUs:
1. The task's preferred NUMA node.
2. The node where the task is currently running.
3. The node that contains the task's current preferred LLC..

While focusing solely on the preferred NUMA node is ideal, a
process-wide scan must remain flexible because the "preferred node"
is a per-task attribute. Different threads within the same process may
have different preferred nodes, causing the process-wide preference to
migrate. Maintaining a mask that covers both the preferred and active
running nodes ensures accuracy while significantly reducing the number of
CPUs inspected.

Future work may integrate numa_group to further refine task aggregation.

Suggested-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/57ed5fcec9b242803fe4ea2ce6e7f3de6a6efc6b.1775065312.git.tim.c.chen@linux.intel.com
kernel/sched/fair.c