git.ipfire.org Git - thirdparty/kernel/linux.git/commit

author	Chen Yu <yu.c.chen@intel.com>
	Wed, 1 Apr 2026 21:52:14 +0000 (14:52 -0700)
committer	Peter Zijlstra <peterz@infradead.org>
	Thu, 9 Apr 2026 13:49:47 +0000 (15:49 +0200)
commit	b4606faab3188beeacc2287b8a369cca943cc8eb
tree	67b83ffb834f9cbf236ecdb730f2848e8c883c54	tree \| snapshot
parent	df0d98475954d655571979aa061ecb07d7e00392	commit \| diff

sched/cache: Limit the scan number of CPUs when calculating task occupancy

When NUMA balancing is enabled, the kernel currently iterates over all
online CPUs to aggregate process-wide occupancy data. On large systems,
this global scan introduces significant overhead.

To reduce scan latency, limit the search to a subset of relevant CPUs:
1. The task's preferred NUMA node.
2. The node where the task is currently running.
3. The node that contains the task's current preferred LLC..

While focusing solely on the preferred NUMA node is ideal, a
process-wide scan must remain flexible because the "preferred node"
is a per-task attribute. Different threads within the same process may
have different preferred nodes, causing the process-wide preference to
migrate. Maintaining a mask that covers both the preferred and active
running nodes ensures accuracy while significantly reducing the number of
CPUs inspected.

Future work may integrate numa_group to further refine task aggregation.

Suggested-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/57ed5fcec9b242803fe4ea2ce6e7f3de6a6efc6b.1775065312.git.tim.c.chen@linux.intel.com