git.ipfire.org Git - thirdparty/systemd.git/commit

author	Chris Down <chris@chrisdown.name>
	Tue, 17 Feb 2026 06:58:44 +0000 (14:58 +0800)
committer	Yu Watanabe <watanabe.yu+github@gmail.com>
	Tue, 17 Feb 2026 09:26:04 +0000 (18:26 +0900)
commit	31620807e19e889c278ac847da9fffc7f2bd2d96
tree	1fd2cd0dd9b082399754c7356f78606692bc2602	tree
parent	4fb8cc53b3438eaafe27e22f168d3ae333ca037c	commit \| diff

oomd: Fix unnecessary delays during OOM kills with pending kills present

Let's say a user has two services with ManagedOOMMemoryPressure=kill,
perhaps a web server under system.slice and a batch job under
user.slice. Both exceed their pressure limits. On the previous timer
tick, oomd has already queued the web server's candidate for killing,
but the prekill hook has not yet responded, so the kill is still
pending.

In the code, monitor_memory_pressure_contexts_handler() iterates over
all pressure targets that have exceeded their limits. When it reaches
the web server target and calls oomd_cgroup_kill_mark(), which returns 0
because that cgroup is already queued. The code treats this the same as
a successful new kill: it resets the 15 second delay timer and returns
from the function, exiting the loop.

This loop is handled by SET_FOREACH and the iteration order is
hash-dependent. As such, if the web server target happens coincidentally
to be visited first, oomd never evaluates the batch job target at all.

The effect is twofold:

1. oomd stalls for 15 seconds despite not having initiated any new kill.
   That can unnecessarily delay further action to stem increases in
   memory pressure. The delay exists to let stale pressure counters
   settle after a kill, but no kill has happened here.
2. It non-deterministically skips pressure targets that may have
   unqueued candidates, dangerously allowing memory pressure to persist
   for longer than it should.

Fix this by skipping cgroups that are already queued so the loop
proceeds to try other pressure targets. We should only delay when a new
kill mark is actually created.