The recently added test case TEST-07-PID1.subgroup-kill.sh surfaced a
race: if we enumerate PIDs in a cgroup, and the cgroup is unlinked at
the very same time reading will result in ENODEV. We need to handle that
gracefully. Hence let's do so.
Noticed while looking at:
https://github.com/systemd/systemd/actions/runs/
16143084441/job/
45554929264?pr=38120
assert(f);
assert(ret);
+ /* NB: The kernel returns ENODEV if we tried to read from cgroup.procs of a cgroup that has been
+ * removed already. Callers should handle that! */
+
for (;;) {
errno = 0;
if (fscanf(f, "%lu", &ul) != 1) {
_cleanup_(pidref_done) PidRef pidref = PIDREF_NULL;
r = cg_read_pidref(f, &pidref, flags);
+ if (r == -ENODEV) {
+ /* reading from cgroup.pids will result in ENODEV if the cgroup is
+ * concurrently removed. Just leave in that case, because a removed cgroup
+ * contains no processes anymore. */
+ done = true;
+ break;
+ }
if (r < 0)
return RET_GATHER(ret, log_debug_errno(r, "Failed to read pidref from cgroup '%s': %m", path));
if (r == 0)
/* libvirt / qemu uses threaded mode and cgroup.procs cannot be read at the lower levels.
* From https://docs.kernel.org/admin-guide/cgroup-v2.html#threads, “cgroup.procs” in a
* threaded domain cgroup contains the PIDs of all processes in the subtree and is not
- * readable in the subtree proper. */
+ * readable in the subtree proper.
+ *
+ * We'll see ENODEV when trying to enumerate processes and the cgroup is removed at the same
+ * time. Handle this gracefully. */
r = cg_read_pidref(f, &pidref, /* flags = */ 0);
- if (IN_SET(r, 0, -EOPNOTSUPP))
+ if (IN_SET(r, 0, -EOPNOTSUPP, -ENODEV))
break;
if (r < 0)
return r;
if (r < 0)
return RET_GATHER(ret, r);
}
+ if (r == -ENODEV)
+ continue;
if (r < 0)
return RET_GATHER(ret, r);
} while (!done);
/* libvirt / qemu uses threaded mode and cgroup.procs cannot be read at the lower levels.
* From https://docs.kernel.org/admin-guide/cgroup-v2.html#threads,
* “cgroup.procs” in a threaded domain cgroup contains the PIDs of all processes in
- * the subtree and is not readable in the subtree proper. */
+ * the subtree and is not readable in the subtree proper.
+ *
+ * ENODEV is generated when we enumerate processes from a cgroup and the cgroup is removed
+ * concurrently. */
r = cg_read_pid(f, &pid, /* flags = */ 0);
- if (IN_SET(r, 0, -EOPNOTSUPP))
+ if (IN_SET(r, 0, -EOPNOTSUPP, -ENODEV))
break;
if (r < 0)
return r;