]> git.ipfire.org Git - thirdparty/systemd.git/commitdiff
core/exec-invoke: log write() allow-list widening
authorChris Mason <clm@meta.com>
Fri, 22 May 2026 16:15:50 +0000 (09:15 -0700)
committerLennart Poettering <lennart@amutable.com>
Fri, 22 May 2026 20:19:35 +0000 (22:19 +0200)
apply_syscall_filter() unconditionally inserts the write() syscall
into c->syscall_filter when exec_fd or handoff_timestamp_fd is in
use, so the parent can receive the exec status / handoff timestamp
from the child. When the unit configured a positive
SystemCallFilter= allow-list that deliberately omits write(), the
resulting widening of the operator's policy happens silently with
no trace in the journal.

Emit a log_debug() before the seccomp_filter_set_add_by_name() call
when syscall_allow_list is true, so the widening is at least
observable to operators inspecting the unit's debug log.

While here, document that mutating c->syscall_filter through a
'const ExecContext *c' is intentional: apply_syscall_filter() runs
only in the post-fork child, which owns a private copy of the
address space, so the hashmap change is never observed by the
manager.

No functional change for the allow-list itself; write() is still
added exactly as before.

Fixes: 84b79215ccc5 ("core: do not filter out write() if required in the very late stage")
Assisted-by: kres (claude-opus-4-7)
Signed-off-by: Chris Mason <clm@meta.com>
src/core/exec-invoke.c

index 183442aef3c07d6ff0276c213c942586da052306..7c16ffc7f030e0d27283a117f687999ef8841dfc 100644 (file)
@@ -1638,8 +1638,16 @@ static int apply_syscall_filter(const ExecContext *c, const ExecParameters *p) {
                 action = negative_action;
         }
 
-        /* Sending over exec_fd or handoff_timestamp_fd requires write() syscall. */
+        /* Sending over exec_fd or handoff_timestamp_fd requires write() syscall.
+         *
+         * Note: this mutates c->syscall_filter despite the 'const ExecContext *c' qualifier.
+         * That is intentional and safe here because apply_syscall_filter() runs only in the
+         * post-fork child, which holds a private copy of the address space; the hashmap
+         * change is never visible to the manager process. */
         if (p->exec_fd >= 0 || p->handoff_timestamp_fd >= 0) {
+                if (c->syscall_allow_list)
+                        log_debug("SystemCallFilter= allow-list in effect; adding 'write' syscall required for exec handoff.");
+
                 r = seccomp_filter_set_add_by_name(c->syscall_filter, c->syscall_allow_list, "write");
                 if (r < 0)
                         return r;