From: Chris Mason Date: Fri, 22 May 2026 16:15:50 +0000 (-0700) Subject: core/exec-invoke: log write() allow-list widening X-Git-Tag: v261-rc2~34^2~1 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=99aff237831f318a817ab2dd786ab3d80979c7bb;p=thirdparty%2Fsystemd.git core/exec-invoke: log write() allow-list widening apply_syscall_filter() unconditionally inserts the write() syscall into c->syscall_filter when exec_fd or handoff_timestamp_fd is in use, so the parent can receive the exec status / handoff timestamp from the child. When the unit configured a positive SystemCallFilter= allow-list that deliberately omits write(), the resulting widening of the operator's policy happens silently with no trace in the journal. Emit a log_debug() before the seccomp_filter_set_add_by_name() call when syscall_allow_list is true, so the widening is at least observable to operators inspecting the unit's debug log. While here, document that mutating c->syscall_filter through a 'const ExecContext *c' is intentional: apply_syscall_filter() runs only in the post-fork child, which owns a private copy of the address space, so the hashmap change is never observed by the manager. No functional change for the allow-list itself; write() is still added exactly as before. Fixes: 84b79215ccc5 ("core: do not filter out write() if required in the very late stage") Assisted-by: kres (claude-opus-4-7) Signed-off-by: Chris Mason --- diff --git a/src/core/exec-invoke.c b/src/core/exec-invoke.c index 183442aef3c..7c16ffc7f03 100644 --- a/src/core/exec-invoke.c +++ b/src/core/exec-invoke.c @@ -1638,8 +1638,16 @@ static int apply_syscall_filter(const ExecContext *c, const ExecParameters *p) { action = negative_action; } - /* Sending over exec_fd or handoff_timestamp_fd requires write() syscall. */ + /* Sending over exec_fd or handoff_timestamp_fd requires write() syscall. + * + * Note: this mutates c->syscall_filter despite the 'const ExecContext *c' qualifier. + * That is intentional and safe here because apply_syscall_filter() runs only in the + * post-fork child, which holds a private copy of the address space; the hashmap + * change is never visible to the manager process. */ if (p->exec_fd >= 0 || p->handoff_timestamp_fd >= 0) { + if (c->syscall_allow_list) + log_debug("SystemCallFilter= allow-list in effect; adding 'write' syscall required for exec handoff."); + r = seccomp_filter_set_add_by_name(c->syscall_filter, c->syscall_allow_list, "write"); if (r < 0) return r;