From: Luca Boccassi Date: Fri, 8 May 2026 14:09:25 +0000 (+0100) Subject: test: workaroud flaky TEST-53-TIMER.restart-trigger against journald cgroup attributi... X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=cd57308303ecffaef01a5b27d7fa4ef7e7d6a9ce;p=thirdparty%2Fsystemd.git test: workaroud flaky TEST-53-TIMER.restart-trigger against journald cgroup attribution race The restart-trigger subtest occasionally fails on CI with: + assert_eq 0 1 FAIL: expected: '1' actual: '0' even though the timer fires correctly and the echo message is in fact written to the journal. The failure happens because the test relies on `journalctl --unit=$UNIT_NAME` to find the message, and that filter is based on the cgroup journald looks up for the writer PID at the time the stdout message is received. For very short-lived processes spawned via systemd-executor (like `echo`), that lookup is racy: the writer's `/proc/$PID/cgroup` can still resolve to `/init.scope` (systemd-executor's own cgroup) rather than the service's cgroup, so the message ends up attributed to `init.scope` and `--unit=` filtering misses it. __CURSOR=s=6f90ff5b6a0e47c3a527a9b4892af965;i=f8ed;b=3dad0cc689a04781879e4dd846d24432;m=17703dc;t=6513be1be2506;x=8d3009a687724b5e __REALTIME_TIMESTAMP=1778167492519174 __MONOTONIC_TIMESTAMP=24576988 __SEQNUM=63725 __SEQNUM_ID=6f90ff5b6a0e47c3a527a9b4892af965 _BOOT_ID=3dad0cc689a04781879e4dd846d24432 _HOSTNAME=H PRIORITY=6 SYSLOG_FACILITY=3 _UID=0 _GID=0 _CAP_EFFECTIVE=1ffffffffff _SYSTEMD_CGROUP=/init.scope _SYSTEMD_UNIT=init.scope _SYSTEMD_SLICE=-.slice _EXE=/usr/lib/systemd/systemd-executor _TRANSPORT=stdout _COMM=18 _MACHINE_ID=89ef83adc0bc4a33a83a227201b57203 _RUNTIME_SCOPE=system _PID=816 _CMDLINE=/usr/lib/systemd/systemd-executor --deserialize 50 --log-level debug,console:info --log-target journal-or-kmsg _STREAM_ID=8e8e4166c99e40afaa58bcd04a50a7f4 SYSLOG_IDENTIFIER=echo MESSAGE=Hello from timer 29581 Note _SYSTEMD_UNIT=init.scope / _SYSTEMD_CGROUP=/init.scope on the echo output: this is what causes `--unit=timer-restart-14362` to return 0 hits. The test failure logs from the same run confirm this: + JOURNAL_TS=1778160292 + journalctl -p info --since=@1778160292 --unit=timer-restart-14362 '--grep=Hello from timer 29581' -- No entries -- + systemctl restart timer-restart-14362.timer ... + date '--set=+2 hours' Thu May 7 15:24:52 UTC 2026 + sleep 1 ... echo[816]: Hello from timer 29581 ... ++ journalctl -q -p info --since=@1778160292 --unit=timer-restart-14362 '--grep=Hello from timer 29581' ++ wc -l + assert_eq 0 1 FAIL: expected: '1' actual: '0' For comparison, in a passing local run the same message is attributed correctly to the service unit (_SYSTEMD_UNIT=timer-restart-24147.service), so `--unit=` matches. Work around the underlying journald race in the test by setting an explicit `SyslogIdentifier=` on the service and matching with `-t` plus the unique grep pattern: `SyslogIdentifier` is carried over the stdout stream protocol and is not affected by the cgroup lookup race. Co-developed-by: Claude Opus 4.7 --- diff --git a/test/units/TEST-53-TIMER.restart-trigger.sh b/test/units/TEST-53-TIMER.restart-trigger.sh index 71de99cf276..6a4add9d638 100755 --- a/test/units/TEST-53-TIMER.restart-trigger.sh +++ b/test/units/TEST-53-TIMER.restart-trigger.sh @@ -25,13 +25,15 @@ EOF cat >"/run/systemd/system/$UNIT_NAME.service" <