Mike Yuan [Sun, 26 May 2024 19:23:37 +0000 (03:23 +0800)]
man/run0: remove @ syntax for --machine=
For run0 (as opposed to systemd-run in general), connecting to
the system bus (of localhost or container) as a different user
than root and then trying to elevate privilege from that
makes little sense:
https://github.com/systemd/systemd/issues/32997#issuecomment-2127992973
The @ syntax is mostly useful when connecting to the user bus,
which is not a use case for run0. Hence, let's remove the example.
The syntax will be properly refused in #32999.
Yu Watanabe [Sun, 26 May 2024 21:01:05 +0000 (06:01 +0900)]
blockdev-util: also check loop/partscan sysattr
With https://github.com/torvalds/linux/commit/b9684a71fca793213378dd410cd11675d973eaa1 (v5.19),
we cannot check partition scanning is enabled for a loopback block device
without checking the attribute.
Yu Watanabe [Mon, 27 May 2024 00:21:41 +0000 (09:21 +0900)]
blockdev-util: also check newer value of GENHD_FL_NO_PART flag
With https://github.com/torvalds/linux/commit/430cc5d3ab4d0ba0bd011cfbb0035e46ba92920c,
the value of GENHD_FL_NO_PART, previously named as GENHD_FL_NO_PART_SCAN,
is changed from 0x0200 to 0x0004. So, we need to check both flags.
Yu Watanabe [Sun, 26 May 2024 01:05:57 +0000 (10:05 +0900)]
test: use SYSLOG_IDENTIFIER= filter instead of "journalctl -u"
"journalctl -u foo.service" may not work as expected, especially entries
for _TRANSPORT=stdout, for short-living services or when the service manager
generates debugging logs. Instead, SYSLOG_IDENTIFIER= should be reliable for
stdout. Let's use it.
Before this commit, if WorkingDirectory= is empty or literally "-",
'simplified' is not populated, resulting in the ASSERT_PTR
in unit_write_settingf() below getting triggered.
Also, do not accept "-", so that the parser is consistent
with load-fragment.c
Yu Watanabe [Fri, 24 May 2024 21:09:52 +0000 (06:09 +0900)]
unit: also stop systemd-journal-flush.service on soft-reboot
After soft-reboot, /var/log/journal may be initially read-only,
and becomes writable a bit later. In such case, runtime journal is
initially opened by journald. Hence, we need to flush to /var when it is
ready.
Yu Watanabe [Fri, 24 May 2024 21:02:39 +0000 (06:02 +0900)]
journald: always unset flushed flag when the runtime journal is opened
If the runtime journal is opened, we will anyway write journal entries
to the runtime journal, even if the persistent journal is writable.
Hence, we need to flush the runtime journal file later.
Yu Watanabe [Fri, 24 May 2024 16:32:21 +0000 (01:32 +0900)]
test: applying timezone is asynchronous
So, we need to try to read timezone several times.
Also, on failure, show journal of timedated instead of hostnamed,
as the timezone is handled by timedated.
Yu Watanabe [Fri, 24 May 2024 16:47:23 +0000 (01:47 +0900)]
machine-id-setup: update comment
If an initrd has an empty or uninitialized /etc/machine-id file,
then PID1 write a valid machine ID. So, the logic is important only on
soft-reboot. Let's mention that explicitly.
Yu Watanabe [Fri, 24 May 2024 17:01:53 +0000 (02:01 +0900)]
man: update machine-id-setup(1)
- mention that /run/machine-id is used if exist.
- mention system.machine_id credential,
- credential, VM uuid, and container uuid are not read when --root=
is specified or running in a chroot environment.
https://github.com/systemd/systemd/pull/32915#discussion_r1608258136
> In many cases we allow --root=/ as a mechanism for forcing an "offline" mode,
> while still operating on the root dir. if we do the getenv_for_pid() thing
> below I'd claim this is very much an "online" operation, and hence --root=/
> should really disable that.
cryptenroll: explicitly pick PCR bank if literal PCR binding is off, but signed PCR binding is on
We so far derived the PCR bank to use from the PCR values specified fr
literal PCR binding. However, when that's not used then we left the bank
uninitialized – which will break if signed PCR binds are used (where we
need to pick a bank too after all).
Hence, let's explicitly pick a bank to use if literal PCR values are not
used, to make things just work.
Michal Sekletar [Wed, 22 May 2024 15:15:07 +0000 (17:15 +0200)]
libsystemd: link with '-z nodelete'
We want to avoid reinitialization of our global variables with static
storage duration in case we get dlopened multiple times by the same
application. This will avoid potential resource leaks that could have
happened otherwise (e.g. leaking journal socket fd).
varlinkctl: when operating in --more mode, fail correcly on Varlink method error
In varlink.c we generally do not make failing callback functions fatal,
since that should be up to the app. Hence, in case of varlinkctl (where
we want failures to be fatal), make sure to propagate the error back
explicitly.
Before this change a failing call to "varlinkctl --more call …" would result in
a zero exit code. With this it will correctly exit with a non-zero exit
code.
Luca Boccassi [Tue, 21 May 2024 00:43:24 +0000 (01:43 +0100)]
test: do not fail network namespace test with permission issues
When running in LXC with AppArmor we'll most likely get an error when creating
a network namespace due to a kernel regression in < v6.2 affecting AppArmor,
resulting in denials. Like other tests, avoid failing in case of permission
issues and handle it gracefully.
Yu Watanabe [Wed, 22 May 2024 15:03:42 +0000 (00:03 +0900)]
units: stop systemd-journald before systemd-soft-reboot.service
Typically, soft-reboot.target is never reached. So, without this change,
systemd-journald may be killed by PID1 on soft-reboot, and may cause
journal corruption.
Still I think this is the way to go. But the change was merged after -rc2,
and still discussion is continued. So, at least now let's revert it,
and do that after v256-final is released if approved.
F_OFD_SETLK is documented to only return EAGAIN, and F_SETLKW/F_OFD_SETLKW
are blocking operations so this logic doesn't apply to them in the
first place.
Hence, only automatically convert EACCES into EAGAIN for F_SETLK
operations, and propagate the original error in the other cases.
This is important because in some cases we catch permission errors
and gracefully fallback, which is not possible if the original error
is lost.
This is an issue in practice because, due to a kernel bug present
before v6.2, AppArmor denies locking on file descriptors to LXC
containers. We support all currently maintained LTS kernels,
including v6.1, where despite a lot of effort and attempts over almost
a year, the bugfix still hasn't been backported, as it is complex and
requires large changes to AppArmor.
On affected kernels, all services running with PrivateNetwork=yes
fail and do not recover, instead of the normal behaviour of gracefully
downgrading to PrivateNetwork=no.
The integration tests in the Debian CI fail due to this issue:
Yu Watanabe [Wed, 22 May 2024 03:26:58 +0000 (12:26 +0900)]
test: replace journal checkers with journalctl --follow + grep -m
Recently, for slow test environments, journalctl --sync was added to the
loop in the timeout. However, journalctl --sync may be slow in such systems,
and timeout easily triggered during syncing.
Hopefully, reading journal with --follow and grep the output with an expected
line should be efficient.
Yu Watanabe [Tue, 21 May 2024 20:24:05 +0000 (05:24 +0900)]
test: lock device during running cryptsetup
On running cryptsetup, udevd detects two inotify events for the
underlying device. Running the test on enough fast host, the expected
symlinks based on UUID and disk label are created by the second event.
During processing a uevent for a device, udevd disables the inotify
watch for the device. If the test runs on slow system, the second
inotify event may comes during a udev worker processing the synthesized
uevent triggered by the first inotify event. Hence, no synthesized
uevent for the second inotify event will be generated, and the expected
symlinks will be never created.
To prevent the issue, we need to lock the device during cryptsetup
command is running.
Luca Boccassi [Tue, 21 May 2024 17:19:04 +0000 (18:19 +0100)]
pkg/opensuse: switch to SHA1 fork
src.opensuse.org switched to SHA256, which means it can no longer be
used as a submodule in a SHA1 repository. Switch to a fork on Pagure
that gets synced across and is still SHA1:
Yu Watanabe [Tue, 21 May 2024 08:57:59 +0000 (17:57 +0900)]
test: wait a bit before stopping/killing service
Otherwise, when stopping the service, the last command may not be
started yet, and the service manager may not send SIGTERM signal to the
last command, but send SIGKILL on timeout.
===
May 21 08:23:24 test19-exit-cgroup.sh[437]: + disown
May 21 08:23:24 test19-exit-cgroup.sh[438]: + sleep infinity
May 21 08:23:24 test19-exit-cgroup.sh[437]: + systemd-notify --ready
May 21 08:23:24 test19-exit-cgroup.sh[437]: + sleep infinity
May 21 08:23:24 test19-exit-cgroup.sh[441]: + systemctl stop one
May 21 08:23:24 test19-exit-cgroup.sh[443]: + sleep infinity
(snip)
May 21 08:23:24 systemd[1]: one.service: Changed running -> stop-sigterm
May 21 08:23:24 systemd[1]: Stopping one.service - /tmp/test19-exit-cgroup.sh "systemctl stop one"...
May 21 08:23:24 systemd[1]: Received SIGCHLD from PID 441 (systemctl).
May 21 08:23:24 systemd[1]: Child 437 (bash) died (code=killed, status=15/TERM)
May 21 08:23:24 systemd[1]: one.service: Child 437 belongs to one.service.
May 21 08:23:24 systemd[1]: one.service: Main process exited, code=killed, status=15/TERM (success)
May 21 08:23:24 systemd[1]: Child 439 (bash) died (code=killed, status=15/TERM)
May 21 08:23:24 systemd[1]: one.service: Child 439 belongs to one.service.
May 21 08:23:24 systemd[1]: Child 441 (systemctl) died (code=killed, status=15/TERM)
May 21 08:23:24 systemd[1]: one.service: Child 441 belongs to one.service.
May 21 08:23:24 systemd[1]: Child 442 (bash) died (code=killed, status=15/TERM)
May 21 08:23:24 systemd[1]: one.service: Child 442 belongs to one.service.
(snip)
May 21 08:24:54 systemd[1]: one.service: State 'stop-sigterm' timed out. Killing.
May 21 08:24:54 systemd[1]: one.service: Killing process 443 (sleep) with signal SIGKILL.
May 21 08:24:54 systemd[1]: one.service: Changed stop-sigterm -> stop-sigkill
May 21 08:24:54 systemd[1]: Received SIGCHLD from PID 443 (sleep).
May 21 08:24:54 systemd[1]: Child 443 (sleep) died (code=killed, status=9/KILL)
May 21 08:24:54 systemd[1]: one.service: Child 443 belongs to one.service.
May 21 08:24:54 systemd[1]: one.service: Control group is empty.
May 21 08:24:54 systemd[1]: one.service: Failed with result 'timeout'.
May 21 08:24:54 systemd[1]: one.service: Service restart not allowed.
May 21 08:24:54 systemd[1]: one.service: Changed stop-sigkill -> failed
May 21 08:24:54 systemd[1]: one.service: Job 738 one.service/stop finished, result=done
May 21 08:24:54 systemd[1]: Stopped one.service - /tmp/test19-exit-cgroup.sh "systemctl stop one".
May 21 08:24:54 systemd[1]: one.service: Unit entered failed state.
May 21 08:24:54 systemd[1]: one.service: Releasing resources...
===
As requested in post-merge review
https://github.com/systemd/systemd/pull/32869#pullrequestreview-2068161094:
> NotInControl error is really about session controllers, but this here really
> is different.
Yu Watanabe [Mon, 20 May 2024 19:48:42 +0000 (04:48 +0900)]
test: wait for unit generated from /proc/self/mountinfo to be unloaded
Fixes https://github.com/systemd/systemd/issues/32680#issuecomment-2120974685.
===
May 21 02:45:08 TEST-74-AUX-UTILS.sh[2475]: + mountpoint /tmp/tmp.eaRV7lSbX2/mnt
May 21 02:45:08 TEST-74-AUX-UTILS.sh[2476]: /tmp/tmp.eaRV7lSbX2/mnt is not a mountpoint
May 21 02:45:08 TEST-74-AUX-UTILS.sh[2449]: + systemd-mount /dev/loop0 /tmp/tmp.eaRV7lSbX2/mnt
May 21 02:45:08 systemd-mount[2477]: Failed to start transient mount unit: Unit tmp-tmp.eaRV7lSbX2-mnt.mount was already loaded or has a fragment file.
===
The change in behavior was partly intentional, as I think
if both --wait and --pty are used, manually disconnecting
from PTY forwarder should not result in systemd-run exiting
with "Finished with ..." log. But we should check for
--wait here.