Yu Watanabe [Tue, 15 Nov 2022 13:59:01 +0000 (22:59 +0900)]
core/unit: fix log message
As you can see in the below, the dropped dependency Before=issue-24990.service
is not logged, but the dependency Before=test1.service which is not owned by
the units generated by the TEST-26 is logged.
Before:
systemd[1]: issue-24990.service: Dependency After=test1.service dropped, merged into issue-24990.service
systemd[1]: issue-24990.service: Dependency Before=test1.service dropped, merged into issue-24990.service
After:
systemd[1]: issue-24990.service: Dependency After=test1.service is dropped, as test1.service is merged into issue-24990.service.
systemd[1]: issue-24990.service: Dependency Before=issue-24990.service in test1.service is dropped, as test1.service is merged into issue-24990.service.
Richard Phibel [Mon, 5 Dec 2022 12:40:41 +0000 (13:40 +0100)]
log: Switch logging to runtime when FS becomes read-only
The journal has a mechanism to log to the runtime journal if it fails to
log to the system journal. This mechanism is not triggered when the file
system becomes read-only. We enable it here.
When appending an entry fails if shall_try_append_again returns true,
the journal is rotated. If the FS is read-only, rotation will fail and
s->system_journal will be set to NULL. After that, when find_journal
will try to open the journal since s->system_journal will be NULL, it
will open the runtime journal.
Before we supported pivot_root() nspawn used to make the rootfs shared
before setting up the mount tunnel. So it was safe for it to just turn
it into a dependent mount during setup.
However, we cannot do this anymore because of the requirements
pivot_root() has. After the pivot_root() we will make the rootfs shared
recursively. If we turned the mount tunnel into dependent mount before
mount_switch_root() this will have the consequence that it becomes a
shared mount within the same peer group as the rootfs. So no mounts will
propagate into the container from the host anymore.
To fix this we split setting up the mount tunnel and making it active
into two steps. Setting up the mount tunnel is performed before
mount_switch_root() and activating it afterwards. Note that this works
because turning a shared mount into a shared mount is a nop. IOW, no new
peer group will be allocated.
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
nspawn: mount temporary visible procfs and sysfs instance
In order to mount procfs and sysfs in an unprivileged container the
kernel requires that a fully visible instance is already present in the
target mount namespace. Mount one here so the inner child can mount its
own instances. Later we umount the temporary instances created here
before we actually exec the payload. Since the rootfs is shared the
umount will propagate into the container. Note, the inner child wouldn't
be able to unmount the instances on its own since it doesn't own the
originating mount namespace. IOW, the outer child needs to do this.
So far nspawn didn't run into this issue because it used MS_MOVE which
meant that the shadow mount tree pinned a procfs and sysfs instance
which the kernel would find. The shadow mount tree is gone with proper
pivot_root() semantics.
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
In order to support pivot_root() we need to move mount propagation
changes after the pivot_root(). While MS_MOVE requires the source mount
to not be a shared mount pivot_root() also requires the target mount to
not be a shared mount. This guarantees that pivot_root() doesn't leak
any mounts.
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Yu Watanabe [Mon, 5 Dec 2022 06:32:32 +0000 (15:32 +0900)]
acl-util: several cleanups
- add missing assertions,
- rename function arguments for storing result,
- rename variables which conflict our macros,
- always initialize function arguments for results on success.
Previously, chase_symlinks() always returned an absolute path, which
changed after 5bc244aaa90211ccd8370535274c266cdff6a1cb. This commit
fixes chase_symlinks() so it returns absolute paths all the time again.
Eric DeVolder [Mon, 21 Nov 2022 16:27:27 +0000 (11:27 -0500)]
pstore: fixes for dmesg.txt reconstruction
This patch fixes problems with the re-assembly of the dmesg
from the records stored in pstore.
The current code simply ignores the last 6 characters of the
file name to form a base record id, which then groups any
pstore files with this base id into the reconstructed dmesg.txt.
This approach fails when the following oops generated the
following in pstore:
-rw-------. 1 root root 1808 Oct 27 22:07 dmesg-efi-166692286101001
-rw-------. 1 root root 1341 Oct 27 22:07 dmesg-efi-166692286101002
-rw-------. 1 root root 1812 Oct 27 22:07 dmesg-efi-166692286102001
-rw-------. 1 root root 1820 Oct 27 22:07 dmesg-efi-166692286102002
-rw-------. 1 root root 1807 Oct 27 22:07 dmesg-efi-166692286103001
-rw-------. 1 root root 1791 Oct 27 22:07 dmesg-efi-166692286103002
-rw-------. 1 root root 1773 Oct 27 22:07 dmesg-efi-166692286104001
-rw-------. 1 root root 1801 Oct 27 22:07 dmesg-efi-166692286104002
-rw-------. 1 root root 1821 Oct 27 22:07 dmesg-efi-166692286105001
-rw-------. 1 root root 1809 Oct 27 22:07 dmesg-efi-166692286105002
-rw-------. 1 root root 1804 Oct 27 22:07 dmesg-efi-166692286106001
-rw-------. 1 root root 1817 Oct 27 22:07 dmesg-efi-166692286106002
-rw-------. 1 root root 1792 Oct 27 22:07 dmesg-efi-166692286107001
-rw-------. 1 root root 1810 Oct 27 22:07 dmesg-efi-166692286107002
-rw-------. 1 root root 1717 Oct 27 22:07 dmesg-efi-166692286108001
-rw-------. 1 root root 1808 Oct 27 22:07 dmesg-efi-166692286108002
-rw-------. 1 root root 1764 Oct 27 22:07 dmesg-efi-166692286109001
-rw-------. 1 root root 1765 Oct 27 22:07 dmesg-efi-166692286109002
-rw-------. 1 root root 1796 Oct 27 22:07 dmesg-efi-166692286110001
-rw-------. 1 root root 1816 Oct 27 22:07 dmesg-efi-166692286110002
-rw-------. 1 root root 1793 Oct 27 22:07 dmesg-efi-166692286111001
-rw-------. 1 root root 1751 Oct 27 22:07 dmesg-efi-166692286111002
-rw-------. 1 root root 1813 Oct 27 22:07 dmesg-efi-166692286112001
-rw-------. 1 root root 1786 Oct 27 22:07 dmesg-efi-166692286112002
-rw-------. 1 root root 1754 Oct 27 22:07 dmesg-efi-166692286113001
-rw-------. 1 root root 1752 Oct 27 22:07 dmesg-efi-166692286113002
-rw-------. 1 root root 1803 Oct 27 22:07 dmesg-efi-166692286114001
-rw-------. 1 root root 1759 Oct 27 22:07 dmesg-efi-166692286114002
-rw-------. 1 root root 1805 Oct 27 22:07 dmesg-efi-166692286115001
-rw-------. 1 root root 1787 Oct 27 22:07 dmesg-efi-166692286115002
-rw-------. 1 root root 1815 Oct 27 22:07 dmesg-efi-166692286116001
-rw-------. 1 root root 1771 Oct 27 22:07 dmesg-efi-166692286116002
-rw-------. 1 root root 1816 Oct 27 22:07 dmesg-efi-166692286117002
-rw-------. 1 root root 1388 Oct 27 22:07 dmesg-efi-166692286701003
-rw-------. 1 root root 1824 Oct 27 22:07 dmesg-efi-166692286702003
-rw-------. 1 root root 1795 Oct 27 22:07 dmesg-efi-166692286703003
-rw-------. 1 root root 1805 Oct 27 22:07 dmesg-efi-166692286704003
-rw-------. 1 root root 1813 Oct 27 22:07 dmesg-efi-166692286705003
-rw-------. 1 root root 1821 Oct 27 22:07 dmesg-efi-166692286706003
-rw-------. 1 root root 1814 Oct 27 22:07 dmesg-efi-166692286707003
-rw-------. 1 root root 1812 Oct 27 22:07 dmesg-efi-166692286708003
-rw-------. 1 root root 1769 Oct 27 22:07 dmesg-efi-166692286709003
-rw-------. 1 root root 1820 Oct 27 22:07 dmesg-efi-166692286710003
-rw-------. 1 root root 1755 Oct 27 22:07 dmesg-efi-166692286711003
-rw-------. 1 root root 1790 Oct 27 22:07 dmesg-efi-166692286712003
-rw-------. 1 root root 1756 Oct 27 22:07 dmesg-efi-166692286713003
-rw-------. 1 root root 1763 Oct 27 22:07 dmesg-efi-166692286714003
-rw-------. 1 root root 1791 Oct 27 22:07 dmesg-efi-166692286715003
-rw-------. 1 root root 1775 Oct 27 22:07 dmesg-efi-166692286716003
-rw-------. 1 root root 1820 Oct 27 22:07 dmesg-efi-166692286717003
The "reconstructed" dmesg.txt that resulted from the above contained
the following (ignoring actual contents, just providing the Part info):
The above is a interleaved mess of three dmesg dumps.
This patch fixes the above problems, and simplifies the dmesg
reconstruction process. The code now distinguishes between
records on EFI vs ERST, which have differently formatted
record identifiers. Using knowledge of the format of the
record ids allows vastly improved reconstruction process.
With this change in place, the above pstore records now
result in the following:
# ls -alR /var/lib/systemd/pstore 1666922861:
total 8
drwxr-xr-x. 4 root root 28 Nov 18 14:58 .
drwxr-xr-x. 7 root root 144 Nov 18 14:58 ..
drwxr-xr-x. 2 root root 4096 Nov 18 14:58 001
drwxr-xr-x. 2 root root 4096 Nov 18 14:58 002
1666922861/001:
total 100
drwxr-xr-x. 2 root root 4096 Nov 18 14:58 .
drwxr-xr-x. 4 root root 28 Nov 18 14:58 ..
-rw-------. 1 root root 1808 Oct 27 22:07 dmesg-efi-166692286101001
-rw-------. 1 root root 1812 Oct 27 22:07 dmesg-efi-166692286102001
-rw-------. 1 root root 1807 Oct 27 22:07 dmesg-efi-166692286103001
-rw-------. 1 root root 1773 Oct 27 22:07 dmesg-efi-166692286104001
-rw-------. 1 root root 1821 Oct 27 22:07 dmesg-efi-166692286105001
-rw-------. 1 root root 1804 Oct 27 22:07 dmesg-efi-166692286106001
-rw-------. 1 root root 1792 Oct 27 22:07 dmesg-efi-166692286107001
-rw-------. 1 root root 1717 Oct 27 22:07 dmesg-efi-166692286108001
-rw-------. 1 root root 1764 Oct 27 22:07 dmesg-efi-166692286109001
-rw-------. 1 root root 1796 Oct 27 22:07 dmesg-efi-166692286110001
-rw-------. 1 root root 1793 Oct 27 22:07 dmesg-efi-166692286111001
-rw-------. 1 root root 1813 Oct 27 22:07 dmesg-efi-166692286112001
-rw-------. 1 root root 1754 Oct 27 22:07 dmesg-efi-166692286113001
-rw-------. 1 root root 1803 Oct 27 22:07 dmesg-efi-166692286114001
-rw-------. 1 root root 1805 Oct 27 22:07 dmesg-efi-166692286115001
-rw-------. 1 root root 1815 Oct 27 22:07 dmesg-efi-166692286116001
-rw-r-----. 1 root root 28677 Nov 18 14:58 dmesg.txt
1666922861/002:
total 104
drwxr-xr-x. 2 root root 4096 Nov 18 14:58 .
drwxr-xr-x. 4 root root 28 Nov 18 14:58 ..
-rw-------. 1 root root 1341 Oct 27 22:07 dmesg-efi-166692286101002
-rw-------. 1 root root 1820 Oct 27 22:07 dmesg-efi-166692286102002
-rw-------. 1 root root 1791 Oct 27 22:07 dmesg-efi-166692286103002
-rw-------. 1 root root 1801 Oct 27 22:07 dmesg-efi-166692286104002
-rw-------. 1 root root 1809 Oct 27 22:07 dmesg-efi-166692286105002
-rw-------. 1 root root 1817 Oct 27 22:07 dmesg-efi-166692286106002
-rw-------. 1 root root 1810 Oct 27 22:07 dmesg-efi-166692286107002
-rw-------. 1 root root 1808 Oct 27 22:07 dmesg-efi-166692286108002
-rw-------. 1 root root 1765 Oct 27 22:07 dmesg-efi-166692286109002
-rw-------. 1 root root 1816 Oct 27 22:07 dmesg-efi-166692286110002
-rw-------. 1 root root 1751 Oct 27 22:07 dmesg-efi-166692286111002
-rw-------. 1 root root 1786 Oct 27 22:07 dmesg-efi-166692286112002
-rw-------. 1 root root 1752 Oct 27 22:07 dmesg-efi-166692286113002
-rw-------. 1 root root 1759 Oct 27 22:07 dmesg-efi-166692286114002
-rw-------. 1 root root 1787 Oct 27 22:07 dmesg-efi-166692286115002
-rw-------. 1 root root 1771 Oct 27 22:07 dmesg-efi-166692286116002
-rw-------. 1 root root 1816 Oct 27 22:07 dmesg-efi-166692286117002
-rw-r-----. 1 root root 30000 Nov 18 14:58 dmesg.txt
1666922867:
total 4
drwxr-xr-x. 3 root root 17 Nov 18 14:58 .
drwxr-xr-x. 7 root root 144 Nov 18 14:58 ..
drwxr-xr-x. 2 root root 4096 Nov 18 14:58 003
1666922867/003:
total 104
drwxr-xr-x. 2 root root 4096 Nov 18 14:58 .
drwxr-xr-x. 3 root root 17 Nov 18 14:58 ..
-rw-------. 1 root root 1388 Oct 27 22:07 dmesg-efi-166692286701003
-rw-------. 1 root root 1824 Oct 27 22:07 dmesg-efi-166692286702003
-rw-------. 1 root root 1795 Oct 27 22:07 dmesg-efi-166692286703003
-rw-------. 1 root root 1805 Oct 27 22:07 dmesg-efi-166692286704003
-rw-------. 1 root root 1813 Oct 27 22:07 dmesg-efi-166692286705003
-rw-------. 1 root root 1821 Oct 27 22:07 dmesg-efi-166692286706003
-rw-------. 1 root root 1814 Oct 27 22:07 dmesg-efi-166692286707003
-rw-------. 1 root root 1812 Oct 27 22:07 dmesg-efi-166692286708003
-rw-------. 1 root root 1769 Oct 27 22:07 dmesg-efi-166692286709003
-rw-------. 1 root root 1820 Oct 27 22:07 dmesg-efi-166692286710003
-rw-------. 1 root root 1755 Oct 27 22:07 dmesg-efi-166692286711003
-rw-------. 1 root root 1790 Oct 27 22:07 dmesg-efi-166692286712003
-rw-------. 1 root root 1756 Oct 27 22:07 dmesg-efi-166692286713003
-rw-------. 1 root root 1763 Oct 27 22:07 dmesg-efi-166692286714003
-rw-------. 1 root root 1791 Oct 27 22:07 dmesg-efi-166692286715003
-rw-------. 1 root root 1775 Oct 27 22:07 dmesg-efi-166692286716003
-rw-------. 1 root root 1820 Oct 27 22:07 dmesg-efi-166692286717003
-rw-r-----. 1 root root 30111 Nov 18 14:58 dmesg.txt
Furthemore, pstore records on ERST are now able to accurately
identify the change in timestamp sequence in order to start a
new dmesg.txt, as needed.
Ivan Shapovalov [Tue, 29 Nov 2022 12:20:48 +0000 (16:20 +0400)]
import: wire up SYSTEMD_IMPORT_BTRFS_{SUBVOL,QUOTA} to importd
Btrfs quotas are actually being enabled in systemd-importd via
setup_machine_directory(), not in systemd-{import,pull} where those
environment variables are checked. Therefore, also check them in
systemd-importd and avoid enabling quotas if requested by the user.
Ivan Shapovalov [Sat, 3 Dec 2022 16:31:36 +0000 (20:31 +0400)]
machine-pool: simplify return values from setup_machine_directory()
Non-negative return values of setup_machine_directory() were never used
and never had clear meaning, so do not distinguish between various
non-error conditions and just return 0 in all cases.
Mike Yuan [Sun, 27 Nov 2022 13:18:44 +0000 (21:18 +0800)]
systemctl: allow suppress the warning of no install info using --no-warn
In cases like packaging scripts, it might be desired to use
enable/disable on units without install info. So, adding an
option '--no-warn' to suppress the warning.
Mike Yuan [Fri, 18 Nov 2022 07:43:34 +0000 (15:43 +0800)]
systemctl: warn if trying to disable a unit with no install info
Trying to disable a unit with no install info is mostly useless, so
adding a warning like we do for enable (with the new dbus method
'DisableUnitFilesWithFlagsAndInstallInfo()'). Note that it would
still find and remove symlinks to the unit in /etc, regardless of
whether it has install info or not, just like before. And if there are
actually files to remove, we suppress the warning.
manager: do not append '\n' when writing sysctl settings
When booting with debug logs, we print:
Setting '/proc/sys/fs/file-max' to '9223372036854775807
'
Setting '/proc/sys/fs/nr_open' to '2147483640
'
Couldn't write fs.nr_open as 2147483640, halving it.
Setting '/proc/sys/fs/nr_open' to '1073741816
'
Successfully bumped fs.nr_open to 1073741816
The strange formatting is because we explicitly appended a newline in those two
places. It seems that the kernel doesn't care. In fact, we have a few dozen other
writes to sysctl where we don't append a newline. So let's just drop those here
too, to make the code a bit simpler and avoid strange output in the logs.
test: check if we can use SHA1 MD for signing before using it
Some distributions have started phasing out SHA1, which breaks
the systemd-measure test case in its current form. Let's make sure we
can use SHA1 for signing beforehand to mitigate this.
Spotted on RHEL 9, where SHA1 signatures are disallowed by [0]:
```
openssl genpkey -algorithm RSA -pkeyopt rsa_keygen_bits:2048 -out "/tmp/pcrsign-private.pem"
...
openssl rsa -pubout -in "/tmp/pcrsign-private.pem" -out "/tmp/pcrsign-public.pem"
writing RSA key
/usr/lib/systemd/systemd-measure sign --current --bank=sha1 --private-key="/tmp/pcrsign-private.pem" --public-key="/tmp/pcrsign-public.pem"
Failed to initialize signature context.
```
Daan De Meyer [Fri, 2 Dec 2022 09:44:56 +0000 (10:44 +0100)]
mkosi: Drop explicit Format=
Once mkosi migrates to systemd-repart, only "disk" will be supported
for making disk images with mkosi and the filesystem will have to be
specified in repart partition definition files. To accomodate this
change, let's remove the explicit Format= assignment which means we'll
default to a disk image with ext4 until we add our own mkosi.repart/
directory.
When an outdated address or route is passed to link_request_address()/route(),
then they return 0 and the address or route will not be assigned. Such
situation can happen when we receive RA with zero lifetime. In that
case, we should not unset Link.ndisc_configured flag, otherwise even
no new address nor route will assigned, the interface will enter to the
configuring state, and unnecessary DBus property change is emit and the state
file will be updated. That makes resolved or timesyncd triggered to
reconfigure the interface.
dissect-image: probe file system via main block device fd/image file fd
let's make sure we can probe file systems also when unprivileged:
instead of probing the partition block devices for file system
signatures, let's go via the original "whole" fd.
libblkid makes this easy actually, as it allows us to specify the
offset/size of the area to probe. And we have the partition
offsets/sizes anyway, so it's trivial for us to make use of.
This thus enables fs probing also when lacking privs and operating on
naked regular files without loopback devices or anything like this.
test-loop-block: let's explicitly flush buffer cache on whole block device
Let's explicitly flush the kernel's buffer cache on the whole block
device once we ran "mkfs". This is necessary, because partition and
whole block devices maintain separate buffer caches, and thus writing
to one will not be visible on the other if cached there already, until
the latter's cache is explicitly flushed.
This is preparation for later adding support for probing file sytems
also if we have no open partition block devices, and hence want to use
the whole block device instead.
test-loop-block: also test dissection without ADD/PIN of partition block devices
Let's extend the test further, and try the codepaths where we do not
pin/add the partition block devices (i.e. which is the codepaths we use
when running without privs)
blkid-util: define enum for blkid_do_safeprobe() return values
libblkid really should define an enum for this on its own, but it
currently doesn't and returns literal numeric values. Lets make this
more readable by adding our own symbolic names via an enum.
Daan De Meyer [Wed, 30 Nov 2022 16:04:14 +0000 (17:04 +0100)]
repart: Ignore copy failures for unsupported file types
e.g. vfat doesn't support symlinks, sockets, fifos, etc so let's ignore
any copy failures related to unsupported file types when populating
filesystems.
Curently, these two flags were implied by dissect_loop_device(), but
that's not right, because this means systemd-gpt-auto-generator will
dissect the root block device with these flags set and that's not
desirable: the generator should not cause the partition devices to be
created (we don't intend to use them right-away after all, but expect
udev to find/probe them first, and then mount them though .mount units).
And there's no point in opening the partition devices, since we do not
intend to mount them via fds either.
Hence, rework this: instead of implying the flags, specify them
explicitly.
While we are at it, let's also rename the flags to make them more
descriptive:
DISSECT_IMAGE_MANAGE_PARTITION_DEVICES becomes
DISSECT_IMAGE_ADD_PARTITION_DEVICES, since that's really all this does:
add the partition devices via BLKPG.
DISSECT_IMAGE_OPEN_PARTITION_DEVICES becomes
DISSECT_IMAGE_PIN_PARTITION_DEVICES, since we not only open the devices,
but keep the devices open continously (i.e. we "pin" them).
Also, drop the DISSECT_IMAGE_BLOCK_DEVICE combination flag, since it is
misleading, i.e. it suggests it was appropriate to specify on all
dissected blocking devices, but that's precisely not the case, see the
systemd-gpt-auto-generator case. My guess is that the confusion around
this was actually the cause for this bug we are addressing here.
Ray Strode [Wed, 30 Nov 2022 19:07:29 +0000 (14:07 -0500)]
terminal-util: Set OPOST when setting ONLCR
reset_terminal_fd sets certain minimum required terminal attributes
that systemd relies on.
One of those attributes is `ONLCR` which ensures that when a new line
is sent to the terminal, that the cursor not only moves to the next
line, but also moves to the very beginning of that line.
In order for `ONLCR` to work, the terminal needs to perform output
post-processing. That requires an additional attribute, `OPOST`,
which reset_terminal_fd currently fails to ensure is set.
In most cases `OPOST` (and `ONLCR` actually) are both set anyway, so
it's not an issue, but it could be a problem if, e.g., the terminal was
put in raw mode by a program and the program unexpectedly died before
restoring settings.
This commit ensures when `ONLCR` is set `OPOST` is set too, which is
the only thing that really makes sense to do.
Daan De Meyer [Wed, 30 Nov 2022 16:01:09 +0000 (17:01 +0100)]
copy: Add COPY_GRACEFUL_WARN
When copying between filesystems, sometimes the target filesystem
might not support symlinks/fifos/sockets/... and we want to log and
ignore any failures to copy such files when copying. Let's introduce
a new flag to enable this behavior.
Frantisek Sumsal [Wed, 30 Nov 2022 15:13:19 +0000 (16:13 +0100)]
test: give the container time to properly shut down on exception
Otherwise the `terminate()` method sends SIGKILL rather quickly (~0.3s),
which then leaves a dangling scope on the host system, breaking further
test executions.