Ronan Pigott [Wed, 21 Jun 2023 02:47:47 +0000 (19:47 -0700)]
systemd-analyze: allow --quiet for condition checks
I figure these messages are rather unnecessary, so let the user quiet
them with the existing --quiet flag if desired. Makes systemd-analyze
condition a little more ergonomic in scripts.
Romain Geissler [Tue, 20 Jun 2023 16:06:31 +0000 (16:06 +0000)]
elf-util: discard PT_LOAD segment early based on the start address.
Indeed when iterating over all the PT_LOAD segment of the core dump
while trying to look for the elf headers of a given module, we iterate
over them all and try to use the first one for which we can parse a
package metadata, but the start address is never taken into account,
so absolutely nothing guarantees we actually parse the right ELF header
of the right module we are currently iterating on.
This was tested like this:
- Create a core dump using sleep on a fedora 37 container, with an
explicit LD_PRELOAD of a library having a valid package metadata:
- Then from a fedora 38 container with systemd installed, the resulting
core dump has been passed to systemd-coredump with and without this
patch. Without this patch, we get:
Module /usr/bin/sleep from rpm bash-5.2.15-3.fc38.x86_64
Module /usr/lib64/libtinfo.so.6.3 from rpm coreutils-9.1-8.fc37.x86_64
Module /usr/lib64/libc.so.6 from rpm coreutils-9.1-8.fc37.x86_64
Module /usr/lib64/libreadline.so.8.2 from rpm coreutils-9.1-8.fc37.x86_64
Module /usr/lib64/ld-linux-x86-64.so.2 from rpm coreutils-9.1-8.fc37.x86_64
While with this patch we get:
Module /usr/bin/sleep from rpm bash-5.2.15-3.fc38.x86_64
Module /usr/lib64/libtinfo.so.6.3 from rpm ncurses-6.3-5.20220501.fc37.x86_64
Module /usr/lib64/libreadline.so.8.2 from rpm readline-8.2-2.fc37.x86_64
So the parsed package metadata reported by systemd-coredump when the module
files are not found on the host (ie the case of crash inside a container) are
now correct. The inconsistency of the first module in the above example
(sleep is indeed not provided by the bash package) can be ignored as it
is a consequence of how this was tested.
In addition to this, this also fixes the performance issue of
systemd-coredump in case of the crashing process uses a large number of
shared libraries and having no package metadata, as reported in
https://sourceware.org/pipermail/elfutils-devel/2023q2/006225.html.
Daan De Meyer [Tue, 6 Jun 2023 15:44:09 +0000 (17:44 +0200)]
core: Add RootEphemeral= setting
This setting allows services to run in an ephemeral copy of the root
directory or root image. To make sure the ephemeral copies are always
cleaned up, we add a tmpfiles snippet to unconditionally clean up
/var/lib/systemd/ephemeral. To prevent in use ephemeral copies from
being cleaned up by tmpfiles, we use the newly added COPY_LOCK_BSD
and BTRFS_SNAPSHOT_LOCK_BSD flags to take a BSD lock on the ephemeral
copies which instruct tmpfiles to not touch those ephemeral copies as
long as the BSD lock is held.
It's highly interesting to see if tools such as systemd-sysupdate
consider a version valid, hence let's output that too (though
gracefully, not fatally)
string-util: move version_is_valid() into generic code
While we are at it, replace the sloppy use of filename_is_valid() by the
less sloppy filename_part_is_valid() (as added by the preceeding
commit), since we don#t want to be too restrictive here. (After all,
version strings invalid as standalone filenames might be valid as part
of filenames, and hence we should allow them).
Add a helper filename_part_is_valid() which does half of what
filename_is_valid() does: it checks for valid chars and length, but does
not filter out ".", ".." and "", as these are OK as parts of filenames,
just not alone.
hostnamectl: show age of firmware as time span, too
This converts the date into a relative timespan from the current time
on, and outputs it. It marks it yellow if older than two years, since
old firmware is probably a security risk. We don't make it red, since we
don't know though.
Daan De Meyer [Thu, 23 Mar 2023 12:48:42 +0000 (13:48 +0100)]
namespace: Load sidecar verity settings in apply_mount_namespace()
Let's reduce the argument count of setup_namespace() a bit by loading
the sidecar verity settings in apply_mount_namespace(). This will also
make it possible to pass file descriptors to the root image/directory
into setup_namespace() as before this wasn't possible because the
verity settings logic looks for sidecar files next to the
root image which requires the path to be available.
hostnamed: when parsing day/month of firmware date, force decimal parsing
safe_atou() by default determines the base from the prefix 0x, 0b, 0o
and for compat with just 0 for octal. This is not what we want here,
since the date components are padded with zeroes yet still decimal.
Hence force decimal parsing (and while we are at it, prohibit a couple
of unexpected decorations).
WIthout this we'd fail to parse any the 8th and 9th day of each months, as
well aus aug and september of every year, because these look like octal
numbers but cannot actually parsed as such.
Let's change the testcase to check for a date that exposes this
bheaviour.
rules: split out DMI related rules from udev-default.rules
The DMI rules where so far guarded by an ACTION=="add" rule, but that
doesn't really make sense for setting properties (only for setting
access modes/ownership of nodes).
Hence let's move this into its own file, that guards properly on
ACTION!="remove".
Before this change the hardware vendor/model info would be dropped
whenever the device was retriggered.
The file long ceased to be exclusively about configuration of the sleep
operation. It contains many many calls for other purposes, hence give it
a more generic name.
sleep-config: reduce scope of DMI object path a bit
We need this in a single function only, hence move it there, and make it
a static field so that it has local scope.
While we are at it, rename s/readsize to buf/bufsize, to make
relationship clear. In particular as the data read is actually binary
and "s" hence a misnomer, since it suggests it was a string.
Daan De Meyer [Tue, 28 Mar 2023 10:32:51 +0000 (12:32 +0200)]
btrfs-util: Add BTRFS_SNAPSHOT_LOCK_BSD
When making ephemeral snapshots of subvolumes whose cleanup depends on
whether they're locked or not, it's necessary to have the lock from the
very beginning, so let's support that with a new BTRFS_SNAPSHOT_LOCK_BSD
flag.
This has been badly named given the path doesn't refer to a device quite
likely, but to a path to a regular file. Hence let's be more precise
with naming.
(.device kinda suggests this was an sd_device object of sorts, but it
really isn't.)
sleep-config: don't use 'device_id' moniker for a dev_t entity
We usually call dev_t entities "devnum" or "devno". That's redundant
enough, let's not call this "device_id". In particular as that's
something else (in udev context).
sleep-config: replace useless fstat() by useful fd_verify_regular()
For some reason there was an fstat() call here whose results was
entirely ignored. Let's remove it. Let's add a call to
fd_verify_regular() instead, because this is a code path for swap files,
hence let's make sure we actually operate on a file, and nothing else.
licunlong [Mon, 19 Jun 2023 13:56:33 +0000 (21:56 +0800)]
basic/env-file: also change to state PRE_KEY if we see NEWLINE in state COMMENT_ESCAPE
When we see a "\" in COMMENT state, we change the state to COMMENT_ESCAPE. When we got
a new character, we reset the state to COMMENT, but this character is not dispatched.
Usually the character is NEWLINE, if so we will stay in COMMENT state until we find
the next NEWLINE.
Frantisek Sumsal [Mon, 19 Jun 2023 15:12:37 +0000 (17:12 +0200)]
journal-remote: make MHD_OPTION_EXTERNAL_LOGGER the first option
To suppress a warning on journal-remote startup:
systemd-journal-remote[691]: microhttpd:
MHD_OPTION_EXTERNAL_LOGGER is not the first option specified for
the daemon. Some messages may be printed by the standard MHD
logger.
This change is incorrect as we don't want to mark the PID as invalid but
only mark it as dead.
The change in question also breaks user level socket activation for
`podman.service` as the termination of the main `podman system service`
process is not properly handled, causing any application accessing the
socket to hang.
This is because the user-level `podman.service` unit also hosts two
non-main processes: `rootlessport` and `rootlessport-child` which causes
the `cgroup_good` check to still succeed.
The original submitter of this commit is recommended to find another
more correct way to fix the cgroupsv1 issue on CentOS 8.
man: place options in a some limited form of subsections
Let's visually separate the options associated with cpu, io, memory, …
in subsections
This patch tries to be minimal. It just adds the section titles, and
does minimal reordering to make sure the options on the same kind of
resource are placed close to each other.
Sam Morris [Mon, 19 Jun 2023 11:30:43 +0000 (12:30 +0100)]
Resource control manpage fixup (#28046)
The order of the description of each item should match the order that they are declared. Un-document effect of deprecated non-unified CGroup hierarchy on
DefaultCPUAccounting=. Mention that the default value for DefaultCPUAccouting= is
affected by the kernel version.
Gibeom Gwon [Wed, 19 Oct 2022 09:12:29 +0000 (18:12 +0900)]
homework: resize to maximum disk space if disk size is not specified
If the backing storage is LUKS2 on a block device, auto resize mode
is enabled, and disk size is not specified, resize the partition to
the maximum expandable size.
Daan De Meyer [Thu, 15 Jun 2023 15:31:23 +0000 (17:31 +0200)]
mkosi: Update to latest
We now run repart before starting systemd-nspawn to make sure that
the root partition is also generated when we boot the image in a
container instead of a VM.
To make sure we start from scratch for both the container boot and
the VM boot, we also enable Ephemeral to make sure all changes to
the image are ephemeral.
Luca Boccassi [Fri, 16 Jun 2023 21:31:04 +0000 (22:31 +0100)]
journal: avoid infinite recursion when closing bad journal FD
When trying to log, if we fail we try to close the journal FD. If
it is bad, safe_close() will fail and assert, which will try to log,
which will fail, which will try to close the journal FD...
Infinite recursion looks very pretty live in gdb, but let's avoid
that by immediately invalidating the journal FD before closing it.
Jan Janssen [Sun, 18 Jun 2023 08:54:20 +0000 (10:54 +0200)]
boot: Improve device_path_to_str_internal()
The UEFI spec has a generic `Path` node representation that can be used
for device path nodes that are unknown. So we can use that instead of
giving up when we see a node other than FilePath.
This also simplifies the FilePath case by just using xasprintf(). The
code is really just a fallback for silly firmware that does not
implement EFI_DEVICE_PATH_TO_TEXT_PROTOCOL (looking at you, Apple).
The correctness of this was tested by round-tripping it through
EFI_DEVICE_PATH_FROM_TEXT_PROTOCOL, which yielded an identical device
compared to our input path.
Frantisek Sumsal [Fri, 16 Jun 2023 17:05:57 +0000 (19:05 +0200)]
socket-activate: make a copy of the command name and arguments
When we call safe_fork() with the first argument set (process name), we
call rename_process() that zeroes out saved argv (that was saved by
save_argc_argv() in the main func defined by DEFINE_MAIN_FUNC()). In this
case this means that with --accept both the target executable name and
its arguments will be empty strings:
```
$ systemd-socket-activate --accept --listen 1111 cat &
Listening on [::]:1111 as 3.
$ curl localhost:1111
Communication attempt on fd 3.
Connection from 127.0.0.1:52948 to [::ffff:127.0.0.1]:1111
Spawned cat (cat) as PID 10576.
Execing ()
Failed to execp (): No such file or directory
Child 10576 died with code 1
curl: (56) Recv failure: Connection reset by peer
```
Let's make a copy of the necessary arguments beforehand and use it
instead to fix this.
Kiran Vemula [Fri, 16 Jun 2023 12:04:37 +0000 (17:34 +0530)]
resolved: Initialize until_valid while storing negative/NXDOMAIN response in the cache
Initialize until_valid is properly for negative response, the cached negative responses can be used to answer the queries before contacting upstream server.