Yu Watanabe [Thu, 10 Aug 2023 04:58:54 +0000 (13:58 +0900)]
core/namespace: reimplement mount_private_sysfs() in the same logic to mount private procfs
Previously, mount_private_sysfs() was implemented by using open_tree()
and move_mount() to keep submounts. But these syscalls are slightly new
and supported since kernel version 5.2.
We already do the same thing for /proc/, but without the new syscalls.
Let's use the same logic to mount private procfs. Then, we can mount
new instance of sysfs with older kernels.
Yu Watanabe [Tue, 22 Aug 2023 07:06:01 +0000 (16:06 +0900)]
network: several follow-ups for TCP-RTO setting
- rename TCPRetransmissionTimeOutSec= -> TCPRetransmissionTimeoutSec,
- refuse infinity,
- fix the input value verifier (USEC_PER_SEC -> USEC_PER_MSEC),
- use DIV_ROUND_UP() when assigning the value.
Yu Watanabe [Sat, 12 Aug 2023 06:18:41 +0000 (15:18 +0900)]
core: do not leak mount for credentials directory if mount namespace is enabled
Since kernel v5.2, open_tree() and move_mount() are added. If a service
loads or sets credentials, then let's try to clone the mount that contains
credentials with open_tree(), then mount it after a (private) mount
namespace is initialized for the service. Then, we can setup a mount for
credentials directory without leaking it to the main shared mount
namespace.
With this change, the credentials for services that request their own
private mount namespace become much much safer. And, the number of mount
events triggered by setting up credential directories can be decreased.
Unfortunately, this does not 'fix' the original issue #25527, as the
reported service does not requests private mount namespace, but the
situation should be better now.
verbs: make a helpful suggestion when user types unrecognized verb
I have been mistyping commands too often myself, and I think the tools
could simply be more helpful, by suggesting to me what I probably wanted
to write. Copy/Paste FTW, after all!
"static inline" makes sense in .h files. But in .c files it's useless
decoration, the compiler should just make its own decisions there, and
it can do that.
hence, replace all remaining uses of "static line" by a simple" static"
in all .c files (but keep them in .h files, where they make sense)
tree-wide: don't ifdef seccomp-util.h, drop seccomp.h inclusion everywhere
seccomp-util.h doesn't need ifdeffing, hence don't. It has worked since
quite a while with HAVE_SECCOMP is off, hence use it everywhere.
Also drop explicit seccomp.h inclusion everywhere (which needs
HAVE_SECCOMP ifdeffery everywhere). seccomp-util.h includes it anyway,
automatically, which we can just rely on, and it deals with HAVE_SECCOMP
at one central place.
seccomp: move seccomp_parse_errno_or_action() into common definitions
Let's remove some HAVE_SECCOMP ifdeffery by simply defining the funcion
in question (seccomp_parse_errno_or_action() + related calls) into
common code that is also compiled if HAVE_SECCOMP is off.
This is generally the better approach anyway, since we want as much as
possible and easily feasible parsers work even if the code implementing
them is disabled. THis is easy to achieve here, hence do.
Luca Boccassi [Wed, 16 Aug 2023 01:00:47 +0000 (02:00 +0100)]
sd-mount: allow creating tmpfs
Mount units can do it, but the command line tool cannot, as it needs a
valid 'what'. If --tmpfs/-T if passed, parse the argument as 'where'
and send a literal 'tmpfs' as the 'what' if not specified.
This metadata (EXTENSION_RELOAD_MANAGER) can be set to "1" to reload the manager
when merging/refreshing/unmerging a system extension image. This can be useful in case the sysext
image provides systemd units that need to be loaded.
With `--no-reload`, one can deactivate the EXTENSION_RELOAD_MANAGER metadata interpretation.
The specs call this TCG PC Client Platform Firmware Profile
Specification says this PCR is owned by the Host Platform Manufacturer,
at various places. Hence let's give it that name.
Daan De Meyer [Wed, 16 Aug 2023 19:22:57 +0000 (21:22 +0200)]
meson: Use rsync to copy test data directories
install_subdir() does not copy symlinks but copies the file they
point to. We also get a very ugly warning in the meson install
output:
"""
Warning: trying to copy a symlink that points to a file. This will copy the file,
but this will be changed in a future version of Meson to copy the symlink as is. Please update your
build definitions so that it will not break when the change happens.
"""
Let's fix both problems at once by using rsync which does the right
thing. Verified by running systemd-dissect --mtree on both the install
output before and after and all the symlinks are now correctly preserved.
David Tardon [Thu, 17 Aug 2023 05:49:35 +0000 (07:49 +0200)]
bus-polkit: don't propagate error from polkit
An error reply from polkit is a valid case and should not be propagated
as failure of async_polkit_callback(). It should only be saved here.
It'll be returned by bus_verify_polkit_async() later, when it's called
for the same method again.
Luca Boccassi [Sun, 13 Aug 2023 21:29:25 +0000 (22:29 +0100)]
core: stage /run/host/os-release with a symlink to avoid possible race condition
If someone reads /run/host/os-release at the exact same time it is being updated, and it
is large enough, they might read a half-written file. This is very unlikely as
os-release is typically small and very rarely changes, but it is not
impossible.
Bind mount a staging directory instead of the file, and symlink the file
into into, so that we can do atomic file updates and close this gap.
Atomic replacement creates a new inode, so existing bind mounts would
continue to see the old file, and only new services would see the new file.
The indirection via the directory allows to work around this, as the
directory is fixed and never changes so the bind mount is always valid,
and its content is shared with all existing services.
Mike Yuan [Thu, 10 Aug 2023 17:41:03 +0000 (01:41 +0800)]
journalctl: support --lines=+N for showing the oldest N entries
After f58269510727964cb5c10e7d2f9849c442ea1f80, the wrong behavior
occurred when --since= and --lines= are both specified is fixed.
However, it seems that the old behavior is already being somewhat
widely used, and the function itself makes sense, i.e. to allow --lines=
to output the first N journal entries.
Therefore, let's support prefixing the number for --lines= with '+',
and provide such functionality.
manager: fix error handling after failure to set up child
exec_child() is supposed to set *exit_status when returning failure.
Unfortunately, we didn't do that in two cases. The result would be:
- a bogus error message "Failed at step SUCCESS spawning foo: …",
- a bogus success exit status.
errno-util: allow ERRNO_IS_* to accept types wider than int
This is useful if the variable is ssize_t and we don't want to trigger a
warning or truncation.
With gcc (gcc-13.2.1-1.fc38.x86_64), the resulting systemd binary is identical,
so I assume that the compiler is able to completely optimize away the type.
basic/errno-util: add wrappers which only accept negative errno
We do 'IN_SET(r, -CONST1, -CONST2)', instead of 'IN_SET(-r, CONST1, CONST2)'
because -r is undefined if r is the minimum value (i.e. INT_MIN). But we know
that the constants are small, so their negative values are fine.