Yu Watanabe [Tue, 7 Jan 2025 18:23:29 +0000 (03:23 +0900)]
sd-device: make sd_device_new_from_path() accept relative path to device node
Even though udevadm accepts relative syspath, previously, udevadm
could not use relative path to device node:
===
$ cd /dev
$ udevadm info sda
Bad argument "sda", expected an absolute path in /dev/ or /sys/ or a unit name: Invalid argument
$ udevadm info /usr/../dev/sda
Unknown device "/usr/../dev/sda": No such device
===
With this change, both the above cases work fine.
Note, still sd_device_new_from_devname() requires absolute path starts
with /dev/, for safety.
Daan De Meyer [Wed, 8 Jan 2025 21:20:42 +0000 (22:20 +0100)]
fmf: Use different heuristic for number of process with many CPUs
Downstream we sometimes end up with machines with lots of CPUs which
leads to running out of memory when trying to run the tests in VMs.
So let's switch to a different heuristic when we have lots of CPUs to
avoid running out of memory.
hwids: add a new efi firmware type of device entry (#35747)
This change adds a new firmware type device entry for the .hwids
section.
It also adds compile time validations and appropriate unit tests for
them.
chid_match() and related helpers have been updated accordingly.
Duplicate of https://github.com/systemd/systemd/pull/35281
Last review feedback's from this above PR has been incorporated and
merged.
Remove no longer needed login-options override. Fixes agetty autologin.
The need for -o was introduced in db6aeda to set the -p flag for login.
Setting -o overrides agettys built-in handling of arguments, so "-- \\u" was needed to mimic it.
This broke the autologin-feature, since the -f (noauth) flag is not passed to login [1].
But with 3d2157e, the -p flag is dropped, but the full change wasn't reverted,
leaving autologin still broken - But for no reason since agetty does the right thing.
nsresource: optionally mangle userns names passed to nsresourced (#35900)
We enforce quite strict rules on naming userns we assign uid ranges to
for users. So strict that they are hard to get right for clients. hence,
let's optionally mangle provided strings so that they work for us.
This should make it much easier to work with the API, as something
reasonable happens regarldess what kind of garbage a client sets as
name.
mangling the name is opt-in for clients, so that there's tight control
for the client on the name, but also "fire and forget".
pid1: allow removal of foreign-owned subcgroups of cgroups owned by some user (#35922)
This improves operation in unprivileged userns environments, where
unpriv user code might invoke a container with a delegated userns UID
range, and thus ends up with a subcgroup owned by another UID. With this
patch any user is always allowed to remove their own cgroups even if it
has subcgroups owned by other users.
This removes a DoS of sorts, and enforces the rule that users strictly
own everything below cgroups they own.
test: add testcase that verifies we can safely delete subcgroups owned by other users if we own the parent
This is a test for the previous commits: we create an unpriv, delegated cgroup in
--user mode, then create a subcgroup that is owned by some other user
(to mimic the case where an unpriv user got a userns with delegated UIDs
assigned), and then try to stop the unit. traditionally this would fail,
because our unpriv systemd --user instance can't remove the subcrroup
owned by someone else. With the earlier patches this is addressed.
pid1: add D-Bus API for removing delegated subcgroups
When running unprivileged containers, we run into a scenario where an
unpriv owned cgroup has a subcgroup delegated to another user (i.e. the
container's own UIDs). When the owner of that cgroup dies without
cleaning it up then the unpriv service manager might encounter a cgroup
it cannot delete anymore.
Let's address that: let's expose a method call on the service manager
(primarly in PID1) that can be used to delete a subcgroup of a unit one
owns. This would then allow the unpriv service manager to ask the priv
service manager to get rid of such a cgroup.
This commit only adds the method call, the next commit then adds the
code that makes use of this.
pid1: allow moving processes in a userns owned by the user, too
Let's liberalize process migration a bit. Previously, PID 1 would only
allow you to move processes into your own cgroups, if those processes
are owned by you too. This is now slightly relaxed: it's now also OK if
the processes are in a userns owned by you.
This makes process migration more useful in context of unpriv userns.
nl6720 [Wed, 8 Jan 2025 14:19:33 +0000 (16:19 +0200)]
dissect-image: mount the ESP with fmask=0177 (#35871)
Avoid showing the files on the ESP (i.e. a FAT formatted volume) as
executable by removing the execute permission from them.
IMO this makes the colored output of `ls` more sensible since the file
system will be mounted with `noexec` anyway.
Add a `fstype_can_fmask_dmask` function that checks if a file system
type can use the `fmask` and `dmask` mount options.
This replaces `fstype_can_umask` since it was only used in
`partition_pick_mount_options` which only cares about the file system
support for fmask & dmask now.
It somewhat reduces the coverage of the feature since there are more file
systems that support umask as opposed to those supporting dmask & dmask,
but it should not be much of an issue since fmask & dmask are supported
by vfat, exfat and ntfs3.
nsresourced: add ability to mangle specified name if necessary
Let's optionally mangle any passed name on the server side so that it is
useful for identifying a userns, if it isn't suitable for that
right-away. This mostly means truncating it if too long.
It's just too nasty to leave this to the client side, since they'd have
to understand the precise rules for naming userns then.
While we are at it, add full Varlink IDL comments.
namespace-util: return recognizable error if namespace_open_by_type() fails because ns type not supported
This makes sure the the codepath that derives an nsfd from a pid works
the same for the pidfd case and the non-pidfd case: if we can verify
that /proc/ is mounted but the /proc/$PID/ns/ files are missing, we can
assume the ns type is not supported by the kernel. Hence return the same
ENOPKG error in this case as we already do in the pidfd ioctl based
codepath.
Mike Yuan [Tue, 7 Jan 2025 17:28:33 +0000 (18:28 +0100)]
Bump minimum kernel baseline to 5.4, recommended version to 5.7
As requested, a list of kernel version to feature mapping
for kernels older than minimum baseline is also included,
in order to ease potential backport work.
Luca Boccassi [Tue, 7 Jan 2025 00:40:02 +0000 (00:40 +0000)]
obs: also trigger Fedora package builds
The package is logistically separated, as the rpm sources conflict from Fedora
conflict with the rpm sources from SUSE (some files have the same name and
location but different, incompatible content), so Fedora builds can't be
triggered from the same package. The result is the same.
This is something I think we should have added a long time ago: a
flavour of open() that safely ensures the inode we are opening is a
regular file, before we open it. It does this by means of pinning the
inode via O_PATH first, and after verification actually opening it.
This ports some code over to this, but sooner or later we should
probably use this a lot more, so that we don't accidentally open weird
stuff such as device nodes or pipes, where we should not.
pretty-print: drop extra ';' from progress reporting end sequence
This corrects the closing sequence for the ConEmu progress reporting
final sequence. We by mistake sent two final ;;, where only one was
expected. The terminals I tested this with didn't care, but Ghostty
apparently does. Let's fix things and generate the closing sequence as
per doc:
machine: transition back to host mount ns before copying files from/to container
When copying files from or to a container we so far opened the host side
fd first, then entered the container (specifically, joined it's mount
namespace) in a forked off child process, and opened the other side
there, followed by the (potentially slow) copying from inside the
container mount namespace.
This commit changes this so that we rejoin the host mount namespace
before doing the copying routine. This is relevant, so that we can rely
on /proc/self/fd/… to work, which is not the case otherwise, as we'll
see /proc/ from a pidns that is not our own, in wich case
/proc/self/fd/… is refused. By moving back to the host mount namespace
our own pidns and the pidns the /proc/ mount belongs to will be in sync
again, and all is good.
This is in particular preparation for the next commit, that makes the
copy routine strictly depending on /proc/ being accessible and working.
This PR introduces io.systemd.Machine.CopyFrom and CopyTo method which
are DBus alternatives of:
- CopyFromMachine
- CopyToMachine
- CopyFromMachineWithFlags
- CopyToMachineWithFlags
Daan De Meyer [Mon, 6 Jan 2025 15:30:23 +0000 (16:30 +0100)]
fmf: Support being used downstream in dist-git tests
We can use our upstream fmf definitions to run downstream tests in
the Fedora systemd dist-git repository
(https://src.fedoraproject.org/rpms/systemd). To have access to the
dist-git sources when running the tests, we enable dist-git-source: true
downstream which makes the sources available in $TMT_SOURCE_DIR so
let's make sure we use those sources if they're available.
Yu Watanabe [Mon, 6 Jan 2025 13:13:50 +0000 (22:13 +0900)]
sd-varlink: add flag for sd_varlink_server for creating connections w… (#35841)
…ith fd passing enabled
Let's add a simple flag that enables fd passing for all connections of a
server. It's much easier to use this than to install a connect handler
which manually enables this for each connection.
Luca Boccassi [Mon, 6 Jan 2025 11:06:23 +0000 (11:06 +0000)]
sd-device: fix validation for devices under /sys/firmware/ in sd_device_new_from_subsystem_sysname() (#35863)
Devices under /sys/firmware/ do not have subsystems. Hence, the
validation in sd_device_new_from_subsystem_sysname() ->
device_new_from_path_join() always failed.