/etc/pacman.d/gnupg is already made available by mkosi's internal
logic so we don't need to copy it in. This prevents failures when
running unprivileged as /etc/pacman.d/gnupg can have rather strict
permissions.
mkosi-initrd: Only set --cacheonly=metadata when running as root
If we're not running as root, we don't use the host's package cache,
but we still use the host's repositories. It's very unlikely that the
user's default package cache directory will have an up-to-date repository
metadata snapshot, so let's update the repository metadata if we're not
running as root.
The login package as provided by util-linux is 'Protected' but no
longer 'Essential' and that's intentional, so it will not be pulled
in by default. Add it to the list.
For the existing verbs, those are the same. But for a verb with
a dash, which I want to add next, the name has the underscore,
which will not match the verb on the commandline.
Daan De Meyer [Wed, 28 Aug 2024 06:53:57 +0000 (08:53 +0200)]
Don't mount stuff twice from different sources in sandbox
We were mounting /var/tmp and /etc/resolv.conf twice in chroot_cmd(),
let's make sure we avoid doing that by moving the CLI options into
the respective _script_cmd() functions.
Daan De Meyer [Tue, 27 Aug 2024 12:57:48 +0000 (14:57 +0200)]
Use python3 in sandbox if host interpreter is not in /usr
We only mount /usr into the sandbox, so if mkosi is invoked from a
venv we'll fail to execute the apivfs script or chroot script in the
sandbox as it will try to use an interpreter that isn't available.
Let's check if the used interpreter is relative to /usr and only use
it to execute the chroot and apivfs scripts in the sandbox if it is.
The autogenerated help for --distribution/--format/… looks like {a,b,…,} (with
an emtpy arg at the end), and it is not obvious what this means. Describe the
empty args in the man page.
Daan De Meyer [Fri, 23 Aug 2024 18:34:23 +0000 (20:34 +0200)]
Make more trees required
- Tools tree is a universal setting and has to be available at the start
- Sandbox trees are a universal setting and have to be available at the start
- Skeleton trees should be available at the start to make sure caching works
properly
Daan De Meyer [Fri, 23 Aug 2024 17:01:46 +0000 (19:01 +0200)]
Rework repository metadata handling
- Stop copying repository metadata into the image.
This is too fragile to ever work properly. If the image is ever
used as a base tree, the caller would also need the exact same package
manager configuration for this to be remotely useful, as well as constantly
rebuild the image to keep the repository metadata up to date.
- Stop picking up repository metadata from the image
For the same reason, we can't use repository metadata from any
base trees automatically. It should be explicitly provided by the user
along with the required package manager configuration.
- Share the same repository metadata snapshot between the main and subimage
builds
Let's ensure that the main image and all subimages are built of the
exact same repository metadata snapshot. Now that we enforce that all
subimages use the same distro, release, architecture, repositories and
everything else that's package manager related, we can use the same
metadata snapshot for every build.
- Rename package_cache_dir in Context to metadata_dir as there's only
metadata in this directory and never any packages.
Daan De Meyer [Fri, 23 Aug 2024 17:00:09 +0000 (19:00 +0200)]
Do not allow configuring universal collection based settings in subimages
For the next commit, we want to enforce all subimages to use the same
package manager trees, repositories and package directories, so let's
not allow adding any extra of those in subimages anymore.
Daan De Meyer [Fri, 23 Aug 2024 16:44:04 +0000 (18:44 +0200)]
Simplify run_verb() logic
- Handle Verb.clean separately from builds
- Move most checks to the front before we clean up the previous results.
- Get rid of check_outputs()
- If main image needs a build, clean all subimages as well
Daan De Meyer [Thu, 22 Aug 2024 10:10:50 +0000 (12:10 +0200)]
fedora: Get rawhide GPG key from github
fedora.gpg is always out-of-date when rawhide branches, so let's
instead fetch the rawhide key from distribution-gpg-keys on Github
which does seem to get updated before rawhide branches.
Daan De Meyer [Thu, 22 Aug 2024 11:42:20 +0000 (13:42 +0200)]
Move creation of context.root out of Context()
On btrfs systems, we're unnecessarily creating a subvolume only to
remove it again immediately afterwards if we're building from a cached
image. So let's move the creation of root outside of Context() so
we can only create it as a subvolume after we've potentially checked
caches first.
Daan De Meyer [Thu, 22 Aug 2024 11:18:30 +0000 (13:18 +0200)]
Optimize copy_tree() a little
Only run cp_version() if we absolutely need to. If we do a btrfs
snapshot or the destination does not exist or is empty, there's no
need to add --keep-directory-symlink and thus we don't need to run
cp_version() either.
Azure Linux looks a lot like Fedora Linux so we opt to share configuration
between Azure and Fedora/CentOS and inherit the Azure definition from
Fedora.
Daan De Meyer [Fri, 16 Aug 2024 21:41:49 +0000 (23:41 +0200)]
Introduce mkosi-sandbox and stop using subuids for image builds
Over the last years, we've accumulated a rather nasty set of workarounds
for various issues in bubblewrap:
- We contributed setpgid to util-linux and use it if available because
bubblewrap does not support making its child process the foreground
process.
- We added the innerpid logic to run() because bubblewrap does not forward
signals to the separate child process it runs in the sandbox which meant
they were getting SIGKILLed when we killed bubblewrap, preventing proper
cleanup from happening.
- bubblewrap does not provide a proper way to detect whether the command
was found in the sandbox or not, which meant we had to execute command -v
within the sandbox separately to check whether the command exists or not.
- We had to add extra logic to make sure / was a mount in the initramfs to
allow running mkosi in the initramfs as bubblewrap does not fall back to
MS_MOVE if pivot_root() doesn't work.
- We had to stitch together shell invocations after bubblewrap but before
executing the actual command we want to run to make sure directories had
the correct mode as bubblewrap creates everything with mode 0700 which was
too restrictive in many cases for us. This was fixed with new --perms and
--chmod options in bubblewrap 0.5 but we had to keep compat with 0.4
because that's what's shipped in CentOS Stream 9.
- We had to figure out a shell hack to do overlayfs mounts as these are not
supported by bubblewrap (even though a PR for the feature has been open for
years).
- We had to introduce a Mount struct to pass around mounts so we could deduplicate
and sort them before passing them to bubblewrap as bubblewrap did not do this
itself.
- Debugging all the above was made all the harder by the fact that bubblewrap's
source code is full of tech debt from its history of being a setuid tool
instead of using user namespaces. Getting any fixes into upstream is almost
impossible as the tool is practically unmaintained.
Aside from bubblewrap, our other source of troubles has been newuidmap/newgidmap.
Running as a user within the subuid range configured in /etc/sub{u,g}id has
meant we're constantly fixing ownership and permissions issues where stuff needs
to be chowned and chmodded everywhere to make sure the current user and the
subuid user can access the proper files. Another unfortunate side effect is that
users end up with many files owned by the subuid root user in their home
directories when building images with mkosi;
Let's fix all these issues at once by getting rid of bubblewrap and
newuidmap/newgidmap.
bubblewrap is replaced with a new tool mkosi-sandbox. It looks and behaves a
lot like bubblewrap, except it's much less code and much more flexible to fit
our needs, allowing us to get rid of all the hacks we've built up over the years to
work around issues that didn't get fixed in bubblewrap.
To get rid of newuidmap/newgidmap, a rework of our user namespacing was needed.
The need to use newuidmap/newgidmap came from the assumption that we need a full
65k subuid range to do unprivileged image builds, as distributions ship packages
containing files and directories that are not owned by the root user. After some
investigation, it turns out that there's very few files and directories not owned
by root in distribution packages if you ignore /var. If we could temporarily
ignore the ownership on these files and directories until we can get distributions
to only ship root owned files in /usr and /etc of their packages, we could simply
map the current user to root in a user namespace and get rid of the subuid range
completely.
Turns out that's possible with a seccomp filter. seccomp allows you to make all
chown() syscalls succeed without actually doing anything. The files and directories
end up owned by the root user instead. If we assume this is OK and are OK with
instructing users to use tmpfiles to fix up the permissions on first boot if needed,
a seccomp filter like this is sufficient to allow us to get rid of doing image
builds within a subuid user namespace.
It turns out we can go one step further. It turns out that for the majority of
the image build, one doesn't actually need to be the root user. Only package
managers and systemd-repart need the current user to be mapped to root to do their
job correctly. The reason we did the entire build mapped to root until now was
that we need to do a few mounts as part of the image build process and for now
I was under the assumption that you needed to be root for that. It turns out that
when you unshare a user namespace, you get a full set of capabilities regardless
of whether you're root or some other uid in the user namespace. The only difference
is that when you exec a subprocess as root, the capabilities aren't lost, whereas
they are when you exec a subprocess as a non-root user. This can be avoided by
adding the capabilities of the non-root user to the inheritable and ambient set.
Once that's done, any subprocess exec'd by a non-root user in the user namespace
can mount as many bind and overlay mounts as they can think of.
The above allows us to run most of the image build under the current user uid
instead of root, only switching to root when running package managers, invoking
systemd-repart or systemd-tmpfiles, or when chroot-ing into the image. This allows
us to get rid of various hacks we had to look up the proper user name or home
directory.
Specifically, we can get rid of the following:
- mkosi-as-caller can become a noop since we now by default run the build as the
caller.
- Lots of chmod()'s and chown()'s can be removed
- All uses of INVOKING_USER.uid/gid can be removed, and most can be replaced with
simple os.getuid()/os.getgid()
- We can use /etc/passwd and /etc/group from the host instead of building our own
- We can get rid of the Acl= option as the user will now be able to remove (almost)
all files written by mkosi.
- We don't have to rchown the package manager cache directory anymore after each
build. Root user builds will now use the system cache instead of the per user
cache.
- We can get rid of the Mount struct as mkosi-sandbox dedups and sorts operations
itself.
One thing to note is that if we're invoked as root, none of the seccomp or capabilities
stuff applies and it is all skipped as it's not required in that case. This means that
when building as root it's still possible to have more than one user in the generated
image unlike when building unprivileged. Also note that users can still be added to
/etc/passwd and such, they just can't own any files or directories in the image itself
until the image is booted.
Michael Ferrari [Wed, 7 Aug 2024 09:37:48 +0000 (11:37 +0200)]
Add executable `mkosi.version` support
`mkosi.version` is executed during configuration parsing, as opposed
to reading the contents of `mkosi.version`. This allows querying the
version before the build without needing to manually adjust the version
beforehand.
This allows using date based versioning by writing a script outputting
`date '+%Y-%m-%d'` or using git tag based versioning by outputting
`git describe --tags`.
kali: A distribution based on Debian: https://www.kali.org/
Kali includes many packages suitable for offensive security tasks.
It follows a rolling release model and serves fewer architectures
than Debian.
Building a kali image requires installing kali-archive-keyring:
- Source: https://gitlab.com/kalilinux/packages/kali-archive-keyring
- Packages: https://pkg.kali.org/pkg/kali-archive-keyring
Markus Weippert [Sat, 10 Aug 2024 07:36:56 +0000 (09:36 +0200)]
Fix loaded host modules filter
Module filenames might use dashes instead of underscores.
Also, anchoring the filename to a directory avoids including unrelated
modules (e.g. exfat vs fat).
Daan De Meyer [Fri, 9 Aug 2024 10:15:09 +0000 (12:15 +0200)]
Add --wipe-build-dir to allow clearing the build directory independently
Currently, to clear the build directory, -ff has to be used which
also clears the image cache. Let's add --wipe-build-dir (-w) to allow
clearing only the build directory without clearing the image cache.
Luca Boccassi [Wed, 7 Aug 2024 22:39:06 +0000 (23:39 +0100)]
distributions: drop Debian workaround for lack of VERSION_CODENAME
It has been present since Debian 9, so we can rely on it now.
It is wrong on sid, but that's a separate issue that this old
workaround doesn't solve anyway.
Daan De Meyer [Thu, 8 Aug 2024 11:09:46 +0000 (13:09 +0200)]
debian: Fix up os-release for unstable/sid builds
The version codename for unstable/sid builds is indistinguishable from
testing. Let's make sure we fix that up ourselves so that unstable image
builds can be properly distinguished from testing builds.