Luca Boccassi [Sat, 16 Oct 2021 23:27:41 +0000 (00:27 +0100)]
sysext: support building deb extensions
debootstrap will fail if the root is already populated, so skip it when building
an extension.
While there, skip also other tasks that apply only to the base image (kernel, etc).
Inspired by https://sethmlarson.dev/blog/2021-10-18/tests-arent-enough-case-study-after-adding-types-to-urllib3:
> Don’t expose Generators unless you want Generator functionality
>
> Generators have additional behaviors over iterables so if the API isn’t meant
> to be used like a generator then it’s best to keep this fact a secret and
> annotate with Iterable[X] instead of Generator[X, None, None].
This is useful to reduce the size of the final image by dropping
packages which are required during the installation, but are not needed
in the final image. So in particular, this includes packages pulled
in as Requires(pre) on rpm-based systems.
man: describe the syntax that Packages= accepts a bit more
What exactly is supported is different between the different distros.
But let's at least list the various options, so that people are aware
of the possibilities. Many people might not even think about the more
esoteric ones.
We create a custom repo file from scratch, with a bunch of repos.
We would then pass --disablerepo=* --enablerepo=… --enablerepo=…
to dnf to enable the repos we configured. This is pointlessly complicated:
let's instead just enable the repos in the config file we write.
(The config file is only used by the dnf commands we invoke.)
This changes the behaviour of UseHostRepositories=true a bit:
we now use the repos that are enabled on the host, instead of enabling
select repos. I think this actually makes more sense: the list of
repos on the host and their names is something that we have no control
over. So we don't really know which ones to enable. And the user can
still use Repositories= to select some repos. So UseHostRepositories=true
alone means "use host repositories as configured", and
UseHostRepositories=true + Repositories=… means "use the specified
host repositories".
Each function to install packages is split in two:
an invoke_[t]dnf() function that accepts a verb and a list of packages,
and an install_packages_[t]dnf() function that calls the first
one with 'install' as the verb.
The idea is to allow other verbs to be called in the future.
I also merged invoke_dnf, invoke_yum, invoke_yum_or_dnf into a single
function. Nowadays, yum is just an aliast to dnf, so they both accept
the same options and provide the same functionality. They actually were
called with different options, but I think this was by mistake: people
added new functionality and forgot to update the callpath for yum.
And if the ancient yum in EPEL doesn't support some option, we can easily
conditionalize on the command name internally.
(Effectively, this changes how yum is called, more options are passed to
it now.)
Vishal Verma [Wed, 6 Oct 2021 07:36:19 +0000 (01:36 -0600)]
mkosi: Fix autologin configs for different PAM versions
Some PAM versions require full /dev/<tty> paths for the autologin setup
done by mkosi, where as others only need the <tty> portion.
If full paths are required, the <tty> only setup breaks, and vice versa.
However having both variants in the config isn't adverse in any way.
As distros upgrade their PAM versions, the distro based checks would
have to constantly play whack-a-mole to switch to the 'prefix-required'
vs. not variations.
Add both variants unconditionally - this way we solve the problem for
all distros, regardless of when they update.
Daan De Meyer [Wed, 13 Oct 2021 10:18:41 +0000 (11:18 +0100)]
nspawn: Copy RLIMIT_CORE and RLIMIT_NOFILE in non-booted nspawn containers
Avoid surprises by copying open files and coredump limits from the user
running mkosi. Most noteably, this makes sure core dumps in non-booted
mkosi containers actually end up on the host as previously the coredump
size limit was zero in non-booted mkosi nspawn containers which led to
no coredumps being generated at all on the host of processes that dumped
core in the build containers (e.g. tests that raise SIGABRT).
Since F34, glibc changed from 'Suggests: glibc-all-langpacks' to
'Suggests: glibc-minimal-langpack', so we don't need to do this ourselves.
The advantage is that we avoid one line of output from dnf (about
glibc-minimal-langpack being already installed, which is confusing when
the user didn't request glibc-minimal-langpack explicitly).
manifest: when recording packages, ignore packages from the base image
This solves the problem that the generated package list included all packages
visible in the combined overlay when building something on top of a base image.
Instead, we want just the stuff that was added in the overlay.
I considered some other approaches:
- use 'dnf history info' to query what the last transacation was.
The output is human-readable tabular text, and would have to be parsed.
This could be done, but there's a bigger problem: we don't necessarilly
know that the last transaction is all that matters. And in fact, as
raised by mdomonko in #rpm-ecosystem, dnf is not the only way to install
rpm packages. Using rpm directly also covers direct rpm invocations,
which could be done from the build scripts.
- look at rpm transaction id. This still has the problem that we don't
know if the last transaction is all that matters.
So overall, the simple time-based approach should be no worse than the other
ones, and is trivially easy to implement.
manifest: change ".packages" suffix to ".changelog"
".packages" was inspired by kiwi, where the original idea came from. But the
format is different anyway, and I'm constantly confused by the format name and
suffix being different, even though I wrote the feature. Let's just make them
the same.
Daan De Meyer [Mon, 11 Oct 2021 14:00:34 +0000 (15:00 +0100)]
Silence sfdisk when refreshing partition tables
Let's try to keep the output minimal when building from cached images.
Currently sfdisk is the only program producing non-trivial amounts of
output when building from cached images so let's pass the --quiet flag
when refreshing partition tables.
We would create a file with 0 padding to force it to a certain size.
This seems fairly ugly, in particular when we later want to use the
blob in other contexts. Also, since 9e0b115e0c, this shouldn't be necessary.
And if we're just copying bytes from a file we already have open, with no
padding, we might just as well do this without forking.
mkosi: add support for verity also for generated roots
In a sysext, I have a squashfs partition that I want to do verity for.
Before this change, we'd fail with an assertion that the root device partition
is not set.
The conditional is bit busy, but I couldn't find a shorter form that
would make mypy happy.
The return value currently isn't used. By returning the Partition
object we can get access to things like the blockdev path, but also
size, so this is more flexible.
Overlay uses char(0,0) files to mark items that are present in the lower
layers, but were removed in the upper layer. At least for the case
of sysexts, we want to get rid of those. Maybe if other uses of this
feature appear, we might want to make this removal optional.
Hat tip to @brau_ner for explaining what those files are.
Add support for creating "sysexts" with the BaseImage option
The general approach is to first create a base image with the base
set of packages, and then install the new layer on top. To make this
work nicely, the base layer should have --clean-package-metadata=false:
This produces a sysext, except that the release files are missing.
I currently create them through a finalize script. At some point we might
move this functionality into mkosi, but I think it's better to get some
experience with sysexts first.
This will have very bad results unless one controls *all* places where
a file can be created. This is possible in systemd and C code, but is
unlikely to work well here, where we have lots of high-level code and
call helpers which can also create files.
So let's use a reasonable value, i.e. 0o022 as the umask during runtime.
This will let us create files with the expected permissions in the image.
After we produce our outputs, we already chown and chmod using the
original umask, so the output have the expected permissions anyway.
indentation: keep each condition on a separate line
We would have two conditions joined by 'and', and we would have
the first operand and half of the second on the first line, and
the remainder of the second operand on multiple lines.
The general idea is that those closes collect attributes and information
about the abstract partition table, but are independent of any underlying
block device. The definitions exists indendependently of a block device
and can be applied later on.
Partitions are referred to by enum PartitionIdentifier.
The calculation of the space necessary for those partitions is centralized
in the class, so we don't have multiplace places where we arrive at slightly
different formulas for the expected disk size.
This adds support for creating signed GPT disk images. If Verity=signed
is set this will not only generate and insert Verity data into the
image, but then use the resulting root hash, sign it and include it in
an additional partition. It will also write the resulting PKCS7
signature out into a new .roothash.p7s file.
This scheme is compatible with kernel 5.4's PKCS7 signature logic for
dm-verity: the resulting .p7s file can be passed as-is to the kernel (or
systemd's RootHashSignature= setting).
The partition this embedds contains a simple JSON object containing
three fields. The verity root hash, the PKCS7 data (i.e. the same data
as in the .p7s file, but in base64), and SHA256 fingerprint of the
signing key. This partition is supposed to be read by the image
dissection logic of systemd, to implement signed single-file images.
(The corresponding PR for systemd I am still working on).
This opens up two avenues for image verification:
1. Everything in one file: the single, "unified" GPT disk image contains
three partitions, for payload data, verity data and verity signature.
2. Split out: root hash and its signature are stored in two "sidecar"
files.
(Of course I'd personally always go the "unified" way, but given the
RootHashSignature= logic exists already, and it's easy to support, let's
support it.)
This uses the key already used for doing secureboot image signing.
Conceptually this makes a ton of sense: we sign the same stuff here
after all: the contents of the image, supporting two different
entrypoints to the image: one via UEFI booting the image, and once for
attaching directly to an image from a running system. Admittedly, the
"mkosi.secure-boot.key" and "mkosi.secure-boot.certificate" monikers for
this key pair might be a bit suprising though.
Joerg Behrmann [Mon, 4 Oct 2021 15:17:06 +0000 (17:17 +0200)]
typing: appease pyright and disable mypy unusued import warning
Since pyright reuses the same comments to ignore types, one can get into
unsatisfiable situations, where one type checker accepts something but the other
doesn't. At this point we only have three instances of type ignore hints anyway,
sol et's relax our mypy settings somewhat, so that we can shut up pyright, when
necessary.
dracut: make sure images with IMAGE_VERSION but without IMAGE_ID can boot
If an image version is set but no image ID mkosi's table currently
places generic labels in the partition labels, instead of the image ids
(because we have none...). This means we cannot refrence the root
partition via root=PARTLABEL=… on the kernel cmdline. Hence do not do
that.
This change ensures we don't try to use IMAGE_ID-based root=PARTLABEL=
kernel cmdline swtches without IMAGE_ID being set.
(Note that the main reason IMAGE_ID/IMAGE_VERSION exists is to allow
versioned setups, i.e. where multiple versions of the same thing exist.
In such a case it's important to reference the right rootfs version that
matches the whole setup we are building here. But if IMAGE_ID isn't set
then this multi-version logic is not desired and we can assume that only
a single version of the OS is in the partition table, and thus rely on
gpt-auto-generators automatic root file system discovery)
Ray Sit [Sat, 11 Sep 2021 05:09:52 +0000 (15:09 +1000)]
Renamed argument UseSystemRepositories to UseHostRepositories
The use of the term system can be misleading as system can refer
to many different things. The term host is more accurate for what the
option is. Also updated the command help to identify the option is
for dnf-based distros.
The option is there, we should have some documentation for it.
But I think the current split is not very useful (e.g. why
"workspace-command", what does this even mean?), and I expect that
we'll want to review the list and hide some options before documenting
this. So the choices are not described in the man page yet.
Add --debug=disk and show sectors and raw sfdisk configs if selected
The low-level sfdisk configs are quite useful when trying to figure
out what sfdisk doesn't like. But we shouldn't show this by default,
so this adds a new --debug selector and hides the detailed output
otherwise.
mkosi: reserve much less area for GPT header and footer
We would reserve 1MB on both ends. This is what sfdisk (and other
tools) do by default. It probably makes sense for real disks, which
are large enough that 1–2 MB don't matter, and one might want to add
partitions later. For our images this isn't very useful. In fact, we
could probably go lower than 128 partitions, since we generally know
exactly how many we will create. But I'm leaving that for later,
because the savings are not large, and there might be compatiblity
issues involved. If it turns out that the changes done here don't
cause problems, we could consider making the max partition count
smaller, to save another 12 kb or so.
The "grain" (partition alignment) is set to 4096 bytes, even on
devices with 512 byte sectors. 4k devices are becoming more popular,
and we could trigger bad performance if the partitions was misaligned.
The code is reworked to take the specified first-lba into account.
(In the initial version of this patch that was posted, 'grain:4096'
was passed to sfdisk, and this seemed enough. But with various other
combinations of image sizes, sfdisk sometimes refuses to create the
expected layout when first-lba is not specified. So this version of
the patch specifies both values.)
Before: ‣ Resulting image size is 1.8M, consumes 848.0K.
After: ‣ Resulting image size is 808.0K, consumes 808.0K.
mkosi: use blockdev --rereadpt instead of partx --update
With smaller devices, we trigger a bug in partx. It was fixed in
util-linux-2.37, but at least Fedora 34 still has 2.36. With
blockdev --rereadpt, it's the kernel which parses the table, which
should avoid the issue.