Daan De Meyer [Fri, 17 Jun 2022 14:16:18 +0000 (10:16 -0400)]
machine: Add retries for ssh
We've been seeing quite a bit of "connection refused" errors in CI.
These are likely happening because sshd hasn't finished starting
yet.
The proper fix for this is to add notify socket support for systemd
running qemu VMs via virtio sockets, but even if that's added, it
will be a very long time before we can rely on it.
For now, let's add a retry mechanism for SSH connections to make
our CI setup more reliable.
Daan De Meyer [Fri, 17 Jun 2022 14:13:49 +0000 (10:13 -0400)]
Refactor command running in integration tests
Let's move run_command_image() into Machine.run(), introduce
run_systemd_cmdline() to get the systemd-run command line, and
remove all arguments from run_ssh() that aren't required anymore
now.
mkosi: optimize/fix patching of root part-type uuid
The bug was that the part-type write we did would get overwritten when the
partition table was subsequently rewritten when we were adding the verity and
verity-sig paritions. We don't need to write out the part-type manually, it's
enough to store the right value in our partition list. This makes things a bit
faster too.
We know that if we calculated the verity info, we'll insert a partition soon
after and then it'll get written correctly.
Daan De Meyer [Wed, 15 Jun 2022 20:24:41 +0000 (16:24 -0400)]
arch: Use gpgdir from host system
Instead of setting up the keyring in the image, let's reuse the
keyring from the host. If users want to use pacman in the image,
they just have to run pacman-key themselves in a postinst script
or such.
This speeds up building of images and hopefully also gets rid of
our CI issues with Arch where there's something keeping files open
in the root mount (which I expect is gpg-agent).
Daan De Meyer [Fri, 20 May 2022 13:05:09 +0000 (15:05 +0200)]
Fix losetup race condition with initializing partition devices
This fixes the same issue we've seen in the systemd repo where
using PARTSCAN introduces a race condition with trying to use
the partition device since the kernel initializes the partition
devices asynchronously. To avoid the issue, we initialize partition
devices manually using the BLKPG ioctl(). We also avoid the same
problem on detaching loop devices by removing partition devices
explicitly using the BLKPG ioctl().
See https://github.com/systemd/systemd/pull/22992,
https://github.com/systemd/systemd/pull/23427,
https://github.com/systemd/systemd/issues/23174 and
https://github.com/systemd/systemd/issues/17469 for more context.
The default formatter would wrap all text into a single paragraph,
which is rather hard to read in case when we have a list of options
and an explanation for each of the values. Let's add a custom formatter.
Also, never split option names or other words.
Part of output wrapped to the default 80 columns:
--source-file-transfer-final METHOD
How to copy build sources to the final image:
'copy-all': normal file copy
'copy-git-cached': use git ls-files --cached, ignoring
any file that git itself ignores
'copy-git-others': use git ls-files --others, ignoring
any file that git itself ignores
'copy-git-more': use git ls-files --cached, ignoring
any file that git itself ignores, but include the
.git/ directory
(default: None)
--source-resolve-symlinks [BOOL]
If true, symbolic links in the build sources are
followed and the file contents copied to the build
image. If false, they are left as symbolic links. Only
applies if --source-file-transfer-final is set to
'copy-all'.
(default: false)
--source-resolve-symlinks-final [BOOL]
If true, symbolic links in the build sources are
followed and the file contents copied to the final
image. If false, they are left as symbolic links in
the final image. Only applies if
--source-file-transfer-final is set to 'copy-all'.
(default: false)
--with-network [WITH_NETWORK]
Run build and postinst scripts with network access
(instead of private network)
--settings PATH Add in .nspawn settings file
--help would print something like "--source-file-transfer-final SOURCE_FILE_TRANSFER_FINAL"
which takes a lot of space but is not very helpful. In particular it
might not be clear whether this expects some custom string or just a
yes/no boolean. Let's use "BOOL" instead to tell the user the type of
the argument, which immediately implies what values can be specified.
Similarly, say "--source-file-transfer METHOD", "--source-file-transfer-final METHOD".
Also drop metavar= when it matches the default value anyway.
Creation of bmap files needs to take place before any compression
happens, since bmaptool has to know where the "holes" of the image lie.
Compression removes the holes, preventing bmap from recreating the
original raw image. Move the bmap calculation step before the
compression.
Joerg Behrmann [Wed, 1 Jun 2022 17:35:39 +0000 (19:35 +0200)]
ssh: make parse_ssh_agent only handle strings
pyright complains (wrongly I think) about value being None when passed to Path
to create the socket variable. Let's work around this by eliminating Nones as
values.
Daan De Meyer [Mon, 16 May 2022 13:57:53 +0000 (15:57 +0200)]
mkosi: Always use the embedded default version when no release is specified
Let's not have the host system determine the image distribution release.
Instead, let's always default to the default release embedded within mkosi.
This gives more consistent results when building images for a single distro
regardless of the host distribution.
Daan De Meyer [Tue, 17 May 2022 09:45:08 +0000 (11:45 +0200)]
machine: Translate \r\n to \n in logfile
Output lines from pexpect sent to the logfile will always end with
"\r\n" (side-effect of working with pseudo-TTYs) . On Github Actions,
this results in blank lines in the test output. Let's add a simple
adapter that translates "\r\n" back to "\n" before actually writing
to the logfile.
Daan De Meyer [Wed, 11 May 2022 11:54:24 +0000 (13:54 +0200)]
Install util-linux explicitly on Fedora
In Fedora 36, by default only util-linux-core is pulled in which
is missing /bin/login which is required by /sbin/agetty to function
properly. Let's pull it in explicitly until the bug is resolved.
Joerg Behrmann [Wed, 11 May 2022 07:41:35 +0000 (09:41 +0200)]
debian: include ca-certificates for bootstrap packages
apt throws warnings because it cannot verify the certificates for the security
repositories we included recently. We could add this to extra-packages, but then
ca-certificates is missing when we call apt update for the first time, so add it
to the debootsrap call.
Daan De Meyer [Thu, 5 May 2022 09:23:29 +0000 (11:23 +0200)]
Add nspawn version check to check_native()
From systemd-nspawn v250 onwards, it's possible to run build scripts
on non-native architectures (as long as binfmt.d is configured correctly)
so update the native check to consider that.
Right now mkosi.skeleton cannot be used for dpkg-based distributions, since
debootstrap will not work on a non-empty target. This adds a parameter to
install_skeleton_trees to hack around this for Debian and Ubuntu, so that the
call before install_distribution is skipped and we only add skeletons before
invoking apt again after doing the initial debootstrap.
py3.11: fix Enum formatting to work with python3.11-a7
Something strange is happening with .__repr__() access in python3.11:
>>> mkosi.backend.ManifestFormat.mro()
[<enum 'ManifestFormat'>, <class 'mkosi.backend.Parseable'>, <enum 'Enum'>, <class 'object'>]
>>> mkosi.backend.ManifestFormat.changelog.__repr__()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python3.11/enum.py", line 1194, in __repr__
return "<%s.%s: %s>" % (self.__class__.__name__, self._name_, v_repr(self._value_))
^^^^^^^^^^^^^^^^^^^^
File "/home/zbyszek/src/mkosi/mkosi/backend.py", line 95, in __repr__
return cast(str, getattr(self, "name"))
^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'str' object has no attribute 'name'
Enum somehow subverts normal lookup and makes its own __repr__ function be
used, even though Parseable is listed first in MRO. This seems to be related to
PEP 663, which was rejected, and the changes reverted for -a4 [1], but then the revert
was reverted [2].
Let's just sidestep MRO with a method redefinition:
>>> mkosi.backend.ManifestFormat.changelog.__repr__
<bound method ManifestFormat.__repr__ of changelog>
>>> mkosi.backend.ManifestFormat.changelog.__repr__()
'changelog'
This should work on all python versions. If python3.11 returns to previous
semantics before the final release, we can remove the workaround.
ubuntu/debian: Set up locale correctly on Debian/Ubuntu
Let's make sure we configure the locale. Also, some programs
expect /etc/default/locale to exist on Ubuntu/Debian so let's
create a symlink from there to /etc/locale.conf as well.
Let's not capture output by default. Instead, let's forward it
directly to stdout/stderr to simplify debugging. Similar to the
subprocess.run() function, let's add a capture_output argument
to allow configuring whether to capture the output.
We also remove the debug argument from Machine since logging to
stdout/stderr is now the default.
Drop --hostonly-initrd from test machine bootable images
Causes a few boot issues with rocky and alma bootable images so
let's remove the option as it the speed improvement shouldn't matter
too much for integration tests.
ci: Remove systemd.volatile from kernel command line
volatile doesn't work on many distros. We initially added it to
support booting GPT squashfs images but since we don't test those
in CI anymore, we can safely remove volatile from the kernel
commandline as well.
The 'asyncio_mode' default value will change to 'strict' in
future, please explicitly use 'asyncio_mode=strict' or
'asyncio_mode=auto' in pytest configuration file.
The difference is whether pytest considers async tests to be
asyncio-driven even when they are not marked @pytest.mark.asyncio
with 'auto' meaning yes, consider them even when not marked, and
'strict' requiring the marking.
This doesn't really make a difference for us, since we don't have
any async tests, but it's nevertheless nice to silence the warning.
Run dracut for unified kernel images instead of objcopy
objcopy is faster but it doesn't apply any changes in installed
files when running in incremental mode which is a regression. Let's
run dracut again to fix that regression.
We can if needed add an option later to use objcopy instead of dracut
if users would want that.
Rémi Palancher [Mon, 24 Jan 2022 08:59:29 +0000 (09:59 +0100)]
Add option to run nspawn in current unit
This commit adds --nspawn-keep-unit option to add --keep-unit option to
underlying systemd-nspawn commands. This makes systemd-nspawn uses the
current unit scope and allocated ressources. This can be notably useful
when mkosi is run by a system service.
Luca Boccassi [Tue, 29 Mar 2022 00:43:40 +0000 (01:43 +0100)]
Fix bootstrapping RPM distro on Debian
The Debian rpm/dnf packages store the db in the home directory, so
the bootstrapped image has a broken rpm database.
Move it to the right place if it happens.