Dumping everything to console slows the test quite considerably on
slower machines, so let's forward nspawn logs to the journal to still
have them available in case something goes south.
This should, hopefully, help with TEST-13 timeouts in Ubuntu CI and
maybe with CPU soft lockups in CentOS CI.
This should make the test faster on fast machines and more reliable on
slower/under-load machines, where the 4 sec sleep wasn't sometimes enough.
Spotted on C8S machines under load:
```
test_added_after (__main__.ExecutionResumeTest) ... FAIL
test_added_before (__main__.ExecutionResumeTest) ... ok
test_interleaved (__main__.ExecutionResumeTest) ... ok
test_issue_6533 (__main__.ExecutionResumeTest) ... ok
test_no_change (__main__.ExecutionResumeTest) ... ok
test_removal (__main__.ExecutionResumeTest) ... ok
test_swapped (__main__.ExecutionResumeTest) ... ok
======================================================================
FAIL: test_added_after (__main__.ExecutionResumeTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "./test/test-exec-deserialization.py", line 101, in check_output
with open(self.output_file, 'r') as log:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpjnec1dj4'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./test/test-exec-deserialization.py", line 150, in test_added_after
self.check_output(expected_output)
File "./test/test-exec-deserialization.py", line 104, in check_output
self.fail()
AssertionError: None
----------------------------------------------------------------------
Ran 7 tests in 44.270s
```
In sd-device, `devpath` is a kind of syspath without '/sys' prefix, e.g.
/devices/pci0000:00/0000:00:1c.4/0000:3c:00.0/nvme/nvme0/nvme0n1,
and `devname` is a path to the device node, e.g. /dev/nvme0n1.
Let's use the consistent name for the helper function.
condition: change operator logic to use $= instead of =$ for glob comparisons
So this is a bit of a bikeshedding thing. But I think we should do this
nonetheless, before this is released.
Playing around with the glob matches I realized that "=$" is really hard
to grep for, since in shell code it's an often seen construct. Also,
when reading code I often found myself thinking first that the "$"
belongs to the rvalue instead of the operator, in a variable expansion
scheme.
If we move the $ character to the left hand, I think we are on the safer
side, since usually lvalues are much more restricted in character sets
than rvalues (at least most programming languages do enforce limits on
the character set for identifiers).
It makes it much easier to grep for the new operator, and easier to read
too. Example:
None of our other fnmatch() calls make use of this, and the concept was
new to me at least. Given that this is only used for the recently added
SMBIOS field matches (and is not included in any release) let's disable
"extended" matches for now. We can certainly revisit this, and enable it
later if there is real demand, but if we do, we should probably add that
all over the place, not just for smbios matches.
Let's move the operator enum into its own .c/.h file, so that we can
reuse it elsewhere, in particular systemd-analyze's compare-versions
logic.
Let's rename the concept CompareOperator, since it is nowadays
genericlaly about both order *and* fnmatch comparisons, hence just
naming it "order" is misleading.
loop-util: lock the control device around clearing the loopback device and deleting it
This mirrors what we already do during allocation. We lock the control
device first, and then release the block device and then delete it.
This makes things substantially more robust as long all participants do
such locking: we won't attempt to delete a block device somebody else
already is using.
Back when I wrote this code I wasn't aware of BLKPG and what it can do.
Hence I came up with this hack to attach an empty file to delete all
partitions. But today we can do better with BLKPG: let's just explicitly
remove all partitions, and then try again.
loop-util: rework how we lock loopback block devices
Let's rework how we lock loopback block devices in two ways:
1. Lock a separate fd, instead of the main block device fd. We already
did that for our internal locking when allocating loopback block
devices, but do so for the exposed locking (i.e.
loop_device_flock()), too, so that the lock is independent of the
main fd we actually use of IO.
2. Instead of locking the device during allocation of the loopback
device, then unlocking it (which will make udev run), and then
re-locking things if we need, let's instead just keep the lock the
whole time, to make things a bit safer and faster, and not have to
wait for udev at all. This is done by adding a "lock_op" parameter to
loop device allocation functions that declares the initial state of
the lock, and is one of LOCK_UN/LOCK_SH/LOCK_EX. This change also
shortens a lot of code, since we allocate + immediately lock loopback
devices pretty much everywhere.
Now that the loopback device code already destroys the partitions we
don't have to do this here anymore.
I am sure the right place to delete the partitions is in the loopback
code, since we really only should do that for loopback devices, see
bug #24431, and not on "real" block devices.
I am also not convinced dropping partitions the dissection logic doesn't
care about is a good idea, after all. The dissection stuff should
probably not consider itself the "owner" of the block devices it
analyzes, but take a more passive role: figure out what is what, but not
modify it.
loop-util: when clearing a loopback device delete partitions first, and take BSD lock
Whenever we release a loopback device, let's first synchronously delete
all partitions, so that we know that's complete and not done
asynchronously in the background. Take a BSD lock on the device while
doing so, so that udev won't make the devices busy while we do this.
Colin Walters [Wed, 31 Aug 2022 20:39:03 +0000 (16:39 -0400)]
tree-wide: Use "unmet" for condition checks, not "failed"
Often I end up debugging a problem on a system, and I
do e.g. `journalctl --grep=failed|error`. The use of the term
"failed" for condition checks adds a *lot* of unnecessary noise into
this.
Now, I know this regexp search isn't precise, but it has proven
to be useful to me.
I think "failed" is too strong of a term as a baseline, and also
just stands out to e.g. humans watching their servers boot or
whatever.
The term "met condition" is fairly widely used, e.g.
https://stackoverflow.com/questions/63751794/what-does-the-condition-is-met-exactly-mean-in-programming-languages
Jan Janssen [Tue, 30 Aug 2022 07:28:56 +0000 (09:28 +0200)]
tree-wide: Fix format specifier warnings for %x
Unfortunately, hex output can only be produced with unsigned types. Some
cases can be fixed by producing the correct type, but a few simply have
to be cast. At least casting makes it explicit.