Yu Watanabe [Wed, 22 Feb 2023 22:31:01 +0000 (07:31 +0900)]
sd-event: always initialize sd_event.perturb
If the boot ID cannot be obtained, let's first fallback to the machine
ID, and if still cannot, then let's use 0.
Otherwise, no timer event source cannot be triggered.
Frantisek Sumsal [Wed, 22 Feb 2023 19:43:52 +0000 (20:43 +0100)]
sd-journal: fix build with older glibc
In older glibc (like 2.28 on CentOS Stream 8) there is no wrapper
for the gettid() syscall, so we need to provide our own.
../src/libsystemd/sd-journal/journal-send.c: In function ‘close_journal_fd’:
../src/libsystemd/sd-journal/journal-send.c:88:25: error: implicit declaration of function ‘gettid’; did you mean ‘getgid’? [-Werror=implicit-function-declaration]
if (getpid() != gettid())
^~~~~~
getgid
../src/libsystemd/sd-journal/journal-send.c:88:25: warning: nested extern declaration of ‘gettid’ [-Wnested-externs]
cc1: some warnings being treated as errors
usec_t is also a uint64_t internally, hence this doesn't actually change
anything. However, on the conceptual level, sd-bus expects a uint64_t
hence give it one.
systemctl: right-align left/passed columns in list-timers
Timespans are probably best right-aligned, in particular if they
systematically end in either " ago" or " left" because they are used as
"relative timestamps".
Yu Watanabe [Wed, 22 Feb 2023 04:26:28 +0000 (13:26 +0900)]
systemctl: show "Until:" field only for service and scope units
Only service and scope units have RuntimeMaxUSec bus property.
To suppress the "Until:" field for other unit types, the entry must be
initialized with USEC_INFINITY.
Typically, in reasonably complex programs we want to realease various
caches (such as glibc allocation caches) in case of memory pressure.
Let's add explicit infrastructure for that to sd-event, that can hook
Linux' Pressure Stall Information (PSI) logic with our event loop.
This adds sd_event_add_memory_pressure() as easy, one-step API to
install an even source that is called under memory pressure.
The parameters which file to watch (the per-cgroup PSI file, or the
system-wide file /proc/pressure/memory) can be configured via env vars.
The idea is that the service manager sooner or later gains controls for
setting this up correctly.
Alternatively to the PSI a similar logic is supported but instead of
waiting for POLLPRI on a procfs/cgroupfs fd we'll wait for POLLIN on
FIFO or AF_UNIX sockets. This is useful for testing, and possibly in
other environments, for example to hook up this protocol directly with
GNOME's low memory monitor.
By default this watches on the cgroup-local PSI so that we aren't
affected by pressure on cgroups we are not related to.
Daan De Meyer [Mon, 20 Feb 2023 19:30:44 +0000 (20:30 +0100)]
copy: Support both inode exclusion and contents exclusion
In some cases, we want to exclude a directory's contents but not
the directory itself. In other cases, we want to exclude a directory
and its contents. Let's extend the denylist logic in copy.h to support
both by changing the denylist from a set to hashmap so we can store the
deny type as the value.
We also modify the repart ExcludeFiles= option to make use of this. If
a directory to exclude ends with a "/", we'll only exclude its contents.
Otherwise, we'll exclude the full directory.
I left this one as a separate commit because it is more involved.
We want people to compile with valgrind support, but we don't want to
use a slow hash function unless we're actually running under valgrind.
So the compile-time check is changed to a runtime check. When compiled
with optimization, the compiler should elide the checks on the constants,
and only leave the check for RUNNING_ON_VALGRIND. It is wrapped with
_unlikely_ so that the else branch is put in the hot path.
meson: merge our two valgrind configuration conditions into one
Most of the support for valgrind was under HAVE_VALGRIND_VALGRIND_H, i.e. we
would enable if the valgrind headers were found. The operations then we be
conditionalized on RUNNING_UNDER_VALGRIND.
But in a few places we had code which was conditionalized on VALGRIND, i.e. the
config option. I noticed because I compiled with -Dvalgrind=true on a machine
that didn't have valgrind.h, and the build failed because
RUNNING_UNDER_VALGRIND was not defined. My first idea was to add a check that
the header is present if the option is set, but it seems better to just remove
the option. The code to support valgrind is trivial, and if we're
!RUNNING_UNDER_VALGRIND, it has negligible cost. And the case of running under
valgrind is always some special testing/debugging mode, so we should just do
those extra steps to make valgrind output cleaner. Removing the option makes
things simpler and we don't have to think if something should be covered by the
one or the other configuration bit.
I had a vague recollection that in some places we used -Dvalgrind=true not
for valgrind support, but to enable additional cleanup under other sanitizers.
But that code would fail to build without the valgrind headers anyway, so
I'm not sure if that was still used. If there are uses like that, we can
extend the condition for cleanup_pools().
There are three conditionalizations in the status quo ante function,
which kinda indicates this should not be the same function in the first
place. Hence split this up, simplify it, and have two distinct functions
without conditionalizations.
Jan Janssen [Sat, 7 Jan 2023 08:19:23 +0000 (09:19 +0100)]
boot: Do not use errno.h/inttypes.h
These are provided by libc instead of the compiler and are not supposed
to be used in freestanding environments.
When cross-compiling with clang and the corresponding gcc
cross-toolchain is not around, clang may pick up the wrong header from
the host system.
On aarch64 we end up with a nonexecutable stack, but on ia32 and x64 we get one,
so this might be just a matter of defaults in the linker. It doesn't matter
greatly, but let's mark the stack as non-executable to avoid the warning.
Note: '-Wl,-z' is not needed, things work with just '-z'.
Mike Yuan [Mon, 20 Feb 2023 12:12:19 +0000 (20:12 +0800)]
sleep: check if we're on AC power before checking battery capacity
Before this commit, battery_is_low() returns
true if there's no battery on the system.
It's now modified to check if the system is
on AC power first, and returns false early
if that's the case.
We already have this nice code in system that determines the block
device backing the root file system, but it's only used internally in
systemd-gpt-generator. Let's make this more accessible and expose it
directly in bootctl.
It doesn't fit immediately into the topic of bootctl, but I think it's
close enough and behaves very similar to the existing "bootctl
--print-boot-path" and "--print-esp-path" tools.
If --print-root-device (or -R) is specified once, will show the block device
backing the root fs, and if specified twice (probably easier: -RR) it
will show the whole block device that block device belongs to in case it
is a partition block device.
Suggested use:
# cfdisk `bootctl -RR`
To get access to the partition table, behind the OS install, for
whatever it might be.
Daan De Meyer [Tue, 21 Feb 2023 14:09:38 +0000 (15:09 +0100)]
mkosi: Move more logic to the postinst script
Let's move stuff that only applies to the final image to the
postinst script. Let's also move out some of the static files to
mkosi.extra/ instead of hardcoding them in scripts.
Jan Janssen [Fri, 27 Jan 2023 11:57:35 +0000 (12:57 +0100)]
meson: Add simple_tests list
A lot of tests can be defined by just their filename. Moving into their
own list keeps things simpler, especially with the next commit. It also
makes it easier to keep the lists sorted.
Jan Janssen [Fri, 6 Jan 2023 17:07:18 +0000 (18:07 +0100)]
boot: Provide our own EFI API headers
We want to get away from gnu-efi and the only really usable source of
EFI headers would be EDK2, which is somewhat impractical to use and
quite large to require to be around just for some headers.
As a bonus point, the new headers are safe to be included in userspace
code.
This should not have any behavior changes as it is mostly changing
header includes. There are some renames to conform to standard names
and a few minor device path fixups as the struct is defined slightly
different.
Of note is that this removes usage of uchar.h and wchar.h as they are
not guaranteed to be available in a freestanding environment. Instead
efi.h will provide the needed types.
I'm not sure what "suffix" was meant by this comment, but the file has the usual suffix.
The file was added with the current name back in c4708f132381e4bbc864d5241381b5cde4f54878.
Maybe an earlier version of the patch did something different.
journal-file: allow opening journal files for write when machine ID is not initialized
We allow reading them, and we allow creating them, but we so far did not
allow opening existing ones for write – if the machine ID is not
initialized.
Let's fix that.
(This is just to fix an asymmetry. I have no immediate use for this. But
test code should in theory be able to use this, if it runs in an
incompletely initialized environment.)
journal-file: lazily fill in machine ID into journal header, if needed
Previously, if we ran in an environment where /etc/machine-id was
not defined, we'd never bother to write it ever again. So it would stay
at all zeroes till the end of times.
Let's make this more robust: whenever we try to append an entry, let's
try to refresh it from the status quo if not initialized yet. Moreover,
when copying records from a different journal file, let's propagate the
machine ID from there.
This should make things more robust and systematic, and match how we
propagate the boot ID and the seqnum ID to some level.
journal-file: write machine ID when create the file, not when we open it for writing
This doesn't actually change much, but makes the code less surprising.
Status quo ante:
1. Open a journal file
2. If newly created set header machine ID to zero
3. If existing and open for write check if machine ID in header matches
local one, if not, refuse.
4. if open for writing, now refresh the machine ID from the local system
Of course, step 4 is pretty much pointless for existing files, as the
check in 3 made sure it is already in order or we'd refuse operating on
it anyway. With this patch this is simplified to:
1. Open a journal file
2. If newly created initialized machine ID to local machine ID
3. If existing, compare machine ID in header with local one, if not
matching refuse.
journal-file: don't update boot_id in journal header on open
The header of the journal file contains a boot ID field that is
currently updated whenever we open the journal file. This is not ideal:
pretty often we want to archive a journal file, and need to open it for
that. Archiving a foreign journal file should not mark it as ours, it
should just change the status flag in the file header.
The boot ID in the header is aleady rewritten whenever we write a
journal entry to the file anyway, hence all this patch effectively does
is slightly "delay" when the boot ID in the header is updated: instead
of immediately on open it is updated on the first entry that is written.
Net effect: archived journal files don't all look like they were written
to on a boot newer then they actually were
And more importantly: the "tail_entry_monotonic" field suddenly becomes
useful, since we know which boot it belongs to. Generally, monotonic
timestamps without boot ID information are useless, and this fixes it.
A new (compatible) header flag marks file where the boot_id can be
understood this way. This can be used by code that wants to make use of
the "tail_entry_monotonic" field to ensure it actually can do so safely.
This also renames the structure definition in journal-def accordingly,
to indicate we now follow the stricter semantics for it.
meson: adjust whitespace handling in jinja2 rendering
In 6abe882bae1bb12827ef395c60f21ab8bb1bc61b the renderer was made to
unconditionally append a newline to output. This works, but is ugly. A nicer
solution is to tell jinja2 to not strip the newline in the first place, via
keep_trailing_newline=True. It seems that the result is unchanged because all
our source files have exactly one trailing newline.
Also, enable lstrip_blocks=True. This would cause whitespace on the line before
an {%if block to be automatically stripped. It seems reasonable to enable that
if trim_blocks=True.
Overall, no change is expected, though I didn't test combinations of
configurations, so there might be a change in some cases. But now the rules of
rendering are more logical, e.g. we should be able to indent nested conditional
statements without getting unexpected whitespace in the output.
We should be more careful with distinguishing the cases "all bits set in
caps mask" from "cap mask invalid". We so far mostly used UINT64_MAX for
both, which is not correct though (as it would mean
AmbientCapabilities=~0 followed by AmbientCapabilities=0) would result
in capability 63 to be set (which we don't really allow, since that
means unset).