We might end up allocating mempools, and when we are unloaded we might
orphan them, thus leaking them. Hence, let's just stick around for good,
so the mempools remain referenced continously and for good, and thus no
memory is leaked (though the memory isn't cleaned up either).
ott [Tue, 12 Dec 2017 15:30:12 +0000 (16:30 +0100)]
resolve: add support for RFC 8080 (#7600)
RFC 8080 describes how to use EdDSA keys and signatures in DNSSEC. It
uses the curves Ed25519 and Ed448. Libgcrypt 1.8.1 does not support
Ed448, so only the Ed25519 is supported at the moment. Once Libgcrypt
supports Ed448, support for it can be trivially added to resolve.
networkd: Fix race condition in [RoutingPolicyRule] handling (#7615)
The routing policy rule setup logic is moved to the routes setup phase (rather than the addresses setup phase as it is now). Additionally, a call to `link_check_ready` is added to the routing policy rules setup handler. This prevents a race condition with the routes setup handler.
Also give each async handler its own message counter to prevent race conditions when logging successes.
resolved: when a server consistently returns SERVFAIL, try another one
Currently, we accept SERVFAIL after downgrading fully, cache it and move
on. Let's extend this a bit: after downgrading fully, if the SERVFAIL
logic continues to be an issue, then use a different DNS server if there
are any.
bootctl: don't trip up in "bootctl status" when we can't find the ESP because of lack of privilges
On my system the boot and EFI partitions are protected, hence "bootctl
status" can't find the ESP, and then the tool continues with arg_path ==
NULL, which it really should not. Handle these cases, and simply
suppress all output that needs arg_path.
efi: rework find_esp() error propagation/logging a bit
This renames find_esp() to find_esp_and_warn() and tries to normalize its
behaviour:
1. Change the error that is returned when we can't find the ESP to
ENOKEY (from ENOENT). This way the error code can only mean one
thing: that our search loop didn't find a good candidate.
2. Really log about all errors, except for ENOKEY and EACCES, and
document the letter cases.
3. Normalize parameters to the call: separate out the path parameter in
two: an input path and an output path. That way the memory management
is clear: we will access the input parameter only for reading, and
only write out the output parameter, using malloc() memory.
Before the calling convention were quire surprising for internal API
code, as the path parameter had to be malloc() memory and might and
might not have changed.
4. Rename bootctl's find_esp_warn() to acquire_esp(), and make it a
simple wrapper around find_esp_warn(), that basically just adds the
friendly logging for the ENOKEY case. This rework removes double
logging in a number of error cases, as we no longer log here in
anything but ENOKEY, and leave that entirely to find_esp_warn().
5. find_esp_and_warn() now takes a bool flag parameter
"unprivileged_mode", which disables logging in the EACCES case, and
skips privileged validation of the path. This makes the function less
magic, and doesn't hide this internal silencing automatism from the
caller anymore.
With all that in place "bootctl list" and "bootctl status" work properly
(or as good as they can) when I invoke the tools whithout privileges on
my system where /boot is not world-readable
tree-wide: drop a few == NULL and != NULL comparison
Our CODING_STYLE suggests not comparing with NULL, but relying on C's
downgrade-to-bool feature for that. Fix up some code to match these
guidelines. (This is not comprehensive, the coccinelle output for this
is unfortunately kinda borked)
Alan Jenkins [Sun, 10 Dec 2017 10:58:01 +0000 (10:58 +0000)]
core: fix undefined behaviour due to uninitialized string buffer (#7597)
Failure of systemd to respond on the bus interface was bisected to af6b0ecc
"core: make "taint" string logic a bit more generic and output it at boot".
Failure was presumably caused by trying to append strings to an
unintialized buffer, leading to writing outside the unterminated buffer
and hence undefined behaviour.
Olaf Hering [Fri, 8 Dec 2017 21:21:42 +0000 (22:21 +0100)]
virt: use XENFEAT_dom0 to detect the hardware domain (#6442, #6662) (#7581)
The detection of ConditionVirtualisation= relies on the presence of
/proc/xen/capabilities. If the file exists and contains the string
"control_d", the running system is a dom0 and VIRTUALIZATION_NONE should
be set. In case /proc/xen exists, or some sysfs files indicate "xen",
VIRTUALIZATION_XEN should be set to indicate the system is a domU.
With an (old) xenlinux based kernel, /proc/xen/capabilities is always
available and the detection described above works always. But with a
pvops based kernel, xenfs must be mounted on /proc/xen to get
"capabilities". This is done by a proc-xen.mount unit, which is part of
xen.git. Since the mounting happens "late", other units may be scheduled
before "proc-xen.mount". If these other units make use of
"ConditionVirtualisation=", the virtualization detection returns
incorect results. detect_vm() will set VIRTUALIZATION_XEN because "xen"
is found in sysfs. This value will be cached. Once xenfs is mounted, the
next process that runs detect_vm() will get VIRTUALIZATION_NONE.
This misdetection can be fixed by using
/sys/hypervisor/properties/features, which exports the value returned by
the "XENVER_get_features" hypercall. If the bit XENFEAT_dom0 is set, the
domain is the "hardware domain". It is supposed to have permissions to
access all hardware. The used sysfs file is available since v2.6.31.
The commonly used term "dom0" refers to the control domain which runs
the toolstack and has access to all hardware. But the virtualization
host may be configured such that one dedicated domain becomes the
"hardware domain", and another one the "toolstack domain".
Edward A. James [Fri, 8 Dec 2017 17:26:44 +0000 (11:26 -0600)]
core: Add WatchdogDevice config option and implement it
This option allows a device path to be specified for the systemd
watchdog (both runtime and shutdown).
If a system requires a watchdog other than /dev/watchdog (pointing to
/dev/watchdog0) to be used to reboot the system, this setting should be
changed to the relevant watchdog device path (e.g. /dev/watchdog1).
Edward A. James [Fri, 8 Dec 2017 17:26:30 +0000 (11:26 -0600)]
watchdog: allow a device path to be specified
Currently systemd hardcodes the use of /dev/watchdog. This is a legacy
chardev that points to watchdog0 in the system.
Modify the watchdog API to allow a different device path to be passed
and stored. Opening the watchdog defaults to /dev/watchdog, maintaining
existing behavior.
It would be nicer to use <footnote> to place the notes directly in the table,
but docbook renders this improperly.
v2:
- also add "RequiredBy=" to the notes section
- remove duplicated paragraph
v3:
- clarify the description
- drop References/ReferenceBy which are only shown in systemd-analyze dump
Patrik Flykt [Fri, 8 Dec 2017 12:33:40 +0000 (14:33 +0200)]
networkd: Ignore DNS information when uplink is not managed (#7571)
When another networking daemon or configuration is handling the
uplink connection, systemd-networkd won't have a network configuration
associated with the link, and therefore link->network will be NULL.
An assert will be triggered later on in the code when link->network is
NULL.
Dmitry Rozhkov [Mon, 16 Oct 2017 14:25:17 +0000 (17:25 +0300)]
resolved: add authority section to mDNS probing queries
According to RFC 6762 Section 8.2 "Simultaneous Probe Tiebreaking"
probing queries' Authority Section is populated with proposed
resource records in order to resolve possible race conditions.
Dmitry Rozhkov [Tue, 31 Oct 2017 08:34:58 +0000 (10:34 +0200)]
resolved: set cache-flush bit on mDNS responses
From RFC 6762, Section 10.2
"They (the rules about when to set the cache-flush bit) apply to
startup announcements as described in Section 8.3, "Announcing",
and to responses generated as a result of receiving query messages."
So, set the cache-flush bit for mDNS answers except for DNS-SD
service enumerattion PTRs described in RFC 6763, Section 4.1.
Olaf Hering [Thu, 7 Dec 2017 20:09:32 +0000 (21:09 +0100)]
virt: propagate errors in detect_vm_xen_dom0 (#7553)
Update detect_vm_xen_dom0 to propagate errors in case reading
/proc/xen/capabilites fails. This does not fix any bugs, it just makes
it consistent with other functions called by detect_vm.
sulogin-shell: do daemon-reload before starting default target
If the user modifies configuration, e.g. /etc/fstab, they might forget to tell
systemd about the changes. Let's do a reload for them.
Note that doing a reload should be safe, because emergency and rescue modes are
"single threaded" and nothing should be doing changes at the point where we are
exiting from the sushell. Also, daemon-reload can be implicitly called at
various moments, so we can ignore the case where the user did some incompatible
changes on disk and is counting on systemd never reloading and picking them up.
This is actually slightly safer because it allows gcc to make sure that all code
paths either call return or are noreturn. But the real motivation is just to
follow the usual style and make it a bit shorter.
core: make "taint" string logic a bit more generic and output it at boot
The tainting logic existed for a long time, but was hidden inside the
bus interfaces. Let's give it a small bit more coverage, by logging its
value early at boot during initialization.
units: delegate only "cpu" and "pids" controllers by default (#7564)
Now that we can configure which controllers to delegate precisely, let's
limit wht we delegate to the user session: only "cpu" and "pids" as a
minimal baseline.
Yu Watanabe [Thu, 7 Dec 2017 05:21:13 +0000 (14:21 +0900)]
bootspec: fix debug message about default entry
When no entries matches with entry_oneshot, entry_default and
default_pattern, then log message shows a wrong entry.
Moreover, if none of entry_oneshot, entry_default and default_pattern
are set, then the index `i` is uninitialized.
This fixes such problem.
core: move write_container_id() invocation into initialize_runtime()
This moves the invocation a bit later, but that shoudln't matter. By
moving it we gain two things: first of all, its closer to other code
where it belongs, secondly its naturally conditioned properly, as we no
longer will rewrite the container ID file on every reexecution again,
and not in test mode either.
No real functional changes, just some rearranging to shorten the overly
long main() function a bit.
This gets rid of the arm_reboot_watchdog variable, as it can be directly
derived from shutdown_verb, and we need it only one time. By dropping it
we can reduce the number of arguments we need to pass around.
```
$ ./src/test/test-systemd-tmpfiles.py valgrind --leak-check=full --error-exitcode=1 ./build/systemd-tmpfiles
...
Running valgrind --leak-check=full --error-exitcode=1 ./build/systemd-tmpfiles on 'w /unresolved/argument - - - - "%Y"'
...
[<stdin>:1] Failed to substitute specifiers in argument: Invalid slot
...
==22602== 5 bytes in 1 blocks are definitely lost in loss record 1 of 2
==22602== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==22602== by 0x4ECA7D4: malloc_multiply (alloc-util.h:74)
==22602== by 0x4ECA909: specifier_printf (specifier.c:59)
==22602== by 0x113490: specifier_expansion_from_arg (tmpfiles.c:1923)
==22602== by 0x1144E7: parse_line (tmpfiles.c:2159)
==22602== by 0x11551C: read_config_file (tmpfiles.c:2425)
==22602== by 0x115AB0: main (tmpfiles.c:2529)
```
Olaf Hering [Wed, 6 Dec 2017 18:59:30 +0000 (19:59 +0100)]
virt: use /proc/xen as indicator for a Xen domain (#6442, #6662) (#7555)
The file /proc/xen/capabilities is only available if xenfs is mounted.
With a classic xenlinux based kernel that file is available
unconditionally. But with a modern pvops based kernel, xenfs must be
mounted before the "capabilities" may appear. xenfs is mounted very late
via .services files provided by the Xen toolstack. Other units may be
scheduled before xenfs is mounted, which will confuse the detection of
VIRTUALIZATION_XEN.
In all Xen enabled kernels, and if that kernel is actually running on
the Xen hypervisor, the "/proc/xen" directory is the reliable indicator
that this instance runs in a "Xen guest".
Adjust the code to check for /proc/xen instead of
/proc/xen/capabilities.
Fixes commit 3f61278b5 ("basic: Bugfix Detect XEN Dom0 as no virtualization")
Max Resch [Wed, 6 Dec 2017 14:29:52 +0000 (15:29 +0100)]
Set secure_boot flag in Kernel Zero-Page (#7482)
Setting the secure_boot flag, avoids getting the printout
"EFI stub: UEFI Secure Boot is enabled." when booting
a Linux kernel with linuxx64.efi.stub and EFI SecureBoot enabled.
This is mainly a cosmetic fixup, as the "quiet" kernel parameter does
not silence pr_efi printouts in the linux kernel (this only works using
the efi stub from the linux source tree)