man: tone down the note that reboot/halt/poweroff are legacy
They may be old (or rather compatible implementations of old commands), but
they certainly are not going away. Apart from privilege escalation through
polkit, they are mostly equivalent.
We might end up allocating mempools, and when we are unloaded we might
orphan them, thus leaking them. Hence, let's just stick around for good,
so the mempools remain referenced continously and for good, and thus no
memory is leaked (though the memory isn't cleaned up either).
ott [Tue, 12 Dec 2017 15:30:12 +0000 (16:30 +0100)]
resolve: add support for RFC 8080 (#7600)
RFC 8080 describes how to use EdDSA keys and signatures in DNSSEC. It
uses the curves Ed25519 and Ed448. Libgcrypt 1.8.1 does not support
Ed448, so only the Ed25519 is supported at the moment. Once Libgcrypt
supports Ed448, support for it can be trivially added to resolve.
networkd: Fix race condition in [RoutingPolicyRule] handling (#7615)
The routing policy rule setup logic is moved to the routes setup phase (rather than the addresses setup phase as it is now). Additionally, a call to `link_check_ready` is added to the routing policy rules setup handler. This prevents a race condition with the routes setup handler.
Also give each async handler its own message counter to prevent race conditions when logging successes.
Alan Jenkins [Sun, 10 Dec 2017 10:58:01 +0000 (10:58 +0000)]
core: fix undefined behaviour due to uninitialized string buffer (#7597)
Failure of systemd to respond on the bus interface was bisected to af6b0ecc
"core: make "taint" string logic a bit more generic and output it at boot".
Failure was presumably caused by trying to append strings to an
unintialized buffer, leading to writing outside the unterminated buffer
and hence undefined behaviour.
Olaf Hering [Fri, 8 Dec 2017 21:21:42 +0000 (22:21 +0100)]
virt: use XENFEAT_dom0 to detect the hardware domain (#6442, #6662) (#7581)
The detection of ConditionVirtualisation= relies on the presence of
/proc/xen/capabilities. If the file exists and contains the string
"control_d", the running system is a dom0 and VIRTUALIZATION_NONE should
be set. In case /proc/xen exists, or some sysfs files indicate "xen",
VIRTUALIZATION_XEN should be set to indicate the system is a domU.
With an (old) xenlinux based kernel, /proc/xen/capabilities is always
available and the detection described above works always. But with a
pvops based kernel, xenfs must be mounted on /proc/xen to get
"capabilities". This is done by a proc-xen.mount unit, which is part of
xen.git. Since the mounting happens "late", other units may be scheduled
before "proc-xen.mount". If these other units make use of
"ConditionVirtualisation=", the virtualization detection returns
incorect results. detect_vm() will set VIRTUALIZATION_XEN because "xen"
is found in sysfs. This value will be cached. Once xenfs is mounted, the
next process that runs detect_vm() will get VIRTUALIZATION_NONE.
This misdetection can be fixed by using
/sys/hypervisor/properties/features, which exports the value returned by
the "XENVER_get_features" hypercall. If the bit XENFEAT_dom0 is set, the
domain is the "hardware domain". It is supposed to have permissions to
access all hardware. The used sysfs file is available since v2.6.31.
The commonly used term "dom0" refers to the control domain which runs
the toolstack and has access to all hardware. But the virtualization
host may be configured such that one dedicated domain becomes the
"hardware domain", and another one the "toolstack domain".
Edward A. James [Fri, 8 Dec 2017 17:26:44 +0000 (11:26 -0600)]
core: Add WatchdogDevice config option and implement it
This option allows a device path to be specified for the systemd
watchdog (both runtime and shutdown).
If a system requires a watchdog other than /dev/watchdog (pointing to
/dev/watchdog0) to be used to reboot the system, this setting should be
changed to the relevant watchdog device path (e.g. /dev/watchdog1).
Edward A. James [Fri, 8 Dec 2017 17:26:30 +0000 (11:26 -0600)]
watchdog: allow a device path to be specified
Currently systemd hardcodes the use of /dev/watchdog. This is a legacy
chardev that points to watchdog0 in the system.
Modify the watchdog API to allow a different device path to be passed
and stored. Opening the watchdog defaults to /dev/watchdog, maintaining
existing behavior.
Patrik Flykt [Fri, 8 Dec 2017 12:33:40 +0000 (14:33 +0200)]
networkd: Ignore DNS information when uplink is not managed (#7571)
When another networking daemon or configuration is handling the
uplink connection, systemd-networkd won't have a network configuration
associated with the link, and therefore link->network will be NULL.
An assert will be triggered later on in the code when link->network is
NULL.
Dmitry Rozhkov [Mon, 16 Oct 2017 14:25:17 +0000 (17:25 +0300)]
resolved: add authority section to mDNS probing queries
According to RFC 6762 Section 8.2 "Simultaneous Probe Tiebreaking"
probing queries' Authority Section is populated with proposed
resource records in order to resolve possible race conditions.
Dmitry Rozhkov [Tue, 31 Oct 2017 08:34:58 +0000 (10:34 +0200)]
resolved: set cache-flush bit on mDNS responses
From RFC 6762, Section 10.2
"They (the rules about when to set the cache-flush bit) apply to
startup announcements as described in Section 8.3, "Announcing",
and to responses generated as a result of receiving query messages."
So, set the cache-flush bit for mDNS answers except for DNS-SD
service enumerattion PTRs described in RFC 6763, Section 4.1.
Olaf Hering [Thu, 7 Dec 2017 20:09:32 +0000 (21:09 +0100)]
virt: propagate errors in detect_vm_xen_dom0 (#7553)
Update detect_vm_xen_dom0 to propagate errors in case reading
/proc/xen/capabilites fails. This does not fix any bugs, it just makes
it consistent with other functions called by detect_vm.
sulogin-shell: do daemon-reload before starting default target
If the user modifies configuration, e.g. /etc/fstab, they might forget to tell
systemd about the changes. Let's do a reload for them.
Note that doing a reload should be safe, because emergency and rescue modes are
"single threaded" and nothing should be doing changes at the point where we are
exiting from the sushell. Also, daemon-reload can be implicitly called at
various moments, so we can ignore the case where the user did some incompatible
changes on disk and is counting on systemd never reloading and picking them up.
This is actually slightly safer because it allows gcc to make sure that all code
paths either call return or are noreturn. But the real motivation is just to
follow the usual style and make it a bit shorter.
core: make "taint" string logic a bit more generic and output it at boot
The tainting logic existed for a long time, but was hidden inside the
bus interfaces. Let's give it a small bit more coverage, by logging its
value early at boot during initialization.
units: delegate only "cpu" and "pids" controllers by default (#7564)
Now that we can configure which controllers to delegate precisely, let's
limit wht we delegate to the user session: only "cpu" and "pids" as a
minimal baseline.
Yu Watanabe [Thu, 7 Dec 2017 05:21:13 +0000 (14:21 +0900)]
bootspec: fix debug message about default entry
When no entries matches with entry_oneshot, entry_default and
default_pattern, then log message shows a wrong entry.
Moreover, if none of entry_oneshot, entry_default and default_pattern
are set, then the index `i` is uninitialized.
This fixes such problem.
core: move write_container_id() invocation into initialize_runtime()
This moves the invocation a bit later, but that shoudln't matter. By
moving it we gain two things: first of all, its closer to other code
where it belongs, secondly its naturally conditioned properly, as we no
longer will rewrite the container ID file on every reexecution again,
and not in test mode either.
No real functional changes, just some rearranging to shorten the overly
long main() function a bit.
This gets rid of the arm_reboot_watchdog variable, as it can be directly
derived from shutdown_verb, and we need it only one time. By dropping it
we can reduce the number of arguments we need to pass around.
```
$ ./src/test/test-systemd-tmpfiles.py valgrind --leak-check=full --error-exitcode=1 ./build/systemd-tmpfiles
...
Running valgrind --leak-check=full --error-exitcode=1 ./build/systemd-tmpfiles on 'w /unresolved/argument - - - - "%Y"'
...
[<stdin>:1] Failed to substitute specifiers in argument: Invalid slot
...
==22602== 5 bytes in 1 blocks are definitely lost in loss record 1 of 2
==22602== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==22602== by 0x4ECA7D4: malloc_multiply (alloc-util.h:74)
==22602== by 0x4ECA909: specifier_printf (specifier.c:59)
==22602== by 0x113490: specifier_expansion_from_arg (tmpfiles.c:1923)
==22602== by 0x1144E7: parse_line (tmpfiles.c:2159)
==22602== by 0x11551C: read_config_file (tmpfiles.c:2425)
==22602== by 0x115AB0: main (tmpfiles.c:2529)
```
Olaf Hering [Wed, 6 Dec 2017 18:59:30 +0000 (19:59 +0100)]
virt: use /proc/xen as indicator for a Xen domain (#6442, #6662) (#7555)
The file /proc/xen/capabilities is only available if xenfs is mounted.
With a classic xenlinux based kernel that file is available
unconditionally. But with a modern pvops based kernel, xenfs must be
mounted before the "capabilities" may appear. xenfs is mounted very late
via .services files provided by the Xen toolstack. Other units may be
scheduled before xenfs is mounted, which will confuse the detection of
VIRTUALIZATION_XEN.
In all Xen enabled kernels, and if that kernel is actually running on
the Xen hypervisor, the "/proc/xen" directory is the reliable indicator
that this instance runs in a "Xen guest".
Adjust the code to check for /proc/xen instead of
/proc/xen/capabilities.
Fixes commit 3f61278b5 ("basic: Bugfix Detect XEN Dom0 as no virtualization")
Max Resch [Wed, 6 Dec 2017 14:29:52 +0000 (15:29 +0100)]
Set secure_boot flag in Kernel Zero-Page (#7482)
Setting the secure_boot flag, avoids getting the printout
"EFI stub: UEFI Secure Boot is enabled." when booting
a Linux kernel with linuxx64.efi.stub and EFI SecureBoot enabled.
This is mainly a cosmetic fixup, as the "quiet" kernel parameter does
not silence pr_efi printouts in the linux kernel (this only works using
the efi stub from the linux source tree)
test-execute: use the "nogroup" group if it exists for testing
We currently look for "nobody" and "nfsnobody" when testing groups, both
of which do not exist on Ubuntu, our main testing environment. Let's
extend the tests slightly to also use "nogroup" if it exists.
journal,coredump: do not do ACL magic for "nobody" user either
The "nobody" user might possibly be seen by the journal or coredumping
code if unmapped userns-using processes are somehow visible to them.
Let's make sure we don't do the ACL magic for this user either, since
this is a special system user that might be backed by different real
users in different contexts.
user-util: synthesize user records for "nobody" the same way as for "root"
We already synthesize records for both "root" and "nobody" in
nss-systemd. Let's do the same in our own NSS wrappers that are supposed
to bypass NSS if possible. Previously this was done for "root" only, but
let's clean this up, and do the same for "nobody" too, so that we
synthesize records the same way everywhere, regardless whether in NSS or
internally.
nss-systemd: tweak checks when we consult PID 1 for dynamic UID/GID lookups
Instead of contacting PID 1 for dynamic UID/GID lookups for all
UIDs/GIDs that do not qualify as "system" do the more precise check
instead: check if they actually qualify for the "dynamic" range.
This adds uid_is_system() and gid_is_system(), similar in style to
uid_is_dynamic(). That a helper like this is useful is illustrated by
the fact that test-condition.c didn't get the check right so far, which
this patch fixes.
resolved: downgrade log messages about incoming LLMNR/mDNS packets on unexpected scopes
This might very well happen due to races between joining multicast
groups and network configuration and such, let's not complain, but just
drop the messages at debug level.
test-systemd-tmpfiles: respect $HOME in test for %h expansion
%h is a special specifier because we look at $HOME (unless running suid, but
let's say that this case does not apply to tmpfiles, since the code is
completely unready to be run suid). For all other specifiers we query the user
db and use those values directly. I'm not sure if this exception is good, but
let's just "document" status quo for now. If this is changes, it should be in
a separate PR.