nspawn: do not emit any warning when $UNIFIED_CGROUP_HIERARCHY is used
Initially I thought this is a good idea, but when reviewing a different PR
(https://github.com/systemd/systemd/pull/13862#discussion_r340604313) I changed
my mind about this. At some point we probably should start warning about the
old option name, and yet later remove it. But it'll make it easier for people
to transition to the new option name if there's a period of support for both
names without any fuss. There's nothing particularly wrong about the old name,
and there is no support cost.
Martin Wilck [Tue, 12 Nov 2019 15:43:42 +0000 (16:43 +0100)]
udevd: fix crash when workers time out after exit is signal caught
If udevd receives an exit signal, it releases its reference on the udev
monitor in manager_exit(). If at this time a worker is hanging, and if
the event timeout for this worker expires before udevd exits, udevd
crashes in on_sigchld()->udev_monitor_send_device(), because the monitor
has already been freed.
Fix this by releasing the main process's monitor ref later, in
manager_free().
Franck Bui [Fri, 18 Oct 2019 10:44:51 +0000 (12:44 +0200)]
logind: fix (again) the race that might happen when logind restores VT
This patch is a new attempt to fix the race originally described in issue #9754.
The initial fix (commit ad96887a1205bad9656d280c5681f482e6d04838) consisted in
spawning a sub process that became the controlling process of the VT and hence
kicked the old controlling process off to make sure that the VT wouldn't have
entered in HUP state while logind restored the VT.
But it introduced a regression (see issue #11269) and thus was reverted. But
unlike it was described in the revert commit message, commit adb8688b3ff445d9c48ed0d72208c7844c2acc01 alone doen't fix the initial race.
This patch fixes the race in a simpler way by trying to restore the VT a second
time after making sure to re-open it if the first attempt fails.
Indeed if the old controlling process dies before or during the first attempt,
logind will fail to restore the VT. At this point the VT is in HUP state but
we're sure that it won't enter in a HUP state a second time. Therefore we will
retry by re-opening the VT to clear the HUP state and by restoring the VT a
second time, which should be safe this time.
Martin Wilck [Wed, 6 Nov 2019 11:24:41 +0000 (12:24 +0100)]
udevd: wait for workers to finish when exiting
On some systems with lots of devices, device probing for certain drivers can
take a very long time. If systemd-udevd detects a timeout and kills the worker
running modprobe using SIGKILL, some devices will not be probed, or end up in
unusable state. The --event-timeout option can be used to modify the maximum
time spent in an uevent handler. But if systemd-udevd exits, it uses a
different timeout, hard-coded to 30s, and exits when this timeout expires,
causing all workers to be KILLed by systemd afterwards. In practice, this may
lead to workers being killed after significantly less time than specified with
the event-timeout. This is particularly significant during initrd processing:
systemd-udevd will be stopped by systemd when initrd-switch-root.target is
about to be isolated, which usually happens quickly after finding and mounting
the root FS.
If systemd-udevd is started by PID 1 (i.e. basically always), systemd will
kill both udevd and the workers after expiry of TimeoutStopSec. This is
actually better than the built-in udevd timeout, because it's more transparent
and configurable for users. This way users can avoid the mentioned boot problem
by simply increasing StopTimeoutSec= in systemd-udevd.service.
If udevd is not started by systemd (standalone), this is still an
improvement. udevd will kill hanging workers when the event timeout is
reached, which is configurable via the udev.event_timeout= kernel
command line parameter. Before this patch, udevd would simply exit with
workers still running, which would then become zombie processes.
With the timeout removed, the sd_event_now() assertion in manager_exit() can be
dropped.
test-unit-name: add usual headers and add more verbose output
This makes it easier to see what unit_name_is_valid() returns at a glance.
The output is not whitespace clean, but I think it's good enough for a test.
We compile some c++ code for tests. We would simply use the default options for
those. When the previous commit raised the default warning level, we started
getting warnings from c++ code. Let's add the most important options to the c++
command, so that we get a compilation without any warnings again.
I don't think it makes sense to add *all* the options that we add for c to the
c++ flags, because testing them takes quite a while, and the c++ compilations
are for small amounts of code, mostly to check that the headers have compatible
syntax.
Let's bump up the warning level, and not add by -Wextra by hand. This is the
approach recommended by meson. The idea is that all projects should be as
similar as possible to make it easier for users to switch between projects.
With meson-0.52.0-1.module_f31+6771+f5d842eb.noarch I get:
src/test/meson.build:19: WARNING: Overriding previous value of environment variable 'PATH' with a new one
When we're using *prepend*, the whole point is to modify an existing variable,
so meson shouldn't warn. But let's set avoid the warning and shorten things by
setting the final value immediately.
The code in cgroup.c has support for all hierarchies, but the test,
as written, will only work on unified. Since the test is really about
bpf code, and not the legacy devices controller, let's just skip
the test.
(Yes, Ubuntu should install the UTC timezone data unconditionally: it
should not be an option, even if all other timezone data is excluded,
but since it's our business to validate user input but not out business
to validate distros, let's just accept "UTC" unconditionally, it's magic
after all)
bpf: make sure the kernel do not submit an invalid program if no pattern matched
It turns out that the kernel verifier would reject a program we would build
if there was a whitelist, but no entries in the whitelist matched.
The program would approximately like this:
0: (61) r2 = *(u32 *)(r1 +0)
1: (54) w2 &= 65535
2: (61) r3 = *(u32 *)(r1 +0)
3: (74) w3 >>= 16
4: (61) r4 = *(u32 *)(r1 +4)
5: (61) r5 = *(u32 *)(r1 +8)
48: (b7) r0 = 0
49: (05) goto pc+1
50: (b7) r0 = 1
51: (95) exit
and insn 50 is unreachable, which is illegal. We would then either keep a
previous version of the program or allow everything. Make sure we build a
valid program that simply rejects everything.
This makes the code a bit longer, but easier to read I think, because
the cgroup v1 and v2 code paths are more similar. And whent he type is
a char, any backtrace is easier to interpret.
core: move bpf devices implementation to bpf-devices.[ch] and rename
The naming of the functions was a complete mess: the most specific functions
which don't know anything about cgroups had "cgroup_" prefix, while more
general functions which took a node path and a cgroup for reporting had no
prefix. Let's use "bpf_devices_" for the latter group, and "bpf_prog_*" for the
rest.
The main goal of this move is to split the implementation from the calling code
and add unit tests in a later patch.
Inspired by https://bugzilla.redhat.com/show_bug.cgi?id=1769299.
This change doesn't solve the issue, but makes it easier to whitelist the
syscall group.
From https://bugzilla.redhat.com/show_bug.cgi?id=1770154:
> utime is an obsolete system call. The current kernel interface is
> utimensat_time64. New 32-bit architectures do not even provide the utime
> system call.
Also add all other *time64 syscalls listed in
https://fedora.juszkiewicz.com.pl/syscalls.html.
Michal Suchanek [Mon, 4 Nov 2019 20:23:15 +0000 (21:23 +0100)]
libblkid: open device in nonblock mode.
When autoclose is set (kernel default but many distributions reverse the
setting) opening a CD-rom device causes the tray to close.
The function of blkid is to report the current state of the device and
not to change it. Hence it should use O_NONBLOCK when opening the
device to avoid closing a CD-rom tray.
blkid is used liberally in scripts so it can potentially interfere with
the user operating the CD-rom hardware.
[kzak@redhat.com: add O_NONBLOCK also to:
- wipefs
- blkid_new_probe_from_filename()
- blkid_evaluate_tag()]
Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Karel Zak <kzak@redhat.com>
(cherry picked from commit 39f5af25982d8b0244000e92a9d0e0e6557d0e17)
Anita Zhang [Thu, 7 Nov 2019 06:25:43 +0000 (22:25 -0800)]
include missing_fcntl.h where needed
f5947a5e925117c55b390460d592f57504277bf9 dropped missing.h and
replaced with the more specific headers but did not add
missing_fcntl.h in places that use O_TMPFILE. This is needed for
some older versions of glibc.
Anita Zhang [Tue, 5 Nov 2019 02:29:55 +0000 (18:29 -0800)]
core: change top-level drop-in from -.service.d to service.d
Discussed in #13743, the -.service semantic conflicts with the
existing root mount and slice names, making this feature not
uniformly extensible to all types. Change the name to be
<type>.d instead.
Updating to this format also extends the top-level dropin to
unit types.
We want users to use Wants, but we'd describe Requires first and ask users to
look for Wants instead. While at it, let's split the wall of text into sensible
paragraphs: syntax first, followed by semantics and longer description, and
finally hints and comparison to other configuration items last.
meson: remove strange dep that causes meson to enter infinite loop
The value is obviously bogus, but didn't seem to cause problems so far.
With meson-0.52.0, it causes a hang. The number of aliases is always rather
small (usually just one or two, possibly up to a dozen in a few cases), so
even if this causes some looping, it is strange that it has such a huge impact.
But let's just remove it.
Fixes #13742.
Tested with meson-0.52.0-1.module_f31+6771+f5d842eb.noarch,
meson-0.51.1-1.fc29.noarch.
Kevin Kuehler [Mon, 4 Nov 2019 22:48:06 +0000 (14:48 -0800)]
systemctl: Add TriggeredBy and Triggers to status
For all units that aren't timers, if it is activated by another unit,
add the triggering unit under the "TriggeredBy:" header. If a unit can
trigger other units, print the units it triggers other the "Triggers:"
header.
Fixes #13756. We were returning things that didn't make much sense:
we would always use the exit_code value as the exit code. But it sometimes
contains a exit code from the process, and sometimes the number of a signal
that was used to kill the process. We would also ignore SuccessExitStatus=
and in general whether systemd thinks the service exited successfully
(hence the issue in #13756, where systemd would return success/SIGTERM,
but we'd just look at the SIGTERM part.)
If we are doing --wait, let's always propagate the exit code/status from
the child.
Lorenz Bauer [Mon, 4 Nov 2019 16:35:46 +0000 (16:35 +0000)]
journal: refresh cached credentials of stdout streams
journald assumes that getsockopt(SO_PEERCRED) correctly identifies the
process on the remote end of the socket. However, this is incorrect
according to man 7 socket:
The returned credentials are those that were in effect at the
time of the call to connect(2) or socketpair(2).
This becomes a problem when a new process inherits the stdout stream
from a parent. First, log messages from the child process will
be attributed to the parent. Second, the struct ucred used by journald
becomes invalid as soon as the parent exits. Further sendmsg calls then
fail with ENOENT. Logs for the child process then vanish from the journal.
Fix this by using recvmsg on the stdout stream, and refreshing the cached
struct ucred if SCM_CREDENTIALS indicate a new process.
Sebastian Wick [Thu, 31 Oct 2019 13:27:24 +0000 (14:27 +0100)]
hwdb: add XKB_FIXED_MODEL to the keyboard hwdb
Chromebook keyboards have a top row which generates f1-f10 key codes but
the keys have media symbols printed on them. A simple scan code to key
code mapping to the correct media keys makes the f1-f10 inaccessible. To
properly use the keyboard a custom key code to symbol mapping in xbk is
required (a variant of the chromebook xkb model is already upstream).
Other devices have similar problems.
This commit makes it possible to specify which xkb model should be used
for a specific device by setting XKB_FIXED_MODEL.