core: skip whitespace after "|" and "!" in the condition parser
We'd skip any whitespace immediately after "=", but then we'd treat whitespace
that is between "|" or "!" and the value as significant. This is rather
confusing, let's ignore it too.
Zach Smith [Wed, 26 Jun 2019 13:55:37 +0000 (06:55 -0700)]
systemd-sleep: refuse to calculate swapfile offset on Btrfs
If hibernation is requested but /sys/power/resume and
/sys/power/resume_offset are not configured correctly, systemd-sleep
attempts to calculate swapfile offset using fstat and fiemap.
Btrfs returns virtual device number for stat and a virtual offset
for fiemap which results in incorrect offset calculations. In the
case where offset would be calculated and the user is using Btrfs,
log a debug message and fail to write device and offset values.
Zach Smith [Sun, 9 Jun 2019 03:44:34 +0000 (20:44 -0700)]
systemd-sleep: (bug) use resume_offset value if set
Use hibernation configuration as defined in
/sys/power/resume and /sys/power/resume_offset
if present before inspecting /proc/swaps and
attempting to calculate swapfile offset
core: do not enumerate units in MANAGER_TEST_RUN_MINIMAL mode
In this mode we are not supposed to "interact with the environment", so loading
all units and printing warnings about syntax errors and /var/run usage seems
inappropriate.
man: move description of how conditions are combined to the beginning
Originally the description of conditions was brief, so it was acceptable
to put this part at the end. But now we have a myriad conditions, and
this crucial bit of information is easy to miss.
camoz [Tue, 25 Jun 2019 08:28:19 +0000 (10:28 +0200)]
systemd-nspawn(1): update example section
Remove the retired flag -d from Example 4. "Boot a minimal Arch Linux
distribution in a container". It has been retired here:
https://git.archlinux.org/arch-install-scripts.git/commit/pacstrap.in?id=0af6884aca68dcb7eed0b85fbc2960903df3d968
bootctl: fix display of options with embedeed newlines
I have an .efi image with embedded newlinews. Now I don't even remember if it
was created for testing or by accident, but it doesn't really matter. We should
display such files correctly.
(This isn't a problem with normal BLS entries, because input is split into lines
so newlines are consumed.)
Lubomir Rintel [Mon, 24 Jun 2019 17:23:13 +0000 (19:23 +0200)]
udevd: fix a reversed conditional on global property set
# udevadm control --property=HELLO=WORLD
Received udev control message (ENV), unsetting 'HELLO'
# udevadm control --property=HELLO=
Received udev control message (ENV), setting 'HELLO='
Oh no, it's busted. Let's try removing this one little negation real quick
to see if it helps...
# udevadm control --property=HELLO=WORLD
Received udev control message (ENV), setting 'HELLO=WORLD'
# udevadm control --property=HELLO=
Received udev control message (ENV), unsetting 'HELLO'
networkd: rework warning and debug messages about address addition and removal
Those messages were quite confusing. In particular "adding address" suggests
that we are assiging a new address to an interface, but in fact we're just
reacting to a notification about an addition. So let's call that "remembering"
and "forgetting". It's not fully gramatically correct, but I think it's much
clearer than "adding"/"removing" in this context.
And "received address without address" is too cryptic, let's say "address
message" to distinguish the message from its content.
Also, make failure to format address non-fatal, and print more details in
various places.
logind: log operation details when starting actions
For some reason, systemd-logind is trying to handle idle action in one of my containers:
Jun 07 10:28:08 rawhide systemd-logind[42]: System idle. Taking action.
Jun 07 10:28:08 rawhide systemd-logind[42]: Requested operation not supported, ignoring.
But we didn't log what exactly was being done. Let's put the name of the action in messages.
All callers pass either a fixed action, or HANDLE_IGNORE is explicitly filtered
out. Let's remove this case here, because we cannot properly log what opreation
we are ignoring.
Michal Sekletar [Tue, 12 Mar 2019 17:58:26 +0000 (18:58 +0100)]
core: introduce NUMAPolicy and NUMAMask options
Make possible to set NUMA allocation policy for manager. Manager's
policy is by default inherited to all forked off processes. However, it
is possible to override the policy on per-service basis. Currently we
support, these policies: default, prefer, bind, interleave, local.
See man 2 set_mempolicy for details on each policy.
Overall NUMA policy actually consists of two parts. Policy itself and
bitmask representing NUMA nodes where is policy effective. Node mask can
be specified using related option, NUMAMask. Default mask can be
overwritten on per-service level.
man: drop references to "syslog" and "syslog+console" from man page
These options are pretty much equivalent to "journal" and
"journal+console" anyway, let's simplify things, and drop them from the
documentation hence.
For compat reasons let's keep them in the code.
(Note that they are not 100% identical to 'journal', but I doubt the
distinction in behaviour is really relevant to keep this in the docs.
And we should probably should drop 'syslog' entirely from our codebase
eventually, but it's problematic as long as we semi-support udev on
non-systemd systems still.)
Anita Zhang [Mon, 20 May 2019 21:43:53 +0000 (14:43 -0700)]
bpf-firewall: optimization for IPAddressXYZ="any" (and unprivileged users)
This is a workaround to make IPAddressDeny=any/IPAddressAllow=any work
for non-root users that have CAP_NET_ADMIN. "any" was chosen since
all or nothing network access is one of the most common use cases for
isolation.
Allocating BPF LPM TRIE maps require CAP_SYS_ADMIN while BPF_PROG_TYPE_CGROUP_SKB
only needs CAP_NET_ADMIN. In the case of IPAddressXYZ="any" we can just
consistently return false/true to avoid allocating the map and limit the user
to having CAP_NET_ADMIN.
Topi Miettinen [Mon, 20 May 2019 09:20:58 +0000 (12:20 +0300)]
cgroup-util: kill also threads
It's possible for a zombie process to have live threads. These are not listed
in /sys in "cgroup.procs" for cgroupsv2, but they show up in
"cgroup.threads" (cgroupv2) or "tasks" (cgroupv1) nodes. When killing a
cgroup (v2 only) with SIGKILL, let's also kill threads after killing processes,
so the live threads of a zombie get killed too.
prefix_root() is equivalent to path_join() in almost all ways, hence
let's remove it.
There are subtle differences though: prefix_root() will try shorten
multiple "/" before and after the prefix. path_join() doesn't do that.
This means prefix_root() might return a string shorter than both its
inputs combined, while path_join() never does that. I like the
path_join() semantics better, hence I think dropping prefix_root() is
totally OK. In the end the strings generated by both functon should
always be identical in terms of path_equal() if not streq().
This leaves prefix_roota() in place. Ideally we'd have path_joina(), but
I don't think we can reasonably implement that as a macro. or maybe we
can? (if so, sounds like something for a later PR)
Anita Zhang [Mon, 3 Jun 2019 23:25:43 +0000 (16:25 -0700)]
nspawn: don't hard fail when setting capabilities
The OCI changes in #9762 broke a use case in which we use nspawn from
inside a container that has dropped capabilities from the bounding set
that nspawn expected to retain. In an attempt to keep OCI compliance
and support our use case, I made hard failing on setting capabilities
not in the bounding set optional (hard fail if using OCI and log only
if using nspawn cmdline).
cap_last_cap() returns the last valid cap (instead of the number of
valid caps). to iterate through all known caps we hence need to use a <=
check, and not a < check like for all other cases. We got this right
usually, but in three cases we did not.
Topi Miettinen [Wed, 1 May 2019 12:28:36 +0000 (15:28 +0300)]
units: deny access to block devices
While the need for access to character devices can be tricky to determine for
the general case, it's obvious that most of our services have no need to access
block devices. For logind and timedated this can be tightened further.
Donald Buczek [Thu, 25 Apr 2019 07:39:41 +0000 (09:39 +0200)]
cgroup: Continue unit reset if cgroup is busy
When part of the cgroup hierarchy cannot be deleted (e.g. because there
are still processes in it), do not exit unit_prune_cgroup early, but
continue so that u->cgroup_realized is reset.
Log the known case of non-empty cgroups at debug level and other errors
at warning level.