Franck Bui [Fri, 21 Jun 2019 11:12:41 +0000 (13:12 +0200)]
coredump: rely on /proc exclusively to get the name of the crashing process
I couldn't see any reason why the kernel could provide COMM to the coredump
handler via the core_pattern command line but could not make it available in
/proc. So let's assume that this info is always available in /proc.
For "backtrace" mode (when --backtrace option is passed), I assumed that the
crashing process still exists at the time systemd-coredump is called.
Also changing the core_pattern line is an API breakage for any users of the
backtrace mode but given that systemd-coredump is installed in
/usr/lib/systemd, it's a private tool which has no internal users. At least no
one complained when the hostname was added to the core_pattern line
(f45b8015513)...
Indeed it's much easier to get it from /proc since the kernel substitutes '%e'
specifier with multiple strings if the process name contains spaces (!).
camoz [Tue, 25 Jun 2019 08:28:19 +0000 (10:28 +0200)]
systemd-nspawn(1): update example section
Remove the retired flag -d from Example 4. "Boot a minimal Arch Linux
distribution in a container". It has been retired here:
https://git.archlinux.org/arch-install-scripts.git/commit/pacstrap.in?id=0af6884aca68dcb7eed0b85fbc2960903df3d968
bootctl: fix display of options with embedeed newlines
I have an .efi image with embedded newlinews. Now I don't even remember if it
was created for testing or by accident, but it doesn't really matter. We should
display such files correctly.
(This isn't a problem with normal BLS entries, because input is split into lines
so newlines are consumed.)
Lubomir Rintel [Mon, 24 Jun 2019 17:23:13 +0000 (19:23 +0200)]
udevd: fix a reversed conditional on global property set
# udevadm control --property=HELLO=WORLD
Received udev control message (ENV), unsetting 'HELLO'
# udevadm control --property=HELLO=
Received udev control message (ENV), setting 'HELLO='
Oh no, it's busted. Let's try removing this one little negation real quick
to see if it helps...
# udevadm control --property=HELLO=WORLD
Received udev control message (ENV), setting 'HELLO=WORLD'
# udevadm control --property=HELLO=
Received udev control message (ENV), unsetting 'HELLO'
networkd: rework warning and debug messages about address addition and removal
Those messages were quite confusing. In particular "adding address" suggests
that we are assiging a new address to an interface, but in fact we're just
reacting to a notification about an addition. So let's call that "remembering"
and "forgetting". It's not fully gramatically correct, but I think it's much
clearer than "adding"/"removing" in this context.
And "received address without address" is too cryptic, let's say "address
message" to distinguish the message from its content.
Also, make failure to format address non-fatal, and print more details in
various places.
logind: log operation details when starting actions
For some reason, systemd-logind is trying to handle idle action in one of my containers:
Jun 07 10:28:08 rawhide systemd-logind[42]: System idle. Taking action.
Jun 07 10:28:08 rawhide systemd-logind[42]: Requested operation not supported, ignoring.
But we didn't log what exactly was being done. Let's put the name of the action in messages.
All callers pass either a fixed action, or HANDLE_IGNORE is explicitly filtered
out. Let's remove this case here, because we cannot properly log what opreation
we are ignoring.
Michal Sekletar [Tue, 12 Mar 2019 17:58:26 +0000 (18:58 +0100)]
core: introduce NUMAPolicy and NUMAMask options
Make possible to set NUMA allocation policy for manager. Manager's
policy is by default inherited to all forked off processes. However, it
is possible to override the policy on per-service basis. Currently we
support, these policies: default, prefer, bind, interleave, local.
See man 2 set_mempolicy for details on each policy.
Overall NUMA policy actually consists of two parts. Policy itself and
bitmask representing NUMA nodes where is policy effective. Node mask can
be specified using related option, NUMAMask. Default mask can be
overwritten on per-service level.
man: drop references to "syslog" and "syslog+console" from man page
These options are pretty much equivalent to "journal" and
"journal+console" anyway, let's simplify things, and drop them from the
documentation hence.
For compat reasons let's keep them in the code.
(Note that they are not 100% identical to 'journal', but I doubt the
distinction in behaviour is really relevant to keep this in the docs.
And we should probably should drop 'syslog' entirely from our codebase
eventually, but it's problematic as long as we semi-support udev on
non-systemd systems still.)
Anita Zhang [Mon, 20 May 2019 21:43:53 +0000 (14:43 -0700)]
bpf-firewall: optimization for IPAddressXYZ="any" (and unprivileged users)
This is a workaround to make IPAddressDeny=any/IPAddressAllow=any work
for non-root users that have CAP_NET_ADMIN. "any" was chosen since
all or nothing network access is one of the most common use cases for
isolation.
Allocating BPF LPM TRIE maps require CAP_SYS_ADMIN while BPF_PROG_TYPE_CGROUP_SKB
only needs CAP_NET_ADMIN. In the case of IPAddressXYZ="any" we can just
consistently return false/true to avoid allocating the map and limit the user
to having CAP_NET_ADMIN.
Topi Miettinen [Mon, 20 May 2019 09:20:58 +0000 (12:20 +0300)]
cgroup-util: kill also threads
It's possible for a zombie process to have live threads. These are not listed
in /sys in "cgroup.procs" for cgroupsv2, but they show up in
"cgroup.threads" (cgroupv2) or "tasks" (cgroupv1) nodes. When killing a
cgroup (v2 only) with SIGKILL, let's also kill threads after killing processes,
so the live threads of a zombie get killed too.
prefix_root() is equivalent to path_join() in almost all ways, hence
let's remove it.
There are subtle differences though: prefix_root() will try shorten
multiple "/" before and after the prefix. path_join() doesn't do that.
This means prefix_root() might return a string shorter than both its
inputs combined, while path_join() never does that. I like the
path_join() semantics better, hence I think dropping prefix_root() is
totally OK. In the end the strings generated by both functon should
always be identical in terms of path_equal() if not streq().
This leaves prefix_roota() in place. Ideally we'd have path_joina(), but
I don't think we can reasonably implement that as a macro. or maybe we
can? (if so, sounds like something for a later PR)
Anita Zhang [Mon, 3 Jun 2019 23:25:43 +0000 (16:25 -0700)]
nspawn: don't hard fail when setting capabilities
The OCI changes in #9762 broke a use case in which we use nspawn from
inside a container that has dropped capabilities from the bounding set
that nspawn expected to retain. In an attempt to keep OCI compliance
and support our use case, I made hard failing on setting capabilities
not in the bounding set optional (hard fail if using OCI and log only
if using nspawn cmdline).
cap_last_cap() returns the last valid cap (instead of the number of
valid caps). to iterate through all known caps we hence need to use a <=
check, and not a < check like for all other cases. We got this right
usually, but in three cases we did not.
Topi Miettinen [Wed, 1 May 2019 12:28:36 +0000 (15:28 +0300)]
units: deny access to block devices
While the need for access to character devices can be tricky to determine for
the general case, it's obvious that most of our services have no need to access
block devices. For logind and timedated this can be tightened further.
Donald Buczek [Thu, 25 Apr 2019 07:39:41 +0000 (09:39 +0200)]
cgroup: Continue unit reset if cgroup is busy
When part of the cgroup hierarchy cannot be deleted (e.g. because there
are still processes in it), do not exit unit_prune_cgroup early, but
continue so that u->cgroup_realized is reset.
Log the known case of non-empty cgroups at debug level and other errors
at warning level.
Iwan Timmer [Sat, 15 Jun 2019 20:54:41 +0000 (22:54 +0200)]
resolved: move TLS data shared by all servers to manager
Instead of having a context and/or trusted CA list per server this is now moved to the server. Ensures future TLS configuration options are global instead of per server.