tmpfiles: check if not too many symbolic links. (#7423)
Some filesystems do not set d_type value when
readdir is called, so entry type is unknown.
Therefore check if accessing entry does not
return ELOOP error.
Michael Vogt [Fri, 24 Nov 2017 20:03:05 +0000 (21:03 +0100)]
networkd: auto promote links if "promote_secondaries" is unset (#7167)
The DHCP code in systemd-networkd relies on the
`net.ipv4.conf.{default,all,<if>}.promote_secondaries` sysctl to be set
(the kernels default is that it is unset). If this sysctl is not set
DHCP will work most of the time, however when the IP address changes
between leases then the system will loose its IP.
Because some distributions decided to not ship these defaults (Debian
is an example and via downstream Ubuntu) networkd by default will now
enable this sysctl opton automatically.
nspawn: hash the machine name, when looking for a suitable UID base (#7437)
When "-U" is used we look for a UID range we can use for our container.
We start with the UID the tree is already assigned to, and if that
didn't work we'd pick random ranges so far. With this change we'll first
try to hash a suitable range from the container name, and use that if it
works, in order to make UID assignments more likely to be stable.
This follows a similar logic PID 1 follows when using DynamicUser=1.
meson: restore building of man pages on demand even if -Dman=false
I want to configure -Dman=false for speed, but be able to build a specific
man page sometimes to check my edits. Commit 5b316b9ea6c broke this by mistake.
Let's adjust the condition to better match the logic of disabling tests only
if xsltproc is really not found.
All other places where libkmod.h is included are guarded. Build would
fail with:
In file included from ../src/core/kmod-setup.c:35:0:
../src/basic/module-util.h:23:10: fatal error: libkmod.h: No such file or directory
#include <libkmod.h>
^~~~~~~~~~~
compilation terminated.
logind: don't propagate firmware misbehaviours to bus clients
If for some reason we can't query the firmware state, don't propagate
that to clients, but instead log about it, and claim that
reboot-to-firmware is not available (which is the right answer, since it
is not working).
Let's log about this though, as this is certainly relevant to know, even
though not for the client.
This watches controllers on the bus, and unsets them automatically when
they disappear.
Note that this is primarily a cosmetical fix. Since unique bus names are not
recycled, there's strictly no need to forget about them, but it's a lot
nicer to do so.
nspawn: make use of the RequestStop logic of scope units
Since time began, scope units had a concept of "Controllers", a bus peer
that would be notified when somebody requested a unit to stop. None of
our code used that facility so far, let's change that.
This way, nspawn can print a nice message when somebody invokes
"systemctl stop" on the container's scope unit, and then react with the
right action to shut it down.
test: fix test-mount-util when handling duplicate mounts on the same location
The test was written so far under the assumption that if two mounts are
placed onto the same location the "upper" mount is listed later in
/proc/self/mountinfo. This appears not to be guaranteed however, as
running the tests in a normal nspawn shows.
This patch fixes that: it reverses the hashmap of mounts we build:
instead of keying by path, we key by mnt_id, and if we notice that
path_get_mnt_id() doesn't match what a line in /proc/self/mountinfo
says, we use the returned ID to check if maybe another line agrees.
mount-util: EOVERFLOW might have other causes than buffer size issues
When we get EOVERFLOW this might be caused by untriggered nfs4 mounts
(see discussion at
https://github.com/systemd/systemd/pull/7395#issuecomment-346164481 and
further down).
Handle this nicely by falling back to fdinfo-based mntid determination.
mount-util: drop exponential buffer growing in name_to_handle_at_loop()
So, it appears name_to_handle_at() always returns the right buffer size
on EOVERFLOW, when it's returned due to a too small buffer. Let's rely
on that exclusively for sizing the buffer, and let's drop the
exponential buffer growing.
The new logic is now: if we see EOVERFLOW and the returned size has
increased, resize our buffer and try again. But if it didn't increase,
then propagate the EOVERFLOW as it likely has other causes.
Yu Watanabe [Thu, 23 Nov 2017 12:25:56 +0000 (21:25 +0900)]
core/manager: check the existance of the special units (#7433)
In the user mode, not all special units exist.
So, we need to check whether the units exist or not before operate
something to the units.
Such the check was mistakenly dropped by e68537f0ba1a4433ecdf58e609b1701ed7091abc.
cgroup: check whether unified hierarchy is writable
When systemd is running inside a container employing user
namespaces it currently mounts the unified cgroup hierarchy
without being able to write to it. This causes systemd to
freeze during boot.
This patch checks whether the unified cgroup hierarchy
is writable. If it is not it will not mount it.
This solution is based on a patch by Evgeny Vereshchagin.
path_prepend returned a status code, but it wasn't looked at anywhere.
Adding checks for the return value in all the bazillion places where it
is called is not very attractive, so let's just make the whole program
abort cleanly if the (very unlikely) oom is encountered.
Susant Sahani [Wed, 22 Nov 2017 07:23:22 +0000 (12:53 +0530)]
networkd: introduce vxcan netdev. (#7150)
Similar to the virtual ethernet driver veth, vxcan implements a
local CAN traffic tunnel between two virtual CAN network devices.
When creating a vxcan, two vxcan devices are created as pair
When one end receives the packet it appears on its pair and vice
versa. The vxcan can be used for cross namespace communication.
Smack LSM needs the capability CAP_MAC_ADMIN to allow
setting of the current Smack exec label. Consequently,
dropping capabilities must be done after changing the
current exec label.
This is only related to Smack LSM. But for clarity and
regularity, all setting of security context moved before
dropping capabilities.
test: add a test case that validates cgroup delegation
This test runs on the unified hierarchy, and ensures that cgroup
delegation works properly, i.e. writ access is granted and the requested
controllers are enabled.
Make sure to add the delegation mask to the mask of controllers we have
to enable on our own unit. Do not claim it was a members mask, as such
a logic would mean we'd collide with cgroupv2's "no processes on inner
nodes policy".
This change does the right thing: it means any controller enabled
through Controllers= will be made available to subcrgoups of our unit,
but the unit itself has to still enable it through
cgroup.subtree_control (which it can since that file is delegated too)
to be inherited further down.
Or to say this differently: we only should manipulate
cgroup.subtree_control ourselves for inner nodes (i.e. slices), and
for leaves we need to provide a way to enable controllers in the slices
above, but stay away from the cgroup's own cgroup.subtree_control —
which is what this patch ensures.
cgroup: properly determine cgroups zombie processes belong to
When a process becomes a zombie its cgroup might be deleted. Let's add
some minimal code to detect cases like this, so that we can still
attribute this back to the original cgroup.
core: unify common code for preparing for forking off unit processes
This introduces a new function unit_prepare_exec() that encapsulates a
number of calls we do in preparation for spawning off some processes in
all our unit types that do so.
This allows us to neatly unify a bit of code between unit types and
shorten our code.
Let's rename get_controllers() → get_process_controllers(), in order to
underline the difference to cg_kernel_controllers(). After all, one
returns the controllers available to the process, the other the
controllers enabled in the kernel at all).
Let's also update the code to use read_line() and set_put_strdup() to
shorten the code a bit, and make it more robust.
nspawn: rework mount_systemd_cgroup_writable() a bit
We shouldn't call alloca() as part of function calls, that's not really
defined in C. Hence, let's first do our stack allocations, and then
invoke functions.
Also, some coding style fixes, and minor shuffling around.
mount-util: add new path_get_mnt_id() call that queries the mnt ID of a path
This is a simple wrapper around name_to_handle_at_loop() and
fd_fdinfo_mnt_id() to query the mnt ID of a path. It uses
name_to_handle_at() where it can, and falls back to to
fd_fdinfo_mnt_id() where that doesn't work.
This is a best-effort thing of course, since neither name_to_handle_at()
nor the fdinfo logic work on all kernels.
mount-util: add name_to_handle_at_loop() wrapper around name_to_handle_at()
As it turns out MAX_HANDLE_SZ is a lie, the handle buffer we pass into
name_to_handle_at() might need to be larger than MAX_HANDLE_SZ, and we
thus need to invoke name_to_handle_at() in a loop, growing the buffer as
needed.
This adds a new wrapper name_to_handle_at_loop() around
name_to_handle_at() that does the necessary looping, and ports over all
users.
core: make use of unit_active_or_pending() where we can
Let's make use of unit_active_or_pending() where we can. Note that this
change changes beaviour in one specific case: when shutdown.target is
active we'll now also return that the system is in "stopping" state, not
only when we try to get into it. That makes sense as shutdown.target is
ordered before the actually shutdown units such as
"systemd-poweroff.service", and if the state is queried between reaching
those we should also report "stopping".
Let's make our finished checks a bit more readable. Checking the
timestamp is not entirely obvious, hence let's abstract that a bit by
adding a macro that shows what we are doing here, not how we doing it.
This is particularly useful if we want to change the definition of
"finished" later on, in particular, when we try to fix #7023.
It's like manager_dump(), but returns a string. This allows us to reduce
some duplicate code. Also, while we are at it, turn off stdio locking
while we write to the memory FILE *f.
manager: rework the timestamps logic, so that they are an enum-index array
This makes things quite a bit more systematic I think, as we can
systematically operate on all timestamps, for example for the purpose of
serialization/deserialization.
This rework doesn't necessarily make things shorter in the individual
lines, but it does reduce the line count a bit.
(This is useful particularly when we want to add additional timestamps,
for example to solve #7023)
Shawn Landden [Tue, 21 Nov 2017 07:24:12 +0000 (23:24 -0800)]
shared: silence gcc warning (#7402)
[346/1860] Compiling C object 'src/shared/systemd-shared-235@sha/firewall-util.c.o'.
../src/shared/firewall-util.c: In function ‘entry_fill_basics’:
../src/shared/firewall-util.c:81:79: warning: logical ‘and’ of equal expressions [-Wlogical-op]
[543/1860] Compiling C object 'src/shared/systemd-shared-235@sta/firewall-util.c.o'.
../src/shared/firewall-util.c: In function ‘entry_fill_basics’:
../src/shared/firewall-util.c:81:79: warning: logical ‘and’ of equal expressions [-Wlogical-op]