test: fix test-mount-util when handling duplicate mounts on the same location
The test was written so far under the assumption that if two mounts are
placed onto the same location the "upper" mount is listed later in
/proc/self/mountinfo. This appears not to be guaranteed however, as
running the tests in a normal nspawn shows.
This patch fixes that: it reverses the hashmap of mounts we build:
instead of keying by path, we key by mnt_id, and if we notice that
path_get_mnt_id() doesn't match what a line in /proc/self/mountinfo
says, we use the returned ID to check if maybe another line agrees.
mount-util: EOVERFLOW might have other causes than buffer size issues
When we get EOVERFLOW this might be caused by untriggered nfs4 mounts
(see discussion at
https://github.com/systemd/systemd/pull/7395#issuecomment-346164481 and
further down).
Handle this nicely by falling back to fdinfo-based mntid determination.
mount-util: drop exponential buffer growing in name_to_handle_at_loop()
So, it appears name_to_handle_at() always returns the right buffer size
on EOVERFLOW, when it's returned due to a too small buffer. Let's rely
on that exclusively for sizing the buffer, and let's drop the
exponential buffer growing.
The new logic is now: if we see EOVERFLOW and the returned size has
increased, resize our buffer and try again. But if it didn't increase,
then propagate the EOVERFLOW as it likely has other causes.
Yu Watanabe [Thu, 23 Nov 2017 12:25:56 +0000 (21:25 +0900)]
core/manager: check the existance of the special units (#7433)
In the user mode, not all special units exist.
So, we need to check whether the units exist or not before operate
something to the units.
Such the check was mistakenly dropped by e68537f0ba1a4433ecdf58e609b1701ed7091abc.
cgroup: check whether unified hierarchy is writable
When systemd is running inside a container employing user
namespaces it currently mounts the unified cgroup hierarchy
without being able to write to it. This causes systemd to
freeze during boot.
This patch checks whether the unified cgroup hierarchy
is writable. If it is not it will not mount it.
This solution is based on a patch by Evgeny Vereshchagin.
Susant Sahani [Wed, 22 Nov 2017 07:23:22 +0000 (12:53 +0530)]
networkd: introduce vxcan netdev. (#7150)
Similar to the virtual ethernet driver veth, vxcan implements a
local CAN traffic tunnel between two virtual CAN network devices.
When creating a vxcan, two vxcan devices are created as pair
When one end receives the packet it appears on its pair and vice
versa. The vxcan can be used for cross namespace communication.
Smack LSM needs the capability CAP_MAC_ADMIN to allow
setting of the current Smack exec label. Consequently,
dropping capabilities must be done after changing the
current exec label.
This is only related to Smack LSM. But for clarity and
regularity, all setting of security context moved before
dropping capabilities.
test: add a test case that validates cgroup delegation
This test runs on the unified hierarchy, and ensures that cgroup
delegation works properly, i.e. writ access is granted and the requested
controllers are enabled.
Make sure to add the delegation mask to the mask of controllers we have
to enable on our own unit. Do not claim it was a members mask, as such
a logic would mean we'd collide with cgroupv2's "no processes on inner
nodes policy".
This change does the right thing: it means any controller enabled
through Controllers= will be made available to subcrgoups of our unit,
but the unit itself has to still enable it through
cgroup.subtree_control (which it can since that file is delegated too)
to be inherited further down.
Or to say this differently: we only should manipulate
cgroup.subtree_control ourselves for inner nodes (i.e. slices), and
for leaves we need to provide a way to enable controllers in the slices
above, but stay away from the cgroup's own cgroup.subtree_control —
which is what this patch ensures.
cgroup: properly determine cgroups zombie processes belong to
When a process becomes a zombie its cgroup might be deleted. Let's add
some minimal code to detect cases like this, so that we can still
attribute this back to the original cgroup.
core: unify common code for preparing for forking off unit processes
This introduces a new function unit_prepare_exec() that encapsulates a
number of calls we do in preparation for spawning off some processes in
all our unit types that do so.
This allows us to neatly unify a bit of code between unit types and
shorten our code.
Let's rename get_controllers() → get_process_controllers(), in order to
underline the difference to cg_kernel_controllers(). After all, one
returns the controllers available to the process, the other the
controllers enabled in the kernel at all).
Let's also update the code to use read_line() and set_put_strdup() to
shorten the code a bit, and make it more robust.
nspawn: rework mount_systemd_cgroup_writable() a bit
We shouldn't call alloca() as part of function calls, that's not really
defined in C. Hence, let's first do our stack allocations, and then
invoke functions.
Also, some coding style fixes, and minor shuffling around.
mount-util: add new path_get_mnt_id() call that queries the mnt ID of a path
This is a simple wrapper around name_to_handle_at_loop() and
fd_fdinfo_mnt_id() to query the mnt ID of a path. It uses
name_to_handle_at() where it can, and falls back to to
fd_fdinfo_mnt_id() where that doesn't work.
This is a best-effort thing of course, since neither name_to_handle_at()
nor the fdinfo logic work on all kernels.
mount-util: add name_to_handle_at_loop() wrapper around name_to_handle_at()
As it turns out MAX_HANDLE_SZ is a lie, the handle buffer we pass into
name_to_handle_at() might need to be larger than MAX_HANDLE_SZ, and we
thus need to invoke name_to_handle_at() in a loop, growing the buffer as
needed.
This adds a new wrapper name_to_handle_at_loop() around
name_to_handle_at() that does the necessary looping, and ports over all
users.
core: make use of unit_active_or_pending() where we can
Let's make use of unit_active_or_pending() where we can. Note that this
change changes beaviour in one specific case: when shutdown.target is
active we'll now also return that the system is in "stopping" state, not
only when we try to get into it. That makes sense as shutdown.target is
ordered before the actually shutdown units such as
"systemd-poweroff.service", and if the state is queried between reaching
those we should also report "stopping".
Let's make our finished checks a bit more readable. Checking the
timestamp is not entirely obvious, hence let's abstract that a bit by
adding a macro that shows what we are doing here, not how we doing it.
This is particularly useful if we want to change the definition of
"finished" later on, in particular, when we try to fix #7023.
It's like manager_dump(), but returns a string. This allows us to reduce
some duplicate code. Also, while we are at it, turn off stdio locking
while we write to the memory FILE *f.
manager: rework the timestamps logic, so that they are an enum-index array
This makes things quite a bit more systematic I think, as we can
systematically operate on all timestamps, for example for the purpose of
serialization/deserialization.
This rework doesn't necessarily make things shorter in the individual
lines, but it does reduce the line count a bit.
(This is useful particularly when we want to add additional timestamps,
for example to solve #7023)
Shawn Landden [Tue, 21 Nov 2017 07:24:12 +0000 (23:24 -0800)]
shared: silence gcc warning (#7402)
[346/1860] Compiling C object 'src/shared/systemd-shared-235@sha/firewall-util.c.o'.
../src/shared/firewall-util.c: In function ‘entry_fill_basics’:
../src/shared/firewall-util.c:81:79: warning: logical ‘and’ of equal expressions [-Wlogical-op]
[543/1860] Compiling C object 'src/shared/systemd-shared-235@sta/firewall-util.c.o'.
../src/shared/firewall-util.c: In function ‘entry_fill_basics’:
../src/shared/firewall-util.c:81:79: warning: logical ‘and’ of equal expressions [-Wlogical-op]
Susant Sahani [Mon, 20 Nov 2017 16:50:48 +0000 (22:20 +0530)]
networkd: Set RoutingPolicyRule in link_configure (#7235)
The RoutingPolicyRules are not added when we are calling from set_address
the link->message++ and link->message-- never reaches to zero in the callback function
resulting routes are never gets added.
machined: port machined's bus APIs to use new image metadata API
Let's rework the D-Bus APIs GetImageOSRelease() to use the new internal
metadata API, to query what it needs to know. Augment it with
GetImageHostname(), GetImageMachineID(), GetImageMachineInfo(), that
expose the other new APIS.
machine-image: add a generic API to determine metadata of any image
This adds an internal API that permits querying metadata from any type
of image, including both subvol/dir images, and raw/block images. In the
latter case we use the new dissection API we just added.
hostname-util: rework read_hostname_config() a bit
First of all, let's rename it to read_etc_hostname(), to make clearer
what kind of configuration it actually reads: the file format defined in
/etc/hostname and nothing else.
Secondly: let's port this to use read_line(), i.e. the new way to read
lines from a file in a safe, bounded way.
Thirdly: let's strip leading/trailing whitespace from what we are
reading. Given that we are already pretty lenient what we read (comments
and empty lines), let's be permissive regarding whitespace too.
Fourthly: let's actually validate the hostname when reading it. So far
we tried to make it valid, but that's not always possible (for example,
we can't make an empty hostname valid, ever).
core: introduce SuccessAction= as unit file property
SuccessAction= is similar to FailureAction= but declares what to do on
success of a unit, rather than on failure. This is useful for running
commands in qemu/nspawn images, that shall power down on completion. We
frequently see "ExecStopPost=/usr/bin/systemctl poweroff" or so in unit
files like this. Offer a simple, more declarative alternative for this.
While we are at it, hook up failure action with unit_dump() and
transient units too.
meson: add -Wimplicit-fallthrough=3 to compilation options (#7393)
At some point before gcc-7 was released, -Wimplicit-fallthrough=3 was included
in -Wextra. The documentation for gcc-7.2.1-2.fc27.x86_64 still says that, but
empirical testing shows that it's not. The documentation also misstates that
-Wimplicit-fallthrough is equivalent to -Wimplicit-fallthrough=3.
Let's add -Wimplicit-fallthrough=3 explicitly to get the warnings if we regress.
This little new command can parse, validate, normalize calendar events,
and calculate when they will elapse next. This should be useful for
anyone writing calendar events and who'd like to validate the expression
before running them as timer units.
Yu Watanabe [Sun, 19 Nov 2017 16:00:34 +0000 (01:00 +0900)]
core/swap: load() should fail when neither of corresponding unit file nor /proc/swap entry does not exist
It is not necessary to label as loaded to a swap unit when neither of
corresponding unit file nor entry in /proc/swap does not exist.
This makes swap_load() to fail such a case.