Susant Sahani [Sun, 26 Nov 2017 14:21:45 +0000 (19:51 +0530)]
networkd: DHCP client do not get into a loop while setting MTU (#7460)
Some devices get reset itself while setting the MTU. we get in to a LOOP .
Once the MTU changed then the DHCP client talking with DHCP server never stops.
networkd gets into a loop and generates endless DHCP requests.
core: warn about left-over processes in cgroup on unit start
Now that we don't kill control processes anymore, let's at least warn
about any processes left-over in the unit cgroup at the moment of
starting the unit.
Let's move the cgroup empty check for all unit types into the generic
unit_check_gc() call, out of the per-unit-type _check_gc() type. This
not only allows us to share some code, but also hooks up mount and
socket units with this kind of check, for free, as it was missing there
previously.
cgroup: remove logic for maintaining /control subcgroup for the service unit type
Previously, in the service unit type we ran all control processes in a
special subcgroup /control of the unit's main cgroup. Remove that, and
run the control program in the main cgroup instead.
The concept conflicts with cgroupv2's logic of "no processes in inner
nodes": if a unit has a main daemon process running in the main cgroup,
and a reload control process would be started in the /control subcgroup,
then this would necessarily fail, as the main daemon process would
become an inner node process that way.
We could in theory continue to support this in cgroupv1, but in the
interest in keeping behaviour similar in both hierarchies, let's drop
this altogether.
Philosophically maybe it wasn't the greatest idea anyway to just go
berserk and SIGKILL all those processes — loud warning logging might
have sufficed, too.
unit: initialize bpf cgroup realization state properly
Before this patch, the bpf cgroup realization state was implicitly set
to "NO", meaning that the bpf configuration was realized but was turned
off. That means invalidation requests for the bpf stuff (which we issue
in blanket fashion when doing a daemon reload) would actually later
result in a us re-realizing the unit, under the assumption it was
already realized once, even though in reality it never was realized
before.
This had the effect that after each daemon-reload we'd end up realizing
*all* defined units, even the unloaded ones, populating cgroupfs with
lots of unneeded empty cgroups.
With this fix we properly set the realiazation state to "INVALIDATED",
i.e. indicating the bpf stuff was never set up for the unit, and hence
when we try to invalidate it later we won't do anything.
cgroup: when dispatching the cgroup realization queue, check again if we shall actually realize
We add units to the cgroup realization queue when propagating realizing
requests to sibling units, and when invalidating cgroup settings because
some cgroup setting changed. In the time between where we add the unit
to the queue until the cgroup is actually dispatched the unit's state
might have changed however, so that the unit doesn't actually need to be
realized anymore, for example because the unit went down. To handle
that, check the unit state again, if realization makes sense.
Redundant realization is usually not a problem, except when the unit is
not actually running, hence check exactly for that.
cgroup-util: merge cg_set_tasks_access() and cg-set_group_access() into one
We never use these functions seperately, hence don't bother splitting
them into to.
Also, simplify things a bit, and maintain tables for the attribute files
to chown. Let's also update those tables a bit, and include thenew
"cgroup.threads" file in it, that needs to be delegated too, according
to the documentation.
Yu Watanabe [Sat, 25 Nov 2017 15:01:55 +0000 (00:01 +0900)]
test: set log_level to info in test-hwdb and check-help-*
These tests check the stderr. So, if the systemd.log_level=debug
is set in the kernel command line, then these tests fail.
This set log_level to info in hwdb-test.sh and meson-check-help.sh,
the kernel command line not to change the output of the target
programs.
tmpfiles: check if not too many symbolic links. (#7423)
Some filesystems do not set d_type value when
readdir is called, so entry type is unknown.
Therefore check if accessing entry does not
return ELOOP error.
Michael Vogt [Fri, 24 Nov 2017 20:03:05 +0000 (21:03 +0100)]
networkd: auto promote links if "promote_secondaries" is unset (#7167)
The DHCP code in systemd-networkd relies on the
`net.ipv4.conf.{default,all,<if>}.promote_secondaries` sysctl to be set
(the kernels default is that it is unset). If this sysctl is not set
DHCP will work most of the time, however when the IP address changes
between leases then the system will loose its IP.
Because some distributions decided to not ship these defaults (Debian
is an example and via downstream Ubuntu) networkd by default will now
enable this sysctl opton automatically.
nspawn: hash the machine name, when looking for a suitable UID base (#7437)
When "-U" is used we look for a UID range we can use for our container.
We start with the UID the tree is already assigned to, and if that
didn't work we'd pick random ranges so far. With this change we'll first
try to hash a suitable range from the container name, and use that if it
works, in order to make UID assignments more likely to be stable.
This follows a similar logic PID 1 follows when using DynamicUser=1.
meson: restore building of man pages on demand even if -Dman=false
I want to configure -Dman=false for speed, but be able to build a specific
man page sometimes to check my edits. Commit 5b316b9ea6c broke this by mistake.
Let's adjust the condition to better match the logic of disabling tests only
if xsltproc is really not found.
All other places where libkmod.h is included are guarded. Build would
fail with:
In file included from ../src/core/kmod-setup.c:35:0:
../src/basic/module-util.h:23:10: fatal error: libkmod.h: No such file or directory
#include <libkmod.h>
^~~~~~~~~~~
compilation terminated.
logind: don't propagate firmware misbehaviours to bus clients
If for some reason we can't query the firmware state, don't propagate
that to clients, but instead log about it, and claim that
reboot-to-firmware is not available (which is the right answer, since it
is not working).
Let's log about this though, as this is certainly relevant to know, even
though not for the client.
This watches controllers on the bus, and unsets them automatically when
they disappear.
Note that this is primarily a cosmetical fix. Since unique bus names are not
recycled, there's strictly no need to forget about them, but it's a lot
nicer to do so.
nspawn: make use of the RequestStop logic of scope units
Since time began, scope units had a concept of "Controllers", a bus peer
that would be notified when somebody requested a unit to stop. None of
our code used that facility so far, let's change that.
This way, nspawn can print a nice message when somebody invokes
"systemctl stop" on the container's scope unit, and then react with the
right action to shut it down.
test: fix test-mount-util when handling duplicate mounts on the same location
The test was written so far under the assumption that if two mounts are
placed onto the same location the "upper" mount is listed later in
/proc/self/mountinfo. This appears not to be guaranteed however, as
running the tests in a normal nspawn shows.
This patch fixes that: it reverses the hashmap of mounts we build:
instead of keying by path, we key by mnt_id, and if we notice that
path_get_mnt_id() doesn't match what a line in /proc/self/mountinfo
says, we use the returned ID to check if maybe another line agrees.
mount-util: EOVERFLOW might have other causes than buffer size issues
When we get EOVERFLOW this might be caused by untriggered nfs4 mounts
(see discussion at
https://github.com/systemd/systemd/pull/7395#issuecomment-346164481 and
further down).
Handle this nicely by falling back to fdinfo-based mntid determination.
mount-util: drop exponential buffer growing in name_to_handle_at_loop()
So, it appears name_to_handle_at() always returns the right buffer size
on EOVERFLOW, when it's returned due to a too small buffer. Let's rely
on that exclusively for sizing the buffer, and let's drop the
exponential buffer growing.
The new logic is now: if we see EOVERFLOW and the returned size has
increased, resize our buffer and try again. But if it didn't increase,
then propagate the EOVERFLOW as it likely has other causes.
Yu Watanabe [Thu, 23 Nov 2017 12:25:56 +0000 (21:25 +0900)]
core/manager: check the existance of the special units (#7433)
In the user mode, not all special units exist.
So, we need to check whether the units exist or not before operate
something to the units.
Such the check was mistakenly dropped by e68537f0ba1a4433ecdf58e609b1701ed7091abc.
cgroup: check whether unified hierarchy is writable
When systemd is running inside a container employing user
namespaces it currently mounts the unified cgroup hierarchy
without being able to write to it. This causes systemd to
freeze during boot.
This patch checks whether the unified cgroup hierarchy
is writable. If it is not it will not mount it.
This solution is based on a patch by Evgeny Vereshchagin.
path_prepend returned a status code, but it wasn't looked at anywhere.
Adding checks for the return value in all the bazillion places where it
is called is not very attractive, so let's just make the whole program
abort cleanly if the (very unlikely) oom is encountered.
Susant Sahani [Wed, 22 Nov 2017 07:23:22 +0000 (12:53 +0530)]
networkd: introduce vxcan netdev. (#7150)
Similar to the virtual ethernet driver veth, vxcan implements a
local CAN traffic tunnel between two virtual CAN network devices.
When creating a vxcan, two vxcan devices are created as pair
When one end receives the packet it appears on its pair and vice
versa. The vxcan can be used for cross namespace communication.
Smack LSM needs the capability CAP_MAC_ADMIN to allow
setting of the current Smack exec label. Consequently,
dropping capabilities must be done after changing the
current exec label.
This is only related to Smack LSM. But for clarity and
regularity, all setting of security context moved before
dropping capabilities.