With Symlinks= we can manage one or more symlinks to AF_UNIX or FIFO
nodes in the file system, with the same lifecycle as the socket itself.
This has two benefits: first, this allows us to remove /dev/log and
/dev/initctl from /dev, thus leaving only symlinks, device nodes and
directories in the /dev tree. More importantly however, this allows us
to move /dev/log out of /dev, while still making it accessible there, so
that PrivateDevices= can provide /dev/log too.
Kay Sievers [Wed, 4 Jun 2014 10:16:28 +0000 (12:16 +0200)]
udev: synthesize "change' events for partitions when tools change the disk
This should make sure that fdisk-like programs will automatically
cause an update of all partitions, just like mkfs-like programs cause
an update of the partition.
Either become uid/gid of the client we have been forked for, or become
the "systemd-bus-proxy" user if the client was root. We retain
CAP_IPC_OWNER so that we can tell kdbus we are actually our own client.
core: add new ReadOnlySystem= and ProtectedHome= settings for service units
ReadOnlySystem= uses fs namespaces to mount /usr and /boot read-only for
a service.
ProtectedHome= uses fs namespaces to mount /home and /run/user
inaccessible or read-only for a service.
This patch also enables these settings for all our long-running services.
Together they should be good building block for a minimal service
sandbox, removing the ability for services to modify the operating
system or access the user's private data.
Camilo Aguilar [Wed, 28 May 2014 18:43:37 +0000 (14:43 -0400)]
sd-dhcp-client: allways request broadcast
On systems which cannot receive unicast packets until its IP stack has been configured
we need to request broadcast packets. We are currently not able to reliably detect when
this is necessary, so set it unconditionally for now.
This is set on all packets, but the DHCP server will only broadcast the packets that are
necessary, and unicast the rest.
For more information please refer to this thread in CoreOS: https://github.com/coreos/bugs/issues/12
Tom Gundersen [Mon, 2 Jun 2014 19:50:50 +0000 (21:50 +0200)]
networkd: drop CAP_SYS_MODULE
Rely on modules being built-in or autoloaded on-demand.
As networkd is a network facing service, we want to limits its capabilities,
as much as possible. Also, we may not have CAP_SYS_MODULE in a container,
and we want networkd to work the same there.
Module autoloading does not always work, but should be fixed by the kernel
patch f98f89a0104454f35a: 'net: tunnels - enable module autoloading', which
is currently in net-next and which people may consider backporting if they
want tunneling support without compiling in the modules.
Early adopters may also use a module-load.d snippet and order
systemd-modules-load.service before networkd to force the module
loading of tunneling modules.
This sholud fix the various build issues people have reported.
networkd: run as unpriviliged "systemd-network" user
This allows us to run networkd mostly unpriviliged with the exception of
CAP_NET_* and CAP_SYS_MODULE. I'd really like to get rid of the latter
though...
units: remove CAP_SYS_PTRACE capability from hostnamed/networkd
The ptrace capability was only necessary to detect virtualizations
environments. Since we changed the logic to determine this to not
require priviliges, there's no need to carry the CAP_SYS_PTRACE
capability anymore.
udev-builtin-keyboard: do tell on which device EVIOCSKEYCODE failed.
I am getting
"Error calling EVIOCSKEYCODE (scan code 0xc022d, key code 418): Invalid
argument", the error message does not tell on which specific device the
problem is, add that info.
Instead of accessing /proc/1/environ directly, trying to read the
$container variable from it, let's make PID 1 save the contents of that
variable to /run/systemd/container. This allows us to detect containers
without the need for CAP_SYS_PTRACE, which allows us to drop it from a
number of daemons and from the file capabilities of systemd-detect-virt.
Also, don't consider chroot a container technology anymore. After all,
we don't consider file system namespaces container technology anymore,
and hence chroot() should be considered a container even less.
Stef Walter [Wed, 12 Feb 2014 08:46:31 +0000 (09:46 +0100)]
hostnamed: Fix the way that static and transient host names interact
It is almost always incorrect to allow DHCP or other sources of
transient host names to override an explicitly configured static host
name.
This commit changes things so that if a static host name is set, this
will override the transient host name (eg: provided via DHCP). Transient
host names can still be used to provide host names for machines that have
not been explicitly configured with a static host name.
The exception to this rule is if the static host name is set to
"localhost". In those cases we act as if no
static host name has been explicitly set.
As discussed elsewhere, systemd may want to have an fd based ownership
of the transient name. That part is not included in this commit.
Thomas Bächler [Fri, 21 Feb 2014 10:55:24 +0000 (11:55 +0100)]
analyze/run: use bus_open_transport_systemd instead of bus_open_transport
Both systemd-analyze and systemd-run only access org.freedesktop.systemd1
on the bus. This patch allows using systemd-run --user and systemd-analyze
--user even if the user session's bus is not properly integrated with the
systemd user unit.
https://bugs.freedesktop.org/show_bug.cgi?id=79252 and other reports...
Kay Sievers [Mon, 26 May 2014 01:30:21 +0000 (09:30 +0800)]
udev: keyboard - also hook into "change" events
Re-apply the keymaps when "udevadm trigger" is called. Hooking into
"add" only would just remove all keymap content from the udev database
instead of applying the new config.
Djalal Harouni [Sat, 24 May 2014 13:58:55 +0000 (14:58 +0100)]
nspawn: make nspawn robust to container failure
nspawn and the container child use eventfd to wait and notify each other
that they are ready so the container setup can be completed.
However in its current form the wait/notify event ignore errors that
may especially affect the child (container).
On errors the child will jump to the "child_fail" label and terminate
with _exit(EXIT_FAILURE) without notifying the parent. Since the eventfd
is created without the "EFD_NONBLOCK" flag, this leaves the parent
blocking on the eventfd_read() call. The container can also be killed
at any moment before execv() and the parent will not receive
notifications.
We can fix this by using cheap mechanisms, the new high level eventfd
API and handle SIGCHLD signals:
* Keep the cheap eventfd and EFD_NONBLOCK flag.
* Introduce eventfd states for parent and child to sync.
Child notifies parent with EVENTFD_CHILD_SUCCEEDED on success or
EVENTFD_CHILD_FAILED on failure and before _exit(). This prevents the
parent from waiting on an event that will never come.
* If the child is killed before execv() or before notifying the parent,
we install a NOP handler for SIGCHLD which will interrupt blocking calls
with EINTR. This gives a chance to the parent to call wait() and
terminate in main().
* If there are no errors, parent will block SIGCHLD, restore default
handler and notify child which will do execv(), then parent will pass
control to process_pty() to do its magic.
This was exposed in part by:
https://bugs.freedesktop.org/show_bug.cgi?id=76193
Djalal Harouni [Sat, 24 May 2014 13:58:54 +0000 (14:58 +0100)]
nspawn: move container wait logic into wait_for_container()
Move the container wait logic into its own wait_for_container() function
and add two status codes: CONTAINER_TERMINATED or CONTAINER_REBOOTED.
The status will be stored in its argument, this way we handle:
a) Return negative on failures.
b) Return zero on success and set the status to either
CONTAINER_REBOOTED or CONTAINER_TERMINATED.
These status codes are used to terminate nspawn or loop again in case of
CONTAINER_REBOOTED.
Tanu Kaskinen [Sat, 24 May 2014 09:01:12 +0000 (12:01 +0300)]
path-util: fix missing terminating zero
There was this code:
if (to_path_len > 0)
memcpy(p, to_path, to_path_len);
That didn't add the terminating zero, so the resulting string was
corrupt if this code path was taken.
Using strcpy() instead of memcpy() solves this issue, and also
simplifies the code.
Previously there was special handling for shortening "../../" to
"../..", but that has now been replaced by a path_kill_slashes() call,
which also makes the result prettier in case the input contains
redundant slashes that would otherwise be copied to the result.
Tom Gundersen [Fri, 23 May 2014 22:46:30 +0000 (00:46 +0200)]
sd-network: avoid false positive compiler warning caused by LTO
Djalal Harouni <tixxdz@opendz.org>:
There is also this one genrated by LTO, IMO it's a false positive since
we do *check* for "lease" but the code is not consistent since in that
code path, "lease" is initialized to NULL in other places, except for
this one:
src/resolve/resolved-manager.c: In function 'manager_update_resolv_conf':
src/libsystemd-network/sd-dhcp-lease.c:67:18: warning: 'lease' may be used uninitialized in this function [-Wmaybe-uninitialized]
if (lease->dns_size) {
^
src/network/sd-network.c:146:24: note: 'lease' was declared here
sd_dhcp_lease *lease;
^
Only accept cpu quota values in percentages, get rid of period
definition.
It's not clear whether the CFS period controllable per-cgroup even has a
future in the kernel, hence let's simplify all this, hardcode the period
to 100ms and only accept percentage based quota values.
Kay Sievers [Thu, 22 May 2014 01:08:04 +0000 (10:08 +0900)]
build-sys: let libsystemd_network pull in libudev-internal.la
On Thu, May 22, 2014 at 9:53 AM, Jan Engelhardt <jengelh@inai.de> wrote:
>
> If libsystemd-network.la is relying on that udev function, it ought
> to specify libudev(-internal).la in libsystemd_network_la_LIBADD.
nspawn: allow to bind mount journal on top of a non empty container journal dentry
Currently if nspawn was called with --link-journal=host or
--link-journal=auto and the right /var/log/journal/machine-id/ exists
then the bind mount the subdirectory into the container might fail due
to the ~/mycontainer/var/log/journal/machine-id/ of the container not
being empty.
There is no reason to check if the container journal subdir is empty
since there will be a bind mount on top of it. The user asked for a bind
mount so give it.
Note: a next call with --link-journal=guest may fail due to the
/var/log/journal/machine-id/ on the host not being empty.
Kay Sievers [Thu, 22 May 2014 00:43:22 +0000 (09:43 +0900)]
build-sys: do not run symbol list export test for compat-only libs
The verbose link-time deprecation warnings are annoying. These libs
will never change or be extended; there is no need to test the list
of exported symbols.
Kay Sievers [Thu, 22 May 2014 00:41:32 +0000 (09:41 +0900)]
build-sys: fix linking order
./.libs/libsystemd-network.a(libsystemd_network_la-network-internal.o):
network-internal.c:function net_get_unique_predictable_data:
error: undefined reference to 'udev_device_get_property_value'
collect2: error: ld returned 1 exit status