Susant Sahani [Mon, 13 Jun 2016 13:57:38 +0000 (19:27 +0530)]
networkd: fix NULL pointer (#3523)
Not every link has kind associated with it.
(gdb) r
Starting program: /home/sus/tt/systemd/systemd-networkd
Missing separate debuginfos, use: dnf debuginfo-install
glibc-2.23.1-7.fc24.x86_64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
vboxnet0: Gained IPv6LL
wlp3s0: Gained IPv6LL
enp0s25: Gained IPv6LL
Enumeration completed
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6e27ade in __strcmp_sse2_unaligned () from /lib64/libc.so.6
(gdb) bt
src/network/networkd-link.c:2008
src/network/networkd-link.c:2059
src/network/networkd-link.c:2442
m=0x555555704a30, userdata=0x55555570bfe0) at src/network/networkd-link.c:2497
at src/libsystemd/sd-netlink/sd-netlink.c:347
src/libsystemd/sd-netlink/sd-netlink.c:402
src/libsystemd/sd-netlink/sd-netlink.c:432
userdata=0x5555556f7470) at src/libsystemd/sd-netlink/sd-netlink.c:739
src/libsystemd/sd-event/sd-event.c:2275
src/libsystemd/sd-event/sd-event.c:2626
timeout=18446744073709551615) at src/libsystemd/sd-event/sd-event.c:2685
bus=0x5555556f9af0, name=0x555555692315 "org.freedesktop.network1",
timeout=30000000,
check_idle=0x55555556ac84 <manager_check_idle>, userdata=0x5555556f6b20) at
src/shared/bus-util.c:134
src/network/networkd-manager.c:1128
src/network/networkd.c:127
(gdb) f 1
src/network/networkd-link.c:2008
2008 if (link->network->bridge || streq("bridge", link->kind)) {
(gdb) p link->kind
$1 = 0x0
Jouke Witteveen [Mon, 13 Jun 2016 10:50:12 +0000 (12:50 +0200)]
core/execute: pass env vars to PAM session setup (#3503)
Move the merger of environment variables before setting up the PAM
session and pass the aggregate environment to PAM setup. This allows
control over the PAM session hooks through environment variables.
PAM session initiation may update the environment. On successful
initiation of a PAM session, we adopt the environment of the
PAM context.
... as well as halt/poweroff/kexec/suspend/hibernate/hybrid-sleep.
Running those commands will fail in user mode, but we try to set the wall
message first, which might even succeed for privileged users. Best to nip
the whole sequence in the bud.
Our functions that query /proc/pid/ support using pid==0 to mean
self. get_process_id also seemed to support that, but it was not implemented
correctly: the result should be in *uid, not returned, and also it gave
completely bogus result when called from get_process_gid(). But afaict,
get_process_{uid,gid} were never called with pid==0, so it's not an actual
bug. Remove the broken code to avoid confusion.
resolved: move verification that link is unmanaged into the proper bus calls
Previously, we checked only for the various SetLinkXYZ() calls on the Manager
object exposed on the bus if the specified interface is managed/unmanaged by
networkd (as we don't permit overriding DNS configuration via bus calls if
networkd owns the device), but the equivalent SetXYZ() calls on the Link object
did not have such a check. Fix that by moving the appropriate check into the
latter, as the former just calls that anyway.
systemctl: prolong timeout of "systemctl daemon-reload"
Reloading or reexecuting PID 1 means the unit generators are rerun, which are
timed out at 90s. Make sure the method call asking for the reload is timed out
at twice that, so that the generators have 90s and the reload operation has 90s
too.
This reworks the daemon_reload() call in systemctl, and makes it exclusively
about reloading/reexecing. Previously it was used for other trivial method
calls too, which didn't really help readability. As the code paths are now
sufficiently different, split out the old code into a new function
trivial_method().
systemctl: don't suppress error code when handling legacy commands
For legacy commands such as /sbin/halt or /sbin/poweroff we support legacy
fallbacks that talk via traditional SysV way with PID 1 to issue the desired
operation. We do this on any kind of error if the primary method of operation
fails. When this is the case we suppress any error message that is normally
generated, in order to not confuse the user. When suppressing this log message,
don't suppress the original error code, because there's really no reason to.
core/execute: add the magic character '!' to allow privileged execution (#3493)
This patch implements the new magic character '!'. By putting '!' in front
of a command, systemd executes it with full privileges ignoring paramters
such as User, Group, SupplementaryGroups, CapabilityBoundingSet,
AmbientCapabilities, SecureBits, SystemCallFilter, SELinuxContext,
AppArmorProfile, SmackProcessLabel, and RestrictAddressFamilies.
Fixes partially https://github.com/systemd/systemd/issues/3414
Related to https://github.com/coreos/rkt/issues/2482
Testing:
1. Create a user 'bob'
2. Create the unit file /etc/systemd/system/exec-perm.service
(You can use the example below)
3. sudo systemctl start ext-perm.service
4. Verify that the commands starting with '!' were not executed as bob,
4.1 Looking to the output of ls -l /tmp/exec-perm
4.2 Each file contains the result of the id command.
This the patch implements a notificaiton mechanism from the init process
in the container to systemd-nspawn.
The switch --notify-ready=yes configures systemd-nspawn to wait the "READY=1"
message from the init process in the container to send its own to systemd.
--notify-ready=no is equivalent to the previous behavior before this patch,
systemd-nspawn notifies systemd with a "READY=1" message when the container is
created. This notificaiton mechanism uses socket file with path relative to the contanier
"/run/systemd/nspawn/notify". The default values it --notify-ready=no.
It is also possible to configure this mechanism from the .nspawn files using
NotifyReady. This parameter takes the same options of the command line switch.
Before this patch, systemd-nspawn notifies "ready" after the inner child was created,
regardless the status of the service running inside it. Now, with --notify-ready=yes,
systemd-nspawn notifies when the service is ready. This is really useful when
there are dependencies between different contaniers.
Fixes https://github.com/systemd/systemd/issues/1369
Based on the work from https://github.com/systemd/systemd/pull/3022
Testing:
Boot a OS inside a container with systemd-nspawn.
Note: modify the commands accordingly with your filesystem.
1. Create a filesystem where you can boot an OS.
2. sudo systemd-nspawn -D ${HOME}/distros/fedora-23/ sh
2.1. Create the unit file /etc/systemd/system/sleep.service inside the container
(You can use the example below)
2.2. systemdctl enable sleep
2.3 exit
3. sudo systemd-run --service-type=notify --unit=notify-test
${HOME}/systemd/systemd-nspawn --notify-ready=yes
-D ${HOME}/distros/fedora-23/ -b
4. In a different shell run "systemctl status notify-test"
When using --notify-ready=yes the service status is "activating" for 20 seconds
before being set to "active (running)". Instead, using --notify-ready=no
the service status is marked "active (running)" quickly, without waiting for
the 20 seconds.
This patch was also test with --private-users=yes, you can test it just adding it
at the end of the command at point 3.
Let's add a generic parser for VLAN ids, which should become handy as
preparation for PR #3428. Let's also make sure we use uint16_t for the vlan ID
type everywhere, and that validity checks are already applied at the time of
parsing, and not only whne we about to prepare a netdev.
Also, establish a common definition VLANID_INVALID we can use for
non-initialized VLAN id fields.
execute: check whether the specified fd is a tty before chowning/chmoding it (#3457)
Let's add an extra safety check before we chmod/chown a TTY to the right user,
as we might end up having connected something to STDIN/STDOUT that is actually
not a TTY, even though this might have been requested, due to permissive
StandardInput= settings or transient service activation with fds passed in.
Topi Miettinen [Thu, 9 Jun 2016 07:32:04 +0000 (07:32 +0000)]
units: add a basic SystemCallFilter (#3471)
Add a line
SystemCallFilter=~@clock @module @mount @obsolete @raw-io ptrace
for daemons shipped by systemd. As an exception, systemd-timesyncd
needs @clock system calls and systemd-localed is not privileged.
ptrace(2) is blocked to prevent seccomp escapes.
The changes in 788d2b088b13a2444b9eb2ea82c0cc57d9f0980f weren't complete, only
half the code that dealt with K links was removed. This is a follow-up patch
that removes the rest too.
networkd: rename IPv6AcceptRouterAdvertisements to IPv6AcceptRA
The long name is just too hard to type. We generally should avoid using
acronyms too liberally, if they aren't established enough, but it appears that
"RA" is known well enough. Internally we call the option "ipv6_accept_ra"
anyway, and the kernel also exposes it under this name. Hence, let's rename the
IPv6AcceptRouterAdvertisements= setting and the
[IPv6AcceptRouterAdvertisements] section to IPv6AcceptRA= and [IPv6AcceptRA].
The old setting IPv6AcceptRouterAdvertisements= is kept for compatibility with
older configuration. (However the section [IPv6AcceptRouterAdvertisements] is
not, as it was never available in a published version of systemd.
David Herrmann [Tue, 7 Jun 2016 08:38:33 +0000 (10:38 +0200)]
sd-netlink: fix deep recursion in message destruction (#3455)
On larger systems we might very well see messages with thousands of parts.
When we free them, we must avoid recursing into each part, otherwise we
very likely get stack overflows.
Fix sd_netlink_message_unref() to use an iterative approach rather than
recursion (also avoid tail-recursion in case it is not optimized by the
compiler).
michaelolbrich [Mon, 6 Jun 2016 19:59:51 +0000 (21:59 +0200)]
mount: make sure got into MOUNT_DEAD state after a successful umount (#3444)
Without this code the following can happen:
1. Open a file to keep a mount busy
2. Try to stop the corresponding mount unit with systemctl
-> umount fails and the failure is remembered in mount->result
3. Close the file and umount the filesystem manually
-> mount_dispatch_io() calls "mount_enter_dead(mount, MOUNT_SUCCESS)"
-> Old error in mount->result is reused and the mount unit enters a
failed state
Clear the old error result when 'mountinfo' reports a successful umount to
fix this.
This reworks sd-ndisc and networkd substantially to support IPv6 RA much more
comprehensively. Since the API is extended quite a bit networkd has been ported
over too, and the patch is not as straight-forward as one could wish. The
rework includes:
- Support for DNSSL, RDNSS and RA routing options in sd-ndisc and networkd. Two
new configuration options have been added to networkd to make this
configurable.
- sd-ndisc now exposes an sd_ndisc_router object that encapsulates a full RA
message, and has direct, friendly acessor functions for the singleton RA
properties, as well as an iterative interface to iterate through known and
unsupported options. The router object may either be retrieved from the wire,
or generated from raw data. In many ways the sd-ndisc API now matches the
sd-lldp API, except that no implicit database of seen data is kept. (Note
that sd-ndisc actually had a half-written, but unused implementaiton of such
a store, which is removed now.)
- sd-ndisc will now collect the reception timestamps of RA, which is useful to
make sd_ndisc_router fully descriptive of what it covers.
lldp: deal properly with recv() returning EAGAIN/EINTR
It might very well return EAGAIN in case of packet checksum problems and
suchlike, hence let's better handle this nicely, the same way as we do it in
the other sd-network libraries for incoming datagrams.
Let's make sure the inline functions for retrieving TLV data actually carry TLV
in the name, so that we don#t assume they retrieve the whole, raw packet data.
sd-lldp: take triple timestamp when reading LLDP packets
It's a good idea to store away the recption time of LLDP packets in the
neighbor object, simply because the LLDP data only has a validity of a certain
amount of time.
Hence, let's record the timestamp when we receive the datagram and expose an
API for it. Also, automatically expire LLDP neighbors based on this new
timestamp.
We already have a double timestamp object that we use whenever we need both a
MONOTONIC and a REALTIME timestamp taken and stored. With this change we
also add a triple timestamp object that in addition stores a BOOTTIME
timestamp, which is useful for a few usecases.
Note that we keep dual_timestamp around, as it is useful in many cases where
triple_timestamp is not, in particular because retrieving the monotonic and
realtime timestamps is much cheaper on Linux that getting the boottime
timestamp.
resolved: also rewrite private /etc/resolv.conf when configuration is changed via bus calls
This also moves log message generation into manager_write_resolv_conf(), so
that it is shorter to invoke the function, given that we have to invoke it at a
couple of additional places now.
resolved: support IPv6 DNS servers on the local link
Make sure we can parse DNS server addresses that use the "zone id" syntax for
local link addresses, i.e. "fe80::c256:27ff:febb:12f%wlp3s0", when reading
/etc/resolv.conf.
Also make sure we spit this out correctly again when writing /etc/resolv.conf
and via the bus.
This was not duplicate code. The expire_tokens need to be handled as well:
Send 0 == success for MOUNT_DEAD (umount successful), do nothing for
MOUNT_UNMOUNTING (not yet done) and an error for everything else.
Otherwise the automount logic will assume unmounting is not done and will
not send any new requests for mounting. As a result, the corresponding
mount unit is never mounted.
Without this, automounts with TimeoutIdleSec= are broken. Once the idle
timeout triggered a umount, any access to the corresponding filesystem
hangs forever.
networkd: drop weird "const" usage in function parameters
We generally only use "const" to constify the destination of pointers, but not
the pointers themselves, as they are copied anyway during C function
invocation. Hence, drop this usage of "const".
Topi Miettinen [Fri, 3 Jun 2016 15:58:18 +0000 (15:58 +0000)]
core: Restrict mmap and mprotect with PAGE_WRITE|PAGE_EXEC (#3319) (#3379)
New exec boolean MemoryDenyWriteExecute, when set, installs
a seccomp filter to reject mmap(2) with PAGE_WRITE|PAGE_EXEC
and mprotect(2) with PAGE_EXEC.