Richard Maw [Tue, 30 Jun 2015 13:41:41 +0000 (13:41 +0000)]
nspawn: Communicate determined UID shift to parent
There is logic to determine the UID shift from the file-system, rather
than having it be explicitly passed in.
However, this needs to happen in the child process that sets up the
mounts, as what's important is the UID of the mounted root, rather than
the mount-point.
Setting up the UID map needs to happen in the parent becuase the inner
child needs to have been started, and the outer child is no longer able
to access the uid_map file, since it lost access to it when setting up
the mounts for the inner child.
So we need to communicate the uid shift back out, along with the PID of
the inner child process.
Failing to communicate this means that the invalid UID shift, which is
the value used to specify "this needs to be determined from the file
system" is left invalid, so setting up the user namespace's UID shift
fails.
networkd: be more defensive when writing to ipv4/ipv6 forwarding settings
1) never bother with setting the flag for loopback devices
2) if we fail to write the flag due to EROFS (which is likely to happen
in containers where /proc/sys is read-only) or any other error, check
if the flag already has the right value. If so, don't complain.
David Herrmann [Sun, 5 Jul 2015 09:25:38 +0000 (11:25 +0200)]
core: don't mount kdbusfs if not wanted
Just like we conditionalize loading kdbus.ko, we should conditionalize
mounting kdbusfs. Otherwise, we might run with kdbus if it is builtin,
even though the user didn't want this.
This patch add support for ipv6 privacy extensions.
The variable /proc/sys/net/ipv6/conf/<if>/use_tempaddr
can be changed via the boolean
IPv6PrivacyExtensions=[yes/no/prefer-temporary]
When true enables privacy extensions, but prefer public addresses over
temporary addresses.
prefer-temporary prefers temporary adresses over public addresses.
Defaults to false.
David Herrmann [Sat, 4 Jul 2015 11:08:29 +0000 (13:08 +0200)]
man: fix sysctl references in networkd-manpage
We refer to the same sysctl-setting twice, which is misleading. Correctly
list all global forwarding options. As we _always_ change the forwarding
setting on links, they will get disabled by default. The global sysctl
defaults thus will not have any effect.
David Herrmann [Sat, 4 Jul 2015 10:14:45 +0000 (12:14 +0200)]
core: harden cgroups-agent forwarding
On dbus1, we receive systemd1.Agent signals via the private socket, hence
it's trusted. However, on kdbus we receive it on the system bus. We must
make sure it's sent by UID=0, otherwise unprivileged users can fake it.
Furthermore, never forward broadcasts we sent ourself. This might happen
on kdbus, as we forward the message on the same bus we received it on,
thus ending up in an endless loop.
David Herrmann [Sat, 4 Jul 2015 10:11:22 +0000 (12:11 +0200)]
busctl: flush stdout after dumping data
Running `busctl monitor` currently buffers data for several seconds /
kilobytes before writing stdout. This is highly confusing if you dump in a
file, ^C busctl and then end up with a file with data of the last few
_seconds_ missing.
Fix this by explicitly flushing after each signal.
sd-bus: introduce new sd_bus_flush_close_unref() call
sd_bus_flush_close_unref() is a call that simply combines sd_bus_flush()
(which writes all unwritten messages out) + sd_bus_close() (which
terminates the connection, releasing all unread messages) +
sd_bus_unref() (which frees the connection).
The combination of this call is used pretty frequently in systemd tools
right before exiting, and should also be relevant for most external
clients, and is hence useful to cover in a call of its own.
Previously the combination of the three calls was already done in the
_cleanup_bus_close_unref_ macro, but this was only available internally.
journal: in persistent mode create /var/log/journal, with all parents.
systemd-journald races with systemd-tmpfiles-setup, and hence both are
started at about the same time. On a bare-bones system (e.g. with
empty /var, or even non-existent /var), systemd-tmpfiles will create
/var/log. But it can happen too late, that is systemd-journald already
attempted to mkdir /var/log/journal, ignoring the error. Thus failing
to create /var/log/journal. One option, without modifiying the
dependency graph is to create /var/log/journal directory with parents,
when persistent storage has been requested.
Gerd Hoffmann [Mon, 29 Jun 2015 07:42:11 +0000 (09:42 +0200)]
login: add rule for qemu's pci-bridge-seat
Qemu provides a separate pci-bridge exclusively for multi-seat setups.
The normal pci-pci bridge ("-device pci-bridge") has 1b36:0001. The new
pci-bridge-seat was specifically added to simplify guest-side
multiseat configuration. It is identical to the normal pci-pci bridge,
except that it has a different id (1b36:000a) so we can match it and
configure multiseating automatically.
Make sure we always treat this as separate seat if we detect this, just
like other "Pluggable" devices.
Richard Maw [Thu, 2 Jul 2015 13:04:34 +0000 (13:04 +0000)]
util: fall back in rename_noreplace when renameat2 isn't implemented
According to README we only need 3.7, and while it may also make sense
to bump that requirement when appropriate, it's trivial to fall back
when renameat2 is not available.
David Herrmann [Thu, 2 Jul 2015 10:14:27 +0000 (12:14 +0200)]
sd-bus: don't leak kdbus notifications
When we get notifications from the kernel, we always turn them into
synthetic dbus1 messages. This means, we do *not* consume the kdbus
message, and as such have to free the offset.
Right now, the translation-helpers told the caller that they consumed the
message, which is wrong. Fix this by explicitly releasing all kernel
messages that are translated.
David Herrmann [Wed, 1 Jul 2015 17:25:30 +0000 (19:25 +0200)]
udev: destroy manager before cleaning environment
Due to our _cleanup_ usage for the udev manager, it will be destroyed
after the "exit:" label has finished. Therefore, it is the last
destruction done in main(). This has two side-effects:
- mac_selinux is destroyed before the udev manager is, possible causing
use-after-free if the manager-cleanup accesses selinux data
- log_close() is called *before* the manager is destroyed, possibly
re-opening the log if you use --debug (and thus not re-applying the
--debug option)
Avoid this by moving the manager-handling into a new function called
run(). This function will be left before we enter the "exit:" label in
main(), hence, the manager object will be destroyed early.
David Herrmann [Wed, 1 Jul 2015 16:31:18 +0000 (18:31 +0200)]
bus-proxy: never apply policy when sending signals
Unlike dbus-daemon, the bus-proxy does not know the receiver of a
broadcast (as the kernel has exclusive access on the bus connections).
Hence, and "destination=" matches in dbus1 policies cannot be applied.
But kdbus does not place any restrictions on *SENDING* broadcasts, anyway.
The kernel never returns EPERM to KDBUS_CMD_SEND if KDBUS_MSG_SIGNAL is
set. Instead, receiver policies are checked. Hence, stop checking sender
policies for signals in bus-proxy and leave it up to the kernel.
This fixes some network-manager bus-proxy issues where NM uses weird
dst-based matches against interface-based matches. As we cannot perform
dst-based matches, our bus-proxy cannot properly implement this policy.
David Herrmann [Wed, 1 Jul 2015 13:05:01 +0000 (15:05 +0200)]
login: re-use VT-sessions if they already exist
Right now, if you start a session via 'su' or 'sudo' from within a
session, we make sure to re-use the existing session instead of creating a
new one. We detect this by reading the session of the requesting PID.
However, with gnome-terminal running as a busname-unit, and as such
running outside the session of the user, this will no longer work.
Therefore, this patch makes sure to return the existing session of a VT if
you start a new one.
This has the side-effect, that you will re-use a session which your PID is
not part of. This works fine, but will break assumptions if the parent
session dies (and as such close your session even though you think you're
part of it). However, this should be perfectly fine. If you run multiple
logins on the same session, you should really know what you're doing. The
current way of silently accepting it but choosing the last registered
session is just weird.
David Herrmann [Wed, 1 Jul 2015 11:02:58 +0000 (13:02 +0200)]
sysv-generator: fix coding-style
Fix weird coding-style:
- proper white-space
- no if (func() >= 0) bail-outs
- fix braces
- avoid 'r' for anything but errno
- init _cleanup_ variables unconditionally, even if not needed
kmod was fixed upstream to no longer return ENOSYS. Also see:
https://git.kernel.org/cgit/utils/kernel/kmod/kmod.git/commit/?id=114ec87c85c35a2bd3682f9f891e494127be6fb5
The kmod fix is marked for backport, so no reason to bump the kmod
version we depend on.
Tom Gundersen [Tue, 23 Jun 2015 11:13:20 +0000 (13:13 +0200)]
sd-netlink: respect attribute type flags
Though currently unused by us, netlink attribute types support embedding flags to indicate
if the type is encoded in network byte-order and if it is a nested attribute. Read out
these flags when parsing the message.
We will now swap the byteorder in case it is non-native when reading out integers (though
this is not needed by any of the types we currently support). We do not enforce the NESTED
flag, as the kernel gets this wrong in many cases.
Richard Maw [Tue, 30 Jun 2015 13:21:14 +0000 (13:21 +0000)]
nspawn: Don't remount with fewer options
When we do a MS_BIND mount, it inherits the flags of its parent mount.
When we do a remount, it sets the flags to exactly what is specified.
If we are in a user namespace then these mount points have their flags
locked, so you can't reduce the protection.
As a consequence, the default setup of mount_all doesn't work with user
namespaces. However if we ensure we add the mount flags of the parent
mount when remounting, then we aren't removing mount options, so we
aren't trying to unlock an option that we aren't allowed to.
build-sys: use wildcard glob in update-man-list again
The idea is that after adding a new man page, make update-man-list
will be used to regenerate part of the makefile. So the data already
present in the makefile cannot be used to do that.
Also, renames filter out generated xml files in make-man-rules.py
itself in order to make Makefile.am a bit simpler, and rename files
to dist_files to better reflect new meaning.
Tom Gundersen [Sun, 28 Jun 2015 21:42:52 +0000 (23:42 +0200)]
udev: event - simplify udev_event_spawn() logic
Push the extraction of the envp + argv as close as possible to their use, to avoid code
duplication. As a sideeffect fix logging when delaing execution.
Andrew Eikum [Fri, 26 Jun 2015 16:02:00 +0000 (11:02 -0500)]
man: Remove instances of pseudo-English "resp."
Me again :) Just noticed one of these in a manpage and did another pass
to clean them up. See 16dad32e437fdf2ffca03cc60a083d84bd31886f for
explanation, though the link needs updating:
<http://transblawg.eu/2004/02/26/resp-and-other-non-existent-english-wordsnicht-existente-englische-worter/>
Tom Gundersen [Mon, 29 Jun 2015 12:24:40 +0000 (14:24 +0200)]
networkd: netdev - avoid hanging transactions in failure cases
If a link is attempted t obe enslaved by a netdev that has already failed, we
must fail immediately and not save the callback for later, as it will then
never get triggered.
Tom Gundersen [Mon, 29 Jun 2015 12:23:17 +0000 (14:23 +0200)]
networkd: fix segfault when cancelling callbacks
This only happens when something has gone wrong, so is not easy to hit. However,
if a bridge (say) is configured on a system without bridge support we will hit
this.
bootchart: reset list_sample_data head before generating SVG
Until commit 1f2ecb0 ("bootchart: kill a bunch of global variables")
variable "head" was declared global and this action was performed by svg_header.
Now that "head" is local and passed to each function called by svg_do(...)
move the code at the beginning of svg_do(...) to restore the correct behaviour.
Tom Gundersen [Thu, 25 Jun 2015 22:02:55 +0000 (00:02 +0200)]
netlink: rework containers
Instead of representing containers as several arrays, make a new
netlink_container struct and keep one array of these structs. We
also introduce netlink_attribute structs that in the future will
hold meta-information about each atribute.