Alan Jenkins [Thu, 18 Jan 2018 13:58:13 +0000 (13:58 +0000)]
core: clone_device_node(): add debug message
For people who use debug messages, maybe it is helpful to know that
PrivateDevices= failed due to mknod(), and which device node.
(The other (un-logged) failures could be while mounting filesystems e.g. no
CAP_SYS_ADMIN which is the common case, or missing /dev/shm or /dev/pts,
or missing /dev/ptmx).
Alan Jenkins [Thu, 18 Jan 2018 12:07:31 +0000 (12:07 +0000)]
core: un-break PrivateDevices= by allowing it to mknod /dev/ptmx
#7886 caused PrivateDevices= to silently fail-open.
https://github.com/systemd/systemd/pull/7886#issuecomment-358542849
Allow PrivateDevices= to succeed, in creating /dev/ptmx, even though
DeviceControl=closed applies.
No specific justification was given for blocking mknod of /dev/ptmx. Only
that we didn't seem to need it, because we weren't creating it correctly as
a device node.
Add a new -Dllvm-fuzz=true option that can be used to build against
libFuzzer and update the oss-fuzz script to work outside of the
oss-fuzz build environment.
fuzz: disable all deps when building with oss-fuzz
The fuzz targets are intended to be fast and only target systemd
code, so they don't need to call out to any dependencies. They also
shouldn't depend on shared libraries outside of libc, so we disable
every dependency when compiling against oss-fuzz. This also
simplifies the upstream build environment significantly.
Alan Jenkins [Wed, 17 Jan 2018 12:53:26 +0000 (12:53 +0000)]
core: namespace: remove unnecessary mode on /dev/shm mount target
This should have no behavioural effect; it just confused me.
All the other mount directories in this function are created as 0755.
Some of the mounts are allowed to fail - mqueue and hugepages.
If the /dev/mqueue mount target was created with the permissive mode 01777,
to match the filesystem we're trying to mount there, then a mount failure
would allow unprivileged users to write to the /dev filesystem, e.g. to
exhaust the available space. There is no reason to allow this.
(Allowing the user read access (0755) seems a reasonable idea though, e.g. for
quicker troubleshooting.)
We do not allow failure of the /dev/shm mount, so it doesn't matter that
it is created as 01777. But on the same grounds, we have no *reason* to
create it as any specific mode. 0755 is equally fine.
This function will be clearer by using 0755 throughout, to avoid
unintentionally implying some connection between the mode of the mount
target, and the mode of the mounted filesystem.
Alan Jenkins [Mon, 15 Jan 2018 16:55:11 +0000 (16:55 +0000)]
README: fix context for CONFIG_DEVPTS_MULTIPLE_INSTANCES
`newinstance` (and `ptmxmode`) options of devpts are _not_ used by
PrivateDevices=. (/dev/pts is shared, similar to how /dev/shm and
/dev/mqueue are handled). It is used by nspawn containers though.
Also CONFIG_DEVPTS_MULTIPLE_INSTANCES was removed in 4.7-rc2
https://github.com/torvalds/linux/commit/eedf265aa003b4781de24cfed40a655a664457e6
and no longer needs to be set, so make that clearer to avoid confusion.
Alan Jenkins [Wed, 17 Jan 2018 13:28:04 +0000 (13:28 +0000)]
core: namespace: nitpick /dev/ptmx error handling
If /dev/tty did not exist, or had st_rdev == 0, we ignored it. And the
same is true for null, zero, full, random, urandom.
If /dev/ptmx did not exist, we treated this as a failure. If /dev/ptmx had
st_rdev == 0, we ignored it.
This was a very recent change, but there was no reason for ptmx creation
specifically to treat st_rdev == 0 differently from non-existence. This
confuses me when reading it.
Change the creation of /dev/ptmx so that st_rdev == 0 is
treated as failure.
This still leaves /dev/ptmx as a special case with stricter handling.
However it is consistent with the immediately preceding creation of
/dev/pts/, which is treated as essential, and is directly related to ptmx.
I don't know why we check st_rdev. But I'd prefer to have only one
unanswered question here, and not to have a second unanswered question
added on top.
fs-util: refuse taking a relative path to chase if "root" is specified and CHASE_PREFIX_ROOT is set
If we take a relative path we first make it absolute, based on the
current working directory. But if CHASE_PREFIX_ROOT is passe we are
supposed to make the path absolute taking the specified root path into
account, but that makes no sense if we talk about the current working
directory as that is relative to the host's root in any case. Hence,
let's refuse this politely.
Some newer BIOS versions of the TrekStor SurfTab wintron 7.0 tablet use
different (better) DMI strings, update the existing 60-sensors.hwdb
entry for this tablet to also work with the newer BIOS.
namespace: only make the symlink /dev/ptmx if it was already a symlink
…otherwise try to clone it as a device node
On most contemporary distros /dev/ptmx is a device node, and
/dev/pts/ptmx has 000 inaccessible permissions. In those cases
the symlink /dev/ptmx -> /dev/pts/ptmx breaks the pseudo tty support.
In that case we better clone the device node.
OTOH, in nspawn containers (and possibly others), /dev/pts/ptmx has
normal permissions, and /dev/ptmx is a symlink. In that case make the
same symlink.
Yu Watanabe [Tue, 16 Jan 2018 18:35:25 +0000 (03:35 +0900)]
network: create runtime sub-directories after drop_privileges()
For old kernels not supporting AmbientCapabilities=, networkd is
started as root with limited capabilities. Then, networkd cannot
chown the directories under runtime directory as
CapabilityBoundingSet= does not contains enough capabilities.
This makes these directories are created after dropping privileges.
Thus, networkd does not need to chown them anymore.
Yu Watanabe [Tue, 16 Jan 2018 16:53:00 +0000 (01:53 +0900)]
dhcp6: fix warnings by clang with -Waddress-of-packed-member
This fixes the following warnings:
```
[194/1521] Compiling C object 'src/libsystemd-network/systemd-network@sta/dhcp6-option.c.o'.
../../git/systemd/src/libsystemd-network/dhcp6-option.c:110:25: warning: taking address of packed member 'id' of class or structure 'ia_na' may result in an unaligned pointer value [-Waddress-of-packed-member]
iaid = &ia->ia_na.id;
^~~~~~~~~~~~
../../git/systemd/src/libsystemd-network/dhcp6-option.c:115:25: warning: taking address of packed member 'id' of class or structure 'ia_ta' may result in an unaligned pointer value [-Waddress-of-packed-member]
iaid = &ia->ia_ta.id;
^~~~~~~~~~~~
2 warnings generated.
```
Olaf Hering [Tue, 16 Jan 2018 09:24:37 +0000 (10:24 +0100)]
Fix parsing of features in detect_vm_xen_dom0 (#7890)
Use sscanf instead of the built-in safe_atolu because the scanned string
lacks the leading "0x", it is generated with snprintf(b, "%08x", val).
As a result strtoull handles it as octal, and parsing fails.
The initial submission already used sscanf, then parsing was replaced by
safe_atolu without retesting the updated PR.
Fixes 575e6588d ("virt: use XENFEAT_dom0 to detect the hardware domain
(#6442, #6662) (#7581)")
Patrik Flykt [Mon, 15 Jan 2018 15:37:52 +0000 (17:37 +0200)]
sd-dhcp6-client: Use offsetof() instead of sizeof()
The slightly modified review comments say that "...in theory
offsetof(DHCP6Option, data) is nicer than sizeof(DHCP6Option)
because the former removes alignment artifacts. In this
specific case there are no alignment whitespaces hence it's
fine, but out of a matter of principle offsetof() is preferred
over sizeof() in cases like this..."
Patrik Flykt [Mon, 15 Jan 2018 15:15:13 +0000 (17:15 +0200)]
dhcp6: Fix valgrind nitpick about returned test case value
Calling dhcp6_option_parse_address() will always return a value
< 0 on error even though lt_valid remains unset. This is more
than valgrind can safely detect, but let's fix the valgrind
nitpick anyway.
While fixing, use UINT32_MAX instead of ~0 on the same line.
Adam Duskett [Mon, 15 Jan 2018 11:25:46 +0000 (06:25 -0500)]
add false option for tests (#7778)
Currently there is no way to prevent tests from building using meson.
This introduces two problems:
1) It adds a extra 381 files to compile.
2) One of these tests explicitly requires libgcrypt to be built even if systemd
is not using it.
3) It adds C++ to the requirements to build systemd.
Alan Jenkins [Sat, 13 Jan 2018 17:22:46 +0000 (17:22 +0000)]
core: prevent spurious retries of umount
Testing the previous commit with `systemctl stop tmp.mount` logged the
reason for failure as expected, but unexpectedly the message was repeated
32 times.
The retry is a special case for umount; it is only supposed to cover the
case where the umount command was _successful_, but there was still some
remaining mount(s) underneath. Fix it by making sure to test the first
condition :).
Re-tested with and without a preceding `mount --bind /mnt /tmp`,
and using `findmnt` to check the end result.
Wieland Hoffmann [Sat, 13 Jan 2018 14:23:28 +0000 (15:23 +0100)]
zsh/coredumpctl: Never sort the completion candidates
That way, they're always sorted by date. I do not know how to make ZSH sort
them by PID through some option, but that doesn't seem very useful in the first
place.
Alan Jenkins [Sat, 13 Jan 2018 12:30:43 +0000 (12:30 +0000)]
core: fix output (logging) for mount units (#7603)
Documentation - systemd.exec - strongly implies mount units get logging.
It is safe for mounts to depend on systemd-journald.socket. There is no
cyclic dependency generated. This is because the root, -.mount, was
already deliberately set to EXEC_OUTPUT_NULL. See comment in
mount_load_root_mount(). And /run is excluded from being a mount unit.
Nor does systemd-journald depend on /var. It starts earlier, initially
logging to /run.
Tested before/after using `systemctl stop tmp.mount`.
Michal Sekletar [Fri, 12 Jan 2018 12:05:48 +0000 (13:05 +0100)]
process-util: make our freeze() routine do something useful
When we crash we freeze() our-self (or possibly we reboot the machine if
that is configured). However, calling pause() is very unhelpful thing to
do. We should at least continue to do what init systems being doing
since 70's and that is reaping zombies. Otherwise zombies start to
accumulate on the system which is a very bad thing. As that can prevent
admin from taking manual steps to reboot the machine in somewhat
graceful manner (e.g. manually stopping services, unmounting data
volumes and calling reboot -f).
cocci: there's not ENOTSUP, there's only EOPNOTSUPP
On Linux the former is a compat alias to the latter, and that's really
weird, as inside the kernel the two are distinct. Which means we really
should stay away from it.
core: be stricter when handling PID files and MAINPID sd_notify() messages
Let's be more restrictive when validating PID files and MAINPID=
messages: don't accept PIDs that make no sense, and if the configuration
source is not trusted, don't accept out-of-cgroup PIDs. A configuratin
source is considered trusted when the PID file is owned by root, or the
message was received from root.
This should lock things down a bit, in case service authors write out
PID files from unprivileged code or use NotifyAccess=all with
unprivileged code. Note that doing so was always problematic, just now
it's a bit less problematic.
When we open the PID file we'll now use the CHASE_SAFE chase_symlinks()
logic, to ensure that we won't follow an unpriviled-owned symlink to a
privileged-owned file thinking this was a valid privileged PID file,
even though it really isn't.
sd-dameon: also sent ucred when our UID differs from EUID
Let's be explicit, and always send the messages from our UID and never
our EUID. Previously this behaviour was conditionalized only on whether
the PID was specified, which made this non-obvious.
The new flag returns the O_PATH fd of the final component, which may be
converted into a proper fd by open()ing it again through the
/proc/self/fd/xyz path.
Together with O_SAFE this provides us with a somewhat safe way to open()
files in directories potentially owned by unprivileged code, where we
want to refuse operation if any symlink tricks are played pointing to
privileged files.
fs-util: add new CHASE_SAFE flag to chase_symlinks()
When the flag is specified we won't transition to a privilege-owned
file or directory from an unprivileged-owned one. This is useful when
privileged code wants to load data from a file unprivileged users have
write access to, and validates the ownership, but want's to make sure
that no symlink games are played to read a root-owned system file
belonging to a different context.