Note: in check_triggering_units 'path' will be allocated twice. This is a
conscious choice, this way the implementation is simpler and not worth
optimizing.
Yu Watanabe [Wed, 28 Mar 2018 11:37:27 +0000 (20:37 +0900)]
bus-util: add flags for bus_map_all_properties() (#8546)
This adds flags BUS_MAP_STRDUP and BUS_MAP_BOOLEAN_AS_BOOL.
If BUS_MAP_STRDUP is set, then each "s" message is duplicated.
If BUS_MAP_BOOLEAN_AS_BOOL is set, then each "b" message is
written to a bool pointer.
Follow-up for #8488.
See https://github.com/systemd/systemd/pull/8488#discussion_r175816270.
core: dont't remount /sys/fs/cgroup for relabel if not needed (#8595)
The initial fix for relabelling the cgroup filesystem for
SELinux delivered in commit 8739f23e3 was based on the assumption that
the cgroup filesystem is already populated once mount_setup() is
executed, which was true for my system. What I wasn't aware is that this
is the case only when another instance of systemd was running before
this one, which can happen if systemd is used in the initrd (for ex. by
dracut).
In case of a clean systemd start-up the cgroup filesystem is actually
being populated after mount_setup() and does not need relabelling as at
that moment the SELinux policy is already loaded. Since however the root
cgroup filesystem was remounted read-only in the meantime this operation
will now fail.
To fix this check for the filesystem mount flags before relabelling and
only remount ro->rw->ro if necessary and leave the filesystem read-write
otherwise.
shared/specifier: use realloc to free some memory after specifier expansion
This is a separate commit only because it actually *increases* memory allocations:
==3256== total heap usage: 100,120 allocs, 100,120 frees, 13,097,140 bytes allocated
to
==4690== total heap usage: 100,121 allocs, 100,121 frees, 14,198,329 bytes allocated
Essentially, we do a little more work to reduce the memory footprint a bit. For a
test where we just allocate the memory and drop it soon afterwards, this is not
beneficial, but it should still be useful for a long running program.
core: use setreuid/setregid trick to create session keyring with right ownership (#8447)
Re-use the hacks used to link user keyring, when creating the session
keyring. This way changing ownership of the keyring is not required, and thus
incovation_id can be correctly created in restricted environments.
Creating invocation_id with root permissions works and linking it into session
keyring works, as at that point session keyring is possessed.
Simple way to validate this is with following commands:
journal-file: we can't use a chain cache entry if we don't know where it starts (#8542)
It might happen that we try to bisect through a chain of offset arrays in the
journal whose last element was just allocated but no item yet written
to. In that case that array will be all NUL, but it might still end up
in our array chain cache. If it does, we cannot use it for bisection,
since for bisection we need to know the value of the first entry in that
array, but if it's uninitialized it does not have a first value. Hence,
as a simple fix, in this unlikely case, simply ignore the chain cache.
This is supposed to fix the issue pointed out in #8432, but in a more
permissive way, as this case isn't strictly a badly formatted journal
but actually a valid state (though one within a very short time window),
and we should make the best of it, and handle it gracefully.
Background: in each journal file entries are linked up in large arrays
of offsets. In each array the entries are strictly ordered by the
offsets of the entries, which permits search by bisection. These arrays
are allocated with a fixed size and then filled up as entries are added
to the journal file. If an array is fully filled up, a new array
(double in size as the old one) is appended to the journal file, and
linked up. This means, the journal file will contain a series of chained
up arrays, each time doubling in size, and strictly ordered. When
looking for an entry we maintain a "chain cache", which allows us to
bypass traversing the chain in full if we look for entries close to each
other in a short time. With the fix above we make sure we don't
erroneously use a chain cache item that doesn't carry enough information
for this bisection to work.
This reworks the SELinux and SMACK label fixing calls in a number of
ways:
1. The two separate boolean arguments of these functions are converted
into a flags type LabelFixFlags.
2. The operations are now implemented based on O_PATH. This should
resolve TTOCTTOU races between determining the label for the file
system object and applying it, as it it allows to pin the object
while we are operating on it.
3. When changing a label fails we'll query the label previously set, and
if matches what we want to set anyway we'll suppress the error.
Also, all calls to label_fix() are now (void)ified, when we ignore the
return values.
Stuart Hayes [Thu, 18 Jan 2018 20:14:56 +0000 (15:14 -0500)]
udev: net_id: Improve predictable names for NPAR devices
NPAR is a technology that allows a single network interface to
be divided into number of partitions. The partitions show up
as functions on the same PCI device... when there are more than
8 functions, ARI (alternative routing-ID interpretation) is
used. With ARI is enabled, the 8 bit field that normally has 5
bits for the PCI device and 3 bits for the PCI function is instead
interpreted as (implicit) device 0, with 8 bits for the function
number.
Because the linux kernel exposes the PCI device/function numbers
to userspace the same regardless of whether ARI is enabled,
systemd predictable device naming can generate unpredictable
names in this case, because network names using the PCI slot use
the function number, but not the device number, causing systemd
to generate the same name for mulitple network devices (so some
will revert to the "ethX" names).
With this patch, device naming code checks if ARI is enabled for
a PCI network device, and uses the full 8-bit function number
for naming to avoid this situation. This should improve
readability and predictability of device names.
Here is an example of how this change would affect naming:
Stuart Hayes [Wed, 17 Jan 2018 19:31:55 +0000 (14:31 -0500)]
udev: net_id: Improve predictable names for SR-IOV virtual devices
With PCI SR-IOV, a number of virtual network devices can be enabled,
all of which share the same physical network device. Currently,
udev generates names for SR-IOV virtual functions as if they were
independent network devices.
With this change, the predictable network device naming code will
check if a network device is an SR-IOV virtual device, and will
generate a name based on the physical PCI device plus a "v%u"
suffix. This should improve readability and predictability of
device names.
Here is an example of how this change would affect naming:
Stuart Hayes [Tue, 16 Jan 2018 21:08:10 +0000 (16:08 -0500)]
udev: net_id: search parent devices for PCI slot number
To generate predictable network device names, the code in
udev-builting-net_id.c tries to match the PCI device address
of the network device to the entries in /sys/bus/pci/slots.
However, sometimes the slot number is not associated the
network controller PCI device itself, but rather with one of
its parents.
This change will try to find a match in /sys/bus/pci/slots for
the parents of the PCI network device, if it doesn't find a
match for the device itself.
mourikwa [Mon, 26 Mar 2018 15:50:35 +0000 (17:50 +0200)]
Fix for alphabetical ordering (#8581)
I read the addition of the purism laptop keyboard and noticed
that the 60-keyboard.hwdb file could/should have an alphabetical ordering.
I scratched that itch with this commit.
Michael Olbrich [Mon, 26 Mar 2018 15:34:53 +0000 (17:34 +0200)]
core: don't include libmount.h in a header file (#8580)
linux/fs.h sys/mount.h, libmount.h and missing.h all include MS_*
definitions.
To avoid problems, only one of linux/fs.h, sys/mount.h and libmount.h
should be included. And missing.h must be included last.
Without this, building systemd may fail with:
In file included from [...]/libmount/libmount.h:31:0,
from ../systemd-238/src/core/manager.h:23,
from ../systemd-238/src/core/emergency-action.h:37,
from ../systemd-238/src/core/unit.h:34,
from ../systemd-238/src/core/dbus-timer.h:25,
from ../systemd-238/src/core/timer.c:26:
[...]/sys/mount.h:57:2: error: expected identifier before numeric constant
fuzz-unit-file: add __has_feature(memory_sanitizer) when skipping ListenNetlink=
https://clang.llvm.org/docs/MemorySanitizer.html#id5 documents this
check as the way to detect MemorySanitizer at compilation time. We
only need to skip the test if MemorySanitizer is used.
Also, use this condition in cg_slice_to_path(). There, the code that is
conditionalized is not harmful in any way (it's just unnecessary), so remove
the FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION condition.
fuzz-unit-file: adjust check for ListenNetlink yet again
The test for ListenNetlink would abort the loop if a line longer then LINE_MAX
was encountered (read_line() returns -ENOBUFS in that case). Let's use the
the line length limit that the unit file parses uses.
I kept the _SYSTEMD_UNIT= example because it is easy to understand and
not very verbose. _SYSTEMD_CGROUP has much longer entries which do not
fit well in the narrow man page. Instead, I added an explanation of what
-u is translated into.
man: add a note about "archived" journal files and when files can be copied
Issue #6673 requests advice on backup strategy. But the right backup strategy
depends on many factors, too many to describe in a man page. So let's just
provide some general information which files are mutable and that it is always
safe to use/copy files.
man: add a note about $XDG_SEAT and $XDG_VTNR to pam_systemd(8)
Issue #6499 requests that a mention that those varibles can be set in the
environment is added. But the man page already says that. There isn't much
detail, but a man page does not need to and in this case should not include
all the details. Instead a note is added that those vars can be derived from
$DISPLAY.
We're moving towards just SPDX license identifiers, and the boilerplate
is especially annoying in a man page. Also adjust to the smaller indentation
to make the code fit better on a page.
man: move examples out of sd_journal_get_fd into separate files
man/.dir-locals is to keep indentation under control.
This makes it much easier to compile and run those examples, c.f. #7578.
v2:
- copy more of .dir-locals.el from the root to man/.dir-locals.el
(I though emacs would inherit from the one in the parent dir, but
it seems it just uses its own broken defaults, including
indent-tabs-mode by default.)
Unfortunately the MIPS toolchains still do not implement PT_GNU_STACK.
This means that while the commit to restrict mmap on MIPS was "correct",
it had the side effect of causing pthread_create to fail because glibc tries
to allocate an executable stack for new threads in the absense of
PT_GNU_STACK. We should wait until PT_GNU_STACK is implemented in all
the relevant parts of the toolchain (at least gcc and glibc) before
enabling this again.
test: bypass selinux integration test if selinux policy devel package is not installed
With this "sudo ./run-integration-tests.sh" should work fully without
exception, even on systems lacking SELinux (in which case that test will
just be skipped)
Michal Sekletar [Fri, 23 Mar 2018 14:28:06 +0000 (15:28 +0100)]
core: delay adding target dependencies until all units are loaded and aliases resolved (#8381)
Currently we add target dependencies while we are loading units. This
can create ordering loops even if configuration doesn't contain any
loop. Take for example following configuration,
If we encounter such unit file early during manager start-up (e.g. load
queue is dispatched while enumerating devices due to SYSTEMD_WANTS in
udev rules) we would add stub unit default.target and we order it Before
test.service. At the same time we add implicit Before to
multi-user.target. Later we merge two units and we create ordering cycle
in the process.
To fix the issue we will now never add any target dependencies until we
loaded all the unit files and resolved all the aliases.
Peter Hutterer [Fri, 23 Mar 2018 14:15:41 +0000 (00:15 +1000)]
udev: don't label high-button mice as joysticks (#8493)
If a device exposes more than 16 mouse buttons, we run into the BTN_JOYSTICK
range, also labelling it as joystick. And since 774ff9b this results in only
ID_INPUT_JOYSTICK but no ID_INPUT_MOUSE.
tree-wide: warn when a directory path already exists but has bad mode/owner/type
When we are attempting to create directory somewhere in the bowels of /var/lib
and get an error that it already exists, it can be quite hard to diagnose what
is wrong (especially for a user who is not aware that the directory must have
the specified owner, and permissions not looser than what was requested). Let's
print a warning in most cases. A warning is appropriate, because such state is
usually a sign of borked installation and needs to be resolved by the adminstrator.
$ build/test-fs-util
Path "/tmp/test-readlink_and_make_absolute" already exists and is not a directory, refusing.
(or)
Directory "/tmp/test-readlink_and_make_absolute" already exists, but has mode 0775 that is too permissive (0755 was requested), refusing.
(or)
Directory "/tmp/test-readlink_and_make_absolute" already exists, but is owned by 1001:1000 (1000:1000 was requested), refusing.
Assertion 'mkdir_safe(tempdir, 0755, getuid(), getgid(), MKDIR_WARN_MODE) >= 0' failed at ../src/test/test-fs-util.c:320, function test_readlink_and_make_absolute(). Aborting.
No functional change except for the new log lines.
This macro will read a pointer of any type, return it, and set the
pointer to NULL. This is useful as an explicit concept of passing
ownership of a memory area between pointers.
It drops ~160 lines of code from our codebase, which makes me like it.
Also, I think it clarifies passing of ownership, and thus helps
readability a bit (at least for the initiated who know the new macro)
fs-util: add new CHASE_TRAIL_SLASH flag for chase_symlinks()
This rearranges chase_symlinks() a bit: if no special flags are
specified it will now revert to behaviour before b12d25a8d631af00b200e7aa9dbba6ba4a4a59ff. However, if the new
CHASE_TRAIL_SLASH flag is specified it will follow the behaviour
introduced by that commit.
I wasn't sure which one to make the beaviour that requires specification
of a flag to enable. I opted to make the "append trailing slash"
behaviour the one to enable by a flag, following the thinking that the
function should primarily be used to generate a normalized path, and I
am pretty sure a path without trailing slash is the more "normalized"
one, as the trailing slash is not really a part of it, but merely a
"decorator" that tells various system calls to generate ENOTDIR if the
path doesn't refer to a path.
Or to say this differently: if the slash was part of normalization then
we really should add it in all cases when the final path is a directory,
not just when the user originally specified it.
test-execute: allow sit0@ to exist in private network namespace
It's always visible:
$ sudo modprobe sit
$ sudo unshare -n ip l
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000
...
2: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
...
test-execute: simplify checks if grep output is empty
grep already indicates if it matched anything by return value.
Additional advantage is then that if the test fails, the unexpected
matching lines are visible in the log output.
safe_atou16_full() is like safe_atou16() but also takes a base
parameter. safe_atou16() is then implemented as inline function on top
of it, passing 0 as base. Similar safe_atoux16() is reworked as inline
function too, with 16 as base.
Let's use first_word() instead of startswith(), it's more explanatory
and a bit more correct. Also, let's use the return value instead of
adding +9 when looking for the second part of the directive.