basic/hashmap,set: move pointer symbol adjactent to the returned value
I think this is nicer in general, and here in particular we have a lot
of code like:
static inline IteratedCache* hashmap_iterated_cache_new(Hashmap *h) {
return (IteratedCache*) _hashmap_iterated_cache_new(HASHMAP_BASE(h));
}
and it's visually appealing to use the same whitespace in the function
signature and the cast in the body of the function.
The compiler would do this to, esp. with LTO, but we can short-circuit the
whole process and make everything a bit simpler by avoiding the separate
definition.
(It would be nice to do the same for _set_new(), _set_ensure_allocated()
and other similar functions which are one-line trivial wrappers too. Unfortunately
that would require enum HashmapType to be made public, which we don't want
to do.)
Tobias Kaufmann [Mon, 31 Aug 2020 11:48:31 +0000 (13:48 +0200)]
core: fix securebits setting
Desired functionality:
Set securebits for services started as non-root user.
Failure:
The starting of the service fails if no ambient capability shall be
raised.
... systemd[217941]: ...: Failed to set process secure bits: Operation
not permitted
... systemd[217941]: ...: Failed at step SECUREBITS spawning
/usr/bin/abc.service: Operation not permitted
... systemd[1]: abc.service: Failed with result 'exit-code'.
Reason:
For setting securebits the capability CAP_SETPCAP is required. However
the securebits (if no ambient capability shall be raised) are set after
setresuid.
When setresuid is invoked all capabilities are dropped from the
permitted, effective and ambient capability set. If the securebit
SECBIT_KEEP_CAPS is set the permitted capability set is retained, but
the effective and the ambient set are cleared.
If ambient capabilities shall be set, the securebit SECBIT_KEEP_CAPS is
added to the securebits configured in the service file and set together
with the securebits from the service file before setresuid is executed
(in enforce_user).
Before setresuid is executed the capabilities are the same as for pid1.
This means that all capabilities in the effective, permitted and
bounding set are set. Thus the capability CAP_SETPCAP is in the
effective set and the prctl(PR_SET_SECUREBITS, ...) succeeds.
However, if the secure bits aren't set before setresuid is invoked they
shall be set shortly after the uid change in enforce_user.
This fails as SECBIT_KEEP_CAPS wasn't set before setresuid and in
consequence the effective and permitted set was cleared, hence
CAP_SETPCAP is not set in the effective set (and cannot be raised any
longer) and prctl(PR_SET_SECUREBITS, ...) failes with EPERM.
Proposed solution:
The proposed solution consists of three parts
1. Check in enforce_user, if securebits are configured in the service
file. If securebits are configured, set SECBIT_KEEP_CAPS
before invoking setresuid.
2. Don't set any other securebits than SECBIT_KEEP_CAPS in enforce_user,
but set all requested ones after enforce_user.
This has the advantage that securebits are set at the same place for
root and non-root services.
3. Raise CAP_SETPCAP to the effective set (if not already set) before
setting the securebits to avoid EPERM during the prctl syscall.
For gaining CAP_SETPCAP the function capability_bounding_set_drop is
splitted into two functions:
- The first one raises CAP_SETPCAP (required for dropping bounding
capabilities)
- The second drops the bounding capabilities
Why are ambient capabilities not affected by this change?
Ambient capabilities get cleared during setresuid, no matter if
SECBIT_KEEP_CAPS is set or not.
For raising ambient capabilities for a user different to root, the
requested capability has to be raised in the inheritable set first. Then
the SECBIT_KEEP_CAPS securebit needs to be set before setresuid is
invoked. Afterwards the ambient capability can be raised, because it is
in the inheritable and permitted set.
Security considerations:
Although the manpage is ambiguous SECBIT_KEEP_CAPS is cleared during
execve no matter if SECBIT_KEEP_CAPS_LOCKED is set or not. If both are
set only SECBIT_KEEP_CAPS_LOCKED is set after execve.
Setting SECBIT_KEEP_CAPS in enforce_user for being able to set
securebits is no security risk, as the effective and permitted set are
set to the value of the ambient set during execve (if the executed file
has no file capabilities. For details check man 7 capabilities).
Remark:
In capability-util.c is a comment complaining about the missing
capability CAP_SETPCAP in the effective set, after the kernel executed
/sbin/init. Thus it is checked there if this capability has to be raised
in the effective set before dropping capabilities from the bounding set.
If this were true all the time, ambient capabilities couldn't be set
without dropping at least one capability from the bounding set, as the
capability CAP_SETPCAP would miss and setting SECBIT_KEEP_CAPS would
fail with EPERM.
Rework how we cache mtime to figure out if units changed
Instead of assuming that more-recently modified directories have higher mtime,
just look for any mtime changes, up or down. Since we don't want to remember
individual mtimes, hash them to obtain a single value.
This should help us behave properly in the case when the time jumps backwards
during boot: various files might have mtimes that in the future, but we won't
care. This fixes the following scenario:
We have /etc/systemd/system with T1. T1 is initially far in the past.
We have /run/systemd/generator with time T2.
The time is adjusted backwards, so T2 will be always in the future for a while.
Now the user writes new files to /etc/systemd/system, and T1 is updated to T1'.
Nevertheless, T1 < T1' << T2.
We would consider our cache to be up-to-date, falsely.
This check was added in d904afc730268d50502f764dfd55b8cf4906c46f. It would only
apply in the case where the cache hasn't been loaded yet. I think we pretty
much always have the cache loaded when we reach this point, but even if we
didn't, it seems better to try to reload the unit. So let's drop this check.
pid1: use the cache mtime not clock to "mark" load attempts
We really only care if the cache has been reloaded between the time when we
last attempted to load this unit and now. So instead of recording the actual
time we try to load the unit, just store the timestamp of the cache. This has
the advantage that we'll notice if the cache mtime jumps forward or backward.
Also rename fragment_loadtime to fragment_not_found_time. It only gets set when
we failed to load the unit and the old name was suggesting it is always set.
In https://bugzilla.redhat.com/show_bug.cgi?id=1871327
(and most likely https://bugzilla.redhat.com/show_bug.cgi?id=1867930
and most likely https://bugzilla.redhat.com/show_bug.cgi?id=1872068) we try
to load a non-existent unit over and over from transaction_add_job_and_dependencies().
My understanding is that the clock was in the future during inital boot,
so cache_mtime is always in the future (since we don't touch the fs after initial boot),
so no matter how many times we try to load the unit and set
fragment_loadtime / fragment_not_found_time, it is always higher than cache_mtime,
so manager_unit_cache_should_retry_load() always returns true.
The name is misleading, since we aren't really loading the unit from cache — if
this function returns true, we'll try to load the unit from disk, updating the
cache in the process.
Florian Klink [Sat, 29 Aug 2020 17:57:24 +0000 (19:57 +0200)]
homed: fix log message to honor real homework path
This seems to be overridable by setting the SYSTEMD_HOMEWORK_PATH env
variable, but the error message always printed the SYSTEMD_HOMEWORK_PATH
constant.
However, this variable is only defined if HAVE_BLKID is set resulting in
the following build failure if cryptsetup is enabled but not libblkid:
../src/shared/dissect-image.c:1336:34: error: 'N_DEVICE_NODE_LIST_ATTEMPTS' undeclared (first use in this function)
1336 | for (unsigned i = 0; i < N_DEVICE_NODE_LIST_ATTEMPTS; i++) {
|
resolved: make sure we initialize t->answer_errno before completing the transaction
We must have the error number around when completing the transaction.
Let's hence make sure we always initialize it *first* (we accidentally
did it once after).
Michael Biebl [Fri, 28 Aug 2020 15:21:27 +0000 (17:21 +0200)]
test-network: stop networkd and its socket
With the changes from 2c0dffe82db574b6b9e850e48f444674e4e1d7ea, starting
systemd-networkd.service will also activate systemd-networkd.socket.
When tearing down a test, we need to stop the socket as well, to make
sure networkd can't be activated accidentally with the wrong
configuration.
Daniel Mack [Fri, 28 Aug 2020 14:14:12 +0000 (16:14 +0200)]
clock-util: read timestamp from /usr/lib/clock-epoch
On systems without an RTC, systemd currently sets the clock to a
compile-time epoch value, derived from the NEWS file in the
repository. This is not ideal as the initial clock hence depends
on the last time systemd was built, not when the image was compiled.
Let's provide a different way here and look at `/usr/lib/clock-epoch`.
If that file exists, it's timestamp for the last modification will be
used instead of the compile-time default.
Let's document the discrepancy between the Sec and USec suffixing of
unit files and D-Bus properties at three places: in "systemctl show"
(where it already was briefly mentioned), in the D-Bus interface
description (at one place at least, i.e. the most prominent of
properties that encapsulate time values, there are many more) and in the
general man page explaining time values.
By documenting this at all three places I think we now do as much as we
can do about this highlighting the discrepancy of the naming and the
reasons behind it.
This allows us to properly detect mount points, for free. (Also, allows
us to respect btimes that are newer than the cutoff, which should be
useful when people untar file trees in /var/tmp)
device: propagate reload events from devices on everything but "add", and "remove"
Any uevent other then the initial and the last uevent we see for a
device (which is "add" and "remove") should result in a reload being
triggered, including "bind" and "unbind". Hence, let's fix up the check.
("move" is kinda a combined "remove" + "add", hence cover that too)
Jérémy Nouhaud [Thu, 27 Aug 2020 19:59:23 +0000 (21:59 +0200)]
hwdb: fix size lenovo x240 touchpad (#16871)
As discussed in https://gitlab.freedesktop.org/libinput/libinput/-/issues/521, it adds a narrower
match that only applies to X240. Other laptops that match `pvrThinkPad??40` are not affected:
This makes use of the developer mode switch: the test is only done
if the user opted-in into developer mode.
Before the man/update-dbus-docs was using the argument form where
we don't need to run find_command(), but that doesn't work with test(),,
so find_command() is used and we get one more line in the config log.
tests/TEST-50: support the case when /etc/os-release is present
We have four legal cases:
1. /usr/lib/os-release exists and /etc/os-release is a symlink to it
2. both exist but /etc/os-release is not a symlink to /usr/lib/os-release
3. only /usr/lib/os-release exists
4. only /etc/os-release exists
The generic setup code in test-functions and create-busybox-image didn't handle
case 3.
The test-specific code in TEST-50 didn't handle 2 (because the general setup
code would only install /etc/os-release in the image and
grep -f /usr/lib/os-release would not work) and 4 (same reason) and would fail
in case 3 in generic setup.
Add replacement defines so that when acl/libacl.h is not available, the
ACL_{READ,WRITE,EXECUTE} constants are also defined. Those constants were
declared in the kernel headers already in 1da177e4c3f41524e886b7f1b8a0c1f,
so they should be the same pretty much everywhere.
Chris Down [Wed, 26 Aug 2020 17:49:27 +0000 (18:49 +0100)]
path: Improve $PATH search directory case
Previously:
1. last_error wouldn't be updated with errors from is_dir;
2. We'd always issue a stat(), even for binaries without execute;
3. We used stat() instead of access(), which is cheaper.
This change avoids all of those, by only checking inside X_OK-positive
case whether access() works on the path with an extra slash appended.
Thanks to Lennart for the suggestion.
Michael Biebl [Wed, 26 Aug 2020 14:54:45 +0000 (16:54 +0200)]
networkd: use socket activation when starting networkd
Add After=systemd-networkd.socket to avoid a race condition and networkd
falling back to the non-socket activation code.
Also add Wants=systemd-networkd.socket, so the socket is started when
networkd is started via `systemctl start systemd-networkd.service`.
A Requires is not strictly necessary, as networkd still ships the
non-socket activation code. Should this code be removed one day, the
Wants should be bumped to Requires accordingly.
resolved: add minimal varlink api for resolving hostnames/addresses
This allows us to later port nss-resolve to use Varlink rather than
D-Bus for resolution. This has the benefit that nss-resolve based
resoluton works even without D-Bus being up. And it's faster too.
Let's prepare for adding a new varlink interface, and thus rename the
"request" field to "bus_request", so that we can later add a
varlink_request field too.
in-addr-util: add byte accessor array to union in_addr_union
It's pretty useful to be able to access the bytes generically, without
acknowledging a specific family, hence let's a third way to access an
in_addr_union.