Daan De Meyer [Tue, 24 Aug 2021 15:46:47 +0000 (16:46 +0100)]
core: Check unit start rate limiting earlier
Fixes #17433. Currently, if any of the validations we do before we
check start rate limiting fail, we can still enter a busy loop as
no rate limiting gets applied. A common occurence of this scenario
is path units triggering a service that fails a condition check.
To fix the issue, we simply move up start rate limiting checks to
be the first thing we do when starting a unit. To achieve this,
we add a new method to the unit vtable and implement it for the
relevant unit types so that we can do the start rate limit checks
earlier on.
json: rework JSON_BUILD_XYZ() macros to use compound literals instead of compound statements
Compound statements is this stuff: ({ … })
Compound literals is this stuff: (type) { … }
We use compound statements a lot in macro definitions: they have one
drawback though: they define a code block of their own, hence if macro
invocations are nested within them that use compound literals their
lifetime is limited to the code block, which might be unexpected.
Thankfully, we can rework things from compound statements to compund
literals in the case of json.h: they don't open a new codeblack, and
hence do not suffer by the problem explained above.
The interesting thing about compound statements is that they also work
for simple types, not just for structs/unions/arrays. We can use this
here for a typechecked implicit conversion: we want to superficially
typecheck arguments to the json_build() varargs function, and we do that
by assigning the specified arguments to our compound literals, which
does the minimal amount of typechecks and ensures that types are
propagated on correctly.
We need one special tweak for this: sd_id128_t is not a simple type but
a union. Using compound literals for initialzing that would mean
specifiying the components of the union, not a complete sd_id128_t. Our
hack around that: instead of passing the object directly via the stack
we now take a pointer (and thus a simple type) instead.
Nice side-effect of all this: compound literals is C99, while compound
statements are a GCC extension, hence we move closer to standard C.
Yu Watanabe [Fri, 20 Aug 2021 18:51:39 +0000 (03:51 +0900)]
network: fix logic for checking gateway address is ready
This fixes the followings:
- The corresponding route or address to the gateway address must be in
the same link.
- IPv6 link local address is not necessary to be reachable.
Fixes an issue reported in https://github.com/systemd/systemd/issues/8686#issuecomment-902562324.
Andreas Rammhold [Mon, 26 Jul 2021 15:20:34 +0000 (17:20 +0200)]
login: respect install_sysconfdir_samples in meson file
The refactoring done in c900d89faa0 caused the configuration files to be
installed into the pkgsysconfdir regardless of the state of the
install_sysconfdir_samples boolean that indicates whether or not the
sample files should be installed.
Andreas Rammhold [Mon, 26 Jul 2021 14:57:43 +0000 (16:57 +0200)]
core: respect install_sysconfdir_samples in meson file
The refactoring done in e11a25cadbe caused the configuration files to be
installed into the pkgsysconfdir regardless of the state of the
install_sysconfdir_samples boolean that indicates whether or not the
sample files should be installed.
macro: handle overflow in ALIGN_TO() somewhat reasonably
The helper call rounds up to next multiple of specified boundary. If one
passes a very large value as first argument, then there might not be a
next multiple. So far we ignored that. Let's handle this now and return
SIZE_MAX in this case, as special indicator that we reached the end.
Of course, IRL this should not happen. With this new change we at least
do something somewhat reasonable, leaving it to the caller to handle it
further.
import: enable sparse file writing logic only for files we create
Only if we create a file we know for sure that it is empty and hence our
sparse file logic of skipping over NUL bytes can can work. If we hwoever
are called to write data to some existing file/block device, we must do
regular writes to override everything that might be in place before.
Hence, conditionalize sparse file writing on the write offset not being
configured (which is how we internally distinguish write to existing
file and write to new file)
Previously we only allows http/https urls, let's open this up a bit.
Why? Because it makes testing *so* *much* *easier* as we don't need to
run a HTTP server all the time.
CURL mostly abstracts the differences of http/https away from us, hence
we can get away with very little extra work.
Let's lock things down a bit and now allow curl's weirder protocols to
be used with our use. i.e. stick to http:// + https:// + file:// and
turn everything else off. (Gopher!)
This is cde that interfaces with the network after all, and we better
shouldn't support protocols needlessly that are much less tested.
(Given that HTTP redirects (and other redirects) exist, this should give
us a security benefit, since we will then be sure that noone can forward
us to a weird protocol, which we never tested, and other people test
neither)
Maanya Goenka [Tue, 17 Aug 2021 17:40:15 +0000 (10:40 -0700)]
systemd-analyze: add new 'security' option to compare unit's overall exposure level with
--threshold option added to work with security verb and with the --offline option so that
users can determine what qualifies as a security threat. The threshold set by the user is
compared with the overall exposure level assigned to a unit file and if the exposure is
higher than the threshold, 'security' will return a non-zero exit status. The default value
of the --threshold option is 100.
Example Run:
1. testcase.service is a unit file created for testing the --threshold option
For the purposes of this demo, the security table outputted below has been cut to show only the first two security settings.
maanya-goenka@debian:~/systemd (systemd-security)$ sudo build/systemd-analyze security --offline=true testcase.service
/usr/lib/systemd/system/plymouth-start.service:15: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's
process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'.
Support for KillMode=none is deprecated and will eventually be removed.
/usr/lib/systemd/system/gdm.service:30: Standard output type syslog is obsolete, automatically updating to journal. Please update your
unit file, and consider removing the setting altogether.
/usr/lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating
/var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update the unit file accordingly.
NAME DESCRIPTION EXPOSURE
✗ PrivateNetwork= Service has access to the host's network 0.5
✗ User=/DynamicUser= Service runs as root user 0.4
→ Overall exposure level for testcase.service: 9.6 UNSAFE 😨
2. Next, we use the same testcase.service file but add an additional --threshold=60 parameter. We would expect 'security' to exit
with a non-zero status because the overall exposure level (= 96) is higher than the set threshold (= 60).
maanya-goenka@debian:~/systemd (systemd-security)$ sudo build/systemd-analyze security --offline=true --threshold=60 testcase.service
/usr/lib/systemd/system/plymouth-start.service:15: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's
process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'.
Support for KillMode=none is deprecated and will eventually be removed.
/usr/lib/systemd/system/gdm.service:30: Standard output type syslog is obsolete, automatically updating to journal. Please update your
unit file, and consider removing the setting altogether.
/usr/lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating
/var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update the unit file accordingly.
NAME DESCRIPTION EXPOSURE
✗ PrivateNetwork= Service has access to the host's network 0.5
✗ User=/DynamicUser= Service runs as root user 0.4
→ Overall exposure level for testcase.service: 9.6 UNSAFE 😨
Maanya Goenka [Tue, 17 Aug 2021 17:25:38 +0000 (10:25 -0700)]
systemd-analyze: 'security' option to perform offline reviews of the specified unit file(s)
New option --offline which works with the 'security' command and takes in a boolean value. When set to true,
it performs an offline security review of the specified unit file(s). It does not rely on PID 1 to acquire
security information for the files like 'security' when used by itself does. It makes use of the refactored
security_info struct instead (commit #8cd669d3d3cf1b5e8667acc46ba290a9e8a8e529). This means that --offline can be
used with --image and --root as well. When used with --threshold, if a unit's overall exposure level is above
that set by the user, the default value being 100, --offline returns a non-zero exit status.
Example Run:
1. testcase.service is a unit file created for testing the --offline option
For the purposes of this demo, the security table outputted below has been cut to show only the first two security settings.
maanya-goenka@debian:~/systemd (systemd-security)$ sudo build/systemd-analyze security --offline=true testcase.service
/usr/lib/systemd/system/plymouth-start.service:15: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's
process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'.
Support for KillMode=none is deprecated and will eventually be removed.
/usr/lib/systemd/system/gdm.service:30: Standard output type syslog is obsolete, automatically updating to journal. Please update your
unit file, and consider removing the setting altogether.
/usr/lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating
/var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update the unit file accordingly.
NAME DESCRIPTION EXPOSURE
✗ PrivateNetwork= Service has access to the host's network 0.5
✗ User=/DynamicUser= Service runs as root user 0.4
→ Overall exposure level for testcase.service: 9.6 UNSAFE 😨
maanya-goenka@debian:~/systemd (systemd-security)$ sudo build/systemd-analyze security --offline=true testcase.service
/usr/lib/systemd/system/plymouth-start.service:15: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's
process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'.
Support for KillMode=none is deprecated and will eventually be removed.
/usr/lib/systemd/system/gdm.service:30: Standard output type syslog is obsolete, automatically updating to journal. Please update your
unit file, and consider removing the setting altogether.
/usr/lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating
/var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update the unit file accordingly.
NAME DESCRIPTION EXPOSURE
✓ PrivateNetwork= Service has access to the host's network
✗ User=/DynamicUser= Service runs as root user 0.4
→ Overall exposure level for testcase.service: 9.1 UNSAFE 😨
3. Next, we use the same testcase.service unit file but add the additional --threshold=60 option to see how --threshold works with
--offline. Since the overall exposure level is 91 which is greater than the threshold value set by the user (= 60), we can expect
a non-zero exit status.
maanya-goenka@debian:~/systemd (systemd-security)$ sudo build/systemd-analyze security --offline=true --threshold=60 testcase.service
/usr/lib/systemd/system/plymouth-start.service:15: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's
process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'.
Support for KillMode=none is deprecated and will eventually be removed.
/usr/lib/systemd/system/gdm.service:30: Standard output type syslog is obsolete, automatically updating to journal. Please update your
unit file, and consider removing the setting altogether.
/usr/lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating
/var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update the unit file accordingly.
NAME DESCRIPTION EXPOSURE
✓ PrivateNetwork= Service has access to the host's network
✗ User=/DynamicUser= Service runs as root user 0.4
→ Overall exposure level for testcase.service: 9.1 UNSAFE 😨
Maanya Goenka [Tue, 10 Aug 2021 21:00:23 +0000 (14:00 -0700)]
systemd-analyze: refactor security_info to make use of existing struct variables
In the original implementation of the security_info struct, the struct variables receive its values
via dbus protocol. We want to make use of existing structs ExecContext, Unit, and CGroupContext to
assign values to the security_info variables instead of relying on dbus for the same. This is possible since these
pre-defined structs already contain all the variables that security_info needs to perform security reviews on
unit files that are passed to it in the command line.
Callers to linux_exec() are actually passing an EFI_HANDLE, not a pointer to
it. linux_efi_handover(), which is called by linux_exec(), also expects an
EFI_HANDLE.
Daan De Meyer [Thu, 19 Aug 2021 14:09:44 +0000 (15:09 +0100)]
sd-bus: Improve (sd-buscntr) error logging
We're only doing one thing in the child process which is connecting
to the D-Bus socket so let's mention that in the error message when
something goes wrong instead of having a generic error message.
Daan De Meyer [Thu, 19 Aug 2021 14:09:34 +0000 (15:09 +0100)]
sd-bus: Return detailed (sd-buscntr) error from bus_container_connect_socket()
Previously, when the connect() call in (sd-buscntr) failed, we returned
-EPROTO without ever reading the actual errno from the error pipe. To fix
the issue, delay checking the process exit status until after we've read
and processed any error from the error pipe.
Ondrej Kozina [Thu, 20 May 2021 13:37:08 +0000 (15:37 +0200)]
Add support for systemd-pkcs11 libcryptsetup plugin.
Add support for systemd-pkcs11 based LUKS2 device activation
via libcryptsetup plugin. This make the feature (pkcs11 sealed
LUKS2 keyslot passphrase) usable from both systemd utilities
and cryptsetup cli.
The feature is configured via -Dlibcryptsetup-plugins combo
with default value set to 'auto'. It get's enabled automatically
when cryptsetup 2.4.0 or later is installed in build system.
Ondrej Kozina [Fri, 4 Jun 2021 14:21:30 +0000 (16:21 +0200)]
pkcs11-util: split pkcs11_token_login function
Future systemd-pkcs11 plugin requires unlock via single
call with supplied pin. To reduce needless code duplication
in plugin itself split original pkcs_11_token_login call in
two calls:
new pkcs11_token_login_by_pin and the former where loop
for retrying via PIN query callback remains.
Ondrej Kozina [Mon, 17 May 2021 13:26:14 +0000 (15:26 +0200)]
Add support for systemd-fido2 libcryptsetup plugin.
Add support for systemd-fido2 based LUKS2 device activation
via libcryptsetup plugin. This make the feature (fido2 sealed
LUKS2 keyslot passphrase) usable from both systemd utilities
and cryptsetup cli.
The feature is configured via -Dlibcryptsetup-plugins combo
with default value set to 'auto'. It get's enabled automatically
when cryptsetup 2.4.0 or later is installed in build system.
as per docs snprintf() can fail in which case it returns -1. The
snprintf_ok() macro so far unconditionally cast the return value of
snprintf() to size_t, which would turn -1 to (size_t) INT_MAX,
presumably, at least on 2 complements system.
Let's be more careful with types here, and first check if return value
is positive, before casting to size_t.
Also, while we are at it, let's return the input buffer as return value
or NULL instead of 1 or 0. It's marginally more useful, but more
importantly, is more inline with most of our other codebase that
typically doesn't use booleans to signal success.
All uses of snprintf_ok() don't care for the type of the return, hence
this change does not propagate anywhere else.
Mauricio Vásquez [Thu, 21 Jan 2021 15:45:38 +0000 (10:45 -0500)]
core: add RestrictNetworkInterfaces= BPF program source code
The code is composed by two BPF_PROG_TYPE_CGROUP_SKB programs that
are loaded in the cgroup inet ingress and egress hooks
(BPF_CGROUP_INET_{INGRESS|EGRESS}).
The decision to let a packet pass or not is based on a map that contains
the indexes of the interfaces.
Franck Bui [Tue, 3 Aug 2021 06:44:47 +0000 (08:44 +0200)]
test: don't try to find BUILD_DIR when NO_BUILD is set
NO_BUILD=1 indicates that we want to test systemd from the local system and not
the one from the local build. Hence there should be no need to call
find-build-dir.sh when NO_BUID=1 especially since it's likely that the script
will fail to find a local build in this case.
This avoids find-build-dir.sh to emit 'Specify build directory with $BUILD_DIR'
message when NO_BUILD=1 and no local build can be found.
This introduces a behavior change though: systemd from the local system will
always be preferred when NO_BUILD=1 even if a local build can be found.
Also, this makes
- the settings accept an empty string,
- if the specified value is too large, also use the advertised maximum
value.
- mention the range of the value in the man page.
Daan De Meyer [Wed, 18 Aug 2021 06:59:13 +0000 (07:59 +0100)]
udev: Support "max" string for BufferSize options (#20458)
"max" indicates the hardware advertised maximum queue buffer size
should be used.
The max sizes can be checked by running `ethtool -g <dev>` (Preset maximums).
Since the buffer sizes can't be set to 0 by users, internally we use 0 to
indicate that the hardware advertised maximum should be used.
Yu Watanabe [Tue, 17 Aug 2021 05:03:19 +0000 (14:03 +0900)]
network: do not assume the highest priority when Priority= is unspecified
Previously, when Priority= is unspecified, networkd configured the rule with
the highest (=0) priority. This commit makes networkd distinguish the case
the setting is unspecified and one explicitly specified as Priority=0.
Note.
1) If the priority is unspecified on configure, then kernel dynamically picks
a priority for the rule.
2) The new behavior is consistent with 'ip rule' command.