Before this commit:
mode=1777,size=10%,nr_inodes=400k,uid=496107520,gid=496107520,context=,sys.id:sys.role:systemd.nspawn.container.fs:s0,
After this commit:
mode=1777,size=10%,nr_inodes=400k,uid=496107520,gid=496107520,context=sys.id:sys.role:systemd.nspawn.container.fs:s0
Anders Wenhaug [Sun, 20 Jun 2021 19:43:07 +0000 (21:43 +0200)]
time-util: don't use plural units indiscriminately
format_timestamp_relative currently returns the plural form of
years and months no matter the quantity, and in many cases (for
durations > 1 week) this is the same with days.
This patch changes this so that the function takes the quantity into account,
returning "1 month 1 week ago" instead of "1 months 1 weeks ago".
repart: make No-Auto GPT partition flag configurable too
This is useful for provisioning initially empty secondary A/B root file
systems. We don't want those to ever be considered for automatic
mounting, for example in "systemd-nspawn --image=", hence we should
create them with the No-Auto flag turned on. Once a file system image is
dropped into the partition the flag may be turned off by the updater
tool, so that it is considered from then on.
Thew new option for this is called NoAuto. I dislike negated options
like this, but this is taken from the naming in the spec, which in turn
inherited the name from the same flag for Microsoft Data Partitions. To
minimize confusion, let's stick to the name hence.
path-util: make path_equal() an inline wrapper around path_compare()
The two are completely identical, only the return code is inverted.
let's hence make it easy for the compiler to make it the same function
call even in lowest optimization modes.
Frantisek Sumsal [Thu, 17 Jun 2021 18:17:25 +0000 (20:17 +0200)]
test: wait until the unit leaves the 'inactive' state as well
In many CI runs I noticed a race where we check the "active" state a bit
too early where the unit is still in the "inactive" state, causing the
`is-failed` check to fail. Mitigate this by waiting even if the unit is
in the inactive state and introduce a "safe net" which checks whether
the unit is not restarting indefinitely or more than it should (as
described in the original issue #3166).
Example:
```
[ 5.757784] testsuite-11.sh[216]: + systemctl --no-block start fail-on-restart.service
[ 5.853657] testsuite-11.sh[222]: ++ systemctl show --value --property ActiveState fail-on-restart.service
[ 5.946044] testsuite-11.sh[216]: + active_state=inactive
[ 5.946044] testsuite-11.sh[216]: + [[ inactive == \a\c\t\i\v\a\t\i\n\g ]]
[ 5.946044] testsuite-11.sh[216]: + [[ inactive == \a\c\t\i\v\e ]]
[ 5.946044] testsuite-11.sh[216]: + systemctl is-failed fail-on-restart.service
[ 5.946816] systemd[1]: fail-on-restart.service: Passing 0 fds to service
[ 5.946913] systemd[1]: fail-on-restart.service: About to execute false
[ 5.947011] systemd[1]: fail-on-restart.service: Forked false as 228
[ 5.947093] systemd[1]: fail-on-restart.service: Changed dead -> start
[ 5.947172] systemd[1]: Starting Fail on restart...
[ 5.947272] systemd[228]: fail-on-restart.service: Executing: false
[ 5.960553] testsuite-11.sh[227]: activating
[ 5.965188] testsuite-11.sh[216]: + exit 1
[ 6.011838] systemd[1]: Received SIGCHLD from PID 228 (4).
[ 6.012510] systemd[1]: fail-on-restart.service: Main process exited, code=exited, status=1/FAILURE
[ 6.012638] systemd[1]: fail-on-restart.service: Failed with result 'exit-code'.
[ 6.012834] systemd[1]: fail-on-restart.service: Service will restart (restart setting)
[ 6.012963] systemd[1]: fail-on-restart.service: Changed running -> failed
[ 6.013081] systemd[1]: fail-on-restart.service: Unit entered failed state.
```
plattrap [Fri, 18 Jun 2021 00:32:02 +0000 (12:32 +1200)]
Update systemd-resolved.service.8 help
Text currently refers to `/etc/nsswitch.conf` where it should refer to `/etc/resolv.conf`.
This is in the context of defining a nameserver IP and search domains.
Frantisek Sumsal [Thu, 17 Jun 2021 12:38:21 +0000 (14:38 +0200)]
test: drop the mawk-incompatible expression
The three-argument match() is a GNU AWK extension, thus breaking the
compatibility with mawk (used on Ubuntu/Debian, for example). Let's
replace it with a (hopefully) more portable sed expression to drop the
inadvertently introduced gawk dependency.
Jan Macku [Thu, 27 May 2021 10:25:51 +0000 (12:25 +0200)]
core: Hide "Deactivated successfully" message
Show message "Deactivated successfully" in debug mode (when manager is
user) rather than in info mode. This message has low information value
for regular users and it might be a bit overwhelming on a system with
a lot of devices.
meson: allow "soft-static" allocations for uids and gids in the initrd
The general idea with users and groups created through sysusers is that an
appropriate number is picked when the allocation is made. The number that is
selected will be different on each system based on the order of creation of
users, installed packages, etc. Since system users and groups are not shared
between installations, this generally is not an issue. But it becomes a problem
for initrd: some file systems are shared between the initrd and the host (/run
and /dev are probably the only ones that matter). If the allocations are
different in the host and the initrd, and files survive switch-root, they will
have wrong ownership.
This makes the gids build-time-configurable for all groups and users where
state may survive the switch from initrd to the host.
In particular, all "hardware access" groups are like this: files in /dev will
be owned by them. Eventually the new udev would change ownership, but there
would be a momemnt where the files were owned by the wrong group. The
allocations are "soft-static" in the language of Fedora packaging guidelines:
the uid/gid will be used if possible, but we'll fall back to a different
one. TTY_GID is the exception, because the number is used directly.
Similarly, the possibility to configure "soft-static" uids is added for daemons
which may usefully run in the initramfs: systemd-network (lease information and
interface state is serialized to /run), systemd-resolve (stub files and
interface state), systemd-timesync (/run/systemd/timesync).
Journal files are owned by the group systemd-journal, and acls are granted
for wheel and adm.
systemd-oom and systemd-coredump are excluded from this patch: I assume that
oomd is not useful in the initrd, and coredump leaves no state (it only creates
a pipe in /run?).
The defaults are not changed: if nothing is configured, dynamic allocation will
be used. I looked at a Debian system, and the numbers are all different than
on Fedora.
For Fedora, see the list of uids and gids at https://pagure.io/setup/blob/master/f/uidgid.
In particular, systemd-network and systemd-resolve got soft-static numbers to
make it easy to transition from a non-host-specific initrd to a host system
already a few years back (https://bugzilla.redhat.com/show_bug.cgi?id=1102002).
I also requested static allocations for sgx, input, render in
https://pagure.io/packaging-committee/issue/1078,
https://pagure.io/setup/pull-request/27.
Julia Kartseva [Tue, 15 Jun 2021 18:58:54 +0000 (11:58 -0700)]
dbus: extend SocktBind{Allow|Deny}= with ip proto
Support filtering by ip protocol (L4) in SocketBind{Allow|Deny}=
properties.
The signature of dbus methods must be finalized before new release is
cut, hence reserve a parameter for ip protocol.
Implementation will follow.
bootctl: print SystemdOptions from efivarfs if newer than our cache
The logic is that if the options are updated after boot, we *don't* use
the new value. But we still want to print out the changed contents in
bootctl as to not confuse people.
Fixes #19597.
Also https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=988450.
$ build/bootctl systemd-efi-options
quiet
Note: SystemdOptions EFI variable has been modified since boot. New value: debug
The hint is printed to stderr, so scripts should not be confused.
basic/efivars: replace dynanamic creation of efivar names with static strings
Creating those string dynamically at runtime is slow and unnecessary.
Let's use static strings with a bit of macro magic and the let the compiler
coalesce as much as possible.
$ size build/src/shared/libsystemd-shared-248.so{.old,}
text data bss dec hex filename 2813453 94572 4584 2912609 2c7161 build/src/shared/libsystemd-shared-248.so.old 2812309 94564 4584 2911457 2c6ce1 build/src/shared/libsystemd-shared-248.so
A nice side-effect is that the same form is used everywhere, so it's easier to
figure out all variables that are used, and where each specific variable is
used.
Note: 'const char *foo = alloca(…);' seems OK. Our coding style document and
alloca(3) only warn against using alloca() in function invocations. Declaring
both stack variable and alloca at the same time should be fine: no matter in
which order they happen, i.e. if the pointer variable is above the contents,
or the contents are above the pointer, or even if the pointer is elided by the
compiler, everything should be fine.
seccomp: drop quotactl_path() again from filter sets
In the light of https://lwn.net/Articles/859679/ let's drop
quotactl_path() again from the filter set list, as it got backed out
again in 5.13-rc3.
It's likely going to be replaced by quotactl_fd() eventually, but that
hasn't made its way into the tree yet, hence let's not replace the entry
for now.
sd-id128: document everywhere that we treat all UUIDs as Variant 1
So in theory UUID Variant 2 (i.e. microsoft GUIDs) are supposed to be
displayed in native endian. That is of course a bad idea, and Linux
userspace generally didn't implement that, i.e. uuidd and similar.
Hence, let's not bother either, but let's document that we treat
everything the same as Variant 1, even if it declares something else.
Yu Watanabe [Mon, 14 Jun 2021 10:46:33 +0000 (19:46 +0900)]
network: use void* to correctly store SetLinkOperation in Request
Previously, when `link_request_queue()` is called in link_request_set_link(),
`SetLinkOperation` is casted with INT_TO_PTR(), and the value is assigned to
`void *object`. However the value was read directly through the member
`SetLinkOperation set_link_operation` of the union which `object`
beloging to. Thus, read value was always 0 on big-endian systems.
Michal Sekletár [Mon, 24 Aug 2020 09:21:44 +0000 (11:21 +0200)]
udev: add basic set of user-space defined tracepoints (USDT)
Debugging udev issues especially during the early boot is fairly
difficult. Currently, you need to enable (at least) debug logging and
start monitoring uevents, try to reproduce the issue and then analyze
and correlate two (usually) huge log files. This is not ideal.
This patch aims to provide much more focused debugging tool,
tracepoints. More often then not we tend to have at least the basic idea
about the issue we are trying to debug further, e.g. we know it is
storage related. Hence all of the debug data generated for network
devices is useless, adds clutter to the log files and generally
slows things down.
Using this set of tracepoints you can start asking very specific
questions related to event processing for given device or subsystem.
Tracepoints can be used with various tracing tools but I will provide
examples using bpftrace.
Another important aspect to consider is that using tracepoints you can
debug production systems. There is no need to install test packages with
added logging, no debuginfo packages, etc...
Example usage (you might be asking such questions during the debug session),
Q: How can I list all tracepoints?
A: bpftrace -l 'usdt:/usr/lib/systemd/systemd-udevd:udev:*'
Q: What are the arguments for each tracepoint?
A: Look at the code and search for use of DEVICE_TRACE_POINT macro.
Q: How many times we have executed external binary?
A: bpftrace -e 'usdt:/usr/lib/systemd/systemd-udevd:udev:spawn_exec { @cnt = count(); }'
Q: What binaries where executed while handling events for "dm-0" device?
A bpftrace -e 'usdt:/usr/lib/systemd/systemd-udevd:udev:spawn_exec / str(arg1) == "dm-0"/ { @cmds[str(arg4)] = count(); }'
Thanks to Thomas Weißschuh <thomas@t-8ch.de> for reviewing this patch
and contributions that allowed us to drop the dependency on dtrace tool
and made the resulting code much more concise.
repart: show partitions we don't grow/create as "unchanged"
The previous string was "unknown", but that's wrong, because we *do*
know what we are going to do with those partitions: we leave them
unmodified, hence say "unchanged" in the output, to be clearer.
Yu Watanabe [Mon, 14 Jun 2021 17:13:59 +0000 (02:13 +0900)]
sd-event: always reshuffle time prioq on changing online/offline state
Before 81107b8419c39f726fd2805517a5b9faab204e59, the compare functions
for the latest or earliest prioq did not handle ratelimited flag.
So, it was ok to not reshuffle the time prioq when changing the flag.
But now, those two compare functions also compare the source is
ratelimited or not. So, it is necessary to reshuffle the time prioq
after changing the ratelimited flag.
A poorly documented fact is that SELinux unfortunately uses nosuid mount flag
to specify that also a fundamental feature of SELinux, domain transitions, must
not be allowed either. While this could be mitigated case by case by changing
the SELinux policy to use `nosuid_transition`, such mitigations would probably
have to be added everywhere if systemd used automatic nosuid mount flags when
`NoNewPrivileges=yes` would be implied. This isn't very desirable from SELinux
policy point of view since also untrusted mounts in service's mount namespaces
could start triggering domain transitions.
Alternatively there could be directives to override this behavior globally or
for each service (for example, new directives `SUIDPaths=`/`NoSUIDPaths=` or
more generic mount flag applicators), but since there's little value of the
commit by itself (setting NNP already disables most setuid functionality), it's
simpler to revert the commit. Such new directives could be used to implement
the original goal.
Yu Watanabe [Mon, 14 Jun 2021 06:43:43 +0000 (15:43 +0900)]
network: drop misleading debugging logs about MTU
This fixes the following spurious logs on enumerating links:
```
wlan0: Saved original MTU 1500 (min: 256, max: 2304)
wlan0: MTU is changed: 0 → 1500 (min: 256, max: 2304)
```
Yu Watanabe [Sat, 12 Jun 2021 20:12:03 +0000 (05:12 +0900)]
network: try to bring down before setting MAC address
Most real network devices refuse to set MAC address when its operstate
is not down. So, setting MAC address once failed, then let's bring down
the interface and retry to set.
Yu Watanabe [Fri, 11 Jun 2021 11:34:17 +0000 (20:34 +0900)]
network: always check dynamic address assignments before entering configured state
Previously (v248 or earlier), even if no static address is configured,
the link did not enter configured state, as e.g. Link::static_addresses_configured
is false until the link gained its carrier.
But, after the commit 1187fc337577cecd685d331eeab656be186ba3b2, the
situation was changed. Static addresses, routes, and etc are requested even
if the link does not have its carrier, and thus the link enters configured
state when no static address and etc are specified.
This makes the link does not enter configured state before it gains its
carrier when at least one of dynamic address assignment protocols (e.g.
DHCP) except for NDISC is enabled.
Note that, unfortunately, netplan always enables ConfigureWithoutCarrier=
for all virtual devices, e.g. bridge. See,
https://github.com/canonical/netplan/commit/978e20f902f6b92a46dc6e0050e2172e834e4617
So, we need to support e.g. the following strange config:
```
[Netowkr]
ConfigureWithoutCarrier=yes
DHCP=yes
```