oomd: increase accuracy of SwapUsedLimit= to permyriads too
oomd.conf has two parameters with fractionals: SwapUsedLimit= and
DefaultMemoryPressureLimit=, but one accepts permyriads, the other only
percentages, for no apparent reason. One carries the "Percent" in the
name, the other doesn't.
Let's clean this up: always accept permyriads, and drop the suffix,
given that it is misleading.
I figure we should internally try to focus on scaling everything
relative to UINT32_MAX, and if that isn't in the cards at least 10000,
but never permille nor percent unless there's a really really good
reason for it (e.g. interface defined by someone else).
core: use our usual UINT32_MAX scaling for OOMD limits
So far OOMD limits used permyriads, as an upgrade from the original
percent.
The rest of our codebase typically scales stuff relative to UINT32_MAX.
Let's clean this up, an make sure this happens here too. This is
particularly relevant, as this is exposed in unit files and API, and
before we mark this stable we should get the APIs right.
parse-util: add format string macro for outputting permyriad
Let's define a set of macros for making output of permyriad values easy.
They are printed in pure ASCII, i.e. without the permille/permyriad
suffix, using just percent and two places after the dot.
util: add some helpers for converting percent/permille/permyriad to parts of 2^32-1
At various places we accept values scaled to the range 0…2^32-1 which
are exposed to the user as percentages/permille/permyriad. Let's add
some helper macros (actually: typesafe macro-like functions) that help
with converting our internal encoding to the external encodings.
benefits: some of the previous code rounded up, some down. let's always
round to nearest, to ensure that our conversions are reversible. Also,
check for overflows correctly.
This also adds a test that makes sure that for the full
percent/permille/permyriad ranges we can convert forth and back without
loss of accuracy.
percent-util: when parsing permyriads, permit percents too with 1 place after the dot
Previously, when parsing myriads, we'd support:
x% → percent, no places after the dot
x.yz% → percent, two places after the dot
x‰ → permille, no places after the dot
x.y‰ → permille, one place after the dot
x‱ → permyriad, no places after the dot
Given that we now have a parser for permyriads, let's use it everywhere
for greater accuracy. This means wherever we previously supported % and
‰, we now also support ‱.
udevadm: after validating action, use our internal string instead of optarg
This doesn't really change anything, but feels nicer, since it abstracts
away what device_action_from_string()/device_action_to_string() do
internally, and always uses a normalized action string (yes, there's no
ambiguity, but it's nice to stay abstract, maybe one day there is
ambiguity around this)
To make sd-device properly usable for all programs we need to provide an
API for the "action" field of an event, it's one of the most relevant
ones, and it was so far missing.
This also adds sd_device_get_seqnum(), which isn't that interesting,
except for generating pretty debug output, which we use it ourselves
for.
This also makes device_new_from_stat_rdev() public, as it is truly
useful, as we can see in our own uses of it, and I think is fairly
generic to show up in the public APIs.
Apparently, in our current public headers (i.e. those called sd-*.h) we
suffixed typedefs that we use as values with _t, but we didn't do this
for enum typedefs. Fix that while this stuff is not actually public yet.
With this scheme "value typedefs" now end systematically in _t, and
"object typedefs" (i.e. structures that are typically passed around via
pointers and not values) do not.
resolved: tweak how we calculate MTU for sending packets
Let's take all MTU info we possibly have into account, i.e. the one
reported via netlink, as before and the one the socket might now (from
PMTUD and such), clamped by our own ideas.
resolved: collect incoming fragment size when receiving UDP datagrams
We can later use this to adapt our announced EDNS buffer size in order
to avoid fragmentation to make the best of large datagrams while still
avoiding he security weaknesses of it.
I'm seeing the following with kernel-core-5.10.16-200.fc33.x86_64:
$ sudo SYSTEMD_LOG_LEVEL=debug build/systemd-rfkill
Reading struct rfkill_event: got 8 bytes.
A new rfkill device has been added with index 0 and type bluetooth.
Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
Found container virtualization none.
rfkill0: Operating on rfkill device 'tpacpi_bluetooth_sw'.
Writing struct rfkill_event successful (8 of 9 bytes).
Loaded state '0' from /var/lib/systemd/rfkill/platform-thinkpad_acpi:bluetooth.
Reading struct rfkill_event: got 8 bytes.
A new rfkill device has been added with index 1 and type wwan.
rfkill1: Operating on rfkill device 'tpacpi_wwan_sw'.
Writing struct rfkill_event successful (8 of 9 bytes).
Loaded state '0' from /var/lib/systemd/rfkill/platform-thinkpad_acpi:wwan.
Reading struct rfkill_event: got 8 bytes.
A new rfkill device has been added with index 2 and type bluetooth.
rfkill2: Operating on rfkill device 'hci0'.
Writing struct rfkill_event successful (8 of 9 bytes).
Loaded state '0' from /var/lib/systemd/rfkill/pci-0000:00:14.0-usb-0:7:1.0:bluetooth.
Reading struct rfkill_event: got 8 bytes.
A new rfkill device has been added with index 3 and type wlan.
rfkill3: Operating on rfkill device 'phy0'.
Writing struct rfkill_event successful (8 of 9 bytes).
Loaded state '0' from /var/lib/systemd/rfkill/pci-0000:04:00.0:wlan.
All events read and idle, exiting.
We were expecting a read of exactly RFKILL_EVENT_SIZE_V1==8 bytes. But the
structure has 9 after [1].
/* we don't need the 'hard' variable but accept it */
if (count < RFKILL_EVENT_SIZE_V1 - 1)
return -EINVAL;
/*
* Copy as much data as we can accept into our 'ev' buffer,
* but tell userspace how much we've copied so it can determine
* our API version even in a write() call, if it cares.
*/
count = min(count, sizeof(ev));
if (copy_from_user(&ev, buf, count))
return -EFAULT;
... so it should accept the full size. I'm not sure what is going on here.
But we don't care about the extra fields, so let's accept a write as long as
it's at least RFKILL_EVENT_SIZE_V1.
Luca Boccassi [Fri, 12 Feb 2021 15:30:10 +0000 (15:30 +0000)]
os-util: allow missing VERSION_ID on the host
Rolling releases, like ArchLinux, do not set VERSION_ID in
their os-release files, so allow matching simply on ID if the host
does not provide anything.
shell-completion: complete --legend=no for resolvectl and systemctl
I don't think it makes sense to complete --legend=yes. It is the default, and
it would be only used very rarely (and then it is easy enough to just remove
the '=no' part from the suggested string).
systemctl: hide legends with --quiet, allow overriding
--no-legend is replaced by --legend=no.
--quiet now implies --legend=no, but --legend=yes may be used to override that.
--quiet controls hints and warnings and such, and --legend controls just the
legends. I think it makes sense to allow both to controlled independently, in
particular --quiet --legend makes sense when using systemctl in a script to
provide some user-visible output.
journal-remote: convert to parse_boolean_argument() and fix type confusion
We were passing a reference to 'int arg_seal' to config_parse_bool(),
which expects a 'bool *'. Luckily, this would work, because 'bool'
is smaller than 'int', so config_parse_bool() would set the least-significant
byte of arg_seal. At least I think so. But let's use consistent types ;)
Also, modernize style a bit and don't use integers in boolean context.
This nicely covers the case when optarg is optional. The same parser can be
used when the option string passed to getopt_long() requires a parameter and
when it doesn't.
The error messages are made consistent.
Also fixes a log error c&p in --crash-reboot message.
We almost never use the named enum type, in almost all cases we use
"int" instead, since we overload it with negative errnos. To simplify
things, let's use "int" really everywhere.
Moreover, let's rename the fields for this enum to "type_or_errno", to
make the overloading clear. And let's ad some assertions that things are
in the right range.
resolved: in DNSSEC permissive mode, check if DO bit wasn't copied from request to response
If the server doesn't copy the DO bit from request to response, this is
a very early and easy indication that it doesn#t support DNSSEC
properly. Hence, let's immediately downgrade to non-DNSSEC mode if we
see this – if permissive mode is on and this is allowed.
Luca Boccassi [Tue, 16 Feb 2021 23:47:34 +0000 (23:47 +0000)]
test: avoid leaking open loop devices
When a subshell is used ('make' or 'make all') the LOOPDEV environment
variable, which is used to store the opened loop device, is lost.
So the cleanup on trap/exit doesn't do anything, and the loop
device used to mount the test image is left around.
This is an updated version of #8608 with more restrictive logic. To
quite the original bug:
Some captive portals, lie and do not respond with the captive portal
IP address, if the query is with EDNS0 enabled and D0 bit set to
zero. Thus retry "secure" domain name look ups with less secure
methods, upon NXDOMAIN.
Vito Caputo [Tue, 27 Oct 2020 06:24:34 +0000 (23:24 -0700)]
logs-show: move show_journal_by_unit _BOOT_ID match
In scrutinizing the journal overhead of `systemctl status $service`
it became apparent that the matching engine was performing the unit
matches on every journal in my system, even ones containing nothing
relevant to the current boot.
This seemed strange and likely suboptimal to me, since there's likely
far more unit data to rifle through than boot IDs in any given
journal. The _BOOT_ID match seemed like it should be serving as an
early exit match on irrelevant journals, but that wasn't what seemed
to be happening.
As a quick experiment to see if I could get the _BOOT_ID match to be
something along the lines of a higher priority when matching, and try
early exit on these unrelated journals, I moved add_match_this_boot()
to after the unit match adds, inserting a conjunction between them.
The end result seems to be a very substantial performance gain in my
simple uncached tests, and I still get the expected journal output
from the `systemctl status $service` command:
root@localhost:/# echo 2 > /proc/sys/vm/drop_caches
root@localhost:/# time systemctl --no-pager status dbus
● dbus.service - D-Bus System Message Bus
Loaded: loaded (/lib/systemd/system/dbus.service; static; vendor preset: enabled)
Active: active (running) since Sun 2020-10-25 17:03:05 PDT; 1 day 6h ago
Docs: man:dbus-daemon(1)
Main PID: 572 (dbus-daemon)
Memory: 2.8M
CPU: 110ms
CGroup: /system.slice/dbus.service
└─572 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
Oct 25 17:03:05 localhost systemd[1]: Started D-Bus System Message Bus.
Oct 25 17:06:26 localhost dbus[572]: [system] Activating via systemd: service name='org.freedesktop.machine1' unit='dbus-org.freedesktop.machine1.service'
Oct 25 17:06:26 localhost dbus[572]: [system] Successfully activated service 'org.freedesktop.machine1'
real 0m0.695s
user 0m0.005s
sys 0m0.043s
root@localhost:/# echo 2 > /proc/sys/vm/drop_caches
root@localhost:/# time systemctl --no-pager status dbus
● dbus.service - D-Bus System Message Bus
Loaded: loaded (/lib/systemd/system/dbus.service; static; vendor preset: enabled)
Active: active (running) since Sun 2020-10-25 17:03:05 PDT; 1 day 6h ago
Docs: man:dbus-daemon(1)
Main PID: 572 (dbus-daemon)
Memory: 2.8M
CPU: 110ms
CGroup: /system.slice/dbus.service
└─572 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
Oct 25 17:03:05 localhost systemd[1]: Started D-Bus System Message Bus.
Oct 25 17:06:26 localhost dbus[572]: [system] Activating via systemd: service name='org.freedesktop.machine1' unit='dbus-org.freedesktop.machine1.service'
Oct 25 17:06:26 localhost dbus[572]: [system] Successfully activated service 'org.freedesktop.machine1'
real 0m0.696s
user 0m0.003s
sys 0m0.046s
root@localhost:/# echo 2 > /proc/sys/vm/drop_caches
root@localhost:/# time systemctl --no-pager status dbus
● dbus.service - D-Bus System Message Bus
Loaded: loaded (/lib/systemd/system/dbus.service; static; vendor preset: enabled)
Active: active (running) since Sun 2020-10-25 17:03:05 PDT; 1 day 6h ago
Docs: man:dbus-daemon(1)
Main PID: 572 (dbus-daemon)
Memory: 2.8M
CPU: 110ms
CGroup: /system.slice/dbus.service
└─572 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
Oct 25 17:03:05 localhost systemd[1]: Started D-Bus System Message Bus.
Oct 25 17:06:26 localhost dbus[572]: [system] Activating via systemd: service name='org.freedesktop.machine1' unit='dbus-org.freedesktop.machine1.service'
Oct 25 17:06:26 localhost dbus[572]: [system] Successfully activated service 'org.freedesktop.machine1'
root@localhost:/home/vc/gh/systemd/build# echo 2 > /proc/sys/vm/drop_caches
root@localhost:/home/vc/gh/systemd/build# time ./systemctl --no-pager status dbus
● dbus.service - D-Bus System Message Bus
Loaded: loaded (/lib/systemd/system/dbus.service; static)
Active: active (running) since Sun 2020-10-25 17:03:05 PDT; 1 day 6h ago
TriggeredBy: ● dbus.socket
Docs: man:dbus-daemon(1)
Main PID: 572 (dbus-daemon)
Memory: 2.8M
CPU: 110ms
CGroup: /system.slice/dbus.service
└─572 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
Oct 25 17:03:05 localhost systemd[1]: Started D-Bus System Message Bus.
Oct 25 17:06:26 localhost dbus[572]: [system] Activating via systemd: service name='org.freedesktop.machine1' unit='dbus-org.freedesktop.machine1.service'
Oct 25 17:06:26 localhost dbus[572]: [system] Successfully activated service 'org.freedesktop.machine1'
real 0m0.168s
user 0m0.003s
sys 0m0.016s
root@localhost:/home/vc/gh/systemd/build# echo 2 > /proc/sys/vm/drop_caches
root@localhost:/home/vc/gh/systemd/build# time ./systemctl --no-pager status dbus
● dbus.service - D-Bus System Message Bus
Loaded: loaded (/lib/systemd/system/dbus.service; static)
Active: active (running) since Sun 2020-10-25 17:03:05 PDT; 1 day 6h ago
TriggeredBy: ● dbus.socket
Docs: man:dbus-daemon(1)
Main PID: 572 (dbus-daemon)
Memory: 2.8M
CPU: 110ms
CGroup: /system.slice/dbus.service
└─572 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
Oct 25 17:03:05 localhost systemd[1]: Started D-Bus System Message Bus.
Oct 25 17:06:26 localhost dbus[572]: [system] Activating via systemd: service name='org.freedesktop.machine1' unit='dbus-org.freedesktop.machine1.service'
Oct 25 17:06:26 localhost dbus[572]: [system] Successfully activated service 'org.freedesktop.machine1'
real 0m0.167s
user 0m0.005s
sys 0m0.013s
root@localhost:/home/vc/gh/systemd/build# echo 2 > /proc/sys/vm/drop_caches
root@localhost:/home/vc/gh/systemd/build# time ./systemctl --no-pager status dbus
● dbus.service - D-Bus System Message Bus
Loaded: loaded (/lib/systemd/system/dbus.service; static)
Active: active (running) since Sun 2020-10-25 17:03:05 PDT; 1 day 6h ago
TriggeredBy: ● dbus.socket
Docs: man:dbus-daemon(1)
Main PID: 572 (dbus-daemon)
Memory: 2.8M
CPU: 110ms
CGroup: /system.slice/dbus.service
└─572 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
Oct 25 17:03:05 localhost systemd[1]: Started D-Bus System Message Bus.
Oct 25 17:06:26 localhost dbus[572]: [system] Activating via systemd: service name='org.freedesktop.machine1' unit='dbus-org.freedesktop.machine1.service'
Oct 25 17:06:26 localhost dbus[572]: [system] Successfully activated service 'org.freedesktop.machine1'