resolved: don't redundantly switch DNS servers because of transaction failures
When a transaction fails and we decide to switch DNS servers, don#t do
so unconditionally. Check if the current DNS server is still the same as
when the transaction was initiated. And if not, do not do anything.
That should reduce the number of redundant DNS server switches if many
parallel transactions fail simultaneously (which is pretty likely if
DNSSEC is on).
resolved: reuse check for link-local IP address lookups
Let's reuse accept_link_local_reverse_lookups() at one more place, where
we check for the list of link local reverase address domains. Since we
don't actually accept the domains here (but rather the opposite, not
accept), let's rename the function a bit more generically with accept_ →
match_.
While we are at it invert the if branches, to make things more easily
understandable: filter out the unwatnted stuff and have the "all good"
state as main codepath.
This fixes a long-standing issue in packaging scriptlets: daemon-reload
was moved to the end of the transaction, but restarting services was still
straightaway after package installation.
Note that daemon-reload is called twice. This wouldn't be hardly noticable,
except that now a bunch of units (at least in Fedora) generate very verbose
warnings about deprecated features. So we get those warnings twice…
reload-or-restart --needing-restart is also called twice, but the second call
is usually a noop, because the first clears the flag for restarted units. The
second call is necessary for the case where we only uninstall packages, and the
%transfiletriggerpostun trigger fires, but not the %transfiletriggerin
scriptlet.
Also note that this assumes that units are marked only for restart if paths
under @systemunitdir@ or /etc/systemd/system have been touched. I would prefer
make the trigger that does 'restart --needing-restart' fire always, but it
seems rpm doesn't have such functionality. (Except as a %transfiletrigger that
would trigger on "/*" to catch all transactions, but that seems ineffiecient
and ugly.)
rpm: order sysctl/sysusers/tmpfiles execution before package scriptlets
P>1000000 is *before* "normal" scriptlets, P<1000000 is *after*. I think it
makes sense to do stuff like execution of sysctl/sysusers/tmpfiles configuration
before package scriptlets. I think that was the intent, but a single digit got
dropped ;(
Also, let's reorder the scriptlets in the file to match execution order, to
make it easier to see what is going on.
Most of those may happen in any order, but there are some exceptions:
tmpfiles should be after sysusers,
udevadm --reload should be after hwdb.
The trigger was initially written to use %transfiletriggerun instead
of %transfiletriggerpostun because the latter would not fire. It turned
out to a buffer overread in rpm that since has been long fixed:
https://bugzilla.redhat.com/show_bug.cgi?id=1284645
https://github.com/rpm-software-management/rpm/commit/f6521c50f6836374a0f7995f8f393aaf36e178ea
rpm: sync the shell version of triggers.systemd with the lua version
Note that this goes both ways: in particular the lua version had udev
scriptlets in the wrong package, fixed in
https://src.fedoraproject.org/rpms/systemd/c/3c9433d7cf4afc8d76660402f6c3d9d991596b83.
rpm: pull in the alternative trigger implementation in sh
From https://src.fedoraproject.org/rpms/systemd/blob/master/f/triggers.systemd.
In 12dde791d519bc80d5cca4ab6f088763cd481015 scriptlets were converted to lua.
This is not only faster and cleaner, but also avoids a nasty dependency loop:
rpm implements the lua scripting internally, so we don't need a working shell
for the scriplets. This is nice and all, but unfortunately ostree wants to
capture scriptlets and execute them at a later time and does not support lua.
So in Fedora we ended up with a revert back to a shell-based implementation
[1]. At the time I hoped this would only be a temporary workaround, but three
years later I think it's fair to assume that this will not happen any time
soon. But carrying the upstream lua version and the downstream sh version is
error prone. So let's import the other version into our tree too so that they
can be kept in sync.
This is almost equivalent to 'busctl call-method org.freedesktop.systemd1
/org/freedesktop/systemd1 org.freedesktop.systemd1.Manager EnqueueMarkedJobs',
but waits for the jobs to finish.
core: add EnqueueMarkedJobs method to reload/restart marked units
We support two return types for methods that start jobs. EnqueueJob support the
full-monty mode with affected jobs. I didn't do this here, since it seems
unlikely to be used. In the common case there'd be a huge list of jobs and
affected jobs. EnqueueMarkedJobs() just returns a list of jobs that we can wait
upon.
The name of the method is generic in case we decide to add something other than
just reload/restart later on.
When errors occur, resource errors are treated as fatal, but for other error
types we queue up other jobs, and only return an error at the end. The
assumption is that the caller will ignore the result error anyway, so it's
better to try to reload/restart as much as possible.
The property is never set by systemd, only reset after a stop or restart or
reload. It may externally be set to mark the unit for a later restart/reload.
I wasn't sure whether to configure the property only for the types where this
makes sense (Service, Swap, etc). But Restart() method is defined on the unit,
and also having this always under the same property name is more convenient.
The ELN composes are quite unstable and take a while to refresh. Let's
drop them again and revisit this once they get more mature to reduce
the CI noise.
Let's suppress repeated stub queries coming in, to minimize resource
usage. Many DNS clients are pretty aggressive regarding repeating DNS
requests, hence let's find them and suppress the follow-ups should we
need more time to fulfill the queries.
Luca Boccassi [Sun, 14 Feb 2021 19:29:42 +0000 (19:29 +0000)]
test: install binaries from local d/control file
The source package in the apt cache might be older than the
packaging from salsa.debian.org/systemd-team/systemd so it might not
list all the current binary packages.
This is currently the case for systemd-timesyncd, so TEST-30 fails.
Simply grep the control file rather than using apt-cache when iterating
over the packages contents.
systemctl: use argv[0] not program_invocation_short_name for arg dispatch
The immediate motivation is to allow fuzz-systemctl-parse-argv to cover also
the other code paths. p_i_s_n is not getting set (and it probably shouldn't),
so the fuzzer would only cover the paths for ./systemctl, and not ./reboot,
etc. Looking at argv[0] instead, which is passed as part of the fuzzer data,
fixes that.
But I think in general it's more correct to look at argv[0] here: after all we
have all the information available through local variables and shouldn't go out
of our way to look at a global.
This lists numerical signal values:
$ systemctl --signal list
SIGNAL NAME
1 SIGHUP
2 SIGINT
3 SIGQUIT
...
62 SIGRTMIN+28
63 SIGRTMIN+29
64 SIGRTMIN+30
This is useful when trying to kill e.g. systemd with a specific signal number
using kill. kill doesn't accept our fancy signal names like RTMIN+4, so one
would have to calculate that value somehow. Doing
systemctl --signal list | grep -F RTMIN+4
is a nice way of doing that.
resolved: refuse sending packets to our own stub listeners
A previous commit made sure that when one of our own packets is looped
back to us, we ignore it. But let's go one step further, and refuse
operation if we notice the server we talk to is our own. This way we
won't generate unnecessary traffic and can return a cleaner error.
Let's be more precise in naming this function, after all this doesn#t
actually check if the packet is really ours, but just that the source IP
address is a local one. Hence name it that way.
(This is preparation to add a helper that checks if packet belongs to
local transaction later on)
Let's add some overflow checks. Also, if 0 records are reserved, use
this as indication that a copy shall be done and do not grow the answer
beyond the current size.
resolved: gracefully handle with packets with too large RR count
Apparently, there are plenty routers in place that report an incorrect
RR count in the packets: they declare more RRs than are actually
included.
Let's accept these responses, but let's downgrade them to baseline, i.e.
let's suppress OPT in this case: if they don't even get the RR count
right, let's operate on the absolute baseline, and not bother with
anything fancier such as EDNS.
Prompted-by: https://github.com/systemd/systemd/issues/12841#issuecomment-724063973 Fixes: #3980
Most likely fixes: #12841
units: turn off DNSSEC validation when timesyncd resolves hostnames
We have a chicken and egg problem: validation of DNSSEC signatures
doesn't work without a correct clock, but to set the correct clock we
need to contact NTP servers which requires resolving a hostname, which
would normally require DNSSEC validation.
Let's break the cycle by excluding NTP hostname resolution from
validation for now.
Of course, this leaves NTP traffic unprotected. To cover that we need
NTPSEC support, which we can add later.
basic/unit-file: when loading linked unit files, use link source as "fragment path"
The general idea is that when a unit file is "linked" (i.e. installed by
symlinking from outside of the search paths), the *destination* name is
irrelevant. It doesn't even have to be a valid unit name, or to match the type
or instance value. The obvious collorary is that we shouldn't look at the
symlink destination name to derive the unit name, instance value, or anything
else at all.
When building the name map, when we find a linked unit (possibly at the end
of a series of alias redirects), store the *source* of the final symlink as the
fragment path. This has two effects:
- we stop looking at the *target* file name to derive unit info, i.e. actually
implement the stuff described in the first paragraph.
- we load the unit fragment through the symlink. If someone were to remove the
symlink, we'll not load the unit. This seems like the right thing.
Fixes #18058.
Before this change, we were generally quite confused about unit alises for
linked units. Fortunately most poeple use the same symlink source and target,
so in practice we wouldn't hit this too often.
In unit_load_fragment() a comment is added to explain what we're doing there.
core: slightly improve error message on load errors
Let's be a bit more helpful when refusing jobs on units that failed to
load properly. We already have explicit D-Bus errors for the error
conditions that are common and expected (such as "not found"), but for
the rest we so far generate a fairly cryptic message.
Let's try to be friendlier towards users and suggest what to do on such
errors.
Yu Watanabe [Fri, 12 Feb 2021 05:44:42 +0000 (14:44 +0900)]
network: address: do not set IFA_F_PERMANENT flag
The flag is automatically set by kernel when the valid lifetime is
infinite. Note that the flag in netlink message for IPv4 address is
ignored. See set_ifa_lifetime() in kernel's net/ipv4/devinet.c.
But the flag is honored for IPv6 address. And if the flag is set with
finite valid lifetime, the address will not removed automatically by
the kernel.
Yu Watanabe [Thu, 11 Feb 2021 17:56:43 +0000 (02:56 +0900)]
network: address: also set IFA_FLAGS on remove
If an address is assigned with IFA_F_MANAGETEMPADDR, then the flag must
be also set on remove. Otherwise, temporary addresses will not be
removed. See also inet6_rtm_deladdr() in kernel's net/ipv6/addrconf.c.
We had a lone 'bool job_running_timeout_set:1', which generated a hole. Let's
move things around a bit. The structure is a tiny bit smaller and has less
holes:
/* size: 1192, cachelines: 19, members: 149 */
/* sum members: 1175, holes: 3, sum holes: 11 */
/* sum bitfield members: 27 bits, bit holes: 1, sum bit holes: 7 bits */
/* bit_padding: 14 bits */
/* last cacheline: 40 bytes */
/* size: 1184, cachelines: 19, members: 149 */
/* sum members: 1175, holes: 1, sum holes: 4 */
/* sum bitfield members: 27 bits (3 bytes) */
/* bit_padding: 13 bits */
/* last cacheline: 32 bytes */
A helper function would seem more natural, but there are two reasons why a
macro is needed:
- many bool fields are bitfields, so we can't take a pointer, and using a macro
allows us to avoid taking a pointer.
- we have a few diffent types (bool, uint64_t, FreezerState), and we can have
type safety without specifying the type by using the macro.
This also makes the error messages more informative: they print the exact field
identifier that failed, which is more useful for debugging than a description.
sd-bus: extend sd_bus_message_read_strv() to paths and signatures
It's rather convenient to be able to read all three types with this function.
Strictly speaking this change is not fully compatible, in case someone was
relying on sd_bus_message_read_strv() returning an error for anything except
"as", but I hope nobody was doing that.