NeilBrown [Thu, 4 Oct 2018 05:49:22 +0000 (15:49 +1000)]
core/mount: minimize impact on mount storm.
If we create 2000 mounts (on a 1-CPU qemu VM) with
mkdir -p /MNT/{1..2000}
time for i in {1..2000}; do mount --bind /etc /MNT/$i ; done
it takes around 20 seconds to complete. Much of this time is taken up
by systemd repeatedly processing /proc/self/mountinfo.
If I disable the processing, the time drops to about 4 seconds.
I have reports that on a larger system with multiple active user sessions, each
with it's own systemd, the impact can be higher.
One particular use-case where a large number of mounts can be expected in quick
succession is when the "clearcase" SCM starts up.
This patch modifies the handling up events from /proc/self/mountinfo so
that systemd backs off when a storm is detected. Specifically the time to process
mountinfo is measured, and the process will not be repeated until 10 times
that duration has passed. This ensures systemd won't use more than 10% of
real time processing mountinfo.
With this patch, my test above takes about 5 seconds.
lldp: simplify compare_func, using ?: to chain comparisons
The ?: operator is very useful for chaining comparison functions
(strcmp, memcmp, CMP), since its behavior is to return the result
of the comparison function call if non-zero, or continue evaluating
the chain of comparison functions.
This simplifies the code in that using a temporary `r` variable
to store the function results is no longer necessary and the checks
for non-zero to return are no longer needed either, resulting in a
typical three-fold reduction to the number of lines in the code.
Introduce a new memcmp_nn() to compare two memory buffers in
lexicographic order, taking length in consideration.
Tested: $ ninja -C build/ test
All test cases pass. In particular, test_multiple_neighbors_sorted()
in test-lldp would catch regressions introduced by this commit.
fileio: fail early if we can't return the number of bytes we read anymore in an int
This is mostly paranoia, but let's better be safer than sorry. This of
course means there's always an implicit limit to how much we can read at
a time of 2G. But that should be ample.
time-out
n 1: a brief suspension of play; "each team has two time-outs left"
From The Free On-line Dictionary of Computing (18 March 2015) [foldoc]:
timeout
A period of time after which an error condition is raised if
some event has not occured. A common example is sending a
message. If the receiver does not acknowledge the message
within some preset timeout period, a transmission error is
assumed to have occured.
Thomas Haller [Thu, 13 Dec 2018 18:59:46 +0000 (19:59 +0100)]
in-addr-util: fix undefined result for in4_addr_netmask_to_prefixlen(<0.0.0.0>)
u32ctz() was undefined for zero due to __builtin_ctz() [1].
Explicitly check for zero to make the behavior defined.
Note that this issue only affected in4_addr_netmask_to_prefixlen()
which is the only caller.
It may seem slightly odd, to return 32 (bits) for utz(0). But that
is what in4_addr_netmask_to_prefixlen() needs, and it probably makes
the most sense here.
units: replace symlinks in units/user/ by real files
We already *install* those as real files since de78fa9ba0be55b01066ca5a716c6673d76b817b.
Meson will start to copy symlinks as-is, so we would get dangling symlinks in
/usr/lib/systemd/user/.
I considered the layout in our sources to match the layout in the installation
filesystem (i.e. creating units/system/ and moving all files from units/ to
units/system/), but that seems overkill. By using normal files for both we get
some duplication, but those files change rarely, so it's not a big downside in
practice.
shared/install: ignore symlinks which have lower priority than the unit file
In #10583, a unit file lives in ~/.config/systemd/user, and
'systemctl --runtime --user mask' is used to create a symlink in /run.
This symlink has lower priority than the config file, so
'systemctl --user' will happily load the unit file, and does't care about
the symlink at all.
But when asked if the unit is enabled, we'd look for all symlinks, find the
symlink in the runtime directory, and report that the unit is runtime-enabled.
In this particular case the fact that the symlink points at /dev/null, creates
additional confusion, but it doesn't really matter: *any* symlink (or regular
file) that is lower in the priority order is "covered" by the unit fragment,
and should be ignored.
Franck Bui [Wed, 12 Dec 2018 12:46:32 +0000 (13:46 +0100)]
vconsole-setup: fonts copy will fail if the current terminal is in graphical mode
If the terminal is in graphical mode, the kernel will refuse to copy the fonts
and will return -EINVAL.
Also having the graphical mode in effect probably indicates that the terminal
is in used by another application and we shouldn't interfer in such cases.
shared/install: remove two conditionals which are always false
The name argument in UnitFileInstallInfo (i->name) should always be a unit
file name, so the conditional always takes the 'else' branch.
The only call chain that links to find_symlinks_fd() is unit_file_lookup_state
→ find_symlinks_in_scope → find_symlinks → find_symlinks_fd. But
unit_file_lookup_state calls unit_name_is_valid(name), and then name is used
to construct the UnitFileInstallInfo object in install_info_discover, which just
uses the name it was given.
It would work when the generator was run by systemd, since generators
are always started in "/", but when running the generator for debugging
purposes the result would be ... different.
generators: define custom main func definer and use it where applicable
There should be no functional difference, except that the error message
is changd from "three or no arguments" to "zero or three arguments". Somehow
the inverted form always seemed strange.
umask() call is also dropped from run-generator. I think it wasn't dropped in 053254e3cb215df3b8c905bc39b920f8817e1c7d because the run generator was merged
around the same time.
Sam Morris [Mon, 8 Oct 2018 11:03:28 +0000 (12:03 +0100)]
resolved: have the stub resolver listen on both TCP and UDP by default
RFC7766 section 4 states that in the absence of EDNS0, a response that
is too large for a 512-byte UDP packet will have the 'truncated' bit
set. The client is expected to retry the query over TCP.
Chris Down [Wed, 12 Dec 2018 10:49:35 +0000 (10:49 +0000)]
cgroup: Don't explicitly check for member in UNIT_BEFORE
The parent slice is always filtered ahead of time from UNIT_BEFORE, so
checking if the current member is the same as the parent unit will never
pass.
I may also write a SLICE_FOREACH_CHILD macro to remove some more of the
parent slice checks, but this requires a bit of a rework and general
refactoring and may not be worth it, so let's just do this for now.
Mark *data and *userdata params to specifier_printf() as const
It would be very wrong if any of the specfier printf calls modified
any of the objects or data being printed. Let's mark all arguments as const
(primarily to make it easier for the reader to see where modifications cannot
occur).
core: when a unit state changes only propagate to jobs after reloading is complete
Previously, we'd immediately propagate unit state changes into any jobs
pending for them, always. With this we only do this if the manager is
out of the "reload" state. This fixes the problem #8803 tried to
address, by simply not completing jobs until after the reload (and thus
reestablishment of the dbus connection) is complete.
Note that there's no need to later on explicitly catch up with the
missed job state changes (i.e. there's no need to call
unit_process_job() later one explicitly). That's because for jobs in
JOB_WAITING state on deserialization all jobs are requeued into the run
queue anyway, and thus checked again if they can complete now. And for
JOB_RUNNING jobs unit_catchup() phase is going to trigger missed out
state changes *after* the reload complete anyway (after all that's what
distinguishes from unit_coldplug()).
Let's add a helper call unit_deserialize_job() for this purpose, and
let's move registration in the global jobs hash table into
job_install_deserialized() so that it it is done after all superficial
checks are done, and before transitioning into installed states, so that
rollback code is not necessary anymore.
job: be more careful when removing job object from jobs hash table
Let's validate that the ID is actually allocated to us before remove a
job.
This is relevant as various bits of code will call job_free() on
partially set up Job objects, and we really shouldn't remove another job
object accidentally from the hash table, when the set up didn't
complete.
Memory management is borked for this, and moreover this is unnecessary
since f0831ed2a03, i.e. since coldplug() and catchup() are two different
concepts: the former restoring the state from before a reload, the
latter than adjusting it again to the actual status in effect after the
reload.
meson: make net.naming-scheme= default configurable
This is useful for distributions, where the stability of interface names should
be preseved after an upgrade of systemd. So when some specific release of the
distro is made available, systemd defaults to the latest & greatest naming
scheme, and subsequent updates set the same default. This default may still
be overriden through the kernel and env var options.
A special value "latest" is also allowed. Without a specific name, it is harder
to verride from meson. In case of 'combo' options, meson reads the default
during the initial configuration, and "remembers" this choice. When systemd is
updated, old build/ directories could keep the old default, which would be
annoying. Hence, "latest" is introduced to make it explicit, yet follow the
upstream. This is actually useful for the user too, because it may be used
as an override, without having to actually specify a version.
With this we can stabilize how naming works for network interfaces. A
user can request through a kernel cmdline option or an env var which
scheme to follow. The idea is that installers use this to set into stone
(a very soft stone though) the scheme used during installation so that
interface naming doesn't change afterwards anymore.
Why use env vars and kernel cmdline options, and not a config file of
its own?
Well, first of all there's no obvious existing one to use. But more
importantly: I have the feeling that this logic is kind of an incomplete
hack, and I simply don't want to do advertise this as a perfectly
working solution. So far we used env vars for the non-so-official
options and proper config files for the official stuff. Given how
incomplete this logic is (i.e. the big variable for naming remains the
kernel, which might expose sysfs attributes in newer versions that we
check for and didn't exist in older versions — and other problems like
this), I am simply not confident in giving this first-class exposure in
a primary configuration file.
We would describe tmpfiles.d through what systemd-tmpfiles does with them, but
I think it's better to start with a geneneral statement what they are. Also,
let's make the description of volatile file systems less prominent.
Also, strenghten the advice to use RuntimeDirectory and mention
{Cache,Logs,Configuration,State}Directory=.
man: reword tmpfiles.d descriptions to refer less to previous descriptions
I think it is OK if some option is described as "similar to ..., but in
addition ...", as long as the "in addition" part is strictly additive this is
unambiguous. Otherwise, we'd have to repeat a lot of text, and then we'd
probably forget to adjust some of the descriptions when doing changes.
But when the "in addition" part is about replacing or removing parts of
functionality, it is better to avoid this pattern and describe the later option
from scratch.
Some paragraph breaks are added and minor changes made. UID/GID is changed to
user/group, since we generally expect user/group names to be used, not numeric
ids.
cg_ns_supported() caches, so the condition was really checked just once, but
it looks weird to assign the return value to arg_use_cgns (if the variable is not present),
because then the other checks are effectively equivalent to
if (cg_ns_supported() && cg_ns_supported()) { ...
and later
if (!cg_ns_supported() || !cg_ns_supported()) { ...