varlink: port varlink code over to use getdtablesize() for sizing number of concurrent connections
Use the official glibc API for determining this parameter. In most other
cases in our tree it's better to go directly for RLIMIT_NOFILE since
it's semantically what we want, but for this case it appears more
appropriate to use the friendlier, shorter, explicit API.
We want to use this code in NSS modules, and we never know the execution
environment we are run in there, hence let's move our fds up to ensure
we won't step into dangerous fd territory.
This is similar to how we already do it in sd-bus for client connection
fds.
Before, we'd unref from machine_stop_unit, still keeping the unit name around,
and only forget the name later, when garbage collecting. If we didn't call
manager_stop_unit(), then we wouldn't do the unref. Let's unref at the same
point where we do garbage collection, so that it is always true that
iff we have the name generated with AddRef=1, then have a reference to the unit,
and as soon as we forget the name, we drop the reference.
This should fix the issue when repeated systemd-nspawn --register=yes fails
with "scope already exists" error.
Incidentally, this fixes an error in the code path where r was used instead of q.
Having this function which is called only from one place in a separate file
makes the code harder to follow. In preparation for subsequent changes, let's
make it static.
man: sort options without "=" in the directives index
Some options would appear twice in the index, e.g. --collect= and
--collect. Some man pages use one form, some the other, and the argument
might be mandatory for some commands but not others. Anyway, let's display
them as one entry, to reduce the total number of items listed.
When wrong element types are used, directives are sometimes placed in the wrong
section. Also, strip part of text starting with "'", which is used in a few
places and which is displayed improperly in the index.
bootctl: make 'random-seed' handle inability to write system token EFI variable gracefully
Apparently some firmwares don't allow us to write this token, and refuse
it with EINVAL. We should normally consider that a fatal error, but not
really in the case of "bootctl random-seed" when called from the
systemd-boot-system-token.service since it's called as "best effort"
service after boot on various systems, and hence we shouldn't fail
loudly.
Similar, when we cannot find the ESP don't fail either, since there are
systems (arch install ISOs) that carry a boot loader capable of the
random seed logic but don't mount it after boot.
cgls: show delegation boundaries by underlining the cgroup in the output
This should help visualize where one manager's territory begins and
another's starts. Do this by underlining (since it's a "cut" point an
underline made most sense to me). Since underlining is not visible on
the console let's also show an ellipses for all lines that are
delegation boundaries.
Unfortunately this all is not as useful as it appears. The
"trusted.delegate" xattr is only visible to roo, which means
"systemd-cgls" has be called as root to show the boundaries.
Unfortunately cgroupfs doesn't support unprivileged xattrs on cgroups.
core: set "trusted.delegate" xattr on cgroups that are delegation boundaries
Let's mark cgroups that are delegation boundaries to us. This can then
be used by tools such as "systemd-cgls" to show where the next manager
takes over.
core: don't insist on ProtectHostname= if unshare() is blocked
Previously we'd only skip ProtectHostname= if kernel support for
namespaces was lacking. With this change we also accept if unshare()
fails because it is blocked.
core: be more lenient when checking whether sandboxing is necessary
In some containers unshare() is made unavailable entirely. Let's deal
with this that more gracefully and disable our sandboxing of services
then, so that we work in a container, under the assumption the container
manager is then responsible for sandboxing if we can't do it ourselves.
Previously, we'd insist on sandboxing as soon as any form of BindPath=
is used. With this change we only insist on it if we have a setting like
that where source and destination differ, i.e. there's a mapping
established that actually rearranges things, and thus would result in
systematically different behaviour if skipped (as opposed to mappings
that just make stuff read-only/writable that otherwise arent').
(Let's also update a test that intended to test for this behaviour with
a more specific configuration that still triggers the behaviour with
this change in place)
Fixes: #13955
(For testing purposes unshare() can easily be blocked with
systemd-nspawn --system-call-filter=~unshare.)
test: make sure our tests get exclusive TTY access
This sould make our test suite a bit more robust if it is slow running.
A few of our test services use StandardOutput=tty or StandardError=tty
in the tests in order to connect test services to the container console.
This gets into conflict with the container getty which wants exclusive
access to the console. Since the container getty is started with
Type=idle it typically gets started after a timeout only if the TTY is
already used, which hence introduces a race: if the test finishes
earlier all is good, if not, then the test gets kicked off the TTY which
then causes bash to abort since it cannot write any error messages
anymore.
Let's fix this hence: all tests that connect to the tty are now
synchronized to getty-pre.target, so they finish before any getty is
started.
Note that this slightly changes behaviour: "none" is only allowed as
option, if it's the only option specified, but not in combination with
other options. I think this makes more sense, since it's the choice when
no options shall be specified.
pam_systemd: don't use PAM_SYSTEM_ERR for something that isn't precisely a system error
It's not really clear which PAM errors to use for which conditions, but
something called PAM_SYSTEM_ERR should probably not be used when the
error is not the result of some system call failure.
Move the explanation of options three columns to the right: then almost
all options fit and we do not need to break lines so often.
When a multi-line explanation precedes a section break, i.e. there is a
half-line on the right side, do not use an empty space. This saves a line,
and actually looks visually better because the text is still clearly
separated, but we don't get the big vertical white space.
This copies the commands log-level and log-target (to query and set the current
settings) from systemd-analyze to systemctl, essentially reverting a65615ca5d78be0dcd7d9c9b4a663fa75f758606. Controllling the log level settings
of the manager is basic functionality, that should be available even if
systemd-analyze (which is more of an analysis tool) is not installed. This is
like dmesg and journalctl, which should be available even if a debugger and
more advanced tools to analyze the kernel are not available. (Note that dmesg
is used to control the log level too, not just to browse the kernel logs.)
I chose to copy&paste the methods from analyze.c to the new location. There
isn't enough code to share, because acquire_bus() in both places has a
different signature despite the same name, so the only part that is common
is the invocation of sd_bus_set_property().
This cleans up and unifies the outut of --help texts a bit:
1. Highlight the human friendly description string, not the command
line via ANSI sequences. Previously both this description string and
the brief command line summary was marked with the same ANSI
highlight sequence, but given we auto-page to less and less does not
honour multi-line highlights only the command line summary was
affectively highlighted. Rationale: for highlighting the description
instead of the command line: the command line summary is relatively
boring, and mostly the same for out tools, the description on the
other hand is pregnant, important and captions the whole thing and
hence deserves highlighting.
2. Always suffix "Options" with ":" in the help text
3. Rename "Flags" → "Options" in one case
4. Move commands to the top in a few cases
5. add coloring to many more help pages
6. Unify on COMMAND instead of {COMMAND} in the command line summary.
Some tools did it one way, others the other way. I am not sure what
precisely {} is supposed to mean, that uppercasing doesn't, hence
let's simplify and stick to the {}-less syntax
core/path: fix spurious triggering of PathExists= on restart/reload
Our handling of the condition was inconsistent. Normally, we'd only fire when
the file was created (or removed and subsequently created again). But on restarts,
we'd do a "recheck" from path_coldplug(), and if the file existed, we'd
always trigger. Daemon restarts and reloads should not be observeable, in
the sense that they should not trigger units which were already triggered and
would not be started again under normal circumstances.
Note that the mechanism for checks is racy: we get a notification from inotify,
and by the time we check, the file could have been created and removed again,
or removed and created again. It would be better if we inotify would give as
an unambiguous signal that the file was created, but it doesn't: IN_DELETE_SELF
triggers on inode removal, not directory entry, so we need to include IN_ATTRIB,
which obviously triggers on other conditions.