Likewise, it's useful to provide an alternate value during an
environment expansion, if the environment variable isn't already set.
For instance, $LD_LIBRARY_PATH will inadvertently search the current
working directory if it starts or ends with a colon, so the following
is usually wrong:
LD_LIBRARY_PATH=/foo/lib:${LD_LIBRARY_PATH}
To address that, this changes replace_env to support the following
shell compatible alternate value syntax:
We have ./configure switches for various parts of non-essential functionality,
let's add one for this new stuff too. Support for environment generators is
not conditional — if you don't want them, just don't install any.
In the future we might want to allow additional syntax (for example
"unset VAR". But let's check that the data we're getting does not contain
anything unexpected.
We have only basic compatibility with shell syntax, but specifying variables
without using braces is probably more common, and I think a lot of people would
be surprised if this didn't work.
environment-generator: new generator to peruse environment.d
Why the strange name: the prefix is necessary to follow our own advice that
environment generators should have numerical prefixes. I also put -d- in the
name because otherwise the name was very easy to mistake with
systemd.environment-generator. This additional letter clarifies that this
on special generator that supports environment.d files.
Ray Strode [Thu, 4 Aug 2016 16:00:00 +0000 (12:00 -0400)]
basic: add new merge_env_file function
merge_env_file is a new function, that's like load_env_file, but takes a
pre-existing environment as an input argument. New environment entries are
merged. Variable expansion is performed.
Falling back to the process environment is supported (when a flag is set).
Alternatively this could be implemented as passing an additional fallback
environment array, but later on we're adding another flag to allow braceless
expansion, and the two flags can be combined in one arg, so there's less
stuff to pass around.
Ray Strode [Wed, 3 Aug 2016 18:35:50 +0000 (14:35 -0400)]
basic: fix strv_env_get_n for unclean arrays
If an environment array has duplicates, strv_env_get_n returns
the results for the first match. This is wrong, because later
entries in the environment are supposed to replace earlier
entries.
man: add systemd.environment-generator(7) with two examples
v2:
- add example files to EXTRA_DIST
v3:
- rework for the new scheme where nothing is written to disk
v4:
- use separate dirs for system and user env generators
Environment file generators are a lot like unit file generators, but not
exactly:
1. environment file generators are run for each manager instance, and their
output is (or at least can be) individualized.
The generators themselves are system-wide, the same for all users.
2. environment file generators are run sequentially, in priority order.
Thus, the lifetime of those files is tied to lifecycle of the manager
instance. Because generators are run sequentially, later generators can use or
modify the output of earlier generators.
Each generator is run with no arguments, and the whole state is stored in the
environment variables. The generator can echo a set of variable assignments to
standard output:
VAR_A=something
VAR_B=something else
This output is parsed, and the next and subsequent generators run with those
updated variables in the environment. After the last generator is done, the
environment that the manager itself exports is updated.
Each generator must return 0, otherwise the output is ignored.
The generators in */user-env-generator are for the user session managers,
including root, and the ones in */system-env-generator are for pid1.
env-util,fileio: immediately replace variables in load_env_file_push()
strv_env_replace was calling env_match(), which in effect allowed multiple
values for the same key to be inserted into the environment block. That's
pointless, because APIs to access variables only return a single value (the
latest entry), so it's better to keep the block clean, i.e. with just a single
entry for each key.
Add a new helper function that simply tests if the part before '=' is equal in
two strings and use that in strv_env_replace.
In load_env_file_push, use strv_env_replace to immediately replace the previous
assignment with a matching name.
Afaict, none of the callers are materially affected by this change, but it
seems like some pointless work was being done, if the same value was set
multiple times. We'd go through parsing and assigning the value for each
entry. With this change, we handle just the last one.
core/manager: move environment serialization out to basic/env-util.c
This protocol is generally useful, we might just as well reuse it for the
env. generators.
The implementation is changed a bit: instead of making a new strv and freeing
the old one, just mutate the original. This is much faster with larger arrays,
while in fact atomicity is preserved, since we only either insert the new
entry or not, without being in inconsistent state.
basic/exec-util: add support for synchronous (ordered) execution
The output of processes can be gathered, and passed back to the callee.
(This commit just implements the basic functionality and tests.)
After the preparation in previous commits, the change in functionality is
relatively simple. For coding convenience, alarm is prepared *before* any
children are executed, and not before. This shouldn't matter usually, since
just forking of the children should be pretty quick. One could also argue that
this is more correct, because we will also catch the case when (for whatever
reason), forking itself is slow.
Three callback functions and three levels of serialization are used:
- from individual generator processes to the generator forker
- from the forker back to the main process
- deserialization in the main process
v2:
- replace an structure with an indexed array of callbacks
core/manager: split out creation of serialization fd out to a helper
There is a slight change in behaviour: the user manager for root will create a
temporary file in /run/systemd, not /tmp. I don't think this matters, but
simplifies implementation.
sd-device: replace lstat() + open() with open(O_NOFOLLOW)
Coverity was complaining about TOCTOU (CID #745806). Indeed, it seems better
to open the file and avoid the stat altogether:
- O_NOFOLLOW means we'll get ELOOP, which we can translate to EINVAL as before,
- similarly, open(O_WRONLY) on a directory will fail with EISDIR,
- and finally, it makes no sense to check access mode ourselves: just let
the kernel do it and propagate the error.
systemctl: give a hint about --force --force when communication with manager fails
The hint is not too explicit, and just refers to the man page, because this
option is slightly dangereous. This was we don't have to discuss the limitation
in the hint itself.
"systemctl --user edit --force --full tmp.mount" would crash, when we'd do
basename(NULL). Fix this by creating a new unit or a new override even if
not path is found.
journalctl: add reference to sd-id128(3) to output (#5382)
SD_ID128_MAKE is clearly not a standard C macro, so let’s point the user
to its documentation to let them know which header they need and what
they can then do with MESSAGE_XYZ.
nspawn: tweak check whether resolved is around a bit
Let's check D-Bus instead of files in /run to see if resolved is
running. This is a bit nicer as bus names are automatically cleaned up
when resolved dies, which is not the case for files in /run.
Martin Pitt [Fri, 17 Feb 2017 20:29:02 +0000 (21:29 +0100)]
test: re-drop assumption that /run is a mount point (#5377)
Commit 436e916ea introduced the assumption into test-stat-util that /run
is a tmpfs mount point. This is not the case in build chroots such as
Fedora's mock or Debian's sbuild. So only assert that /run is a tmpfs
and not a btrfs if /run is actually a mount point. This will then still
be asserted with installed tests.
udev: fix id_net_name_path for virtio-ccw interfaces (#5357)
The CCW id_net_name_path detection didn't account for virtio
interfaces on the CCW bus. As a result the default interface
names for virtio-ccw interfaces would use the old eth<x>
format instead of enc<busid>.
Since virtio-pci interface naming follows the naming rules
of the parent bus, the names_ccw() logic was changed to apply
the CCW interface naming rules to virtio interfaces as well,
e.g. enc2000 for an interface with a CCW bus id 0.0.2000.
As virtio interfaces are apt to get the otherwise unusual
CCW bus id 0.0.0000, the last '0' is now preserved in this
case.
The virtio subsystem skipping loop has been moved from
names_pci() into a function skip_virtio() that can be reused
for all bus types with virtio network devices.
Since virtio-ccw interfaces use single CCW addresses the ccwgroup
requirement was relaxed and the C definitions were changed
accordingly.
network: change condition in if testing section presence
section_line and filename should be set together or not at all. Change the
if to test filename, since it's the first of the pair and it seems more natural
to test that.
networkd: immediately transfer ownership of route->section
The code was not incorrect previously, but I think it's easier to follow the
ownership (and the code is more likely to remain correct when updated later on),
if freeing of NetworkConfigSection* is immediately made the responsibility of
route_free(), so instead of relying on route_free() not freeing ->section
if adding to the network hashmap failed, make this freeing unconditional.
coredump: when reconstructing original kernel coredump context, chop off trailing zeroes
Our coredump handler operates on a "context" supplied by the kernel via
the core_pattern arguments. When we pass off a coredump for processing
to coredumpd we pass along enough information for this context to be
reconstructed. This information is passed in the usual journal fields,
and that means we extended the 1s granularity timestamp to 1µs
granularity by appending 6 zeroes. We need to chop them off again when
reconstructing the original kernel context.
udevd: use signal_to_string() instead of strsignal() at one place
strsignal() sucks, as it tries to generate human readable strings from
something that isn't really human readable by concept. Let's use
signal_to_string() instead, making this more grokkable. Difference is:
SIGINT gets translated → "SIGINT" rather than → "Interrupted".
(Note that we only do this for the journal metadata, not for the xattrs,
as the xattrs are only supposed to store the original 1:1 info we
acquired from the kernel.)
resolved: try to authenticate SOA on negative replies
For caching negative replies we need the SOA TTL information. Hence,
let's authenticate all auxiliary SOA RRs through DS requests on all
negative requests.
1) The first UDP retry we increase 500ms → 750ms. This is a good idea,
since some servers need relatively long responses for trivial lookups,
and giving up our first attempt also has the effect of trying a
different server for the next attempt which has the side effect that
we'll run two down-grade iterations in parallel, on both servers.
Hence, let's give servers a bit more time in the first iteration.
2) Permit 24 retries instead of just 16 per transactions. If we end up
downgrading all the way down to UDP for a lookup we already need 5
iterations for that. If we want permit a couple of lost packages for
each (let's say 4), then we already need 20 iterations.
3) Increase the overall query timeout on the service side to 60s (from
45s), simply because very long and slow DNSSEC + CNAME chains (such as
us.ynuf.alipay.com) hit this boundary too easily. The client side
timeout for the bus method call is increased to 90s, in order to have
room for the dbus reply to go through
resolved: initialize all return values on successful exit of dns_cache_lookup()
Following our coding style on success we should initialize all return
parameters of a function. We missed to cases for dns_cache_lookup() (but
covered all others), fix them too.
resolved: don't downgrade feature level if we get RCODE on UDP level
Retrying a transaction via TCP is a good approach for mitigating
packet loss. However, it's not a good away way to fix a bad RCODE if we
already downgraded to UDP level for it. Hence, don't do this.
This is a small tweak only, but shortens the time we spend on
downgrading when a specific domain continously returns a bad rcode.
Some domains (such as us.ynuf.alipay.com) almost appear as if they actively
want to sabotage our DNSSEC work. Specifically, they unconditionally
return SERVFAIL on SOA lookups and always only after a 1s delay (at
least). This is pretty bad for our validation logic, as we use SOA
lookups to distuingish zones from non-terminal names. Moreover, SERVFAIL
is an error that is typically returned if we send requests a server
doesn't grok, and thus is reason for us to downgrade our protocol and
try again. In case of these zones this means we'll accept the SERVFAIL
response only after a full iterative downgrade to our lowest feature
level: TCP. In combination with the 1s delays this has the effect of
making us hit our transaction timeout way to easily.
As first attempt to improve the situation: let's start caching SERVFAIL
responses in our cache, after the full downgrade for a short period of
time.
Conceptually this is exposed as "weird rcode" caching, but for now we
only consider SERVFAIL a "weird rcode" worthy of caching. Later on we
might want to add more.
When we are doing a TCP transaction the kernel will automatically resend
all packets for us, there's no need to do that ourselves. Hence:
increase the timeout for TCP transactions substantially, to give the
kernel enough time to connect to the peer, without interrupting it when
we become impatient.
resolved: when accepted a query candidate as final answer, propagate authentication bool even on failure
Let's make sure that if we accept a query candidate, then let's also
propagate the authenticated flag for it, so that we can properly report
back to the clients whether lookups failed due to non-existance that can
be proven.
resolved: when the dns server feature level grace period elapses, flush caches
The cache might contain all kinds of unauthenticated data that we really
shouldn't be using if we upgrade our feature level and suddenly are able
to get authenticated data again.
For the wildcard NSEC check we need to generate an "asterisk" domain, by
prepend the common ancestor with "*.". So far we did that with a simple
strappenda() which is fine for most domains, but doesn't work if the
common ancestor is the root domain as we usually write that as "." in
normalized form, and "*." joined with "." is "*.." and not "*." as it
should be.
Hence, use the clean way out, let's just use dns_name_concat() which
only exists precisely for this reason, to properly concatenate labels.
There's a good chance this actually fixes #5029, as this NSEC proof is
triggered by lookups in the TLD "example", which doesn't exist in the
Internet.
resolved: make sure configured NTAs affect subdomains too
This ensures that configured NTAs exclude not only the listed domain but
also all domains below it from DNSSEC validation -- except if a positive
trust anchor is defined below (as suggested by RFC7647, section 1.1)
machined: when copying files from/to userns containers chown to root
This changes the file copy logic of machined to set the UID/GID of all
copied files to 0 if the host and container do not share the same user
namespace.
copy: change the various copy_xyz() calls to take a unified flags parameter
This adds a unified "copy_flags" parameter to all copy_xyz() function
calls, replacing the various boolean flags so far used. This should make
many invocations more readable as it is clear what behaviour is
precisely requested. This also prepares ground for adding support for
more modes later on.
machinectl: tweak address output in "machinectl status"
With this change we'll not show an "Addresses" field for machines that
we don't know any addresses for.
This changes print_addresses() to never suffix its output with a
newline, leaving that to the caller. That's a good idea since depending
on who the caller is, different rules apply: if no addresses are found,
then the list view still wants a newline, but the status view does not.
This also changes the function to return the number of found addresses,
which can be used to decide when to add a newline or not.
machined: expose "UID shift" concept for containers
UID/GID mapping with userns can be arbitrarily complex. Let's break this
down to a single admin-friendly parameter: let's expose the UID/GID
shift of a container via a new bus call for each container, and let's
show this as part of "machinectl status" if it is not 0.
This should work for pretty much all real-life full OS container setups
(i.e. the stuff machined is suppose to be useful for). For everything
else we generate a clean error, clarifying that we can't expose the
mapping.
resolved: default to the compile-time fallback hostname
This changes resolved to use the compile-time fallback hostname the
configured one is not set. Note that if the local hostname is set to
"localhost" then we'll instead default to "linux" here, as for
mDNS/LLMNR exposing "localhost" is actively dangerous.
hostname-util: default to the compile time default hostname in gethostname_malloc()
Currently, if the hostname is not set gethostname_malloc() defaults to
the "sysname", which is "linux" on Linux. Let's change that to also
honour the compile-time fallback hostname as specified on the configure
command line.
Martin Pitt [Wed, 15 Feb 2017 22:37:25 +0000 (23:37 +0100)]
test: drop TEST_DATA_DIR, fold into get_testdata_dir()
Drop the TEST_DATA_DIR macro as this was using alloca() within a
function call which is allegedly unsafe. So add a "suffix" argument to
get_testdata_dir() instead and call that directly.
Martin Pitt [Wed, 15 Feb 2017 07:52:17 +0000 (08:52 +0100)]
test: show error message if $SYSTEMD_TEST_DATA does not exist
Rename get_exe_relative_testdata_dir() to get_testdata_dir() and move
the env var check into that, so that everything interesting happens at
the same place.
tests: look for tests relative to source dir when running from build dir
automake helpfully sets a few variables for during build. When our executable
is in a directory underneath $(abs_top_builddir), we know that we're in the
build environment $(abs_top_srcdir) contains the sources, and test data is
under $(abs_top_srcdir)/test. This remains true no matter where the build
directory is relative to the source directory. It also works if the test
executable is invoked as ./test-whatever or .libs/test-whatever, since the
relative path is not used at all.
When running from outside of the build directory, we should be running from the
installed location and we can look for ../testdata relative to the location of
the exe file.
Of course, $SYSTEMD_TEST_DATA always overrides this logic.