Martin Pitt [Tue, 22 Nov 2016 07:41:51 +0000 (08:41 +0100)]
networkd: allow networkd to start in early boot
With the previous improvements, networkd.service's "After=dbus.service" can now
be dropped. That ordering effectively forced networkd.service to run in late
boot only (dbus.service was rejected to run in early boot in
https://bugs.freedesktop.org/show_bug.cgi?id=98254).
Martin Pitt [Tue, 22 Nov 2016 07:36:20 +0000 (08:36 +0100)]
networkd: set DHCP-acquired timezone and hostname after connecting to D-Bus
If setting the received timezone or transient hostname fails because D-Bus is
not (yet) up, store the data in the Manager object and try again after
connecting to D-Bus.
Martin Pitt [Tue, 22 Nov 2016 07:05:18 +0000 (08:05 +0100)]
networkd: allow networkd to set the timezone in timedated
systemd-networkd runs as user "systemd-network" and thus is not privileged to
set the timezone acquired from DHCP:
systemd-networkd[4167]: test_eth42: Could not set timezone: Interactive authentication required.
Similarly to commit e8c0de912, add a polkit rule to grant
org.freedesktop.timedate1.set-timezone to the "systemd-network" system user.
Move the polkit rules from src/hostname/ to src/network/ to avoid too many
small distributed policy snippets (there might be more in the future), as it's
easier to specify the privileges for a particular subject in this case.
Add NetworkdClientTest.test_dhcp_timezone() test case to verify this (for
all people except those in Pacific/Honolulu, there the test doesn't prove
anything -- sorry ☺ ).
build-sys: check for lz4 in the old and new numbering scheme (#4717)
lz4 upstream decided to switch to an incompatible numbering scheme
(1.7.3 follows 131, to match the so version).
PKG_CHECK_MODULES does not allow two version matches for the same package,
so e.g. lz4 < 10 || lz4 >= 125 cannot be used. Check twice, once for
"new" numbers (anything below 10 is assume to be new), once for the "old"
numbers (anything above >= 125). This assumes that the "new" versioning
will not get to 10 to quickly. I think that's a safe assumption, lz4 is a
mature project.
Janne Heß [Wed, 23 Nov 2016 04:19:56 +0000 (05:19 +0100)]
Document an edge-case with resume and mounting (#4581)
When trying to read keyfiles from an encrypted partition to unlock the swap,
a cyclic dependency is generated because systemd can not mount the
filesystem before it has checked if there is a swap to resume from.
nspawn: add fallback top normal copy/reflink when we cannot btrfs snapshot
Given that other file systems (notably: xfs) support reflinks these days, let's
extend the file system snapshotting logic to fall back to plan copies or
reflinks when full btrfs subvolume snapshots are not available.
This essentially makes "systemd-nspawn --ephemeral" and "systemd-nspawn
--template=" available on non-btrfs subvolumes. Of course, both operations will
still be slower on non-btrfs than on btrfs (simply because reflinking each file
individually in a directory tree is still slower than doing this in one step
for a whole subvolume), but it's probably good enough for many cases, and we
should provide the users with the tools, they have to figure out what's good
for them.
Note that "machinectl clone" already had a fallback like this in place, this
patch generalizes this, and adds similar support to our other cases.
When mountint a loopback image, we need a temporary root directory we can mount
stuff to. Make sure to actually remove it when exiting, so that we don't leave
stuff around in /tmp unnecessarily.
Previously --ephemeral was only supported with container trees in btrfs
subvolumes (i.e. in combination with --directory=). This adds support for
--ephemeral in conjunction with disk images (i.e. --image=) too.
As side effect this fixes that --ephemeral was accepted but ignored when using
-M on a container that turned out to be an image.
@filesystem groups various file system operations, such as opening files and
directories for read/write and stat()ing them, plus renaming, deleting,
symlinking, hardlinking.
This changes the return value a bit: 1 will be returned if the value is
changed. But the return value was not documented, and the change should
be for the good anyway. Current callers don't care.
networkd: do not automatically propagate bogus DNS/NTP servers
Never propagate DNS/NTP servers on the local link via the DHCP server. The
DNS/NTP servers 0.0.0.0 and 127.0.0.1 only make sense in the local context,
hence never propagate them automatically to other hosts.
networkd: store DNS servers configured per-network as parsed addresses
DNS servers must be specified as IP addresses, hence let's store them as that
internally, so that they are guaranteed to be fully normalized always, and
invalid data cannot be stored.
Let's reorder them a bit, so that stuff that belongs together semantically is
placed together (in particular, move the various DHCP "use" booleans together).
This adds in4_addr_is_localhost() and in4_addr_is_link_local() that only take
an IPv4 "struct in_addr", to match in_addr_is_localhost() and
in_addr_is_link_local() that that a "union in_addr_union".
This matches the existing in4_addr_is_null() call that already exists.
For IPv6 glibc already exports a set of macros, hence we don't add similar
functions in6_addr_is_localhost(). We also drop in6_addr_is_null() as
IN6_IS_ADDR_UNSPECIFIED() already provides that.
Martin Pitt [Fri, 18 Nov 2016 15:17:01 +0000 (16:17 +0100)]
hostnamed: allow networkd to set the transient hostname
systemd-networkd runs as user "systemd-network" and thus is not privileged to
set the transient hostname:
systemd-networkd[516]: ens3: Could not set hostname: Interactive authentication required.
Standard polkit *.policy files do not have a syntax for granting privileges to
a user, so ship a pklocalauthority (for polkit < 106) and a JavaScript rules
file (for polkit >= 106) that grants the "systemd-network" system user that
privilege.
Add DnsmasqClientTest.test_transient_hostname() test to networkd-test.py to
cover this. Make do_test() a bit more flexible by interpreting "coldplug==None"
as "test sets up the interface by itself". Change DnsmasqClientTest to set up
test_eth42 with a fixed MAC address so that we can configure dnsmasq to send a
special host name for that.
Add MSI VR420 (model MS-1422) to the list of MSI models which need to
ignore brightness hotkey presses, as these are already reported through
the acpi-video interface.
This commit adds the possibility to leave /sys, and /proc/sys read-write.
It introduces a new (undocumented) env var SYSTEMD_NSPAWN_API_VFS_WRITABLE
to enable this feature.
If set to "yes", /sys, and /proc/sys will be read-write.
If set to "no", /sys, and /proc/sys will be read-only.
If set to "network" /proc/sys/net will be read-write. This is useful in
use-cases, where systemd-nspawn is used in an external network
namespace.
This adds the possibility to start privileged containers which need more
control over settings in the /proc, and /sys filesystem.
This is also a follow-up on the discussion from
https://github.com/systemd/systemd/pull/4018#r76971862 where an
introduction of a simple env var to enable R/W support for those
directories was already discussed.
basic/process-util: we need to take the shorter of two strings
==30496== Conditional jump or move depends on uninitialised value(s)
==30496== at 0x489F654: memcmp (vg_replace_strmem.c:1091)
==30496== by 0x49BF203: getenv_for_pid (process-util.c:678)
==30496== by 0x4993ACB: detect_container (virt.c:442)
==30496== by 0x182DFF: test_get_process_comm (test-process-util.c:98)
==30496== by 0x185847: main (test-process-util.c:368)
==30496==
Franck Bui [Mon, 7 Nov 2016 16:14:59 +0000 (17:14 +0100)]
core: limit the length of the confirmation question
When "confirmation_spawn=1", the confirmation question can look like:
Execute /usr/bin/kmod static-nodes --format=tmpfiles --output=/run/tmpfiles.d/kmod.conf? [Yes, No, Skip]
which is pretty verbose and might not fit in the console width size (which is
usually 80 chars) and thus question will be splitted into 2 consecutive lines.
However since the question is now refreshed every 2 secs, the reprinted
question will overwrite the second line of the previous one...
To prevent this, this patch makes sure that the command line won't be longer
than 60 chars by ellipsizing it if the command is longer:
Execute /usr/bin/kmod static-nodes --format=tmpfiles --output=/ru…nf? [Yes, No, View, Skip]
A following patch will introduce a new choice that will allow the user to get
details on the command to be executed so it will still be possible to see the
full command line.
Franck Bui [Mon, 7 Nov 2016 16:14:59 +0000 (17:14 +0100)]
core: reprint the question every 2 sec in ask_char()
ask_char() now reprints the question every 2sec automatically.
It prefixes its output with '\r' to to bring the cursor to the
beginning of the terminal line, and then print the message, redoing it
every 2sec.
As long as nothing interferes with out output this logic will have no
visible effect as we constantly overprint the visible text with the
exact same text.
However, if something is dumped in the middle, then our question won't
get lost, as we'll ask soon again.
This is useful if the question is asked to a terminal that is also
used to dump some other status messages/logs. For example when
confirmation messages are enabled during the boot
(systemd.confirm_spawn=1), the question can easily be lost if the
kernel logs are also enabled and both use the same console.
Franck Bui [Wed, 2 Nov 2016 12:51:02 +0000 (13:51 +0100)]
core: rework ask_for_confirmation()
Now the reponses are handled by ask_for_confirmation() as well as the report of
any errors occuring during the process of retrieving the confirmation response.
One benefit of this is that there's no need to open/close the console one more
time when reporting error/status messages.
The caller now just needs to care about the return values whose meanings are:
- don't execute and pretend that the command failed
- don't execute and pretend that the command succeeed
- positive answer, execute the command
Also some slight code reorganization and introduce write_confirm_error() and
write_confirm_error_fd(). write_confim_message becomes unneeded.
Franck Bui [Wed, 2 Nov 2016 09:38:22 +0000 (10:38 +0100)]
core: allow to redirect confirmation messages to a different console
It's rather hard to parse the confirmation messages (enabled with
systemd.confirm_spawn=true) amongst the status messages and the kernel
ones (if enabled).
This patch gives the possibility to the user to redirect the confirmation
message to a different virtual console, either by giving its name or its path,
so those messages are separated from the other ones and easier to read.
Franck Bui [Wed, 2 Nov 2016 09:50:20 +0000 (10:50 +0100)]
core: prevent the cylon when confirmation_spawn=yes (#2194)
When booting with systemd.confirm_spawn=true, the eye of cylon
animation kicks in pretty quickly so user doesn't have any chance to
answer the questions which services to start before the confirmation
message is screwed by the cylon.
This basically breaks the confirm_spawn functionality completely.
This patch prevents the cylon animation to kick in when
confirmation_spawn=yes.
namespace: simplify, optimize and extend handling of mounts for namespace
This changes a couple of things in the namespace handling:
It merges the BindMount and TargetMount structures. They are mostly the same,
hence let's just use the same structue, and rely on C's implicit zero
initialization of partially initialized structures for the unneeded fields.
This reworks memory management of each entry a bit. It now contains one "const"
and one "malloc" path. We use the former whenever we can, but use the latter
when we have to, which is the case when we have to chase symlinks or prefix a
root directory. This means in the common case we don't actually need to
allocate any dynamic memory. To make this easy to use we add an accessor
function bind_mount_path() which retrieves the right path string from a
BindMount structure.
While we are at it, also permit "+" as prefix for dirs configured with
ReadOnlyPaths= and friends: if specified the root directory of the unit is
implicited prefixed.
This also drops set_bind_mount() and uses C99 structure initialization instead,
which I think is more readable and clarifies what is being done.
This drops append_protect_kernel_tunables() and
append_protect_kernel_modules() as append_static_mounts() is now simple enough
to be called directly.
Prefixing with the root dir is now done in an explicit step in
prefix_where_needed(). It will prepend the root directory on each entry that
doesn't have it prefixed yet. The latter is determined depending on an extra
bit in the BindMount structure.
Merge pull request #4678 from poettering/gc-device
Automatically GC device jobs when there's no need to keep them in the job queue anymore.
Implement systemctl list-jobs --before/--after.
Allow systemd-run -p After/Before/Wants/Requires= ...
systemctl: shorter list-jobs --before/--after output a bit
(before)$ systemctl list-jobs --before --after
JOB UNIT TYPE STATE
8769 foobar.device start running
A job waits for this job: 8669 (run-rb6da596d0cfa4e36b7c594cd973e795a.service/start)
8669 run-rb6da596d0cfa4e36b7c594cd973e795a.service start waiting
This job waits for a job: 8769 (foobar.device/start)
2 jobs listed.
(after)$ systemctl list-jobs --before --after
JOB UNIT TYPE STATE
8769 foobar.device start running
waiting for job 8669 (run-rb6da596d0cfa4e36b7c594cd973e795a.service/start)
8669 run-rb6da596d0cfa4e36b7c594cd973e795a.service start waiting
blocking job 8769 (foobar.device/start)
systemctl: add env var to force connection to system manager via the bus
Sometimes it is useful for debugging purposes to force systemctl to connect to
PID 1 via the bus instead of direct connection, even if the direct connection
is possible.
In contrast to all other unit types device units when queued just track
external state, they cannot effect state changes on their own. Hence unless a
client or other job waits for them there's no reason to keep them in the job
queue. This adds a concept of GC'ing jobs of this type as soon as no client or
other job waits for them anymore.
To ensure this works correctly we need to track which clients actually
reference a job (i.e. which ones enqueued it). Unfortunately that's pretty
nasty to do for direct connections, as sd_bus_track doesn't work for
them. For now, work around this, by simply remembering in a boolean that a job
was requested by a direct connection, and reset it when we notice the direct
connection is gone. This means the GC logic works fine, except that jobs are
not immediately removed when direct connections disconnect.
In the longer term, a rework of the bus logic should fix this properly. For now
this should be good enough, as GC works for fine all cases except this one, and
thus is a clear improvement over the previous behaviour.
core: rename "clients" field of Job structure to "bus_track"
Let's make semantics of this field more similar to the same functionality in
the Unit object, in particular as we add new functionality to it later on.
Djalal Harouni [Mon, 14 Nov 2016 08:12:21 +0000 (09:12 +0100)]
core:gperf: pass the exec_context struct directly to parse restrict namespaces
The RestrictNamespaces= takes yes, no or a list of namespaces types,
therefor config_parse_restrict_namespaces() is a bit complex and it
operates on the ExecContext, fix this by passing the offset of
ExecContext directly otherwise restricting namespaces won't work.
Djalal Harouni [Tue, 15 Nov 2016 09:15:27 +0000 (10:15 +0100)]
core: improve the logic that implies no new privileges
The no_new_privileged_set variable is not used any more since commit 9b232d3241fcfbf60af that fixed another thing. So remove it. Also no
need to check if we are under user manager, remove that part too.
nspawn: restart the whole systemd-nspawn@.service unit on container reboot (#4613)
Since 133 is now used in a few places, add a #define for it.
Also make the status message a bit informative.
Another issue introduced in b006762. The logic was borked, we were supposed
to return 0 to break the loop, and 133 to restart the container, not the other
way around.
But this doesn't seem to work, reboot fails with:
Nov 08 00:41:32 laptop systemd-nspawn[26564]: Failed to register machine: Machine 'fedora-rawhide' already exists
So actually the version before this patch worked better, since 133 > 0 and we'd
at least loop internally.
build-sys: do not install ctrl-alt-del.target symlink twice
It was a harmless but pointless duplication. Fixes #4655.
Note: in general we try to install as little as possible in
/etc/systemd/{system,user}. We only install .wants links there for units which
are "user configurable", i.e. which have an [Install] section. Most our units
and aliases are not user configurable, do not have an [Install] section, and
must be symlinked statically during installation. A few units do have an
[Install] section, and are enabled through symlinks in /etc/ during
installation using GENERAL_ALIASES. It *would* be possible to not create those
symlinks, and instead require 'systemctl preset' to be invoked after
installation, but GENERAL_ALIASES works well enough.