Tom Gundersen [Fri, 13 Nov 2015 14:05:58 +0000 (15:05 +0100)]
networkd: check explicit state rather than link->network
When deserializing we can now have an attached network without the various clients yet
having been configured. Hence, don't misused the link->network as a check to determine
if a link is ready to be used, but check the state explicitly.
networkd: stop managing per-interface IP forwarding settings
As it turns out the kernel does not support per-interface IPv6 packet
forwarding controls (unlike as it does for IPv4), but only supports a
global option (#1597). Also, the current per-interface management of the
setting isn't really useful, as you want it to propagate to at least one
more interface than the one you configure it on. This created much grief
(#1411, #1808).
Hence, let's roll this logic back and simplify this again, so that we
can expose the same behaviour on IPv4 and IPv6 and things start to work
automatically again for most folks: if a network with this setting set
is set up we propagate the setting into the global setting, but this is
strictly one-way: we never reset it again, and we do nothing for network
interfaces where this setting is not enabled.
networkd: rearrange checks when to write something into sysctl a bit
Move check whether ipv6 is available into link_ipv6_privacy_extensions()
to keep it as internal and early as possible.
Always check if there's a network attached to a link before we apply
sysctls. We do this for most of the sysctl functions already, with this
change we do it for all.
util-lib: optionally, when writing a string to a file, verify string on failure
With this change, the idiom:
r = write_string_file(p, buf, 0);
if (r < 0) {
if (verify_one_line_file(p, buf) > 0)
r = 0;
}
gets reduced to:
r = write_string_file(p, buf, WRITE_STRING_FILE_VERIFY_ON_FAILURE);
i.e. when writing the string fails and the new flag
WRITE_STRING_FILE_VERIFY_ON_FAILURE is specified we'll not return a
failure immediately, but check the contents of the file. If it matches
what we wanted to write we suppress the error and exit cleanly.
Michael Marineau [Fri, 13 Nov 2015 02:10:57 +0000 (18:10 -0800)]
generator: order initrd fsck-root after local-fs-pre
The initrd version of systemd-fsck-root.service must wait for
local-fs-pre.target just like systemd-fsck@.service to prevent
modifications to the filesystem prior to resuming from hibernation.
As-is my laptop routinely fails to resume due to fsck errors. The rest
of the time it is probably silently corrupting the filesystem.
Unlike normal boot, in the initrd systemd-fsck-root.service has no
special significance so it needs to be kept in sync with
systemd-fsck@.service. The name systemd-fsck-root.service is only used
to preserve state across switch-root.
nspawn: add new --network-veth-extra= switch for defining additional veth links
The new switch operates like --network-veth, but may be specified
multiple times (to define multiple link pairs) and allows flexible
definition of the interface names.
This is an independent reimplementation of #1678, but defines different
semantics, keeping the behaviour completely independent of
--network-veth. It also comes will full hook-up for .nspawn files, and
the matching documentation.
core: drop "override" flag when building transactions
Now that we don't have RequiresOverridable= and RequisiteOverridable=
dependencies anymore, we can get rid of tracking the "override" boolean
for jobs in the job engine, as it serves no purpose anymore.
While we are at it, fix some error messages we print when invoking
functions that take the override parameter.
This removes the two XyzOverridable= unit dependencies, that were
basically never used, and do not enhance user experience in any way.
Most folks looking for the functionality this provides probably opt for
the "ignore-dependencies" job mode, and that's probably a good idea.
Hence, let's simplify systemd's dependency engine and remove these two
dependency types (and their inverses).
The unit file parser and the dbus property parser will now redirect
the settings/properties to result in an equivalent non-overridable
dependency. In the case of the unit file parser we generate a warning,
to inform the user.
The dbus properties for this unit type stay available on the unit
objects, but they are now hidden from usual introspection and will
always return the empty list when queried.
This should provide enough compatibility for the few unit files that
actually ever made use of this.
core: simplify handling of %u, %U, %s and %h unit file specifiers
Previously, the %u, %U, %s and %h specifiers would resolve to the user
name, numeric user ID, shell and home directory of the user configured
in the User= setting of a unit file, or the user of the manager instance
if no User= setting was configured. That at least was the theory. In
real-life this was not ever actually useful:
- For the systemd --user instance it made no sense to ever set User=,
since the instance runs in user context after all, and hence the
privileges to change user IDs don't even exist. The four specifiers
were actually not useful at all in this case.
- For the systemd --system instance we did not allow any resolving that
would require NSS. Hence, %s and %h were not supported, unless
User=root was set, in which case they would be hardcoded to /bin/sh
and /root, to avoid NSS. Then, %u would actually resolve to whatever
was set with User=, but %U would only resolve to the numeric UID of
that setting if the User= was specified in numeric form, or happened
to be root (in which case 0 was hardcoded as mapping). Two of the
specifiers are entirely useless in this case, one is realistically
also useless, and one is pretty pointless.
- Resolving of these settings would only happen if User= was actually
set *before* the specifiers where resolved. This behaviour was
undocumented and is really ugly, as specifiers should actually be
considered something that applies to the whole file equally,
independently of order...
With this change, %u, %U, %s and %h are drastically simplified: they now
always refer to the user that is running the service instance, and the
user configured in the unit file is irrelevant. For the system instance
of systemd this means they always resolve to "root", "0", "/bin/sh" and
"/root", thus avoiding NSS. For the user instance, to the data for the
specific user.
The new behaviour is identical to the old behaviour in all --user cases
and for all units that have no User= set (or set to "0" or "root").
install: follow unit file symlinks in /usr, but not /etc when looking for [Install] data
Some distributions use alias unit files via symlinks in /usr to cover
for legacy service names. With this change we'll allow "systemctl
enable" on such aliases.
Previously, our rule was that symlinks are user configuration that
"systemctl enable" + "systemctl disable" creates and removes, while unit
files is where the instructions to do so are store. As a result of the
rule we'd never read install information through symlinks, since that
would mix enablement state with installation instructions.
Now, the new rule is that only symlinks inside of /etc are
configuration. Unit files, and symlinks in /usr are now valid for
installation instructions.
This patch is quite a rework of the whole install logic, and makes the
following addional changes:
- Adds a complete test "test-instal-root" that tests the install logic
pretty comprehensively.
- Never uses canonicalize_file_name(), because that's incompatible with
operation relative to a specific root directory.
- unit_file_get_state() is reworked to return a proper error, and
returns the state in a call-by-ref parameter. This cleans up confusion
between the enum type and errno-like errors.
- The new logic puts a limit on how long to follow unit file symlinks:
it will do so only for 64 steps at max.
- The InstallContext object's fields are renamed to will_process and
has_processed (will_install and has_installed) since they are also
used for deinstallation and all kinds of other operations.
- The root directory is always verified before use.
- install.c is reordered to place the exported functions together.
- Stricter rules are followed when traversing symlinks: the unit suffix
must say identical, and it's not allowed to link between regular units
and templated units.
- Various modernizations
- The "invalid" unit file state has been renamed to "bad", in order to
avoid confusion between UNIT_FILE_INVALID and
_UNIT_FILE_STATE_INVALID. Given that the state should normally not be
seen and is not documented this should not be a problematic change.
The new name is now documented however.
Instead, let the caller do that. Fix this by moving masked unit messages
into the caller, by returning a clear error code (ESHUTDOWN) by which
this may be detected.
Apparently, util-linux' mount command implicitly drops the smack-related
options anyway before passing them to the kernel, if the kernel doesn't
know SMACK, hence there's no point in duplicating this in systemd.
Adding 3/4th of the watchdog frequency as accuracy on top of 1/2 of the
watchdog frequency means we might end up at 5/4th of the frequency which
means we might miss the message from time to time.
journald: rework --sync/--rotate logic to use CLOCK_MONOTONIC timestamp files
Previously, we'd rely on the mtime timestamps of the touch files to see
if our sync/rotation requests were already suppressed. This means we
rely on CLOCK_REALTIME timestamps. With this patch we instead store the
CLOCK_MONOTONIC timestamp *in* the touch files, and avoid relying on
mtime.
This should make things more reliable when the clock or underlying mtime
granularity is not very good.
This also adds warning messages if writing any of the flag files fails.
3d793d29059a7ddf5282efa6b32b953c183d7a4d broke parsing of unit file
names that include backslashes, as extract_first_word() strips those.
Fix this, by introducing a new EXTRACT_RETAIN_ESCAPE flag which disables
looking at any flags, thus being compatible with the classic
FOREACH_WORD() behaviour.
For all units ensure there's an "Automatic Dependencies" section in the
man page, and explain which dependencies are automatically added in all
cases, and which ones are added on top if DefaultDependencies=yes is
set.
This is also done for systemd.exec(5), systemd.resource-control(5) and
systemd.unit(5) as these pages describe common behaviour of various unit
types.
core: simplify things a bit by checking default_dependencies boolean in callee, not caller
It's nicer to hide the check away in the various
xyz_add_default_dependencies() calls, rather than making it explicit in
the caller, and thus require deeper nesing.
core: change default deps of services to require sysinit.target instead of basic.target
With this change services by default will no longer require
basic.target, but instead only after it it via After=basic.target.
However, they will still Require= on sysinit.target.
This has the benefit that when booting into emergency mode it is
relatively safe to actviate individual services, as this will not pull
the entirety of basic.target anymore, thus avoid everything listed in
sockets.target and suchlike. However, during the usual boot no change
should be noticed.
test-execute: Clarify interaction of PassEnvironment= and MANAGER_USER
@evverx brought up that test-execute runs under MANAGER_USER which
forwards all its environment variables to the services. It turns out it
only forwards those that were in the environment at the time of manager
creation, so this test was still working.
It was still possible to attack it by running something like:
$ sudo VAR1=a VAR2=b VAR3=c ./test-execute
Prevent that attack by unsetting the three variables explicitly before
creating the manager for the test case.
Also add comments explaining the interactions with MANAGER_USER and,
while it has some caveats, this tests are still valid in that context.
Tested by checking that the test running with the variables set from the
external environment will still pass.
test-execute: Add tests for new PassEnvironment= directive
Check the base case, plus erasing the list, listing the same variable
name more than once and when variables are absent from the manager
execution environment.
Confirmed that `sudo ./test-execute` passes and that modifying the test
cases (or the values of the set variables in test-execute.c) is enough
to make the test cases fail.
This directive allows passing environment variables from the system
manager to spawned services. Variables in the system manager can be set
inside a container by passing `--set-env=...` options to systemd-spawn.
Tested with an on-disk test.service unit. Tested using multiple variable
names on a single line, with an empty setting to clear the current list
of variables, with non-existing variables.
Tested using `systemd-run -p PassEnvironment=VARNAME` to confirm it
works with transient units.
Confirmed that `systemctl show` will display the PassEnvironment
settings.
Marcin Bachry [Wed, 11 Nov 2015 14:45:26 +0000 (15:45 +0100)]
test: fix failing test-socket-util when running with ipv6.disable=1 kernel param
The ability to use inet_pton(AF_INET6, ...) doesn't depend on kernel
ipv6 support (inet_pton is a pure libc function), so make ipv6 address
parsing tests unconditional.
Tom Gundersen [Fri, 2 Oct 2015 14:52:49 +0000 (16:52 +0200)]
networkd: link - drop foreign config when configuring link
This is a change in behavior:
Before we would never remove any state, only add to it, we now drop unwanted
state from any link the moment we start managing it.
Note however, that we still will not remove any foreign state added at runtime,
to avoid any feedback loops. However, we make no guarantees about coexisting
with third-party tools that change the state of the links we manage.
Tom Gundersen [Tue, 10 Nov 2015 20:30:59 +0000 (21:30 +0100)]
networkd: link - track state of IPv6LL address
This is managed by the kernel, but we should track whether or not we have
a configured IPv6LL address. This fixes two issues:
- we now wait for IPv6LL before considering the link ready
- we now wait for IPv6LL before attempting to do NDisc or DHCPv6
these protocols relies on an LL address being available.
Tom Gundersen [Tue, 3 Nov 2015 12:02:16 +0000 (13:02 +0100)]
networkd: ndisc - handle router advertisement in userspace
Router Discovery is a core part of IPv6, which by default is handled by the kernel.
However, the kernel implementation is meant as a fall-back, and to fully support
the protocol a userspace implementation is desired.
The protocol essentially listens for Router Advertisement packets from routers
on the local link and use these to configure the client automatically. The four
main pieces of information are: what kind (if any) of DHCPv6 configuration should
be performed; a default gateway; the prefixes that should be considered to be on
the local link; and the prefixes with which we can preform SLAAC in order to pick
a global IPv6 address.
A lot of additional information is also available, which we do not yet fully
support, but which will eventually allow us to avoid the need for DHCPv6 in the
common case.
Short-term, the reason for wanting this is in userspace was the desire to fully
track all the addresses on links we manage, and that is not possible for addresses
managed by the kernel (as the kernel does not expose to us the fact that it
manages these addresses). Moreover, we would like to support stable privacy
addresses, which will soon be mandated and the legacy MAC-based global addresses
deprecated, to do this well we need to handle the generation in userspace. Lastly,
more long-term we wish to support more RA options than what the kernel exposes.
The previous behavior:
When DHCPv6 was enabled, router discover was performed first, and then DHCPv6 was
enabled only if the relevant flags were passed in the Router Advertisement message.
Moreover, router discovery was performed even if AcceptRouterAdvertisements=false,
moreover, even if router advertisements were accepted (by the kernel) the flags
indicating that DHCPv6 should be performed were ignored.
New behavior:
If RouterAdvertisements are accepted, and either no routers are found, or an
advertisement is received indicating DHCPv6 should be performed, the DHCPv6
client is started. Moreover, the DHCP option now truly enables the DHCPv6
client regardless of router discovery (though it will probably not be
very useful to get a lease withotu any routes, this seems the more consistent
approach).
The recommended default setting should be to set DHCP=ipv4 and to leave
IPv6AcceptRouterAdvertisements unset.
Tom Gundersen [Tue, 10 Nov 2015 14:43:52 +0000 (15:43 +0100)]
networkd: dhcp6 - split up configure() method
Enabling address acquisition, configuring the client and starting the client are now
split out. This to better handle the client being repeatedly enabled due to router
advertisements.
Tom Gundersen [Mon, 19 Oct 2015 20:15:50 +0000 (22:15 +0200)]
sd-ndisc: introduce separate callbacks
As the data passed is very different, we introduce four different callbacks:
- Generic - router discovery timed out or state machine stopped
- Router - router and link configuration received
- Prefix onlink - configuration for an onlink prefix received
- Prefix autonomous - configuration for to configure a SLAAC address for a prefix received
Tom Gundersen [Wed, 11 Nov 2015 14:17:29 +0000 (15:17 +0100)]
sd-netlink: types - let tables be sized implicitly
This way we do not rely on the size MAX* constants from the kernel headers, as these will
be out-of-sync in case we have old headers and new defines in missing.h.
Of course, ideally we'd just use normal synchronous bus calls, but this
is out of the question as long as we rely on dbus-daemon (which logs to
journald, and thus cannot use to avoid cyclic sync loops). Hence,
instead, reuse the wait logic already implemented for --sync, and use a
signal in one direction, and a mtime watch file for the reply.
journalctl: add new --sync switch for syncing the journal to disk
With this new "--sync" switch we add a synchronous way to sync
everything queued to disk, and return only after that's complete. This
command gives the guarantee that anything queued before has hit the disk
before the command returns.
While we are at it, also improve the man pages and help text for
journalctl a bit.
Snapshots were never useful or used for anything. Many systemd
developers that I spoke to at systemd.conf2015, didn't even know they
existed, so it is fairly safe to assume that this type can be deleted
without harm.
The fundamental problem with snapshots is that the state of the system
is dynamic, devices come and go, users log in and out, timers fire...
and restoring all units to some state from the past would "undo"
those changes, which isn't really possible.
Tested by creating a snapshot, running the new binary, and checking
that the transition did not cause errors, and the snapshot is gone,
and snapshots cannot be created anymore.
New systemctl says:
Unknown operation snapshot.
Old systemctl says:
Failed to create snapshot: Support for snapshots has been removed.
IgnoreOnSnaphost settings are warned about and ignored:
Support for option IgnoreOnSnapshot= has been removed and it is ignored