core: Execute first boot presets in an enable-only preset-mode.
This means any existing enabled units well be preserved and no
pre-created symlinks will be removed. This is done on first boot, when
the assumption is that /etc is not populated at all (no machine-id
setup). For minimal containers that gives a significant first boot
speed up, approximately ~20ms / ~16% in my trials.
It turns out we don't actually need to set the global ip_forward setting.
The only relevant setting is the one on each interface.
What the global toggle actually does is switch forwarding on/off for all
currently present interfaces and change the default for new ones.
That means that by setting the global ip_forward we
- Introduce a race condition, because if the interface with IPForward=yes
is brought up after one with IPForward=no, both will have forwarding
enabled, because the global switch turns it on for all interfaces.
If the other interface comes up first networkd correctly sets forward=0
and it doesn't get overridden.
- Change the forwarding setting for interfaces that networkd is not
configured to touch, even if the user disabled forwarding via sysctl,
either globally or per-interface
As forwarding works fine without this, as long as all relevant interfacest
individually set IPForward=yes: just drop it
This means that non-networkd interfaces use the global default while
networkd interfaces default to off if IPForward isn't given.
Building with address sanitizer enabled on GCC 5.1.x a memory leak
is reported because we never close the bus, fix it by using
cleanup variable attribute.
GNU memmem() requires a nonnull first parameter. Let's introduce
memmem_safe() that removes this restriction for zero-length parameters,
and make use of it where appropriate.
util: add generic calls for prefixing a root directory to a path
So far a number of utilities implemented their own calls for this, unify
them in prefix_root() and prefix_roota(). The former uses heap memory,
the latter allocates from the stack via alloca().
If systemd is built with GCC address sanitizer or leak sanitizer
the following memory leak ocurs:
May 12 02:02:46 linux.site systemd[326]: =================================================================
May 12 02:02:46 linux.site systemd[326]: ==326==ERROR: LeakSanitizer: detected memory leaks
May 12 02:02:46 linux.site systemd[326]: Direct leak of 101 byte(s) in 3 object(s) allocated from:
May 12 02:02:46 linux.site systemd[326]: #0 0x7fd1f504993f in strdup (/usr/lib64/libasan.so.2+0x6293f)
May 12 02:02:46 linux.site systemd[326]: #1 0x55d6ffac5336 in strv_new_ap src/shared/strv.c:163
May 12 02:02:46 linux.site systemd[326]: #2 0x55d6ffac56a9 in strv_new src/shared/strv.c:185
May 12 02:02:46 linux.site systemd[326]: #3 0x55d6ffa80272 in generator_paths src/shared/path-lookup.c:223
May 12 02:02:46 linux.site systemd[326]: #4 0x55d6ff9bdb0f in manager_run_generators src/core/manager.c:2828
May 12 02:02:46 linux.site systemd[326]: #5 0x55d6ff9b1a10 in manager_startup src/core/manager.c:1121
May 12 02:02:46 linux.site systemd[326]: #6 0x55d6ff9a78e3 in main src/core/main.c:1667
May 12 02:02:46 linux.site systemd[326]: #7 0x7fd1f394e8c4 in __libc_start_main (/lib64/libc.so.6+0x208c4)
May 12 02:02:46 linux.site systemd[326]: Direct leak of 29 byte(s) in 1 object(s) allocated from:
May 12 02:02:46 linux.site systemd[326]: #0 0x7fd1f504993f in strdup (/usr/lib64/libasan.so.2+0x6293f)
May 12 02:02:46 linux.site systemd[326]: #1 0x55d6ffac5288 in strv_new_ap src/shared/strv.c:152
May 12 02:02:46 linux.site systemd[326]: #2 0x55d6ffac56a9 in strv_new src/shared/strv.c:185
May 12 02:02:46 linux.site systemd[326]: #3 0x55d6ffa80272 in generator_paths src/shared/path-lookup.c:223
May 12 02:02:46 linux.site systemd[326]: #4 0x55d6ff9bdb0f in manager_run_generators src/core/manager.c:2828
May 12 02:02:46 linux.site systemd[326]: #5 0x55d6ff9b1a10 in manager_startup src/core/manager.c:1121
May 12 02:02:46 linux.site systemd[326]: #6 0x55d6ff9a78e3 in main src/core/main.c:1667
May 12 02:02:46 linux.site systemd[326]: #7 0x7fd1f394e8c4 in __libc_start_main (/lib64/libc.so.6+0x208c4)
May 12 02:02:46 linux.site systemd[326]: SUMMARY: AddressSanitizer: 130 byte(s) leaked in 4 allocation(s).
There is a leak due to the the use of cleanup_free instead _cleanup_strv_free_
nspawn: skip symlink to a combined cgroup hierarchy if it already exists
If a symlink to a combined cgroup hierarchy already exists and points to
the right path, skip it. This avoids an error when the cgroups are set
manually before calling nspawn.
nspawn: rework custom mount point order, and add support for overlayfs
Previously all bind mount mounts were applied in the order specified,
followed by all tmpfs mounts in the order specified. This is
problematic, if bind mounts shall be placed within tmpfs mounts.
This patch hence reworks the custom mount point logic, and alwas applies
them in strict prefix-first order. This means the order of mounts
specified on the command line becomes irrelevant, the right operation
will always be executed.
While we are at it this commit also adds native support for overlayfs
mounts, as supported by recent kernels.
Direct leak of 32 byte(s) in 1 object(s) allocated from:
#0 0x7f623c961c4a in malloc (/usr/lib64/libasan.so.2+0x96c4a)
#1 0x5651f79ad34e in malloc_multiply (/home/crrodriguez/scm/systemd/systemd-modules-load+0x2134e)
#2 0x5651f79b02d6 in strjoin (/home/crrodriguez/scm/systemd/systemd-modules-load+0x242d6)
#3 0x5651f79be1f5 in files_add (/home/crrodriguez/scm/systemd/systemd-modules-load+0x321f5)
#4 0x5651f79be6a3 in conf_files_list_strv_internal (/home/crrodriguez/scm/systemd/systemd-modules-load+0x326a3)
#5 0x5651f79bea24 in conf_files_list_nulstr (/home/crrodriguez/scm/systemd/systemd-modules-load+0x32a24)
#6 0x5651f79ad01a in main (/home/crrodriguez/scm/systemd/systemd-modules-load+0x2101a)
#7 0x7f623c11586f in __libc_start_main (/lib64/libc.so.6+0x2086f)
SUMMARY: AddressSanitizer: 32 byte(s) leaked in 1 allocation(s).
This happens due to the wrong cleanup attribute is used (free vs strv_free)
Tom Gundersen [Tue, 12 May 2015 14:55:29 +0000 (16:55 +0200)]
udevd: explicitly update queue file before answering to ping
This avoids updating the flag files twice for every loop, and also removes another dependency
in the main-loop, so we are freer to reshufle it as we want.
Tom Gundersen [Tue, 12 May 2015 14:51:31 +0000 (16:51 +0200)]
udevd: explicitly read out uevents we create ourselves
Rather than skippling ctrl handling whenever we have handlede inotify events
(and hence may have synthesized a 'change' event), just call the uevent
handling explicitly from on_inotify() so that the event queue is up-to-date.
It's primarily just a property of the Manager object after all, and we
try to refer to PID 1 as "manager" instead of "systemd", hence let's to
stick to this here too.
unit: move unit_warn_if_dir_nonempty() and friend to unit.c
The call is only used by the mount and automount unit types, but that's
already enough to consider it generic unit functionality, hence move it
out of mount.c and into unit.c.
This changes log_unit_info() (and friends) to take a real Unit* object
insted of just a unit name as parameter. The call will now prefix all
logged messages with the unit name, thus allowing the unit name to be
dropped from the various passed romat strings, simplifying invocations
drastically, and unifying log output across messages. Also, UNIT= vs.
USER_UNIT= is now derived from the Manager object attached to the Unit
object, instead of getpid(). This has the benefit of correcting the
field for --test runs.
Also contains a couple of other logging improvements:
- Drops a couple of strerror() invocations in favour of using %m.
- Not only .mount units now warn if a symlinks exist for the mount
point already, .automount units do that too, now.
- A few invocations of log_struct() that didn't actually pass any
additional structured data have been replaced by simpler invocations
of log_unit_info() and friends.
- For structured data a new LOG_UNIT_MESSAGE() macro has been added,
that works like LOG_MESSAGE() but prefixes the message with the unit
name. Similar, there's now LOG_LINK_MESSAGE() and
LOG_NETDEV_MESSAGE().
- For structured data new LOG_UNIT_ID(), LOG_LINK_INTERFACE(),
LOG_NETDEV_INTERFACE() macros have been added that generate the
necessary per object fields. The old log_unit_struct() call has been
removed in favour of these new macros used in raw log_struct()
invocations. In addition to removing one more function call this
allows generated structured log messages that contain two object
fields, as necessary for example for network interfaces that are
joined into another network interface, and whose messages shall be
indexed by both.
- The LOG_ERRNO() macro has been removed, in favour of
log_struct_errno(). The latter has the benefit of ensuring that %m in
format strings is properly resolved to the specified error number.
- A number of logging messages have been converted to use
log_unit_info() instead of log_info()
- The client code in sysv-generator no longer #includes core code from
src/core/.
- log_unit_full_errno() has been removed, log_unit_full() instead takes
an errno now, too.
- log_unit_info(), log_link_info(), log_netdev_info() and friends, now
avoid double evaluation of their parameters
Generate systemd-fsck-root.service in the initramfs
In the initrafms, generate a systemd-fsck-root.service to replace
systemd-fsck@<sysroot-device>.service. This way, after we transition
to the real root, systemd-fsck-root.service is marked as already done.
This introduces an unnecessary synchronization point, because
systemd-fsck@* is ordered after systemd-fsck-root also in the
initramfs. In practice this shouldn't be a problem.
Tom Gundersen [Mon, 27 Apr 2015 09:43:31 +0000 (11:43 +0200)]
udevd: worker - drop reference counting
Make the worker context have the same life-span as the worker process. It is created on fork()
and free'd on SIGCHLD.
The change means that we can get worker_returned() for a worker context that is no longer around,
this is not a problem and we can just drop the message. The only use for worker_returned() is to
know to reschedule events to workers that are still around, so if the worker has already exited
it is not important to keep track of. We still print a debug statement in this case to be on the
safe side.
David Herrmann [Wed, 6 May 2015 16:18:43 +0000 (18:18 +0200)]
bus: don't switch to kdbus if not requested
Whenever systemd is re-executed, it tries to create a system bus via
kdbus. If the system did not have kdbus loaded during bootup, but the
module is loaded later on manually, this will cause two system buses
running (kdbus and dbus-daemon in parallel).
This patch makes sure we never try to create kdbus buses if it wasn't
explicitly requested on the command-line.
Michael Olbrich [Thu, 30 Apr 2015 18:50:38 +0000 (20:50 +0200)]
tmpfiles: try to handle read-only file systems gracefully
On read-only filesystems trying to create the target will not fail with
EEXIST but with EROFS. Handle EROFS by checking if the target already
exists, and if empty when truncating.
This avoids reporting errors if tmpfiles doesn't actually needs to do
anything.
[zj: revert condition to whitelist rather then blacklisting, and add goto
to avoid stat'ting twice.]
Colin Walters [Mon, 4 May 2015 20:12:46 +0000 (16:12 -0400)]
lockfile-util.[ch]: Split out from util.[ch]
Continuing the general trend of splitting up util.[ch]. I specifically
want to reuse this code in https://github.com/GNOME/libglnx and
having it split up will make future copy-pasting easier.
man: nspawn is used in production these days, admit that
Previously, the man page suggested to only use nspawn for testing,
building, and debugging things. However, it is nowadays used in
production and used as building block for rocket, hence let's just admit
that it's pretty much production ready.
We should be more strict when verifying paths with path_is_safe() for
potentially dangerous constructs, and that includes lengths of
PATH_MAX-1 and larger. Be more accurate here.
Add VARIANT as a standard value for /etc/os-release
Some distributions (such as Fedora) are using the VARIANT field to
indicate to select packages which of several default configurations
they should be using. For example, VARIANT=Server provides a
different default firewall configuration (blocking basically
everything but SSH and the management console) whereas
VARIANT=Workstation opens many other ports for application
compatibility.
By adding this patch to the manual pages, we can standardize on a
cross-distribution mechanism for accomplishing this.
Fedora implementation details are available at
https://fedoraproject.org/wiki/Packaging:Per-Product_Configuration
generators: rename add_{root,usr}_mount to add_{sysroot,sysroot_usr}_mount
This makes it obvious that those functions are only usable in the
initramfs.
Also, add a warning when noauto, nofail, or automount is used for the
root fs, instead of silently ignoring. Using those options would be a
sign of significant misconfiguration, and if we bother to check for
them, than let's go all the way and complain.
Other various small cleanups and reformattings elsewhere.
network: Implement fallback DHCPv6 prefix handling for older kernels
When setting IPv6 addresses acquired by DHCPv6, systemd-networkd sets
the IFA_F_NOPREFIXROUTE flag in the IFA_FLAGS netlink attribute. As
the flag and the attribute are present starting with Linux 3.14, older
kernels will need systemd-network to manage prefix route expiry.
By default, DHCPv6 addresses are first assigned setting the
IFA_F_NOPREFIXROUTE flag in the IFA_FLAGS netlink attribute. Should
the address assignment fail, the same assignment is tried without
the IFA_FLAGS attribute. Should also the second attempt fail, an error
is printed and address assignment ends with failure. As successful use
of the IFA_FLAGS netlink attribute is recorded in the Link structure,
the DHCPv6 code will know if the kernel or systemd-network fallback
code handles expiring prefixes.
filtered was used to store an allocated string twice. The first allocation was
thus lost. The string is not needed for anything, so simply skip the allocation.