machined: also make image removal operation asynchronous
If we remove a directory image (i.e. not a btrfs snapshot) then things might
get quite expensive, hence run this asynchronous in a forked off process, too.
machined: support non-btrfs file systems with "machinectl clone"
Fall back to a normal copy operation when the backing file system isn't btrfs,
and hence doesn't support cheap snapshotting. Of course, this will be slow, but
given that the execution is asynchronous now, this should be OK.
copy: adjust directory times after writing to the directory
When recursively copying a directory tree, fix up the file times after having
created all contents in it, so that our changes don't end up altering any of
the directory times.
util: rework sigkill_wait() to not require pid_t pointer
Let's make sigkill_wait() take a normal pid_t, and add sigkill_waitp() that
takes a pointer (which is useful for usage in _cleanup_), following the usual
logic we have for this.
machined: run clone operation asynchronously in the background
Cloning an image can be slow, if the image is not on a btrfs subvolume, hence
let's make sure we do this asynchronously in a child process, so that machined
isn't blocked as long as we process the client request.
This adds a new, generic "Operation" object to machined, that is used to track
these kind of background processes.
This is inspired by the MachineOperation object that already exists to make
copy operations asynchronous. A later patch will rework the MachineOperation
logic to use the generic Operation instead.
shared/install: ignore Alias in [Install] of units which don't allow aliases
A downside is that a warning about missing [Install] is printed:
$ systemctl --root=/ enable mnt-test.mount
[/etc/systemd/system/mnt-test.mount:5] Aliases are not allowed for mount units, ignoring.
The unit files have no installation config (WantedBy, RequiredBy, Also, Alias
settings in the [Install] section, and DefaultInstance for template units).
This means they are not meant to be enabled using systemctl.
Possible reasons for having this kind of units are:
1) A unit may be statically enabled by being symlinked from another unit's
.wants/ or .requires/ directory.
2) A unit's purpose may be to act as a helper for some other unit which has
a requirement dependency on it.
3) A unit may be started when needed via activation (socket, path, timer,
D-Bus, udev, scripted systemctl call, ...).
4) In case of template units, the unit is meant to be enabled with some
instance name specified.
That's a bit misleading, but I don't see an easy way to fix this. But
the situation is similar for many other parsing errors, so maybe that's
OK.
core: refuse merging on units when the unit type does not support alias
The concept of merging units exists so that we can create Unit objects for a
number of names early, and then load them only later, possibly merging units
which then turn out to be symlinked to other names. This of course only makes
sense for unit types where multiple names per unit are supported. For all
others, let's refuse the merge operation early.
core: merge service_connection_unref() into service_close_socket_fd()
We always call one after the other anyway, and this way service_set_socket_fd()
and service_close_socket_fd() nicely match each other as one undoes the effect
of the other.
There's no need to set the no_gc bit for service units that socket units
prepare, as we always keep a proper reference (as maintained by unit_ref_set())
on them, and such references are honoured by the GC logic anyway. Moreover,
explicitly setting the no_gc bit is problematic if the socket gets GC'ed for a
reason, as the service might then leak with the bit set.
In service_set_socket_fd(), let's make sure that if we can't add the requested
dependencies we take no possession of the passed connection fd.
This way, we follow the strict rule: we take possession of the passed fd on
success, but on failure we don't, and the fd remains in possession of the
caller.
core: rename StartLimitInterval= to StartLimitIntervalSec=
We generally follow the rule that for time settings we suffix the setting name
with "Sec" to indicate the default unit if none is specified. The only
exception was the rate limiting interval settings. Fix this, and keep the old
names for compatibility.
Do the same for journald's RateLimitInterval= setting
core: move start ratelimiting check after condition checks
With #2564 unit start rate limiting was moved from after the condition checks
are to before they are made, in an attempt to fix #2467. This however resulted
in #2684. However, with a previous commit a concept of per socket unit trigger
rate limiting has been added, to fix #2467 more comprehensively, hence the
start limit can be moved after the condition checks again, thus fixing #2684.
core: introduce activation rate limiting for socket units
This adds two new settings TriggerLimitIntervalSec= and TriggerLimitBurst= that
define a rate limit for activation of socket units. When the limit is hit, the
socket is is put into a failure mode. This is an alternative fix for #2467,
since the original fix resulted in issue #2684.
In a later commit the StartLimitInterval=/StartLimitBurst= rate limiter will be
changed to be applied after any start conditions checks are made. This way,
there are two separate rate limiters enforced: one at triggering time, before
any jobs are queued with this patch, as well as the start limit that is moved
again to be run immediately before the unit is activated. Condition checks are
done in between the two, and thus no longer affect the start limit.
build-sys: improve compat with older kernel headers
In 4.2 kernel headers, some netlink defines are missing that we need. missing.h
already can add them in, but currently makes this dependent on a definition
that these kernels already have. Change the check hence to check for the newest
definition in the table, so that the whole bunch of definitions as added in on
all kernels lacking this.
path-util: also support ".old" and ".new" suffixes and recommend them
~ suffix works fine, but looks to much like it the file is supposed to be
automatically cleaned up. For new versions of configuration files installers
might want to using something that looks more permanent like foobar.new.
So let's add treat ".old" and ".new" as special.
We previously would fail with EOPNOTSUPP when encountering an AF_UNIX socket in
the directory tree to copy. Fix that, and copy them too (even if they are dead
in the result).
Let's move DUID configuration into the [DHCP] section, since it only makes
sense in a DHCP context, and should be close to the configuration of
ClientIdentifier= and suchlike.
This really shouldn't be a section of its own, we don't have any for any of our
other per-protocol specific identifiers...
parse-util: fix conversion from size_t on s390 (#3147)
On s390 size_t is an unsigned long, nor an unsigned int. They both are
of the same size and can be cast to each other safely, but the compiler
still seems unhappy about incompatible pointers.
Running cgtop on a system, which lacks expecting stat file, results in a
segfault. For example, a system with blkio tree but without cfq io scheduler,
lacks "blkio.io_service_bytes".
When the targeting cgroup's file does not exist, process() returns 0 and
also does not modify `*ret' value (which is `*ours'). As a result,
callers of refresh_one() can have bogus pointer, which result in SEGV.
This patch just properly initialize the variable to NULL.
tree-wide: rename hidden_file to hidden_or_backup_file and optimize
In standard linux parlance, "hidden" usually means that the file name starts
with ".", and nothing else. Rename the function to convey what the function does
better to casual readers.
Stop exposing hidden_file_allow_backup which is rather ugly and rewrite
hidden_file to extract the suffix first. Note that hidden_file_allow_backup
excluded files with "~" at the end, which is quite confusing. Let's get
rid of it before it gets used in the wrong place.
basic/dirent-util: do not call hidden_file_allow_backup from dirent_is_file_with_suffix
If the file name is supposed to end in a suffix, there's not need to check the
name against a list of "special" file names, which is slow. Instead, just check
that the name doens't start with a period.
Martin Pitt [Wed, 27 Apr 2016 08:34:24 +0000 (10:34 +0200)]
Stop syslog.socket when entering emergency mode (#3130)
When enabling ForwardToSyslog=yes, the syslog.socket is active when entering
emergency mode. Any log message then triggers the start of rsyslog.service (or
other implementation) along with its dependencies such as local-fs.target and
sysinit.target. As these might fail themselves (e. g. faulty /etc/fstab), this
breaks the emergency mode.
This causes syslog.socket to fail with "Failed to queue service startup job:
Transition is destructive".
Add Conflicts=syslog.socket to emergency.service to make sure the socket is
stopped when emergency.service is started.
Martin Pitt [Wed, 27 Apr 2016 07:58:42 +0000 (09:58 +0200)]
path-util: Add hidden suffixes for ucf (#3131)
ucf is a standard Debian helper for managing configuration file upgrades which
need more interaction or elaborate merging than conffiles managed by dpkg.
Ignore its temporary and backup files similarly to the *.dpkg-* ones to avoid
creating units for them in generators.
journal: set STATE_ARCHIVED as part of offlining (#2740)
The only code path which makes a journal durable is via
journal_file_set_offline().
When we perform a rotate the journal's header->state is being set to
STATE_ARCHIVED prior to journal_file_set_offline() being called.
In journal_file_set_offline(), we short-circuit the entire offline when
f->header->state != STATE_ONLINE.
This all results in none of the journal_file_set_offline() fsync() calls
being reached when rotate archives a journal, so archived journals are
never explicitly made durable.
What we do now is instead of setting the f->header->state to
STATE_ARCHIVED directly in journal_file_rotate() prior to
journal_file_close(), we set an archive flag in f->archive for the
journal_file_set_offline() machinery to honor by committing
STATE_ARCHIVED instead of STATE_OFFLINE when set.
Prior to this, rotated journals were never getting fsync() explicitly
performed on them, since journal_file_set_offline() short-circuited.
Obviously this is undesirable, and depends entirely on the underlying
filesystem as to how much durability was achieved when simply closing
the file.
Note that this problem existed prior to the recent asynchronous fsync
changes, but those changes do facilitate our performing this durable
offline on rotate without blocking, regardless of the underlying
filesystem sync-on-close semantics.
* sd-journal: detect earlier if we try to read an object from an invalid offset
Specifically, detect early if we try to read from offset 0, i.e. are using
uninitialized offset data.
* journal: when dumping journal contents, react nicer to lines we can't read
If journal files are not cleanly closed it might happen that intermediaery
journal entries cannot be read. Handle this nicely, skip over the unreadable
entries, and log a debug message about it; after all we generally follow the
logic that we try to make the best of corrupted files.
* journal-file: always generate the same error when encountering corrupted files
Let's make sure EBADMSG is the one error we throw when we encounter corrupted
data, so that we can neatly test for it.
* journal-file: when iterating through a partly corruped journal file, treat error like EOF
When we linearly iterate through a corrupted journal file, and we encounter a
read error, don't consider this fatal, but merely as EOF condition (and log
about it).
* journal-file: make seeking in corrupted files work
Previously, when we used a bisection table for seeking through a corrupted
file, and the end of the bisection table was corrupted we'd most likely fail
the entire seek operation. Improve the situation: if we encounter invalid
entries in a bisection table, linearly go backwards until we find a working
entry again.
* man: elaborate on the automatic systemd-journald.socket service dependencies
journal-file: make seeking in corrupted files work
Previously, when we used a bisection table for seeking through a corrupted
file, and the end of the bisection table was corrupted we'd most likely fail
the entire seek operation. Improve the situation: if we encounter invalid
entries in a bisection table, linearly go backwards until we find a working
entry again.
journal-file: when iterating through a partly corruped journal file, treat error like EOF
When we linearly iterate through a corrupted journal file, and we encounter a
read error, don't consider this fatal, but merely as EOF condition (and log
about it).
journal: when dumping journal contents, react nicer to lines we can't read
If journal files are not cleanly closed it might happen that intermediaery
journal entries cannot be read. Handle this nicely, skip over the unreadable
entries, and log a debug message about it; after all we generally follow the
logic that we try to make the best of corrupted files.