This change documents the existance of the systemd-nspawn@.service template
unit file, which was previously not mentioned at all. Since the unit file uses
slightly different default than nspawn invoked from the command line, these
defaults are now explicitly documented too.
A couple of further additions and changes are made, too.
Elias Probst [Wed, 22 Jun 2016 15:10:52 +0000 (17:10 +0200)]
machinectl: do not escape the unit name (#3554)
Otherwise starting a machine named `foo-bar-baz` will end up in
machinectl attempting to start the service unit
`systemd-nspawn@foo\x2dbar\x2dbaz` instead of
`systemd-nspawn@foo-bar-baz`.
Minkyung [Wed, 22 Jun 2016 11:26:05 +0000 (20:26 +0900)]
watchdog: Support changing watchdog_usec during runtime (#3492)
Add sd_notify() parameter to change watchdog_usec during runtime.
Application can change watchdog_usec value by
sd_notify like this. Example. sd_notify(0, "WATCHDOG_USEC=20000000").
To reset watchdog_usec as configured value in service file,
restart service.
Notice.
sd_event is not currently supported. If application uses
sd_event_set_watchdog, or sd_watchdog_enabled, do not use
"WATCHDOG_USEC" option through sd_notify.
Franck Bui [Mon, 20 Jun 2016 16:54:21 +0000 (18:54 +0200)]
pid1: initialize TERM environment variable correctly
When systemd is started by the kernel, the kernel set the TERM
environment variable unconditionnally to "linux" no matter the console
device used. This might be an issue for dumb devices with no colors
support.
This patch uses default_term_for_tty() for getting a more accurate
value. But it makes sure to keep the user preferences (if any) which
might be passed via the kernel command line. For that purpose /proc
should be mounted.
emergency.service: Don't say "Welcome" when it's an emergency (#3569)
Quoting @cgwalters:
Just uploading this as an RFC. Now I know reading the code that systemd says
`Welcome to $OS` as a generic thing, but my initial impression on seeing this
was that it was almost sarcastic =)
Let's say "You are in emergency mode" as a more neutral/less excited phrase.
This patch is based on #3556, but makes the same change for rescue mode.
Lukáš Nykrýn [Sun, 19 Jun 2016 17:22:46 +0000 (19:22 +0200)]
man: match runlevel symlinks recommendation with our makefile (#3563)
In makefile we create symlinks runlevel5.target to graphical.target and
runlevel2-4.target to multi-user.target. Let's say the same thing in
systemd.special manpage.
Fixes:
==27917== 3 bytes in 1 blocks are definitely lost in loss record 1 of 1
==27917== at 0x4C28BF6: malloc (vg_replace_malloc.c:299)
==27917== by 0x55083D9: strdup (in /usr/lib64/libc-2.22.so)
==27917== by 0x1140DA: find_converted_keymap (keymap-util.c:524)
==27917== by 0x110844: test_find_converted_keymap (test-keymap-util.c:52)
==27917== by 0x1124FE: main (test-keymap-util.c:213)
==27917==
Dave Reisner [Fri, 10 Jun 2016 13:50:16 +0000 (09:50 -0400)]
Ensure kdbus isn't used (#3501)
Delete the dbus1 generator and some critical wiring. This prevents
kdbus from being loaded or detected. As such, it will never be used,
even if the user still has a useful kdbus module loaded on their system.
Sort of fixes #3480. Not really, but it's better than the current state.
resolved: when restarting a transaction make sure to not touch it anymore (#3553)
dns_transaction_maybe_restart() is supposed to return 1 if the the transaction
has been restarted and 0 otherwise. dns_transaction_process_dnssec() relies on
this behaviour. Before this change in case of restart we'd call
dns_transaction_go() when restarting the lookup, returning its return value
unmodified. This is wrong however, as that function returns 1 if the
transaction is pending, and 0 if it completed immediately, which is a very
different set of return values. Fix this, by always returning 1 on redirection.
The wrong return value resulted in all kinds of bad memory accesses as we might
continue processing a transaction that was redirected and completed immediately
(and thus freed).
This patch also adds comments to the two functions to clarify the return values
for the future.
systemctl: delay pager/polkit agent opening as much as possible
In https://github.com/systemd/systemd/issues/3543, we would open the pager
before starting ssh, and the pipe fd was "leaked" into the ssh child as the
stderr fd. Previous commit fixes bus-socket to nullify stderr before launching
the child, but it seems reasonable to also delay starting the pager.
If we are going to croak when trying to open the transport, it seems better
to do this before starting the pager.
systemctl: make sure we terminate the bus connection first, and then close the pager (#3550)
If "systemctl -H" is used, let's make sure we first terminate the bus
connection, and only then close the pager. If done in this order ssh will get
an EOF on stdin (as we speak D-Bus through ssh's stdin/stdout), and then
terminate. This makes sure the standard error we were invoked on is released by
ssh, and only that makes sure we don't deadlock on the pager which waits for
all clients closing its input pipe.
(Similar fixes for the various other xyzctl tools that support both pagers and
-H)
If for whatever reason the file system is "corrupted", we want
to be resilient and ignore the error, as long as we can load the units
from a different place.
Arch bug https://bugs.archlinux.org/task/49547.
A user had an ntfs symlink (essentially a file) instead of a directory after
restoring from backup. We should just ignore that like we would treat a missing
directory, for general resiliency.
We should treat permission errors similarly. For example an unreadable
/usr/local/lib directory would prevent (user) instances of systemd from
loading any units. It seems better to continue.
core: set $JOURNAL_STREAM to the dev_t/ino_t of the journal stream of executed services
This permits services to detect whether their stdout/stderr is connected to the
journal, and if so talk to the journal directly, thus permitting carrying of
metadata.
- The passed max_length is also applied to the "comm" name, if comm_fallback is
set.
- The right thing happens if max_length == 1 is specified
- when the cmdline "foobar" is abbreviated to 6 characters the result is not
"foobar" instead of "foo...".
- trailing whitespace are removed before the ... suffix is appended. The 7
character abbreviation of "foo barz" is hence "foo..." instead of "foo ...".
- leading whitespace are suppressed from the cmdline
resolved: in the ResolveHostname() bus call, accept IP addresses with scope
When we get a literal IP address as string that includes a zone suffix, process
this properly and return the parsed ifindex back to the client, and include it
in the canonical name in case of a link-local IP address.
Apparently newer gcc versions are a bit more forgiving when assigning an
"unsigned char*" pointer to something of a different type. Let's add the
missing cast so that old gcc versions are fine, too.
systemctl: allow percent-based MemoryLimit= settings via systemctl set-property
The unit files already accept relative, percent-based memory limit
specification, let's make sure "systemctl set-property" support this too.
Since we want the physical memory size of the destination machine to apply we
pass the percentage in a new set of properties that only exist for this
purpose, and can only be set.
core: optionally, accept a percentage value for MemoryLimit= and related settings
If a percentage is used, it is taken relative to the installed RAM size. This
should make it easier to write generic unit files that adapt to the local system.
util: when determining the amount of memory on this system, take cgroup limit into account
When determining the amount of RAM in the system, let's make sure we also read
the root-level cgroup memory limit into account. This isn't particularly useful
on the host, but in containers it makes sure that whatever memory the container
got assigned is actually used for RAM size calculations.
Lukáš Nykrýn [Tue, 14 Jun 2016 12:20:56 +0000 (14:20 +0200)]
manager: reduce complexity of unit_gc_sweep (#3507)
When unit is marked as UNSURE, we are trying to find if it state was
changed over and over again. So lets not go through the UNSURE states
again. Also when we find a GOOD unit lets propagate the GOOD state to
all units that this unit reference.
This is a problem on machines with a lot of initscripts with different
starting priority, since those units will reference each other and the
original algorithm might get to n! complexity.
Thanks HATAYAMA Daisuke for the expand_good_state code.
Andrew Jeddeloh [Tue, 14 Jun 2016 09:09:06 +0000 (02:09 -0700)]
build: fix missing symbol for old kernel headers (#3530)
Fix issue where IN6_ADDR_GEN_MODE_STABLE_PRIVACY is undefined but
IFLA_INET6_ADDR_GEN_MODE is defined and thus the former does not get
fixed in missing.h. This occurs with kernel headers new enough to have
the IFLA_INET6_ADDR_GEN_MODE but old enough to not yet have
IN6_ADDR_GEN_MODE_STABLE_PRIVACY (e.g. 3.18).
This reworks "systemctl status" and "systemctl show" a bit. It removes the
definition of the `property_info` structure, because we can simply reuse the
existing UnitStatusInfo type for that.
The "could not be found" message is now printed by show_one() itself (and not
its caller), so that it is shown regardless by who the function is called.
(This makes it necessary to pass the unit name to the function.)
This also adds all properties found to a set, and then checks if any of the
properties passed via "--property=" is mising in it, if so, a proper error is
generated.
Support for checking the PID file of a unit is removed, as this cannot be done
reasonably client side (since the systemd instance we are talking to might sit
on another host)
Let's block access to the kernel keyring and a number of obsolete system calls.
Also, update list of syscalls that may alter the system clock, and do raw IO
access. Filter ptrace() if CAP_SYS_PTRACE is not passed to the container and
acct() if CAP_SYS_PACCT is not passed.
This also changes things so that kexec(), some profiling calls, the swap calls
and quotactl() is never available to containers, not even if CAP_SYS_ADMIN is
passed. After all we currently permit CAP_SYS_ADMIN to containers by default,
but these calls should not be available, even then.
This adds three new seccomp syscall groups: @keyring for kernel keyring access,
@cpu-emulation for CPU emulation features, for exampe vm86() for dosemu and
suchlike, and @debug for ptrace() and related calls.
Also, the @clock group is updated with more syscalls that alter the system
clock. capset() is added to @privileged, and pciconfig_iobase() is added to
@raw-io.
Finally, @obsolete is a cleaned up. A number of syscalls that never existed on
Linux and have no number assigned on any architecture are removed, as they only
exist in the man pages and other operating sytems, but not in code at all.
create_module() is moved from @module to @obsolete, as it is an obsolete system
call. mem_getpolicy() is removed from the @obsolete list, as it is not
obsolete, but simply a NUMA API.
Susant Sahani [Mon, 13 Jun 2016 13:57:38 +0000 (19:27 +0530)]
networkd: fix NULL pointer (#3523)
Not every link has kind associated with it.
(gdb) r
Starting program: /home/sus/tt/systemd/systemd-networkd
Missing separate debuginfos, use: dnf debuginfo-install
glibc-2.23.1-7.fc24.x86_64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
vboxnet0: Gained IPv6LL
wlp3s0: Gained IPv6LL
enp0s25: Gained IPv6LL
Enumeration completed
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6e27ade in __strcmp_sse2_unaligned () from /lib64/libc.so.6
(gdb) bt
src/network/networkd-link.c:2008
src/network/networkd-link.c:2059
src/network/networkd-link.c:2442
m=0x555555704a30, userdata=0x55555570bfe0) at src/network/networkd-link.c:2497
at src/libsystemd/sd-netlink/sd-netlink.c:347
src/libsystemd/sd-netlink/sd-netlink.c:402
src/libsystemd/sd-netlink/sd-netlink.c:432
userdata=0x5555556f7470) at src/libsystemd/sd-netlink/sd-netlink.c:739
src/libsystemd/sd-event/sd-event.c:2275
src/libsystemd/sd-event/sd-event.c:2626
timeout=18446744073709551615) at src/libsystemd/sd-event/sd-event.c:2685
bus=0x5555556f9af0, name=0x555555692315 "org.freedesktop.network1",
timeout=30000000,
check_idle=0x55555556ac84 <manager_check_idle>, userdata=0x5555556f6b20) at
src/shared/bus-util.c:134
src/network/networkd-manager.c:1128
src/network/networkd.c:127
(gdb) f 1
src/network/networkd-link.c:2008
2008 if (link->network->bridge || streq("bridge", link->kind)) {
(gdb) p link->kind
$1 = 0x0
Jouke Witteveen [Mon, 13 Jun 2016 10:50:12 +0000 (12:50 +0200)]
core/execute: pass env vars to PAM session setup (#3503)
Move the merger of environment variables before setting up the PAM
session and pass the aggregate environment to PAM setup. This allows
control over the PAM session hooks through environment variables.
PAM session initiation may update the environment. On successful
initiation of a PAM session, we adopt the environment of the
PAM context.
... as well as halt/poweroff/kexec/suspend/hibernate/hybrid-sleep.
Running those commands will fail in user mode, but we try to set the wall
message first, which might even succeed for privileged users. Best to nip
the whole sequence in the bud.
Our functions that query /proc/pid/ support using pid==0 to mean
self. get_process_id also seemed to support that, but it was not implemented
correctly: the result should be in *uid, not returned, and also it gave
completely bogus result when called from get_process_gid(). But afaict,
get_process_{uid,gid} were never called with pid==0, so it's not an actual
bug. Remove the broken code to avoid confusion.