Felipe Sateler [Wed, 17 Aug 2016 18:11:27 +0000 (15:11 -0300)]
sysv-generator: better error reporting (#3977)
Currently in the journal you get messages without context like:
systemd-sysv-generator[$pid]: Failed to build name: Invalid argument
When parsing the init script, show the file and line number where the
error was found. At the same time, add more context information if
available.
Thus turning the message into something like:
systemd-sysv-generator[$pid]: [/etc/init.d/root-system-proofd:13] Could not build name for facility $network,: Invalid argument
Vito Caputo [Wed, 17 Aug 2016 12:51:07 +0000 (05:51 -0700)]
journal: ensure open journals from find_journal() (#3973)
If journals get into a closed state like when rotate fails due to
ENOSPC, when space is made available it currently goes unnoticed leaving
the journals in a closed state indefinitely.
By calling system_journal_open() on entry to find_journal() we ensure
the journal has been opened/created if possible.
Also moved system_journal_open() up to after open_journal(), before
find_journal().
Daniel Hahler [Tue, 16 Aug 2016 16:42:41 +0000 (18:42 +0200)]
zsh: _journalctl: do not complete exclusive modes (#3957)
After `journalctl -D /var/log/journal` "--directory", "--file",
"--machine" and "--root" should not be available for completion, because
they are exclusive. But multiple `--file` arguments are allowed.
However, it should not use `--system` in that case anyway, so this patch
removes the part that should cause a default to be used and adds some
comments.
Daniel Hahler [Sat, 13 Aug 2016 14:41:22 +0000 (16:41 +0200)]
zsh: _journalctl: improve support for handling mode args (#3952)
This only completes fields from `journalctl --user` in _journal_fields when `--user`
is used.
It also changes $_sys_service_mgr to include both `--system` and `--user`,
because `journalctl` behaves different from `systemctl` in this regard.
No attempt is made to filter out invalid combinations, e.g. when using both
`--directory` and `--system` (see https://github.com/systemd/systemd/issues/3949).
journalctl: allow --root argument for journal watching
It is useful to look at a (possibly inactive) container or other os tree
with --root=/path/to/container. This is similar to specifying
--directory=/path/to/container/var/log/journal --directory=/path/to/container/run/systemd/journal
(if using --directory multiple times was allowed), but doesn't require
as much typing.
sd-journal: fix sd_journal_open_directory with SD_JOURNAL_OS_ROOT
The directory argument that is given to sd_j_o_d was ignored when
SD_JOURNAL_OS_ROOT was given, and directories relative to the root of the host
file system were used. With that flag, sd_j_o_d should do the same as
sd_j_open_container: use the path as "prefix", i.e. the directory relative to
which everything happens.
Instead of touching sd_j_o_d, journal_new is fixed to do what sd_j_o_c
was doing, and treat the specified path as prefix when SD_JOURNAL_OS_ROOT is
specified.
Daniel Hahler [Thu, 11 Aug 2016 16:52:13 +0000 (18:52 +0200)]
zsh: _journalctl: handle --user in _journal_none
This uses the same mechanism from _systemctl to inject `--user` into the
`journalctrl -F _EXE` call to list executables.
Before this patch the "commands" section would list executables from
system units always.
Daniel Hahler [Thu, 11 Aug 2016 16:46:31 +0000 (18:46 +0200)]
zsh: _filter_units_by_property: respect --user
Use `$_sys_service_mgr` to handle `--user`, so that `systemctl --user
stop` will correctly filter the active (user) units. Before this patch,
only user units that also exist as system units and are stoppable there
would be listed.
coredump: treat RLIMIT_CORE below page size as disabling coredumps (#3932)
The kernel treats values below a certain threshold (minfmt->min_coredump
which is initialized do ELF_EXEC_PAGESIZE, which varies between architectures,
but is usually the same as PAGE_SIZE) as disabling coredumps [1].
Any core image below ELF_EXEC_PAGESIZE will yield an invalid backtrace anyway [2],
so follow the kernel and not try to parse or store such images.
[1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/coredump.c#n660
[2] systemd-coredump[16260]: Process 16258 (sleep) of user 1002 dumped core.
Stack trace of thread 16258:
#0 0x00007f1d8b3d3810 n/a (n/a)
Rhys [Tue, 9 Aug 2016 13:33:46 +0000 (23:33 +1000)]
install: follow config_path symlink (#3362)
Under NixOS, the config_path /etc/systemd/system is a symlink to
/etc/static/systemd/system. Commands such as `systemctl list-unit-files`
and `systemctl is-enabled` did not work as the symlink was not followed.
This does not affect how symlinks are treated within the config_path
directory.
It's hard to say which one of the two mappings should stay. But the later
one would win (when both very present), and nobody complained, so let's
assume that that's the one.
hwdb: remove duplicated matches for old Logitech unifying receiver
Quoting https://github.com/systemd/systemd/pull/3906#discussion_r73828368:
> According to
> http://support.logitech.com/en_us/product/v220-cordless-optical-mouse-for-notebooks
> it seems the mouse is using a pre-version of the small unifying receiver we
> know now. If there are 2 mice with the same receiver, that means that the
> values should both be dropped IMO.
This works for hwdb/[67]0-*.hwdb. I also added code to parse hwdb/20-*, but those
files are huge, and parsing them using this parser is annoyingly slow (about one
minute for the biggest files). So I removed the support for hwdb/20-*, a much simpler
hand-generated parser should suffice for those.
Current output:
hwdb/60-evdev.hwdb: 24 match groups, 35 matches, 88 properties, 0.19323015213012695s to parse
Match 'evdev:input:b0003v05ACp0259*' is duplicated
Match 'evdev:input:b0003v05ACp025A*' is duplicated
Match 'evdev:input:b0003v05ACp025B*' is duplicated
hwdb/60-keyboard.hwdb: 122 match groups, 188 matches, 638 properties, 1.0906572341918945s to parse
Failed to parse: 'KEYBOARD_KEY_8F=switchvideomode'
Failed to parse: 'KEYBOARD_KEY_C0183=media'
Failed to parse: 'KEYBOARD_KEY_C0201=new'
Failed to parse: 'KEYBOARD_KEY_C0289=reply'
Failed to parse: 'KEYBOARD_KEY_C028B=forwardmail'
Failed to parse: 'KEYBOARD_KEY_C028C=send'
Failed to parse: 'KEYBOARD_KEY_C021A=undo'
Failed to parse: 'KEYBOARD_KEY_C0279=redo'
Failed to parse: 'KEYBOARD_KEY_C0208=print'
Failed to parse: 'KEYBOARD_KEY_C0207=save'
Failed to parse: 'KEYBOARD_KEY_C0194=file'
Failed to parse: 'KEYBOARD_KEY_C01A7=documents'
Failed to parse: 'KEYBOARD_KEY_C01B6=images'
Failed to parse: 'KEYBOARD_KEY_C01B7=sound'
Property KEYBOARD_KEY_c7 is duplicated
Failed to parse: 'KEYBOARD_KEY_cF=end'
hwdb/70-mouse.hwdb: 62 match groups, 93 matches, 68 properties, 0.34186625480651855s to parse
Match 'mouse:usb:v046dpc51b:name:Logitech USB Receiver:' is duplicated
hwdb/70-pointingstick.hwdb: 5 match groups, 14 matches, 7 properties, 0.06518816947937012s to parse
hwdb/70-touchpad.hwdb: 3 match groups, 5 matches, 3 properties, 0.039690494537353516s to parse
This way it's clear that the property block does not end at the comment.
The python checker will complain if this is not the case.
We had a few bugs before where two match blocks were merged by mistake,
and this change should help avoid that.
Tejun Heo [Sun, 7 Aug 2016 13:45:39 +0000 (09:45 -0400)]
core: add cgroup CPU controller support on the unified hierarchy
Unfortunately, due to the disagreements in the kernel development community,
CPU controller cgroup v2 support has not been merged and enabling it requires
applying two small out-of-tree kernel patches. The situation is explained in
the following documentation.
While it isn't clear what will happen with CPU controller cgroup v2 support,
there are critical features which are possible only on cgroup v2 such as
buffered write control making cgroup v2 essential for a lot of workloads. This
commit implements systemd CPU controller support on the unified hierarchy so
that users who choose to deploy CPU controller cgroup v2 support can easily
take advantage of it.
On the unified hierarchy, "cpu.weight" knob replaces "cpu.shares" and "cpu.max"
replaces "cpu.cfs_period_us" and "cpu.cfs_quota_us". [Startup]CPUWeight config
options are added with the usual compat translation. CPU quota settings remain
unchanged and apply to both legacy and unified hierarchies.
v2: - Error in man page corrected.
- CPU config application in cgroup_context_apply() refactored.
- CPU accounting now works on unified hierarchy.
buildsys,journal: allow -fsanitize=address without VALGRIND defined
Fixed (master) versions of libtool pass -fsanitize=address correctly
into CFLAGS and LDFLAGS allowing ASAN to be used without any special
configure tricks..however ASAN triggers in lookup3.c for the same
reasons valgrind does. take the alternative codepath if
__SANITIZE_ADDRESS__ is defined as well.
It was meant to write to q instead of t
FAIL: test-id128
================
=================================================================
==125770==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffd4615bd31 at pc 0x7a2f41b1bf33 bp 0x7ffd4615b750 sp 0x7ffd4615b748
WRITE of size 1 at 0x7ffd4615bd31 thread T0
#0 0x7a2f41b1bf32 in id128_to_uuid_string src/libsystemd/sd-id128/id128-util.c:42
#1 0x401f73 in main src/test/test-id128.c:147
#2 0x7a2f41336341 in __libc_start_main (/lib64/libc.so.6+0x20341)
#3 0x401129 in _start (/home/crrodriguez/scm/systemd/.libs/test-id128+0x401129)
Address 0x7ffd4615bd31 is located in stack of thread T0 at offset 1409 in frame
#0 0x401205 in main src/test/test-id128.c:37
This adds parse_nice() that parses a nice level and ensures it is in the right
range, via a new nice_is_valid() helper. It then ports over a number of users
to this.
core: remember first unit failure, not last unit failure
Previously, the result value of a unit was overriden with each failure that
took place, so that the result always reported the last failure that took
place.
With this commit this is changed, so that the first failure taking place is
stored instead. This should normally not matter much as multiple failures are
sufficiently uncommon. However, it improves one behaviour: if we send SIGABRT
to a service due to a watchdog timeout, then this currently would be reported
as "coredump" failure, rather than the "watchodg" failure it really is. Hence,
in order to report information about the type of the failure, and not about
the effect of it, let's change this from all unit type to store the first, not
the last failure.
Let's extend nss-systemd to also synthesize user/group entries for the
UIDs/GIDs 0 and 65534 which have special kernel meaning. Given that nss-systemd
is listed in /etc/nsswitch.conf only very late any explicit listing in
/etc/passwd or /etc/group takes precedence.
This functionality is useful in minimal container-like setups that lack
/etc/passwd files (or only have incompletely populated ones).
core: set $SERVICE_RESULT, $EXIT_CODE and $EXIT_STATUS in ExecStop=/ExecStopPost= commands
This should simplify monitoring tools for services, by passing the most basic
information about service result/exit information via environment variables,
thus making it unnecessary to retrieve them explicitly via the bus.
Vito Caputo [Thu, 4 Aug 2016 20:52:02 +0000 (13:52 -0700)]
fileio: fix read_full_stream() bugs (#3887)
read_full_stream() _always_ allocated twice the memory needed, due to
only breaking the realloc() && fread() loop when fread() returned 0,
requiring another iteration and exponentially enlarged buffer just to
discover the EOF condition.
This also caused file sizes >2MiB && <= 4MiB to erroneously be treated
as E2BIG, due to the inappropriately doubled buffer size exceeding
4*1024*1024.
Also made the 4*1024*1024 magic number a READ_FULL_BYTES_MAX constant.
core: turn various execution flags into a proper flags parameter
The ExecParameters structure contains a number of bit-flags, that were so far
exposed as bool:1, change this to a proper, single binary bit flag field. This
makes things a bit more expressive, and is helpful as we add more flags, since
these booleans are passed around in various callers, for example
service_spawn(), whose signature can be made much shorter now.
Not all bit booleans from ExecParameters are moved into the flags field for
now, but this can be added later.
Beef up the existing var_tmp() call, rename it to var_tmp_dir() and add a
matching tmp_dir() call (the former looks for the place for /var/tmp, the
latter for /tmp).
Both calls check $TMPDIR, $TEMP, $TMP, following the algorithm Python3 uses.
All dirs are validated before use. secure_getenv() is used in order to limite
exposure in suid binaries.
This also ports a couple of users over to these new APIs.
The var_tmp() return parameter is changed from an allocated buffer the caller
will own to a const string either pointing into environ[], or into a static
const buffer. Given that environ[] is mostly considered constant (and this is
exposed in the very well-known getenv() call), this should be OK behaviour and
allows us to avoid memory allocations in most cases.
Note that $TMPDIR and friends override both /var/tmp and /tmp usage if set.
journalctl: add new output mode "short-full" (#3880)
This new output mode formats all timestamps using the usual format_timestamp()
call we use pretty much everywhere else. Timestamps formatted this way are some
ways more useful than traditional syslog timestamps as they include weekday,
month and timezone information, while not being much longer. They are also not
locale-dependent. The primary advantage however is that they may be passed
directly to journalctl's --since= and --until= switches as soon as #3869 is
merged.
While we are at it, let's also add "short-unix" to shell completion.
util-lib: make timestamp generation and parsing reversible (#3869)
This patch improves parsing and generation of timestamps and calendar
specifications in two ways:
- The week day is now always printed in the abbreviated English form, instead
of the locale's setting. This makes sure we can always parse the week day
again, even if the locale is changed. Given that we don't follow locale
settings for printing timestamps in any other way either (for example, we
always use 24h syntax in order to make uniform parsing possible), it only
makes sense to also stick to a generic, non-localized form for the timestamp,
too.
- When parsing a timestamp, the local timezone (in its DST or non-DST name)
may be specified, in addition to "UTC". Other timezones are still not
supported however (not because we wouldn't want to, but mostly because libc
offers no nice API for that). In itself this brings no new features, however
it ensures that any locally formatted timestamp's timezone is also parsable
again.
These two changes ensure that the output of format_timestamp() may always be
passed to parse_timestamp() and results in the original input. The related
flavours for usec/UTC also work accordingly. Calendar specifications are
extended in a similar way.
The man page is updated accordingly, in particular this removes the claim that
timestamps systemd prints wouldn't be parsable by systemd. They are now.
The man page previously showed invalid timestamps as examples. This has been
removed, as the man page shouldn't be a unit test, where such negative examples
would be useful. The man page also no longer mentions the names of internal
functions, such as format_timestamp_us() or UNIX error codes such as EINVAL.
core: add new PrivateUsers= option to service execution
This setting adds minimal user namespacing support to a service. When set the invoked
processes will run in their own user namespace. Only a trivial mapping will be
set up: the root user/group is mapped to root, and the user/group of the
service will be mapped to itself, everything else is mapped to nobody.
If this setting is used the service runs with no capabilities on the host, but
configurable capabilities within the service.
This setting is particularly useful in conjunction with RootDirectory= as the
need to synchronize /etc/passwd and /etc/group between the host and the service
OS tree is reduced, as only three UID/GIDs need to match: root, nobody and the
user of the service itself. But even outside the RootDirectory= case this
setting is useful to substantially reduce the attack surface of a service.