coredump: when reconstructing original kernel coredump context, chop off trailing zeroes
Our coredump handler operates on a "context" supplied by the kernel via
the core_pattern arguments. When we pass off a coredump for processing
to coredumpd we pass along enough information for this context to be
reconstructed. This information is passed in the usual journal fields,
and that means we extended the 1s granularity timestamp to 1µs
granularity by appending 6 zeroes. We need to chop them off again when
reconstructing the original kernel context.
udevd: use signal_to_string() instead of strsignal() at one place
strsignal() sucks, as it tries to generate human readable strings from
something that isn't really human readable by concept. Let's use
signal_to_string() instead, making this more grokkable. Difference is:
SIGINT gets translated → "SIGINT" rather than → "Interrupted".
(Note that we only do this for the journal metadata, not for the xattrs,
as the xattrs are only supposed to store the original 1:1 info we
acquired from the kernel.)
Martin Pitt [Wed, 15 Feb 2017 22:37:25 +0000 (23:37 +0100)]
test: drop TEST_DATA_DIR, fold into get_testdata_dir()
Drop the TEST_DATA_DIR macro as this was using alloca() within a
function call which is allegedly unsafe. So add a "suffix" argument to
get_testdata_dir() instead and call that directly.
Martin Pitt [Wed, 15 Feb 2017 07:52:17 +0000 (08:52 +0100)]
test: show error message if $SYSTEMD_TEST_DATA does not exist
Rename get_exe_relative_testdata_dir() to get_testdata_dir() and move
the env var check into that, so that everything interesting happens at
the same place.
tests: look for tests relative to source dir when running from build dir
automake helpfully sets a few variables for during build. When our executable
is in a directory underneath $(abs_top_builddir), we know that we're in the
build environment $(abs_top_srcdir) contains the sources, and test data is
under $(abs_top_srcdir)/test. This remains true no matter where the build
directory is relative to the source directory. It also works if the test
executable is invoked as ./test-whatever or .libs/test-whatever, since the
relative path is not used at all.
When running from outside of the build directory, we should be running from the
installed location and we can look for ../testdata relative to the location of
the exe file.
Of course, $SYSTEMD_TEST_DATA always overrides this logic.
Martin Pitt [Tue, 14 Feb 2017 21:33:52 +0000 (22:33 +0100)]
test: setup test data dir before fake runtime dir
That way, if the test directory does not exist we don't leave behind
temporary files (as in that case or on test failure the cleanup actions
don't run).
Martin Pitt [Tue, 14 Feb 2017 07:58:19 +0000 (08:58 +0100)]
test: clarify error message if test data directory does not exist
When trying to directly run a test executable in the build tree without
setting $TEST_DIR, some tests fail with a non-obvious error message.
Print an useful one instead.
A bug exists where the conflict counter is cleared
regardless of whether or not the next probe attempt leads to
a successful address acquisition. This causes 'bursts' of
MAX_CONFLICTS probes followed by a delay of
RATE_LIMIT_INTERVAL instead of a single probe each
RATE_LIMIT_INTERVAL when beyond MAX_CONFLICTS.
The conflict counter should only be cleared after an
address is successfully acquired. This commit achieves that
goal.
From RFC3927:
A host should maintain a counter of the number of address
conflicts it has experienced in the process of trying to
acquire an address, and if the number of conflicts exceeds
MAX_CONFLICTS then the host MUST limit the rate at which it
probes for new addresses to no more than one new address per
RATE_LIMIT_INTERVAL. This is to prevent catastrophic ARP
storms in pathological failure cases, such as a rogue host
that answers all ARP probes, causing legitimate hosts to go
into an infinite loop attempting to select a usable address.
Signed-off-by: Jason Reeder <jasonreeder@gmail.com>
Maarten de Vries [Thu, 16 Feb 2017 09:52:04 +0000 (10:52 +0100)]
nss-resolve: report ERANGE for small buffers. (#5359)
The correct error code to report when a provided buffer is too small is
ERANGE. This is recognized by glibc, which will then try again with a
larger buffer. The old behaviour of reporting ENOMEM has no special
meaning for glibc. The error will simply be propagated to the
application, and a later retry will trigger the same error again.
Additionally, h_errnop must be set to NETDB_INTERNAL to have glibc look
at errnop for details.
More information at:
https://www.gnu.org/software/libc/manual/html_node/NSS-Modules-Interface.html
Christian Hesse [Wed, 15 Feb 2017 22:51:31 +0000 (23:51 +0100)]
virt: swap order of cpuid and dmi again, but properly detect oracle (#5355)
This breaks again, this time for setups where Qemu is not reported via DMI for whatever
reason. So swap order of cpuid and dmi again, but properly detect oracle.
coredumpctl: display non-coredump coredump entries too
$ ./coredumpctl --no-pager -1
TIME PID UID GID SIG COREFILE EXE
Sun 2016-11-06 10:10:51 EST 29514 1002 1002 - - /usr/bin/python3.5
$ ./coredumpctl info 29514
PID: 29514 (python3)
UID: 1002 (zbyszek)
GID: 1002 (zbyszek)
Reason: ZeroDivisionError
Timestamp: Sun 2016-11-06 10:10:51 EST (3h 22min ago)
Command Line: python3 systemd_coredump_exception_handler.py
Executable: /usr/bin/python3.5
Control Group: /user.slice/user-1002.slice/user@1002.service/gnome-terminal-server.service
Unit: user@1002.service
User Unit: gnome-terminal-server.service
Slice: user-1002.slice
Owner UID: 1002 (zbyszek)
Boot ID: 1531fd22ec84429e85ae888b12fadb91
Machine ID: 519a16632fbd4c71966ce9305b360c9c
Hostname: laptop
Storage: none
Message: Process 29514 (systemd_coredump_exception_handler.py) of user zbyszek failed with ZeroDivisionError: division by
Traceback (most recent call last):
File "systemd_coredump_exception_handler.py", line 134, in <module>
g()
File "systemd_coredump_exception_handler.py", line 133, in g
f()
File "systemd_coredump_exception_handler.py", line 131, in f
div0 = 1 / 0
ZeroDivisionError: division by zero
Local variables in innermost frame:
a=3
h=<function f at 0x7efdc14b6ea0>
Embedding sd_id128_t's in constant strings was rather cumbersome. We had
SD_ID128_CONST_STR which returned a const char[], but it had two problems:
- it wasn't possible to statically concatanate this array with a normal string
- gcc wasn't really able to optimize this, and generated code to perform the
"conversion" at runtime.
Because of this, even our own code in coredumpctl wasn't using
SD_ID128_CONST_STR.
Add a new macro to generate a constant string: SD_ID128_MAKE_STR.
It is not as elegant as SD_ID128_CONST_STR, because it requires a repetition
of the numbers, but in practice it is more convenient to use, and allows gcc
to generate smarter code:
coredump: with --backtrace accept a journal entry on stdin
The entry must be a single entry in the journal export format, including the
terminating double newline. The MESSAGE field is now generated on the sender
side.
The advantage is that the reporter can easily pass additional metadata.
Continuing with the example of the python excepthook:
COREDUMP_PYTHON_EXECUTABLE=/usr/bin/python3
COREDUMP_PYTHON_VERSION=3.5.2 (default, Sep 14 2016, 11:28:32)
[GCC 6.2.1 20160901 (Red Hat 6.2.1-1)]
COREDUMP_PYTHON_THREAD_INFO=sys.thread_info(name='pthread', lock='semaphore', version='NPTL 2.24')
COREDUMP_PYTHON_EXCEPTION_TYPE=ZeroDivisionError
COREDUMP_PYTHON_EXCEPTION_VALUE=division by zero
MESSAGE=Process 29514 (systemd_coredump_exception_handler.py) of user zbyszek failed with ZeroDivisionError: division by zero
Traceback (most recent call last):
File "systemd_coredump_exception_handler.py", line 134, in <module>
g()
File "systemd_coredump_exception_handler.py", line 133, in g
f()
File "systemd_coredump_exception_handler.py", line 131, in f
div0 = 1 / 0
ZeroDivisionError: division by zero
Local variables in innermost frame:
a=3
h=<function f at 0x7efdc14b6ea0>
One consideration is whether to use the Journal Export Format, or send packets
over a UNIX socket instead. The advantage of current solution is that although
parsing is more complicated on the receiver side, it is much easier to use on the
sender side. I hope this can be used by various languages for which writing
binary structures to a UNIX socket is harder and more likely to be done wrong
than piping of a simple textyish format.
test-journal-importer: new test file to check the newly exported importer code
Only one test case is added, but it is enough to check basic sanity of the
code (single-line and binary fields and trusted fields, allocation and freeing).
coredump: implement logging of external backtraces with --backtrace
This is useful for example for Python progams. By installing a python
sys.execepthook we can store the backtrace in the journal. We gather the
backtrace in the python process, and call systemd-coredump to attach additional
fields (COREDUMP_COMM, COREDUMP_EXE, COREDUMP_UNIT, COREDUMP_USER_UNIT,
COREDUMP_OWNER_UID, COREDUMP_SLICE, COREDUMP_CMDLINE, COREDUMP_CGROUP,
COREDUMP_OPEN_FDS, COREDUMP_PROC_STATUS, COREDUMP_PROC_MAPS,
COREDUMP_PROC_LIMITS, COREDUMP_PROC_MOUNTINFO, COREDUMP_CWD, COREDUMP_ROOT,
COREDUMP_ENVIRON, COREDUMP_CONTAINER_CMDLINE). This could also be done in the
python process, but doing this in systemd-coredump saves quite a bit of
duplicate work and unifies the handling of various tricky fields like
COREDUMP_CONTAINER_CMDLINE in one place.
(Of course this applies to any other language which does not dump cores
but wants to log a traceback, e.g. ruby.)
Traceback (most recent call last):
File "systemd_coredump_exception_handler.py", line 89, in <module>
g()
File "systemd_coredump_exception_handler.py", line 88, in g
f()
File "systemd_coredump_exception_handler.py", line 86, in f
div0 = 1 / 0 # pylint: disable=W0612
ZeroDivisionError: division by zero
Local variables in innermost frame:
h=<function f at 0x7f4da3606e18>
a=3
_PID=14499
_SOURCE_REALTIME_TIMESTAMP=1477877460025975
coredump: split out metadata gathering to a separate function
In preparation for subsequenct changes...
Various stack allocations are changed to use the heap. This might be minimally
slower, but probably doesn't matter. The upside is that we will now properly
free all memory that is allocated.
Christian Hesse [Tue, 14 Feb 2017 13:51:12 +0000 (14:51 +0100)]
virt: detect qemu/kvm as 'kvm'
In commit 050e65a we swapped order of detect_vm_{cpuid,dmi}(). That
fixed Virtualbox but broke qemu with kvm, which is expected to return
'kvm'. So check for qemu/kvm first, then DMI, CPUID last.
core: explicitly verify that BindsTo= deps are in order before dispatch start operation of a unit
Let's make sure we verify that all BindsTo= are in order before we actually go
and dispatch a start operation to a unit. Normally the job queue should already
have made sure all deps are in order, but this might not have been sufficient
in two cases: a) when the user changes deps during runtime and reloads the
daemon, and b) when the user placed BindsTo= dependencies without matching
After= dependencies, so that we don't actually wait for the bound to unit to be
up before upping also the binding unit.
resolved: size the mdns announce answer array properly
The array doesn't grow dynamically, hence pick the right size at the
moment of allocation. Let's simply multiply the number of addresses of
this link by 2, as that's how many RRs we maintain for it.
Commit ae3251851 changed the fprintf() format argument into a variable
which triggers a gcc 6.3 warning/error:
src/fstab-generator/fstab-generator.c:243:17: error: format not a string literal,
argument types not checked [-Werror=format-nonliteral]
fprintf(f, format, res);
This is a false positive, as the function is only being called with
constant (not user-definable) arguments which are valid format strings.
Martin Pitt [Sun, 12 Feb 2017 22:53:53 +0000 (23:53 +0100)]
buildsys: add "install-tests" target
Add a new "install-tests" make target that installs our unit test-*
executables and their test data files into /usr/lib/systemd/tests/.
This is useful for packaging the tests to run them with root privileges
or in CI.
Martin Pitt [Sun, 12 Feb 2017 22:14:43 +0000 (23:14 +0100)]
test: make unit tests relocatable
It is useful to package test-* binaries and run them as root under
autopkgtest or manually on particular machines. They currently have a
built-in hardcoded absolute path to their test data, which does not work
when running the test programs from any other path than the original
build directory.
By default, make the tests look for their data in
<test_exe_directory>/testdata/ so that they can be called from any
directory (provided that the corresponding test data is installed
correctly). As we don't have a fixed static path in the build tree (as
build and source tree are independent), set $TEST_DIR with "make check"
to point to <srcdir>/test/, as we previously did with an automake
variable.
Martin Pitt [Sun, 12 Feb 2017 21:39:21 +0000 (22:39 +0100)]
test: move resolved test data into test/
Moe test-resolve's test data from src/resolve/test-data to
test/test-resolve/ to be consistent with test/test-{execute,path}/. This
will make it easier to make the tests relocatable.
Ruslan Bilovol [Mon, 13 Feb 2017 19:50:22 +0000 (21:50 +0200)]
fstab-generator: add x-systemd.before and x-systemd.after fstab options (#5330)
Currently fstab entries with 'nofail' option are mounted
asynchronously and there is no way how to specify dependencies
between such fstab entry and another units. It means that
users are forced to write additional dependency units manually.
The patch introduces new systemd fstab options:
x-systemd.before=<PATH>
x-systemd.after=<PATH>
- to specify another mount dependency (PATH is translated to unit name)
x-systemd.before=<UNIT>
x-systemd.after=<UNIT>
- to specify arbitrary UNIT dependency
For example mount where A should be mounted before local-fs.target unit:
On classic DNS and LLMNR ANY requests may be replied to with any kind of
RR, and the reply does not have to be comprehensive: these protocols
simply define that if there's an RRset that can answer the question,
then at least one should be sent as reply, but not necessarily all. This
means it's not safe to "merge" transactions for arbitrary RR types into
ANY requests, as the reply might not answer the specific question.
As the merging is primarily an optimization, let's undo this for now.
This logic may be readded later, in a way that only applies to mDNS.
Also, there's an OOM problem with this chunk: dns_resource_key_new()
might fail due to OOM and this is not handled. (This is easily removed
though, by using DNS_RESOURCE_KEY_CONST()).
core/dbus: silence gcc warning about unitialized variable
src/core/dbus.c: In function 'find_unit':
src/core/dbus.c:334:15: warning: 'u' may be used uninitialized in this function [-Wmaybe-uninitialized]
*unit = u;
^
src/core/dbus.c:301:15: note: 'u' was declared here
Unit *u;
^
core/manager: silence gcc warning about unitialized variable
At -O3, this was printed a hundred times for various callers of
manager_add_job_by_name(). AFAICT, there is no error and `unit` is always
intialized. Nevertheless, add explicit initialization to silence the noise.
src/core/manager.c: In function 'manager_start_target':
src/core/manager.c:1413:16: warning: 'unit' may be used uninitialized in this function [-Wmaybe-uninitialized]
return manager_add_job(m, type, unit, mode, e, ret);
^
src/core/manager.c:1401:15: note: 'unit' was declared here
Unit *unit;
^
core/manager: make manager_load_unit*() functions always take output arg
We were inconsistent, manager_load_unit_prepare() would crash if _ret was ever NULL.
But none of the callers use NULL. So simplify things and require it to be non-NULL.
Hans de Goede [Sun, 12 Feb 2017 11:33:22 +0000 (12:33 +0100)]
rules: Add extended evdev/input match rules for event nodes with the same name
Sometimes a system may have 2 input event nodes with the same name where
we only want to apply keyboard hwdb rules to 1 of the 2 devices.
This problem happens e.g. on devices where the soc_button_array driver is
used (e.g. intel atom based tablets) which registers 2 event nodes with
the name "gpio-keys".
This commit adds a new extended match rule which extends the match to also
check $attr{phys} and $attr{capabilities/ev}, allowing to differentiate
between devices with an identical name.
Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
core: skip ReadOnlyPaths= and other permission-related mounts on PermissionsStartOnly= (#5309)
ReadOnlyPaths=, ProtectHome=, InaccessiblePaths= and ProtectSystem= are
about restricting access and little more, hence they should be disabled
if PermissionsStartOnly= is used or ExecStart= lines are prefixed with a
"+". Do that.
(Note that we will still create namespaces and stuff, since that's about
a lot more than just permissions. We'll simply disable the effect of
the four options mentioned above, but nothing else mount related.)
This also adds a test for this, to ensure this works as intended.
No documentation updates, as the documentation are already vague enough
to support the new behaviour ("If true, the permission-related execution
options…"). We could clarify this further, but I think we might want to
extend the switches' behaviour a bit more in future, hence leave it at
this for now.
shared: pass *unsigned_long to namespace_flag_from_string_many (#5315)
Fixes:
```
src/shared/bus-unit-util.c: In function ‘bus_append_unit_property_assignment’:
src/shared/bus-unit-util.c:570:65: warning: passing argument 2 of ‘namespace_flag_from_string_many’ from incompatible pointer type [-Wincompatible-pointer-types]
r = namespace_flag_from_string_many(eq, &flags);
^
In file included from src/shared/bus-unit-util.c:31:0:
src/shared/nsflags.h:41:5: note: expected ‘long unsigned int *’ but argument is of type ‘uint64_t * {aka long long unsigned int *}’
int namespace_flag_from_string_many(const char *name, unsigned long *ret);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
Florian Klink [Fri, 10 Feb 2017 23:47:55 +0000 (00:47 +0100)]
networkd: add IPv6ProxyNDPAddress support (#5174)
IPv6 Neighbor discovery proxy is the IPv6 equivalent to proxy ARP for IPv4.
It is required when ISPs do not unconditional route IPv6 subnets
to their designated target, but expect neighbor solicitation messages
for every address on a link.
A variable IPv6ProxyNDPAddress= is introduced to the [Network] section,
each representing a IPv6 neighbour proxy entry in the neighbour table.
Dan Streetman [Fri, 10 Feb 2017 20:29:23 +0000 (15:29 -0500)]
test: create sys-script.py script
The script contains the contents of all sys/ test files, and creates
all dirs/links/files when run. This replaces the sys.tar.xz tarball
that contained sys/, so changes to sys files only require a simple
commit in git, instead of checking in an entire new tarball for each
sys/ change.
Dan Streetman [Fri, 10 Feb 2017 20:27:18 +0000 (15:27 -0500)]
test: add script to convert sys/ into sys-script.py
Instead of keeping all sys/ nodes in a tarball, use a script
"sys-script.py" to create all the sys/ entries.
This adds a script to create that initial "sys-script.py" script, using
an existing sys/ directory, created from the sys.tar.xz contents.
The "sys-script.py" can then be edited or recreated later, when any sys/
files are added or modified; the change will be only a patch to the
"sys-script.py" script in git, instead of forcing git to store a new
binary tarball.
path-lookup: if $HOME can be determined but $XDG_RUNTIME_DIR can't, is it
So far, if either $HOME or $XDG_RUNTIME_DIR is not set we wouldn't use
either, and fail acquire_config_dirs() and acquire_control_dirs() in
their entireties. With this change, let's make use of the variables we
can acquire, and don't bother with the other.
Specifically this means: in both acquire_config_dirs() and
acquire_control_dirs() handle ENXIO from user_config_dir() and
user_runtime_dir() directly, instead of propagating it up and handling
it in the caller.
path-lookup: try harder acquiring them $HOME of a user
Let's use get_home_dir() for figuring out the home directory, so that
there's a good chance we succeed figuring out unit locations even if
$HOME isn't set.