Don't ever permit successful user or group lookups if no UID/GID mapping is
actually applied. THis way, we can be sure that nss-mymachines cannot be used
to insert invalid cache entries into nscd's cache.
Daniel Mack [Wed, 10 Feb 2016 14:44:01 +0000 (15:44 +0100)]
cgroup: remove support for NetClass= directive
Support for net_cls.class_id through the NetClass= configuration directive
has been added in v227 in preparation for a per-unit packet filter mechanism.
However, it turns out the kernel people have decided to deprecate the net_cls
and net_prio controllers in v2. Tejun provides a comprehensive justification
for this in his commit, which has landed during the merge window for kernel
v4.5:
As we're aiming for full support for the v2 cgroup hierarchy, we can no
longer support this feature. Userspace tool such as nftables are moving over
to setting rules that are specific to the full cgroup path of a task, which
obsoletes these controllers anyway.
This commit removes support for tweaking details in the net_cls controller,
but keeps the NetClass= directive around for legacy compatibility reasons.
coredump: dump priviliges when processing system coredumps
Let's add an extra-safety net and change UID/GID to the "systemd-coredump" user when processing coredumps from system
user. For coredumps of normal users we keep the current logic of processing the coredumps from the user id the coredump
was created under.
The kernel sets RLIMIT_CORE to 0 by default. Let's bump this to unlimited by
default (for systemd itself and all processes we fork off), so that the
coredump hooks have an effect if they honour it.
Bumping RLIMIT_CORE of course would have the effect that "core" files will end
up on the system at various places, if no coredump hook is used. To avoid this,
make sure PID1 sets the core pattern to the empty string by default, so that
this logic is disabled.
This change in defaults should be useful for all systems where coredump hooks
are used, as it allows useful usage of RLIMIT_CORE from these hooks again. OTOH
systems that expect that coredumps are placed under the name "core" in the
current directory will break with this change. Given how questionnable this
behaviour is, and given that no common distro makes use of this by default it
shouldn't be too much of a loss. Also, the old behaviour may be restored by
explicitly configuring a "core_pattern" of "core", and setting the default
system RLIMIT_CORE to 0 again via system.conf.
coredump: honour RLIMIT_CORE when saving/processing coredumps
With this change processing/saving of coredumps takes the RLIMIT_CORE resource limit of the crashing process into
account, given the user control whether specific processes shall core dump or not, and how large to make the core dump.
Note that this effectively disables core-dumping for now, as RLIMIT_CORE defaults to 0 (i.e. is disabled) for all
system processes.
This reworks the coredumping logic so that the coredump handler invoked from the kernel only collects runtime data
about the crashed process, and then submits it for processing to a socket-activate coredump service, which extracts a
stacktrace and writes the coredump to disk.
This has a number of benefits: the disk IO and stack trace generation may take a substantial amount of resources, and
hence should better be managed by PID 1, so that resource management applies. This patch uses RuntimeMaxSec=, Nice=, OOMScoreAdjust=
and various sandboxing settings to ensure that the coredump handler doesn't take away unbounded resources from normally
priorized processes.
This logic is also nice since this makes sure the coredump processing and storage is delayed correctly until
/var/systemd/coredump is mounted and writable.
activate: add a new switch --inetd to enable inetd-style socket activation
Previously, using --accept would enable inetd-style socket activation in addition to per-connection operation. This is
now split into two switches: --accept only switches between per-connection or single-instance operation. --inetd
switches between inetd-style or new-style fd passing.
This breaks the interface of the tool, but given that it is a debugging tool shipped in /usr/lib/systemd/ it's not
really a public interface.
This change allows testing new-style per-connection daemons.
core: make the StartLimitXYZ= settings generic and apply to any kind of unit, not just services
This moves the StartLimitBurst=, StartLimitInterval=, StartLimitAction=, RebootArgument= from the [Service] section
into the [Unit] section of unit files, and thus support it in all unit types, not just in services.
This way we can enforce the start limit much earlier, in particular before testing the unit conditions, so that
repeated start-up failure due to failed conditions is also considered for the start limit logic.
For compatibility the four options may also be configured in the [Service] section still, but we only document them in
their new section [Unit].
This also renamed the socket unit failure code "service-failed-permanent" into "service-start-limit-hit" to express
more clearly what it is about, after all it's only triggered through the start limit being hit.
Finally, the code in busname_trigger_notify() and socket_trigger_notify() is altered to become more alike.
editors: only extend line width to 119 for C and XML files
For all other files leave the line width at 79 as before. This is a good idea
since we generally don't want text files such as catalog files, unit files or
README/NEWS files to be line-broken at 119 since they are regularly browsed on
text terminals.
While we are at it, also add a couple of comments to the various files.
(Note that .editorconfig doesn't carry line-width information, simply because
the specification doesn't know the concept.)
core: clarify which unit file is masked in error message
After all, the masked unit file error might be returned when enqueuing a unit that is not masked but requires a masked
unit. In this case it should really be clear which unit is meant here.
units: downgrade dependency on /tmp in basic.target to Wants=
Now that requiring of a masked unit results in failure again, downgrade the dependency on /tmp to Wants= again, so that
our suggested way to disable /tmp-on-tmpfs by masking doesn't result in a failing boot.
core: change internal error code for masked units from EBADR to ESHUTDOWN
This commit changes the mapping of the BUS_ERROR_UNIT_MASKED error to ESHUTDOWN. This error is used whenever the
transaction engine is asked to operate on a masked unit. ESHUTDOWN is what is used for the similar case when the unit
file enable/disable logic hits a masked unit file, hence is a natural candidate to be used here too.
Background: before this patch both "job type not applicable" and "unit masked" where mapped to EBADR, which
transaction_add_job_and_dependencies() then checked for. It actually wanted to check exclusively for the former error
condition, not the latter but due to the same mapping this failed to work.
This patch semi-undoes an accidental change made in caffa4ef700fdd0eadd6c0b2ef9925611672a1bc, however restores the
error number to ESHUTDOWN instead of the original ENOSYS (for the reasons indicated above).
To make this easier to grok for the future, I added comments to explaining which error conditions are checked for.
Michal Sekletar [Tue, 9 Feb 2016 08:57:45 +0000 (09:57 +0100)]
path_id: reintroduce by-path links for virtio block devices
Enumeration of virtio buses is global and hence
non-deterministic. However, we are guaranteed there is never going to be
more than one virtio bus per parent PCI device. While populating
ID_PATH we simply skip virtio part of the syspath and we extend the path
using the sysname of the parent PCI device.
With this patch udev creates following by-path links for virtio-blk
device /dev/vda which contains two partitions.
ls -l /dev/disk/by-path/
total 0
lrwxrwxrwx 1 root root 9 Feb 9 10:47 virtio-pci-0000:00:05.0 -> ../../vda
lrwxrwxrwx 1 root root 10 Feb 9 10:47 virtio-pci-0000:00:05.0-part1 -> ../../vda1
lrwxrwxrwx 1 root root 10 Feb 9 10:47 virtio-pci-0000:00:05.0-part2 -> ../../vda2
journal: Drop monotonicity check when appending to journal file
Remove the check that triggers rotation of the journal file when the arriving log entry had a monotonic timestamp smaller that the previous log entry. This check causes unnecessary rotations when journal-remote was receiving from multiple senders, therefore monotonicity can not be guaranteed. Also, it does not offer any useful functionality for systemd-journald.
The dual_timestamp_from_realtime(), dual_timestamp_from_monotonic()
and dual_timestamp_from_boottime_or_monotonic() shares the same
code for comparison given ts with delta. Let's move it to the
separate inline function to prevent code duplication.
time-util: merge format_timestamp_internal() and format_timestamp_internal_us()
The time_util.c provides format_timestamp_internal() and
format_timestamp_internal_us() functions for a timestamp formating. Both
functions are very similar and differ only in formats handling.
We can add additional boolean parameter to the format_timestamp_internal()
function which will represent is a format for us timestamp or not.
This allows us to get rid of format_timestamp_internal_us() that is prevent
code duplication.
We can remove format_timestamp_internal_us() safely, because it is static and
has no users outside of the time_util.c. New fourth parameter will be passed
inside of the format_timestamp(), format_timestamp_us() and etc, functions,
but the public API is not changed.
random-seed: provide nicer error message when unable to open file
If /var is read-only, and the seed file does not exist, we would print
a misleading error message for ENOENT. Print both messages instead, to
make it easy to diagonose.
Also, treat the cases of missing seed file the same as empty seed file
and exit successfully. Initialize the return code properly.
Michał Górny [Sat, 6 Feb 2016 12:47:30 +0000 (13:47 +0100)]
build-sys: Perform flag tests in context to existing flags
Fix the CC_CHECK_FLAG_APPEND macro to test appended flags in context to
current flag values. Otherwise, it is possible to append flags colliding
with user's *FLAGS or even previously appended flags that will cause
the build to fail.
The time-util.c provides dual_timestamp_get() function for getting
realtime and monotonic timestamps. Let's use it instead of direct
realtime/monotonic calculation.
Vito Caputo [Fri, 5 Feb 2016 15:26:18 +0000 (07:26 -0800)]
journal: remove template from open_journal args
None of the callers take advantage of this parameter, it's always NULL,
this is just a private helper function to simplify the call sites so
drop the template parameter altogether. If a caller emerges later who
needs it, it can be restored.
Stef Walter [Sun, 8 Nov 2015 10:20:01 +0000 (11:20 +0100)]
journal: Combine journal catalog entries with the same id
Instead of discarding duplicate catalog entries, we now combine
them. This allows software or admins to add or override catalog
headers, or add additional text to the catalog message.
Set the MHD_OPTION_CONNECTION_MEMORY_LIMIT to 128KB. The precious value was DATA_SIZE_MAX, which was defined as 1024*1024*768. This caused journal-remote to allocate 756MB for each journal-upload connection, thus exhausting the available memory.
Let's use bitfields for our booleans, and don't try to apply binary OR or addition on them, because that's weird and we
should instead use logical OR only.
We really shouldn't fail silently, but print a log message about these errors. Also make sure to attach error codes to
all log messages where that makes sense.
(While we are at it, add a couple of (void) casts to functions where we knowingly ignore return values.)
core: when a service's ExecStartPre= times out, skip ExecStop=
This makes sure we never run two control processes at the same time, we cannot keep track off.
This introduces a slight change of behaviour but cleans up the definition of ExecStop= and ExecStopPost=. The former is
now invoked only if the service managed to start-up correctly. The latter is called even if start-up failed half-way.
Thus, ExecStopPost= may be used as clean-up step for both successful and failed start-up attempts, but ExecStop='s
purpose is clearly defined as being responsible for shutting down the service and nothing else.
The precise behaviour of this was not documented yet. This commit adds the necessary docs.