seccomp: drop mincore() from @system-service syscall filter group
Previously, this system call was included in @system-service since it is
a "getter" only, i.e. only queries information, and doesn't change
anything, and hence was considered not risky.
However, as it turns out, mincore() is actually security sensitive, see
the discussion here:
https://lwn.net/Articles/776034/
Hence, let's adjust the system call filter and drop mincore() from it.
This constitues a compatibility break to some level, however I presume
we can get away with this as the systemcall is pretty exotic. The fact
that it is pretty exotic is also reflected by the fact that the kernel
intends to majorly change behaviour of the system call soon (see the
linked LWN article)
bl33pbl0p [Wed, 16 Jan 2019 00:03:22 +0000 (00:03 +0000)]
Log the job being merged
Makes it easier to understand what was merged (and easier to realize why).
Example is a start job running, and another unit triggering a verify-active job. It is not clear what job was it that from baz.service that merged into the installed job for bar.service in the debug logs. This makes it useful when debugging issues.
Jan 15 11:45:58 jupiter systemd[1218]: baz.service: Trying to enqueue job baz.service/start/replace
Jan 15 11:45:58 jupiter systemd[1218]: baz.service: Installed new job baz.service/start as 498
Jan 15 11:45:58 jupiter systemd[1218]: bar.service: Merged into installed job bar.service/start as 497
Jan 15 11:45:58 jupiter systemd[1218]: baz.service: Enqueued job baz.service/start as 498
It becomes:
Jan 15 11:45:58 jupiter systemd[1218]: bar.service: Merged bar.service/verify-active into installed job bar.service/start as 497
So it's apparently problematic that we use STRV_MAKE() (i.e. a compound
initializer) outside of the {} block we use it in (and that includes
outside of the ({}) block, too). Hence, let's rework the macro to not
need that.
This also makes the macro shorter, which is definitely a good and more
readable. Moreover, it will now complain if the iterator is a "char*"
instead of a "const char*", which is good too.
These sysctls were added in Linux 4.19 (torvalds/linux@30aba6656f), and
we should enable them just like we enable the older hardlink/symlink
protection since v199. Implements #11414.
Yu Watanabe [Sun, 13 Jan 2019 15:30:37 +0000 (00:30 +0900)]
network: make Link and NetDev always have the valid poiter to Manager
c4397d94c3d94909188d82e086ebedf5d3690569 introduces
link_detach_from_manager() and netdev_detach_from_manager(), and they
set Link::manager or NetDev::manager NULL.
But, at the time e.g. link is removed, hence link_drop() is called,
there may be still some asynchronous netlink call is waiting, and
their callbacks hit assertion.
This make {link,netdev}_detach_from_manager() just drop all references
from manager, but keep the pointer to manager.
Mikhail Kasimov [Tue, 15 Jan 2019 12:52:34 +0000 (14:52 +0200)]
Update systemd-system.conf.xml
Updating due to phrase "Defaults to DefaultTimeoutStartSec= from the manager configuration file, except when Type=oneshot is used, in which case the timeout is disabled by default (see systemd-system.conf)" from [0] https://github.com/systemd/systemd/blob/master/man/systemd.service.xml
Yu Watanabe [Fri, 11 Jan 2019 20:24:54 +0000 (05:24 +0900)]
sd-device-monitor: fix ordering of setting buffer size
By b1c097af8df58a94cba031a347061b7cb9b62d9b (#10239), the receive buffer
size for uevents was set by SO_RCVBUF at first, and fallback to
use SO_RCVBUFFORCE. So, as SO_RCVBUF limits to the buffer size
net.core.rmem_max, which is usually much smaller than 128MB udevd requests,
uevents buffer size was not sufficient.
This fixes the ordering of the request: SO_RCVBUFFORCE first, and
fallback to SO_RCVBUF. Then, udevd's uevent buffer size can be set to
128MB.
Michael Biebl [Thu, 10 Jan 2019 11:58:27 +0000 (12:58 +0100)]
meson: stop setting -fPIE globally
Setting -fPIE globally can lead to miscompilations on certain
architectures.
This is caused by both -fPIE and -fPIC options being added to various
compilation commands. Only -fPIC is being recorded in the LTO options
section of the object. The gcc-8 LTO plugin merges -fPIC + -fPIE to
nothing. So, the compilations done by the plugin are not
position-independent and fail to link with -pie.
The simplest solution is to stop setting -fPIE globally and instead
using meson's b_pie=true option. This requires meson 0.49 or later.
Since we don't set this option in meson.build but leave it up to the
distro maintainer to set this option, do not bump the meson version
requirement.
pam_systemd: reword message about not creating a session
The message is changed from
Cannot create session: Already running in a session...
to
Not creating session: Already running in a session...
This is more neutral and avoids suggesting a problem.
"Will not create session: ..." was suggested, but it sounds like the action
would have yet to be performed. I think Using present continuous is better.
udev: open control and netlink sockets before daemonization
c4b69e990f962128cc6975e36e91e9ad838fa2c4 effectively moved the initalization of socket.
Before that commit:
run → listen_fds → udev_ctrl_new → udev_ctrl_new_from_fd → socket()
After:
run → main_loop → manager_new → udev_ctrl_new_from_fd → socket()
The problem is that main_loop was called after daemonization. Move manager_new
out of main_loop and before daemonization.
Fixes #11314 (hopefully ;)).
v2: Yu Watanabe
sd_event is initialized in main_loop().
journal-remote: set a limit on the number of fields in a message
Existing use of E2BIG is replaced with ENOBUFS (entry too long), and E2BIG is
reused for the new error condition (too many fields).
This matches the change done for systemd-journald, hence forming the second
part of the fix for CVE-2018-16865
(https://bugzilla.redhat.com/show_bug.cgi?id=1653861).
Calling mhd_respond(), which ulimately calls MHD_queue_response() is
ineffective at point, becuase MHD_queue_response() immediately returns
MHD_NO signifying an error, because the connection is in state
MHD_CONNECTION_CONTINUE_SENT.
As Christian Grothoff kindly explained:
> You are likely calling MHD_queue_repsonse() too late: once you are
> receiving upload_data, HTTP forces you to process it all. At this time,
> MHD has already sent "100 continue" and cannot take it back (hence you
> get MHD_NO!).
>
> In your request handler, the first time when you are called for a
> connection (and when hence *upload_data_size == 0 and upload_data ==
> NULL) you must check the content-length header and react (with
> MHD_queue_response) based on this (to prevent MHD from automatically
> generating 100 continue).
If we ever encounter this kind of error, print a warning and immediately
abort the connection. (The alternative would be to keep reading the data,
but ignore it, and return an error after we get to the end of data.
That is possible, but of course puts additional load on both the
sender and reciever, and doesn't seem important enough just to return
a good error message.)
Note that sending of the error does not work (the connection is always aborted
when MHD_queue_response is used with MHD_RESPMEM_MUST_FREE, as in this case)
with libµhttpd 0.59, but works with 0.61:
https://src.fedoraproject.org/rpms/libmicrohttpd/pull-request/1
journald: lower the maximum entry size limit to ½ for non-sealed fds
We immediately read the whole contents into memory, making thigs much more
expensive. Sealed fds should be used instead since they are more efficient
on our side.
journald: when processing a native message, bail more quickly on overbig messages
We'd first parse all or most of the message, and only then consider if it
is not too large. Also, when encountering a single field over the limit,
we'd still process the preceding part of the message. Let's be stricter,
and check size limits early, and let's refuse the whole message if it fails
any of the size limits.
journald: set a limit on the number of fields (1k)
We allocate a iovec entry for each field, so with many short entries,
our memory usage and processing time can be large, even with a relatively
small message size. Let's refuse overly long entries.
What from I can see, the problem is not from an alloca, despite what the CVE
description says, but from the attack multiplication that comes from creating
many very small iovecs: (void* + size_t) for each three bytes of input message.
coredump: fix message when we fail to save a journald coredump
If creation of the message failed, we'd write a bogus entry:
systemd-coredump[1400]: Cannot store coredump of 416 (systemd-journal): No space left on device
systemd-coredump[1400]: MESSAGE=Process 416 (systemd-journal) of user 0 dumped core.
systemd-coredump[1400]: Coredump diverted to
2MB might be hard for some clients to use meaningfully, but OTOH, it is
important to log the full commandline sometimes. For example, when the program
is crashing, the exact argument list is useful.
journald: do not store the iovec entry for process commandline on stack
This fixes a crash where we would read the commandline, whose length is under
control of the sending program, and then crash when trying to create a stack
allocation for it.
coredump: remove duplicate MESSAGE= prefix from message
systemd-coredump[9982]: MESSAGE=Process 771 (systemd-journal) of user 0 dumped core.
systemd-coredump[9982]: Coredump diverted to /var/lib/systemd/coredump/core...
log_dispatch() calls log_dispatch_internal() which calls write_to_journal()
which appends MESSAGE= on its own.
timesyncd,resolved,machinectl: drop calls to sd_event_get_exit_code()
In all three cases, sd_event_loop() will return the exit code anyway.
If sd_event_loop() returns negative, failure is logged and results in an
immediate return. Otherwise, we don't care if sd_event_loop() returns 0
or positive, because the return value feeds into DEFINE_MAIN_FUNCTION(), which
doesn't make the distinction.
Empty line between setting the output parameter and return is removed. I like
to think about both steps as part of returning from the function, and there's
no need to separate them.
Similarly, if we need to unset a pointer after successfully passing ownership,
use TAKE_PTR and do it immediately after the ownership change, without an empty
line inbetween.
dana [Mon, 24 Dec 2018 11:15:38 +0000 (05:15 -0600)]
zsh completion: Prevent functions from clobbering each other, &c.
- Don't redefine helpers on every call
- Prefix helper names with main function name
- Adjust some helper names for consistency and convention adherance