basic/missing: move syscall definitions to basic/missing_syscall.h
We have a bunch of syscall wrapper definitions and it's easier to
see that they follow the same pattern if they are not interspersed
with other defines.
Change the wrappers to be uniform:
- if __NR_XXX is not defined, do not bother to call the syscall,
and return -1/ENOSYS immediately.
- do not check __NR_XXX defines if we detect the symbol as defined,
since we don't need them anyway
- reindent stuff for readability
New file basic/missing_syscall.h is included at the end of missing.h
because it might make use of some of the definitions in missing.h.
For btrfs, c_f_r() is like BTRFS_IOC_CLONE which we already used, but also
works when max_bytes is set. We do call copy_bytes in coredump code with
max_bytes set, and for large files, so we might see some benefit from using
c_f_r() on btrfs.
For other filesystems, c_f_r() falls back to do_splice_direct(), the same as
sendfile, which we already call, so there shouldn't be much difference.
Tested with test-copy and systemd-coredump on Linux 4.3 (w/o c_f_r)
and 4.5 (w/ c_f_r).
We called sendfile with 16kb (a.k.a. COPY_BUFFER_SIZE) as the maximum
number of bytes to copy. This seems rather inefficient, especially with
large files. Instead, call sendfile with a "large" maximum.
What "large" max means is a bit tricky: current file offset + max
must fit in loff_t. This means that as we call sendfile more than once,
we have to lower the max size.
With this patch, test-copy calls sendfile twice, e.g.:
sendfile(4, 3, NULL, 9223372036854775807) = 738760
sendfile(4, 3, NULL, 9223372036854037047) = 0
The second call is necessary to determine EOF.
test-copy: add a test shuffling bytes between normal files
I started looking into adding copy_file_range support, and discovered
that we can improve the way we call sendfile:
- sendfile(2) man page is missing an important bit: the number of bytes to
copy cannot be too big (SSIZE_MAX actually), and the description of EINVAL
return code does not mention this either,
- our implementation works but calls sendfile over and over with a small
size, which seems suboptimal.
First add a test which (under strace) can be used to see current behaviour.
Colin Guthrie [Mon, 14 Mar 2016 09:42:07 +0000 (09:42 +0000)]
device: Ensure we have sysfs path before comparing.
In some cases we do not have a udev device when setting up a unit
(certainly the code gracefully handles this). However, we do
then go on to compare the path via path_equal which will assert
if a null value is passed in.
See https://bugs.mageia.org/show_bug.cgi?id=17766
Not sure if this is the correct fix, but it avoids the crash
Petr Lautrbach [Thu, 10 Mar 2016 09:19:56 +0000 (10:19 +0100)]
socket_address_listen - do not rely on errno
Currently socket_address_listen() calls mac_selinux_bind() to bind a UNIX
socket and checks its return value and errno for EADDRINUSE. This is not
correct. When there's an SELinux context change made for the new socket,
bind() is not the last function called in mac_selinux_bind(). In that
case the last call is setfscreatecon() from libselinux which can change
errno as it uses access() to check if /proc/thread-self is available.
It fails on kernels before 3.17 and errno is set to ENOENT.
It's safe to check only the return value at it's set to -errno.
Dan Walsh [Wed, 9 Mar 2016 14:29:25 +0000 (09:29 -0500)]
/dev/console must be labeled with SELinux label
If the user specifies an selinux_apifs_context all content created in
the container including /dev/console should use this label.
Currently when this uses the default label it gets labeled user_devpts_t,
which would require us to write a policy allowing container processes to
manage user_devpts_t. This means that an escaped process would be allowed
to attack all users terminals as well as other container terminals. Changing
the label to match the apifs_context, means the processes would only be allowed
to manage their specific tty.
This change fixes a problem preventing RKT containers from working with systemd-nspawn.
networkctl: avoid reading past end of input buffer
name is IFNAMSIZ bytes, but we would copy sizeof(info->name) bytes,
which is IFNAMSIZ + 1. In effect we would go outside of the source
buffer and possibly leave a non-null terminated string in info->name.
Rename test-boot-timestamp to test-boot-timestamps and enable by default
The source file name and the binary name were mismatched.
Rename binary to match.
Make the test exit with TEST_SKIP if the data is missing or we
have no permissions. Otherwise, the data will be printed, which
should be safe to enable by default.
Franck Bui [Tue, 1 Dec 2015 17:01:44 +0000 (18:01 +0100)]
fstab-generator: fix automount option and don't start associated mount unit at boot
Without this patch applied the mount unit with 'automount' option was still
pulled by local-fs.target and thus was activated during the boot process which
defeats the purpose of the 'automount' option:
$ mount | grep mnt
systemd-1 on /mnt type autofs (rw,relatime,fd=34,pgrp=1,timeout=0,minproto=5,maxproto=5,direct)
/dev/vdb1 on /mnt type ext2 (rw,relatime)
$ systemctl status mnt.mount | grep Active
Active: active (mounted) since Thu 2016-03-03 21:36:22 CET; 42s ago
With the patch applied:
$ reboot
...
$ mount | grep mnt
systemd-1 on /mnt type autofs (rw,relatime,fd=22,pgrp=1,timeout=0,minproto=5,maxproto=5,direct)
$ systemctl status mnt.mount | grep Active
Active: inactive (dead)
$ ls /mnt
lost+found
$ systemctl status mnt.mount | grep Active
Active: active (mounted) since Thu 2016-03-03 21:47:32 CET; 4s ago
Joel Holdsworth [Thu, 3 Mar 2016 17:25:53 +0000 (17:25 +0000)]
core/mount: Don't unmount initramfs mounts
A mount within /run/initramfs is indicative that the mount was
created by initramfs init and will be unmounted by initramfs
shutdown.
It is unlikely that such a mount point would even be unmountable
by the the main system, for example in the case of the root file-
system being loop-mounted from a file in a /run/initramfs mount.
Joel Holdsworth [Thu, 3 Mar 2016 20:40:01 +0000 (20:40 +0000)]
core/failure-action: Set job-modes to replace-irreversibly
Up until now, the failure action has launched reboot.target and
poweroff.target with a less aggressive job mode than
"systemctl reboot" does. This has meant that the reboot and power-
off operations can stall if there are any conflicts with the target
during rebooting.
src/udev/udevadm-monitor.c: In function ‘print_device’:
src/udev/udevadm-monitor.c:44:16: warning: format ‘%u’ expects argument of type ‘unsigned int’, but argument 3 has type ‘__time_t {aka long int}’ [-Wformat=]
printf("%-6s[%"PRI_TIME".%06ld] %-8s %s (%s)\n",
^
Christian Hesse [Mon, 29 Feb 2016 20:04:02 +0000 (21:04 +0100)]
ask-password: add option --no-output to not print password to stdout
systemd-ask-password can store passwords in kernel keyring. However it
uses to print the passwords to standard output nevertheless. Depending
on where systemd-ask-password is called passwords may end on display
or in log, leaking sensitive information.
This allows to make systemd-ask-password quiet, effectively disabling
printing passwords to standard output.