Change cunescape() to return a normal error code, so that we can
distuingish OOM errors from parse errors.
This also adds a flags parameter to control whether "relaxed" or normal
parsing shall be done. If set no parse failures are generated, and the
only reason why cunescape() can fail is OOM.
David Herrmann [Tue, 7 Apr 2015 12:03:44 +0000 (14:03 +0200)]
core: fix mount setup to work with non-existing mount points
We must not fail on ENOENT. We properly create the mount-point in
mount-setup, so there's really no reason to skip the mount. Make sure we
just skip the mount on unexpected failures or if it's already mounted.
Hans de Goede [Fri, 3 Apr 2015 10:07:32 +0000 (12:07 +0200)]
udev: input_id: tag accelerometers as ID_INPUT_ACCELEROMETER
input_id already (tries to) tag accelerometers as such, but this only works
for absolute accelerometers. Recent kernels mark accelerometers through an
input prop. Trust that prop and always tag devices with it with
ID_INPUT_ACCELEROMETER.
Note that detection by the prop bit works the same as the existing detection
and will ensure that no other tags get set on the device.
Peter Hutterer [Thu, 26 Mar 2015 04:08:35 +0000 (14:08 +1000)]
udev: input_id: tag pointing sticks as ID_INPUT_POINTINGSTICK
Also referred to as trackpoint, trackstick. These are marked by recent kernels
through an input prop. Forward that prop as udev property so userspace can
easily determine whether there is a pointing stick present.
These devices were previously marked as ID_INPUT_MOUSE, for backwards
compatibility we keep that in place, the new property is an addition.
Commit e792e890f ("path-util: don't eat up ENOENT in
path_is_mount_point()") changed path_is_mount_point() so it doesn't hide
-ENOENT from its caller. This causes all boots to fail early in case
any of the mount points does not exist (for instance, when kdbus isn't
loaded, /sys/fs/kdbus is missing).
Fix this by returning 0 from mount_one() if path_is_mount_point()
returned -ENOENT.
Tom Gundersen [Sun, 5 Apr 2015 10:17:29 +0000 (12:17 +0200)]
sd-device: don't use alloca() within loops
I shall not use alloca() within loops
I shall not use alloca() within loops
I shall not use alloca() within loops
I shall not use alloca() within loops
...
Daniel Mack [Thu, 2 Apr 2015 22:40:01 +0000 (00:40 +0200)]
bootchart: assorted coding style fixes
* kill unnecessary {}
* add newlines where appropriate
* remove dead code
* reorder variable declarations
* fix more return code logic
* pass O_CLOEXEC to all open*() calles
* use safe_close() where possible
Daniel Mack [Thu, 2 Apr 2015 12:15:33 +0000 (14:15 +0200)]
bootchart: clean up sysfd and proc handling
Retrieve the handle to procfs in main(), and pass it functions
that need it. Kill the global variables.
Also, refactor lots of code in svg_title(). There's no need to access any
global variables from there either, and we really should return proper
errors from there as well.
units: explicitly require /var, /tmp and /var/tmp to be mounted before basic.target
We support /var, /tmp and /var/tmp on NFS. NFS shares however are by
default ordered only before remote-fs.target which is a late-boot
service. /var, /tmp, /var/tmp need to be around earlier though, hence
explicitly order them before basic.target.
Note that this change simply makes explicit what was implicit before,
since many early-boot services pulled in parts of /var anyway early.
units: move After=systemd-hwdb-update.service dependency from udev to udev-trigger
Let's move the hwdb regeneration a bit later. Given that hwdb is
non-essential it should be OK to allow udev to run without it until we
do the full trigger.
Tom Gundersen [Wed, 1 Apr 2015 11:50:31 +0000 (13:50 +0200)]
libsystemd: add sd-device library
This provides equivalent functionality to libudev-device, but in the
systemd style. The public API only caters to creating sd_device objects
from for devices that already exist in /sys, there is no support for
listening for monitoring events or creating devices received over
the udev netlink protocol.
The private API contains the necessary functionality to make sd-device
a drop-in replacement for libudev-device, but which we would not
otherwise want to export.
Lukas Nykryn [Mon, 30 Mar 2015 12:42:02 +0000 (14:42 +0200)]
mount: don't run quotaon only for network filesystems
If you have for example ext4 on iscsi devices it is possible to setup
qoutas there. Unfortunately, because such fstab entry contains _netdev,
systemd will not add dependency to quotaon.service.
Alban Crequy [Tue, 31 Mar 2015 15:14:48 +0000 (17:14 +0200)]
nspawn: fallback on bind mount when mknod fails
Some systems abusively restrict mknod, even when the device node already
exists in /dev. This is unfortunate because it prevents systemd-nspawn
from creating the basic devices in /dev in the container.
This patch implements a workaround: when mknod fails, fallback on bind
mounts.
Additionally, /dev/console was created with a mknod with the same
major/minor as /dev/null before bind mounting a pts on it. This patch
removes the mknod and creates an empty regular file instead.
In order to test this patch, I used the following configuration, which I
think should replicate the system with the abusive restriction on mknod:
# grep devices /proc/self/cgroup
4:devices:/user.slice/restrict
# cat /sys/fs/cgroup/devices/user.slice/restrict/devices.list
c 1:9 r
c 5:2 rw
c 136:* rw
# systemd-nspawn --register=false -D .
v2:
- remove "bind", it is not needed since there is already MS_BIND
v3:
- fix error management when calling touch()
- fix lowercase in error message
We have no such check in any of the other tools, hence don't have one in
nspawn either.
(This should make things nicer for Rocket, among other things)
Note: removing this check does not mean that we support running nspawn
on non-systemd. We explicitly don't. It just means that we remove the
check for running it like that. You are still on your own if you do...
Andrew Jones [Tue, 31 Mar 2015 09:08:13 +0000 (11:08 +0200)]
ARM: detect-virt: detect QEMU/KVM
QEMU/KVM guests do not have hypervisor nodes, but they do have
fw-cfg nodes (since qemu v2.3.0-rc0). fw-cfg nodes are documented,
see kernel doc Documentation/devicetree/bindings/arm/fw-cfg.txt,
and therefore we should be able to rely on it in this detection.
Unfortunately, we currently don't have enough information in the
DT, or elsewhere, to determine if we're using KVM acceleration
with QEMU or not, so we can only report 'qemu' at this time, even
if KVM is in use. This shouldn't really matter in practice though,
because if detect-virt is used interactively it will be clear to
the user whether or not KVM acceleration is present by the overall
speed of the guest. If used by a script, then the script's behavior
should not change whether it's 'qemu' or 'kvm'. QEMU emulated
guests and QEMU/KVM guests of the same type should behave
identically, only the speed at which they run should differ.
Andrew Jones [Tue, 31 Mar 2015 09:08:11 +0000 (11:08 +0200)]
detect-virt: use /proc/device-tree
Kernel doc Documentation/ABI/testing/sysfs-firmware-ofw says that
the /proc/device-tree symlink should be used, as opposed to
directly accessing /sys/firmware/devicetree/base. The former is
ABI, but not the later.
Entropy Graph code doesn't handle the error condition if open() of /proc entry
fails. Moreover, the file is only opened once and only first sample will contain
the correct value because the return value of pread() is also not handled
properly and file is not re-opened. Fix both problems.
systemd-bootchart: Prevent leaking file descriptors in open-fdopen combination
Correctly handle the potential failure of fdopen() (because of OOM, for instance)
after potentially successful open(). Prevent leaking open fd in such case.
systemd-bootchart: Prevent closing random file descriptors
If the kernel has no CONFIG_SCHED_DEBUG option set, systemd-bootchart produces
empty .svg file. The reason for this is very fragile file descriptor logic in
log_sample() and main() (/* do some cleanup, close fd's */ block). There are
many places where file descriptors are closed on failure (missing SCHED_DEBUG
provokes it), but there are several problems with it:
- following iterations in the loop see that the descriptor is non zero and do
not open the corresponding file again;
- "some cleanup" code closes already closed files and the descriptors are reused
already, in particular for resulting .svg file;
- static "vmstat" and "schedstat" variables in log_sample() made the situation
even worse.
Harald Hoyer [Fri, 27 Mar 2015 12:47:32 +0000 (13:47 +0100)]
cdrom_id: unroll and simplify data check loop
also removes this warning:
src/udev/cdrom_id/cdrom_id.c: In function ‘cd_media_info.isra.13’:
src/udev/cdrom_id/cdrom_id.c:612:12: warning: assuming signed overflow
does not occur when assuming that (X + c) >= X is always true
[-Wstrict-overflow]
static int cd_media_info(struct udev *udev, int fd)
^
Harald Hoyer [Fri, 27 Mar 2015 11:02:49 +0000 (12:02 +0100)]
fix gcc warnings about uninitialized variables
like:
src/shared/install.c: In function ‘unit_file_lookup_state’:
src/shared/install.c:1861:16: warning: ‘r’ may be used uninitialized in
this function [-Wmaybe-uninitialized]
return r < 0 ? r : state;
^
src/shared/install.c:1796:13: note: ‘r’ was declared here
int r;
^
Patrik Flykt [Wed, 25 Mar 2015 11:22:43 +0000 (13:22 +0200)]
networkd-dhcp6: Do not handle prefix expiry
Expiring prefixes need not be handled anymore as the kernel has been
instructed not to create routes for DHCPv6 assigned addresses via the
IFA_F_NOPREFIXROUTE flag.
Patrik Flykt [Mon, 2 Feb 2015 11:13:17 +0000 (13:13 +0200)]
systemd-networkd: Use IFA_F_NOPREFIXROUTE with IPv6 addresses
The IFA_F_NOPREFIXROUTE flag prevents the kernel from creating new onlink
prefixes when a DHCPv6 IPv6 address with a prefix length is set from user
space. IPv6 routing will follow the onlink status from Router Advertisment
Prefix Information options or any manually set route, which is the correct
thing to do.
As this flag has a larger value than what fits into an unsigned char, update
the flag attribute to an uint32_t and set it with an IFA_FLAGS attribute
when writing netlink messages to the kernel.
When parsing words from input files, optionally automatically unescape
the passed strings, controllable via a new flags parameter.
Make use of this in tmpfiles, and port everything else over, too.
This improves parsing quite a bit, since we no longer have to process the
same string multiple times with different calls, where an earlier call
might corrupt the input for a later call.
Tobias Hunger [Tue, 24 Mar 2015 23:05:38 +0000 (00:05 +0100)]
fstab-generator: don't accept missing root=, but accept root=none
And other non-device entries (like fstab does).
Mount whatever the user asked to be mounted on / on the kernel
command line. Do less sanity check and do *not* bail out
when the mount device looks strange or does not exist.
This basically makes the changes for deviceless filesystems
from yesterday unnecessary and is in line with what we do for
filesystems set up in fstab.
[tomegun:
- change patch title/description a bit.
- don't touch the /usr logic, that would be a separate change and
we don't currently have a convincing use-case for that.
- don't bail out on /sys ro. This only makes sense in containers,
where we would not be doing this anyway. If there is a use-case
we could consider that as a separate patch.]
Kay Sievers [Tue, 24 Mar 2015 22:28:25 +0000 (23:28 +0100)]
rules: storage - whitelist partitioned MS & MMC devices
On Mon, Mar 23, 2015 at 8:55 AM, Mantas Mikulėnas <grawity@gmail.com> wrote:
> On Tue, Mar 17, 2015 at 11:50 PM, Kay Sievers <kay@vrfy.org> wrote:
>> On Tue, Mar 17, 2015 at 5:00 PM, Mantas Mikulėnas <grawity@gmail.com>
>> wrote:
>> > Accidentally dropped in 1aff20687f4868575.
>> > ---
>> > rules/60-persistent-storage.rules | 2 +-
>> > 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> > +KERNEL!="loop*|mmcblk[0-9]*|mspblk[0-9]*|nvme*|sd*|sr*|vd*",
>> > GOTO="persistent_storage_end"
>>
>> We can't do that, we need to ignore the mmc*rpmb devices:
>>
>> http://cgit.freedesktop.org/systemd/systemd/commit/?id=b87b01cf83947f467f3c46d9831cd67955fc46b9
>>
>> Maybe "mmcblk*[0-9]" will work?
>
> Yeah, that would probably work (the names are like mmcblk0p1 etc.)
Kay Sievers [Tue, 24 Mar 2015 12:52:04 +0000 (13:52 +0100)]
timedate: remove daylight saving time handling and tzfile parser
We planned to support (the conceptually broken) daylight saving
time/local time features in the kernel, SCSI, networking, FAT
filesystem, but it turned out to be a race we cannot win and do
not want to get involved. Systemd should not fiddle with daylight
saving time or parse timezone information itself.
Leave everything to glibc or tools like date(1) and do not make any
promises or raise expectations that systemd should handle anything
like this.
Alin Rauta [Wed, 18 Mar 2015 12:06:19 +0000 (05:06 -0700)]
sd-rtnl: handle empty multi-part message from the kernel
We strips out NLMSG_DONE piece from a multi-part message adding into the
receive queue only the messages containing actual data.
If we send a request to the kernel for getting the forwarding database table (just an example),
the response will be a multi-part message like below:
1. FDB entry 1;
2. FDB entry 2;
3. NLMSG_DONE;
We strip out "3. NLMSG_DONE;" part and places into the receive queue a pointer to
"1. FDB entry 1; 2. FDB entry 2".
But if the FDB table is empty, the respose from the kernel will look like below:
1. NLMSG_DONE;
We strip out "1. NLMSG_DONE;" part and since there is no actual data got, it continues
waiting until reaching timeout.
Therefore, a call to "sd_rtnl_call" to send and wait for a response from kernel will exit
with timeout which is interpreted as error in communication.
This patch puts the NLMSG_DONE message on the receive queue if it ends an empty multi-part
message. This situation is detected in sd_rtnl_call() and in the callback code and NULL is
returned to the caller instead.
[tomegun:
- added/reworded commit message
- extend the same support to sd_rtnl_call_async()
- drop debug logging from library, we only do this if something is really wrong, but an
empty multi-part message is perfectly normal
- modernize the code we touch whilst we are at it]
timedated: flip internal status after executing operation
timedated would set the internal status before calling out to systemd to do
the actual change. When the operation was refused because of a SELinux denial,
the state kept in timedated would get out of sync, and the second call from
timedatectl would appear to succeed.