Stéphane Graber [Tue, 20 Jan 2015 23:40:17 +0000 (18:40 -0500)]
Set kmsg to 0 by default
It's now been proven over and over again that the symlink from /dev/kmsg
to /dev/console is harmful for everything but upstart systems. As Ubuntu
is now switching over to systemd too, lets switch the default.
Upstart users wishing to see boot messages can always set lxc.kmsg = 1
manually in their config (so long as they don't expect to then
dist-upgrade the container to systemd succesfuly).
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Serge Hallyn [Mon, 19 Jan 2015 05:06:55 +0000 (05:06 +0000)]
yet another problem with new overlay fs
It turns out that the new upstream overlay fs requires that the delta
and work dirs be under the same mount. So create a $lxcpath/tmpfs
and create delta0 and work0 under that. If the user asks for a
tmpfs that'll be mounted under $lxcpath/tmpfs and workdir and delta0
both created under that.
This isn't heavily tested. But if fixes mounting of 'overlay' fs
for me.
It's "not backward compatible", since it moves delta0, but that
shouldn't matter since ephemeral containers are either destroyed
on exit, or re-started with lxc-start.
S.Çağlar Onur [Sun, 18 Jan 2015 00:08:01 +0000 (19:08 -0500)]
restore the dropped bits of 1c1bb85ad2b6 and also implement the logic
suggested at
https://lists.linuxcontainers.org/pipermail/lxc-devel/2014-December/010985.html
Serge Hallyn [Tue, 20 Jan 2015 16:59:27 +0000 (16:59 +0000)]
update hwaddr to fill in xx at create time
Commit 67702c21 regressed the case where lxc-create use a config
file with 'xx:xx' in lxc.network.hwaddr, so that the 'xx' were
preserved in the container's configuration file. Expand those
in the unexpanded_config file whenever we are reading a
config file which is not coming from a 'lxc.include'.
The config file will have \n-terminated lines, so update
rand_complete_hwaddr to also stop on \n.
Add a test case to make sure xx gets expanded at lxc-create.
Serge Hallyn [Mon, 12 Jan 2015 23:56:28 +0000 (23:56 +0000)]
fill_autodev: bind-mount if mknod fails (v3)
First, rename setup_autodev to fill_autodev, since all it
does is populate it, not fully set it up.
Secondly, if mknod of a device fails, then try bind-mounting
it from the host rather than failing immediately.
Note that this isn't an urgent patch because the common.userns
configuration hook already specifies bind,create=file mount
entries for all the devices we would want.
Serge Hallyn [Mon, 12 Jan 2015 23:54:36 +0000 (23:54 +0000)]
autodev: switch strategies (v3)
Do not keep container devs under /dev/.lxc. Instead, always
keep them in a small tmpfs mounted at $(mounted_root)/dev.
The tmpfs is mounted in the container monitor's namespace. This
means that at every reboot it will get re-created. It seems to
me this better replicates what happens on a real host.
If we want devices persisting across reboots, then perhaps we can
implement a $lxcpath/$name/keepdev directory containing devices to
bind into the container at each startup.
Changelog (v2): don't bother with the $lxcpath/$name/rootfs.dev
directory, just mount the tmpfs straight into the container.
Changelog (v3): Don't create /dev if it doesn't exist
Serge Hallyn [Tue, 13 Jan 2015 06:02:26 +0000 (06:02 +0000)]
close-all-fds: fix behavior
We want to close all inherited fds in three cases - one, if a container
is daemonized. Two, if the user specifies -C on the lxc-start command
line. Three, in src/lxc/monitor.c. The presence of -C is passed in the
lxc_conf may not always exist.
One call to lxc_check_inherited was being done from lxc_start(), which
doesn't know whether we are daemonized. Move that call to its caller,
lxcapi_start(), which does know.
Pass an explicit closeall boolean as second argument to lxc_check_inherited.
If it is true, then all fds are closed. If it is false, then we check
the lxc_conf->close_all_fds.
With this, all tests pass, and the logic appears correct.
Note that when -C is not true, then we only warn about inherited fds,
but we do not abort the container start. This appears to have ben the case
since commit 92c7f6295518 in 2011. Unfortunately the referenced URL with
the justification is no longer valid. We may want to consider becoming
stricter about this again. (Note that the commit did say "for now")
Serge Hallyn [Tue, 13 Jan 2015 00:08:37 +0000 (00:08 +0000)]
lxc-start-ephemeral: handle the overlayfs workdir option (v2)
We fixed this some time ago for basic lxc-start, but never did
lxc-start-ephemeral.
Since the lxc-start patches were pushed, Miklos has given us a
way to detect whether we need the workdir= option. So the
bdev.c code could be simplified to check for "overlay\n" in
/proc/filesystems just as lxc-start-ephemeral does. This
patch doesn't do that.
Changelog (v2):
1. use 'overlay' fstype for new overlay upstream module
2. avoid using unneeded readlines().
Serge Hallyn [Fri, 9 Jan 2015 22:00:28 +0000 (22:00 +0000)]
Fix reversed args in mount call
Riya Khanna reported that with a ramfs rootfs the mount to make
/ rprivate was returning -EFAULT. NULL was being passed as the
mount target. Pass "/" instead.
Serge Hallyn [Fri, 9 Jan 2015 16:33:42 +0000 (16:33 +0000)]
set close-all-fds by default
When containers request to be daemonized, close-all-fd is
set to true. But when we switched ot daemonize-by-default we didn't
set close-all-fd by default.
Fix that. In order to do that we have to always have a lxc_conf
object. As a consequence, after this patch we can drop a bunch
of checks for c->lxc_conf existing. We should consider removing
those. This patch does not do that.
This should close https://github.com/lxc/lxc/issues/354
Martin Pitt [Thu, 8 Jan 2015 12:09:37 +0000 (13:09 +0100)]
apparmor: Fix slave bind mounts
The permission to make a mount "slave" is spelt "make-slave", not "slave", see
https://launchpad.net/bugs/1401619. Also, we need to make all mounts slave, not
just the root dir.
Serge Hallyn [Fri, 19 Dec 2014 18:23:52 +0000 (18:23 +0000)]
Enable seccomp by default for unprivileged users.
In contrast to what the comment above the line disabling it said,
it seems to work just fine. It also is needed on current kernels
(until Eric's patch hits upstream) to prevent unprivileged containers
from hosing fuse filesystems they inherit.
Serge Hallyn [Fri, 19 Dec 2014 18:22:55 +0000 (18:22 +0000)]
seccomp: add rule to reject umount -f
If a container has a bind mount from a host nfs or fuse
filesystem, and does 'umount -f', it will disconnect the
host's filesystem. This patch adds a seccomp rule to
block umount -f from a container. It also adds that rule
to the default seccomp profile.
Shuai Zhang [Sun, 30 Nov 2014 13:03:37 +0000 (21:03 +0800)]
audit: added capacity and reserve() to nlmsg
There are now two (permitted) ways to add data to netlink message:
1. put_xxx()
2. call nlmsg_reserve() to get a pointer to newly reserved room within the
original netlink message, then write or memcpy data to that area.
Both of them guarantee adding requested length data do not overflow the
pre-allocated message buffer by checking against its cap field first.
And there may be no need to access nlmsg_len outside nl module, because both
put_xxx() and nlmsg_reserve() have alread did that for us.
Cameron Norman [Mon, 1 Dec 2014 21:29:26 +0000 (13:29 -0800)]
lxc-debian: adjust init system configurations
Do as much as possible to allow containers switching from non-systemd to
systemd to work as intended (but nothing that will cause side effects).
Use update-rc.d disable instead of remove so the init scripts are not
re-enabled when the package is updated
Signed-off-by: Cameron Norman <camerontnorman@gmail.com> Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Shuai Zhang [Sun, 30 Nov 2014 13:03:37 +0000 (21:03 +0800)]
audit: added capacity and reserve() to nlmsg
There are now two (permitted) ways to add data to netlink message:
1. put_xxx()
2. call nlmsg_reserve() to get a pointer to newly reserved room within the
original netlink message, then write or memcpy data to that area.
Both of them guarantee adding requested length data do not overflow the
pre-allocated message buffer by checking against its cap field first.
And there may be no need to access nlmsg_len outside nl module, because both
put_xxx() and nlmsg_reserve() have alread did that for us.
Johannes Kastl [Wed, 26 Nov 2014 19:20:05 +0000 (20:20 +0100)]
lxc-opensuse: Disable on 13.2
Disabled building openSUSE containers on openSUSE 13.2 and openSUSE
Tumbleweed due to faulty behaviour with newer versions of
init_buildsystem.
Signed-off-by: Johannes Kastl <git@ojkastl.de> Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Abin Shahab [Wed, 12 Nov 2014 00:06:52 +0000 (00:06 +0000)]
Remounts bind mounts if read-only flag is provided
Bind mounts do not honor filesystem mount options. This change will
remount filesystems that are bind mounted if there are changes to
filesystem mount options, specifically if the mount is readonly.
Signed-off-by: Abin Shahab <ashahab@altiscale.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Antonio Terceiro [Mon, 24 Nov 2014 01:51:06 +0000 (23:51 -0200)]
lxc-debian: support systemd as PID 1
Containers with systemd need a somewhat special setup, which I borrowed
and adapted from lxc-fedora. These changes are required so that Debian 8
(jessie) containers work properly, and are a no-op for previous Debian
versions.
Signed-off-by: Antonio Terceiro <terceiro@debian.org> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Silvio Fricke [Fri, 14 Nov 2014 19:56:12 +0000 (20:56 +0100)]
lxc/utils: bugfix freed pointer return value
We allocate a pointer and save this address in a static variable. After
this we freed this pointer and return.
Here a cuttout of a valgrind report:
[...]
==11568== Invalid read of size 1
==11568== at 0x4C2D524: strlen (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11568== by 0x5961C9B: puts (in /usr/lib/libc-2.20.so)
==11568== by 0x400890: main (lxc_config.c:73)
==11568== Address 0x6933e21 is 1 bytes inside a block of size 32 free'd
==11568== at 0x4C2B200: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==11568== by 0x4E654F2: lxc_global_config_value (utils.c:415)
==11568== by 0x4E92177: lxc_get_global_config_item (lxccontainer.c:2287)
==11568== by 0x400883: main (lxc_config.c:71)
[...]
Gu1 [Tue, 28 Oct 2014 01:14:28 +0000 (02:14 +0100)]
lxc-debian: Fix default mirrors
Fix a typo in the lines inserted in the default sources.list.
Change the default mirror to http.debian.net which is (supposedly) more
accurate and better than cdn.debian.net for a generic configuration.
Use security.debian.org directly for the {release}/updates repository.
KATOH Yasufumi [Wed, 5 Nov 2014 07:03:34 +0000 (16:03 +0900)]
Fix clone issues
This commit fixes two issues at the time of clone:
* unnecessary directory is created when clone between overlayfs/aufs
* clone failed when the end of rootfs path is not "/rootfs"
Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>