When setting lxc.network.veth.pair to get a fixed interface
name the recreation of it after a reboot caused an EEXIST.
-) The reboot flag is now a three-state value. It's set to
1 to request a reboot, and 2 during a reboot until after
lxc_spawn where it is reset to 0.
-) If the reboot is set (!= 0) within instantiate_veth and
a fixed name is used, the interface is now deleted before
being recreated.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Serge Hallyn [Thu, 11 Jun 2015 04:08:15 +0000 (23:08 -0500)]
daemonized start: exit children on failure, don't return
When starting a daemonized container, only the original parent
thread should return to the caller. The first forked child
immediately exits after forking, but the grandparent child
was in some places returning on error - causing a second instance
of the calling function.
Tycho Andersen [Wed, 10 Jun 2015 21:57:50 +0000 (21:57 +0000)]
uniformly nullify std fds
In various places throughout the code, we want to "nullify" the std fds,
opening them to /dev/null or zero or so. Instead, let's unify this code and do
it in such a way that Coverity (probably) won't complain.
v2: use /dev/null for stdin as well
v3: add a comment about use of C's short circuiting
v4: axe comment, check errors on dup2, s/quiet/need_null_stdfds
Doing this requires some btrfs functions from bdev to be used in
utils.c Because utils.h is imported by lxc_init.c, I had to create
a new initutils.[ch] which are used by both lxc_init.c and utils.c
We could instead put the btrfs functions into utils.c, which would
be a shorter patch, but it really doesn't belong there. So I went
the other way figuring there may be more such cases coming up of
fns in utils.c needing code from bdev.c which can't go into lxc_init.
Currently, if we detect a btrfs subvolume we just remove it. The
st_dev on that dir is different, so we cannot detect if this is
bound in from another fs easily. If we care, we should check
whether this is a mountpoint, this patch doesn't do that.
Kien Truong [Mon, 6 Apr 2015 16:20:43 +0000 (17:20 +0100)]
Properly free memory of sorted cgroup settings
We need to use lxc_list_for_each_safe, otherwise de-allocation
will fail with a list size bigger than 2. The pointer to the head
of the list also need freeing after we've freed all other elements
of the list.
Signed-off-by: Kien Truong <duckientruong@gmail.com>
Kien Truong [Sun, 5 Apr 2015 23:46:22 +0000 (23:46 +0000)]
Sort the cgroup memory settings before applying.
Add a function to sort the cgroup settings before applying.
Currently, the function will put memory.memsw.limit_in_bytes after
memory.limit_in_bytes setting so the container will start
regardless of the order specified in the input. Fix #453
Signed-off-by: Kien Truong <duckientruong@gmail.com>
Fix incomplete destruction of unprivileged ephemeral containers
If an unprivileged ephemeral container is started as follows,
lxc-start-ephemeral -o trusty -n test_ephemeral
Then an empty directory remains upon exit from the container,
~/.local/share/lxc/test_ephemeral/tmpfs/delta0
(The tmpfs filesystem is successfully unmounted, but we seem to lack
permission to delete the delta0 directory).
This issue arose following commits 4799a1e and dd2271e .
The following patch resolves the issue. It has been tested on ubuntu
14.04 with the lxc-daily ppa.
Since gmail screws up the formatting of the patch via line-wrapping
etc, please copy the patch from the issue-tracker rather than from
this email.
Serge Hallyn [Mon, 16 Mar 2015 17:02:12 +0000 (17:02 +0000)]
lxc-destroy: actually work if underlying fs is overlayfs
One of the 'features' of overlayfs is that depending on whether a file
is on the upper or lower dir you get back a different device from stat.
That breaks our lxc_rmdir_onedev.
So at lxc_rmdir_ondev check the device of the directory being deleted.
If it is overlayfs, then skip the device check.
Note this is unrelated to overlayfs snapshots - in those cases when you
delete a container, /var/lib/lxc/$container/ does not actually have an
overlayfs under it. Rather, to reproduce this you would
Tomas Pospisek [Sun, 25 Jan 2015 15:27:10 +0000 (16:27 +0100)]
improve "lxc-create -t debian -h" help text
- document environment variables
- add missing --packages switch to command line
- describe how to pass template options to lxc-create (since
lxc-create -h doesn't tell you)
- render help text in the same pretty format as lxc-create does
Signed-off-by: Tomáš Posíšek <tpo_deb@sourcepole.ch> Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Markus Elfring [Sat, 24 Jan 2015 19:38:49 +0000 (20:38 +0100)]
Bug #158: Deletion of unnecessary checks before a few calls of LXC functions
The following functions return immediately if a null pointer was passed.
* container_destroy
* lxc_cgroup_process_info_free_and_remove
* lxc_cgroup_put_meta
* toss_list
It is therefore not needed that a function caller repeats a corresponding check.
This issue was fixed by using the software Coccinelle 1.0.0-rc23.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Markus Elfring [Sat, 24 Jan 2015 18:55:36 +0000 (19:55 +0100)]
Bug #158: Deletion of unnecessary checks before calls of the function "free"
The function "free" is documented in the way that no action shall occur for
a passed null pointer. It is therefore not needed that a function caller
repeats a corresponding check.
http://stackoverflow.com/questions/18775608/free-a-null-pointer-anyway-or-check-first
This issue was fixed by using the software Coccinelle 1.0.0-rc23.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Serge Hallyn [Mon, 19 Jan 2015 05:06:55 +0000 (05:06 +0000)]
yet another problem with new overlay fs
It turns out that the new upstream overlay fs requires that the delta
and work dirs be under the same mount. So create a $lxcpath/tmpfs
and create delta0 and work0 under that. If the user asks for a
tmpfs that'll be mounted under $lxcpath/tmpfs and workdir and delta0
both created under that.
This isn't heavily tested. But if fixes mounting of 'overlay' fs
for me.
It's "not backward compatible", since it moves delta0, but that
shouldn't matter since ephemeral containers are either destroyed
on exit, or re-started with lxc-start.
Serge Hallyn [Tue, 13 Jan 2015 00:08:37 +0000 (00:08 +0000)]
lxc-start-ephemeral: handle the overlayfs workdir option (v2)
We fixed this some time ago for basic lxc-start, but never did
lxc-start-ephemeral.
Since the lxc-start patches were pushed, Miklos has given us a
way to detect whether we need the workdir= option. So the
bdev.c code could be simplified to check for "overlay\n" in
/proc/filesystems just as lxc-start-ephemeral does. This
patch doesn't do that.
Changelog (v2):
1. use 'overlay' fstype for new overlay upstream module
2. avoid using unneeded readlines().
David Ward [Tue, 23 Jun 2015 14:57:37 +0000 (10:57 -0400)]
Allow autodev without a rootfs
A container without a rootfs is useful for running a collection of
processes in separate namespaces (to provide separate networking as
an example), while sharing the host filesystem (except for specific
paths that are re-mounted as needed). For multiple processes to run
automatically when such a container is started, it can be launched
using lxc-start, and a separate instance of systemd can manage just
the processes inside the container. (This assumes that the path to
the systemd unit files is re-mounted and only contains the services
that should run inside the container.) For this use case, autodev
should be permitted for a container that does not have a rootfs.
Signed-off-by: David Ward <david.ward@ll.mit.edu> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
David Ward [Tue, 23 Jun 2015 14:57:33 +0000 (10:57 -0400)]
Only mount /proc if needed, even without a rootfs
Use the same code with and without a rootfs to check if mounting
/proc is necessary before doing so. If mounting it is unsuccessful
and there is no rootfs, continue as before.
Signed-off-by: David Ward <david.ward@ll.mit.edu> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Robert Schiele [Fri, 21 Aug 2015 05:35:34 +0000 (07:35 +0200)]
check for NULL pointers before calling setenv()
Latest glibc release actually honours calling setenv with a NULL
pointer by causing SIGSEGV but checking pointers before submitting
to any system function is a good idea anyway.
Signed-off-by: Robert Schiele <rschiele@gmail.com>
reuse label cleanup since free(NULL) is a no-op Signed-off-by: Arjun Sreedharan <arjun024@gmail.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Robert LeBlanc [Thu, 13 Aug 2015 19:36:55 +0000 (13:36 -0600)]
Caps are getting lost when cloning an LXC. Adding the -X parameter copies the extended attributes. This allows things like ping to continue to be used by a non-privilged user in Debian at least.
Jiri Slaby [Wed, 5 Aug 2015 08:32:54 +0000 (10:32 +0200)]
templates: lxc-opensuse, use rpm to determine build version
zypper info's output is not usable for several reasons:
* it is localized -- there is no "Version: " in my output
* it shows results both from the repo and local system
So use plain rpm to determine whether build is installed and if proper
version is in place.
1) Two checks on amd64 for whether compat_ctx has already
been generated were redundant, as compat_ctx is generally
generated before entering the parsing loop.
2) With introduction of reject_force_umount the check for
whether the syscall has the same id on both native and
compat archs results in false behavior as this is an
internal keyword and thus produces a -1 on
seccomp_syscall_resolve_name_arch().
The result was that it was added to the native architecture
twice and never to the 32 bit architecture, causing it to
have no effect on 32 bit containers on 64 bit hosts.
3) I do not see a reason to care about whether the syscalls
have the same number on the two architectures. On the one
hand this check was there to avoid adding it to two archs
(and effectively leaving one arch unprotected), while on
the other hand it seemed to be okay to add it to the
same arch *twice*.
The entire architecture checking branches are now reduced to
three simple cases: 'native', 'non-native' and 'all'. With
'all' adding to both architectures regardless of the syscall
ID.
Also note that libseccomp had a bug in its architecture
checking, so architecture related filters weren't working as
expected before version 2.2.2, which may have contributed to
the confusion in the original architecture-related code.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
The Fedora 22 squashfs doesn't appear to work, the Fedora 21 isn't
available, so lets use the fedora archive mirror and pull the good old
Fedora 20 squashfs.
Loop devices can be added on the fly when needed, they're
not always created beforehand. The loop-control device can
be used to find and allocate the next available number
instead of going through the /dev directory contents (which
is now only a fallback mechanism).
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
KATOH Yasufumi [Thu, 25 Jun 2015 09:14:04 +0000 (18:14 +0900)]
Support unprivileged ephemeral container using aufs
As the commit 31a882e, an unprivileged container can use aufs.
This patch removes the check for unpriv aufs, and change the path of
xino file as an unprivileged user can mount aufs.
Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Lenz Grimmer [Fri, 12 Jun 2015 23:08:41 +0000 (01:08 +0200)]
use `hostname` for DHCP_HOSTNAME in ifcfg-eth0
Updated centos/fedora/oracle templates to use `hostname` for DHCP_HOSTNAME in
/etc/sysconfig/network/ifcfg-eth0, so the container's host name is propagated
to the host's DHCP server (e.g. dnsmasq, which also acts as the DNS server).
This resolves lxc/lxd#756
Daniel Golle [Tue, 9 Jun 2015 10:58:12 +0000 (12:58 +0200)]
fix build on mpc85xx
Initialize ret to 0 so compiler no longer complains about
monitor.c: In function 'lxc_monitor_open':
monitor.c:212:5: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]
https://github.com/openwrt/packages/issues/1356
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Serge Hallyn [Wed, 27 May 2015 10:05:16 +0000 (10:05 +0000)]
cgmanager: attach: never use 'all' controller
We were using 'all' controller if current was in all the
same cgroup. That doesn't suffice. We'd have to check
the target. At that point we may as well just attach
controller by controller.
An optimization to consider is to check the /proc/initpid/cgroup
for all identical controllers. Let's start by just getting it
right.
Dwight Schauer [Tue, 2 Jun 2015 04:41:09 +0000 (23:41 -0500)]
The yum in Centos 5.11 does not know about '--releasever', which is used by: lxc-create ... -- release=VERSION
The release version only needs to be set in the outer bootstrap, not the inner one.
With this change an lxc-create bootstrap of CentOS 5.11 completes enough to be usable.
CentOS 5.11 containers can be created, started, stopped, and networking works. Signed-off-by: Dwight Schauer <das@teegra.net>
Serge Hallyn [Fri, 1 May 2015 21:11:28 +0000 (21:11 +0000)]
Use 'cgm listcontrollers' list rather than /proc/self/cgroups
to populate the list of subsystems to use.
Cgmanager can be started with some subsystems disabled (i.e.
cgmanager -M cpuset). If lxc using cgmanager then uses the
/proc/self/cgroup output to determine which controllers to use,
it will fail when trying to do things to cpuset. Instead, ask
cgmanager which controllers to use.
This still defers (per patch 1/1) to the lxc.cgroup.use values.
Use POSIX-compliant function names in bash completion
When running in posix mode (for example, because it was invoked as `sh`,
or with the --posix option), bash rejects the function names previously
used because they contain hyphens, which are not legal POSIX names, and
exits immediately.
This is a particularly serious problem on a system in which the
following three conditions hold:
1. The `sh` executable is provided by bash, e. g. via a symlink
2. Gnome Display Manager is used to launch X sessions
3. Bash completion is loaded in the (system or user) profile file
instead of in the bashrc file
In that case, GDM's Xsession script (run with `sh`, i. e., bash in posix
mode) sources the profile files, thus causing the shell to load the bash
completion files. Upon encountering the non-POSIX-compliant function
names, bash would then exit, immediately ending the X session.
Fixes #521.
Signed-off-by: Lucas Werkmeister <mail@lucaswerkmeister.de>
Cyril Bitterich [Sat, 9 May 2015 19:57:14 +0000 (21:57 +0200)]
lxc-debian.in: Fixed errors if dbus is not installed
The lxc-debian template debootstraps a minimum debian system which does not contain dbus.
If systemd is used this will result in getty-static.service to be used instead of getty@ .
The systemd default files uses 6 tty's instead of the 4 the script creates.
This will lead to repeated error messages in the systemd journal.
Martin Pitt [Thu, 7 May 2015 11:38:50 +0000 (13:38 +0200)]
Call /lib/apparmor/profile-load directly instead of the wrapper
AppArmor ships /lib/apparmor/profile-load. /lib/init/apparmor-profile-load is
merely a wrapper which calls the former, so just call it directly to avoid the
dependency on the wrapper.
Make lxc-checkconfig work with kernel versions > 3
(1) Add test for kernel version greater 3.
(2) Use && and || instead of -a and -o as suggested in
http://www.unix.com/man-page/posix/1p/test/.
lxc-checkconfig will currently report "missing" on "Cgroup memory controller"
for kernel versions greater 3. This happens because the script, before checking
for the corresponding memory variable in the kernel config, currently will test
whether we have a major kernel version greater- or equal to 3 and a minor kernel
version greater- or equal to 6. This adds an additional test whether we have a
major kernel version greater than 3.
Signed-off-by: Christian Brauner <christianvanbrauner@gmail.com>
Particularly when using the go-lxc api with lots of threads, it
happens that if the open files limit is > 1024, we will try to
select on fd > 1024 which breaks on glibc.
In the past, lxc-cmd-stop would wait until the command pipe was closed
before returning, ensuring that the container monitor had exited.
Now that we accept the actual success return value, lxcapi_stop can
return success before the monitor has fully exited.
So explicitly wait for the container to stop, when lxc-cmd-stop returned
success.
1. When we stop a container from the lxc_cmd stop handler, we kill its
init task, then we unfreeze the container to make sure it receives the
signal. When that unfreeze succeeds, we were immediately returning 0,
without sending a response to the invoker.
2. lxc_cmd returns the length of the field received. In the case of
an lxc_cmd_stop this is 16. But a comment claims we expect no response,
only a 0. In fact the handler does send a response, which may or may
not include an error. So don't call an error just because we got back a
response.