Daniel Lezcano [Thu, 13 Jan 2011 15:25:14 +0000 (16:25 +0100)]
substitute the absolute rootfs mount path
Change the mount point in the rootfs because we mount the rootfs
in ROOTFSDIR for the pivot. We have to substitute the real mount
path to the new path located in ROOTFSDIR.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Daniel Lezcano [Sun, 9 Jan 2011 22:53:19 +0000 (23:53 +0100)]
fix the ns_cgroup vs clone_children
The following patch fixes the bug where the clone_children compatibility
flag is available with the ns_cgroup subsystem. The 2.6.37 kernel version
should be the only one which is concerned by this modification, please
refer to Documentation/feature-removal-schedule.txt and look for ns_cgroup.
The problem is coming from we check for clone_children and we set it
automatically and then we try to create a new cgroup. As the
ns_cgroup is present the cgroup already exists and we are not allowed
to attach our pid to a new cgroup. The next error will be when we try
to create a new container because we enabled the clone_children flag
and the ns_cgroup is present, it is not allowed by the kernel.
The patch fix this by checking the mount options.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Daniel Lezcano [Fri, 17 Dec 2010 10:43:37 +0000 (11:43 +0100)]
use clone_children cgroup's flag
If the ns_cgroup does not exist, we use the clone_children feature.
Everytime a cgroup is created, we set this compatibility flag and we create
the cgroup manually and add the child task to the cgroup.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Michael Tokarev [Fri, 17 Dec 2010 10:43:36 +0000 (11:43 +0100)]
Make mount paths relative to rootfs
Why not chdir into the root of container right when
the root filesystem is (bind-)mounted, and let all
mount entries to be relative to the container root?
Even more, to warn if lxc.mount[.entry] contains
absolute path for the destination directory (or a
variation of this, absolute and does not start with
container root mount point)?
This way, all mounts will look much more sane, and
it will be much easier to move/clone containers -
by changing only lxc.rootfs.
I do it this way locally since the beginning, by
chdir'ing to the proper directory (rootfs) before
running lxc-start (in a startup script), but this
is now broken in 0.7.3 which bind-mounts rootfs
somewhere in /usr/lib/lxc.
Signed-off-by: Michael Tokarev<mjt@tls.msk.ru> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Daniel Lezcano [Tue, 26 Oct 2010 15:42:38 +0000 (17:42 +0200)]
fix multiple console for a container
Don't close the socket when we ask for a console, otherwise this will
make the console slot to be freed, so the next console will use the same
slot leading to an erratic behavior.
Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr>
Daniel Lezcano [Tue, 26 Oct 2010 15:42:37 +0000 (17:42 +0200)]
don't play with the capabilities when we are root
We don't want to drop the capabilities when we are root because that
leads to some problems. For exemple, sudo lxc-start -n foo -o $(tty) fails with
"permission denied".
Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr>
Stefan Tomanek [Tue, 12 Oct 2010 08:52:47 +0000 (10:52 +0200)]
add lxc.network.script.up configuration hook
This commit adds an configuration option to specify a script to be
executed after creating and configuring the network used by the
container. The following arguments are passed to the script:
* container name
* config section name (net)
Additional arguments depend on the config section employing a
script hook; the following are used by the network system:
* execution context (up)
* network type (empty/veth/macvlan/phys)
Depending on the network type, other arguments may be passed:
veth/macvlan/phys:
* (host-sided) device name
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Daniel Lezcano [Sun, 3 Oct 2010 21:09:36 +0000 (23:09 +0200)]
add rootfs mount dir variable to pkg-config
In the case we use an image for rootfs, if we need to do extra mount
from the host to the rootfs, we have to specify the place where the
image is mounted. This value is configured by the user with the
lxc.rootfs.mount otherwise defaulting to @LXCROOTFSMOUNT@. Let's
export this variable to pkg-config, so the user can use it to build
a correct path to the rootfs.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Daniel Lezcano [Sun, 3 Oct 2010 21:09:36 +0000 (23:09 +0200)]
Don't display an error in lxc_file_for_each_line
Don't display an error when the callback returns an error different
from zero. A value greater than zero may means "stop". Let's the caller
to check the error.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Daniel Lezcano [Mon, 13 Sep 2010 13:36:20 +0000 (15:36 +0200)]
configure container architecture
When a container is installed with 32bits binaries while we are
running on a 64bits host, inside the container we are seen as
64bits arch. That leads to some problems for the package updates
because the scripts will download 64bits packages instead of 32bits.
This patch defines a configuration variable to set the architecture
of the container.
lxc.arch = i686 | x86 | x86_64 | amd64
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Daniel Lezcano [Fri, 23 Jul 2010 13:10:38 +0000 (15:10 +0200)]
Fix bad returned value
In case of error the message will be always truncated.
We check the message was truncated with the total size
received which means the kernel as more info to give.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Daniel Lezcano [Tue, 20 Jul 2010 11:45:44 +0000 (13:45 +0200)]
remove/restore effective capabilities
This patch adds the functions to drop the 'effective' capabilities and
restore them from the 'permitted' capabilities.
When the command is run as 'root' we do nothing.
When the command is run as 'lambda' user, we drop the effective capabilities
When the command is run as 'root' but real uid is not root, we keep the capabilies,
switch to real uid, and drop the effective capabilities.
This approach is compatible for root user, lambda + file capabilities
and lambda + setuid.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Daniel Lezcano [Tue, 13 Jul 2010 12:51:45 +0000 (14:51 +0200)]
lxc-init finishes the remaining processes with SIGKILL
If lxc-init receives a SIGALRM, a timeout, it kills all the processes
of the container with SIGKILL. That will prevent the container to be
stuck when one process ignore the SIGTERM signal.
Each time a process exits, the timeout is resetted.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Daniel Lezcano [Tue, 13 Jul 2010 12:51:45 +0000 (14:51 +0200)]
lxc-init kills all processes with SIGTERM
When lxc-init receives a SIGTERM, let's kill all the processes of
the pid namespace with kill -1. So the exit of the container will
happen gracefully with processes death cascade.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
As pointed out by Dan Smith, when a container is being stopped, it must
also be unfrozen after posting the SIGKILL. Otherwise if the container
is frozen when the SIGKILL is posted, the SIGKILL will remain pending
and the lxc-stop command will block until lxc-unfreeze is explicitly
called).
(lxc-stop waits for the container to exit and close the socket but since
the container is frozen, lxc-stop will block).
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Matt Helsley <matthltc@us.ibm.com> Acked-by: Dan Smith <danms@us.ibm.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
A write to the freezer.state file does not gurantee that the state has
changed. To ensure that the freezer state is either FROZEN or THAWED,
read the freezer state and if it has not changed, repeat the write.
Changelog[v2]:
- Minor reorg of code
- Comments from Daniel Lezcano:
- lseek() before each read/write of freezer.state
- Have lxc_freeze_unfreeze() return -1 on error
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Daniel Lezcano [Tue, 6 Jul 2010 19:26:31 +0000 (21:26 +0200)]
close prctl window
If the pdeath signal is set after the synchro we have a window where
the parent exits with the pdeath signal not set.
In order to avoid that, we have to move the prctl before the synchro with
the parent so if the parent exits before we can set the pdeath signal, the
synchro will fail in any case and the container startup will be aborted.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
This bug stalked me for a while, but only now it bit me quite
badly... (Lost about an hour of work...)
So the culprit: inside the fstab file for the `lxc.mount` option I
can use options like `ro` together with `bind`. Unfortunately the
kernel just laughs in my face and ignores any options I've put in
there... :) But not any more: I've updated `./src/lxc/conf.c`
(`mount_file_entries` function) so that when it encounters a `bind`
option it executes it twice (one without any extra options, and a
second time with the remount flag set.)
I've marginally (as in my particular case) tested it and it works.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Andrew Phillips [Mon, 14 Jun 2010 09:34:50 +0000 (11:34 +0200)]
support shutdown/reboot with upstart within a system container
Improve resiliency of utmp.c to removal of /var/run/utmp
Add shutdown timer as we transition to shutdown from running to check for the
number of tasks remaining. Improve container state handling. We can't rely on
the previous runlevel being maintained properly.
Signed-off-by: Andrew Phillips <Andrew.Phillips@lmax.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Ferenc Wagner [Fri, 11 Jun 2010 13:56:25 +0000 (15:56 +0200)]
change pivotdir default to mnt
The mnt directory has a good chance to already exist in the new root
filesystem, so creation and removal can be avoided. This also eases
use of read only root filesystems (no configuration necessary).
Signed-off-by: Ferenc Wagner <wferi@niif.hu> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Daniel Lezcano [Mon, 7 Jun 2010 11:25:30 +0000 (13:25 +0200)]
fix ipv6 acast / mcast restriction
Pointer comparison is buggy as they are never null.
For an ipv6 address configuration, we always zeroed the structure,
hence the bcast and acast structure are equal to in6addr_any.
Any change of this value means the user specified something different
in the configuration file, so we fail gracefully.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>