setup_netdev: re-read ifindex in LXC_NET_PHYS case
When moving an interface from the host netns to a container's,
the ifindex might not remain the same. This happens when the
index of the host interface is already assigned to another interface
in the new netns.
For veth/vlan/macvlan, virtual interfaces are first created on the host,
and then moved in the container. Since they are created after all other
interfaces are discovered, there is no chance for its assigned ifindex
to be already present in a freshly created netns, because it's a greater
number.
However, when moving a physical interface, there is a chance that its
ifindex in the host netns is not free in the new netns. The patch
forces ifindex re-read for the LXC_NET_PHYS case to update the
lxc_netdev structure.
pc-wurm [Fri, 8 Nov 2013 11:45:51 +0000 (12:45 +0100)]
Update lxc_create.c corrected argument usage example for -t
I think '-t timeout' was mistakenly written, so I corrected it to '-t
template', since the -t argument is used for setting templates, not
timeout as far as I know.
Dwight Engen [Tue, 12 Nov 2013 19:04:45 +0000 (14:04 -0500)]
fix multithreaded create()
We were calling save_config() twice within the create() flow, each
from a different process. Depending on order of scheduling, sometimes
the data from the first save_config() (which was just the stuff from
LXC_DEFAULT_CONFIG) would overwrite the config we wanted (the full
config), causing a truncated config file which would then cause lxc
to segfault once it read it back in because no rootfs.path was set.
This fixes it by only calling save_config() once in the create()
flow. A rejected alternative was to call fsync(fileno(fout)) before
the fclose in save_config.
Serge Hallyn [Mon, 11 Nov 2013 18:32:14 +0000 (12:32 -0600)]
lxc_abstract_unix_connect: accomodate containers started before Oct 28
commit aae93dd3dd20dd12c6b8f9f0490e2fb877ee3f09 fixed the command socket
name to use the right pathlen instead of always passing in the max
socket namelen. However, this breaks lxc-info/lxc-list/etc for
containers started before that commit. So if the correct command
sock name doesn't work, try the preexising one.
Note we can probably undo this "after awhile". Maybe in august 2014.
Serge Hallyn [Fri, 25 Oct 2013 23:03:57 +0000 (18:03 -0500)]
lxc-user-nic: rename nic inside container to desired name
To do so we do a quick setns into the container's netns. This
(unexpectedly) turns out cleaner than trying to rename it from
lxc_setup(), because we don't know the original nic name in
the container until we created it which we do in the parent
after the init has been cloned.
Serge Hallyn [Fri, 1 Nov 2013 20:27:49 +0000 (15:27 -0500)]
create_run_template: tell the template what caller's uid was mapped to
conf.c/conf.h: have replaced bool hostid_is_mapped() with int mapped_hostid()
which returns the mapped uid for the caller's uid on the host, or -1 if
none
create_run_template: pass caller's uid into template.
lxc-ubuntu-cloud:
1. accept --mapped-uid argument
2. don't write to devices cgroup - not allowed.
3. if running in userns, use $HOME/.cache
4. chown cached files to the uid to which our caller was
mapped
5. ignore /dev when extracting rootfs in a userns
Dwight Engen [Tue, 5 Nov 2013 18:17:02 +0000 (13:17 -0500)]
fix leak in list_active_containers()
Found by running the lxc-test-list test with valgrind. The names were
put into a local array, and never freed in the success case where the
caller didn't want the names returned and in the early out failure case.
Note we don't need to check the return from remove_from_array() because
we just successfully added the name above.
Dwight Engen [Mon, 4 Nov 2013 22:35:15 +0000 (17:35 -0500)]
allow lxcapi_get_cgroup_item() on lxc-execute containers
Containers started with lxc-execute may not have a conf, but
nothing in the implementation of lxcapi_get_cgroup_item()
actually needs/uses it, and it can be useful to get items out
of the containers' cgroup items.
Dwight Engen [Thu, 31 Oct 2013 20:38:30 +0000 (16:38 -0400)]
lua: fix stats collection using get_cgroup_item
Previously, the lua stats collection was building its own paths to the
cgroup files, which could be wrong depending on what --with-cgroup-pattern
was passed to configure. Fix it to use the get_cgroup_item api so it
always finds the files.
Remove cgroup_path_get since it is not used anymore.
S.Çağlar Onur [Fri, 1 Nov 2013 20:16:10 +0000 (16:16 -0400)]
valgrind drd tool shows conflicting stores happening at lxc_global_config_value@src/lxc/utils.c (v2)
Conflict occurs between following lines
[...]
269 if (values[i])
270 return values[i];
[...]
and
[...]
309 /* could not find value, use default */
310 values[i] = (*ptr)[1];
[...]
fix it using a specific lock dedicated to that problem as Serge suggested.
Also introduce a new autoconf parameter (--enable-mutex-debugging) to convert mutexes to error reporting type and to provide a stacktrace when locking fails.
Serge Hallyn [Fri, 1 Nov 2013 17:17:52 +0000 (12:17 -0500)]
always remount / rslave before running creation template (if root)
If we're not root, our mounts in private userns won't get pushed
back anyway. If we are root, we need to make sure that anything
the template does gets cleaned up.
Dwight Engen [Tue, 29 Oct 2013 20:46:16 +0000 (16:46 -0400)]
fix cgpath test
Commit 1ea59ad28 sets memory.use_hierarchy, which means that this test
cannot use memory.swappiness as its dummy cgroup item to set/unset since
writing to it with use_hierarchy set gets -EINVAL. Change test to use
memory.soft_limit_in_bytes instead.
Dwight Engen [Tue, 29 Oct 2013 18:38:00 +0000 (14:38 -0400)]
fix free() of args to startl
Coverity 1076328 marked this as "Use after free", which it isn't really,
its actually just free()ing the wrong 2nd, 3rd, etc... pointers. Test by
passing two or more args to startl, without this change you get segfault
when free()ing the second pointer/arg.
Serge Hallyn [Tue, 29 Oct 2013 17:48:46 +0000 (12:48 -0500)]
rpm spec: fix version numbering when building alpha, beta, rc
We want to ensure smooth upgrades when doing rpm -U throughout the
release cycle so this change implements the scheme documented at:
http://fedoraproject.org/wiki/Packaging%3aNamingGuidelines#NonNumericRelease
Dwight Engen [Tue, 29 Oct 2013 13:24:29 +0000 (09:24 -0400)]
coverity: ifr_name buffer not NULL terminated
The kernel (net/core/dev_ioctl.c:dev_ioctl()) is going to NULL terminate
this name after the copy-in of the ifr, so even though this is a fixed
sized array the last byte isn't usable as part of the name. All the ioctls
we're using go through this code path.
Use the ifr name in the DEBUG message in case it was possibly truncated.
Changes since v1:
* check the length of passed-in string
Changes since v2:
* remove non-abstract socket code path to simplify functions
* rename lxc_af_unix_* family to lxc_abstract_unix_*
On this system list_active_containers returns 14 containers while only 10 containers are running.
Following patch;
* Introduces array_contains function to do a binary search on given array,
* Starts to sort arrays inside the add_to_clist and add_to_names functions,
* Consumes array_contains in list_active_containers to eliminate duplicates,
* Replaces the linear search code in lxcapi_get_interfaces with the new function.
Changes since v1:
* Do not load containers if a if a container list is not passed in
* Fix possible memory leaks in lxcapi_get_ips and lxcapi_get_interfaces if realloc fails
Serge Hallyn [Wed, 23 Oct 2013 15:52:37 +0000 (10:52 -0500)]
start: use lxc-user-nic if we are not root
Note this results in nics named things like 'lxcuser-0p'. We'll
likely want to pass the requested name to lxc-user-nic, but let's
do that in a separate patch.
If we're not root, we can't create new network itnerfaces to pass
into the container. Instead wait until the container is started,
and call lxc-user-nic to create and assign the nics.
Serge Hallyn [Thu, 24 Oct 2013 16:35:55 +0000 (11:35 -0500)]
strtoul: check errno
In a few places we checked for LONG_MIN or LONG_MAX as indication
that strtoul failed. That's not reliable. As suggested in the
manpage, switch to checking errno value.
Stéphane Graber [Thu, 24 Oct 2013 01:50:43 +0000 (21:50 -0400)]
clang: Remaining changes
Those are a bit less obvious than those I pushed directly to master.
All those changes were required to build LXC under clang here.
With this, gcc can be replaced by clang to build LXC so long as you're
not using the python3 binding (as python extensions can't be built under
clang at the moment).
For reference, the clang output for those is: http://paste.ubuntu.com/6292460/
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Serge Hallyn [Thu, 24 Oct 2013 01:54:13 +0000 (20:54 -0500)]
apparmor: cache the are-we-enabled decision
Since we check /sys/kernel/security/ files when deciding whether
apparmor is enabled, and that might not be mounted in the container,
we cannot re-make the decision at apparmor_process_label_set() time.
Luckily we don't have to - just cache the decision made at
lsm_apparmor_drv_init().
Dwight Engen [Wed, 23 Oct 2013 21:03:40 +0000 (17:03 -0400)]
oracle template: restrict writeability in /proc and /sys
Note that since we don't drop CAP_SYS_ADMIN, root in the container can
remount proc or sys however they want to, however this at least improves
the default situation.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Dwight Engen [Tue, 22 Oct 2013 20:33:26 +0000 (16:33 -0400)]
update rpm .spec file
The following changes were made to fix rpmlint warnings/errors
- use %global instead of %define
http://fedoraproject.org/wiki/PackagingDrafts/global_preferred_over_define
- change Summary to match .deb
- update License
- do not mention the libcap dependency explicitly, rpm will fill it in
- fix Summary, Description for libs and devel packages
- pass -q to %setup
- add %post for libs to run ldconfig
- explicitly name lxc man paths so pkg doesn't "own" /usr/share/man
- mark /etc/lxc/default.conf as a config file
In addition, while I was here:
- split lua bits into seperate lxc-lua package
- change Description to match .deb
- remove "Version" in changelog entries to follow
http://fedoraproject.org/wiki/Packaging:Guidelines#Changelogs
Sidnei da Silva [Mon, 19 Aug 2013 22:34:19 +0000 (19:34 -0300)]
Add a --thinpool argument to lxc-create, to use thin pool backed lvm when creating the container. When cloning a container backed by a thin pool, the clone will default to the same thin pool.
Dwight Engen [Fri, 18 Oct 2013 18:31:53 +0000 (14:31 -0400)]
use proper config item depending on which lsm is enabled
On a system with AppArmor enabled, if lxc.se_context is configured but
lxc.aa_profile is not (because the user just wants to use the default
AppArmor profile) lxc was passing the lxc.se_context to be set as the
new AppArmor profile. Determine which configuration item to use based
on which lsm is enabled.
Stéphane Graber [Fri, 18 Oct 2013 17:27:46 +0000 (13:27 -0400)]
lxc-start-ephemeral: Fix broken mount logic
This reworks the mount logic for lxc-start-ephemeral to be as follow:
- Any real (non-bind) entry gets copied to the target fstab
- Any bind-mount from a virtual fs gets copied to the target fstab
- Any remaining bind-mount if confirmed to be valid gets setup as an
overlay.
Extra bind-mounts passed through the -b option are mounted by the
pre-mount script and don't need processing by the fstab generator.
Serge Hallyn [Fri, 18 Oct 2013 15:31:27 +0000 (10:31 -0500)]
parse.c: don't print error message on callback rv > 0
A callback return value < 0 means there was an error, so print
out an error message. But a rv > 0 is used by the mount_unknown_fs
functions to say "we found the one we want, stop here."
Document this, and only print an error message if rv < 0. Otherwise,
lxc-create -B lvm --fstype ext3 -t ubuntu -n u1
will print an (innocuous) error message about being unable to parse
the config value 'ext3'.