]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/blob - queue-4.9/ipvlan-disallow-userns-cap_net_admin-to-change-global-mode-flags.patch
6c6609f960a53ff03cfdc918c30833e56a538533
[thirdparty/kernel/stable-queue.git] / queue-4.9 / ipvlan-disallow-userns-cap_net_admin-to-change-global-mode-flags.patch
1 From foo@baz Fri Mar 15 20:48:31 PDT 2019
2 From: Daniel Borkmann <daniel@iogearbox.net>
3 Date: Wed, 20 Feb 2019 00:15:30 +0100
4 Subject: ipvlan: disallow userns cap_net_admin to change global mode/flags
5
6 From: Daniel Borkmann <daniel@iogearbox.net>
7
8 [ Upstream commit 7cc9f7003a969d359f608ebb701d42cafe75b84a ]
9
10 When running Docker with userns isolation e.g. --userns-remap="default"
11 and spawning up some containers with CAP_NET_ADMIN under this realm, I
12 noticed that link changes on ipvlan slave device inside that container
13 can affect all devices from this ipvlan group which are in other net
14 namespaces where the container should have no permission to make changes
15 to, such as the init netns, for example.
16
17 This effectively allows to undo ipvlan private mode and switch globally to
18 bridge mode where slaves can communicate directly without going through
19 hostns, or it allows to switch between global operation mode (l2/l3/l3s)
20 for everyone bound to the given ipvlan master device. libnetwork plugin
21 here is creating an ipvlan master and ipvlan slave in hostns and a slave
22 each that is moved into the container's netns upon creation event.
23
24 * In hostns:
25
26 # ip -d a
27 [...]
28 8: cilium_host@bond0: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
29 link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
30 ipvlan mode l3 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
31 inet 10.41.0.1/32 scope link cilium_host
32 valid_lft forever preferred_lft forever
33 [...]
34
35 * Spawn container & change ipvlan mode setting inside of it:
36
37 # docker run -dt --cap-add=NET_ADMIN --network cilium-net --name client -l app=test cilium/netperf
38 9fff485d69dcb5ce37c9e33ca20a11ccafc236d690105aadbfb77e4f4170879c
39
40 # docker exec -ti client ip -d a
41 [...]
42 10: cilium0@if4: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
43 link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
44 ipvlan mode l3 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
45 inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
46 valid_lft forever preferred_lft forever
47
48 # docker exec -ti client ip link change link cilium0 name cilium0 type ipvlan mode l2
49
50 # docker exec -ti client ip -d a
51 [...]
52 10: cilium0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
53 link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
54 ipvlan mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
55 inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
56 valid_lft forever preferred_lft forever
57
58 * In hostns (mode switched to l2):
59
60 # ip -d a
61 [...]
62 8: cilium_host@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
63 link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
64 ipvlan mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
65 inet 10.41.0.1/32 scope link cilium_host
66 valid_lft forever preferred_lft forever
67 [...]
68
69 Same l3 -> l2 switch would also happen by creating another slave inside
70 the container's network namespace when specifying the existing cilium0
71 link to derive the actual (bond0) master:
72
73 # docker exec -ti client ip link add link cilium0 name cilium1 type ipvlan mode l2
74
75 # docker exec -ti client ip -d a
76 [...]
77 2: cilium1@if4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
78 link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
79 ipvlan mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
80 10: cilium0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
81 link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
82 ipvlan mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
83 inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
84 valid_lft forever preferred_lft forever
85
86 * In hostns:
87
88 # ip -d a
89 [...]
90 8: cilium_host@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
91 link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
92 ipvlan mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
93 inet 10.41.0.1/32 scope link cilium_host
94 valid_lft forever preferred_lft forever
95 [...]
96
97 One way to mitigate it is to check CAP_NET_ADMIN permissions of
98 the ipvlan master device's ns, and only then allow to change
99 mode or flags for all devices bound to it. Above two cases are
100 then disallowed after the patch.
101
102 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
103 Acked-by: Mahesh Bandewar <maheshb@google.com>
104 Signed-off-by: David S. Miller <davem@davemloft.net>
105 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
106 ---
107 drivers/net/ipvlan/ipvlan_main.c | 9 ++++++++-
108 1 file changed, 8 insertions(+), 1 deletion(-)
109
110 --- a/drivers/net/ipvlan/ipvlan_main.c
111 +++ b/drivers/net/ipvlan/ipvlan_main.c
112 @@ -463,7 +463,12 @@ static int ipvlan_nl_changelink(struct n
113 struct ipvl_port *port = ipvlan_port_get_rtnl(ipvlan->phy_dev);
114 int err = 0;
115
116 - if (data && data[IFLA_IPVLAN_MODE]) {
117 + if (!data)
118 + return 0;
119 + if (!ns_capable(dev_net(ipvlan->phy_dev)->user_ns, CAP_NET_ADMIN))
120 + return -EPERM;
121 +
122 + if (data[IFLA_IPVLAN_MODE]) {
123 u16 nmode = nla_get_u16(data[IFLA_IPVLAN_MODE]);
124
125 err = ipvlan_set_port_mode(port, nmode);
126 @@ -530,6 +535,8 @@ static int ipvlan_link_new(struct net *s
127 struct ipvl_dev *tmp = netdev_priv(phy_dev);
128
129 phy_dev = tmp->phy_dev;
130 + if (!ns_capable(dev_net(phy_dev)->user_ns, CAP_NET_ADMIN))
131 + return -EPERM;
132 } else if (!netif_is_ipvlan_port(phy_dev)) {
133 err = ipvlan_port_create(phy_dev);
134 if (err < 0)