]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/commitdiff
4.4-stable patches
authorGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Thu, 10 Nov 2016 15:46:37 +0000 (16:46 +0100)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Thu, 10 Nov 2016 15:46:37 +0000 (16:46 +0100)
added patches:
bridge-multicast-restore-perm-router-ports-on-multicast-enable.patch
ip6_gre-fix-flowi6_proto-value-in-ip6gre_xmit_other.patch
ip6_tunnel-fix-ip6_tnl_lookup.patch
ipmr-ip6mr-fix-scheduling-while-atomic-and-a-deadlock-with-ipmr_get_route.patch
ipv4-disable-bh-in-set_ping_group_range.patch
ipv4-use-the-right-lock-for-ping_group_range.patch
ipv6-correctly-add-local-routes-when-lo-goes-up.patch
ipv6-tcp-restore-ip6cb-for-pktoptions-skbs.patch
net-add-netdev-all_adj_list-refcnt-propagation-to-fix-panic.patch
net-add-recursion-limit-to-gro.patch
net-avoid-sk_forward_alloc-overflows.patch
net-fec-set-mac-address-unconditionally.patch
net-pktgen-fix-pkt_size.patch
net-pktgen-remove-rcu-locking-in-pktgen_change_name.patch
net-sched-act_vlan-push-skb-data-to-mac_header-prior-calling-skb_vlan_-functions.patch
net-sched-filters-fix-notification-of-filter-delete-with-proper-handle.patch
net-sctp-forbid-negative-length.patch
netlink-do-not-enter-direct-reclaim-from-netlink_dump.patch
packet-call-fanout_release-while-unregistering-a-netdev.patch
packet-on-direct_xmit-limit-tso-and-csum-to-supported-devices.patch
rtnetlink-add-rtnexthop-offload-flag-to-compare-mask.patch
sctp-validate-chunk-len-before-actually-using-it.patch
tcp-fix-a-compile-error-in-dbgundo.patch
tcp-fix-overflow-in-__tcp_retransmit_skb.patch
tcp-fix-wrong-checksum-calculation-on-mtu-probing.patch
tg3-avoid-null-pointer-dereference-in-tg3_io_error_detected.patch
udp-fix-ip_checksum-handling.patch

29 files changed:
queue-4.4/bridge-multicast-restore-perm-router-ports-on-multicast-enable.patch [new file with mode: 0644]
queue-4.4/ip6_gre-fix-flowi6_proto-value-in-ip6gre_xmit_other.patch [new file with mode: 0644]
queue-4.4/ip6_tunnel-fix-ip6_tnl_lookup.patch [new file with mode: 0644]
queue-4.4/ipmr-ip6mr-fix-scheduling-while-atomic-and-a-deadlock-with-ipmr_get_route.patch [new file with mode: 0644]
queue-4.4/ipv4-disable-bh-in-set_ping_group_range.patch [new file with mode: 0644]
queue-4.4/ipv4-use-the-right-lock-for-ping_group_range.patch [new file with mode: 0644]
queue-4.4/ipv6-correctly-add-local-routes-when-lo-goes-up.patch [new file with mode: 0644]
queue-4.4/ipv6-tcp-restore-ip6cb-for-pktoptions-skbs.patch [new file with mode: 0644]
queue-4.4/net-add-netdev-all_adj_list-refcnt-propagation-to-fix-panic.patch [new file with mode: 0644]
queue-4.4/net-add-recursion-limit-to-gro.patch [new file with mode: 0644]
queue-4.4/net-avoid-sk_forward_alloc-overflows.patch [new file with mode: 0644]
queue-4.4/net-fec-set-mac-address-unconditionally.patch [new file with mode: 0644]
queue-4.4/net-pktgen-fix-pkt_size.patch [new file with mode: 0644]
queue-4.4/net-pktgen-remove-rcu-locking-in-pktgen_change_name.patch [new file with mode: 0644]
queue-4.4/net-sched-act_vlan-push-skb-data-to-mac_header-prior-calling-skb_vlan_-functions.patch [new file with mode: 0644]
queue-4.4/net-sched-filters-fix-notification-of-filter-delete-with-proper-handle.patch [new file with mode: 0644]
queue-4.4/net-sctp-forbid-negative-length.patch [new file with mode: 0644]
queue-4.4/netlink-do-not-enter-direct-reclaim-from-netlink_dump.patch [new file with mode: 0644]
queue-4.4/packet-call-fanout_release-while-unregistering-a-netdev.patch [new file with mode: 0644]
queue-4.4/packet-on-direct_xmit-limit-tso-and-csum-to-supported-devices.patch [new file with mode: 0644]
queue-4.4/rtnetlink-add-rtnexthop-offload-flag-to-compare-mask.patch [new file with mode: 0644]
queue-4.4/sctp-validate-chunk-len-before-actually-using-it.patch [new file with mode: 0644]
queue-4.4/series [new file with mode: 0644]
queue-4.4/tcp-fix-a-compile-error-in-dbgundo.patch [new file with mode: 0644]
queue-4.4/tcp-fix-overflow-in-__tcp_retransmit_skb.patch [new file with mode: 0644]
queue-4.4/tcp-fix-wrong-checksum-calculation-on-mtu-probing.patch [new file with mode: 0644]
queue-4.4/tg3-avoid-null-pointer-dereference-in-tg3_io_error_detected.patch [new file with mode: 0644]
queue-4.4/udp-fix-ip_checksum-handling.patch [new file with mode: 0644]
queue-4.8/series [new file with mode: 0644]

diff --git a/queue-4.4/bridge-multicast-restore-perm-router-ports-on-multicast-enable.patch b/queue-4.4/bridge-multicast-restore-perm-router-ports-on-multicast-enable.patch
new file mode 100644 (file)
index 0000000..995e5cc
--- /dev/null
@@ -0,0 +1,115 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
+Date: Tue, 18 Oct 2016 18:09:48 +0200
+Subject: bridge: multicast: restore perm router ports on multicast enable
+
+From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
+
+
+[ Upstream commit 7cb3f9214dfa443c1ccc2be637dcc6344cc203f0 ]
+
+Satish reported a problem with the perm multicast router ports not getting
+reenabled after some series of events, in particular if it happens that the
+multicast snooping has been disabled and the port goes to disabled state
+then it will be deleted from the router port list, but if it moves into
+non-disabled state it will not be re-added because the mcast snooping is
+still disabled, and enabling snooping later does nothing.
+
+Here are the steps to reproduce, setup br0 with snooping enabled and eth1
+added as a perm router (multicast_router = 2):
+1. $ echo 0 > /sys/class/net/br0/bridge/multicast_snooping
+2. $ ip l set eth1 down
+^ This step deletes the interface from the router list
+3. $ ip l set eth1 up
+^ This step does not add it again because mcast snooping is disabled
+4. $ echo 1 > /sys/class/net/br0/bridge/multicast_snooping
+5. $ bridge -d -s mdb show
+<empty>
+
+At this point we have mcast enabled and eth1 as a perm router (value = 2)
+but it is not in the router list which is incorrect.
+
+After this change:
+1. $ echo 0 > /sys/class/net/br0/bridge/multicast_snooping
+2. $ ip l set eth1 down
+^ This step deletes the interface from the router list
+3. $ ip l set eth1 up
+^ This step does not add it again because mcast snooping is disabled
+4. $ echo 1 > /sys/class/net/br0/bridge/multicast_snooping
+5. $ bridge -d -s mdb show
+router ports on br0: eth1
+
+Note: we can directly do br_multicast_enable_port for all because the
+querier timer already has checks for the port state and will simply
+expire if it's in blocking/disabled. See the comment added by
+commit 9aa66382163e7 ("bridge: multicast: add a comment to
+br_port_state_selection about blocking state")
+
+Fixes: 561f1103a2b7 ("bridge: Add multicast_snooping sysfs toggle")
+Reported-by: Satish Ashok <sashok@cumulusnetworks.com>
+Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/bridge/br_multicast.c |   23 ++++++++++++++---------
+ 1 file changed, 14 insertions(+), 9 deletions(-)
+
+--- a/net/bridge/br_multicast.c
++++ b/net/bridge/br_multicast.c
+@@ -951,13 +951,12 @@ static void br_multicast_enable(struct b
+               mod_timer(&query->timer, jiffies);
+ }
+-void br_multicast_enable_port(struct net_bridge_port *port)
++static void __br_multicast_enable_port(struct net_bridge_port *port)
+ {
+       struct net_bridge *br = port->br;
+-      spin_lock(&br->multicast_lock);
+       if (br->multicast_disabled || !netif_running(br->dev))
+-              goto out;
++              return;
+       br_multicast_enable(&port->ip4_own_query);
+ #if IS_ENABLED(CONFIG_IPV6)
+@@ -965,8 +964,14 @@ void br_multicast_enable_port(struct net
+ #endif
+       if (port->multicast_router == 2 && hlist_unhashed(&port->rlist))
+               br_multicast_add_router(br, port);
++}
+-out:
++void br_multicast_enable_port(struct net_bridge_port *port)
++{
++      struct net_bridge *br = port->br;
++
++      spin_lock(&br->multicast_lock);
++      __br_multicast_enable_port(port);
+       spin_unlock(&br->multicast_lock);
+ }
+@@ -1905,8 +1910,9 @@ static void br_multicast_start_querier(s
+ int br_multicast_toggle(struct net_bridge *br, unsigned long val)
+ {
+-      int err = 0;
+       struct net_bridge_mdb_htable *mdb;
++      struct net_bridge_port *port;
++      int err = 0;
+       spin_lock_bh(&br->multicast_lock);
+       if (br->multicast_disabled == !val)
+@@ -1934,10 +1940,9 @@ rollback:
+                       goto rollback;
+       }
+-      br_multicast_start_querier(br, &br->ip4_own_query);
+-#if IS_ENABLED(CONFIG_IPV6)
+-      br_multicast_start_querier(br, &br->ip6_own_query);
+-#endif
++      br_multicast_open(br);
++      list_for_each_entry(port, &br->port_list, list)
++              __br_multicast_enable_port(port);
+ unlock:
+       spin_unlock_bh(&br->multicast_lock);
diff --git a/queue-4.4/ip6_gre-fix-flowi6_proto-value-in-ip6gre_xmit_other.patch b/queue-4.4/ip6_gre-fix-flowi6_proto-value-in-ip6gre_xmit_other.patch
new file mode 100644 (file)
index 0000000..ddd8b58
--- /dev/null
@@ -0,0 +1,41 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Lance Richardson <lrichard@redhat.com>
+Date: Fri, 23 Sep 2016 15:50:29 -0400
+Subject: ip6_gre: fix flowi6_proto value in ip6gre_xmit_other()
+
+From: Lance Richardson <lrichard@redhat.com>
+
+
+[ Upstream commit db32e4e49ce2b0e5fcc17803d011a401c0a637f6 ]
+
+Similar to commit 3be07244b733 ("ip6_gre: fix flowi6_proto value in
+xmit path"), set flowi6_proto to IPPROTO_GRE for output route lookup.
+
+Up until now, ip6gre_xmit_other() has set flowi6_proto to a bogus value.
+This affected output route lookup for packets sent on an ip6gretap device
+in cases where routing was dependent on the value of flowi6_proto.
+
+Since the correct proto is already set in the tunnel flowi6 template via
+commit 252f3f5a1189 ("ip6_gre: Set flowi6_proto as IPPROTO_GRE in xmit
+path."), simply delete the line setting the incorrect flowi6_proto value.
+
+Suggested-by: Jiri Benc <jbenc@redhat.com>
+Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
+Reviewed-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
+Signed-off-by: Lance Richardson <lrichard@redhat.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/ipv6/ip6_gre.c |    1 -
+ 1 file changed, 1 deletion(-)
+
+--- a/net/ipv6/ip6_gre.c
++++ b/net/ipv6/ip6_gre.c
+@@ -886,7 +886,6 @@ static int ip6gre_xmit_other(struct sk_b
+               encap_limit = t->parms.encap_limit;
+       memcpy(&fl6, &t->fl.u.ip6, sizeof(fl6));
+-      fl6.flowi6_proto = skb->protocol;
+       err = ip6gre_xmit2(skb, dev, 0, &fl6, encap_limit, &mtu);
diff --git a/queue-4.4/ip6_tunnel-fix-ip6_tnl_lookup.patch b/queue-4.4/ip6_tunnel-fix-ip6_tnl_lookup.patch
new file mode 100644 (file)
index 0000000..8237756
--- /dev/null
@@ -0,0 +1,47 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Vadim Fedorenko <junk@yandex-team.ru>
+Date: Tue, 11 Oct 2016 22:47:20 +0300
+Subject: ip6_tunnel: fix ip6_tnl_lookup
+
+From: Vadim Fedorenko <junk@yandex-team.ru>
+
+
+[ Upstream commit 68d00f332e0ba7f60f212be74ede290c9f873bc5 ]
+
+The commit ea3dc9601bda ("ip6_tunnel: Add support for wildcard tunnel
+endpoints.") introduces support for wildcards in tunnels endpoints,
+but in some rare circumstances ip6_tnl_lookup selects wrong tunnel
+interface relying only on source or destination address of the packet
+and not checking presence of wildcard in tunnels endpoints. Later in
+ip6_tnl_rcv this packets can be dicarded because of difference in
+ipproto even if fallback device have proper ipproto configuration.
+
+This patch adds checks of wildcard endpoint in tunnel avoiding such
+behavior
+
+Fixes: ea3dc9601bda ("ip6_tunnel: Add support for wildcard tunnel endpoints.")
+Signed-off-by: Vadim Fedorenko <junk@yandex-team.ru>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/ipv6/ip6_tunnel.c |    2 ++
+ 1 file changed, 2 insertions(+)
+
+--- a/net/ipv6/ip6_tunnel.c
++++ b/net/ipv6/ip6_tunnel.c
+@@ -246,6 +246,7 @@ ip6_tnl_lookup(struct net *net, const st
+       hash = HASH(&any, local);
+       for_each_ip6_tunnel_rcu(ip6n->tnls_r_l[hash]) {
+               if (ipv6_addr_equal(local, &t->parms.laddr) &&
++                  ipv6_addr_any(&t->parms.raddr) &&
+                   (t->dev->flags & IFF_UP))
+                       return t;
+       }
+@@ -253,6 +254,7 @@ ip6_tnl_lookup(struct net *net, const st
+       hash = HASH(remote, &any);
+       for_each_ip6_tunnel_rcu(ip6n->tnls_r_l[hash]) {
+               if (ipv6_addr_equal(remote, &t->parms.raddr) &&
++                  ipv6_addr_any(&t->parms.laddr) &&
+                   (t->dev->flags & IFF_UP))
+                       return t;
+       }
diff --git a/queue-4.4/ipmr-ip6mr-fix-scheduling-while-atomic-and-a-deadlock-with-ipmr_get_route.patch b/queue-4.4/ipmr-ip6mr-fix-scheduling-while-atomic-and-a-deadlock-with-ipmr_get_route.patch
new file mode 100644 (file)
index 0000000..8ff1c33
--- /dev/null
@@ -0,0 +1,160 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
+Date: Sun, 25 Sep 2016 23:08:31 +0200
+Subject: ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route
+
+From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
+
+
+[ Upstream commit 2cf750704bb6d7ed8c7d732e071dd1bc890ea5e8 ]
+
+Since the commit below the ipmr/ip6mr rtnl_unicast() code uses the portid
+instead of the previous dst_pid which was copied from in_skb's portid.
+Since the skb is new the portid is 0 at that point so the packets are sent
+to the kernel and we get scheduling while atomic or a deadlock (depending
+on where it happens) by trying to acquire rtnl two times.
+Also since this is RTM_GETROUTE, it can be triggered by a normal user.
+
+Here's the sleeping while atomic trace:
+[ 7858.212557] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
+[ 7858.212748] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0
+[ 7858.212881] 2 locks held by swapper/0/0:
+[ 7858.213013]  #0:  (((&mrt->ipmr_expire_timer))){+.-...}, at: [<ffffffff810fbbf5>] call_timer_fn+0x5/0x350
+[ 7858.213422]  #1:  (mfc_unres_lock){+.....}, at: [<ffffffff8161e005>] ipmr_expire_process+0x25/0x130
+[ 7858.213807] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc7+ #179
+[ 7858.213934] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
+[ 7858.214108]  0000000000000000 ffff88005b403c50 ffffffff813a7804 0000000000000000
+[ 7858.214412]  ffffffff81a1338e ffff88005b403c78 ffffffff810a4a72 ffffffff81a1338e
+[ 7858.214716]  000000000000026c 0000000000000000 ffff88005b403ca8 ffffffff810a4b9f
+[ 7858.215251] Call Trace:
+[ 7858.215412]  <IRQ>  [<ffffffff813a7804>] dump_stack+0x85/0xc1
+[ 7858.215662]  [<ffffffff810a4a72>] ___might_sleep+0x192/0x250
+[ 7858.215868]  [<ffffffff810a4b9f>] __might_sleep+0x6f/0x100
+[ 7858.216072]  [<ffffffff8165bea3>] mutex_lock_nested+0x33/0x4d0
+[ 7858.216279]  [<ffffffff815a7a5f>] ? netlink_lookup+0x25f/0x460
+[ 7858.216487]  [<ffffffff8157474b>] rtnetlink_rcv+0x1b/0x40
+[ 7858.216687]  [<ffffffff815a9a0c>] netlink_unicast+0x19c/0x260
+[ 7858.216900]  [<ffffffff81573c70>] rtnl_unicast+0x20/0x30
+[ 7858.217128]  [<ffffffff8161cd39>] ipmr_destroy_unres+0xa9/0xf0
+[ 7858.217351]  [<ffffffff8161e06f>] ipmr_expire_process+0x8f/0x130
+[ 7858.217581]  [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180
+[ 7858.217785]  [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180
+[ 7858.217990]  [<ffffffff810fbc95>] call_timer_fn+0xa5/0x350
+[ 7858.218192]  [<ffffffff810fbbf5>] ? call_timer_fn+0x5/0x350
+[ 7858.218415]  [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180
+[ 7858.218656]  [<ffffffff810fde10>] run_timer_softirq+0x260/0x640
+[ 7858.218865]  [<ffffffff8166379b>] ? __do_softirq+0xbb/0x54f
+[ 7858.219068]  [<ffffffff816637c8>] __do_softirq+0xe8/0x54f
+[ 7858.219269]  [<ffffffff8107a948>] irq_exit+0xb8/0xc0
+[ 7858.219463]  [<ffffffff81663452>] smp_apic_timer_interrupt+0x42/0x50
+[ 7858.219678]  [<ffffffff816625bc>] apic_timer_interrupt+0x8c/0xa0
+[ 7858.219897]  <EOI>  [<ffffffff81055f16>] ? native_safe_halt+0x6/0x10
+[ 7858.220165]  [<ffffffff810d64dd>] ? trace_hardirqs_on+0xd/0x10
+[ 7858.220373]  [<ffffffff810298e3>] default_idle+0x23/0x190
+[ 7858.220574]  [<ffffffff8102a20f>] arch_cpu_idle+0xf/0x20
+[ 7858.220790]  [<ffffffff810c9f8c>] default_idle_call+0x4c/0x60
+[ 7858.221016]  [<ffffffff810ca33b>] cpu_startup_entry+0x39b/0x4d0
+[ 7858.221257]  [<ffffffff8164f995>] rest_init+0x135/0x140
+[ 7858.221469]  [<ffffffff81f83014>] start_kernel+0x50e/0x51b
+[ 7858.221670]  [<ffffffff81f82120>] ? early_idt_handler_array+0x120/0x120
+[ 7858.221894]  [<ffffffff81f8243f>] x86_64_start_reservations+0x2a/0x2c
+[ 7858.222113]  [<ffffffff81f8257c>] x86_64_start_kernel+0x13b/0x14a
+
+Fixes: 2942e9005056 ("[RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts")
+Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ include/linux/mroute.h  |    2 +-
+ include/linux/mroute6.h |    2 +-
+ net/ipv4/ipmr.c         |    3 ++-
+ net/ipv4/route.c        |    3 ++-
+ net/ipv6/ip6mr.c        |    5 +++--
+ net/ipv6/route.c        |    4 +++-
+ 6 files changed, 12 insertions(+), 7 deletions(-)
+
+--- a/include/linux/mroute.h
++++ b/include/linux/mroute.h
+@@ -103,5 +103,5 @@ struct mfc_cache {
+ struct rtmsg;
+ extern int ipmr_get_route(struct net *net, struct sk_buff *skb,
+                         __be32 saddr, __be32 daddr,
+-                        struct rtmsg *rtm, int nowait);
++                        struct rtmsg *rtm, int nowait, u32 portid);
+ #endif
+--- a/include/linux/mroute6.h
++++ b/include/linux/mroute6.h
+@@ -115,7 +115,7 @@ struct mfc6_cache {
+ struct rtmsg;
+ extern int ip6mr_get_route(struct net *net, struct sk_buff *skb,
+-                         struct rtmsg *rtm, int nowait);
++                         struct rtmsg *rtm, int nowait, u32 portid);
+ #ifdef CONFIG_IPV6_MROUTE
+ extern struct sock *mroute6_socket(struct net *net, struct sk_buff *skb);
+--- a/net/ipv4/ipmr.c
++++ b/net/ipv4/ipmr.c
+@@ -2192,7 +2192,7 @@ static int __ipmr_fill_mroute(struct mr_
+ int ipmr_get_route(struct net *net, struct sk_buff *skb,
+                  __be32 saddr, __be32 daddr,
+-                 struct rtmsg *rtm, int nowait)
++                 struct rtmsg *rtm, int nowait, u32 portid)
+ {
+       struct mfc_cache *cache;
+       struct mr_table *mrt;
+@@ -2237,6 +2237,7 @@ int ipmr_get_route(struct net *net, stru
+                       return -ENOMEM;
+               }
++              NETLINK_CB(skb2).portid = portid;
+               skb_push(skb2, sizeof(struct iphdr));
+               skb_reset_network_header(skb2);
+               iph = ip_hdr(skb2);
+--- a/net/ipv4/route.c
++++ b/net/ipv4/route.c
+@@ -2492,7 +2492,8 @@ static int rt_fill_info(struct net *net,
+                   IPV4_DEVCONF_ALL(net, MC_FORWARDING)) {
+                       int err = ipmr_get_route(net, skb,
+                                                fl4->saddr, fl4->daddr,
+-                                               r, nowait);
++                                               r, nowait, portid);
++
+                       if (err <= 0) {
+                               if (!nowait) {
+                                       if (err == 0)
+--- a/net/ipv6/ip6mr.c
++++ b/net/ipv6/ip6mr.c
+@@ -2276,8 +2276,8 @@ static int __ip6mr_fill_mroute(struct mr
+       return 1;
+ }
+-int ip6mr_get_route(struct net *net,
+-                  struct sk_buff *skb, struct rtmsg *rtm, int nowait)
++int ip6mr_get_route(struct net *net, struct sk_buff *skb, struct rtmsg *rtm,
++                  int nowait, u32 portid)
+ {
+       int err;
+       struct mr6_table *mrt;
+@@ -2322,6 +2322,7 @@ int ip6mr_get_route(struct net *net,
+                       return -ENOMEM;
+               }
++              NETLINK_CB(skb2).portid = portid;
+               skb_reset_transport_header(skb2);
+               skb_put(skb2, sizeof(struct ipv6hdr));
+--- a/net/ipv6/route.c
++++ b/net/ipv6/route.c
+@@ -3140,7 +3140,9 @@ static int rt6_fill_node(struct net *net
+       if (iif) {
+ #ifdef CONFIG_IPV6_MROUTE
+               if (ipv6_addr_is_multicast(&rt->rt6i_dst.addr)) {
+-                      int err = ip6mr_get_route(net, skb, rtm, nowait);
++                      int err = ip6mr_get_route(net, skb, rtm, nowait,
++                                                portid);
++
+                       if (err <= 0) {
+                               if (!nowait) {
+                                       if (err == 0)
diff --git a/queue-4.4/ipv4-disable-bh-in-set_ping_group_range.patch b/queue-4.4/ipv4-disable-bh-in-set_ping_group_range.patch
new file mode 100644 (file)
index 0000000..ef80f20
--- /dev/null
@@ -0,0 +1,38 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Eric Dumazet <edumazet@google.com>
+Date: Thu, 20 Oct 2016 10:26:48 -0700
+Subject: ipv4: disable BH in set_ping_group_range()
+
+From: Eric Dumazet <edumazet@google.com>
+
+
+[ Upstream commit a681574c99be23e4d20b769bf0e543239c364af5 ]
+
+In commit 4ee3bd4a8c746 ("ipv4: disable BH when changing ip local port
+range") Cong added BH protection in set_local_port_range() but missed
+that same fix was needed in set_ping_group_range()
+
+Fixes: b8f1a55639e6 ("udp: Add function to make source port for UDP tunnels")
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Reported-by: Eric Salo <salo@google.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/ipv4/sysctl_net_ipv4.c |    4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+--- a/net/ipv4/sysctl_net_ipv4.c
++++ b/net/ipv4/sysctl_net_ipv4.c
+@@ -110,10 +110,10 @@ static void set_ping_group_range(struct
+       kgid_t *data = table->data;
+       struct net *net =
+               container_of(table->data, struct net, ipv4.ping_group_range.range);
+-      write_seqlock(&net->ipv4.ip_local_ports.lock);
++      write_seqlock_bh(&net->ipv4.ip_local_ports.lock);
+       data[0] = low;
+       data[1] = high;
+-      write_sequnlock(&net->ipv4.ip_local_ports.lock);
++      write_sequnlock_bh(&net->ipv4.ip_local_ports.lock);
+ }
+ /* Validate changes from /proc interface. */
diff --git a/queue-4.4/ipv4-use-the-right-lock-for-ping_group_range.patch b/queue-4.4/ipv4-use-the-right-lock-for-ping_group_range.patch
new file mode 100644 (file)
index 0000000..5fa85a3
--- /dev/null
@@ -0,0 +1,61 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: WANG Cong <xiyou.wangcong@gmail.com>
+Date: Thu, 20 Oct 2016 14:19:46 -0700
+Subject: ipv4: use the right lock for ping_group_range
+
+From: WANG Cong <xiyou.wangcong@gmail.com>
+
+
+[ Upstream commit 396a30cce15d084b2b1a395aa6d515c3d559c674 ]
+
+This reverts commit a681574c99be23e4d20b769bf0e543239c364af5
+("ipv4: disable BH in set_ping_group_range()") because we never
+read ping_group_range in BH context (unlike local_port_range).
+
+Then, since we already have a lock for ping_group_range, those
+using ip_local_ports.lock for ping_group_range are clearly typos.
+
+We might consider to share a same lock for both ping_group_range
+and local_port_range w.r.t. space saving, but that should be for
+net-next.
+
+Fixes: a681574c99be ("ipv4: disable BH in set_ping_group_range()")
+Fixes: ba6b918ab234 ("ping: move ping_group_range out of CONFIG_SYSCTL")
+Cc: Eric Dumazet <edumazet@google.com>
+Cc: Eric Salo <salo@google.com>
+Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/ipv4/sysctl_net_ipv4.c |    8 ++++----
+ 1 file changed, 4 insertions(+), 4 deletions(-)
+
+--- a/net/ipv4/sysctl_net_ipv4.c
++++ b/net/ipv4/sysctl_net_ipv4.c
+@@ -97,11 +97,11 @@ static void inet_get_ping_group_range_ta
+               container_of(table->data, struct net, ipv4.ping_group_range.range);
+       unsigned int seq;
+       do {
+-              seq = read_seqbegin(&net->ipv4.ip_local_ports.lock);
++              seq = read_seqbegin(&net->ipv4.ping_group_range.lock);
+               *low = data[0];
+               *high = data[1];
+-      } while (read_seqretry(&net->ipv4.ip_local_ports.lock, seq));
++      } while (read_seqretry(&net->ipv4.ping_group_range.lock, seq));
+ }
+ /* Update system visible IP port range */
+@@ -110,10 +110,10 @@ static void set_ping_group_range(struct
+       kgid_t *data = table->data;
+       struct net *net =
+               container_of(table->data, struct net, ipv4.ping_group_range.range);
+-      write_seqlock_bh(&net->ipv4.ip_local_ports.lock);
++      write_seqlock(&net->ipv4.ping_group_range.lock);
+       data[0] = low;
+       data[1] = high;
+-      write_sequnlock_bh(&net->ipv4.ip_local_ports.lock);
++      write_sequnlock(&net->ipv4.ping_group_range.lock);
+ }
+ /* Validate changes from /proc interface. */
diff --git a/queue-4.4/ipv6-correctly-add-local-routes-when-lo-goes-up.patch b/queue-4.4/ipv6-correctly-add-local-routes-when-lo-goes-up.patch
new file mode 100644 (file)
index 0000000..2745427
--- /dev/null
@@ -0,0 +1,57 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
+Date: Wed, 12 Oct 2016 10:10:40 +0200
+Subject: ipv6: correctly add local routes when lo goes up
+
+From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
+
+
+[ Upstream commit a220445f9f4382c36a53d8ef3e08165fa27f7e2c ]
+
+The goal of the patch is to fix this scenario:
+ ip link add dummy1 type dummy
+ ip link set dummy1 up
+ ip link set lo down ; ip link set lo up
+
+After that sequence, the local route to the link layer address of dummy1 is
+not there anymore.
+
+When the loopback is set down, all local routes are deleted by
+addrconf_ifdown()/rt6_ifdown(). At this time, the rt6_info entry still
+exists, because the corresponding idev has a reference on it. After the rcu
+grace period, dst_rcu_free() is called, and thus ___dst_free(), which will
+set obsolete to DST_OBSOLETE_DEAD.
+
+In this case, init_loopback() is called before dst_rcu_free(), thus
+obsolete is still sets to something <= 0. So, the function doesn't add the
+route again. To avoid that race, let's check the rt6 refcnt instead.
+
+Fixes: 25fb6ca4ed9c ("net IPv6 : Fix broken IPv6 routing table after loopback down-up")
+Fixes: a881ae1f625c ("ipv6: don't call addrconf_dst_alloc again when enable lo")
+Fixes: 33d99113b110 ("ipv6: reallocate addrconf router for ipv6 address when lo device up")
+Reported-by: Francesco Santoro <francesco.santoro@6wind.com>
+Reported-by: Samuel Gauthier <samuel.gauthier@6wind.com>
+CC: Balakumaran Kannan <Balakumaran.Kannan@ap.sony.com>
+CC: Maruthi Thotad <Maruthi.Thotad@ap.sony.com>
+CC: Sabrina Dubroca <sd@queasysnail.net>
+CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
+CC: Weilong Chen <chenweilong@huawei.com>
+CC: Gao feng <gaofeng@cn.fujitsu.com>
+Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/ipv6/addrconf.c |    2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/net/ipv6/addrconf.c
++++ b/net/ipv6/addrconf.c
+@@ -2916,7 +2916,7 @@ static void init_loopback(struct net_dev
+                                * lo device down, release this obsolete dst and
+                                * reallocate a new router for ifa.
+                                */
+-                              if (sp_ifa->rt->dst.obsolete > 0) {
++                              if (!atomic_read(&sp_ifa->rt->rt6i_ref)) {
+                                       ip6_rt_put(sp_ifa->rt);
+                                       sp_ifa->rt = NULL;
+                               } else {
diff --git a/queue-4.4/ipv6-tcp-restore-ip6cb-for-pktoptions-skbs.patch b/queue-4.4/ipv6-tcp-restore-ip6cb-for-pktoptions-skbs.patch
new file mode 100644 (file)
index 0000000..c65be14
--- /dev/null
@@ -0,0 +1,97 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Eric Dumazet <edumazet@google.com>
+Date: Wed, 12 Oct 2016 19:01:45 +0200
+Subject: ipv6: tcp: restore IP6CB for pktoptions skbs
+
+From: Eric Dumazet <edumazet@google.com>
+
+
+[ Upstream commit 8ce48623f0cf3d632e32448411feddccb693d351 ]
+
+Baozeng Ding reported following KASAN splat :
+
+BUG: KASAN: use-after-free in ip6_datagram_recv_specific_ctl+0x13f1/0x15c0 at addr ffff880029c84ec8
+Read of size 1 by task poc/25548
+Call Trace:
+ [<ffffffff82cf43c9>] dump_stack+0x12e/0x185 /lib/dump_stack.c:15
+ [<     inline     >] print_address_description /mm/kasan/report.c:204
+ [<ffffffff817ced3b>] kasan_report_error+0x48b/0x4b0 /mm/kasan/report.c:283
+ [<     inline     >] kasan_report /mm/kasan/report.c:303
+ [<ffffffff817ced9e>] __asan_report_load1_noabort+0x3e/0x40 /mm/kasan/report.c:321
+ [<ffffffff85c71da1>] ip6_datagram_recv_specific_ctl+0x13f1/0x15c0 /net/ipv6/datagram.c:687
+ [<ffffffff85c734c3>] ip6_datagram_recv_ctl+0x33/0x40
+ [<ffffffff85c0b07c>] do_ipv6_getsockopt.isra.4+0xaec/0x2150
+ [<ffffffff85c0c7f6>] ipv6_getsockopt+0x116/0x230
+ [<ffffffff859b5a12>] tcp_getsockopt+0x82/0xd0 /net/ipv4/tcp.c:3035
+ [<ffffffff855fb385>] sock_common_getsockopt+0x95/0xd0 /net/core/sock.c:2647
+ [<     inline     >] SYSC_getsockopt /net/socket.c:1776
+ [<ffffffff855f8ba2>] SyS_getsockopt+0x142/0x230 /net/socket.c:1758
+ [<ffffffff8685cdc5>] entry_SYSCALL_64_fastpath+0x23/0xc6
+Memory state around the buggy address:
+ ffff880029c84d80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
+ ffff880029c84e00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
+> ffff880029c84e80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
+                                              ^
+ ffff880029c84f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
+ ffff880029c84f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
+
+He also provided a syzkaller reproducer.
+
+Issue is that ip6_datagram_recv_specific_ctl() expects to find IP6CB
+data that was moved at a different place in tcp_v6_rcv()
+
+This patch moves tcp_v6_restore_cb() up and calls it from
+tcp_v6_do_rcv() when np->pktoptions is set.
+
+Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Reported-by: Baozeng Ding <sploving1@gmail.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/ipv6/tcp_ipv6.c |   20 +++++++++++---------
+ 1 file changed, 11 insertions(+), 9 deletions(-)
+
+--- a/net/ipv6/tcp_ipv6.c
++++ b/net/ipv6/tcp_ipv6.c
+@@ -1179,6 +1179,16 @@ out:
+       return NULL;
+ }
++static void tcp_v6_restore_cb(struct sk_buff *skb)
++{
++      /* We need to move header back to the beginning if xfrm6_policy_check()
++       * and tcp_v6_fill_cb() are going to be called again.
++       * ip6_datagram_recv_specific_ctl() also expects IP6CB to be there.
++       */
++      memmove(IP6CB(skb), &TCP_SKB_CB(skb)->header.h6,
++              sizeof(struct inet6_skb_parm));
++}
++
+ /* The socket must have it's spinlock held when we get
+  * here, unless it is a TCP_LISTEN socket.
+  *
+@@ -1308,6 +1318,7 @@ ipv6_pktoptions:
+                       np->flow_label = ip6_flowlabel(ipv6_hdr(opt_skb));
+               if (ipv6_opt_accepted(sk, opt_skb, &TCP_SKB_CB(opt_skb)->header.h6)) {
+                       skb_set_owner_r(opt_skb, sk);
++                      tcp_v6_restore_cb(opt_skb);
+                       opt_skb = xchg(&np->pktoptions, opt_skb);
+               } else {
+                       __kfree_skb(opt_skb);
+@@ -1341,15 +1352,6 @@ static void tcp_v6_fill_cb(struct sk_buf
+       TCP_SKB_CB(skb)->sacked = 0;
+ }
+-static void tcp_v6_restore_cb(struct sk_buff *skb)
+-{
+-      /* We need to move header back to the beginning if xfrm6_policy_check()
+-       * and tcp_v6_fill_cb() are going to be called again.
+-       */
+-      memmove(IP6CB(skb), &TCP_SKB_CB(skb)->header.h6,
+-              sizeof(struct inet6_skb_parm));
+-}
+-
+ static int tcp_v6_rcv(struct sk_buff *skb)
+ {
+       const struct tcphdr *th;
diff --git a/queue-4.4/net-add-netdev-all_adj_list-refcnt-propagation-to-fix-panic.patch b/queue-4.4/net-add-netdev-all_adj_list-refcnt-propagation-to-fix-panic.patch
new file mode 100644 (file)
index 0000000..821b712
--- /dev/null
@@ -0,0 +1,271 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Andrew Collins <acollins@cradlepoint.com>
+Date: Mon, 3 Oct 2016 13:43:02 -0600
+Subject: net: Add netdev all_adj_list refcnt propagation to fix panic
+
+From: Andrew Collins <acollins@cradlepoint.com>
+
+
+[ Upstream commit 93409033ae653f1c9a949202fb537ab095b2092f ]
+
+This is a respin of a patch to fix a relatively easily reproducible kernel
+panic related to the all_adj_list handling for netdevs in recent kernels.
+
+The following sequence of commands will reproduce the issue:
+
+ip link add link eth0 name eth0.100 type vlan id 100
+ip link add link eth0 name eth0.200 type vlan id 200
+ip link add name testbr type bridge
+ip link set eth0.100 master testbr
+ip link set eth0.200 master testbr
+ip link add link testbr mac0 type macvlan
+ip link delete dev testbr
+
+This creates an upper/lower tree of (excuse the poor ASCII art):
+
+            /---eth0.100-eth0
+mac0-testbr-
+            \---eth0.200-eth0
+
+When testbr is deleted, the all_adj_lists are walked, and eth0 is deleted twice from
+the mac0 list. Unfortunately, during setup in __netdev_upper_dev_link, only one
+reference to eth0 is added, so this results in a panic.
+
+This change adds reference count propagation so things are handled properly.
+
+Matthias Schiffer reported a similar crash in batman-adv:
+
+https://github.com/freifunk-gluon/gluon/issues/680
+https://www.open-mesh.org/issues/247
+
+which this patch also seems to resolve.
+
+Signed-off-by: Andrew Collins <acollins@cradlepoint.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/core/dev.c |   68 +++++++++++++++++++++++++++++++--------------------------
+ 1 file changed, 37 insertions(+), 31 deletions(-)
+
+--- a/net/core/dev.c
++++ b/net/core/dev.c
+@@ -5204,6 +5204,7 @@ static inline bool netdev_adjacent_is_ne
+ static int __netdev_adjacent_dev_insert(struct net_device *dev,
+                                       struct net_device *adj_dev,
++                                      u16 ref_nr,
+                                       struct list_head *dev_list,
+                                       void *private, bool master)
+ {
+@@ -5213,7 +5214,7 @@ static int __netdev_adjacent_dev_insert(
+       adj = __netdev_find_adj(adj_dev, dev_list);
+       if (adj) {
+-              adj->ref_nr++;
++              adj->ref_nr += ref_nr;
+               return 0;
+       }
+@@ -5223,7 +5224,7 @@ static int __netdev_adjacent_dev_insert(
+       adj->dev = adj_dev;
+       adj->master = master;
+-      adj->ref_nr = 1;
++      adj->ref_nr = ref_nr;
+       adj->private = private;
+       dev_hold(adj_dev);
+@@ -5262,6 +5263,7 @@ free_adj:
+ static void __netdev_adjacent_dev_remove(struct net_device *dev,
+                                        struct net_device *adj_dev,
++                                       u16 ref_nr,
+                                        struct list_head *dev_list)
+ {
+       struct netdev_adjacent *adj;
+@@ -5274,10 +5276,10 @@ static void __netdev_adjacent_dev_remove
+               BUG();
+       }
+-      if (adj->ref_nr > 1) {
+-              pr_debug("%s to %s ref_nr-- = %d\n", dev->name, adj_dev->name,
+-                       adj->ref_nr-1);
+-              adj->ref_nr--;
++      if (adj->ref_nr > ref_nr) {
++              pr_debug("%s to %s ref_nr-%d = %d\n", dev->name, adj_dev->name,
++                       ref_nr, adj->ref_nr-ref_nr);
++              adj->ref_nr -= ref_nr;
+               return;
+       }
+@@ -5296,21 +5298,22 @@ static void __netdev_adjacent_dev_remove
+ static int __netdev_adjacent_dev_link_lists(struct net_device *dev,
+                                           struct net_device *upper_dev,
++                                          u16 ref_nr,
+                                           struct list_head *up_list,
+                                           struct list_head *down_list,
+                                           void *private, bool master)
+ {
+       int ret;
+-      ret = __netdev_adjacent_dev_insert(dev, upper_dev, up_list, private,
+-                                         master);
++      ret = __netdev_adjacent_dev_insert(dev, upper_dev, ref_nr, up_list,
++                                         private, master);
+       if (ret)
+               return ret;
+-      ret = __netdev_adjacent_dev_insert(upper_dev, dev, down_list, private,
+-                                         false);
++      ret = __netdev_adjacent_dev_insert(upper_dev, dev, ref_nr, down_list,
++                                         private, false);
+       if (ret) {
+-              __netdev_adjacent_dev_remove(dev, upper_dev, up_list);
++              __netdev_adjacent_dev_remove(dev, upper_dev, ref_nr, up_list);
+               return ret;
+       }
+@@ -5318,9 +5321,10 @@ static int __netdev_adjacent_dev_link_li
+ }
+ static int __netdev_adjacent_dev_link(struct net_device *dev,
+-                                    struct net_device *upper_dev)
++                                    struct net_device *upper_dev,
++                                    u16 ref_nr)
+ {
+-      return __netdev_adjacent_dev_link_lists(dev, upper_dev,
++      return __netdev_adjacent_dev_link_lists(dev, upper_dev, ref_nr,
+                                               &dev->all_adj_list.upper,
+                                               &upper_dev->all_adj_list.lower,
+                                               NULL, false);
+@@ -5328,17 +5332,19 @@ static int __netdev_adjacent_dev_link(st
+ static void __netdev_adjacent_dev_unlink_lists(struct net_device *dev,
+                                              struct net_device *upper_dev,
++                                             u16 ref_nr,
+                                              struct list_head *up_list,
+                                              struct list_head *down_list)
+ {
+-      __netdev_adjacent_dev_remove(dev, upper_dev, up_list);
+-      __netdev_adjacent_dev_remove(upper_dev, dev, down_list);
++      __netdev_adjacent_dev_remove(dev, upper_dev, ref_nr, up_list);
++      __netdev_adjacent_dev_remove(upper_dev, dev, ref_nr, down_list);
+ }
+ static void __netdev_adjacent_dev_unlink(struct net_device *dev,
+-                                       struct net_device *upper_dev)
++                                       struct net_device *upper_dev,
++                                       u16 ref_nr)
+ {
+-      __netdev_adjacent_dev_unlink_lists(dev, upper_dev,
++      __netdev_adjacent_dev_unlink_lists(dev, upper_dev, ref_nr,
+                                          &dev->all_adj_list.upper,
+                                          &upper_dev->all_adj_list.lower);
+ }
+@@ -5347,17 +5353,17 @@ static int __netdev_adjacent_dev_link_ne
+                                               struct net_device *upper_dev,
+                                               void *private, bool master)
+ {
+-      int ret = __netdev_adjacent_dev_link(dev, upper_dev);
++      int ret = __netdev_adjacent_dev_link(dev, upper_dev, 1);
+       if (ret)
+               return ret;
+-      ret = __netdev_adjacent_dev_link_lists(dev, upper_dev,
++      ret = __netdev_adjacent_dev_link_lists(dev, upper_dev, 1,
+                                              &dev->adj_list.upper,
+                                              &upper_dev->adj_list.lower,
+                                              private, master);
+       if (ret) {
+-              __netdev_adjacent_dev_unlink(dev, upper_dev);
++              __netdev_adjacent_dev_unlink(dev, upper_dev, 1);
+               return ret;
+       }
+@@ -5367,8 +5373,8 @@ static int __netdev_adjacent_dev_link_ne
+ static void __netdev_adjacent_dev_unlink_neighbour(struct net_device *dev,
+                                                  struct net_device *upper_dev)
+ {
+-      __netdev_adjacent_dev_unlink(dev, upper_dev);
+-      __netdev_adjacent_dev_unlink_lists(dev, upper_dev,
++      __netdev_adjacent_dev_unlink(dev, upper_dev, 1);
++      __netdev_adjacent_dev_unlink_lists(dev, upper_dev, 1,
+                                          &dev->adj_list.upper,
+                                          &upper_dev->adj_list.lower);
+ }
+@@ -5420,7 +5426,7 @@ static int __netdev_upper_dev_link(struc
+               list_for_each_entry(j, &upper_dev->all_adj_list.upper, list) {
+                       pr_debug("Interlinking %s with %s, non-neighbour\n",
+                                i->dev->name, j->dev->name);
+-                      ret = __netdev_adjacent_dev_link(i->dev, j->dev);
++                      ret = __netdev_adjacent_dev_link(i->dev, j->dev, i->ref_nr);
+                       if (ret)
+                               goto rollback_mesh;
+               }
+@@ -5430,7 +5436,7 @@ static int __netdev_upper_dev_link(struc
+       list_for_each_entry(i, &upper_dev->all_adj_list.upper, list) {
+               pr_debug("linking %s's upper device %s with %s\n",
+                        upper_dev->name, i->dev->name, dev->name);
+-              ret = __netdev_adjacent_dev_link(dev, i->dev);
++              ret = __netdev_adjacent_dev_link(dev, i->dev, i->ref_nr);
+               if (ret)
+                       goto rollback_upper_mesh;
+       }
+@@ -5439,7 +5445,7 @@ static int __netdev_upper_dev_link(struc
+       list_for_each_entry(i, &dev->all_adj_list.lower, list) {
+               pr_debug("linking %s's lower device %s with %s\n", dev->name,
+                        i->dev->name, upper_dev->name);
+-              ret = __netdev_adjacent_dev_link(i->dev, upper_dev);
++              ret = __netdev_adjacent_dev_link(i->dev, upper_dev, i->ref_nr);
+               if (ret)
+                       goto rollback_lower_mesh;
+       }
+@@ -5453,7 +5459,7 @@ rollback_lower_mesh:
+       list_for_each_entry(i, &dev->all_adj_list.lower, list) {
+               if (i == to_i)
+                       break;
+-              __netdev_adjacent_dev_unlink(i->dev, upper_dev);
++              __netdev_adjacent_dev_unlink(i->dev, upper_dev, i->ref_nr);
+       }
+       i = NULL;
+@@ -5463,7 +5469,7 @@ rollback_upper_mesh:
+       list_for_each_entry(i, &upper_dev->all_adj_list.upper, list) {
+               if (i == to_i)
+                       break;
+-              __netdev_adjacent_dev_unlink(dev, i->dev);
++              __netdev_adjacent_dev_unlink(dev, i->dev, i->ref_nr);
+       }
+       i = j = NULL;
+@@ -5475,7 +5481,7 @@ rollback_mesh:
+               list_for_each_entry(j, &upper_dev->all_adj_list.upper, list) {
+                       if (i == to_i && j == to_j)
+                               break;
+-                      __netdev_adjacent_dev_unlink(i->dev, j->dev);
++                      __netdev_adjacent_dev_unlink(i->dev, j->dev, i->ref_nr);
+               }
+               if (i == to_i)
+                       break;
+@@ -5559,16 +5565,16 @@ void netdev_upper_dev_unlink(struct net_
+        */
+       list_for_each_entry(i, &dev->all_adj_list.lower, list)
+               list_for_each_entry(j, &upper_dev->all_adj_list.upper, list)
+-                      __netdev_adjacent_dev_unlink(i->dev, j->dev);
++                      __netdev_adjacent_dev_unlink(i->dev, j->dev, i->ref_nr);
+       /* remove also the devices itself from lower/upper device
+        * list
+        */
+       list_for_each_entry(i, &dev->all_adj_list.lower, list)
+-              __netdev_adjacent_dev_unlink(i->dev, upper_dev);
++              __netdev_adjacent_dev_unlink(i->dev, upper_dev, i->ref_nr);
+       list_for_each_entry(i, &upper_dev->all_adj_list.upper, list)
+-              __netdev_adjacent_dev_unlink(dev, i->dev);
++              __netdev_adjacent_dev_unlink(dev, i->dev, i->ref_nr);
+       call_netdevice_notifiers_info(NETDEV_CHANGEUPPER, dev,
+                                     &changeupper_info.info);
diff --git a/queue-4.4/net-add-recursion-limit-to-gro.patch b/queue-4.4/net-add-recursion-limit-to-gro.patch
new file mode 100644 (file)
index 0000000..bac5fe8
--- /dev/null
@@ -0,0 +1,230 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Sabrina Dubroca <sd@queasysnail.net>
+Date: Thu, 20 Oct 2016 15:58:02 +0200
+Subject: net: add recursion limit to GRO
+
+From: Sabrina Dubroca <sd@queasysnail.net>
+
+
+[ Upstream commit fcd91dd449867c6bfe56a81cabba76b829fd05cd ]
+
+Currently, GRO can do unlimited recursion through the gro_receive
+handlers.  This was fixed for tunneling protocols by limiting tunnel GRO
+to one level with encap_mark, but both VLAN and TEB still have this
+problem.  Thus, the kernel is vulnerable to a stack overflow, if we
+receive a packet composed entirely of VLAN headers.
+
+This patch adds a recursion counter to the GRO layer to prevent stack
+overflow.  When a gro_receive function hits the recursion limit, GRO is
+aborted for this skb and it is processed normally.  This recursion
+counter is put in the GRO CB, but could be turned into a percpu counter
+if we run out of space in the CB.
+
+Thanks to Vladimír BeneÅ¡ <vbenes@redhat.com> for the initial bug report.
+
+Fixes: CVE-2016-7039
+Fixes: 9b174d88c257 ("net: Add Transparent Ethernet Bridging GRO support.")
+Fixes: 66e5133f19e9 ("vlan: Add GRO support for non hardware accelerated vlan")
+Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
+Reviewed-by: Jiri Benc <jbenc@redhat.com>
+Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
+Acked-by: Tom Herbert <tom@herbertland.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ drivers/net/geneve.c      |    2 +-
+ drivers/net/vxlan.c       |    2 +-
+ include/linux/netdevice.h |   40 +++++++++++++++++++++++++++++++++++++++-
+ net/8021q/vlan.c          |    2 +-
+ net/core/dev.c            |    1 +
+ net/ethernet/eth.c        |    2 +-
+ net/ipv4/af_inet.c        |    2 +-
+ net/ipv4/fou.c            |    4 ++--
+ net/ipv4/gre_offload.c    |    2 +-
+ net/ipv4/udp_offload.c    |    4 ++--
+ net/ipv6/ip6_offload.c    |    2 +-
+ 11 files changed, 51 insertions(+), 12 deletions(-)
+
+--- a/drivers/net/geneve.c
++++ b/drivers/net/geneve.c
+@@ -440,7 +440,7 @@ static struct sk_buff **geneve_gro_recei
+       skb_gro_pull(skb, gh_len);
+       skb_gro_postpull_rcsum(skb, gh, gh_len);
+-      pp = ptype->callbacks.gro_receive(head, skb);
++      pp = call_gro_receive(ptype->callbacks.gro_receive, head, skb);
+ out_unlock:
+       rcu_read_unlock();
+--- a/drivers/net/vxlan.c
++++ b/drivers/net/vxlan.c
+@@ -593,7 +593,7 @@ static struct sk_buff **vxlan_gro_receiv
+               }
+       }
+-      pp = eth_gro_receive(head, skb);
++      pp = call_gro_receive(eth_gro_receive, head, skb);
+ out:
+       skb_gro_remcsum_cleanup(skb, &grc);
+--- a/include/linux/netdevice.h
++++ b/include/linux/netdevice.h
+@@ -2003,7 +2003,10 @@ struct napi_gro_cb {
+       /* Used in foo-over-udp, set in udp[46]_gro_receive */
+       u8      is_ipv6:1;
+-      /* 7 bit hole */
++      /* Number of gro_receive callbacks this packet already went through */
++      u8 recursion_counter:4;
++
++      /* 3 bit hole */
+       /* used to support CHECKSUM_COMPLETE for tunneling protocols */
+       __wsum  csum;
+@@ -2014,6 +2017,25 @@ struct napi_gro_cb {
+ #define NAPI_GRO_CB(skb) ((struct napi_gro_cb *)(skb)->cb)
++#define GRO_RECURSION_LIMIT 15
++static inline int gro_recursion_inc_test(struct sk_buff *skb)
++{
++      return ++NAPI_GRO_CB(skb)->recursion_counter == GRO_RECURSION_LIMIT;
++}
++
++typedef struct sk_buff **(*gro_receive_t)(struct sk_buff **, struct sk_buff *);
++static inline struct sk_buff **call_gro_receive(gro_receive_t cb,
++                                              struct sk_buff **head,
++                                              struct sk_buff *skb)
++{
++      if (unlikely(gro_recursion_inc_test(skb))) {
++              NAPI_GRO_CB(skb)->flush |= 1;
++              return NULL;
++      }
++
++      return cb(head, skb);
++}
++
+ struct packet_type {
+       __be16                  type;   /* This is really htons(ether_type). */
+       struct net_device       *dev;   /* NULL is wildcarded here           */
+@@ -2059,6 +2081,22 @@ struct udp_offload {
+       struct udp_offload_callbacks callbacks;
+ };
++typedef struct sk_buff **(*gro_receive_udp_t)(struct sk_buff **,
++                                            struct sk_buff *,
++                                            struct udp_offload *);
++static inline struct sk_buff **call_gro_receive_udp(gro_receive_udp_t cb,
++                                                  struct sk_buff **head,
++                                                  struct sk_buff *skb,
++                                                  struct udp_offload *uoff)
++{
++      if (unlikely(gro_recursion_inc_test(skb))) {
++              NAPI_GRO_CB(skb)->flush |= 1;
++              return NULL;
++      }
++
++      return cb(head, skb, uoff);
++}
++
+ /* often modified stats are per cpu, other are shared (netdev->stats) */
+ struct pcpu_sw_netstats {
+       u64     rx_packets;
+--- a/net/8021q/vlan.c
++++ b/net/8021q/vlan.c
+@@ -659,7 +659,7 @@ static struct sk_buff **vlan_gro_receive
+       skb_gro_pull(skb, sizeof(*vhdr));
+       skb_gro_postpull_rcsum(skb, vhdr, sizeof(*vhdr));
+-      pp = ptype->callbacks.gro_receive(head, skb);
++      pp = call_gro_receive(ptype->callbacks.gro_receive, head, skb);
+ out_unlock:
+       rcu_read_unlock();
+--- a/net/core/dev.c
++++ b/net/core/dev.c
+@@ -4240,6 +4240,7 @@ static enum gro_result dev_gro_receive(s
+               NAPI_GRO_CB(skb)->flush = 0;
+               NAPI_GRO_CB(skb)->free = 0;
+               NAPI_GRO_CB(skb)->encap_mark = 0;
++              NAPI_GRO_CB(skb)->recursion_counter = 0;
+               NAPI_GRO_CB(skb)->gro_remcsum_start = 0;
+               /* Setup for GRO checksum validation */
+--- a/net/ethernet/eth.c
++++ b/net/ethernet/eth.c
+@@ -436,7 +436,7 @@ struct sk_buff **eth_gro_receive(struct
+       skb_gro_pull(skb, sizeof(*eh));
+       skb_gro_postpull_rcsum(skb, eh, sizeof(*eh));
+-      pp = ptype->callbacks.gro_receive(head, skb);
++      pp = call_gro_receive(ptype->callbacks.gro_receive, head, skb);
+ out_unlock:
+       rcu_read_unlock();
+--- a/net/ipv4/af_inet.c
++++ b/net/ipv4/af_inet.c
+@@ -1372,7 +1372,7 @@ static struct sk_buff **inet_gro_receive
+       skb_gro_pull(skb, sizeof(*iph));
+       skb_set_transport_header(skb, skb_gro_offset(skb));
+-      pp = ops->callbacks.gro_receive(head, skb);
++      pp = call_gro_receive(ops->callbacks.gro_receive, head, skb);
+ out_unlock:
+       rcu_read_unlock();
+--- a/net/ipv4/fou.c
++++ b/net/ipv4/fou.c
+@@ -201,7 +201,7 @@ static struct sk_buff **fou_gro_receive(
+       if (!ops || !ops->callbacks.gro_receive)
+               goto out_unlock;
+-      pp = ops->callbacks.gro_receive(head, skb);
++      pp = call_gro_receive(ops->callbacks.gro_receive, head, skb);
+ out_unlock:
+       rcu_read_unlock();
+@@ -360,7 +360,7 @@ static struct sk_buff **gue_gro_receive(
+       if (WARN_ON_ONCE(!ops || !ops->callbacks.gro_receive))
+               goto out_unlock;
+-      pp = ops->callbacks.gro_receive(head, skb);
++      pp = call_gro_receive(ops->callbacks.gro_receive, head, skb);
+ out_unlock:
+       rcu_read_unlock();
+--- a/net/ipv4/gre_offload.c
++++ b/net/ipv4/gre_offload.c
+@@ -219,7 +219,7 @@ static struct sk_buff **gre_gro_receive(
+       /* Adjusted NAPI_GRO_CB(skb)->csum after skb_gro_pull()*/
+       skb_gro_postpull_rcsum(skb, greh, grehlen);
+-      pp = ptype->callbacks.gro_receive(head, skb);
++      pp = call_gro_receive(ptype->callbacks.gro_receive, head, skb);
+ out_unlock:
+       rcu_read_unlock();
+--- a/net/ipv4/udp_offload.c
++++ b/net/ipv4/udp_offload.c
+@@ -339,8 +339,8 @@ unflush:
+       skb_gro_pull(skb, sizeof(struct udphdr)); /* pull encapsulating udp header */
+       skb_gro_postpull_rcsum(skb, uh, sizeof(struct udphdr));
+       NAPI_GRO_CB(skb)->proto = uo_priv->offload->ipproto;
+-      pp = uo_priv->offload->callbacks.gro_receive(head, skb,
+-                                                   uo_priv->offload);
++      pp = call_gro_receive_udp(uo_priv->offload->callbacks.gro_receive,
++                                head, skb, uo_priv->offload);
+ out_unlock:
+       rcu_read_unlock();
+--- a/net/ipv6/ip6_offload.c
++++ b/net/ipv6/ip6_offload.c
+@@ -247,7 +247,7 @@ static struct sk_buff **ipv6_gro_receive
+       skb_gro_postpull_rcsum(skb, iph, nlen);
+-      pp = ops->callbacks.gro_receive(head, skb);
++      pp = call_gro_receive(ops->callbacks.gro_receive, head, skb);
+ out_unlock:
+       rcu_read_unlock();
diff --git a/queue-4.4/net-avoid-sk_forward_alloc-overflows.patch b/queue-4.4/net-avoid-sk_forward_alloc-overflows.patch
new file mode 100644 (file)
index 0000000..20ade1e
--- /dev/null
@@ -0,0 +1,55 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Eric Dumazet <edumazet@google.com>
+Date: Thu, 15 Sep 2016 08:48:46 -0700
+Subject: net: avoid sk_forward_alloc overflows
+
+From: Eric Dumazet <edumazet@google.com>
+
+
+[ Upstream commit 20c64d5cd5a2bdcdc8982a06cb05e5e1bd851a3d ]
+
+A malicious TCP receiver, sending SACK, can force the sender to split
+skbs in write queue and increase its memory usage.
+
+Then, when socket is closed and its write queue purged, we might
+overflow sk_forward_alloc (It becomes negative)
+
+sk_mem_reclaim() does nothing in this case, and more than 2GB
+are leaked from TCP perspective (tcp_memory_allocated is not changed)
+
+Then warnings trigger from inet_sock_destruct() and
+sk_stream_kill_queues() seeing a not zero sk_forward_alloc
+
+All TCP stack can be stuck because TCP is under memory pressure.
+
+A simple fix is to preemptively reclaim from sk_mem_uncharge().
+
+This makes sure a socket wont have more than 2 MB forward allocated,
+after burst and idle period.
+
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ include/net/sock.h |   10 ++++++++++
+ 1 file changed, 10 insertions(+)
+
+--- a/include/net/sock.h
++++ b/include/net/sock.h
+@@ -1425,6 +1425,16 @@ static inline void sk_mem_uncharge(struc
+       if (!sk_has_account(sk))
+               return;
+       sk->sk_forward_alloc += size;
++
++      /* Avoid a possible overflow.
++       * TCP send queues can make this happen, if sk_mem_reclaim()
++       * is not called and more than 2 GBytes are released at once.
++       *
++       * If we reach 2 MBytes, reclaim 1 MBytes right now, there is
++       * no need to hold that much forward allocation anyway.
++       */
++      if (unlikely(sk->sk_forward_alloc >= 1 << 21))
++              __sk_mem_reclaim(sk, 1 << 20);
+ }
+ static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb)
diff --git a/queue-4.4/net-fec-set-mac-address-unconditionally.patch b/queue-4.4/net-fec-set-mac-address-unconditionally.patch
new file mode 100644 (file)
index 0000000..aea7166
--- /dev/null
@@ -0,0 +1,46 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Gavin Schenk <g.schenk@eckelmann.de>
+Date: Fri, 30 Sep 2016 11:46:10 +0200
+Subject: net: fec: set mac address unconditionally
+
+From: Gavin Schenk <g.schenk@eckelmann.de>
+
+
+[ Upstream commit b82d44d78480faff7456e9e0999acb9d38666057 ]
+
+If the mac address origin is not dt, you can only safely assign a mac
+address after "link up" of the device. If the link is off the clocks are
+disabled and because of issues assigning registers when clocks are off the
+new mac address cannot be written in .ndo_set_mac_address() on some soc's.
+This fix sets the mac address unconditionally in fec_restart(...) and
+ensures consistency between fec registers and the network layer.
+
+Signed-off-by: Gavin Schenk <g.schenk@eckelmann.de>
+Acked-by: Fugang Duan <fugang.duan@nxp.com>
+Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
+Fixes: 9638d19e4816 ("net: fec: add netif status check before set mac address")
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ drivers/net/ethernet/freescale/fec_main.c |   10 +++++-----
+ 1 file changed, 5 insertions(+), 5 deletions(-)
+
+--- a/drivers/net/ethernet/freescale/fec_main.c
++++ b/drivers/net/ethernet/freescale/fec_main.c
+@@ -944,11 +944,11 @@ fec_restart(struct net_device *ndev)
+        * enet-mac reset will reset mac address registers too,
+        * so need to reconfigure it.
+        */
+-      if (fep->quirks & FEC_QUIRK_ENET_MAC) {
+-              memcpy(&temp_mac, ndev->dev_addr, ETH_ALEN);
+-              writel(cpu_to_be32(temp_mac[0]), fep->hwp + FEC_ADDR_LOW);
+-              writel(cpu_to_be32(temp_mac[1]), fep->hwp + FEC_ADDR_HIGH);
+-      }
++      memcpy(&temp_mac, ndev->dev_addr, ETH_ALEN);
++      writel((__force u32)cpu_to_be32(temp_mac[0]),
++             fep->hwp + FEC_ADDR_LOW);
++      writel((__force u32)cpu_to_be32(temp_mac[1]),
++             fep->hwp + FEC_ADDR_HIGH);
+       /* Clear any outstanding interrupt. */
+       writel(0xffffffff, fep->hwp + FEC_IEVENT);
diff --git a/queue-4.4/net-pktgen-fix-pkt_size.patch b/queue-4.4/net-pktgen-fix-pkt_size.patch
new file mode 100644 (file)
index 0000000..73d8f0c
--- /dev/null
@@ -0,0 +1,108 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Paolo Abeni <pabeni@redhat.com>
+Date: Fri, 30 Sep 2016 16:56:45 +0200
+Subject: net: pktgen: fix pkt_size
+
+From: Paolo Abeni <pabeni@redhat.com>
+
+
+[ Upstream commit 63d75463c91a5b5be7c0aca11ceb45ea5a0ae81d ]
+
+The commit 879c7220e828 ("net: pktgen: Observe needed_headroom
+of the device") increased the 'pkt_overhead' field value by
+LL_RESERVED_SPACE.
+As a side effect the generated packet size, computed as:
+
+       /* Eth + IPh + UDPh + mpls */
+       datalen = pkt_dev->cur_pkt_size - 14 - 20 - 8 -
+                 pkt_dev->pkt_overhead;
+
+is decreased by the same value.
+The above changed slightly the behavior of existing pktgen users,
+and made the procfs interface somewhat inconsistent.
+Fix it by restoring the previous pkt_overhead value and using
+LL_RESERVED_SPACE as extralen in skb allocation.
+Also, change pktgen_alloc_skb() to only partially reserve
+the headroom to allow the caller to prefetch from ll header
+start.
+
+v1 -> v2:
+ - fixed some typos in the comments
+
+Fixes: 879c7220e828 ("net: pktgen: Observe needed_headroom of the device")
+Suggested-by: Ben Greear <greearb@candelatech.com>
+Signed-off-by: Paolo Abeni <pabeni@redhat.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/core/pktgen.c |   21 ++++++++++-----------
+ 1 file changed, 10 insertions(+), 11 deletions(-)
+
+--- a/net/core/pktgen.c
++++ b/net/core/pktgen.c
+@@ -2278,7 +2278,7 @@ static void spin(struct pktgen_dev *pkt_
+ static inline void set_pkt_overhead(struct pktgen_dev *pkt_dev)
+ {
+-      pkt_dev->pkt_overhead = LL_RESERVED_SPACE(pkt_dev->odev);
++      pkt_dev->pkt_overhead = 0;
+       pkt_dev->pkt_overhead += pkt_dev->nr_labels*sizeof(u32);
+       pkt_dev->pkt_overhead += VLAN_TAG_SIZE(pkt_dev);
+       pkt_dev->pkt_overhead += SVLAN_TAG_SIZE(pkt_dev);
+@@ -2769,13 +2769,13 @@ static void pktgen_finalize_skb(struct p
+ }
+ static struct sk_buff *pktgen_alloc_skb(struct net_device *dev,
+-                                      struct pktgen_dev *pkt_dev,
+-                                      unsigned int extralen)
++                                      struct pktgen_dev *pkt_dev)
+ {
++      unsigned int extralen = LL_RESERVED_SPACE(dev);
+       struct sk_buff *skb = NULL;
+-      unsigned int size = pkt_dev->cur_pkt_size + 64 + extralen +
+-                          pkt_dev->pkt_overhead;
++      unsigned int size;
++      size = pkt_dev->cur_pkt_size + 64 + extralen + pkt_dev->pkt_overhead;
+       if (pkt_dev->flags & F_NODE) {
+               int node = pkt_dev->node >= 0 ? pkt_dev->node : numa_node_id();
+@@ -2788,8 +2788,9 @@ static struct sk_buff *pktgen_alloc_skb(
+                skb = __netdev_alloc_skb(dev, size, GFP_NOWAIT);
+       }
++      /* the caller pre-fetches from skb->data and reserves for the mac hdr */
+       if (likely(skb))
+-              skb_reserve(skb, LL_RESERVED_SPACE(dev));
++              skb_reserve(skb, extralen - 16);
+       return skb;
+ }
+@@ -2822,16 +2823,14 @@ static struct sk_buff *fill_packet_ipv4(
+       mod_cur_headers(pkt_dev);
+       queue_map = pkt_dev->cur_queue_map;
+-      datalen = (odev->hard_header_len + 16) & ~0xf;
+-
+-      skb = pktgen_alloc_skb(odev, pkt_dev, datalen);
++      skb = pktgen_alloc_skb(odev, pkt_dev);
+       if (!skb) {
+               sprintf(pkt_dev->result, "No memory");
+               return NULL;
+       }
+       prefetchw(skb->data);
+-      skb_reserve(skb, datalen);
++      skb_reserve(skb, 16);
+       /*  Reserve for ethernet and IP header  */
+       eth = (__u8 *) skb_push(skb, 14);
+@@ -2951,7 +2950,7 @@ static struct sk_buff *fill_packet_ipv6(
+       mod_cur_headers(pkt_dev);
+       queue_map = pkt_dev->cur_queue_map;
+-      skb = pktgen_alloc_skb(odev, pkt_dev, 16);
++      skb = pktgen_alloc_skb(odev, pkt_dev);
+       if (!skb) {
+               sprintf(pkt_dev->result, "No memory");
+               return NULL;
diff --git a/queue-4.4/net-pktgen-remove-rcu-locking-in-pktgen_change_name.patch b/queue-4.4/net-pktgen-remove-rcu-locking-in-pktgen_change_name.patch
new file mode 100644 (file)
index 0000000..1874eda
--- /dev/null
@@ -0,0 +1,91 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Eric Dumazet <edumazet@google.com>
+Date: Sat, 15 Oct 2016 17:50:49 +0200
+Subject: net: pktgen: remove rcu locking in pktgen_change_name()
+
+From: Eric Dumazet <edumazet@google.com>
+
+
+[ Upstream commit 9a0b1e8ba4061778897b544afc898de2163382f7 ]
+
+After Jesper commit back in linux-3.18, we trigger a lockdep
+splat in proc_create_data() while allocating memory from
+pktgen_change_name().
+
+This patch converts t->if_lock to a mutex, since it is now only
+used from control path, and adds proper locking to pktgen_change_name()
+
+1) pktgen_thread_lock to protect the outer loop (iterating threads)
+2) t->if_lock to protect the inner loop (iterating devices)
+
+Note that before Jesper patch, pktgen_change_name() was lacking proper
+protection, but lockdep was not able to detect the problem.
+
+Fixes: 8788370a1d4b ("pktgen: RCU-ify "if_list" to remove lock in next_to_run()")
+Reported-by: John Sperbeck <jsperbeck@google.com>
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Cc: Jesper Dangaard Brouer <brouer@redhat.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/core/pktgen.c |   17 ++++++++++-------
+ 1 file changed, 10 insertions(+), 7 deletions(-)
+
+--- a/net/core/pktgen.c
++++ b/net/core/pktgen.c
+@@ -215,8 +215,8 @@
+ #define M_NETIF_RECEIVE       1       /* Inject packets into stack */
+ /* If lock -- protects updating of if_list */
+-#define   if_lock(t)           spin_lock(&(t->if_lock));
+-#define   if_unlock(t)           spin_unlock(&(t->if_lock));
++#define   if_lock(t)           mutex_lock(&(t->if_lock));
++#define   if_unlock(t)           mutex_unlock(&(t->if_lock));
+ /* Used to help with determining the pkts on receive */
+ #define PKTGEN_MAGIC 0xbe9be955
+@@ -422,7 +422,7 @@ struct pktgen_net {
+ };
+ struct pktgen_thread {
+-      spinlock_t if_lock;             /* for list of devices */
++      struct mutex if_lock;           /* for list of devices */
+       struct list_head if_list;       /* All device here */
+       struct list_head th_list;
+       struct task_struct *tsk;
+@@ -2002,11 +2002,13 @@ static void pktgen_change_name(const str
+ {
+       struct pktgen_thread *t;
++      mutex_lock(&pktgen_thread_lock);
++
+       list_for_each_entry(t, &pn->pktgen_threads, th_list) {
+               struct pktgen_dev *pkt_dev;
+-              rcu_read_lock();
+-              list_for_each_entry_rcu(pkt_dev, &t->if_list, list) {
++              if_lock(t);
++              list_for_each_entry(pkt_dev, &t->if_list, list) {
+                       if (pkt_dev->odev != dev)
+                               continue;
+@@ -2021,8 +2023,9 @@ static void pktgen_change_name(const str
+                                      dev->name);
+                       break;
+               }
+-              rcu_read_unlock();
++              if_unlock(t);
+       }
++      mutex_unlock(&pktgen_thread_lock);
+ }
+ static int pktgen_device_event(struct notifier_block *unused,
+@@ -3726,7 +3729,7 @@ static int __net_init pktgen_create_thre
+               return -ENOMEM;
+       }
+-      spin_lock_init(&t->if_lock);
++      mutex_init(&t->if_lock);
+       t->cpu = cpu;
+       INIT_LIST_HEAD(&t->if_list);
diff --git a/queue-4.4/net-sched-act_vlan-push-skb-data-to-mac_header-prior-calling-skb_vlan_-functions.patch b/queue-4.4/net-sched-act_vlan-push-skb-data-to-mac_header-prior-calling-skb_vlan_-functions.patch
new file mode 100644 (file)
index 0000000..677c926
--- /dev/null
@@ -0,0 +1,87 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Shmulik Ladkani <shmulik.ladkani@gmail.com>
+Date: Thu, 29 Sep 2016 12:10:40 +0300
+Subject: net/sched: act_vlan: Push skb->data to mac_header prior calling skb_vlan_*() functions
+
+From: Shmulik Ladkani <shmulik.ladkani@gmail.com>
+
+
+[ Upstream commit f39acc84aad10710e89835c60d3b6694c43a8dd9 ]
+
+Generic skb_vlan_push/skb_vlan_pop functions don't properly handle the
+case where the input skb data pointer does not point at the mac header:
+
+- They're doing push/pop, but fail to properly unwind data back to its
+  original location.
+  For example, in the skb_vlan_push case, any subsequent
+  'skb_push(skb, skb->mac_len)' calls make the skb->data point 4 bytes
+  BEFORE start of frame, leading to bogus frames that may be transmitted.
+
+- They update rcsum per the added/removed 4 bytes tag.
+  Alas if data is originally after the vlan/eth headers, then these
+  bytes were already pulled out of the csum.
+
+OTOH calling skb_vlan_push/skb_vlan_pop with skb->data at mac_header
+present no issues.
+
+act_vlan is the only caller to skb_vlan_*() that has skb->data pointing
+at network header (upon ingress).
+Other calles (ovs, bpf) already adjust skb->data at mac_header.
+
+This patch fixes act_vlan to point to the mac_header prior calling
+skb_vlan_*() functions, as other callers do.
+
+Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
+Cc: Daniel Borkmann <daniel@iogearbox.net>
+Cc: Pravin Shelar <pshelar@ovn.org>
+Cc: Jiri Pirko <jiri@mellanox.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ include/net/sch_generic.h |    9 +++++++++
+ net/sched/act_vlan.c      |    9 +++++++++
+ 2 files changed, 18 insertions(+)
+
+--- a/include/net/sch_generic.h
++++ b/include/net/sch_generic.h
+@@ -408,6 +408,15 @@ bool tcf_destroy(struct tcf_proto *tp, b
+ void tcf_destroy_chain(struct tcf_proto __rcu **fl);
+ int skb_do_redirect(struct sk_buff *);
++static inline bool skb_at_tc_ingress(const struct sk_buff *skb)
++{
++#ifdef CONFIG_NET_CLS_ACT
++      return G_TC_AT(skb->tc_verd) & AT_INGRESS;
++#else
++      return false;
++#endif
++}
++
+ /* Reset all TX qdiscs greater then index of a device.  */
+ static inline void qdisc_reset_all_tx_gt(struct net_device *dev, unsigned int i)
+ {
+--- a/net/sched/act_vlan.c
++++ b/net/sched/act_vlan.c
+@@ -33,6 +33,12 @@ static int tcf_vlan(struct sk_buff *skb,
+       bstats_update(&v->tcf_bstats, skb);
+       action = v->tcf_action;
++      /* Ensure 'data' points at mac_header prior calling vlan manipulating
++       * functions.
++       */
++      if (skb_at_tc_ingress(skb))
++              skb_push_rcsum(skb, skb->mac_len);
++
+       switch (v->tcfv_action) {
+       case TCA_VLAN_ACT_POP:
+               err = skb_vlan_pop(skb);
+@@ -54,6 +60,9 @@ drop:
+       action = TC_ACT_SHOT;
+       v->tcf_qstats.drops++;
+ unlock:
++      if (skb_at_tc_ingress(skb))
++              skb_pull_rcsum(skb, skb->mac_len);
++
+       spin_unlock(&v->tcf_lock);
+       return action;
+ }
diff --git a/queue-4.4/net-sched-filters-fix-notification-of-filter-delete-with-proper-handle.patch b/queue-4.4/net-sched-filters-fix-notification-of-filter-delete-with-proper-handle.patch
new file mode 100644 (file)
index 0000000..3a1aca6
--- /dev/null
@@ -0,0 +1,80 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Jamal Hadi Salim <jhs@mojatatu.com>
+Date: Mon, 24 Oct 2016 20:18:27 -0400
+Subject: net sched filters: fix notification of filter delete with proper handle
+
+From: Jamal Hadi Salim <jhs@mojatatu.com>
+
+
+[ Upstream commit 9ee7837449b3d6f0fcf9132c6b5e5aaa58cc67d4 ]
+
+Daniel says:
+
+While trying out [1][2], I noticed that tc monitor doesn't show the
+correct handle on delete:
+
+$ tc monitor
+qdisc clsact ffff: dev eno1 parent ffff:fff1
+filter dev eno1 ingress protocol all pref 49152 bpf handle 0x2a [...]
+deleted filter dev eno1 ingress protocol all pref 49152 bpf handle 0xf3be0c80
+
+some context to explain the above:
+The user identity of any tc filter is represented by a 32-bit
+identifier encoded in tcm->tcm_handle. Example 0x2a in the bpf filter
+above. A user wishing to delete, get or even modify a specific filter
+uses this handle to reference it.
+Every classifier is free to provide its own semantics for the 32 bit handle.
+Example: classifiers like u32 use schemes like 800:1:801 to describe
+the semantics of their filters represented as hash table, bucket and
+node ids etc.
+Classifiers also have internal per-filter representation which is different
+from this externally visible identity. Most classifiers set this
+internal representation to be a pointer address (which allows fast retrieval
+of said filters in their implementations). This internal representation
+is referenced with the "fh" variable in the kernel control code.
+
+When a user successfuly deletes a specific filter, by specifying the correct
+tcm->tcm_handle, an event is generated to user space which indicates
+which specific filter was deleted.
+
+Before this patch, the "fh" value was sent to user space as the identity.
+As an example what is shown in the sample bpf filter delete event above
+is 0xf3be0c80. This is infact a 32-bit truncation of 0xffff8807f3be0c80
+which happens to be a 64-bit memory address of the internal filter
+representation (address of the corresponding filter's struct cls_bpf_prog);
+
+After this patch the appropriate user identifiable handle as encoded
+in the originating request tcm->tcm_handle is generated in the event.
+One of the cardinal rules of netlink rules is to be able to take an
+event (such as a delete in this case) and reflect it back to the
+kernel and successfully delete the filter. This patch achieves that.
+
+Note, this issue has existed since the original TC action
+infrastructure code patch back in 2004 as found in:
+https://git.kernel.org/cgit/linux/kernel/git/history/history.git/commit/
+
+[1] http://patchwork.ozlabs.org/patch/682828/
+[2] http://patchwork.ozlabs.org/patch/682829/
+
+Fixes: 4e54c4816bfe ("[NET]: Add tc extensions infrastructure.")
+Reported-by: Daniel Borkmann <daniel@iogearbox.net>
+Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
+Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/sched/cls_api.c |    3 ++-
+ 1 file changed, 2 insertions(+), 1 deletion(-)
+
+--- a/net/sched/cls_api.c
++++ b/net/sched/cls_api.c
+@@ -315,7 +315,8 @@ replay:
+                       if (err == 0) {
+                               struct tcf_proto *next = rtnl_dereference(tp->next);
+-                              tfilter_notify(net, skb, n, tp, fh, RTM_DELTFILTER);
++                              tfilter_notify(net, skb, n, tp,
++                                             t->tcm_handle, RTM_DELTFILTER);
+                               if (tcf_destroy(tp, false))
+                                       RCU_INIT_POINTER(*back, next);
+                       }
diff --git a/queue-4.4/net-sctp-forbid-negative-length.patch b/queue-4.4/net-sctp-forbid-negative-length.patch
new file mode 100644 (file)
index 0000000..5266943
--- /dev/null
@@ -0,0 +1,79 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Jiri Slaby <jslaby@suse.cz>
+Date: Fri, 21 Oct 2016 14:13:24 +0200
+Subject: net: sctp, forbid negative length
+
+From: Jiri Slaby <jslaby@suse.cz>
+
+
+[ Upstream commit a4b8e71b05c27bae6bad3bdecddbc6b68a3ad8cf ]
+
+Most of getsockopt handlers in net/sctp/socket.c check len against
+sizeof some structure like:
+        if (len < sizeof(int))
+                return -EINVAL;
+
+On the first look, the check seems to be correct. But since len is int
+and sizeof returns size_t, int gets promoted to unsigned size_t too. So
+the test returns false for negative lengths. Yes, (-1 < sizeof(long)) is
+false.
+
+Fix this in sctp by explicitly checking len < 0 before any getsockopt
+handler is called.
+
+Note that sctp_getsockopt_events already handled the negative case.
+Since we added the < 0 check elsewhere, this one can be removed.
+
+If not checked, this is the result:
+UBSAN: Undefined behaviour in ../mm/page_alloc.c:2722:19
+shift exponent 52 is too large for 32-bit type 'int'
+CPU: 1 PID: 24535 Comm: syz-executor Not tainted 4.8.1-0-syzkaller #1
+Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
+ 0000000000000000 ffff88006d99f2a8 ffffffffb2f7bdea 0000000041b58ab3
+ ffffffffb4363c14 ffffffffb2f7bcde ffff88006d99f2d0 ffff88006d99f270
+ 0000000000000000 0000000000000000 0000000000000034 ffffffffb5096422
+Call Trace:
+ [<ffffffffb3051498>] ? __ubsan_handle_shift_out_of_bounds+0x29c/0x300
+...
+ [<ffffffffb273f0e4>] ? kmalloc_order+0x24/0x90
+ [<ffffffffb27416a4>] ? kmalloc_order_trace+0x24/0x220
+ [<ffffffffb2819a30>] ? __kmalloc+0x330/0x540
+ [<ffffffffc18c25f4>] ? sctp_getsockopt_local_addrs+0x174/0xca0 [sctp]
+ [<ffffffffc18d2bcd>] ? sctp_getsockopt+0x10d/0x1b0 [sctp]
+ [<ffffffffb37c1219>] ? sock_common_getsockopt+0xb9/0x150
+ [<ffffffffb37be2f5>] ? SyS_getsockopt+0x1a5/0x270
+
+Signed-off-by: Jiri Slaby <jslaby@suse.cz>
+Cc: Vlad Yasevich <vyasevich@gmail.com>
+Cc: Neil Horman <nhorman@tuxdriver.com>
+Cc: "David S. Miller" <davem@davemloft.net>
+Cc: linux-sctp@vger.kernel.org
+Cc: netdev@vger.kernel.org
+Acked-by: Neil Horman <nhorman@tuxdriver.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/sctp/socket.c |    5 ++++-
+ 1 file changed, 4 insertions(+), 1 deletion(-)
+
+--- a/net/sctp/socket.c
++++ b/net/sctp/socket.c
+@@ -4371,7 +4371,7 @@ static int sctp_getsockopt_disable_fragm
+ static int sctp_getsockopt_events(struct sock *sk, int len, char __user *optval,
+                                 int __user *optlen)
+ {
+-      if (len <= 0)
++      if (len == 0)
+               return -EINVAL;
+       if (len > sizeof(struct sctp_event_subscribe))
+               len = sizeof(struct sctp_event_subscribe);
+@@ -5972,6 +5972,9 @@ static int sctp_getsockopt(struct sock *
+       if (get_user(len, optlen))
+               return -EFAULT;
++      if (len < 0)
++              return -EINVAL;
++
+       lock_sock(sk);
+       switch (optname) {
diff --git a/queue-4.4/netlink-do-not-enter-direct-reclaim-from-netlink_dump.patch b/queue-4.4/netlink-do-not-enter-direct-reclaim-from-netlink_dump.patch
new file mode 100644 (file)
index 0000000..7410d2d
--- /dev/null
@@ -0,0 +1,72 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Eric Dumazet <edumazet@google.com>
+Date: Thu, 6 Oct 2016 04:13:18 +0900
+Subject: netlink: do not enter direct reclaim from netlink_dump()
+
+From: Eric Dumazet <edumazet@google.com>
+
+
+[ Upstream commit d35c99ff77ecb2eb239731b799386f3b3637a31e ]
+
+Since linux-3.15, netlink_dump() can use up to 16384 bytes skb
+allocations.
+
+Due to struct skb_shared_info ~320 bytes overhead, we end up using
+order-3 (on x86) page allocations, that might trigger direct reclaim and
+add stress.
+
+The intent was really to attempt a large allocation but immediately
+fallback to a smaller one (order-1 on x86) in case of memory stress.
+
+On recent kernels (linux-4.4), we can remove __GFP_DIRECT_RECLAIM to
+meet the goal. Old kernels would need to remove __GFP_WAIT
+
+While we are at it, since we do an order-3 allocation, allow to use
+all the allocated bytes instead of 16384 to reduce syscalls during
+large dumps.
+
+iproute2 already uses 32KB recvmsg() buffer sizes.
+
+Alexei provided an initial patch downsizing to SKB_WITH_OVERHEAD(16384)
+
+Fixes: 9063e21fb026 ("netlink: autosize skb lengthes")
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Reported-by: Alexei Starovoitov <ast@kernel.org>
+Cc: Greg Thelen <gthelen@google.com>
+Reviewed-by: Greg Rose <grose@lightfleet.com>
+Acked-by: Alexei Starovoitov <ast@kernel.org>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/netlink/af_netlink.c |    9 ++++-----
+ 1 file changed, 4 insertions(+), 5 deletions(-)
+
+--- a/net/netlink/af_netlink.c
++++ b/net/netlink/af_netlink.c
+@@ -2557,7 +2557,7 @@ static int netlink_recvmsg(struct socket
+       /* Record the max length of recvmsg() calls for future allocations */
+       nlk->max_recvmsg_len = max(nlk->max_recvmsg_len, len);
+       nlk->max_recvmsg_len = min_t(size_t, nlk->max_recvmsg_len,
+-                                   16384);
++                                   SKB_WITH_OVERHEAD(32768));
+       copied = data_skb->len;
+       if (len < copied) {
+@@ -2810,14 +2810,13 @@ static int netlink_dump(struct sock *sk)
+       if (alloc_min_size < nlk->max_recvmsg_len) {
+               alloc_size = nlk->max_recvmsg_len;
+               skb = netlink_alloc_skb(sk, alloc_size, nlk->portid,
+-                                      GFP_KERNEL |
+-                                      __GFP_NOWARN |
+-                                      __GFP_NORETRY);
++                                      (GFP_KERNEL & ~__GFP_DIRECT_RECLAIM) |
++                                      __GFP_NOWARN | __GFP_NORETRY);
+       }
+       if (!skb) {
+               alloc_size = alloc_min_size;
+               skb = netlink_alloc_skb(sk, alloc_size, nlk->portid,
+-                                      GFP_KERNEL);
++                                      (GFP_KERNEL & ~__GFP_DIRECT_RECLAIM));
+       }
+       if (!skb)
+               goto errout_skb;
diff --git a/queue-4.4/packet-call-fanout_release-while-unregistering-a-netdev.patch b/queue-4.4/packet-call-fanout_release-while-unregistering-a-netdev.patch
new file mode 100644 (file)
index 0000000..0a85220
--- /dev/null
@@ -0,0 +1,36 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Anoob Soman <anoob.soman@citrix.com>
+Date: Wed, 5 Oct 2016 15:12:54 +0100
+Subject: packet: call fanout_release, while UNREGISTERING a netdev
+
+From: Anoob Soman <anoob.soman@citrix.com>
+
+
+[ Upstream commit 6664498280cf17a59c3e7cf1a931444c02633ed1 ]
+
+If a socket has FANOUT sockopt set, a new proto_hook is registered
+as part of fanout_add(). When processing a NETDEV_UNREGISTER event in
+af_packet, __fanout_unlink is called for all sockets, but prot_hook which was
+registered as part of fanout_add is not removed. Call fanout_release, on a
+NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the
+fanout_list.
+
+This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in netdev_run_todo()
+
+Signed-off-by: Anoob Soman <anoob.soman@citrix.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/packet/af_packet.c |    1 +
+ 1 file changed, 1 insertion(+)
+
+--- a/net/packet/af_packet.c
++++ b/net/packet/af_packet.c
+@@ -3855,6 +3855,7 @@ static int packet_notifier(struct notifi
+                               }
+                               if (msg == NETDEV_UNREGISTER) {
+                                       packet_cached_dev_reset(po);
++                                      fanout_release(sk);
+                                       po->ifindex = -1;
+                                       if (po->prot_hook.dev)
+                                               dev_put(po->prot_hook.dev);
diff --git a/queue-4.4/packet-on-direct_xmit-limit-tso-and-csum-to-supported-devices.patch b/queue-4.4/packet-on-direct_xmit-limit-tso-and-csum-to-supported-devices.patch
new file mode 100644 (file)
index 0000000..2af1542
--- /dev/null
@@ -0,0 +1,96 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Willem de Bruijn <willemb@google.com>
+Date: Wed, 26 Oct 2016 11:23:07 -0400
+Subject: packet: on direct_xmit, limit tso and csum to supported devices
+
+From: Willem de Bruijn <willemb@google.com>
+
+
+[ Upstream commit 104ba78c98808ae837d1f63aae58c183db5505df ]
+
+When transmitting on a packet socket with PACKET_VNET_HDR and
+PACKET_QDISC_BYPASS, validate device support for features requested
+in vnet_hdr.
+
+Drop TSO packets sent to devices that do not support TSO or have the
+feature disabled. Note that the latter currently do process those
+packets correctly, regardless of not advertising the feature.
+
+Because of SKB_GSO_DODGY, it is not sufficient to test device features
+with netif_needs_gso. Full validate_xmit_skb is needed.
+
+Switch to software checksum for non-TSO packets that request checksum
+offload if that device feature is unsupported or disabled. Note that
+similar to the TSO case, device drivers may perform checksum offload
+correctly even when not advertising it.
+
+When switching to software checksum, packets hit skb_checksum_help,
+which has two BUG_ON checksum not in linear segment. Packet sockets
+always allocate at least up to csum_start + csum_off + 2 as linear.
+
+Tested by running github.com/wdebruij/kerneltools/psock_txring_vnet.c
+
+  ethtool -K eth0 tso off tx on
+  psock_txring_vnet -d $dst -s $src -i eth0 -l 2000 -n 1 -q -v
+  psock_txring_vnet -d $dst -s $src -i eth0 -l 2000 -n 1 -q -v -N
+
+  ethtool -K eth0 tx off
+  psock_txring_vnet -d $dst -s $src -i eth0 -l 1000 -n 1 -q -v -G
+  psock_txring_vnet -d $dst -s $src -i eth0 -l 1000 -n 1 -q -v -G -N
+
+v2:
+  - add EXPORT_SYMBOL_GPL(validate_xmit_skb_list)
+
+Fixes: d346a3fae3ff ("packet: introduce PACKET_QDISC_BYPASS socket option")
+Signed-off-by: Willem de Bruijn <willemb@google.com>
+Acked-by: Eric Dumazet <edumazet@google.com>
+Acked-by: Daniel Borkmann <daniel@iogearbox.net>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/core/dev.c         |    1 +
+ net/packet/af_packet.c |    9 ++++-----
+ 2 files changed, 5 insertions(+), 5 deletions(-)
+
+--- a/net/core/dev.c
++++ b/net/core/dev.c
+@@ -2836,6 +2836,7 @@ struct sk_buff *validate_xmit_skb_list(s
+       }
+       return head;
+ }
++EXPORT_SYMBOL_GPL(validate_xmit_skb_list);
+ static void qdisc_pkt_len_init(struct sk_buff *skb)
+ {
+--- a/net/packet/af_packet.c
++++ b/net/packet/af_packet.c
+@@ -249,7 +249,7 @@ static void __fanout_link(struct sock *s
+ static int packet_direct_xmit(struct sk_buff *skb)
+ {
+       struct net_device *dev = skb->dev;
+-      netdev_features_t features;
++      struct sk_buff *orig_skb = skb;
+       struct netdev_queue *txq;
+       int ret = NETDEV_TX_BUSY;
+@@ -257,9 +257,8 @@ static int packet_direct_xmit(struct sk_
+                    !netif_carrier_ok(dev)))
+               goto drop;
+-      features = netif_skb_features(skb);
+-      if (skb_needs_linearize(skb, features) &&
+-          __skb_linearize(skb))
++      skb = validate_xmit_skb_list(skb, dev);
++      if (skb != orig_skb)
+               goto drop;
+       txq = skb_get_tx_queue(dev, skb);
+@@ -279,7 +278,7 @@ static int packet_direct_xmit(struct sk_
+       return ret;
+ drop:
+       atomic_long_inc(&dev->tx_dropped);
+-      kfree_skb(skb);
++      kfree_skb_list(skb);
+       return NET_XMIT_DROP;
+ }
diff --git a/queue-4.4/rtnetlink-add-rtnexthop-offload-flag-to-compare-mask.patch b/queue-4.4/rtnetlink-add-rtnexthop-offload-flag-to-compare-mask.patch
new file mode 100644 (file)
index 0000000..ba38afe
--- /dev/null
@@ -0,0 +1,33 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Jiri Pirko <jiri@mellanox.com>
+Date: Tue, 18 Oct 2016 18:59:34 +0200
+Subject: rtnetlink: Add rtnexthop offload flag to compare mask
+
+From: Jiri Pirko <jiri@mellanox.com>
+
+
+[ Upstream commit 85dda4e5b0ee1f5b4e8cc93d39e475006bc61ccd ]
+
+The offload flag is a status flag and should not be used by
+FIB semantics for comparison.
+
+Fixes: 37ed9493699c ("rtnetlink: add RTNH_F_EXTERNAL flag for fib offload")
+Signed-off-by: Jiri Pirko <jiri@mellanox.com>
+Reviewed-by: Andy Gospodarek <andy@greyhouse.net>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ include/uapi/linux/rtnetlink.h |    2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/include/uapi/linux/rtnetlink.h
++++ b/include/uapi/linux/rtnetlink.h
+@@ -343,7 +343,7 @@ struct rtnexthop {
+ #define RTNH_F_OFFLOAD                8       /* offloaded route */
+ #define RTNH_F_LINKDOWN               16      /* carrier-down on nexthop */
+-#define RTNH_COMPARE_MASK     (RTNH_F_DEAD | RTNH_F_LINKDOWN)
++#define RTNH_COMPARE_MASK     (RTNH_F_DEAD | RTNH_F_LINKDOWN | RTNH_F_OFFLOAD)
+ /* Macros to handle hexthops */
diff --git a/queue-4.4/sctp-validate-chunk-len-before-actually-using-it.patch b/queue-4.4/sctp-validate-chunk-len-before-actually-using-it.patch
new file mode 100644 (file)
index 0000000..b203908
--- /dev/null
@@ -0,0 +1,58 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
+Date: Tue, 25 Oct 2016 14:27:39 -0200
+Subject: sctp: validate chunk len before actually using it
+
+From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
+
+
+[ Upstream commit bf911e985d6bbaa328c20c3e05f4eb03de11fdd6 ]
+
+Andrey Konovalov reported that KASAN detected that SCTP was using a slab
+beyond the boundaries. It was caused because when handling out of the
+blue packets in function sctp_sf_ootb() it was checking the chunk len
+only after already processing the first chunk, validating only for the
+2nd and subsequent ones.
+
+The fix is to just move the check upwards so it's also validated for the
+1st chunk.
+
+Reported-by: Andrey Konovalov <andreyknvl@google.com>
+Tested-by: Andrey Konovalov <andreyknvl@google.com>
+Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
+Reviewed-by: Xin Long <lucien.xin@gmail.com>
+Acked-by: Neil Horman <nhorman@tuxdriver.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/sctp/sm_statefuns.c |   12 ++++++------
+ 1 file changed, 6 insertions(+), 6 deletions(-)
+
+--- a/net/sctp/sm_statefuns.c
++++ b/net/sctp/sm_statefuns.c
+@@ -3426,6 +3426,12 @@ sctp_disposition_t sctp_sf_ootb(struct n
+                       return sctp_sf_violation_chunklen(net, ep, asoc, type, arg,
+                                                 commands);
++              /* Report violation if chunk len overflows */
++              ch_end = ((__u8 *)ch) + WORD_ROUND(ntohs(ch->length));
++              if (ch_end > skb_tail_pointer(skb))
++                      return sctp_sf_violation_chunklen(net, ep, asoc, type, arg,
++                                                commands);
++
+               /* Now that we know we at least have a chunk header,
+                * do things that are type appropriate.
+                */
+@@ -3457,12 +3463,6 @@ sctp_disposition_t sctp_sf_ootb(struct n
+                       }
+               }
+-              /* Report violation if chunk len overflows */
+-              ch_end = ((__u8 *)ch) + WORD_ROUND(ntohs(ch->length));
+-              if (ch_end > skb_tail_pointer(skb))
+-                      return sctp_sf_violation_chunklen(net, ep, asoc, type, arg,
+-                                                commands);
+-
+               ch = (sctp_chunkhdr_t *) ch_end;
+       } while (ch_end < skb_tail_pointer(skb));
diff --git a/queue-4.4/series b/queue-4.4/series
new file mode 100644 (file)
index 0000000..b4b0e0c
--- /dev/null
@@ -0,0 +1,27 @@
+tcp-fix-overflow-in-__tcp_retransmit_skb.patch
+net-avoid-sk_forward_alloc-overflows.patch
+tcp-fix-wrong-checksum-calculation-on-mtu-probing.patch
+tcp-fix-a-compile-error-in-dbgundo.patch
+ip6_gre-fix-flowi6_proto-value-in-ip6gre_xmit_other.patch
+ipmr-ip6mr-fix-scheduling-while-atomic-and-a-deadlock-with-ipmr_get_route.patch
+tg3-avoid-null-pointer-dereference-in-tg3_io_error_detected.patch
+net-fec-set-mac-address-unconditionally.patch
+net-pktgen-fix-pkt_size.patch
+net-sched-act_vlan-push-skb-data-to-mac_header-prior-calling-skb_vlan_-functions.patch
+net-add-netdev-all_adj_list-refcnt-propagation-to-fix-panic.patch
+packet-call-fanout_release-while-unregistering-a-netdev.patch
+netlink-do-not-enter-direct-reclaim-from-netlink_dump.patch
+ipv6-tcp-restore-ip6cb-for-pktoptions-skbs.patch
+ip6_tunnel-fix-ip6_tnl_lookup.patch
+ipv6-correctly-add-local-routes-when-lo-goes-up.patch
+net-pktgen-remove-rcu-locking-in-pktgen_change_name.patch
+bridge-multicast-restore-perm-router-ports-on-multicast-enable.patch
+rtnetlink-add-rtnexthop-offload-flag-to-compare-mask.patch
+net-add-recursion-limit-to-gro.patch
+ipv4-disable-bh-in-set_ping_group_range.patch
+ipv4-use-the-right-lock-for-ping_group_range.patch
+net-sctp-forbid-negative-length.patch
+udp-fix-ip_checksum-handling.patch
+net-sched-filters-fix-notification-of-filter-delete-with-proper-handle.patch
+sctp-validate-chunk-len-before-actually-using-it.patch
+packet-on-direct_xmit-limit-tso-and-csum-to-supported-devices.patch
diff --git a/queue-4.4/tcp-fix-a-compile-error-in-dbgundo.patch b/queue-4.4/tcp-fix-a-compile-error-in-dbgundo.patch
new file mode 100644 (file)
index 0000000..6d813c9
--- /dev/null
@@ -0,0 +1,35 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Eric Dumazet <edumazet@google.com>
+Date: Thu, 22 Sep 2016 17:54:00 -0700
+Subject: tcp: fix a compile error in DBGUNDO()
+
+From: Eric Dumazet <edumazet@google.com>
+
+
+[ Upstream commit 019b1c9fe32a2a32c1153e31375f87ec3e591273 ]
+
+If DBGUNDO() is enabled (FASTRETRANS_DEBUG > 1), a compile
+error will happen, since inet6_sk(sk)->daddr became sk->sk_v6_daddr
+
+Fixes: efe4208f47f9 ("ipv6: make lookups simpler and faster")
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/ipv4/tcp_input.c |    3 +--
+ 1 file changed, 1 insertion(+), 2 deletions(-)
+
+--- a/net/ipv4/tcp_input.c
++++ b/net/ipv4/tcp_input.c
+@@ -2324,10 +2324,9 @@ static void DBGUNDO(struct sock *sk, con
+       }
+ #if IS_ENABLED(CONFIG_IPV6)
+       else if (sk->sk_family == AF_INET6) {
+-              struct ipv6_pinfo *np = inet6_sk(sk);
+               pr_debug("Undo %s %pI6/%u c%u l%u ss%u/%u p%u\n",
+                        msg,
+-                       &np->daddr, ntohs(inet->inet_dport),
++                       &sk->sk_v6_daddr, ntohs(inet->inet_dport),
+                        tp->snd_cwnd, tcp_left_out(tp),
+                        tp->snd_ssthresh, tp->prior_ssthresh,
+                        tp->packets_out);
diff --git a/queue-4.4/tcp-fix-overflow-in-__tcp_retransmit_skb.patch b/queue-4.4/tcp-fix-overflow-in-__tcp_retransmit_skb.patch
new file mode 100644 (file)
index 0000000..4a21bc2
--- /dev/null
@@ -0,0 +1,39 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Eric Dumazet <edumazet@google.com>
+Date: Thu, 15 Sep 2016 08:12:33 -0700
+Subject: tcp: fix overflow in __tcp_retransmit_skb()
+
+From: Eric Dumazet <edumazet@google.com>
+
+
+[ Upstream commit ffb4d6c8508657824bcef68a36b2a0f9d8c09d10 ]
+
+If a TCP socket gets a large write queue, an overflow can happen
+in a test in __tcp_retransmit_skb() preventing all retransmits.
+
+The flow then stalls and resets after timeouts.
+
+Tested:
+
+sysctl -w net.core.wmem_max=1000000000
+netperf -H dest -- -s 1000000000
+
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/ipv4/tcp_output.c |    3 ++-
+ 1 file changed, 2 insertions(+), 1 deletion(-)
+
+--- a/net/ipv4/tcp_output.c
++++ b/net/ipv4/tcp_output.c
+@@ -2569,7 +2569,8 @@ int __tcp_retransmit_skb(struct sock *sk
+        * copying overhead: fragmentation, tunneling, mangling etc.
+        */
+       if (atomic_read(&sk->sk_wmem_alloc) >
+-          min(sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2), sk->sk_sndbuf))
++          min_t(u32, sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2),
++                sk->sk_sndbuf))
+               return -EAGAIN;
+       if (skb_still_in_host_queue(sk, skb))
diff --git a/queue-4.4/tcp-fix-wrong-checksum-calculation-on-mtu-probing.patch b/queue-4.4/tcp-fix-wrong-checksum-calculation-on-mtu-probing.patch
new file mode 100644 (file)
index 0000000..0dd3c48
--- /dev/null
@@ -0,0 +1,51 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Douglas Caetano dos Santos <douglascs@taghos.com.br>
+Date: Thu, 22 Sep 2016 15:52:04 -0300
+Subject: tcp: fix wrong checksum calculation on MTU probing
+
+From: Douglas Caetano dos Santos <douglascs@taghos.com.br>
+
+
+[ Upstream commit 2fe664f1fcf7c4da6891f95708a7a56d3c024354 ]
+
+With TCP MTU probing enabled and offload TX checksumming disabled,
+tcp_mtu_probe() calculated the wrong checksum when a fragment being copied
+into the probe's SKB had an odd length. This was caused by the direct use
+of skb_copy_and_csum_bits() to calculate the checksum, as it pads the
+fragment being copied, if needed. When this fragment was not the last, a
+subsequent call used the previous checksum without considering this
+padding.
+
+The effect was a stale connection in one way, as even retransmissions
+wouldn't solve the problem, because the checksum was never recalculated for
+the full SKB length.
+
+Signed-off-by: Douglas Caetano dos Santos <douglascs@taghos.com.br>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/ipv4/tcp_output.c |   12 +++++++-----
+ 1 file changed, 7 insertions(+), 5 deletions(-)
+
+--- a/net/ipv4/tcp_output.c
++++ b/net/ipv4/tcp_output.c
+@@ -1950,12 +1950,14 @@ static int tcp_mtu_probe(struct sock *sk
+       len = 0;
+       tcp_for_write_queue_from_safe(skb, next, sk) {
+               copy = min_t(int, skb->len, probe_size - len);
+-              if (nskb->ip_summed)
++              if (nskb->ip_summed) {
+                       skb_copy_bits(skb, 0, skb_put(nskb, copy), copy);
+-              else
+-                      nskb->csum = skb_copy_and_csum_bits(skb, 0,
+-                                                          skb_put(nskb, copy),
+-                                                          copy, nskb->csum);
++              } else {
++                      __wsum csum = skb_copy_and_csum_bits(skb, 0,
++                                                           skb_put(nskb, copy),
++                                                           copy, 0);
++                      nskb->csum = csum_block_add(nskb->csum, csum, len);
++              }
+               if (skb->len <= copy) {
+                       /* We've eaten all the data from this skb.
diff --git a/queue-4.4/tg3-avoid-null-pointer-dereference-in-tg3_io_error_detected.patch b/queue-4.4/tg3-avoid-null-pointer-dereference-in-tg3_io_error_detected.patch
new file mode 100644 (file)
index 0000000..1e05d58
--- /dev/null
@@ -0,0 +1,86 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Milton Miller <miltonm@us.ibm.com>
+Date: Thu, 29 Sep 2016 13:24:08 -0300
+Subject: tg3: Avoid NULL pointer dereference in tg3_io_error_detected()
+
+From: Milton Miller <miltonm@us.ibm.com>
+
+
+[ Upstream commit 1b0ff89852d79354e8a091c81a88df21f5aa9f0a ]
+
+While the driver is probing the adapter, an error may occur before the
+netdev structure is allocated and attached to pci_dev. In this case,
+not only netdev isn't available, but the tg3 private structure is also
+not available as it is just math from the NULL pointer, so dereferences
+must be skipped.
+
+The following trace is seen when the error is triggered:
+
+  [1.402247] Unable to handle kernel paging request for data at address 0x00001a99
+  [1.402410] Faulting instruction address: 0xc0000000007e33f8
+  [1.402450] Oops: Kernel access of bad area, sig: 11 [#1]
+  [1.402481] SMP NR_CPUS=2048 NUMA PowerNV
+  [1.402513] Modules linked in:
+  [1.402545] CPU: 0 PID: 651 Comm: eehd Not tainted 4.4.0-36-generic #55-Ubuntu
+  [1.402591] task: c000001fe4e42a20 ti: c000001fe4e88000 task.ti: c000001fe4e88000
+  [1.402742] NIP: c0000000007e33f8 LR: c0000000007e3164 CTR: c000000000595ea0
+  [1.402787] REGS: c000001fe4e8b790 TRAP: 0300   Not tainted  (4.4.0-36-generic)
+  [1.402832] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 28000422  XER: 20000000
+  [1.403058] CFAR: c000000000008468 DAR: 0000000000001a99 DSISR: 42000000 SOFTE: 1
+  GPR00: c0000000007e3164 c000001fe4e8ba10 c0000000015c5e00 0000000000000000
+  GPR04: 0000000000000001 0000000000000000 0000000000000039 0000000000000299
+  GPR08: 0000000000000000 0000000000000001 c000001fe4e88000 0000000000000006
+  GPR12: 0000000000000000 c00000000fb40000 c0000000000e6558 c000003ca1bffd00
+  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
+  GPR20: 0000000000000000 0000000000000000 0000000000000000 c000000000d52768
+  GPR24: c000000000d52740 0000000000000100 c000003ca1b52000 0000000000000002
+  GPR28: 0000000000000900 0000000000000000 c00000000152a0c0 c000003ca1b52000
+  [1.404226] NIP [c0000000007e33f8] tg3_io_error_detected+0x308/0x340
+  [1.404265] LR [c0000000007e3164] tg3_io_error_detected+0x74/0x340
+
+This patch avoids the NULL pointer dereference by moving the access after
+the netdev NULL pointer check on tg3_io_error_detected(). Also, we add a
+check for netdev being NULL on tg3_io_resume() [suggested by Michael Chan].
+
+Fixes: 0486a063b1ff ("tg3: prevent ifup/ifdown during PCI error recovery")
+Fixes: dfc8f370316b ("net/tg3: Release IRQs on permanent error")
+Tested-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
+Signed-off-by: Milton Miller <miltonm@us.ibm.com>
+Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
+Acked-by: Michael Chan <michael.chan@broadcom.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ drivers/net/ethernet/broadcom/tg3.c |   10 +++++-----
+ 1 file changed, 5 insertions(+), 5 deletions(-)
+
+--- a/drivers/net/ethernet/broadcom/tg3.c
++++ b/drivers/net/ethernet/broadcom/tg3.c
+@@ -18142,14 +18142,14 @@ static pci_ers_result_t tg3_io_error_det
+       rtnl_lock();
+-      /* We needn't recover from permanent error */
+-      if (state == pci_channel_io_frozen)
+-              tp->pcierr_recovery = true;
+-
+       /* We probably don't have netdev yet */
+       if (!netdev || !netif_running(netdev))
+               goto done;
++      /* We needn't recover from permanent error */
++      if (state == pci_channel_io_frozen)
++              tp->pcierr_recovery = true;
++
+       tg3_phy_stop(tp);
+       tg3_netif_stop(tp);
+@@ -18246,7 +18246,7 @@ static void tg3_io_resume(struct pci_dev
+       rtnl_lock();
+-      if (!netif_running(netdev))
++      if (!netdev || !netif_running(netdev))
+               goto done;
+       tg3_full_lock(tp, 0);
diff --git a/queue-4.4/udp-fix-ip_checksum-handling.patch b/queue-4.4/udp-fix-ip_checksum-handling.patch
new file mode 100644 (file)
index 0000000..09c9aad
--- /dev/null
@@ -0,0 +1,117 @@
+From foo@baz Thu Nov 10 16:42:45 CET 2016
+From: Eric Dumazet <edumazet@google.com>
+Date: Sun, 23 Oct 2016 18:03:06 -0700
+Subject: udp: fix IP_CHECKSUM handling
+
+From: Eric Dumazet <edumazet@google.com>
+
+
+[ Upstream commit 10df8e6152c6c400a563a673e9956320bfce1871 ]
+
+First bug was added in commit ad6f939ab193 ("ip: Add offset parameter to
+ip_cmsg_recv") : Tom missed that ipv4 udp messages could be received on
+AF_INET6 socket. ip_cmsg_recv(msg, skb) should have been replaced by
+ip_cmsg_recv_offset(msg, skb, sizeof(struct udphdr));
+
+Then commit e6afc8ace6dd ("udp: remove headers from UDP packets before
+queueing") forgot to adjust the offsets now UDP headers are pulled
+before skb are put in receive queue.
+
+Fixes: ad6f939ab193 ("ip: Add offset parameter to ip_cmsg_recv")
+Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Cc: Sam Kumar <samanthakumar@google.com>
+Cc: Willem de Bruijn <willemb@google.com>
+Tested-by: Willem de Bruijn <willemb@google.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ include/net/ip.h       |    4 ++--
+ net/ipv4/ip_sockglue.c |   10 ++++++----
+ net/ipv4/udp.c         |    2 +-
+ net/ipv6/udp.c         |    3 ++-
+ 4 files changed, 11 insertions(+), 8 deletions(-)
+
+--- a/include/net/ip.h
++++ b/include/net/ip.h
+@@ -553,7 +553,7 @@ int ip_options_rcv_srr(struct sk_buff *s
+  */
+ void ipv4_pktinfo_prepare(const struct sock *sk, struct sk_buff *skb);
+-void ip_cmsg_recv_offset(struct msghdr *msg, struct sk_buff *skb, int offset);
++void ip_cmsg_recv_offset(struct msghdr *msg, struct sk_buff *skb, int tlen, int offset);
+ int ip_cmsg_send(struct net *net, struct msghdr *msg,
+                struct ipcm_cookie *ipc, bool allow_ipv6);
+ int ip_setsockopt(struct sock *sk, int level, int optname, char __user *optval,
+@@ -575,7 +575,7 @@ void ip_local_error(struct sock *sk, int
+ static inline void ip_cmsg_recv(struct msghdr *msg, struct sk_buff *skb)
+ {
+-      ip_cmsg_recv_offset(msg, skb, 0);
++      ip_cmsg_recv_offset(msg, skb, 0, 0);
+ }
+ bool icmp_global_allow(void);
+--- a/net/ipv4/ip_sockglue.c
++++ b/net/ipv4/ip_sockglue.c
+@@ -98,7 +98,7 @@ static void ip_cmsg_recv_retopts(struct
+ }
+ static void ip_cmsg_recv_checksum(struct msghdr *msg, struct sk_buff *skb,
+-                                int offset)
++                                int tlen, int offset)
+ {
+       __wsum csum = skb->csum;
+@@ -106,7 +106,9 @@ static void ip_cmsg_recv_checksum(struct
+               return;
+       if (offset != 0)
+-              csum = csum_sub(csum, csum_partial(skb->data, offset, 0));
++              csum = csum_sub(csum,
++                              csum_partial(skb->data + tlen,
++                                           offset, 0));
+       put_cmsg(msg, SOL_IP, IP_CHECKSUM, sizeof(__wsum), &csum);
+ }
+@@ -152,7 +154,7 @@ static void ip_cmsg_recv_dstaddr(struct
+ }
+ void ip_cmsg_recv_offset(struct msghdr *msg, struct sk_buff *skb,
+-                       int offset)
++                       int tlen, int offset)
+ {
+       struct inet_sock *inet = inet_sk(skb->sk);
+       unsigned int flags = inet->cmsg_flags;
+@@ -215,7 +217,7 @@ void ip_cmsg_recv_offset(struct msghdr *
+       }
+       if (flags & IP_CMSG_CHECKSUM)
+-              ip_cmsg_recv_checksum(msg, skb, offset);
++              ip_cmsg_recv_checksum(msg, skb, tlen, offset);
+ }
+ EXPORT_SYMBOL(ip_cmsg_recv_offset);
+--- a/net/ipv4/udp.c
++++ b/net/ipv4/udp.c
+@@ -1342,7 +1342,7 @@ try_again:
+               *addr_len = sizeof(*sin);
+       }
+       if (inet->cmsg_flags)
+-              ip_cmsg_recv_offset(msg, skb, sizeof(struct udphdr));
++              ip_cmsg_recv_offset(msg, skb, sizeof(struct udphdr), off);
+       err = copied;
+       if (flags & MSG_TRUNC)
+--- a/net/ipv6/udp.c
++++ b/net/ipv6/udp.c
+@@ -498,7 +498,8 @@ try_again:
+       if (is_udp4) {
+               if (inet->cmsg_flags)
+-                      ip_cmsg_recv(msg, skb);
++                      ip_cmsg_recv_offset(msg, skb,
++                                          sizeof(struct udphdr), off);
+       } else {
+               if (np->rxopt.all)
+                       ip6_datagram_recv_specific_ctl(sk, msg, skb);
diff --git a/queue-4.8/series b/queue-4.8/series
new file mode 100644 (file)
index 0000000..d0d9400
--- /dev/null
@@ -0,0 +1,34 @@
+net-fec-set-mac-address-unconditionally.patch
+net-pktgen-fix-pkt_size.patch
+net-sched-act_vlan-push-skb-data-to-mac_header-prior-calling-skb_vlan_-functions.patch
+net-add-netdev-all_adj_list-refcnt-propagation-to-fix-panic.patch
+packet-call-fanout_release-while-unregistering-a-netdev.patch
+netlink-do-not-enter-direct-reclaim-from-netlink_dump.patch
+drivers-ptp-fix-kernel-memory-disclosure.patch
+net_sched-reorder-pernet-ops-and-act-ops-registrations.patch
+ipv6-tcp-restore-ip6cb-for-pktoptions-skbs.patch
+net-phy-trigger-state-machine-on-state-change-and-not-polling.patch
+ip6_tunnel-fix-ip6_tnl_lookup.patch
+ipv6-correctly-add-local-routes-when-lo-goes-up.patch
+ib-ipoib-move-back-ib-ll-address-into-the-hard-header.patch
+net-mlx4_en-fixup-xdp-tx-irq-to-match-rx.patch
+net-pktgen-remove-rcu-locking-in-pktgen_change_name.patch
+bridge-multicast-restore-perm-router-ports-on-multicast-enable.patch
+switchdev-execute-bridge-ndos-only-for-bridge-ports.patch
+rtnetlink-add-rtnexthop-offload-flag-to-compare-mask.patch
+net-core-correctly-iterate-over-lower-adjacency-list.patch
+net-add-recursion-limit-to-gro.patch
+ipv4-disable-bh-in-set_ping_group_range.patch
+ipv4-use-the-right-lock-for-ping_group_range.patch
+net-fec-call-swap_buffer-prior-to-ip-header-alignment.patch
+net-sctp-forbid-negative-length.patch
+sctp-fix-the-panic-caused-by-route-update.patch
+udp-fix-ip_checksum-handling.patch
+netvsc-fix-incorrect-receive-checksum-offloading.patch
+macsec-fix-header-length-if-sci-is-added-if-explicitly-disabled.patch
+net-ipv6-do-not-consider-link-state-for-nexthop-validation.patch
+net-sched-filters-fix-notification-of-filter-delete-with-proper-handle.patch
+sctp-validate-chunk-len-before-actually-using-it.patch
+ip6_tunnel-update-skb-protocol-to-eth_p_ipv6-in-ip6_tnl_xmit.patch
+packet-on-direct_xmit-limit-tso-and-csum-to-supported-devices.patch
+arch-powerpc-update-parameters-for-csum_tcpudp_magic-csum_tcpudp_nofold.patch