From: Greg Kroah-Hartman Date: Thu, 19 Sep 2019 13:03:48 +0000 (+0200) Subject: 5.2-stable patches X-Git-Tag: v4.4.194~45 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=b28a49d2ee2738f76c8f7dd719f65baa68b7bee3;p=thirdparty%2Fkernel%2Fstable-queue.git 5.2-stable patches added patches: ip6_gre-fix-a-dst-leak-in-ip6erspan_tunnel_xmit.patch net-dsa-fix-load-order-between-dsa-drivers-and-taggers.patch net-sched-fix-race-between-deactivation-and-dequeue-for-nolock-qdisc.patch net_sched-let-qdisc_put-accept-null-pointer.patch udp-correct-reuseport-selection-with-connected-sockets.patch xen-netfront-do-not-assume-sk_buff_head-list-is-empty-in-error-handling.patch --- diff --git a/queue-5.2/ip6_gre-fix-a-dst-leak-in-ip6erspan_tunnel_xmit.patch b/queue-5.2/ip6_gre-fix-a-dst-leak-in-ip6erspan_tunnel_xmit.patch new file mode 100644 index 00000000000..2b87f823a0c --- /dev/null +++ b/queue-5.2/ip6_gre-fix-a-dst-leak-in-ip6erspan_tunnel_xmit.patch @@ -0,0 +1,35 @@ +From foo@baz Thu 19 Sep 2019 02:58:44 PM CEST +From: Xin Long +Date: Fri, 13 Sep 2019 17:45:47 +0800 +Subject: ip6_gre: fix a dst leak in ip6erspan_tunnel_xmit + +From: Xin Long + +[ Upstream commit 28e486037747c2180470b77c290d4090ad42f259 ] + +In ip6erspan_tunnel_xmit(), if the skb will not be sent out, it has to +be freed on the tx_err path. Otherwise when deleting a netns, it would +cause dst/dev to leak, and dmesg shows: + + unregister_netdevice: waiting for lo to become free. Usage count = 1 + +Fixes: ef7baf5e083c ("ip6_gre: add ip6 erspan collect_md mode") +Signed-off-by: Xin Long +Acked-by: William Tu +Signed-off-by: David S. Miller +Signed-off-by: Greg Kroah-Hartman +--- + net/ipv6/ip6_gre.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/net/ipv6/ip6_gre.c ++++ b/net/ipv6/ip6_gre.c +@@ -968,7 +968,7 @@ static netdev_tx_t ip6erspan_tunnel_xmit + if (unlikely(!tun_info || + !(tun_info->mode & IP_TUNNEL_INFO_TX) || + ip_tunnel_info_af(tun_info) != AF_INET6)) +- return -EINVAL; ++ goto tx_err; + + key = &tun_info->key; + memset(&fl6, 0, sizeof(fl6)); diff --git a/queue-5.2/net-dsa-fix-load-order-between-dsa-drivers-and-taggers.patch b/queue-5.2/net-dsa-fix-load-order-between-dsa-drivers-and-taggers.patch new file mode 100644 index 00000000000..c4eed2d020c --- /dev/null +++ b/queue-5.2/net-dsa-fix-load-order-between-dsa-drivers-and-taggers.patch @@ -0,0 +1,39 @@ +From foo@baz Thu 19 Sep 2019 02:58:44 PM CEST +From: Andrew Lunn +Date: Thu, 12 Sep 2019 15:16:45 +0200 +Subject: net: dsa: Fix load order between DSA drivers and taggers + +From: Andrew Lunn + +[ Upstream commit 23426a25e55a417dc104df08781b6eff95e65f3f ] + +The DSA core, DSA taggers and DSA drivers all make use of +module_init(). Hence they get initialised at device_initcall() time. +The ordering is non-deterministic. It can be a DSA driver is bound to +a device before the needed tag driver has been initialised, resulting +in the message: + +No tagger for this switch + +Rather than have this be fatal, return -EPROBE_DEFER so that it is +tried again later once all the needed drivers have been loaded. + +Fixes: d3b8c04988ca ("dsa: Add boilerplate helper to register DSA tag driver modules") +Signed-off-by: Andrew Lunn +Signed-off-by: David S. Miller +Signed-off-by: Greg Kroah-Hartman +--- + net/dsa/dsa2.c | 2 ++ + 1 file changed, 2 insertions(+) + +--- a/net/dsa/dsa2.c ++++ b/net/dsa/dsa2.c +@@ -577,6 +577,8 @@ static int dsa_port_parse_cpu(struct dsa + tag_protocol = ds->ops->get_tag_protocol(ds, dp->index); + tag_ops = dsa_tag_driver_get(tag_protocol); + if (IS_ERR(tag_ops)) { ++ if (PTR_ERR(tag_ops) == -ENOPROTOOPT) ++ return -EPROBE_DEFER; + dev_warn(ds->dev, "No tagger for this switch\n"); + return PTR_ERR(tag_ops); + } diff --git a/queue-5.2/net-sched-fix-race-between-deactivation-and-dequeue-for-nolock-qdisc.patch b/queue-5.2/net-sched-fix-race-between-deactivation-and-dequeue-for-nolock-qdisc.patch new file mode 100644 index 00000000000..f5b5417d075 --- /dev/null +++ b/queue-5.2/net-sched-fix-race-between-deactivation-and-dequeue-for-nolock-qdisc.patch @@ -0,0 +1,97 @@ +From foo@baz Thu 19 Sep 2019 02:58:44 PM CEST +From: Paolo Abeni +Date: Thu, 12 Sep 2019 12:02:42 +0200 +Subject: net/sched: fix race between deactivation and dequeue for NOLOCK qdisc + +From: Paolo Abeni + +[ Upstream commit d518d2ed8640c1cbbbb6f63939e3e65471817367 ] + +The test implemented by some_qdisc_is_busy() is somewhat loosy for +NOLOCK qdisc, as we may hit the following scenario: + +CPU1 CPU2 +// in net_tx_action() +clear_bit(__QDISC_STATE_SCHED...); + // in some_qdisc_is_busy() + val = (qdisc_is_running(q) || + test_bit(__QDISC_STATE_SCHED, + &q->state)); + // here val is 0 but... +qdisc_run(q) +// ... CPU1 is going to run the qdisc next + +As a conseguence qdisc_run() in net_tx_action() can race with qdisc_reset() +in dev_qdisc_reset(). Such race is not possible for !NOLOCK qdisc as +both the above bit operations are under the root qdisc lock(). + +After commit 021a17ed796b ("pfifo_fast: drop unneeded additional lock on dequeue") +the race can cause use after free and/or null ptr dereference, but the root +cause is likely older. + +This patch addresses the issue explicitly checking for deactivation under +the seqlock for NOLOCK qdisc, so that the qdisc_run() in the critical +scenario becomes a no-op. + +Note that the enqueue() op can still execute concurrently with dev_qdisc_reset(), +but that is safe due to the skb_array() locking, and we can't avoid that +for NOLOCK qdiscs. + +Fixes: 021a17ed796b ("pfifo_fast: drop unneeded additional lock on dequeue") +Reported-by: Li Shuang +Reported-and-tested-by: Davide Caratti +Signed-off-by: Paolo Abeni +Signed-off-by: David S. Miller +Signed-off-by: Greg Kroah-Hartman +--- + include/net/pkt_sched.h | 7 ++++++- + net/core/dev.c | 16 ++++++++++------ + 2 files changed, 16 insertions(+), 7 deletions(-) + +--- a/include/net/pkt_sched.h ++++ b/include/net/pkt_sched.h +@@ -118,7 +118,12 @@ void __qdisc_run(struct Qdisc *q); + static inline void qdisc_run(struct Qdisc *q) + { + if (qdisc_run_begin(q)) { +- __qdisc_run(q); ++ /* NOLOCK qdisc must check 'state' under the qdisc seqlock ++ * to avoid racing with dev_qdisc_reset() ++ */ ++ if (!(q->flags & TCQ_F_NOLOCK) || ++ likely(!test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) ++ __qdisc_run(q); + qdisc_run_end(q); + } + } +--- a/net/core/dev.c ++++ b/net/core/dev.c +@@ -3475,18 +3475,22 @@ static inline int __dev_xmit_skb(struct + qdisc_calculate_pkt_len(skb, q); + + if (q->flags & TCQ_F_NOLOCK) { +- if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) { +- __qdisc_drop(skb, &to_free); +- rc = NET_XMIT_DROP; +- } else if ((q->flags & TCQ_F_CAN_BYPASS) && q->empty && +- qdisc_run_begin(q)) { ++ if ((q->flags & TCQ_F_CAN_BYPASS) && q->empty && ++ qdisc_run_begin(q)) { ++ if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, ++ &q->state))) { ++ __qdisc_drop(skb, &to_free); ++ rc = NET_XMIT_DROP; ++ goto end_run; ++ } + qdisc_bstats_cpu_update(q, skb); + ++ rc = NET_XMIT_SUCCESS; + if (sch_direct_xmit(skb, q, dev, txq, NULL, true)) + __qdisc_run(q); + ++end_run: + qdisc_run_end(q); +- rc = NET_XMIT_SUCCESS; + } else { + rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK; + qdisc_run(q); diff --git a/queue-5.2/net_sched-let-qdisc_put-accept-null-pointer.patch b/queue-5.2/net_sched-let-qdisc_put-accept-null-pointer.patch new file mode 100644 index 00000000000..9b10213e876 --- /dev/null +++ b/queue-5.2/net_sched-let-qdisc_put-accept-null-pointer.patch @@ -0,0 +1,44 @@ +From foo@baz Thu 19 Sep 2019 02:58:44 PM CEST +From: Cong Wang +Date: Thu, 12 Sep 2019 10:22:30 -0700 +Subject: net_sched: let qdisc_put() accept NULL pointer + +From: Cong Wang + +[ Upstream commit 6efb971ba8edfbd80b666f29de12882852f095ae ] + +When tcf_block_get() fails in sfb_init(), q->qdisc is still a NULL +pointer which leads to a crash in sfb_destroy(). Similar for +sch_dsmark. + +Instead of fixing each separately, Linus suggested to just accept +NULL pointer in qdisc_put(), which would make callers easier. + +(For sch_dsmark, the bug probably exists long before commit +6529eaba33f0.) + +Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure") +Reported-by: syzbot+d5870a903591faaca4ae@syzkaller.appspotmail.com +Suggested-by: Linus Torvalds +Cc: Jamal Hadi Salim +Cc: Jiri Pirko +Signed-off-by: Cong Wang +Acked-by: Jiri Pirko +Signed-off-by: David S. Miller +Signed-off-by: Greg Kroah-Hartman +--- + net/sched/sch_generic.c | 3 +++ + 1 file changed, 3 insertions(+) + +--- a/net/sched/sch_generic.c ++++ b/net/sched/sch_generic.c +@@ -985,6 +985,9 @@ static void qdisc_destroy(struct Qdisc * + + void qdisc_put(struct Qdisc *qdisc) + { ++ if (!qdisc) ++ return; ++ + if (qdisc->flags & TCQ_F_BUILTIN || + !refcount_dec_and_test(&qdisc->refcnt)) + return; diff --git a/queue-5.2/series b/queue-5.2/series index 1fc0de6ccaa..03a40696c40 100644 --- a/queue-5.2/series +++ b/queue-5.2/series @@ -12,3 +12,9 @@ powerpc-mm-radix-use-the-right-page-size-for-vmemmap-mapping.patch scripts-decode_stacktrace-match-basepath-using-shell-prefix-operator-not-regex.patch net-hns-fix-led-configuration-for-marvell-phy.patch net-aquantia-fix-limit-of-vlan-filters.patch +ip6_gre-fix-a-dst-leak-in-ip6erspan_tunnel_xmit.patch +net-sched-fix-race-between-deactivation-and-dequeue-for-nolock-qdisc.patch +net_sched-let-qdisc_put-accept-null-pointer.patch +udp-correct-reuseport-selection-with-connected-sockets.patch +xen-netfront-do-not-assume-sk_buff_head-list-is-empty-in-error-handling.patch +net-dsa-fix-load-order-between-dsa-drivers-and-taggers.patch diff --git a/queue-5.2/udp-correct-reuseport-selection-with-connected-sockets.patch b/queue-5.2/udp-correct-reuseport-selection-with-connected-sockets.patch new file mode 100644 index 00000000000..ab5c015ca5b --- /dev/null +++ b/queue-5.2/udp-correct-reuseport-selection-with-connected-sockets.patch @@ -0,0 +1,176 @@ +From foo@baz Thu 19 Sep 2019 02:58:44 PM CEST +From: Willem de Bruijn +Date: Thu, 12 Sep 2019 21:16:39 -0400 +Subject: udp: correct reuseport selection with connected sockets + +From: Willem de Bruijn + +[ Upstream commit acdcecc61285faed359f1a3568c32089cc3a8329 ] + +UDP reuseport groups can hold a mix unconnected and connected sockets. +Ensure that connections only receive all traffic to their 4-tuple. + +Fast reuseport returns on the first reuseport match on the assumption +that all matches are equal. Only if connections are present, return to +the previous behavior of scoring all sockets. + +Record if connections are present and if so (1) treat such connected +sockets as an independent match from the group, (2) only return +2-tuple matches from reuseport and (3) do not return on the first +2-tuple reuseport match to allow for a higher scoring match later. + +New field has_conns is set without locks. No other fields in the +bitmap are modified at runtime and the field is only ever set +unconditionally, so an RMW cannot miss a change. + +Fixes: e32ea7e74727 ("soreuseport: fast reuseport UDP socket selection") +Link: http://lkml.kernel.org/r/CA+FuTSfRP09aJNYRt04SS6qj22ViiOEWaWmLAwX0psk8-PGNxw@mail.gmail.com +Signed-off-by: Willem de Bruijn +Acked-by: Paolo Abeni +Acked-by: Craig Gallek +Signed-off-by: Willem de Bruijn +Signed-off-by: David S. Miller +Signed-off-by: Greg Kroah-Hartman +--- + include/net/sock_reuseport.h | 21 ++++++++++++++++++++- + net/core/sock_reuseport.c | 15 +++++++++++++-- + net/ipv4/datagram.c | 2 ++ + net/ipv4/udp.c | 5 +++-- + net/ipv6/datagram.c | 2 ++ + net/ipv6/udp.c | 5 +++-- + 6 files changed, 43 insertions(+), 7 deletions(-) + +--- a/include/net/sock_reuseport.h ++++ b/include/net/sock_reuseport.h +@@ -21,7 +21,8 @@ struct sock_reuseport { + unsigned int synq_overflow_ts; + /* ID stays the same even after the size of socks[] grows. */ + unsigned int reuseport_id; +- bool bind_inany; ++ unsigned int bind_inany:1; ++ unsigned int has_conns:1; + struct bpf_prog __rcu *prog; /* optional BPF sock selector */ + struct sock *socks[0]; /* array of sock pointers */ + }; +@@ -35,6 +36,24 @@ extern struct sock *reuseport_select_soc + struct sk_buff *skb, + int hdr_len); + extern int reuseport_attach_prog(struct sock *sk, struct bpf_prog *prog); ++ ++static inline bool reuseport_has_conns(struct sock *sk, bool set) ++{ ++ struct sock_reuseport *reuse; ++ bool ret = false; ++ ++ rcu_read_lock(); ++ reuse = rcu_dereference(sk->sk_reuseport_cb); ++ if (reuse) { ++ if (set) ++ reuse->has_conns = 1; ++ ret = reuse->has_conns; ++ } ++ rcu_read_unlock(); ++ ++ return ret; ++} ++ + int reuseport_get_id(struct sock_reuseport *reuse); + + #endif /* _SOCK_REUSEPORT_H */ +--- a/net/core/sock_reuseport.c ++++ b/net/core/sock_reuseport.c +@@ -295,8 +295,19 @@ struct sock *reuseport_select_sock(struc + + select_by_hash: + /* no bpf or invalid bpf result: fall back to hash usage */ +- if (!sk2) +- sk2 = reuse->socks[reciprocal_scale(hash, socks)]; ++ if (!sk2) { ++ int i, j; ++ ++ i = j = reciprocal_scale(hash, socks); ++ while (reuse->socks[i]->sk_state == TCP_ESTABLISHED) { ++ i++; ++ if (i >= reuse->num_socks) ++ i = 0; ++ if (i == j) ++ goto out; ++ } ++ sk2 = reuse->socks[i]; ++ } + } + + out: +--- a/net/ipv4/datagram.c ++++ b/net/ipv4/datagram.c +@@ -15,6 +15,7 @@ + #include + #include + #include ++#include + + int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) + { +@@ -69,6 +70,7 @@ int __ip4_datagram_connect(struct sock * + } + inet->inet_daddr = fl4->daddr; + inet->inet_dport = usin->sin_port; ++ reuseport_has_conns(sk, true); + sk->sk_state = TCP_ESTABLISHED; + sk_set_txhash(sk); + inet->inet_id = jiffies; +--- a/net/ipv4/udp.c ++++ b/net/ipv4/udp.c +@@ -434,12 +434,13 @@ static struct sock *udp4_lib_lookup2(str + score = compute_score(sk, net, saddr, sport, + daddr, hnum, dif, sdif, exact_dif); + if (score > badness) { +- if (sk->sk_reuseport) { ++ if (sk->sk_reuseport && ++ sk->sk_state != TCP_ESTABLISHED) { + hash = udp_ehashfn(net, daddr, hnum, + saddr, sport); + result = reuseport_select_sock(sk, hash, skb, + sizeof(struct udphdr)); +- if (result) ++ if (result && !reuseport_has_conns(sk, false)) + return result; + } + badness = score; +--- a/net/ipv6/datagram.c ++++ b/net/ipv6/datagram.c +@@ -27,6 +27,7 @@ + #include + #include + #include ++#include + + #include + #include +@@ -254,6 +255,7 @@ ipv4_connected: + goto out; + } + ++ reuseport_has_conns(sk, true); + sk->sk_state = TCP_ESTABLISHED; + sk_set_txhash(sk); + out: +--- a/net/ipv6/udp.c ++++ b/net/ipv6/udp.c +@@ -168,13 +168,14 @@ static struct sock *udp6_lib_lookup2(str + score = compute_score(sk, net, saddr, sport, + daddr, hnum, dif, sdif, exact_dif); + if (score > badness) { +- if (sk->sk_reuseport) { ++ if (sk->sk_reuseport && ++ sk->sk_state != TCP_ESTABLISHED) { + hash = udp6_ehashfn(net, daddr, hnum, + saddr, sport); + + result = reuseport_select_sock(sk, hash, skb, + sizeof(struct udphdr)); +- if (result) ++ if (result && !reuseport_has_conns(sk, false)) + return result; + } + result = sk; diff --git a/queue-5.2/xen-netfront-do-not-assume-sk_buff_head-list-is-empty-in-error-handling.patch b/queue-5.2/xen-netfront-do-not-assume-sk_buff_head-list-is-empty-in-error-handling.patch new file mode 100644 index 00000000000..4a1c6be34c4 --- /dev/null +++ b/queue-5.2/xen-netfront-do-not-assume-sk_buff_head-list-is-empty-in-error-handling.patch @@ -0,0 +1,53 @@ +From foo@baz Thu 19 Sep 2019 02:58:44 PM CEST +From: Dongli Zhang +Date: Mon, 16 Sep 2019 11:46:59 +0800 +Subject: xen-netfront: do not assume sk_buff_head list is empty in error handling + +From: Dongli Zhang + +[ Upstream commit 00b368502d18f790ab715e055869fd4bb7484a9b ] + +When skb_shinfo(skb) is not able to cache extra fragment (that is, +skb_shinfo(skb)->nr_frags >= MAX_SKB_FRAGS), xennet_fill_frags() assumes +the sk_buff_head list is already empty. As a result, cons is increased only +by 1 and returns to error handling path in xennet_poll(). + +However, if the sk_buff_head list is not empty, queue->rx.rsp_cons may be +set incorrectly. That is, queue->rx.rsp_cons would point to the rx ring +buffer entries whose queue->rx_skbs[i] and queue->grant_rx_ref[i] are +already cleared to NULL. This leads to NULL pointer access in the next +iteration to process rx ring buffer entries. + +Below is how xennet_poll() does error handling. All remaining entries in +tmpq are accounted to queue->rx.rsp_cons without assuming how many +outstanding skbs are remained in the list. + + 985 static int xennet_poll(struct napi_struct *napi, int budget) +... ... +1032 if (unlikely(xennet_set_skb_gso(skb, gso))) { +1033 __skb_queue_head(&tmpq, skb); +1034 queue->rx.rsp_cons += skb_queue_len(&tmpq); +1035 goto err; +1036 } + +It is better to always have the error handling in the same way. + +Fixes: ad4f15dc2c70 ("xen/netfront: don't bug in case of too many frags") +Signed-off-by: Dongli Zhang +Signed-off-by: David S. Miller +Signed-off-by: Greg Kroah-Hartman +--- + drivers/net/xen-netfront.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/drivers/net/xen-netfront.c ++++ b/drivers/net/xen-netfront.c +@@ -906,7 +906,7 @@ static RING_IDX xennet_fill_frags(struct + __pskb_pull_tail(skb, pull_to - skb_headlen(skb)); + } + if (unlikely(skb_shinfo(skb)->nr_frags >= MAX_SKB_FRAGS)) { +- queue->rx.rsp_cons = ++cons; ++ queue->rx.rsp_cons = ++cons + skb_queue_len(list); + kfree_skb(nskb); + return ~0U; + }