This patch simplifies the checksum verification in tcpX_gro_receive
by reusing the CHECKSUM_COMPLETE code for CHECKSUM_NONE. All it
does for CHECKSUM_NONE is compute the partial checksum and then
treat it as if it came from the hardware (CHECKSUM_COMPLETE).
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cheers, Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In some cases we may receive IP packets that are longer than
their stated lengths. Such packets are never merged in GRO.
However, we may end up computing their checksums incorrectly
and end up allowing packets with a bogus checksum enter our
stack with the checksum status set as verified.
Since such packets are rare and not performance-critical, this
patch simply skips the checksum verification for them.
Reported-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Thanks, Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Recently GRO started generating packets with frag_lists of frags.
This was not handled by GSO, thus leading to a crash.
Thankfully these packets are of a regular form and are easy to
handle. This patch handles them in two ways. For completely
non-linear frag_list entries, we simply continue to iterate over
the frag_list frags once we exhaust the normal frags. For frag_list
entries with linear parts, we call pskb_trim on the first part
of the frag_list skb, and then process the rest of the frags in
the usual way.
This patch also kills a chunk of dead frag_list code that has
obviously never ever been run since it ends up generating a bogus
GSO-segmented packet with a frag_list entry.
Future work is planned to split super big packets into TSO
ones.
Fixes: 8a29111c7ca6 ("net: gro: allow to build full sized skb") Reported-by: Christoph Paasch <christoph.paasch@uclouvain.be> Reported-by: Jerry Chu <hkchu@google.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Tested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently we're using plain spin_lock() in prb_shutdown_retire_blk_timer(),
however the timer might fire right in the middle and thus try to re-aquire
the same spinlock, leaving us in a endless loop.
To fix that, use the spin_lock_bh() to block it.
Fixes: f6fb8f100b80 ("af-packet: TPACKET_V3 flexible buffer implementation.") CC: "David S. Miller" <davem@davemloft.net> CC: Daniel Borkmann <dborkman@redhat.com> CC: Willem de Bruijn <willemb@google.com> CC: Phil Sutter <phil@nwl.cc> CC: Eric Dumazet <edumazet@google.com> Reported-by: Jan Stancek <jstancek@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Salam reported a use after free bug in PF_PACKET that occurs when
we're sending out frames on a socket bound device and suddenly the
net device is being unregistered. It appears that commit 827d9780
introduced a possible race condition between {t,}packet_snd() and
packet_notifier(). In the case of a bound socket, packet_notifier()
can drop the last reference to the net_device and {t,}packet_snd()
might end up suddenly sending a packet over a freed net_device.
To avoid reverting 827d9780 and thus introducing a performance
regression compared to the current state of things, we decided to
hold a cached RCU protected pointer to the net device and maintain
it on write side via bind spin_lock protected register_prot_hook()
and __unregister_prot_hook() calls.
In {t,}packet_snd() path, we access this pointer under rcu_read_lock
through packet_cached_dev_get() that holds reference to the device
to prevent it from being freed through packet_notifier() while
we're in send path. This is okay to do as dev_put()/dev_hold() are
per-cpu counters, so this should not be a performance issue. Also,
the code simplifies a bit as we don't need need_rls_dev anymore.
Fixes: 827d978037d7 ("af-packet: Use existing netdev reference for bound sockets.") Reported-by: Salam Noureddine <noureddine@aristanetworks.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Salam Noureddine <noureddine@aristanetworks.com> Cc: Ben Greear <greearb@candelatech.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
--------------------------- cut here -------------------------------
The reason is that when the bridge dev's address is changed, the
br_fdb_change_mac_address() will add new address in fdb, but when
the bridge was removed, the address entry in the fdb did not free,
the bridge_fdb_cache still has objects when destroy the cache, Fix
this by flushing the bridge address entry when removing the bridge.
v2: according to the Toshiaki Makita and Vlad's suggestion, I only
delete the vlan0 entry, it still have a leak here if the vlan id
is other number, so I need to call fdb_delete_by_port(br, NULL, 1)
to flush all entries whose dst is NULL for the bridge.
Suggested-by: Toshiaki Makita <toshiaki.makita1@gmail.com> Suggested-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tried to fix a problem with VLAN devices and promiscuouse flag setting.
The issue was that VLAN device was setting a flag on an interface that
was down, thus resulting in bad promiscuity count.
This commit blocked flag propagation to any device that is currently
down.
fixed VLAN code to only propagate flags when the VLAN interface is up,
thus fixing the same issue as above, only localized to VLAN.
The problem we have now is that if we have create a complex stack
involving multiple software devices like bridges, bonds, and vlans,
then it is possible that the flags would not propagate properly to
the physical devices. A simple examle of the scenario is the
following:
eth0----> bond0 ----> bridge0 ---> vlan50
If bond0 or eth0 happen to be down at the time bond0 is added to
the bridge, then eth0 will never have promisc mode set which is
currently required for operation as part of the bridge. As a
result, packets with vlan50 will be dropped by the interface.
The only 2 devices that implement the special flag handling are
VLAN and DSA and they both have required code to prevent incorrect
flag propagation. As a result we can remove the generic solution
introduced in b6c40d68ff6498b7f63ddf97cf0aa818d748dee7 and leave
it to the individual devices to decide whether they will block
flag propagation or not.
Reported-by: Stefan Priebe <s.priebe@profihost.ag> Suggested-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CPUs can ask for local route via ip_route_input_noref() concurrently.
if nh_rth_input is not cached yet, CPUs will proceed to allocate
equivalent DSTs on 'lo' and then will try to cache them in nh_rth_input
via rt_cache_route()
Most of the time they succeed, but on occasion the following two lines:
orig = *p;
prev = cmpxchg(p, orig, rt);
in rt_cache_route() do race and one of the cpus fails to complete cmpxchg.
But ip_route_input_slow() doesn't check the return code of rt_cache_route(),
so dst is leaking. dst_destroy() is never called and 'lo' device
refcnt doesn't go to zero, which can be seen in the logs as:
unregister_netdevice: waiting for lo to become free. Usage count = 1
Adding mdelay() between above two lines makes it easily reproducible.
Fix it similar to nh_pcpu_rth_output case.
Fixes: d2d68ba9fe8b ("ipv4: Cache input routes in fib_info nexthops.") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
snd_nxt must be updated synchronously with sk_send_head. Otherwise
tp->packets_out may be updated incorrectly, what may bring a kernel panic.
Here is a kernel panic from my host.
[ 103.043194] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
[ 103.044025] IP: [<ffffffff815aaaaf>] tcp_rearm_rto+0xcf/0x150
...
[ 146.301158] Call Trace:
[ 146.301158] [<ffffffff815ab7f0>] tcp_ack+0xcc0/0x12c0
Before this panic a tcp socket was restored. This socket had sent and
unsent data in the write queue. Sent data was restored in repair mode,
then the socket was switched from reapair mode and unsent data was
restored. After that the socket was switched back into repair mode.
In that moment we had a socket where write queue looks like this:
snd_una snd_nxt write_seq
|_________|________|
|
sk_send_head
After a second switching from repair mode the state of socket was
changed:
This state is inconsistent, because snd_nxt and sk_send_head are not
synchronized.
Bellow you can find a call trace, how packets_out can be incremented
twice for one skb, if snd_nxt and sk_send_head are not synchronized.
In this case packets_out will be always positive, even when
sk_write_queue is empty.
I think update of snd_nxt isn't required, when a socket is switched from
repair mode. Because it's initialized in tcp_connect_init. Then when a
write queue is restored, snd_nxt is incremented in tcp_event_new_data_sent,
so it's always is in consistent state.
I have checked, that the bug is not reproduced with this patch and
all tests about restoring tcp connections work fine.
Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Eric Dumazet <edumazet@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrey Vagin <avagin@openvz.org> Acked-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
init_card() calls dev_get_by_name() to get a network deceive. But it
doesn't decrease network device reference count after the device is
used.
Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After searching rt by the vti tunnel dst/src parameter,
if this rt has neither attached to any transformation
nor the transformation is not tunnel oriented, this rt
should be released back to ip layer.
otherwise causing dst memory leakage.
Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
and on HOSTB you do:
ping6 HOSTA -s2000 (MTU is 1500)
Incoming echo requests will be filtered out on HOSTA. This issue does
not occur with smaller packets than MTU (where fragmentation does not happen)
</example>
As was discussed previously, the only correct solution seems to be to use
reassembled skb instead of separete frags. Doing this has positive side
effects in reducing sk_buff by one pointer (nfct_reasm) and also the reams
dances in ipvs and conntrack can be removed.
Future plan is to remove net/ipv6/netfilter/nf_conntrack_reasm.c
entirely and use code in net/ipv6/reassembly.c instead.
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If reassembled packet would fit into outdev MTU, it is not fragmented
according the original frag size and it is send as single big packet.
The second case is if skb is gso. In that case fragmentation does not happen
according to the original frag size.
This patch fixes these.
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Offenders don't have port numbers, so set it to 0.
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If kmsg->msg_namelen > sizeof(struct sockaddr_storage) then in the
original code that would lead to memory corruption in the kernel if you
had audit configured. If you didn't have audit configured it was
harmless.
There are some programs such as beta versions of Ruby which use too
large of a buffer and returning an error code breaks them. We should
clamp the ->msg_namelen value instead.
Fixes: 1661bf364ae9 ("net: heap overflow in __audit_sockaddr()") Reported-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Tested-by: Eric Wong <normalperson@yhbt.net> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit bceaa90240b6019ed73b49965eac7d167610be69 ("inet: prevent leakage
of uninitialized memory to user in recv syscalls") conditionally updated
addr_len if the msg_name is written to. The recv_error and rxpmtu
functions relied on the recvmsg functions to set up addr_len before.
As this does not happen any more we have to pass addr_len to those
functions as well and set it to the size of the corresponding sockaddr
length.
This broke traceroute and such.
Fixes: bceaa90240b6 ("inet: prevent leakage of uninitialized memory to user in recv syscalls") Reported-by: Brad Spengler <spender@grsecurity.net> Reported-by: Tom Labanowski Cc: mpb <mpb.mail@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In that case it is probable that kernel code overwrote part of the
stack. So we should bail out loudly here.
The BUG_ON may be removed in future if we are sure all protocols are
conformant.
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
to return msg_name to the user.
This prevents numerous uninitialized memory leaks we had in the
recvmsg handlers and makes it harder for new code to accidentally leak
uninitialized memory.
Optimize for the case recvfrom is called with NULL as address. We don't
need to copy the address at all, so set it to NULL before invoking the
recvmsg handler. We can do so, because all the recvmsg handlers must
cope with the case a plain read() is called on them. read() also sets
msg_name to NULL.
Also document these changes in include/linux/net.h as suggested by David
Miller.
Changes since RFC:
Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
affect sendto as it would bail out earlier while trying to copy-in the
address. It also more naturally reflects the logic by the callers of
verify_iovec.
With this change in place I could remove "
if (!uaddr || msg_sys->msg_namelen == 0)
msg->msg_name = NULL
".
This change does not alter the user visible error logic as we ignore
msg_namelen as long as msg_name is NULL.
Also remove two unnecessary curly brackets in ___sys_recvmsg and change
comments to netdev style.
Cc: David Miller <davem@davemloft.net> Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A plain read() on a socket does set msg->msg_name to NULL. So check for
NULL pointer first.
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Only update *addr_len when we actually fill in sockaddr, otherwise we
can return uninitialized memory from the stack to the caller in the
recvfrom, recvmmsg and recvmsg syscalls. Drop the the (addr_len == NULL)
checks because we only get called with a valid addr_len pointer either
from sock_common_recvmsg or inet_recvmsg.
If a blocking read waits on a socket which is concurrently shut down we
now return zero and set msg_msgnamelen to 0.
Reported-by: mpb <mpb.mail@gmail.com> Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For performance reasons, sch_fq tried hard to not setup timers for every
sent packet, using a quantum based heuristic : A delay is setup only if
the flow exhausted its credit.
Problem is that application limited flows can refill their credit
for every queued packet, and they can evade pacing.
This problem can also be triggered when TCP flows use small MSS values,
as TSO auto sizing builds packets that are smaller than the default fq
quantum (3028 bytes)
This patch adds a 40 ms delay to guard flow credit refill.
Fixes: afe4fd062416 ("pkt_sched: fq: Fair Queue packet scheduler") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 7eec4174ff29 ("pkt_sched: fq: fix non TCP flows pacing")
obsoleted TCA_FQ_FLOW_DEFAULT_RATE without notice for the users.
Suggested by David Miller
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ip4_datagram_connect() being called from process context,
it should use IP_INC_STATS() instead of IP_INC_STATS_BH()
otherwise we can deadlock on 32bit arches, or get corruptions of
SNMP counters.
Fixes: 584bdf8cbdf6 ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In af3e095a1fb4, Erik Jacobsen fixed one type of unaligned access
bug for ia64 by converting a 64-bit write to use put_unaligned().
Unfortunately, since gcc will convert a short memset() to a series
of appropriately-aligned stores, the problem is now visible again
on tilegx, where the memset that zeros out proc_event is converted
to three 64-bit stores, causing an unaligned access panic.
A better fix for the original problem is to ensure that proc_event
is aligned to 8 bytes here. We can do that relatively easily by
arranging to start the struct cn_msg aligned to 8 bytes and then
offset by 4 bytes. Doing so means that the immediately following
proc_event structure is then correctly aligned to 8 bytes.
The result is that the memset() stores are now aligned, and as an
added benefit, we can remove the put_unaligned() calls in the code.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Initial sch_fq implementation copied code from pfifo_fast to classify
a packet as a high prio packet.
This clashes with setups using PRIO with say 7 bands, as one of the
band could be incorrectly (mis)classified by FQ.
Packets would be queued in the 'internal' queue, and no pacing ever
happen for this special queue.
Fixes: afe4fd062416 ("pkt_sched: fq: Fair Queue packet scheduler") Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Willem de Bruijn <willemb@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug has been introduced by commit bb8140947a24 ("ip6tnl: allow to use rtnl ops
on fb tunnel").
When ip6_tunnel.ko is unloaded, FB device is delete by rtnl_link_unregister()
and then we try to use the pointer in ip6_tnl_destroy_tunnels().
Let's add an handler for dellink, which will never remove the FB tunnel. With
this patch it will no more be possible to remove it via 'ip link del ip6tnl0',
but it's safer.
The same fix was already proposed by Willem de Bruijn <willemb@google.com> for
sit interfaces.
CC: Willem de Bruijn <willemb@google.com> Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
These strings come from a copy_from_user() and there is no way to be
sure they are NUL terminated.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: The fallback device is created in sit_init_net and assumed to be
freed in sit_exit_net. First, it is dereferenced in that function, in
sit_destroy_tunnels:
struct net *net = dev_net(sitn->fb_tunnel_dev);
Prior to this, rtnl_unlink_register has removed all devices that match
rtnl_link_ops == sit_link_ops.
which cases the fallback device to match here and be freed before it
is last dereferenced.
Fix: This commit adds an explicit .delllink callback to sit_link_ops
that skips deallocation at rtnl_unlink_register for the fallback
device. This mechanism is comparable to the one in ip_tunnel.
It also modifies sit_destroy_tunnels and its only caller sit_exit_net
to avoid the offending dereference in the first place. That double
lookup is more complicated than required.
Test: The bug is only triggered when CONFIG_NET_NS is enabled. It
causes a GPF only when CONFIG_DEBUG_SLAB is enabled. Verified that
this bug exists at the mentioned commit, at davem-net HEAD and at
3.11.y HEAD. Verified that it went away after applying this patch.
Fixes: 205983c43700 ("sit: allow to use rtnl ops on fb tunnel") Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We had some reports of crashes using TCP fastopen, and Dave Jones
gave a nice stack trace pointing to the error.
Issue is that tcp_get_metrics() should not be called with a NULL dst
Fixes: 1fe4c481ba637 ("net-tcp: Fast Open client - cookie cache") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dave Jones <davej@redhat.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Tested-by: Dave Jones <davej@fedoraproject.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes two race conditions between bond_store_updelay/downdelay
and bond_store_miimon which could lead to division by zero as miimon can
be set to 0 while either updelay/downdelay are being set and thus miss the
zero check in the beginning, the zero div happens because updelay/downdelay
are stored as new_value / bond->params.miimon. Use rtnl to synchronize with
miimon setting.
CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After commit c9eeec26e32e ("tcp: TSQ can use a dynamic limit"), several
users reported throughput regressions, notably on mvneta and wifi
adapters.
802.11 AMPDU requires a fair amount of queueing to be effective.
This patch partially reverts the change done in tcp_write_xmit()
so that the minimal amount is sysctl_tcp_limit_output_bytes.
It also remove the use of this sysctl while building skb stored
in write queue, as TSO autosizing does the right thing anyway.
Users with well behaving NICS and correct qdisc (like sch_fq),
can then lower the default sysctl_tcp_limit_output_bytes value from
128KB to 8KB.
This new usage of sysctl_tcp_limit_output_bytes permits each driver
authors to check how their driver performs when/if the value is set
to a minimum of 4KB.
Normally, line rate for a single TCP flow should be possible,
but some drivers rely on timers to perform TX completion and
too long TX completion delays prevent reaching full throughput.
Fixes: c9eeec26e32e ("tcp: TSQ can use a dynamic limit") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Sujith Manoharan <sujith@msujith.org> Reported-by: Arnaud Ebalard <arno@natisbad.org> Tested-by: Sujith Manoharan <sujith@msujith.org> Cc: Felix Fietkau <nbd@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We currently use hdr_len as a hint of head length which is advertised by
guest. But when guest advertise a very big value, it can lead to an 64K+
allocating of kmalloc() which has a very high possibility of failure when host
memory is fragmented or under heavy stress. The huge hdr_len also reduce the
effect of zerocopy or even disable if a gso skb is linearized in guest.
To solves those issues, this patch introduces an upper limit (PAGE_SIZE) of the
head, which guarantees an order 0 allocation each time.
Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We currently use hdr_len as a hint of head length which is advertised by
guest. But when guest advertise a very big value, it can lead to an 64K+
allocating of kmalloc() which has a very high possibility of failure when host
memory is fragmented or under heavy stress. The huge hdr_len also reduce the
effect of zerocopy or even disable if a gso skb is linearized in guest.
To solves those issues, this patch introduces an upper limit (PAGE_SIZE) of the
head, which guarantees an order 0 allocation each time.
Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If priority/traffic class field in IPv6 header is set (seen when
using ssh), the uncompression sets the TC and Flow fields incorrectly.
Example:
This is IPv6 header of a sent packet. Note the priority/TC (=1) in
the first byte.
00000000: 61 00 00 00 00 2c 06 40 fe 80 00 00 00 00 00 00 00000010: 02 02 72 ff fe c6 42 10 fe 80 00 00 00 00 00 00 00000020: 02 1e ab ff fe 4c 52 57
This gets compressed like this in the sending side
00000000: 72 31 04 06 02 1e ab ff fe 4c 52 57 ec c2 00 16 00000010: aa 2d fe 92 86 4e be c6 ....
In the receiving end, the packet gets uncompressed to this
IPv6 header
00000000: 60 06 06 02 00 2a 1e 40 fe 80 00 00 00 00 00 00 00000010: 02 02 72 ff fe c6 42 10 fe 80 00 00 00 00 00 00 00000020: ab ff fe 4c 52 57 ec c2
First four bytes are set incorrectly and we have also lost
two bytes from destination address.
The fix is to switch the case values in switch statement
when checking the TC field.
Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 06a23fe31ca3
("core/dev: set pkt_type after eth_type_trans() in dev_forward_skb()")
and refactoring 64261f230a91
("dev: move skb_scrub_packet() after eth_type_trans()")
are forcing pkt_type to be PACKET_HOST when skb traverses veth.
which means that ip forwarding will kick in inside netns
even if skb->eth->h_dest != dev->dev_addr
Fix order of eth_type_trans() and skb_scrub_packet() in dev_forward_skb()
and in ip_tunnel_rcv()
Fixes: 06a23fe31ca3 ("core/dev: set pkt_type after eth_type_trans() in dev_forward_skb()") CC: Isaku Yamahata <yamahatanetdev@gmail.com> CC: Maciej Zenczykowski <zenczykowski@gmail.com> CC: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since commit 7b0c5f21f348a66de495868b8df0284e8dfd6bbf
"sierra_net: keep status interrupt URB active", sierra_net triggers
status interrupt polling before the net_device is opened (in order to
properly receive the sync message response).
To be able to receive further interrupts, the interrupt urb needs to be
re-submitted, so this patch removes the bogus check for netif_running().
Signed-off-by: Felix Fietkau <nbd@openwrt.org> Tested-by: Dan Williams <dcbw@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently the ARP monitoring is not supported with 802.3ad, and it's
prohibited to use it via the module params.
However we still can set it afterwards via sysfs, cause we only check for
*LB modes there.
To fix this - add a check for 802.3ad mode in bonding_store_arp_interval.
CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For properly initialising the Tausworthe generator [1], we have
a strict seeding requirement, that is, s1 > 1, s2 > 7, s3 > 15.
Commit 697f8d0348 ("random32: seeding improvement") introduced
a __seed() function that imposes boundary checks proposed by the
errata paper [2] to properly ensure above conditions.
However, we're off by one, as the function is implemented as:
"return (x < m) ? x + m : x;", and called with __seed(X, 1),
__seed(X, 7), __seed(X, 15). Thus, an unwanted seed of 1, 7, 15
would be possible, whereas the lower boundary should actually
be of at least 2, 8, 16, just as GSL does. Fix this, as otherwise
an initialization with an unwanted seed could have the effect
that Tausworthe's PRNG properties cannot not be ensured.
Note that this PRNG is *not* used for cryptography in the kernel.
Cc: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As the rfc 4191 said, the Router Preference and Lifetime values in a
::/0 Route Information Option should override the preference and lifetime
values in the Router Advertisement header. But when the kernel deals with
a ::/0 Route Information Option, the rt6_get_route_info() always return
NULL, that means that overriding will not happen, because those default
routers were added without flag RTF_ROUTEINFO in rt6_add_dflt_router().
In order to deal with that condition, we should call rt6_get_dflt_router
when the prefix length is 0.
Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When trying to delete a table >= 256 using iproute2 the local table
will be deleted.
The table id is specified as a netlink attribute when it needs more then
8 bits and iproute2 then sets the table field to RT_TABLE_UNSPEC (0).
Preconditions to matching the table id in the rule delete code
doesn't seem to take the "table id in netlink attribute" into condition
so the frh_get_table helper function never gets to do its job when
matching against current rule.
Use the helper function twice instead of peaking at the table value directly.
Originally reported at: http://bugs.debian.org/724783
Reported-by: Nicolas HICHER <nhicher@avencall.com> Signed-off-by: Andreas Henriksson <andreas@fatal.se> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
timecounter_init() was was called only after first potential
timecounter_read().
Moved mlx4_en_init_timestamp() before mlx4_en_init_netdev()
Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reported-by: Joe Lawrence <joe.lawrence@stratus.com> Tested-by: Joe Lawrence <joe.lawrence@stratus.com> CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 1e2bd517c108816220f262d7954b697af03b5f9c ("udp6: Fix udp
fragmentation for tunnel traffic.") changed the calculation if
there is enough space to include a fragment header in the skb from a
skb->mac_header dervived one to skb_headroom. Because we already peeled
off the skb to transport_header this is wrong. Change this back to check
if we have enough room before the mac_header.
This fixes a panic Saran Neti reported. He used the tbf scheduler which
skb_gso_segments the skb. The offsets get negative and we panic in memcpy
because the skb was erroneously not expanded at the head.
Reported-by: Saran Neti <Saran.Neti@telus.com> Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We assume that "mp->phy" can be NULL a couple lines before the
dereference.
Fixes: 1cce16d37d0f ('net: mv643xx_eth: Add missing phy_addr_set in DT mode') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Acked-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jason Gunthorpe [Tue, 5 Nov 2013 00:27:19 +0000 (17:27 -0700)]
net: mv643xx_eth: Add missing phy_addr_set in DT mode
Commit cc9d4598 'net: mv643xx_eth: use of_phy_connect if phy_node
present' made the call to phy_scan optional, if the DT has a link to
the phy node.
However phy_scan has the side effect of calling phy_addr_set, which
writes the phy MDIO address to the ethernet controller. If phy_addr_set
is not called, and the bootloader has not set the correct address then
the driver will fail to function.
Tested on Kirkwood.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Tested-by: Arnaud Ebalard <arno@natisbad.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is a patch that adds the new Mayflash Gamecube Controller to USB adapter
(ID 1a34:f705 ACRUX) to the ACRUX driver (drivers/hid/hid-axff.c) with full
force feedback support.
Added id, bindings and comments for Holtek USB ID 04d9:a072
LEETGION Hellion Gaming mouse to use the same corrections of the report
descriptor as Holtek 04d9:a067. As the mouse exceed HID_MAX_USAGES at the
same offsets in the reported descriptor.
Tested on the hardware.
Signed-off-by: Anders F. U. Kiær <ablacksheep@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Remove the certificate date checks that are performed when a certificate is
parsed. There are two checks: a valid from and a valid to. The first check is
causing a lot of problems with system clocks that don't keep good time and the
second places an implicit expiry date upon the kernel when used for module
signing, so do we really need them?
Signed-off-by: David Howells <dhowells@redhat.com>
cc: David Woodhouse <dwmw2@infradead.org>
cc: Rusty Russell <rusty@rustcorp.com.au>
cc: Josh Boyer <jwboyer@redhat.com>
cc: Alexander Holler <holler@ahsoftware.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/dvb-frontends/s5h1420.c:851:1: warning: 's5h1420_tuner_i2c_tuner_xfer' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer.
In the specific case of this frontend, only ttpci uses it. The maximum
number of messages there is two, on I2C read operations. As the logic
can add an extra operation, change the size to 3.
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/dvb-frontends/bcm3510.c:230:1: warning: 'bcm3510_do_hab_cmd' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/itd1000.c:69:1: warning: 'itd1000_write_regs.constprop.0' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/mt312.c:126:1: warning: 'mt312_write' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/nxt200x.c:111:1: warning: 'nxt200x_writebytes' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/stb6100.c:216:1: warning: 'stb6100_write_reg_range.constprop.3' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/stv6110.c:98:1: warning: 'stv6110_write_regs' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/stv6110x.c:85:1: warning: 'stv6110x_write_regs' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/tda18271c2dd.c:147:1: warning: 'WriteRegs' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/zl10039.c:119:1: warning: 'zl10039_write' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer. Considering that I2C
transfers are generally limited, and that devices used on USB has a
max data length of 64 bytes for the control URBs.
So, it seem safe to use 64 bytes as the hard limit for all those devices.
On most cases, the limit is a way lower than that, but this limit
is small enough to not affect the Kernel stack, and it is a no brain
limit, as using smaller ones would require to either carefully each
driver or to take a look on each datasheet.
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/dvb-frontends/af9013.c:77:1: warning: 'af9013_wr_regs_i2c' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/af9033.c:188:1: warning: 'af9033_wr_reg_val_tab' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/af9033.c:68:1: warning: 'af9033_wr_regs' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/bcm3510.c:230:1: warning: 'bcm3510_do_hab_cmd' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/cxd2820r_core.c:84:1: warning: 'cxd2820r_rd_regs_i2c.isra.1' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/rtl2830.c:56:1: warning: 'rtl2830_wr' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/rtl2832.c:187:1: warning: 'rtl2832_wr' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/tda10071.c:52:1: warning: 'tda10071_wr_regs' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/tda10071.c:84:1: warning: 'tda10071_rd_regs' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer. Considering that I2C
transfers are generally limited, and that devices used on USB has a
max data length of 64 bytes for the control URBs.
So, it seem safe to use 64 bytes as the hard limit for all those devices.
On most cases, the limit is a way lower than that, but this limit
is small enough to not affect the Kernel stack, and it is a no brain
limit, as using smaller ones would require to either carefully each
driver or to take a look on each datasheet.
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/dvb-frontends/stb0899_drv.c:540:1: warning: 'stb0899_write_regs' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer. Considering that I2C
transfers are generally limited, and that devices used on USB has a
max data length of 64 bytes for the control URBs.
So, it seem safe to use 64 bytes as the hard limit for all those devices.
On most cases, the limit is a way lower than that, but this limit
is small enough to not affect the Kernel stack, and it is a no brain
limit, as using smaller ones would require to either carefully each
driver or to take a look on each datasheet.
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/dvb-frontends/stv0367.c:791:1: warning: 'stv0367_writeregs.constprop.4' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer. Considering that I2C
transfers are generally limited, and that devices used on USB has a
max data length of 64 bytes for the control URBs.
So, it seem safe to use 64 bytes as the hard limit for all those devices.
On most cases, the limit is a way lower than that, but this limit
is small enough to not affect the Kernel stack, and it is a no brain
limit, as using smaller ones would require to either carefully each
driver or to take a look on each datasheet.
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/dvb-frontends/stv090x.c:750:1: warning: 'stv090x_write_regs.constprop.6' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer. Considering that I2C
transfers are generally limited, and that devices used on USB has a
max data length of 64 bytes for the control URBs.
So, it seem safe to use 64 bytes as the hard limit for all those devices.
On most cases, the limit is a way lower than that, but this limit
is small enough to not affect the Kernel stack, and it is a no brain
limit, as using smaller ones would require to either carefully each
driver or to take a look on each datasheet.
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/tuners/e4000.c:50:1: warning: 'e4000_wr_regs' uses dynamic stack allocation [enabled by default]
drivers/media/tuners/e4000.c:83:1: warning: 'e4000_rd_regs' uses dynamic stack allocation [enabled by default]
drivers/media/tuners/fc2580.c:66:1: warning: 'fc2580_wr_regs.constprop.1' uses dynamic stack allocation [enabled by default]
drivers/media/tuners/fc2580.c:98:1: warning: 'fc2580_rd_regs.constprop.0' uses dynamic stack allocation [enabled by default]
drivers/media/tuners/tda18212.c:57:1: warning: 'tda18212_wr_regs' uses dynamic stack allocation [enabled by default]
drivers/media/tuners/tda18212.c:90:1: warning: 'tda18212_rd_regs.constprop.0' uses dynamic stack allocation [enabled by default]
drivers/media/tuners/tda18218.c:60:1: warning: 'tda18218_wr_regs' uses dynamic stack allocation [enabled by default]
drivers/media/tuners/tda18218.c:92:1: warning: 'tda18218_rd_regs.constprop.0' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer. Considering that I2C
transfers are generally limited, and that devices used on USB has a
max data length of 64 bytes for the control URBs.
So, it seem safe to use 64 bytes as the hard limit for all those devices.
On most cases, the limit is a way lower than that, but this limit
is small enough to not affect the Kernel stack, and it is a no brain
limit, as using smaller ones would require to either carefully each
driver or to take a look on each datasheet.
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/tuners/tuner-xc2028.c:651:1: warning: 'load_firmware' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer.
In the specific case of this driver, the maximum limit is 80, used only
on tm6000 driver. This limit is due to the size of the USB control URBs.
Ok, it would be theoretically possible to use a bigger size on PCI
devices, but the firmware load time is already good enough. Anyway,
if some usage requires more, it is just a matter of also increasing
the buffer size at load_firmware().
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/v4l2-core/v4l2-async.c:238:1: warning: 'v4l2_async_notifier_unregister' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer.
In this specific case, there's a hard limit imposed by V4L2_MAX_SUBDEVS,
with is currently 128. That means that the buffer size can be up to
128x8 = 1024 bytes (on a 64bits kernel), with is too big for stack.
Worse than that, someone could increase it and cause real troubles.
So, let's use dynamically allocated data, instead.
Dynamic static allocation is evil, as Kernel stack is too low, and
ompilation complains about it on some archs:
drivers/staging/media/lirc/lirc_zilog.c:967:1: warning: 'read' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer to be 64. That should
be more than enough.
drivers/media/pci/cx18/cx18-driver.c: In function 'cx18_read_eeprom':
drivers/media/pci/cx18/cx18-driver.c:357:1: warning: the frame size of 1072 bytes is larger than 1024 bytes [-Wframe-larger-than=]
That happens because the routine allocates 256 bytes for an eeprom buffer, plus
the size of struct i2c_client, with is big.
Change the logic to dynamically allocate/deallocate space for struct i2c_client,
instead of using the stack.
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/pci/cx23885/cimax2.c:149:1: warning: 'netup_write_i2c' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer. Considering that I2C
transfers are generally limited, and that devices used on USB has a
max data length of 64 bytes for the control URBs.
So, it seem safe to use 64 bytes as the hard limit for all those devices.
On most cases, the limit is a way lower than that, but this limit
is small enough to not affect the Kernel stack, and it is a no brain
limit, as using smaller ones would require to either carefully each
driver or to take a look on each datasheet.
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/pci/ttpci/av7110_hw.c:510:1: warning: 'av7110_fw_cmd' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer.
In the specific case of this driver, the maximum fw command size
is 6 + 2, as checked using:
$ git grep -A1 av7110_fw_cmd drivers/media/pci/ttpci/
So, use 8 for the buffer size.
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/usb/dvb-usb/cxusb.c:209:1: warning: 'cxusb_i2c_xfer' uses dynamic stack allocation [enabled by default]
drivers/media/usb/dvb-usb/cxusb.c:69:1: warning: 'cxusb_ctrl_msg' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer to be the max size of
a control URB payload data (64 bytes).
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/usb/dvb-usb/dibusb-common.c:124:1: warning: 'dibusb_i2c_msg' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer to be the max size of
a control URB payload data (64 bytes).
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/usb/dvb-usb/dw2102.c:368:1: warning: 'dw2102_earda_i2c_transfer' uses dynamic stack allocation [enabled by default]
drivers/media/usb/dvb-usb/dw2102.c:449:1: warning: 'dw2104_i2c_transfer' uses dynamic stack allocation [enabled by default]
drivers/media/usb/dvb-usb/dw2102.c:512:1: warning: 'dw3101_i2c_transfer' uses dynamic stack allocation [enabled by default]
drivers/media/usb/dvb-usb/dw2102.c:621:1: warning: 's6x0_i2c_transfer' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer to be the max size of
a control URB payload data (64 bytes).
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/usb/dvb-usb-v2/af9015.c:433:1: warning: 'af9015_eeprom_hash' uses dynamic stack allocation [enabled by default]
In this specific case, it is a gcc bug, as the size is a const, but
it is easy to just change it from const to a #define, getting rid of
the gcc warning.
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/usb/dvb-usb-v2/af9035.c:142:1: warning: 'af9035_wr_regs' uses dynamic stack allocation [enabled by default]
drivers/media/usb/dvb-usb-v2/af9035.c:305:1: warning: 'af9035_i2c_master_xfer' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer to be the max size of
a control URB payload data (64 bytes).
Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/usb/dvb-usb-v2/mxl111sf.c:74:1: warning: 'mxl111sf_ctrl_msg' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer to be the max size of
a control URB payload data (64 bytes).
If we fail to allocate an indirect buffer (ib) when updating
the ptes, return an error instead of trying to use the ib.
Avoids a null pointer dereference.
The PL061 driver had the irqdomain initialization in an unfortunate
place: when used with device tree (and thus passing the base IRQ
0) the driver would work, as this registers an irqdomain and waits
for mappings to be done dynamically as the devices request their
IRQs, whereas when booting using platform data the irqdomain core
would attempt to allocate IRQ descriptors dynamically (which works
fine) but also to associate the irq_domain_associate_many() on all
IRQs, which in turn will call the mapping function which at this
point will try to set the type of the IRQ and then tries to acquire
a non-initialized spinlock yielding a backtrace like this:
This moves the irqdomain initialization to a point where the spinlock
and GPIO chip are both fully propulated, so the callbacks can be used
without crashes.
I had some problem reproducing the crash, as the devm_kzalloc():ed
zeroed memory would seemingly mask the spinlock as something OK,
but by poisoning the lock like this:
u32 *dum;
dum = (u32 *) &chip->lock;
*dum = 0xaaaaaaaaU;
I could reproduce, fix and test the patch.
Reported-by: Russell King <linux@arm.linux.org.uk> Cc: Rob Herring <robherring2@gmail.com> Cc: Haojian Zhuang <haojian.zhuang@linaro.org> Cc: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
By default the Logitech Formula Vibration presents a combined accel/brake
axis ('Y'). This patch modifies the HID descriptor to present seperate
accel/brake axes ('Y' and 'Z').
Most of the hid sensor field size is reported in report_size field
in the report descriptor. For rotation fusion sensor the quaternion
data is 16 byte field, the report size was set to 4 and report
count field is set to 4. So the total size is 16 bytes. But the current
driver has a bug and not taking account for report count field. This
causes user space to see only 4 bytes of data sent via IIO interface.
The number of bytes in a field needs to take account of report_count
field. Need to multiply report_size and report_count to get total
number of bytes.
In the case that atomic_open calls finish_no_open() with
the dentry that was supplied to gfs2_atomic_open() an
extra reference count is required. This patch fixes that
issue preventing a bug trap triggering at umount time.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
afa2c9407f8908 ("sh: ecovec24: Use MMC/SDHI CD and RO GPIO") added
.tmio_flags = TMIO_MMC_USE_GPIO_CD on sh_mobile_sdhi_info, but it needs
<linux/mfd/tmio.h> header. This patch adds it.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Yusuke Goda <yusuke.goda.sx@renesas.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Chris Ball <cjb@laptop.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In some platforms, specially Thinkpad series, rts5249 won't be
initialized properly. So we need adjust some phy parameters to
improve the compatibility issue.
It is a little different between simulation and real chip. We have
no idea about which configuration is better before tape-out. We set
default settings according to simulation, but need to tune these
parameters after getting the real chip.
I can't explain every change in detail here. The below information is
just a rough description:
PHY_REG_REV: Disable internal clkreq_tx, enable rx_pwst
PHY_BPCR: No change, just turn the magic number to macro definitions
PHY_PCR: Change OOBS sensitivity, from 60mV to 90mV
PHY_RCR2: Control charge-pump current automatically
PHY_FLD4: Use TX cmu reference clock
PHY_RDR: Change RXDSEL from 30nF to 1.9nF
PHY_RCR1: Change the duration between adp_st and asserting cp_en from
0.32 us to 0.64us
PHY_FLD3: Adjust internal timers
PHY_TUNE: Fine tune the regulator12 output voltage
Signed-off-by: Wei WANG <wei_wang@realsil.com.cn> Signed-off-by: Lee Jones <lee.jones@linaro.org> Cc: Chris Ball <cjb@laptop.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Certain devices with class HID, protocol None did not work with the HID
driver at one point, and as a result were bound to usbtouchscreen
instead as of commit 139ebe8 ("Input: usbtouchscreen - fix eGalax HID
ignoring"). This change was prompted by the following report:
https://lkml.org/lkml/2009/1/25/127
Unfortunately, the device mentioned in this report is no longer
available for testing.
We've recently discovered that some devices with class HID, protocol
None do not work with usbtouchscreen, but do work with usbhid. Here is
the report that made this evident:
Driver binding for these devices has flip-flopped a few times, so both
of the above reports were regressions.
This situation would appear to leave us with no easy way to bind every
device to the right driver. However, in my own testing with several
devices I have not found a device with class HID that does not work with
the current HID driver. It is my belief that changes to the HID driver
since the original report have likely fixed the issue(s) that made it
unsuitable at the time, and that we should prefer it over usbtouchscreen
for these devices. In particular, HID quirks affecting these devices
were added/removed in the following commits since then:
fe6065d HID: add multi-input quirk for eGalax Touchcontroller 77933c3 Merge branch 'egalax' into for-linus ebd11fe HID: Add quirk for eGalax touch controler. d34c4aa HID: add no-get quirk for eGalax touch controller
This patch makes the HID driver no longer ignore eGalax/D-Wav/EETI
devices with class HID. If there are in fact devices with class HID
that still do not work with the HID driver, we will see another round of
regressions. In that case I propose we investigate why the device is
not working with the HID driver rather than re-introduce regressions for
functioning HID devices by again binding them to usbtouchscreen.
The corresponding change to usbtouchscreen will be made separately.
If hardware (or firmware) detects palm on the surface of the device it does
not mean that the data packet is bad from the protocol standpoint. Instead
of reporting PSMOUSE_BAD_DATA in this case simply threat it as if nothing
touches the surface.
BugLink: http://bugs.launchpad.net/bugs/1229361 Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Tested-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
evdev always tries to allocate the event buffer for clients using
kzalloc rather than vmalloc, presumably to avoid mapping overhead where
possible. However, drivers like bcm5974, which claims support for
reporting 16 fingers simultaneously, can have an extraordinarily large
buffer. The resultant contiguous order-4 allocation attempt fails due
to fragmentation, and the device is thus unusable until reboot.
Try kzalloc if we can to avoid the mapping overhead, but if that fails,
fall back to vzalloc.
Signed-off-by: Daniel Stone <daniels@collabora.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
New tracepoints have been added that have allowed for NULL pointers
being assigned to strings. To fix this, change the TRACE_EVENT() code
to check for NULL and if it is, it will assign "(null)" to it instead
(similar to what glibc printf does).
The current generic parser assumes blindly that the volume and mute
amps are found in the aamix node itself. But on some codecs,
typically Analog Devices ones, the aamix amps are separately
implemented in each leaf node of the aamix node, and the current
driver can't establish the correct amp controls. This is a regression
compared with the previous static quirks.
This patch extends the search for the amps to the leaf nodes for
allowing the aamix controls again on such codecs.
In this implementation, I didn't code to loop through the whole paths,
since usually one depth should suffice, and we can't search too
deeply, as it may result in the conflicting control assignments.
Add a fixup entry for the missing bass speaker pin 0x16 on ASUS ET2700
AiO desktop. The channel map will be added in the next patch, so that
this can be backported easily to stable kernels.
When a headphone jack is configurable as input, the generic parser
tries to make it retaskable as Headphone Mic. The switching can be
done smoothly if Capture Source control exists (i.e. there is another
input source). Or when user explicitly enables the creation of jack
mode controls, "Headhpone Mic Jack Mode" will be created accordingly.
However, if the headphone mic is the only input source, we have to
create "Headphone Mic Jack Mode" control because there is no capture
source selection. Otherwise, the generic parser assumes that the
input is constantly enabled, thus the headphone is permanently set
as input. This situation happens on the old MacBook Airs where no
input is supported properly, for example.
This patch fixes the problem: now "Headphone Mic Jack Mode" is created
when such an input selection isn't possible.