When destroying an mrt table the static mfc entries and the static
devices are kept, which leads to devices that can never be destroyed
(because of refcnt taken) and leaked memory, for example:
unreferenced object 0xffff880034c144c0 (size 192):
comm "mfc-broken", pid 4777, jiffies 4320349055 (age 46001.964s)
hex dump (first 32 bytes):
98 53 f0 34 00 88 ff ff 98 53 f0 34 00 88 ff ff .S.4.....S.4....
ef 0a 0a 14 01 02 03 04 00 00 00 00 01 00 00 00 ................
backtrace:
[<ffffffff815c1b9e>] kmemleak_alloc+0x4e/0xb0
[<ffffffff811ea6e0>] kmem_cache_alloc+0x190/0x300
[<ffffffff815931cb>] ip_mroute_setsockopt+0x5cb/0x910
[<ffffffff8153d575>] do_ip_setsockopt.isra.11+0x105/0xff0
[<ffffffff8153e490>] ip_setsockopt+0x30/0xa0
[<ffffffff81564e13>] raw_setsockopt+0x33/0x90
[<ffffffff814d1e14>] sock_common_setsockopt+0x14/0x20
[<ffffffff814d0b51>] SyS_setsockopt+0x71/0xc0
[<ffffffff815cdbf6>] entry_SYSCALL_64_fastpath+0x16/0x7a
[<ffffffffffffffff>] 0xffffffffffffffff
Make sure that everything is cleaned on netns destruction.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Further investigation showed that this can happen when an *odd* number of
fds are being passed over AF_UNIX sockets.
In these cases CMSG_LEN(i * sizeof(int)) and CMSG_SPACE(i * sizeof(int)),
where i is the number of successfully passed fds, differ by 4 bytes due
to the extra CMSG_ALIGN() padding in CMSG_SPACE() to an 8 byte boundary
on 64 bit. The padding is used to align subsequent cmsg headers in the
control buffer.
When the control buffer passed in from the receiver side *lacks* these 4
bytes (e.g. due to buggy/wrong API usage), then msg->msg_controllen will
overflow in scm_detach_fds():
int cmlen = CMSG_LEN(i * sizeof(int)); <--- cmlen w/o tail-padding
err = put_user(SOL_SOCKET, &cm->cmsg_level);
if (!err)
err = put_user(SCM_RIGHTS, &cm->cmsg_type);
if (!err)
err = put_user(cmlen, &cm->cmsg_len);
if (!err) {
cmlen = CMSG_SPACE(i * sizeof(int)); <--- cmlen w/ 4 byte extra tail-padding
msg->msg_control += cmlen;
msg->msg_controllen -= cmlen; <--- iff no tail-padding space here ...
} ... wrap-around
F.e. it will wrap to a length of 18446744073709551612 bytes in case the
receiver passed in msg->msg_controllen of 20 bytes, and the sender
properly transferred 1 fd to the receiver, so that its CMSG_LEN results
in 20 bytes and CMSG_SPACE in 24 bytes.
In case of MSG_CMSG_COMPAT (scm_detach_fds_compat()), I haven't seen an
issue in my tests as alignment seems always on 4 byte boundary. Same
should be in case of native 32 bit, where we end up with 4 byte boundaries
as well.
In practice, passing msg->msg_controllen of 20 to recvmsg() while receiving
a single fd would mean that on successful return, msg->msg_controllen is
being set by the kernel to 24 bytes instead, thus more than the input
buffer advertised. It could f.e. become an issue if such application later
on zeroes or copies the control buffer based on the returned msg->msg_controllen
elsewhere.
Maximum number of fds we can send is a hard upper limit SCM_MAX_FD (253).
Going over the code, it seems like msg->msg_controllen is not being read
after scm_detach_fds() in scm_recv() anymore by the kernel, good!
Relevant recvmsg() handler are unix_dgram_recvmsg() (unix_seqpacket_recvmsg())
and unix_stream_recvmsg(). Both return back to their recvmsg() caller,
and ___sys_recvmsg() places the updated length, that is, new msg_control -
old msg_control pointer into msg->msg_controllen (hence the 24 bytes seen
in the example).
Long time ago, Wei Yongjun fixed something related in commit 1ac70e7ad24a
("[NET]: Fix function put_cmsg() which may cause usr application memory
overflow").
RFC3542, section 20.2. says:
The fields shown as "XX" are possible padding, between the cmsghdr
structure and the data, and between the data and the next cmsghdr
structure, if required by the implementation. While sending an
application may or may not include padding at the end of last
ancillary data in msg_controllen and implementations must accept both
as valid. On receiving a portable application must provide space for
padding at the end of the last ancillary data as implementations may
copy out the padding at the end of the control message buffer and
include it in the received msg_controllen. When recvmsg() is called
if msg_controllen is too small for all the ancillary data items
including any trailing padding after the last item an implementation
may set MSG_CTRUNC.
Since we didn't place MSG_CTRUNC for already quite a long time, just do
the same as in 1ac70e7ad24a to avoid an overflow.
Btw, even man-page author got this wrong :/ See db939c9b26e9 ("cmsg.3: Fix
error in SCM_RIGHTS code sample"). Some people must have copied this (?),
thus it got triggered in the wild (reported several times during boot by
David and HacKurx).
No Fixes tag this time as pre 2002 (that is, pre history tree).
Reported-by: David Sterba <dave@jikos.cz> Reported-by: HacKurx <hackurx@gmail.com> Cc: PaX Team <pageexec@freemail.hu> Cc: Emese Revfy <re.emese@gmail.com> Cc: Brad Spengler <spender@grsecurity.net> Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: Eric Dumazet <edumazet@google.com> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
His program is specifically attempting a Cross SYN TCP exchange,
that we support (for the pleasure of hackers ?), but it looks we
lack proper tcp->copied_seq initialization.
Thanks again Dmitry for your report and testings.
Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tcp_send_rcvq() is used for re-injecting data into tcp receive queue.
Problems :
- No check against size is performed, allowed user to fool kernel in
attempting very large memory allocations, eventually triggering
OOM when memory is fragmented.
- In case of fault during the copy we do not return correct errno.
Lets use alloc_skb_with_frags() to cook optimal skbs.
Fixes: 292e8d8c8538 ("tcp: Move rcvq sending to tcp_input.c") Fixes: c0e88ff0f256 ("tcp: Repair socket queues") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pavel Emelyanov <xemul@parallels.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some middle-boxes black-hole the data after the Fast Open handshake
(https://www.ietf.org/proceedings/94/slides/slides-94-tcpm-13.pdf).
The exact reason is unknown. The work-around is to disable Fast Open
temporarily after multiple recurring timeouts with few or no data
delivered in the established state.
Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a passive TCP is created, we eventually call tcp_md5_do_add()
with sk pointing to the child. It is not owner by the user yet (we
will add this socket into listener accept queue a bit later anyway)
But we do own the spinlock, so amend the lockdep annotation to avoid
following splat :
Fixes: a8afca0329988 ("tcp: md5: protects md5sig_info with RCU") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thomas reports
"
4gsystems sells two total different LTE-surfsticks under the same name.
..
The newer version of XS Stick W100 is from "omega"
..
Under windows the driver switches to the same ID, and uses MI03\6 for
network and MI01\6 for modem.
..
echo "1c9e 9b01" > /sys/bus/usb/drivers/qmi_wwan/new_id
echo "1c9e 9b01" > /sys/bus/usb-serial/drivers/option1/new_id
There is also ttyUSB0, but it is not usable, at least not for at.
The device works well with qmi and ModemManager-NetworkManager.
"
Reported-by: Thomas Schäfer <tschaefer@t-online.de> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
the OUTMCAST stat is double incremented, getting bumped once in the mcast code
itself, and again in the common ip output path. Remove the mcast bump, as its
not needed
Validated by the reporter, with good results
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Claus Jensen <claus.jensen@microsemi.com> CC: Claus Jensen <claus.jensen@microsemi.com> CC: David Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In some cases the crash is caused by nicvf_remove() being called from
outside. For example, if we try to feed the device to vfio after the
probe has failed for some reason. So, move the check to better place.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
rtnl_fdb_dump always expects an index to be returned by the ndo_fdb_dump op,
but when CONFIG_NET_SWITCHDEV is off, it returns an error.
Fix that by returning the given unmodified idx.
A similar fix was 0890cf6cb6ab ("switchdev: fix return value of
switchdev_port_fdb_dump in case of error") but for the CONFIG_NET_SWITCHDEV=y
case.
Fixes: 45d4122ca7cd ("switchdev: add support for fdb add/del/dump via switchdev_port_obj ops.") Signed-off-by: Dragos Tatulea <dragos@endocode.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Drivers like vxlan use the recently introduced
udp_tunnel_xmit_skb/udp_tunnel6_xmit_skb APIs. udp_tunnel6_xmit_skb
makes use of ip6tunnel_xmit, and ip6tunnel_xmit, after sending the
packet, updates the struct stats using the usual
u64_stats_update_begin/end calls on this_cpu_ptr(dev->tstats).
udp_tunnel_xmit_skb makes use of iptunnel_xmit, which doesn't touch
tstats, so drivers like vxlan, immediately after, call
iptunnel_xmit_stats, which does the same thing - calls
u64_stats_update_begin/end on this_cpu_ptr(dev->tstats).
While vxlan is probably fine (I don't know?), calling a similar function
from, say, an unbound workqueue, on a fully preemptable kernel causes
real issues:
The solution would be to protect the whole
this_cpu_ptr(dev->tstats)/u64_stats_update_begin/end blocks with
disabling preemption and then reenabling it.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When cleaning slave's counter resources, we hold a spinlock that
protects the slave's counters list. As part of the clean, we call
__mlx4_clear_if_stat which calls mlx4_alloc_cmd_mailbox which is a
sleepable function.
In order to fix this issue, hold the spinlock, and copy all counter
indices into a temporary array, and release the spinlock. Afterwards,
iterate over this array and free every counter. Repeat this scenario
until the original list is empty (a new counter might have been added
while releasing the counters from the temporary array).
Fixes: b72ca7e96acf ("net/mlx4_core: Reset counters data when freed") Reported-by: Moni Shoua <monis@mellanox.com> Tested-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
now sctp auth cannot work well when setting a hmacid manually, which
is caused by that we didn't use the network order for hmacid, so fix
it by adding the transformation in sctp_auth_ep_set_hmacs.
even we set hmacid with the network order in userspace, it still
can't work, because of this condition in sctp_auth_ep_set_hmacs():
if (id > SCTP_AUTH_HMAC_ID_MAX)
return -EOPNOTSUPP;
so this wasn't working before and thus it won't break compatibility.
Fixes: 65b07e5d0d09 ("[SCTP]: API updates to suport SCTP-AUTH extensions.") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since it's introduction in commit 69e3c75f4d54 ("net: TX_RING and
packet mmap"), TX_RING could be used from SOCK_DGRAM and SOCK_RAW
side. When used with SOCK_DGRAM only, the size_max > dev->mtu +
reserve check should have reserve as 0, but currently, this is
unconditionally set (in it's original form as dev->hard_header_len).
I think this is not correct since tpacket_fill_skb() would then
take dev->mtu and dev->hard_header_len into account for SOCK_DGRAM,
the extra VLAN_HLEN could be possible in both cases. Presumably, the
reserve code was copied from packet_snd(), but later on missed the
check. Make it similar as we have it in packet_snd().
Fixes: 69e3c75f4d54 ("net: TX_RING and packet mmap") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In case no struct sockaddr_ll has been passed to packet
socket's sendmsg() when doing a TX_RING flush run, then
skb->protocol is set to po->num instead, which is the protocol
passed via socket(2)/bind(2).
Applications only xmitting can go the path of allocating the
socket as socket(PF_PACKET, <mode>, 0) and do a bind(2) on the
TX_RING with sll_protocol of 0. That way, register_prot_hook()
is neither called on creation nor on bind time, which saves
cycles when there's no interest in capturing anyway.
That leaves us however with po->num 0 instead and therefore
the TX_RING flush run sets skb->protocol to 0 as well. Eric
reported that this leads to problems when using tools like
trafgen over bonding device. I.e. the bonding's hash function
could invoke the kernel's flow dissector, which depends on
skb->protocol being properly set. In the current situation, all
the traffic is then directed to a single slave.
Fix it up by inferring skb->protocol from the Ethernet header
when not set and we have ARPHRD_ETHER device type. This is only
done in case of SOCK_RAW and where we have a dev->hard_header_len
length. In case of ARPHRD_ETHER devices, this is guaranteed to
cover ETH_HLEN, and therefore being accessed on the skb after
the skb_store_bits().
Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Packet sockets can be used by various net devices and are not
really restricted to ARPHRD_ETHER device types. However, when
currently checking for the extra 4 bytes that can be transmitted
in VLAN case, our assumption is that we generally probe on
ARPHRD_ETHER devices. Therefore, before looking into Ethernet
header, check the device type first.
This also fixes the issue where non-ARPHRD_ETHER devices could
have no dev->hard_header_len in TX_RING SOCK_RAW case, and thus
the check would test unfilled linear part of the skb (instead
of non-linear).
Fixes: 57f89bfa2140 ("network: Allow af_packet to transmit +4 bytes for VLAN packets.") Fixes: 52f1454f629f ("packet: allow to transmit +4 byte in TX_RING slot for VLAN case") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We concluded that the skb_probe_transport_header() should better be
called unconditionally. Avoiding the call into the flow dissector has
also not really much to do with the direct xmit mode.
While it seems that only virtio_net code makes use of GSO from non
RX/TX ring packet socket paths, we should probe for a transport header
nevertheless before they hit devices.
Reference: http://thread.gmane.org/gmane.linux.network/386173/ Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In tpacket_fill_skb() commit c1aad275b029 ("packet: set transport
header before doing xmit") and later on 40893fd0fd4e ("net: switch
to use skb_probe_transport_header()") was probing for a transport
header on the skb from a ring buffer slot, but at a time, where
the skb has _not even_ been filled with data yet. So that call into
the flow dissector is pretty useless. Lets do it after we've set
up the skb frags.
Fixes: c1aad275b029 ("packet: set transport header before doing xmit") Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
All DST_NOCACHE rt6_info used to have rt->dst.from set to
its parent.
After commit 8e3d5be73681 ("ipv6: Avoid double dst_free"),
DST_NOCACHE is also set to rt6_info which does not have
a parent (i.e. rt->dst.from is NULL).
This patch catches the rt->dst.from == NULL case.
Fixes: 8e3d5be73681 ("ipv6: Avoid double dst_free") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since the expires of the DST_NOCACHE rt can be set during
the ip6_rt_update_pmtu(), we also need to consider the expires
value when doing ip6_dst_check().
This patches creates __rt6_check_expired() to only
check the expire value (if one exists) of the current rt.
In rt6_dst_from_check(), it adds __rt6_check_expired() as
one of the condition check.
Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The original bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1272571
The setup has a IPv4 GRE tunnel running in a IPSec. The bug
happens when ndisc starts sending router solicitation at the gre
interface. The simplified oops stack is like:
The rt passed to ip6_rt_update_pmtu() is created by
icmp6_dst_alloc() and it is not managed by the fib6 tree,
so its rt6i_table == NULL. When __ip6_rt_update_pmtu() creates
a RTF_CACHE clone, the newly created clone also has rt6i_table == NULL
and it causes the ip6_ins_rt() oops.
During pmtu update, we only want to create a RTF_CACHE clone
from a rt which is currently managed (or owned) by the
fib6 tree. It means either rt->rt6i_node != NULL or
rt is a RTF_PCPU clone.
It is worth to note that rt6i_table may not be NULL even it is
not (yet) managed by the fib6 tree (e.g. addrconf_dst_alloc()).
Hence, rt6i_node is a better check instead of rt6i_table.
Fixes: 45e4fd26683c ("ipv6: Only create RTF_CACHE routes after encountering pmtu") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Reported-by: Chris Siebenmann <cks-rhbugzilla@cs.toronto.edu> Cc: Chris Siebenmann <cks-rhbugzilla@cs.toronto.edu> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sendpage did not care about credentials at all. This could lead to
situations in which because of fd passing between processes we could
append data to skbs with different scm data. It is illegal to splice those
skbs together. Instead we have to allocate a new skb and if requested
fill out the scm details.
Fixes: 869e7c62486ec ("net: af_unix: implement stream sendpage support") Reported-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rainer Weikusat <rweikusat@mobileactivedefense.com> writes:
An AF_UNIX datagram socket being the client in an n:1 association with
some server socket is only allowed to send messages to the server if the
receive queue of this socket contains at most sk_max_ack_backlog
datagrams. This implies that prospective writers might be forced to go
to sleep despite none of the message presently enqueued on the server
receive queue were sent by them. In order to ensure that these will be
woken up once space becomes again available, the present unix_dgram_poll
routine does a second sock_poll_wait call with the peer_wait wait queue
of the server socket as queue argument (unix_dgram_recvmsg does a wake
up on this queue after a datagram was received). This is inherently
problematic because the server socket is only guaranteed to remain alive
for as long as the client still holds a reference to it. In case the
connection is dissolved via connect or by the dead peer detection logic
in unix_dgram_sendmsg, the server socket may be freed despite "the
polling mechanism" (in particular, epoll) still has a pointer to the
corresponding peer_wait queue. There's no way to forcibly deregister a
wait queue with epoll.
Based on an idea by Jason Baron, the patch below changes the code such
that a wait_queue_t belonging to the client socket is enqueued on the
peer_wait queue of the server whenever the peer receive queue full
condition is detected by either a sendmsg or a poll. A wake up on the
peer queue is then relayed to the ordinary wait queue of the client
socket via wake function. The connection to the peer wait queue is again
dissolved if either a wake up is about to be relayed or the client
socket reconnects or a dead peer is detected or the client socket is
itself closed. This enables removing the second sock_poll_wait from
unix_dgram_poll, thus avoiding the use-after-free, while still ensuring
that no blocked writer sleeps forever.
Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> Fixes: ec0d215f9420 ("af_unix: fix 'poll for write'/connected DGRAM sockets") Reviewed-by: Jason Baron <jbaron@akamai.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While possibly in future we don't necessarily need to use
sk_buff_head.lock this is a rather larger change, as it affects the
af_unix fd garbage collector, diag and socket cleanups. This is too much
for a stable patch.
For the time being grab sk_buff_head.lock without disabling bh and irqs,
so don't use locked skb_queue_tail.
Fixes: 869e7c62486e ("net: af_unix: implement stream sendpage support") Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Reported-by: Eric Dumazet <edumazet@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In case multiple writes to a unix stream socket race we could end up in a
situation where we pre-allocate a new skb for use in unix_stream_sendpage
but have to free it again in the locked section because another skb
has been appended meanwhile, which we must use. Accidentally we didn't
clear the pointer after consuming it and so we touched freed memory
while appending it to the sk_receive_queue. So, clear the pointer after
consuming the skb.
This bug has been found with syzkaller
(http://github.com/google/syzkaller) by Dmitry Vyukov.
Fixes: 869e7c62486e ("net: af_unix: implement stream sendpage support") Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During splicing an af-unix socket to a pipe we have to drop all
af-unix socket locks. While doing so we allow another reader to enter
unix_stream_read_generic which can read, copy and finally free another
skb. If exactly this skb is just in process of being spliced we get a
use-after-free report by kasan.
First, we must make sure to not have a free while the skb is used during
the splice operation. We simply increment its use counter before unlocking
the reader lock.
Stream sockets have the nice characteristic that we don't care about
zero length writes and they never reach the peer socket's queue. That
said, we can take the UNIXCB.consumed field as the indicator if the
skb was already freed from the socket's receive queue. If the skb was
fully consumed after we locked the reader side again we know it has been
dropped by a second reader. We indicate a short read to user space and
abort the current splice operation.
This bug has been found with syzkaller
(http://github.com/google/syzkaller) by Dmitry Vyukov.
Fixes: 2b514574f7e8 ("net: af_unix: implement splice for stream af_unix sockets") Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When building with allmodconfig the build was failing with the error:
arch/tile/kernel/usb.c:70:1: warning: data definition has no type or storage class [enabled by default]
arch/tile/kernel/usb.c:70:1: error: type defaults to 'int' in declaration of 'arch_initcall' [-Werror=implicit-int]
arch/tile/kernel/usb.c:70:1: warning: parameter names (without types) in function declaration [enabled by default]
arch/tile/kernel/usb.c:63:19: warning: 'tilegx_usb_init' defined but not used [-Wunused-function]
Include linux/module.h to resolve the build failure.
We should never allow to enable/disable any facilities for the guest
when other VCPUs were already created.
kvm_arch_vcpu_(load|put) relies on SIMD not changing during runtime.
If somebody would create and run VCPUs and then decides to enable
SIMD, undefined behaviour could be possible (e.g. vector save area
not being set up).
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After commit 8c058b0b9c34 ("x86/irq: Probe for PIC presence before
allocating descs for legacy IRQs") early_irq_init() will no longer
preallocate descriptors for legacy interrupts if PIC does not
exist, which is the case for Xen PV guests.
Therefore we may need to allocate those descriptors ourselves.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The recently introduced lnet_peer_set_alive() function uses
get_seconds() to read the current time into a shared variable,
but all other uses of that variable compare it to jiffies values.
This changes the current use to jiffies as well for consistency.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: af3fa7c71bf ("staging/lustre/lnet: peer aliveness status and NI status") Cc: Liang Zhen <liang.zhen@intel.com> Cc: James Simmons <uja.ornl@gmail.com> Cc: Isaac Huang <he.huang@intel.com> Signed-off-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Existing Intel xHCI controllers require a delay of 1 mS,
after setting the CMD_RESET bit in command register, before
accessing any HC registers. This allows the HC to complete
the reset operation and be ready for HC register access.
Without this delay, the subsequent HC register access,
may result in a system hang, very rarely.
Verified CherryView / Braswell platforms go through over
5000 warm reboot cycles (which was not possible without
this patch), without any xHCI reset hang.
Signed-off-by: Rajmohan Mani <rajmohan.mani@intel.com> Tested-by: Joe Lawrence <joe.lawrence@stratus.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 0fd972a7d91d (module: relocate module_init from init.h to
module.h) broke the build of ttyFDC driver due to that driver's (mis)use
of module_mips_cdmm_driver() without first including module.h, for
example:
In file included from ./arch/mips/include/asm/cdmm.h +11 :0,
from drivers/tty/mips_ejtag_fdc.c +34 :
include/linux/device.h +1295 :1: warning: data definition has no type or storage class
./arch/mips/include/asm/cdmm.h +84 :2: note: in expansion of macro ‘module_driver’
drivers/tty/mips_ejtag_fdc.c +1157 :1: note: in expansion of macro ‘module_mips_cdmm_driver’
include/linux/device.h +1295 :1: error: type defaults to ‘int’ in declaration of ‘module_init’ [-Werror=implicit-int]
./arch/mips/include/asm/cdmm.h +84 :2: note: in expansion of macro ‘module_driver’
drivers/tty/mips_ejtag_fdc.c +1157 :1: note: in expansion of macro ‘module_mips_cdmm_driver’
drivers/tty/mips_ejtag_fdc.c +1157 :1: warning: parameter names (without types) in function declaration
Instead of just adding the module.h include, switch to using the new
builtin_mips_cdmm_driver() helper macro and drop the remove callback,
since it isn't needed. If module support is added later, the code can
always be resurrected.
One of the many faults of the QinHeng CH345 USB MIDI interface chip is
that it does not handle received SysEx messages correctly -- every second
event packet has a wrong code index number, which is the one from the last
seen message, instead of 4. For example, the two messages "FE F0 01 02 03
04 05 06 07 08 09 0A 0B 0C 0D 0E F7" result in the following event
packets:
A class-compliant driver must interpret an event packet with CIN 15 as
having a single data byte, so the other two bytes would be ignored. The
message received by the host would then be missing two bytes out of six;
in this example, "F0 01 02 03 06 07 08 09 0C 0D 0E F7".
These corrupted SysEx event packages contain only data bytes, while the
CH345 uses event packets with a correct CIN value only for messages with
a status byte, so it is possible to distinguish between these two cases by
checking for the presence of this status byte.
(Other bugs in the CH345's input handling, such as the corruption resulting
from running status, cannot be worked around.)
The CH345 USB MIDI chip has two output ports. However, they are
multiplexed through one pin, and the number of ports cannot be reduced
even for hardware that implements only one connector, so for those
devices, data sent to either port ends up on the same hardware output.
This becomes a problem when both ports are used at the same time, as
longer MIDI commands (such as SysEx messages) are likely to be
interrupted by messages from the other port, and thus to get lost.
It would not be possible for the driver to detect how many ports the
device actually has, except that in practice, _all_ devices built with
the CH345 have only one port. So we can just ignore the device's
descriptors, and hardcode one output port.
Thomas reports
"
4gsystems sells two total different LTE-surfsticks under the same name.
..
The newer version of XS Stick W100 is from "omega"
..
Under windows the driver switches to the same ID, and uses MI03\6 for
network and MI01\6 for modem.
..
echo "1c9e 9b01" > /sys/bus/usb/drivers/qmi_wwan/new_id
echo "1c9e 9b01" > /sys/bus/usb-serial/drivers/option1/new_id
The callback registered by the musb driver has to comply to the latter,
but up to now had "offset" first which effectively made the function
broken for correct users. So flip the order and while at it also
switch to the parameter names of struct usb_phy_io_ops's write.
The DEVICE_HWI type was added under the faulty assumption that Huawei
devices based on Qualcomm chipsets and firmware use the static USB
interface numbering known from Gobi devices. But this model does
not apply to Huawei devices like the HP branded lt4112 (Huawei me906e).
Huawei firmwares will dynamically assign interface numbers. Functions
are renumbered when the firmware is reconfigured.
Fix by changing the DEVICE_HWI type to use a simplified version
of Huawei's subclass + protocol scheme: Blacklisting known network
interface combinations and assuming the rest are serial.
Reported-and-tested-by: Muri Nicanor <muri+libqmi@immerda.ch> Tested-by: Martin Hauke <mardnh@gmx.de> Fixes: e7181d005e84 ("USB: qcserial: Add support for HP lt4112 LTE/HSPA+ Gobi 4G Modem") Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It seems like this device has same vendor and product IDs as G2K
devices, but it has different number of interfaces(4 vs 5) and also
different interface layout which makes it currently unusable:
usbcore: registered new interface driver qcserial
usbserial: USB Serial support registered for Qualcomm USB modem
usb 2-1.2: unknown number of interfaces: 5
It is not permitted to set task state before lock. usblp_wwait sets
the state to TASK_INTERRUPTIBLE and calls mutex_lock_interruptible.
Upon return from that function, the state will be TASK_RUNNING again.
Commit d445913ce0ab7f ("usb: ehci-orion: add optional PHY support")
added support for optional phys, but devm_phy_optional_get returns
-ENOSYS if GENERIC_PHY is not enabled.
This causes probe failures, even when there are no phys specified:
[ 1.443365] orion-ehci f1058000.usb: init f1058000.usb fail, -38
[ 1.449403] orion-ehci: probe of f1058000.usb failed with error -38
Some i.mx platforms need three clocks to let controller work, but
others only need one, refine clock operation to adapt for all
platforms, it fixes a regression found at i.mx27.
Certain Synopsys prototyping PHY boards are not able to meet timings
constraints for LPM. This allows the PHY to meet those timings by
leaving the PHY clock running during suspend.
Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch allows the dwc3 driver to run on the new Synopsys USB 3.1
IP core, albeit in USB 3.0 mode only.
The Synopsys USB 3.1 IP (DWC_usb31) retains mostly the same register
interface and programming model as the existing USB 3.0 controller IP
(DWC_usb3). However the GSNPSID and version numbers are different.
Add checking for the new ID to pass driver probe.
Also, since the DWC_usb31 version number is lower in value than the
full GSNPSID of the DWC_usb3 IP, we set the high bit to identify
DWC_usb31 and to ensure the values are higher.
Finally, add a documentation note about the revision numbering scheme.
Any future revision checks (for STARS, workarounds, and new features)
should take into consideration how it applies to both the 3.1/3.0 IP.
Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This ID is for the Synopsys DWC_usb3 core with AXI interface on PCIe
HAPS platform. This core has the debug registers mapped at a separate
BAR in order to support enhanced hibernation.
Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch is to support load and unload gadget driver in full OTG mode.
Signed-off-by: Li Jun <jun.li@freescale.com> Signed-off-by: Peter Chen <peter.chen@freescale.com> Tested-by: Jiada Wang <jiada_wang@mentor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In some SoCs, dwc3 is implemented as a USB2.0 only
core, meaning that it can't ever achieve SuperSpeed.
Currect driver always sets gadget.max_speed to
USB_SPEED_SUPER unconditionally. This can causes
issues to some Host stacks where the host will issue
a GetBOS() request and we will reply with a BOS
containing Superspeed Capability Descriptor.
At least Windows seems to be upset by this fact and
prints a warning that we should connect $this device
to another port.
[ balbi@ti.com : rewrote entire commit, including
source code comment to make a lot clearer what the
problem is ]
Signed-off-by: Ben McCauley <ben.mccauley@garmin.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Following changes that appeared in lk 4.0.0, the gadget udc driver for
some ARM based Atmel SoCs (e.g. at91sam9x5 and sama5d3 families)
incorrectly deduced full-speed USB link speed even when the hardware
had negotiated a high-speed link. The fix is to make sure that the
UDPHS Interrupt Enable Register value does not mask the SPEED bit
in the Interrupt Status Register.
For a mass storage gadget this problem lead to failures when the host
had a USB 3 port with the xhci_hcd driver. If the host was a USB 2
port using the ehci_hcd driver then the mass storage gadget worked
(but probably at a lower speed than it should have).
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com> Fixes: 9870d895ad87 ("usb: atmel_usba_udc: Mask status with enabled irqs") Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Defect 7374 workaround enables all GPEP as endpoint 0. Restore
endpoint number when defect 7374 workaround is disabled. Otherwise,
check to match USB endpoint number to hardware endpoint number in
net2280_enable() fails.
Reported-by: Paul Jones <p.jones@teclyn.com> Signed-off-by: Mian Yousaf Kaukab <yousaf.kaukab@intel.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As it breaks g_ether on my Baytrail FFRD8 device. Everything starts out
fine, but after a bit of data has been transferred it just stops
flowing.
Note that I do get a bunch of these "NOHZ: local_softirq_pending 08"
when booting the machine, but I'm not really sure if they're related
to this problem.
Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-usb@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 383d0b050106 ("KVM: s390: handle pending local interrupts via
bitmap") introduced a possible memory overwrite from user space.
User space could pass an invalid emergency signal code (sending VCPU)
and therefore exceed the bitmap. Let's take care of this case and
check that the id is in the valid range.
Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We seemed to have missed a few corner cases in commit f6c137ff00a4
("KVM: s390: randomize sca address").
The SCA has a maximum size of 2112 bytes. By setting the sca_offset to
some unlucky numbers, we exceed the page.
0x7c0 (1984) -> Fits exactly
0x7d0 (2000) -> 16 bytes out
0x7e0 (2016) -> 32 bytes out
0x7f0 (2032) -> 48 bytes out
One VCPU entry is 32 bytes long.
For the last two cases, we actually write data to the other page.
1. The address of the VCPU.
2. Injection/delivery/clearing of SIGP externall calls via SIGP IF.
Especially the 2. happens regularly. So this could produce two problems:
1. The guest losing/getting external calls.
2. Random memory overwrites in the host.
So this problem happens on every 127 + 128 created VM with 64 VCPUs.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The number of spatial streams that are derived from chain mask
for 4x4 devices is using wrong bitmask and conditional check.
This is affecting downlink throughput for QCA99x0 devices. Earlier
cfg_tx_chainmask is not filled by default until user configured it
and so get_nss_from_chainmask never be called. This issue is exposed
by recent commit 166de3f1895d ("ath10k: remove supported chain mask").
By default maximum supported chain mask is filled in cfg_tx_chainmask.
Fixes: 5572a95b4b ("ath10k: apply chainmask settings to vdev on creation") Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current number of spatial streams used by the client is advertised
as a separate IE in assoc request. Use this information to set
the NSS operating mode.
Fixes: 45c9abc059fa ("ath10k: implement more versatile set_bitrate_mask"). Signed-off-by: Vivek Natarajan <nataraja@qti.qualcomm.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A kernel built with DEBUG_RO_DATA && !CONFIG_DEBUG_ALIGN_RODATA doesn't
have .text aligned to a page boundary, though fixup_executable works at
page-granularity thanks to its use of create_mapping. If .text is not
page-aligned, the first page it exists in may be marked non-executable,
leading to failures when an attempt is made to execute code in said
page.
This patch upgrades ALIGN_DEBUG_RO and ALIGN_DEBUG_RO_MIN to force page
alignment for DEBUG_RO_DATA && !CONFIG_DEBUG_ALIGN_RODATA kernels,
ensuring that all sections with specific RWX permission requirements are
mapped with the correct permissions.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Laura Abbott <laura@labbott.name> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will.deacon@arm.com> Fixes: da141706aea52c1a ("arm64: add better page protections to arm64") Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For reasons not entirely apparent, but now enshrined in history, the
architectural mapping of AArch32 banked registers to AArch64 registers
actually orders SP_<mode> and LR_<mode> backwards compared to the
intuitive r13/r14 order, for all modes except FIQ.
Fix the compat_<reg>_<mode> macros accordingly, in the hope of avoiding
subtle bugs with KVM and AArch32 guests.
Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to SJA1000 data sheet error-warning (EI) interrupt is not
cleared by setting the controller in to reset-mode.
Then if we have the following case:
- system is suspended (echo mem > /sys/power/state) and SJA1000 is left
in operating state
- A bus error condition occurs which activates EI interrupt, system is
still suspended which means EI interrupt will be not be handled nor
cleared.
If the above two events occur, on resume there is no way to return the
SJA1000 to operating state, except to cycle power to it.
By simply reading the IR register on start we will clear any previous
conditions that could be present.
Signed-off-by: Mirza Krak <mirza.krak@hostmobility.com> Reported-by: Christian Magnusson <Christian.Magnusson@semcon.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The commit 89cbb0638e9b7 introduced support for deferred connection
parameter removal when unpairing by removing them only once an
existing connection gets disconnected. However, it failed to address
the scenario when we're *not* connected and do an unpair operation.
What makes things worse is that most user space BlueZ versions will
first issue a disconnect request and only then unpair, meaning the
buggy code will be triggered every time. This effectively causes the
kernel to resume scanning and reconnect to a device for which we've
removed all keys and GATT database information.
This patch fixes the issue by adding the missing call to the
hci_conn_params_del() function to a branch which handles the case of
no existing connection.
The HIDP specs define an idle-timeout which automatically disconnects a
device. This has always been implemented in the HIDP layer and forced a
synchronous shutdown of the hidp-scheduler. This works just fine, but
lacks a forced disconnect on the underlying l2cap channels. This has been
broken since:
The old session-management always forced an l2cap error on the ctrl/intr
channels when shutting down. The new session-management skips this, as we
don't want to enforce channel policy on the caller. In other words, if
user-space removes an HIDP device, the underlying channels (which are
*owned* and *referenced* by user-space) are still left active. User-space
needs to call shutdown(2) or close(2) to release them.
Unfortunately, this does not work with idle-timeouts. There is no way to
signal user-space that the HIDP layer has been stopped. The API simply
does not support any event-passing except for poll(2). Hence, we restore
old behavior and force EUNATCH on the sockets if the HIDP layer is
disconnected due to idle-timeouts (behavior of explicit disconnects
remains unmodified). User-space can still call
getsockopt(..., SO_ERROR, ...)
..to retrieve the EUNATCH error and clear sk_err. Hence, the channels can
still be re-used (which nobody does so far, though). Therefore, the API
still supports the new behavior, but with this patch it's also compatible
to the old implicit channel shutdown.
Reported-by: Mark Haun <haunma@keteu.org> Reported-by: Luiz Augusto von Dentz <luiz.dentz@gmail.com> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This adds the USB ID for the Sitecom WLA2100. The Windows 10 inf file
was checked to verify that the addition is correct.
Reported-by: Frans van de Wiel <fvdw@fvdw.eu> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Frans van de Wiel <fvdw@fvdw.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1) The done label was in the wrong place so we didn't copy any
information out when there was no command given.
2) We were using PAGE_SIZE as the size of the buffer instead of
"PAGE_SIZE - pos".
3) snprintf() returns the number of characters that would have been
printed if there were enough space. If there was not enough space
(and we had fixed the memory corruption bug #2) then it would result
in an information leak when we do simple_read_from_buffer(). I've
changed it to use scnprintf() instead.
I also removed the initialization at the start of the function, because
I thought it made the code a little more clear.
Fixes: 5e6e3a92b9a4 ('wireless: mwifiex: initial commit for Marvell mwifiex driver') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 68bab8662f49 ("mfd: twl6040: Optional clk32k clock handling")
added clock handling for the 32k clock from palmas-clk. However, that
patch did not consider a typical situation where twl6040 is built-in,
and palmas-clk is a loadable module like we have in omap2plus_defconfig.
If palmas-clk is not loaded before twl6040 probes, we will get a
"clk32k is not handled" warning during booting. This means that any
drivers relying on this clock will mysteriously fail, including
omap5-uevm WLAN and audio.
Note that for WLAN, we probably should also eventually get
the clk32kgaudio for MMC3 directly as that's shared between
audio and WLAN SDIO at least for omap5-uevm. It seems the
WLAN chip cannot get it as otherwise MMC3 won't get properly
probed.
Fixes: 68bab8662f49 ("mfd: twl6040: Optional clk32k clock handling") Signed-off-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch affects the clocks that use fractional ndivider in their
PLL output frequency calculation. Instead of 2^20 divide factor, the
clock's ndiv integer shift was used. Fixed the bug by replacing ndiv
integer shift with 2^20 factor.
Signed-off-by: Simran Rai <ssimran@broadcom.com> Signed-off-by: Ray Jui <rjui@broadcom.com> Reviewed-by: Scott Branden <sbranden@broadcom.com> Fixes: 5fe225c105fd ("clk: iproc: add initial common clock support") Signed-off-by: Michael Turquette <mturquette@baylibre.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
So the /proc/PID/stat 'wchan' field (the 30th field, which contains
the absolute kernel address of the kernel function a task is blocked in)
leaks absolute kernel addresses to unprivileged user-space:
seq_put_decimal_ull(m, ' ', wchan);
The absolute address might also leak via /proc/PID/wchan as well, if
KALLSYMS is turned off or if the symbol lookup fails for some reason:
static int proc_pid_wchan(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
{
unsigned long wchan;
char symname[KSYM_NAME_LEN];
There's one compatibility quirk here: procps relies on whether the
absolute value is non-zero - and we can provide that functionality
by outputing "0" or "1" depending on whether the task is blocked
(whether there's a wchan address).
These days there appears to be very little legitimate reason
user-space would be interested in the absolute address. The
absolute address is mostly historic: from the days when we
didn't have kallsyms and user-space procps had to do the
decoding itself via the System.map.
So this patch sets all numeric output to "0" or "1" and keeps only
symbolic output, in /proc/PID/wchan.
( The absolute sleep address can generally still be profiled via
perf, by tasks with sufficient privileges. )
In the actual RX processing, there is same error path for both descriptor
ring refilling and building skb fails. This is not correct, because after
successful refill, the ring is already updated with newly allocated
buffer. Then, in case of build_skb() fail, hitherto code left the original
buffer unmapped.
This patch fixes above situation by swapping error check of skb build with
DMA-unmap of original buffer.
Signed-off-by: Marcin Wojtas <mw@semihalf.com> Acked-by: Simon Guinot <simon.guinot@sequanux.org>
Fixes a84e32894191 ("net: mvneta: fix refilling for Rx DMA buffers") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The CPU_MAP register is duplicated for each CPUs at different addresses,
each instance being at a different address.
However, the code so far was using CONFIG_NR_CPUS to initialise the CPU_MAP
registers for each registers, while the SoCs embed at most 4 CPUs.
This is especially an issue with multi_v7_defconfig, where CONFIG_NR_CPUS
is currently set to 16, resulting in writes to registers that are not
CPU_MAP.
Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit") Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add some new PCI IDs for the 8260 series which were missing.
The following sub-system IDs were added:
0x0130, 0x1130, 0x0132, 0x1132, 0x1150, 0x8110, 0x9110, 0x8130,
0x9130, 0x8132, 0x9132, 0x8150, 0x9150, 0x0044, 0x0930
The hardware bug in the commit mentioned below forces us
not to re-enable the clock gating in the Host Cluster.
The impact on the power consumption is minimal and it allows
the WAKE_ME interrupt to propagate.
When sending HCI data over NCI, HCI return code is part
of the NCI data. In order to get correctly the HCI return
code, we assume the NCI communication is successful and
extract the return code for the nci_hci functions return code.
This is done because nci_to_errno does not match hci return
code value.
When sending HCI data over NCI, cmd information should be
present only on the first packet.
Each packet shall be specifically allocated and sent to the
NCI layer.
If parse_acl_data succeeds but the subsequent parsing of smps
attributes fails, there will be a memory leak due to early returns.
Fix that by moving the ACL parsing later.
Fixes: 18998c381b19b ("cfg80211: allow requesting SMPS mode on ap start") Signed-off-by: Ola Olsson <ola.olsson@sonymobile.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In TDLS channel-switch operations the chandef can sometimes be NULL.
Avoid an oops in the trace code for these cases and just print a
chandef full of zeros.
The ifmgd->ave_beacon_signal value cannot be taken as is for
comparisons, it must be divided by since it's represented
like that for better accuracy of the EWMA calculations. This
would lead to invalid driver RSSI events. Fix the used value.
I received a bug report that running 32-bit MPX binaries on
64-bit kernels was broken. I traced it down to this little code
snippet. We were switching our "number of bounds directory
entries" calculation correctly. But, we didn't switch the other
side of the calculation: the virtual space size.
This meant that we were calculating an absurd size for
bd_entry_virt_space() on 32-bit because we used the 64-bit
virt_space.
This was _also_ broken for 32-bit kernels running on 64-bit
hardware since boot_cpu_data.x86_virt_bits=48 even when running
in 32-bit mode.
Correct that and properly handle all 3 possible cases:
1. 32-bit binary on 64-bit kernel
2. 64-bit binary on 64-bit kernel
3. 32-bit binary on 32-bit kernel
This manifested in having bounds tables not properly unmapped.
It "leaked" memory but had no functional impact otherwise.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20151111181934.FA7FAC34@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When you call get_user(foo, bar), you effectively do a
copy_from_user(&foo, bar, sizeof(*bar));
Note that the sizeof() is implicit.
When we reach out to userspace to try to zap an entire "bounds
table" we need to go read a "bounds directory entry" in order to
locate the table's address. The size of a "directory entry"
depends on the binary being run and is always the size of a
pointer.
But, when we have a 64-bit kernel and a 32-bit application, the
directory entry is still only 32-bits long, but we fetch it with
a 64-bit pointer which makes get_user() does a 64-bit fetch.
Reading 4 extra bytes isn't harmful, unless we are at the end of
and run off the table. It might also cause the zero page to get
faulted in unnecessarily even if you are not at the end.
Fix it up by doing a special 32-bit get_user() via a cast when
we have 32-bit userspace.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20151111181931.3ACF6822@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(This should have gone to LKML originally. Sorry for the extra
noise, folks on the cc.)
Background:
Signal frames on x86 have two formats:
1. For 32-bit executables (whether on a real 32-bit kernel or
under 32-bit emulation on a 64-bit kernel) we have a
'fpregset_t' that includes the "FSAVE" registers.
2. For 64-bit executables (on 64-bit kernels obviously), the
'fpregset_t' is smaller and does not contain the "FSAVE"
state.
When creating the signal frame, we have to be aware of whether
we are running a 32 or 64-bit executable so we create the
correct format signal frame.
Problem:
save_xstate_epilog() uses 'fx_sw_reserved_ia32' whenever it is
called for a 32-bit executable. This is for real 32-bit and
ia32 emulation.
But, fpu__init_prepare_fx_sw_frame() only initializes
'fx_sw_reserved_ia32' when emulation is enabled, *NOT* for real
32-bit kernels.
This leads to really wierd situations where 32-bit programs
lose their extended state when returning from a signal handler.
The kernel copies the uninitialized (zero) 'fx_sw_reserved_ia32'
out to userspace in save_xstate_epilog(). But when returning
from the signal, the kernel errors out in check_for_xstate()
when it does not see FP_XSTATE_MAGIC1 present (because it was
zeroed). This leads to the FPU/XSAVE state being initialized.
For MPX, this leads to the most permissive state and means we
silently lose bounds violations. I think this would also mean
that we could lose *ANY* FPU/SSE/AVX state. I'm not sure why
no one has spotted this bug.
I believe this was broken by:
72a671ced66d ("x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels")
way back in 2012.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave@sr71.net Cc: fenghua.yu@intel.com Cc: yu-cheng.yu@intel.com Link: http://lkml.kernel.org/r/20151111002354.A0799571@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
KVM uses the get_xsave_addr() function in a different fashion from
the native kernel, in that the 'xsave' parameter belongs to guest vcpu,
not the currently running task.
But 'xsave' is replaced with current task's (host) xsave structure, so
get_xsave_addr() will incorrectly return the bad xsave address to KVM.
Fix it so that the passed in 'xsave' address is used - as intended
originally.
Signed-off-by: Huaitong Han <huaitong.han@intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave.hansen@intel.com Link: http://lkml.kernel.org/r/1446800423-21622-1-git-send-email-huaitong.han@intel.com
[ Tidied up the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There appears to be no formal statement of what pv_irq_ops.save_fl() is
supposed to return precisely. Native returns the full flags, while lguest and
Xen only return the Interrupt Flag, and both have comments by the
implementations stating that only the Interrupt Flag is looked at. This may
have been true when initially implemented, but no longer is.
To make matters worse, the Xen PVOP leaves the upper bits undefined, making
the BUG_ON() undefined behaviour. Experimentally, this now trips for 32bit PV
guests on Broadwell hardware. The BUG_ON() is consistent for an individual
build, but not consistent for all builds. It has also been a sitting timebomb
since SMAP support was introduced.
Use native_save_fl() instead, which will obtain an accurate view of the AC
flag.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Tested-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: <lguest@lists.ozlabs.org> Cc: Xen-devel <xen-devel@lists.xen.org> Link: http://lkml.kernel.org/r/1433323874-6927-1-git-send-email-andrew.cooper3@citrix.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we get loaded by a 64-bit bootloader, kernel entry point is
startup_64 in head_64.S. We don't trust any and all bootloaders because
some will fiddle with CPU configuration so we go ahead and massage each
CPU into sanity again.
For example, some dell BIOSes have this XD disable feature which set
IA32_MISC_ENABLE[34] and disable NX. This might be some dumb workaround
for other OSes but Linux sure doesn't need it.
A similar thing is present in the Surface 3 firmware - see
https://bugzilla.kernel.org/show_bug.cgi?id=106051 - which sets this bit
only on the BSP:
Commit d32932d02e18 ("x86/irq: Convert IOAPIC to use hierarchical irqdomain
interfaces") brought a regression for Hyper-V Gen2 instances. These
instances don't have i8259 legacy PIC but they use legacy IRQs for serial
port, rtc, and acpi. With this commit included we end up with these IRQs
not initialized. Earlier, there was a special workaround for legacy IRQs
in mp_map_pin_to_irq() doing mp_irqdomain_map() without looking at
nr_legacy_irqs() and now we fail in __irq_domain_alloc_irqs() when
irq_domain_alloc_descs() returns -EEXIST.
The essence of the issue seems to be that early_irq_init() calls
arch_probe_nr_irqs() to figure out the number of legacy IRQs before
we probe for i8259 and gets 16. Later when init_8259A() is called we switch
to NULL legacy PIC and nr_legacy_irqs() starts to return 0 but we already
have 16 descs allocated.
Solve the issue by separating i8259 probe from init and calling it in
arch_probe_nr_irqs() before we actually use nr_legacy_irqs() information.
Fixes: d32932d02e18 ("x86/irq: Convert IOAPIC to use hierarchical irqdomain interfaces") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Link: http://lkml.kernel.org/r/1446543614-3621-1-git-send-email-vkuznets@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The commit f5f3497cad8c extended the low identity mapping. However, if
the kernel uses more than 2 GB (VMSPLIT_2G_OPT or VMSPLIT_1G memory
split), the normal memory mapping is overwritten by the low identity
mapping causing a crash. To avoid overwritting, limit the low identity
map to cover only memory before kernel range (PAGE_OFFSET).
Fixes: f5f3497cad8c "x86/setup: Extend low identity map to cover whole kernel range Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lkml.kernel.org/r/1446815916-22105-1-git-send-email-krzysiek@podlesie.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>