Marvell has not published the chip's datasheet yet, so it's very hard
to find the relevant bits to manipulate to change the IRQ behaviour.
Fortunately, these bits are described in the proprietary LSP patch set
which is publicly available here :
http://www.plugcomputer.org/downloads/mirabox/
So let's put them back in the driver in order to reduce the burden of
current and future maintenance.
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Gregory CLEMENT <gregory.clement@free-electrons.com> Tested-by: Arnaud Ebalard <arno@natisbad.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If a queue timeout is reported, we can oops because of some
schedules while the caller is atomic, as shown below :
mvneta d0070000.ethernet eth0: tx timeout
BUG: scheduling while atomic: bash/1528/0x00000100
Modules linked in: slhttp_ethdiv(C) [last unloaded: slhttp_ethdiv]
CPU: 2 PID: 1528 Comm: bash Tainted: G WC 3.13.0-rc4-mvebu-nf #180
[<c0011bd9>] (unwind_backtrace+0x1/0x98) from [<c000f1ab>] (show_stack+0xb/0xc)
[<c000f1ab>] (show_stack+0xb/0xc) from [<c02ad323>] (dump_stack+0x4f/0x64)
[<c02ad323>] (dump_stack+0x4f/0x64) from [<c02abe67>] (__schedule_bug+0x37/0x4c)
[<c02abe67>] (__schedule_bug+0x37/0x4c) from [<c02ae261>] (__schedule+0x325/0x3ec)
[<c02ae261>] (__schedule+0x325/0x3ec) from [<c02adb97>] (schedule_timeout+0xb7/0x118)
[<c02adb97>] (schedule_timeout+0xb7/0x118) from [<c0020a67>] (msleep+0xf/0x14)
[<c0020a67>] (msleep+0xf/0x14) from [<c01dcbe5>] (mvneta_stop_dev+0x21/0x194)
[<c01dcbe5>] (mvneta_stop_dev+0x21/0x194) from [<c01dcfe9>] (mvneta_tx_timeout+0x19/0x24)
[<c01dcfe9>] (mvneta_tx_timeout+0x19/0x24) from [<c024afc7>] (dev_watchdog+0x18b/0x1c4)
[<c024afc7>] (dev_watchdog+0x18b/0x1c4) from [<c0020b53>] (call_timer_fn.isra.27+0x17/0x5c)
[<c0020b53>] (call_timer_fn.isra.27+0x17/0x5c) from [<c0020cad>] (run_timer_softirq+0x115/0x170)
[<c0020cad>] (run_timer_softirq+0x115/0x170) from [<c001ccb9>] (__do_softirq+0xbd/0x1a8)
[<c001ccb9>] (__do_softirq+0xbd/0x1a8) from [<c001cfad>] (irq_exit+0x61/0x98)
[<c001cfad>] (irq_exit+0x61/0x98) from [<c000d4bf>] (handle_IRQ+0x27/0x60)
[<c000d4bf>] (handle_IRQ+0x27/0x60) from [<c000843b>] (armada_370_xp_handle_irq+0x33/0xc8)
[<c000843b>] (armada_370_xp_handle_irq+0x33/0xc8) from [<c000fba9>] (__irq_usr+0x49/0x60)
Ben Hutchings attempted to propose a better fix consisting in using a
scheduled work for this, but while it fixed this panic, it caused other
random freezes and panics proving that the reset sequence in the driver
is unreliable and that additional fixes should be investigated.
When sending multiple streams over a link limited to 100 Mbps, Tx timeouts
happen from time to time, and the driver correctly recovers only when the
function is disabled.
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Gregory CLEMENT <gregory.clement@free-electrons.com> Cc: Ben Hutchings <ben@decadent.org.uk> Tested-by: Arnaud Ebalard <arno@natisbad.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stats writers are mvneta_rx() and mvneta_tx(). They don't lock anything
when they update the stats, and as a result, it randomly happens that
the stats freeze on SMP if two updates happen during stats retrieval.
This is very easily reproducible by starting two HTTP servers and binding
each of them to a different CPU, then consulting /proc/net/dev in loops
during transfers, the interface should immediately lock up. This issue
also randomly happens upon link state changes during transfers, because
the stats are collected in this situation, but it takes more attempts to
reproduce it.
The comments in netdevice.h suggest using per_cpu stats instead to get
rid of this issue.
This patch implements this. It merges both rx_stats and tx_stats into
a single "stats" member with a single syncp. Both mvneta_rx() and
mvneta_rx() now only update the a single CPU's counters.
In turn, mvneta_get_stats64() does the summing by iterating over all CPUs
to get their respective stats.
With this change, stats are still correct and no more lockup is encountered.
Note that this bug was present since the first import of the mvneta
driver. It might make sense to backport it to some stable trees. If
so, it depends on "d33dc73 net: mvneta: increase the 64-bit rx/tx stats
out of the hot path".
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Gregory CLEMENT <gregory.clement@free-electrons.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Tested-by: Arnaud Ebalard <arno@natisbad.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Better count packets and bytes in the stack and on 32 bit then
accumulate them at the end for once. This saves two memory writes
and two memory barriers per packet. The incoming packet rate was
increased by 4.7% on the Openblocks AX3 thanks to this.
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Gregory CLEMENT <gregory.clement@free-electrons.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Tested-by: Arnaud Ebalard <arno@natisbad.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Marcelo Ricardo Leitner reported problems when the forwarding link path
has a lower mtu than the incoming one if the inbound interface supports GRO.
Given:
Host <mtu1500> R1 <mtu1200> R2
Host sends tcp stream which is routed via R1 and R2. R1 performs GRO.
In this case, the kernel will fail to send ICMP fragmentation needed
messages (or pkt too big for ipv6), as GSO packets currently bypass dstmtu
checks in forward path. Instead, Linux tries to send out packets exceeding
the mtu.
When locking route MTU on Host (i.e., no ipv4 DF bit set), R1 does
not fragment the packets when forwarding, and again tries to send out
packets exceeding R1-R2 link mtu.
This alters the forwarding dstmtu checks to take the individual gso
segment lengths into account.
For ipv6, we send out pkt too big error for gso if the individual
segments are too big.
For ipv4, we either send icmp fragmentation needed, or, if the DF bit
is not set, perform software segmentation and let the output path
create fragments when the packet is leaving the machine.
It is not 100% correct as the error message will contain the headers of
the GRO skb instead of the original/segmented one, but it seems to
work fine in my (limited) tests.
Eric Dumazet suggested to simply shrink mss via ->gso_size to avoid
sofware segmentation.
However it turns out that skb_segment() assumes skb nr_frags is related
to mss size so we would BUG there. I don't want to mess with it considering
Herbert and Eric disagree on what the correct behavior should be.
Hannes Frederic Sowa notes that when we would shrink gso_size
skb_segment would then also need to deal with the case where
SKB_MAX_FRAGS would be exceeded.
This uses sofware segmentation in the forward path when we hit ipv4
non-DF packets and the outgoing link mtu is too small. Its not perfect,
but given the lack of bug reports wrt. GRO fwd being broken this is a
rare case anyway. Also its not like this could not be improved later
once the dust settles.
Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Reported-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Will be used by upcoming ipv4 forward path change that needs to
determine feature mask using skb->dst->dev instead of skb->dev.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This moves part of Eric Dumazets skb_gso_seglen helper from tbf sched to
skbuff core so it may be reused by upcoming ip forwarding path patch.
Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
SCTP's sctp_connectx() abi breaks for 64bit kernels compiled with 32bit
emulation (e.g. ia32 emulation or x86_x32). Due to internal usage of
'struct sctp_getaddrs_old' which includes a struct sockaddr pointer,
sizeof(param) check will always fail in kernel as the structure in
64bit kernel space is 4bytes larger than for user binaries compiled
in 32bit mode. Thus, applications making use of sctp_connectx() won't
be able to run under such circumstances.
Introduce a compat interface in the kernel to deal with such
situations by using a 'struct compat_sctp_getaddrs_old' structure
where user data is copied into it, and then sucessively transformed
into a 'struct sctp_getaddrs_old' structure with the help of
compat_ptr(). That fixes sctp_connectx() abi without any changes
needed in user space, and lets the SCTP test suite pass when compiled
in 32bit and run on 64bit kernels.
Fixes: f9c67811ebc0 ("sctp: Fix regression introduced by new sctp_connectx api") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
since commit 89aef8921bf("ipv4: Delete routing cache."), the counter
in_slow_tot can't work correctly.
The counter in_slow_tot increase by one when fib_lookup() return successfully
in ip_route_input_slow(), but actually the dst struct maybe not be created and
cached, so we can increase in_slow_tot after the dst struct is created.
Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
aggregator_identifier is used to assign unique aggregator identifiers
to aggregators of a bond during device enslaving.
aggregator_identifier is currently a global variable that is zeroed in
bond_3ad_initialize().
This sequence will lead to duplicate aggregator identifiers for eth1 and eth3:
create bond0
change bond0 mode to 802.3ad
enslave eth0 to bond0 //eth0 gets agg id 1
enslave eth1 to bond0 //eth1 gets agg id 2
create bond1
change bond1 mode to 802.3ad
enslave eth2 to bond1 //aggregator_identifier is reset to 0
//eth2 gets agg id 1
enslave eth3 to bond0 //eth3 gets agg id 2
Fix this by making aggregator_identifier private to the bond.
Signed-off-by: Jiri Bohac <jbohac@suse.cz> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch removes a generic hard_header_len check from the usbnet
module that is causing dropped packages under certain circumstances
for devices that send rx packets that cross urb boundaries.
One example is the AX88772B which occasionally send rx packets that
cross urb boundaries where the remaining partial packet is sent with
no hardware header. When the buffer with a partial packet is of less
number of octets than the value of hard_header_len the buffer is
discarded by the usbnet module.
With AX88772B this can be reproduced by using ping with a packet
size between 1965-1976.
The bug has been reported here:
https://bugzilla.kernel.org/show_bug.cgi?id=29082
This patch introduces the following changes:
- Removes the generic hard_header_len check in the rx_complete
function in the usbnet module.
- Introduces a ETH_HLEN check for skbs that are not cloned from
within a rx_fixup callback.
- For safety a hard_header_len check is added to each rx_fixup
callback function that could be affected by this change.
These extra checks could possibly be removed by someone
who has the hardware to test.
- Removes a call to dev_kfree_skb_any() and instead utilizes the
dev->done list to queue skbs for cleanup.
The changes place full responsibility on the rx_fixup callback
functions that clone skbs to only pass valid skbs to the
usbnet_skb_return function.
Signed-off-by: Emil Goode <emilgoode@gmail.com> Reported-by: Igor Gnatenko <i.gnatenko.brain@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This bug was reported by Steinar H. Gunderson and was introduced by commit f7cb8886335d ("sit/gre6: don't try to add the same route two times").
root@morgental:~# ip tunnel add foo mode gre remote 1.2.3.4 ttl 64
root@morgental:~# ip link set foo up mtu 1468
root@morgental:~# ip -6 route show dev foo
fe80::/64 proto kernel metric 256
but after the above commit, no such route shows up.
There is no link local route because dev->dev_addr is 0 (because local ipv4
address is 0), hence no link local address is configured.
In this scenario, the link local address is added manually: 'ip -6 addr add
fe80::1 dev foo' and because prefix is /128, no link local route is added by the
kernel.
Even if the right things to do is to add the link local address with a /64
prefix, we need to restore the previous behavior to avoid breaking userpace.
Reported-by: Steinar H. Gunderson <sesse@samfundet.no> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The struct driver_info ax88178_info is assigned the function
asix_rx_fixup_common as it's rx_fixup callback. This means that
FLAG_MULTI_PACKET must be set as this function is cloning the
data and calling usbnet_skb_return. Not setting this flag leads
to usbnet_skb_return beeing called a second time from within
the rx_process function in the usbnet module.
Signed-off-by: Emil Goode <emilgoode@gmail.com> Reported-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Without this patch, the "cat /sys/class/net/ethN/operstate" shows
"unknown", and "ethtool ethN" shows "Link detected: yes", when VM
boots up with or without vNIC connected.
This patch fixed the problem.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
vhost checked the counter within the refcnt before decrementing. It
really wanted to know that it is the one that has the last reference, as
a way to batch freeing resources a bit more efficiently.
Note: we only let refcount go to 0 on device release.
This works well but we now access the ref counter twice so there's a
race: all users might see a high count and decide to defer freeing
resources.
In the end no one initiates freeing resources until the last reference
is gone (which is on VM shotdown so might happen after a looooong time).
Let's do what we probably should have done straight away:
switch from kref to plain atomic, documenting the
semantics, return the refcount value atomically after decrement,
then use that to avoid the deadlock.
Reported-by: Qin Chuanyu <qinchuanyu@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Quoting David Vrabel -
"5780 cards cannot have jumbo frames and TSO enabled together. When
jumbo frames are enabled by setting the MTU, the TSO feature must be
cleared. This is done indirectly by calling netdev_update_features()
which will call tg3_fix_features() to actually clear the flags.
netdev_update_features() will also trigger a new netlink message for the
feature change event which will result in a call to tg3_get_stats64()
which deadlocks on the tg3 lock."
tg3_set_mtu() does not need to be under the tg3 lock since converting
the flags to use set_bit(). Move it out to after tg3_netif_stop().
Reported-by: David Vrabel <david.vrabel@citrix.com> Tested-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 46d3ceabd8d9 ("tcp: TCP Small Queues") introduced a possible
regression for applications using TCP_NODELAY.
If TCP session is throttled because of tsq, we should consult
tp->nonagle when TX completion is done and allow us to send additional
segment, especially if this segment is not a full MSS.
Otherwise this segment is sent after an RTO.
[edumazet] : Cooked the changelog, added another fix about testing
sk_wmem_alloc twice because TX completion can happen right before
setting TSQ_THROTTLED bit.
This problem is particularly visible with recent auto corking,
but might also be triggered with low tcp_limit_output_bytes
values or NIC drivers delaying TX completion by hundred of usec,
and very low rtt.
Thomas Glanzmann for example reported an iscsi regression, caused
by tcp auto corking making this bug quite visible.
Fixes: 46d3ceabd8d9 ("tcp: TCP Small Queues") Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Thomas Glanzmann <thomas@glanzmann.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This device was mentioned in an OpenWRT forum. Seems to have a "standard"
Sierra Wireless ifnumber to function layout:
0: qcdm
2: nmea
3: modem
8: qmi
9: storage
Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, to make netconsole start over IPv6, the source address
needs to be specified. Without a source address, netpoll_parse_options
assumes we're setting up over IPv4 and the destination IPv6 address is
rejected.
Check if the IP version has been forced by a source address before
checking for a version mismatch when parsing the destination address.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ip rules with iif/oif references do not update:
(detach/attach) across interface renames.
Signed-off-by: Maciej Żenczykowski <maze@google.com> CC: Willem de Bruijn <willemb@google.com> CC: Eric Dumazet <edumazet@google.com> CC: Chris Davis <chrismd@google.com> CC: Carlo Contavalli <ccontavalli@google.com>
Google-Bug-Id: 12936021 Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Apparently commit 5c766d642bcaffd0c2a5b354db2068515b3846cf ("ipv4:
introduce address lifetime") forgot to take into account the addition of
struct ifa_cacheinfo in inet_nlmsg_size(). Hence add it, like is already
done for ipv6.
Suggested-by: Cong Wang <cwang@twopensource.com> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Self generated skbuffs in net/can/bcm.c are setting a skb->sk reference but
no explicit destructor which is enforced since Linux 3.11 with commit 376c7311bdb6 (net: add a temporary sanity check in skb_orphan()).
This patch adds some helper functions to make sure that a destructor is
properly defined when a sock reference is assigned to a CAN related skb.
To create an unshared skb owned by the original sock a common helper function
has been introduced to replace open coded functions to create CAN echo skbs.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Tested-by: Andre Naujoks <nautsch2@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 93d8bf9fb8f3 ("bridge: cleanup netpoll code") introduced
a check in br_netpoll_enable(), but this check is incorrect for
br_netpoll_setup(). This patch moves the code after the check
into __br_netpoll_enable() and calls it in br_netpoll_setup().
For br_add_if(), the check is still needed.
Fixes: 93d8bf9fb8f3 ("bridge: cleanup netpoll code") Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Tested-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The 9p-virtio transport does zero copy on things larger than 1024 bytes
in size. It accomplishes this by returning the physical addresses of
pages to the virtio-pci device. At present, the translation is usually a
bit shift.
That approach produces an invalid page address when we read/write to
vmalloc buffers, such as those used for Linux kernel modules. Any
attempt to load a Linux kernel module from 9p-virtio produces the
following stack.
qemu-system-x86_64: virtio: trying to map MMIO memory
This patch enables 9p-virtio to correctly handle this case. This not
only enables us to load Linux kernel modules off virtfs, but also
enables ZFS file-based vdevs on virtfs to be used without killing QEMU.
Special thanks to both Avi Kivity and Alexander Graf for their
interpretation of QEMU backtraces. Without their guidence, tracking down
this bug would have taken much longer. Also, special thanks to Linus
Torvalds for his insightful explanation of why this should use
is_vmalloc_addr() instead of is_vmalloc_or_module_addr():
https://lkml.org/lkml/2014/2/8/272
Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a device ndo_start_xmit() calls again dev_queue_xmit(),
lockdep can complain because dev_queue_xmit() is re-entered and the
spinlocks protecting tx queues share a common lockdep class.
Same issue was fixed for bonding/l2tp/ppp in commits
0daa2303028a6 ("[PATCH] bonding: lockdep annotation") 49ee49202b4ac ("bonding: set qdisc_tx_busylock to avoid LOCKDEP splat") 23d3b8bfb8eb2 ("net: qdisc busylock needs lockdep annotations ") 303c07db487be ("ppp: set qdisc_tx_busylock to avoid LOCKDEP splat ")
Reported-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit aa9c2669626c (NFS: Client implementation of Labeled-NFS) introduces
a performance regression. When nfs_zap_caches_locked is called, it sets
the NFS_INO_INVALID_LABEL flag irrespectively of whether or not the
NFS server supports security labels. Since that flag is never cleared,
it means that all calls to nfs_revalidate_inode() will now trigger
an on-the-wire GETATTR call.
This patch ensures that we never set the NFS_INO_INVALID_LABEL unless the
server advertises support for labeled NFS.
It also causes nfs_setsecurity() to clear NFS_INO_INVALID_LABEL when it
has successfully set the security label for the inode.
Finally it gets rid of the NFS_INO_INVALID_LABEL cruft from nfs_update_inode,
which has nothing to do with labeled NFS.
Reported-by: Neil Brown <neilb@suse.de> Tested-by: Neil Brown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1. wpa_supplicant send a start_scan request to the nl80211 driver
2. mac80211 module call rtl_op_config with IEEE80211_CONF_CHANGE_IDLE
3. rtl_ips_nic_on is called which disable local irqs
4. rtl92c_phy_set_rf_power_state() is called
5. rtl_ps_enable_nic() is called and hw_init()is executed and then the interrupts on the device are enabled
A good solution could be to refactor the code to avoid calling rtl92ce_hw_init() with the irqs disabled
but a quick and dirty solution that has proven to work is
to reenable the irqs during the function rtl92ce_hw_init().
I think that it is safe doing so since the device interrupt will only be enabled after the init function succeed.
Signed-off-by: Olivier Langlois <olivier@trillion01.com> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes regression caused by commit a16dad77634 "MIPS: Fix
potencial corruption". That commit fixes one corruption scenario in
cost of adding another one, which actually start to cause crashes
on Yeeloong laptop when rtl8187 driver is used.
For correct DMA read operation on machines without DMA coherence, kernel
have to invalidate cache, such it will refill later with new data that
device wrote to memory, when that data is needed to process. We can only
invalidate full cache line. Hence when cache line includes both dma
buffer and some other data (written in cache, but not yet in main
memory), the other data can not hit memory due to invalidation. That
happen on rtl8187 where struct rtl8187_priv fields are located just
before and after small buffers that are passed to USB layer and DMA
is performed on them.
To fix the problem we align buffers and reserve space after them to make
them match cache line.
This patch does not resolve all possible MIPS problems entirely, for
that we have to assure that we always map cache aligned buffers for DMA,
what can be complex or even not possible. But patch fixes visible and
reproducible regression and seems other possible corruptions do not
happen in practice, since Yeeloong laptop works stable without rtl8187
driver.
Reported-by: Petr Pisar <petr.pisar@atlas.cz> Bisected-by: Tom Li <biergaizi2009@gmail.com> Reported-and-tested-by: Tom Li <biergaizi2009@gmail.com> Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl> Acked-by: Larry Finger <Larry.Finger@lwfinger.next> Acked-by: Hin-Tak Leung <htl10@users.sourceforge.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
SMB3 servers can respond with MaxTransactSize of more than 4M
that can cause a memory allocation error returned from kmalloc
in a lock codepath. Also the client doesn't support multicredit
requests now and allows buffer sizes of 65536 bytes only. Set
MaxTransactSize to this maximum supported value.
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's possible for userland to pass down an iovec via writev() that has a
bogus user pointer in it. If that happens and we're doing an uncached
write, then we can end up getting less bytes than we expect from the
call to iov_iter_copy_from_user. This is CVE-2014-0069
cifs_iovec_write isn't set up to handle that situation however. It'll
blindly keep chugging through the page array and not filling those pages
with anything useful. Worse yet, we'll later end up with a negative
number in wdata->tailsz, which will confuse the sending routines and
cause an oops at the very least.
Fix this by having the copy phase of cifs_iovec_write stop copying data
in this situation and send the last write as a short one. At the same
time, we want to avoid sending a zero-length write to the server, so
break out of the loop and set rc to -EFAULT if that happens. This also
allows us to handle the case where no address in the iovec is valid.
[Note: Marking this for stable on v3.4+ kernels, but kernels as old as
v2.6.38 may have a similar problem and may need similar fix]
Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru> Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For avr32 cross compiler, do not define '__linux__' internally, so it
will cause issue with allmodconfig.
The related error:
CC [M] fs/coda/psdev.o
In file included from include/linux/coda.h:64,
from fs/coda/psdev.c:45:
include/uapi/linux/coda.h:221: error: expected specifier-qualifier-list before 'u_quad_t'
The related toolchain version (which only download, not re-compile):
In file included from arch/avr32/boards/mimc200/fram.c:13:
include/linux/miscdevice.h:51: error: field 'list' has incomplete type
include/linux/miscdevice.h:55: error: expected specifier-qualifier-list before 'mode_t'
arch/avr32/boards/mimc200/fram.c:42: error: 'THIS_MODULE' undeclared here (not in a function)
When doing reset in order to recover the affected PE, we issue
hot reset on PE primary bus if it's not root bus. Otherwise, we
issue hot or fundamental reset on root port or PHB accordingly.
For the later case, we didn't cover the situation where PE only
includes root port and it potentially causes kernel crash upon
EEH error to the PE.
The patch reworks the logic of EEH reset to improve the code
readability and also avoid the kernel crash.
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The reason is that we have used the wrong register to calculate the
ksp_limit in commit cbc9565ee826 (powerpc: Remove ksp_limit on ppc64).
Just fix it.
As suggested by Benjamin Herrenschmidt, also add the C prototype of the
function in the comment in order to avoid such kind of errors in the
future.
Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix tegra_init_cache() to check whether the system has a PL310 cache
before touching the PL310 registers. This prevents access to non-existent
registers on Tegra114 and later.
Note for stable kernels:
In <= v3.12, the file to patch is arch/arm/mach-tegra/common.c.
Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When building a kernel image with only CONFIG_CPU_IDLE but no CONFIG_PM,
we will get the following link error.
LD init/built-in.o
arch/arm/mach-imx/built-in.o: In function `imx6q_enter_wait':
platform-spi_imx.c:(.text+0x25c0): undefined reference to `imx6q_set_lpm'
platform-spi_imx.c:(.text+0x25d4): undefined reference to `imx6q_set_lpm'
arch/arm/mach-imx/built-in.o: In function `imx6q_cpuidle_init':
platform-spi_imx.c:(.init.text+0x75d4): undefined reference to `imx6q_set_chicken_bit'
make[1]: *** [vmlinux] Error 1
Since pm-imx6q.c has been a collection of library functions that access
CCM low-power registers used by not only suspend but also cpuidle and
other drivers, let's build pm-imx6q.c independently of CONFIG_PM to fix
above error.
Reported-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
OMAP SoC(s) depend on GPMC controller driver to parse GPMC DT child nodes and
register them platform_device for ONENAND driver to probe later. However this does
not happen if generic MTD_ONENAND framework is built as module (CONFIG_MTD_ONENAND=m).
Therefore, when MTD/ONENAND and MTD/ONENAND/OMAP2 modules are loaded, they are unable
to find any matching platform_device and remain un-binded. This causes on board
ONENAND flash to remain un-detected.
This patch causes GPMC controller to parse DT nodes when
CONFIG_MTD_ONENAND=y || CONFIG_MTD_ONENAND=m
Fixes: commit bc6b1e7b86f5d8e4a6fc1c0189e64bba4077efe0
ARM: OMAP: gpmc: add DT bindings for GPMC timings and NAND
OMAP SoC(s) depend on GPMC controller driver to parse GPMC DT child nodes and
register them platform_device for NAND driver to probe later. However this does
not happen if generic MTD_NAND framework is built as module (CONFIG_MTD_NAND=m).
Therefore, when MTD/NAND and MTD/NAND/OMAP2 modules are loaded, they are unable
to find any matching platform_device and remain un-binded. This causes on board
NAND flash to remain un-detected.
This patch causes GPMC controller to parse DT nodes when
CONFIG_MTD_NAND=y || CONFIG_MTD_NAND=m
Whilst the code does indeed reflect this in terms of the architecture,
the final <barrier> + <sev> have been contracted into a single inline
asm without a "memory" clobber, therefore the compiler is at liberty to
reorder the unlock to the end of the above sequence. In such a case,
a waiting CPU may be woken up before the lock has been unlocked, leading
to extremely poor performance.
This patch reworks the dsb_sev() function to make use of the dsb()
macro and ensure ordering against the unlock.
Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During __v{6,7}_setup, we invalidate the TLBs since we are about to
enable the MMU on return to head.S. Unfortunately, without a subsequent
dsb instruction, the invalidation is not guaranteed to have completed by
the time we write to the sctlr, potentially exposing us to junk/stale
translations cached in the TLB.
This patch reworks the init functions so that the dsb used to ensure
completion of cache/predictor maintenance is also used to ensure
completion of the TLB invalidation.
Reported-by: Albin Tonnerre <Albin.Tonnerre@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The stage-2 memory attributes are distinct from the Hyp memory
attributes and the Stage-1 memory attributes. We were using the stage-1
memory attributes for stage-2 mappings causing device mappings to be
mapped as normal memory. Add the S2 equivalent defines for memory
attributes and fix the comments explaining the defines while at it.
Add a prot_pte_s2 field to the mem_type struct and fill out the field
for device mappings accordingly.
Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
GFP_ATOMIC is not a single gfp flag, but a macro which expands to the other
flags and LACK of __GFP_WAIT flag. To check if caller wanted to perform an
atomic allocation, the code must test __GFP_WAIT flag presence. This patch
fixes the issue introduced in v3.6-rc5
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The set_flexbg_block_bitmap() function assumed that the number of
blocks in a blockgroup was sb->blocksize * 8, which is normally true,
but not always! Use EXT4_BLOCKS_PER_GROUP(sb) instead, to fix block
bitmap corruption after:
If a file system has a large number of inodes per block group, all of
the metadata blocks in a flex_bg may be larger than what can fit in a
single block group. Unfortunately, ext4_alloc_group_tables() in
resize.c was never tested to see if it would handle this case
correctly, and there were a large number of bugs which caused the
following sequence to result in a BUG_ON:
To fix this, we need to make sure the right thing happens when a block
group's inode table straddles two block groups, which means the
following bugs had to be fixed:
1) Not clearing the BLOCK_UNINIT flag in the second block group in
ext4_alloc_group_tables --- the was proximate cause of the BUG_ON.
2) Incorrectly determining how many block groups contained contiguous
free blocks in ext4_alloc_group_tables().
3) Incorrectly setting the start of the next block range to be marked
in use after a discontinuity in setup_new_flex_group_blocks().
If an ext4 file system is created by some tool other than mke2fs
(perhaps by someone who has a pathalogical fear of the GPL) that
doesn't set one or the other of the EXT2_FLAGS_{UN}SIGNED_HASH flags,
and that file system is then mounted read-only, don't try to modify
the s_flags field. Otherwise, if dm_verity is in use, the superblock
will change, causing an dm_verity failure.
In swap_inode_boot_loader() we forgot to release ->i_mutex and resume
unlocked dio for inode and inode_bl if there is an error starting the
journal handle. This commit fixes this issue.
Reported-by: Ahmed Tamrawi <ahmedtamrawi@gmail.com> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Dr. Tilmann Bubeck <t.bubeck@reinform.de> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit a115f749c1 (ext4: remove wait for unwritten extent conversion from
ext4_truncate) exposed a bug in ext4_ext_handle_uninitialized_extents().
It can be triggered by xfstest generic/299 when run on a test file
system created without a journal. This test continuously fallocates and
truncates files to which random dio/aio writes are simultaneously
performed by a separate process. The test completes successfully, but
if the test filesystem is mounted with the block_validity option, a
warning message stating that a logical block has been mapped to an
illegal physical block is posted in the kernel log.
The bug occurs when an extent is being converted to the written state
by ext4_end_io_dio() and ext4_ext_handle_uninitialized_extents()
discovers a mapping for an existing uninitialized extent. Although it
sets EXT4_MAP_MAPPED in map->m_flags, it fails to set map->m_pblk to
the discovered physical block number. Because map->m_pblk is not
otherwise initialized or set by this function or its callers, its
uninitialized value is returned to ext4_map_blocks(), where it is
stored as a bogus mapping in the extent status tree.
Since map->m_pblk can accidentally contain illegal values that are
larger than the physical size of the file system, calls to
check_block_validity() in ext4_map_blocks() that are enabled if the
block_validity mount option is used can fail, resulting in the logged
warning message.
intel_ring_cachline_align() emits MI_NOOPs until the ring tail is
aligned to a cacheline boundary.
Cc: Bjoern C <lkml@call-home.ch> Cc: Alexandru DAMIAN <alexandru.damian@intel.com> Cc: Enrico Tagliavini <enrico.tagliavini@gmail.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 0a0afd282f ("drm/nv50-/disp: move DP link training to core and
train from supervisor") added code that uses the wrong register for
computing the display bpp, used for bandwidth calculation. Adjust to use
the same register as used by exec_clkcmp and nv50_disp_intr_unk20_2_dp.
Reported-by: Torsten Wagner <torsten.wagner@gmail.com> Reported-by: Michael Gulick <mgulick@mathworks.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67628 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drm/nouveau/fb: remove ram oclass argument from base fb constructor
Introduced a unfortunate regression by using nv10 ram oclass for nv1a
hardware, causing corruption and eventually system lockup.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74866 Reported-by: John F. Godfrey <jfgodfrey@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since commit 0fa9061ae8c ("drm/nouveau/mc: handle irq-related setup
ourselves"), drm_device->irq_enabled remained unset. This is needed in
order to properly wait for a vblank event in the generic drm code.
See https://bugs.freedesktop.org/show_bug.cgi?id=74195
Reported-by: Jan Janecek <janjanjanx@gmail.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The CP semaphore queue on CIK has a bug that triggers if uncompleted
waits use the same address while a signal is still pending. Work around
this by using different addresses for each sync.
Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We're using edac_mc_workq_setup() both on the init path, when
we load an edac driver and when we change the polling period
(edac_mc_reset_delay_period) through /sys/.../edac_mc_poll_msec.
On that second path we don't need to init the workqueue which has been
initialized already.
Sanitize code even more to accept unsigned longs only and to not allow
polling intervals below 1 second as this is unnecessary and doesn't make
much sense anyway for polling errors.
In allmodconfig builds for sparc and any other arch which does
not set CONFIG_SPARSE_IRQ, the following will be seen at modpost:
CC [M] lib/cpu-notifier-error-inject.o
CC [M] lib/pm-notifier-error-inject.o
ERROR: "irq_to_desc" [drivers/gpio/gpio-mcp23s08.ko] undefined!
make[2]: *** [__modpost] Error 1
This happens because commit 3911ff30f5 ("genirq: export
handle_edge_irq() and irq_to_desc()") added one export for it, but
there were actually two instances of it, in an if/else clause for
CONFIG_SPARSE_IRQ. Add the second one.
Each sub-buffer (buffer page) has a full 64 bit timestamp. The events on
that page use a 27 bit delta against that timestamp in order to save on
bits written to the ring buffer. If the time between events is larger than
what the 27 bits can hold, a "time extend" event is added to hold the
entire 64 bit timestamp again and the events after that hold a delta from
that timestamp.
As a "time extend" is always paired with an event, it is logical to just
allocate the event with the time extend, to make things a bit more efficient.
Unfortunately, when the pairing code was written, it removed the "delta = 0"
from the first commit on a page, causing the events on the page to be
slightly skewed.
Fixes: 69d1b839f7ee "ring-buffer: Bind time extend and data events together" Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix NULL pointer dereference of "chip->pdata" if platform_data was not
supplied to the driver.
The driver during probe stored the pointer to the platform_data:
chip->pdata = client->dev.platform_data;
Later it was dereferenced in max17040_get_online() and
max17040_get_status().
If platform_data was not supplied, the NULL pointer exception would
happen:
When compiling for the IA-64 ski emulator, HZ is set to 32 because the
emulation is slow and we don't want to waste too many cycles processing
timers. Alpha also has an option to set HZ to 32.
This causes integer underflow in
kernel/time/jiffies.c:
kernel/time/jiffies.c:66:2: warning: large integer implicitly truncated to unsigned type [-Woverflow]
.mult = NSEC_PER_JIFFY << JIFFIES_SHIFT, /* details above */
^
This patch reduces the JIFFIES_SHIFT value to avoid the overflow.
Because the offload mechanism can fall back to a standard transfer,
having two seperate initialization states is unfortunate. Let's just
have one state which does things consistently. This fixes a bug where
some preparation was missing when the fallback happened. And it makes
the code much easier to follow. To implement this, we put the check
if offload is possible at the top of the offload setup function.
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).
Interestingly, the raid5 code can actually prevent double initialization and
hence can use the following simplified form of callback registration:
register_cpu_notifier(&foobar_cpu_notifier);
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
put_online_cpus();
A hotplug operation that occurs between registering the notifier and calling
get_online_cpus(), won't disrupt anything, because the code takes care to
perform the memory allocations only once.
So reorganize the code in raid5 this way to fix the deadlock with callback
registration.
Cc: linux-raid@vger.kernel.org Fixes: 36d1c6476be51101778882897b315bd928c8c7b5 Signed-off-by: Oleg Nesterov <oleg@redhat.com>
[Srivatsa: Fixed the unregister_cpu_notifier() deadlock, added the
free_scratch_buffer() helper to condense code further and wrote the changelog.] Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AMD systems which use the C1E workaround in the amd_e400_idle routine
trigger the WARN_ON_ONCE in the broadcast code when onlining a CPU.
The reason is that the idle routine of those AMD systems switches the
cpu into forced broadcast mode early on before the newly brought up
CPU can switch over to high resolution / NOHZ mode. The timer related
CPU1 bringup looks like this:
So while we remove CPU1 from the broadcast_oneshot_mask when we switch
over to highres mode, we do not clear the pending bit, which then
triggers the warning when we go back to idle.
The reason why this is only visible on C1E affected AMD systems is
that the other machines enter the deep sleep states via
acpi_idle/intel_idle and exit the broadcast mode before executing the
remote triggered local_apic_timer_interrupt. So the pending bit is
already cleared when the switch over to highres mode is clearing the
oneshot mask.
The solution is simple: Clear the pending bit together with the mask
bit when we switch over to highres mode.
Stanislaw came up independently with the same patch by enforcing the
C1E workaround and debugging the fallout. I picked mine, because mine
has a changelog :)
Reported-by: poma <pomidorabelisima@gmail.com> Debugged-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Olaf Hering <olaf@aepfle.de> Cc: Dave Jones <davej@redhat.com> Cc: Justin M. Forbes <jforbes@redhat.com> Cc: Josh Boyer <jwboyer@redhat.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1402111434180.21991@ionos.tec.linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If kvm_io_bus_register_dev() fails then it returns success but it should
return an error code.
I also did a little cleanup like removing an impossible NULL test.
Fixes: 2b3c246a682c ('KVM: Make coalesced mmio use a device per zone') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
iovcnt is declared as a signed integer in both the userspace API and
as a local variable in mic_virtio.c. The while() loop in mic_virtio.c
iterates until the local variable iovcnt reaches the value 0. If
userspace passes e.g. INT_MIN as iovcnt field, this loop then appears
to depend on an undefined behavior (signed underflow) to complete.
The fix is to use unsigned integers in both the userspace API and
the local variable.
This issue was reported @ https://lkml.org/lkml/2014/1/10/10
I started noticing problems with KVM guest destruction on Linux
3.12+, where guest memory wasn't being cleaned up. I bisected it
down to the commit introducing the new 'asm goto'-based atomics,
and found this quirk was later applied to those.
Unfortunately, even with GCC 4.8.2 (which ostensibly fixed the
known 'asm goto' bug) I am still getting some kind of
miscompilation. If I enable the asm_volatile_goto quirk for my
compiler, KVM guests are destroyed correctly and the memory is
cleaned up.
So make the quirk unconditional for now, until bug is found
and fixed.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Noonan <steven@uplinklabs.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/1392274867-15236-1-git-send-email-steven@uplinklabs.net Link: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670 Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The ACPI specification (ACPI 5.0A, Section 6.3.7) says:
_STA may return bit 0 clear (not present) with bit 3 set (device is
functional). This case is used to indicate a valid device for which
no device driver should be loaded (for example, a bridge device.)
Children of this device may be present and valid. OSPM should
continue enumeration below a device whose _STA returns this bit
combination.
Evidently, some BIOSes follow that and return 0x0A from _STA, which
causes problems to happen when they trigger bus check or device check
notifications for those devices too. Namely, ACPIPHP thinks that they
are gone and may drop them, for example, if such a notification is
triggered during a resume from system suspend.
To fix that, modify ACPICA to regard devies as present and
functioning if _STA returns both the ACPI_STA_DEVICE_ENABLED
and ACPI_STA_DEVICE_FUNCTIONING bits set for them.
Reported-and-tested-by: Peter Wu <lekensteyn@gmail.com>
[rjw: Subject and changelog, minor code modifications] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When mkfs issues a full device discard and the device only
supports discards of a smallish size, we can loop in
blkdev_issue_discard() for a long time. If preempt isn't enabled,
this can turn into a softlock situation and the kernel will
start complaining.
Add an explicit cond_resched() at the end of the loop to avoid
that.
Commit 9f060e2231ca changed the way we handle allocations for the
integrity vectors. When the vectors are inline there is no associated
slab and consequently bvec_nr_vecs() returns 0. Ensure that we check
against BIP_INLINE_VECS in that case.
Reported-by: David Milburn <dmilburn@redhat.com> Tested-by: David Milburn <dmilburn@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
request_queue bypassing is used to suppress higher-level function of a
request_queue so that they can be switched, reconfigured and shut
down. A request_queue does the followings while bypassing.
* bypasses elevator and io_cq association and queues requests directly
to the FIFO dispatch queue.
* bypasses block cgroup request_list lookup and always uses the root
request_list.
Once confirmed to be bypassing, specific elevator and block cgroup
policy implementations can assume that nothing is in flight for them
and perform various operations which would be dangerous otherwise.
Such confirmation is acheived by short-circuiting all new requests
directly to the dispatch queue and waiting for all the requests which
were issued before to finish. Unfortunately, while the request
allocating and draining sides were properly handled, we forgot to
actually plug the request dispatch path. Even after bypassing mode is
confirmed, if the attached driver tries to fetch a request and the
dispatch queue is empty, __elv_next_request() would invoke the current
elevator's elevator_dispatch_fn() callback. As all in-flight requests
were drained, the elevator wouldn't contain any request but once
bypass is confirmed we don't even know whether the elevator is even
there. It might be in the process of being switched and half torn
down.
Frank Mayhar reports that this actually happened while switching
elevators, leading to an oops.
Let's fix it by making __elv_next_request() avoid invoking the
elevator_dispatch_fn() callback if the queue is bypassing. It already
avoids invoking the callback if the queue is dying. As a dying queue
is guaranteed to be bypassing, we can simply replace blk_queue_dying()
check with blk_queue_bypass().
Reported-by: Frank Mayhar <fmayhar@google.com>
References: http://lkml.kernel.org/g/1390319905.20232.38.camel@bobble.lax.corp.google.com Tested-by: Frank Mayhar <fmayhar@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit afe2dab4f6 ("USB: add hex/bcd detection to usb modalias generation")
changed the routine that generates alias ranges. Before that change, only
digits 0-9 were supported; the commit tried to fix the case when the range
includes higher values than 0x9.
Unfortunately, the commit didn't fix the case when the range includes both
0x9 and 0xA, meaning that the final range must look like [x-9A-y] where
x <= 0x9 and y >= 0xA -- instead the [x-9A-x] range was produced.
Modprobe doesn't complain as it sees no difference between no-match and
bad-pattern results of fnmatch().
Fixing this simple bug to fix the aliases.
Also changing the hardcoded beginning of the range to uppercase as all the
other letters are also uppercase in the device version numbers.
Fortunately, this affects only the dvb-usb-dib0700 module, AFAIK.
Signed-off-by: Jan Moskyto Matejka <mq@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 9df89d85b407690afa46ddfbccc80bec6869971d "usbcore: set
lpm_capable field for LPM capable root hubs" was created under the
assumption that all USB host controllers should have USB 3.0 Link PM
enabled for all devices under the hosts.
Unfortunately, that's not the case. The xHCI driver relies on knowledge
of the host hardware scheduler to calculate the LPM U1/U2 timeout
values, and it only sets lpm_capable to one for Intel host controllers
(that have the XHCI_LPM_SUPPORT quirk set).
When LPM is enabled for some Fresco Logic hosts, it causes failures with
a AgeStar 3UBT USB 3.0 hard drive dock:
Jan 11 13:59:03 sg-laptop kernel: usb 3-1: new SuperSpeed USB device number 2 using xhci_hcd
Jan 11 13:59:03 sg-laptop kernel: usb 3-1: Set SEL for device-initiated U1 failed.
Jan 11 13:59:08 sg-laptop kernel: usb 3-1: Set SEL for device-initiated U2 failed.
Jan 11 13:59:08 sg-laptop kernel: usb-storage 3-1:1.0: USB Mass Storage device detected
Jan 11 13:59:08 sg-laptop mtp-probe[613]: checking bus 3, device 2: "/sys/devices/pci0000:00/0000:00:1c.3/0000:04:00.0/usb3/3-1"
Jan 11 13:59:08 sg-laptop mtp-probe[613]: bus: 3, device: 2 was not an MTP device
Jan 11 13:59:08 sg-laptop kernel: scsi6 : usb-storage 3-1:1.0
Jan 11 13:59:13 sg-laptop kernel: usb 3-1: Set SEL for device-initiated U1 failed.
Jan 11 13:59:18 sg-laptop kernel: usb 3-1: Set SEL for device-initiated U2 failed.
Jan 11 13:59:18 sg-laptop kernel: usbcore: registered new interface driver usb-storage
Jan 11 13:59:40 sg-laptop kernel: usb 3-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 11 13:59:41 sg-laptop kernel: usb 3-1: device descriptor read/8, error -71
Jan 11 13:59:41 sg-laptop kernel: usb 3-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 11 13:59:46 sg-laptop kernel: usb 3-1: device descriptor read/8, error -110
Jan 11 13:59:46 sg-laptop kernel: scsi 6:0:0:0: Device offlined - not ready after error recovery
Jan 11 13:59:46 sg-laptop kernel: usb 3-1: USB disconnect, device number 2
This reverts commit 35773dac5f862cb1c82ea151eba3e2f6de51ec3e. It's a
hack that caused regressions in the usb-storage and userspace USB
drivers that use usbfs and libusb. Commit 70cabb7d992f "xhci 1.0: Limit
arbitrarily-aligned scatter gather." should fix the issues seen with the
ax88179_178a driver on xHCI 1.0 hosts, without causing regressions.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We are ripping out commit 35773dac5f862cb1c82ea151eba3e2f6de51ec3e "usb:
xhci: Link TRB must not occur within a USB payload burst" because it's a
hack that caused regressions in the usb-storage and userspace USB
drivers that use usbfs and libusb. This commit attempted to fix the
issues with that patch.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We are ripping out commit 35773dac5f862cb1c82ea151eba3e2f6de51ec3e "usb:
xhci: Link TRB must not occur within a USB payload burst" because it's a
hack that caused regressions in the usb-storage and userspace USB
drivers that use usbfs and libusb. This commit attempted to fix the
issues with that patch.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
xHCI 1.0 hosts have a set of requirements on how to align transfer
buffers on the endpoint rings called "TD fragment" rules. When the
ax88179_178a driver added support for scatter gather in 3.12, with
commit 804fad45411b48233b48003e33a78f290d227c8 "USBNET: ax88179_178a:
enable tso if usb host supports sg dma", it broke the device under xHCI
1.0 hosts. Under certain network loads, the device would see an
unexpected short packet from the host, which would cause the device to
stop sending ethernet packets, even through USB packets would still be
sent.
Commit 35773dac5f86 "usb: xhci: Link TRB must not occur within a USB
payload burst" attempted to fix this. It was a quick hack to partially
implement the TD fragment rules. However, it caused regressions in the
usb-storage layer and userspace USB drivers using libusb. The patches
to attempt to fix this are too far reaching into the USB core, and we
really need to implement the TD fragment rules correctly in the xHCI
driver, instead of continuing to wallpaper over the issues.
Disable arbitrarily-aligned scatter-gather in the xHCI driver for 1.0
hosts. Only the ax88179_178a driver checks the no_sg_constraint flag,
so don't set it for 1.0 hosts. This should not impact usb-storage or
usbfs behavior, since they pass down max packet sized aligned sg-list
entries (512 for USB 2.0 and 1024 for USB 3.0).
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Tested-by: Mark Lord <mlord@pobox.com> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Bjørn Mork <bjorn@mork.no> Cc: Freddy Xin <freddy@asix.com.tw> Cc: Ming Lei <ming.lei@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add support for ANT USB-m Stick from Dynastream Innovations, by listing
USB pid
[34366.944805] usb 6-1: New USB device found, idVendor=0fcf, idProduct=1009
[34366.944817] usb 6-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[34366.944824] usb 6-1: Product: ANT USB-m Stick
[34366.944831] usb 6-1: Manufacturer: Dynastream Innovations
Device reported (https://code.google.com/p/antpm/issues/detail?id=5) to
work through:
$ modprobe usbserial vendor=0x0fcf product=0x1009
People sometimes create their own custom-configured kernels and forget
to enable CONFIG_SCSI_MULTI_LUN. This causes problems when they plug
in a USB storage device (such as a card reader) with more than one
LUN.
Fortunately, we can tell fairly easily when a storage device claims to
have more than one LUN. When that happens, this patch asks the SCSI
layer to probe all the LUNs automatically, regardless of the config
setting.
The patch also updates the Kconfig help text for usb-storage,
explaining that CONFIG_SCSI_MULTI_LUN may be necessary.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Thomas Raschbacher <lordvan@lordvan.com> CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net> CC: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Cypress ATACB unusual-devs entry for the Super Top SATA bridge
causes problems. Although it was originally reported only for
bcdDevice = 0x160, its range was much larger. This resulted in a bug
report for bcdDevice 0x220, so the range was capped at 0x219. Now
Milan reports errors with bcdDevice 0x150.
Therefore this patch restricts the range to just 0x160.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: Milan Svoboda <milan.svoboda@centrum.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the guest attempts to connect with the host when there may already be a
connection with the host (as would be the case during the kdump/kexec path),
it is difficult to guarantee timely response from the host. Starting with
WS2012 R2, the host supports this ability to re-connect with the host
(explicitly to support kexec). Prior to responding to the guest, the host
needs to ensure that device states based on the previous connection to
the host have been properly torn down. This may introduce unbounded delays.
To deal with this issue, don't do a timed wait during the initial connect
with the host.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During the initial VMBUS connect phase, starting with WS2012 R2, we should
specify the VPCU in the guest that should receive the notification. Fix this
issue. This fix is required to properly connect to the host in the kexeced
kernel.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In order to ensure the correct width cycles on the VME bus, the VME bridge
drivers implement an algorithm to utilise the largest possible width reads and
writes whilst maintaining natural alignment constraints. The algorithm
currently looks at the start address rather than the current read/write address
when determining whether a 16-bit width cycle is required to get to 32-bit
alignment. This results in incorrect alignment,
Reported-by: Jim Strouth <james.strouth@ge.com> Tested-by: Jim Strouth <james.strouth@ge.com> Signed-off-by: Martyn Welch <martyn.welch@ge.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>