The ll_add_cpu_to_smp_group(), ll_enable_coherency() and
ll_disable_coherency() are used on Armada XP to control the coherency
fabric. However, they make the assumption that the coherency fabric is
always available, which is currently a correct assumption but will no
longer be true with a followup commit that disables the usage of the
coherency fabric when the conditions are not met to use it.
Therefore, this commit modifies those functions so that they check the
return value of ll_get_coherency_base(), and if the return value is 0,
they simply return without configuring anything in the coherency
fabric.
The ll_get_coherency_base() function is also modified to properly
return 0 when the function is called with the MMU disabled. In this
case, it normally returns the physical address of the coherency
fabric, but we now check if the virtual address is 0, and if that's
case, return a physical address of 0 to indicate that the coherency
fabric is not enabled.
Commit d127e9c ("ARM: tegra: make tegra_resume can work with current and later
chips") removed tegra_get_soc_id macro leaving used cpu register corrupted after
branching to v7_invalidate_l1() and as result causing execution of unintended
code on tegra20. Possibly it was expected that r6 would be SoC id func argument
since common cpu reset handler is setting r6 before branching to tegra_resume(),
but neither tegra20_lp1_reset() nor tegra30_lp1_reset() aren't setting r6
register before jumping to resume function. Fix it by re-adding macro.
Fixes: d127e9c (ARM: tegra: make tegra_resume can work with current and later chips) Reviewed-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When creating a dumb buffer object using the DRM_IOCTL_MODE_CREATE_DUMB
IOCTL, only the width, height, bpp and flags parameters are inputs. The
caller is not guaranteed to zero out or set handle, pitch and size, so
the driver must not treat these values as possible inputs.
Fixes a bug where running the Weston compositor on Tegra DRM would cause
an attempt to allocate a 3 GiB framebuffer to be allocated.
Commit a469abd0f868 (ARM: elf: add new hwcap for identifying atomic
ldrd/strd instructions) introduces HWCAP_ELF for 32-bit ARM
applications. As LPAE is always present on arm64, report the
corresponding compat HWCAP to user space.
As long as struct thin_c is in the list, anyone can grab a reference of
it. Consequently, we must wait for the reference count to drop to zero
*after* we remove the structure from the list, not before.
Discard bios and thin device deletion have the potential to release data
blocks. If the thin-pool is in out-of-data-space mode, and blocks were
released, transition the thin-pool back to full write mode.
The correct time to do this is just after the thin-pool metadata commit.
It cannot be done before the commit because the space maps will not
allow immediate reuse of the data blocks in case there's a rollback
following power failure.
Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the pool was in PM_OUT_OF_SPACE mode its process_prepared_discard
function pointer was incorrectly being set to
process_prepared_discard_passdown rather than process_prepared_discard.
This incorrect function pointer meant the discard was being passed down,
but not effecting the mapping. As such any discard that was issued, in
an attempt to reclaim blocks, would not successfully free data space.
Reported-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We never bother caching a partial block that is at the back end of the
origin device. No cell ever gets locked, but the calling code was
assuming it was and trying to release it.
Now the code only releases if the cell has been set to a non NULL
value.
Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the incoming bio is a WRITE and completely covers a block then we
don't bother to do any copying for a promotion operation. Once this is
done the cache block and origin block will be different, so we need to
set it to 'dirty'.
Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When dm-bufio sets out to use the bio built into a struct dm_buffer to
issue an IO, it needs to call bio_reset after it's done with the bio
so that we can free things attached to the bio such as the integrity
payload. Therefore, inject our own endio callback to take care of
the bio_reset after calling submit_io's end_io callback.
Test case:
1. modprobe scsi_debug delay=0 dif=1 dix=199 ato=1 dev_size_mb=300
2. Set up a dm-bufio client, e.g. dm-verity, on the scsi_debug device
3. Repeatedly read metadata and watch kmalloc-192 leak!
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes kmemcheck warning in switch_names. The function
switch_names swaps inline names of two dentries. It swaps full arrays
d_iname, no matter how many bytes are really used by the strings. Reading
data beyond string ends results in kmemcheck warning.
We fix the bug by marking both arrays as fully initialized.
f2fs_write_begin() doesn't initialize the 'dn' variable if the inode has
inline data. However it uses its contents to decide whether it should
just zero out the page or load data to it. Thus if we are unlucky we can
zero out page contents instead of loading inline data into a page.
CC: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The code reads the default voltage selector from its register. If the
bootloader disables the regulator, the default voltage selector will be
0 which results in faulty behaviour of this regulator driver.
This patch sets a default voltage selector for vddpu if it is not set in
the register.
Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit f1d2736c8156 (mmc: dw_mmc: control card read threshold) added
dw_mci_ctrl_rd_thld() with an unconditional write to the CDTHRCTL
register at offset 0x100. However before version 240a, the FIFO region
started at 0x100, so the write messes with the FIFO and completely
breaks the driver.
If the version id < 240A, return early from dw_mci_ctl_rd_thld() so as
not to hit this problem.
Some boards with TC6393XB chip require full state restore during system
resume thanks to chip's VCC being cut off during suspend (Sharp SL-6000
tosa is one of them). Failing to do so would result in ohci Oops on
resume due to internal memory contentes being changed. Fail ohci suspend
on tc6393xb is full state restore is required.
Recommended workaround is to unbind tmio-ohci driver before suspend and
rebind it after resume.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit e7cd1d1eb16f ("mfd: twl4030-power: Add generic reset
configuration") accidentally removed the compatible flag for
"ti,twl4030-power" that should be there as documented in the
binding.
If "ti,twl4030-power" only the poweroff configuration is done
by the driver.
Fixes: e7cd1d1eb16f ("mfd: twl4030-power: Add generic reset configuration") Reported-by: "Dr. H. Nikolaus Schaller" <hns@goldelico.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If two threads call bitmap_unplug at the same time, then
one might schedule all the writes, and the other might
decide that it doesn't need to wait. But really it does.
It rarely hurts to wait when it isn't absolutely necessary,
and the current code doesn't really focus on 'absolutely necessary'
anyway. So just wait always.
This can potentially lead to data corruption if a crash happens
at an awkward time and data was written before the bitmap was
updated. It is very unlikely, but this should go to -stable
just to be safe. Appropriate for any -stable.
- Disables the F00F bug workaround warning. There is no F00F bug
workaround any more because Linux's standard IDT handling already
works around the F00F bug, but the warning still exists. This
is only cosmetic, and, in any event, there is no such thing as
KVM on a CPU with the F00F bug.
- Disables 32-bit APM BIOS detection. On a KVM paravirt system,
there should be no APM BIOS anyway.
- Disables tboot. I think that the tboot code should check the
CPUID hypervisor bit directly if it matters.
- paravirt_enabled disables espfix32. espfix32 should *not* be
disabled under KVM paravirt.
The last point is the purpose of this patch. It fixes a leak of the
high 16 bits of the kernel stack address on 32-bit KVM paravirt
guests. Fixes CVE-2014-8134.
Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Users have no business installing custom code segments into the
GDT, and segments that are not present but are otherwise valid
are a historical source of interesting attacks.
For completeness, block attempts to set the L bit. (Prior to
this patch, the L bit would have been silently dropped.)
This is an ABI break. I've checked glibc, musl, and Wine, and
none of them look like they'll have any trouble.
Note to stable maintainers: this is a hardening patch that fixes
no known bugs. Given the possibility of ABI issues, this
probably shouldn't be backported quickly.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rock Ridge extensions define so called Continuation Entries (CE) which
define where is further space with Rock Ridge data. Corrupted isofs
image can contain arbitrarily long chain of these, including a one
containing loop and thus causing kernel to end in an infinite loop when
traversing these entries.
Limit the traversal to 32 entries which should be more than enough space
to store all the Rock Ridge data.
Reported-by: P J P <ppandit@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In snd_usbmidi_error_timer(), the driver tries to resubmit MIDI input
URBs to reactivate the MIDI stream, but this causes the error when
some of URBs are still pending like:
Not just the userspace relocs, otherwise we won't wait
for a swapped out page tables to be swapped in again.
v2: rebased on Alex current drm-fixes-3.18
Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sorry for the confusion, this got applied twice, and reverted once, this
is the second revert and I hope to never touch it again...
Reported-by: Lv Zheng <lv.zheng@intel.com> Cc: Alexander Mezin <mezin.alexander@gmail.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For netlink, we shouldn't be using arch_fast_hash() as a hashing
discipline, but rather jhash() instead.
Since netlink sockets can be opened by any user, a local attacker
would be able to easily create collisions with the DPDK-derived
arch_fast_hash(), which trades off performance for security by
using crc32 CPU instructions on x86_64.
While it might have a legimite use case in other places, it should
be avoided in netlink context, though. As rhashtable's API is very
flexible, we could later on still decide on other hashing disciplines,
if legitimate.
Reference: http://thread.gmane.org/gmane.linux.kernel/1844123 Fixes: e341694e3eb5 ("netlink: Convert netlink_lookup() to use RCU protected hash table") Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 97a6d1bb2b658ac85ed88205ccd1ab809899884d (xen-netfront: Fix
handling packets on compound pages with skb_linearize) attempted to
fix a problem where an skb that would have required too many slots
would be dropped causing TCP connections to stall.
However, it filled in the first slot using the original buffer and not
the new one and would use the wrong offset and grant access to the
wrong page.
Netback would notice the malformed request and stop all traffic on the
VIF, reporting:
Reported-by: Anthony Wright <anthony@overnetdata.com> Tested-by: Anthony Wright <anthony@overnetdata.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To accomodate for enough headroom for tunnels, use MAX_HEADER instead
of LL_MAX_HEADER. Robert reported that he has hit after roughly 40hrs
of trinity an skb_under_panic() via SCTP output path (see reference).
I couldn't reproduce it from here, but not using MAX_HEADER as elsewhere
in other protocols might be one possible cause for this.
In any case, it looks like accounting on chunks themself seems to look
good as the skb already passed the SCTP output path and did not hit
any skb_over_panic(). Given tunneling was enabled in his .config, the
headroom would have been expanded by MAX_HEADER in this case.
Reported-by: Robert Święcki <robert@swiecki.net>
Reference: https://lkml.org/lkml/2014/12/1/507 Fixes: 594ccc14dfe4d ("[SCTP] Replace incorrect use of dev_alloc_skb with alloc_skb in sctp_packet_transmit().") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mvneta_tx() dereferences skb to get skb->len too late,
as hardware might have completed the transmit and TX completion
could have freed the skb from another cpu.
Fixes: 71f6d1b31fb1 ("net: mvneta: replace Tx timer with a real interrupt") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The mvneta driver sets the amount of Tx coalesce packets to 16 by
default. Normally that does not cause any trouble since the driver
uses a much larger Tx ring size (532 packets). But some sockets
might run with very small buffers, much smaller than the equivalent
of 16 packets. This is what ping is doing for example, by setting
SNDBUF to 324 bytes rounded up to 2kB by the kernel.
The problem is that there is no documented method to force a specific
packet to emit an interrupt (eg: the last of the ring) nor is it
possible to make the NIC emit an interrupt after a given delay.
In this case, it causes trouble, because when ping sends packets over
its raw socket, the few first packets leave the system, and the first
15 packets will be emitted without an IRQ being generated, so without
the skbs being freed. And since the socket's buffer is small, there's
no way to reach that amount of packets, and the ping ends up with
"send: no buffer available" after sending 6 packets. Running with 3
instances of ping in parallel is enough to hide the problem, because
with 6 packets per instance, that's 18 packets total, which is enough
to grant a Tx interrupt before all are sent.
The original driver in the LSP kernel worked around this design flaw
by using a software timer to clean up the Tx descriptors. This timer
was slow and caused terrible network performance on some Tx-bound
workloads (such as routing) but was enough to make tools like ping
work correctly.
Instead here, we simply set the packet counts before interrupt to 1.
This ensures that each packet sent will produce an interrupt. NAPI
takes care of coalescing interrupts since the interrupt is disabled
once generated.
No measurable performance impact nor CPU usage were observed on small
nor large packets, including when saturating the link on Tx, and this
fixes tools like ping which rely on too small a send buffer. If one
wants to increase this value for certain workloads where it is safe
to do so, "ethtool -C $dev tx-frames" will override this default
setting.
This fix needs to be applied to stable kernels starting with 3.10.
Tested-By: Maggie Mae Roxas <maggie.mae.roxas@gmail.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Remove optimize_div() from BPF_MOD | BPF_K case
since we don't know the dividend and fix the
emit_mod() by reading the mod operation result from HI register
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Reviewed-by: Markos Chandras <markos.chandras@imgtec.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Set the inner mac header to point to the GRE payload when
doing GRO. This is needed if we proceed to send the packet
through GRE GSO which now uses the inner mac header instead
of inner network header to determine the length of encapsulation
headers.
Fixes: 14051f0452a2 ("gre: Use inner mac length when computing tunnel length") Reported-by: Wolfgang Walter <linux@stwm.de> Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
rtnl_link_get_net() holds a reference on the 'struct net', we need to release
it in case of error.
CC: Eric W. Biederman <ebiederm@xmission.com> Fixes: b51642f6d77b ("net: Enable a userns root rtnl calls that are safe for unprivilged users") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes: 7f28fa10 ("bonding: add arp_ip_target netlink support") Reported-by: John Fastabend <john.fastabend@gmail.com> Cc: Scott Feldman <sfeldma@cumulusnetworks.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
TCP timestamping introduced MSG_ERRQUEUE handling for TCP sockets.
If the socket is of family AF_INET6, call ipv6_recv_error instead
of ip_recv_error.
This change is more complex than a single branch due to the loadable
ipv6 module. It reuses a pre-existing indirect function call from
ping. The ping code is safe to call, because it is part of the core
ipv6 module and always present when AF_INET6 sockets are active.
Fixes: 4ed2d765 (net-timestamp: TCP timestamping) Signed-off-by: Willem de Bruijn <willemb@google.com>
----
It may also be worthwhile to add WARN_ON_ONCE(sk->family == AF_INET6)
to ip_recv_error. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some VF drivers use the upper byte of "param1" (the qp count field)
in mlx4_qp_reserve_range() to pass flags which are used to optimize
the range allocation.
Under the current code, if any of these flags are set, the 32-bit
count field yields a count greater than 2^24, which is out of range,
and this VF fails.
As these flags represent a "best-effort" allocation hint anyway, they may
safely be ignored. Therefore, the PF driver may simply mask out the bits.
Fixes: c82e9aa0a8 "mlx4_core: resource tracking for HCA resources used by guests" Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If TX channels are set to 4 and RX channels are set to less than 4,
using ethtool -L, the driver will try to initialize more RX channels
than it has allocated, causing an oops.
This fix only initializes the RX ring if it has been allocated.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, when trying to reuse a socket, vxlan_sock_add will grab
vn->sock_lock, locate a reusable socket, inc refcount and release
vn->sock_lock.
But vxlan_sock_release() will first decrement refcount, and then grab
that lock. refcnt operations are atomic but as currently we have
deferred works which hold vs->refcnt each, this might happen, leading to
a use after free (specially after vxlan_igmp_leave):
CPU 1 CPU 2
deferred work vxlan_sock_add
... ...
spin_lock(&vn->sock_lock)
vs = vxlan_find_sock();
vxlan_sock_release
dec vs->refcnt, reaches 0
spin_lock(&vn->sock_lock)
vxlan_sock_hold(vs), refcnt=1
spin_unlock(&vn->sock_lock)
hlist_del_rcu(&vs->hlist);
vxlan_notify_del_rx_port(vs)
spin_unlock(&vn->sock_lock)
So when we look for a reusable socket, we check if it wasn't freed
already before reusing it.
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Fixes: 7c47cedf43a8b3 ("vxlan: move IGMP join/leave to work queue") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In "vxlan: Call udp_sock_create" there was a logic error that resulted in
the default for IPv6 VXLAN tunnels going from using checksums to not using
checksums. Since there is currently no support in iproute2 for setting
these values it means that a kernel after the change cannot talk over a IPv6
VXLAN tunnel to a kernel prior the change.
Fixes: 3ee64f3 ("vxlan: Call udp_sock_create") Cc: Tom Herbert <therbert@google.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When using GRE redirection in WCCP, it sets the wrong skb->protocol,
that is, ETH_P_IP instead of ETH_P_IPV6 for the encapuslated traffic.
Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Cc: Dmitry Kozlov <xeb@mail.ru> Signed-off-by: Yuri Chislov <yuri.chislov@gmail.com> Tested-by: Yuri Chislov <yuri.chislov@gmail.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now the vti_link_ops do not point the .dellink, for fb tunnel device
(ip_vti0), the net_device will be removed as the default .dellink is
unregister_netdevice_queue,but the tunnel still in the tunnel list,
then if we add a new vti tunnel, in ip_tunnel_find():
hlist_for_each_entry_rcu(t, head, hash_node) {
if (local == t->parms.iph.saddr &&
remote == t->parms.iph.daddr &&
link == t->parms.link &&
==> type == t->dev->type &&
ip_tunnel_key_match(&t->parms, flags, key))
break;
}
modprobe ip_vti
ip link del ip_vti0 type vti
ip link add ip_vti0 type vti
rmmod ip_vti
do that one or more times, kernel will panic.
fix it by assigning ip_tunnel_dellink to vti_link_ops' dellink, in
which we skip the unregister of fb tunnel device. do the same on ip6_vti.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Just like 0x1600 which got blacklisted by 66a7cbc303f4 ("ahci: disable
MSI instead of NCQ on Samsung pci-e SSDs on macbooks"), 0xa800 chokes
on NCQ commands if MSI is enabled. Disable MSI.
Setting a non-settable selection target caused BUG() to be called. The check
for valid selections only takes the selection target into account, but does
not tell whether it may be set, or only get. Fix the issue by simply
returning an error to the user.
commit e6023367d779 'x86, kaslr: Prevent .bss from overlaping initrd'
broke the cross compile of x86. It added a objdump invocation, which
invokes the host native objdump and ignores an active cross tool
chain.
Use $(OBJDUMP) instead which takes the CROSS_COMPILE prefix into
account.
[ tglx: Massage changelog and use $(OBJDUMP) ]
Fixes: e6023367d779 'x86, kaslr: Prevent .bss from overlaping initrd' Signed-off-by: Chris Clayton <chris2553@googlemail.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Borislav Petkov <bp@suse.de> Cc: Junjie Mao <eternal.n08@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/54705C8E.1080400@googlemail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Apparently PCH fifo underruns are tricky, we have plenty reports that
we see the occasional underrun (especially at boot-up).
So for a change let's see what happens when we don't re-enable pch
fifo underrun reporting when the pipe is disabled. This means that the
kernel can't catch pch fifo underruns when they happen (except when
all pipes are on on the pch). But we'll still catch underruns when
disabling the pipe again. So not a terrible reduction in test
coverage.
Since the DRM_ERROR is new and hence a regression plan B would be to
revert it back to a debug output. Which would be a lot worse than this
hack for underrun test coverage in the wild. See the referenced
discussions for more.
References: http://mid.gmane.org/CA+gsUGRfGe3t4NcjdeA=qXysrhLY3r4CEu7z4bjTwxi1uOfy+g@mail.gmail.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85898
References: https://bugs.freedesktop.org/show_bug.cgi?id=85898
References: https://bugs.freedesktop.org/show_bug.cgi?id=86233
References: https://bugs.freedesktop.org/show_bug.cgi?id=86478 Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Tested-by: lu hua <huax.lu@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
memblock_is_region_reserved() returns true in the case of a partial
overlap, meaning that the current code fails to reserve the
non-overlapping portion.
This call was introduced as part of d1552ce449eb "of/fdt: move
memreserve and dtb memory reservations into core" which went into
v3.16.
I observed this causing a Midway system with a buggy fdt (the header
declares itself to be larger than it really is) failing to boot
because the over-inflated size of the fdt was causing it to seem to
run into the swapper_pg_dir region, meaning the DT wasn't reserved.
The symptoms were failing to find an disks or network and failing to
boot.
However given the ambiguity of whether things like the initrd are
covered by /memreserve/ and similar I think it is best to also
register the region rather than just ignoring it.
Since memblock_reserve() handles overlaps just fine lets just warn and
carry on.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Grant Likely <grant.likely@linaro.org> Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cadence I2C controller has bug wherein it generates invalid read transactions
after timeout in master receiver mode. This driver does not use the HW
timeout and this interrupt is disabled but the feature itself cannot be
disabled. Hence, this patch writes the maximum value (0xFF) to this register.
This is one of the workarounds to this bug and it will not avoid the issue
completely but reduces the chances of error.
According to I2C specification the NACK should be handled as follows:
"When SDA remains HIGH during this ninth clock pulse, this is defined as the Not
Acknowledge signal. The master can then generate either a STOP condition to
abort the transfer, or a repeated START condition to start a new transfer."
[I2C spec Rev. 6, 3.1.6: http://www.nxp.com/documents/user_manual/UM10204.pdf]
Currently the Davinci i2c driver interrupts the transfer on receipt of a
NACK but fails to send a STOP in some situations and so makes the bus
stuck until next I2C IP reset (idle/enable).
For example, the issue will happen during SMBus read transfer which
consists from two i2c messages write command/address and read data:
S Slave Address Wr A Command Code A Sr Slave Address Rd A D1..Dn A P
<--- write -----------------------> <--- read --------------------->
The I2C client device will send NACK if it can't recognize "Command Code"
and it's expected from I2C master to generate STP in this case.
But now, Davinci i2C driver will just exit with -EREMOTEIO and STP will
not be generated.
Hence, fix it by generating Stop condition (STP) always when NACK is received.
This patch fixes Davinci I2C in the same way it was done for OMAP I2C
commit cda2109a26eb ("i2c: omap: query STP always when NACK is received").
commit 6d9939f651419a63e091105663821f9c7d3fec37 (i2c: omap: split out [XR]DR
and [XR]RDY) changed the way how errata i207 (I2C: RDR Flag May Be Incorrectly
Set) get handled. 6d9939f6514 code doesn't correspond to workaround provided by
errata.
According to errata ISR must filter out spurious RDR before data read not after.
ISR must read RXSTAT to get number of bytes available to read. Because RDR
could be set while there could no data in the receive FIFO.
Found by code review. Real impact haven't seen.
Tested on Beagleboard XM C.
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com> Fixes: 6d9939f651419a63e09110 i2c: omap: split out [XR]DR and [XR]RDY Tested-by: Felipe Balbi <balbi@ti.com> Reviewed-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d7afc95946487945cc7f5019b41255b72224b70 (i2c: omap: ack IRQ in parts)
changed the interrupt handler to complete transfers without clearing
XRDY (AL case) and ARDY (NACK case) flags. XRDY or ARDY interrupts will be
fired again. As a result, ISR keep processing transfer after it was already
complete (from the driver code point of view).
A didn't see real impacts of the 1d7afc9, but it is really bad idea to
have ISR running on user data after transfer was complete.
It looks, what 1d7afc9 violate TI specs in what how AL and NACK should be
handled (see Note 1, sprugn4r, Figure 17-31 and Figure 17-32).
According to specs (if I understood correctly), in case of NACK and AL driver
must reset NACK, AL, ARDY, RDR, and RRDY (Master Receive Mode), and
NACK, AL, ARDY, and XDR (Master Transmitter Mode).
All that is done down the code under the if condition:
if (stat & (OMAP_I2C_STAT_ARDY | OMAP_I2C_STAT_NACK | OMAP_I2C_STAT_AL)) ...
The patch restore pre 1d7afc9 logic of handling NACK and AL interrupts, so
no interrupts is fired after ISR informs the rest of driver what transfer
complete.
Note: instead of removing break under NACK case, we could just replace 'break'
with 'continue' and allow NACK transfer to finish using ARDY event. I found
that NACK and ARDY bits usually set together. That case confirm TI wiki:
http://processors.wiki.ti.com/index.php/I2C_Tips#Detecting_and_handling_NACK
In order if someone interested in the event traces for NACK and AL cases,
I sent them to mailing list.
Tested on Beagleboard XM C.
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com> Fixes: 1d7afc9 i2c: omap: ack IRQ in parts Acked-by: Felipe Balbi <balbi@ti.com> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Indications are that no GF116's actually have a copy engine there, but
actually have the decompression engine. This engine can be made to do
copies, but that should be done separately.
Unclear why this didn't turn up on all GF116's, but perhaps the
non-mobile ones came with enough VRAM to not trigger ttm migrations in
test scenarios.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=85465 Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=59168 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
These BUGs can be erroneously triggered by frags which refer to
tail pages within a compound page. The data in these pages may
overrun the hardware page while still being contained within the
compound page, but since compound_order() evaluates to 0 for tail
pages the assertion fails. The code already iterates through
subsequent pages correctly in this scenario, so the BUGs are
unnecessary and can be removed.
Fixes: f36c374782e4 ("xen/netfront: handle compound page fragments on transmit") Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The bounds check for nodeid in ____cache_alloc_node gives false
positives on machines where the node IDs are not contiguous, leading to
a panic at boot time. For example, on a POWER8 machine the node IDs are
typically 0, 1, 16 and 17. This means that num_online_nodes() returns
4, so when ____cache_alloc_node is called with nodeid = 16 the VM_BUG_ON
triggers, like this:
To fix this, we instead compare the nodeid with MAX_NUMNODES, and
additionally make sure it isn't negative (since nodeid is an int). The
check is there mainly to protect the array dereference in the get_node()
call in the next line, and the array being dereferenced is of size
MAX_NUMNODES. If the nodeid is in range but invalid (for example if the
node is off-line), the BUG_ON in the next line will catch that.
Fixes: 14e50c6a9bc2 ("mm: slab: Verify the nodeid passed to ____cache_alloc_node") Signed-off-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Reviewed-by: Pekka Enberg <penberg@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andrew Morton noticed that the error return from anon_vma_clone() was
being dropped and replaced with -ENOMEM (which is not itself a bug
because the only error return value from anon_vma_clone() is -ENOMEM).
I did an audit of callers of anon_vma_clone() and discovered an actual
bug where the error return was being lost. In __split_vma(), between
Linux 3.11 and 3.12 the code was changed so the err variable is used
before the call to anon_vma_clone() and the default initial value of
-ENOMEM is overwritten. So a failure of anon_vma_clone() will return
success since err at this point is now zero.
Below is a patch which fixes this bug and also propagates the error
return value from anon_vma_clone() in all cases.
Fixes: ef0855d334e1 ("mm: mempolicy: turn vma_set_policy() into vma_dup_policy()") Signed-off-by: Daniel Forrest <dan.forrest@ssec.wisc.edu> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Tim Hartrick <tim@edgecast.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michel Lespinasse <walken@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
I've been seeing swapoff hangs in recent testing: it's cycling around
trying unsuccessfully to find an mm for some remaining pages of swap.
I have been exercising swap and page migration more heavily recently,
and now notice a long-standing error in copy_one_pte(): it's trying to
add dst_mm to swapoff's mmlist when it finds a swap entry, but is doing
so even when it's a migration entry or an hwpoison entry.
Which wouldn't matter much, except it adds dst_mm next to src_mm,
assuming src_mm is already on the mmlist: which may not be so. Then if
pages are later swapped out from dst_mm, swapoff won't be able to find
where to replace them.
There's already a !non_swap_entry() test for stats: move that up before
the swap_duplicate() and the addition to mmlist.
If a frontswap dup-store failed, it should invalidate the expired page
in the backend, or it could trigger some data corruption issue.
Such as:
1. use zswap as the frontswap backend with writeback feature
2. store a swap page(version_1) to entry A, success
3. dup-store a newer page(version_2) to the same entry A, fail
4. use __swap_writepage() write version_2 page to swapfile, success
5. zswap do shrink, writeback version_1 page to swapfile
6. version_2 page is overwrited by version_1, data corrupt.
This patch fixes this issue by invalidating expired data immediately
when meet a dup-store failure.
Signed-off-by: Weijie Yang <weijie.yang@samsung.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Daniel Borkmann <dborkman@redhat.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After removal of the central spinlock nf_conntrack_lock, in
commit 93bb0ceb75be2 ("netfilter: conntrack: remove central
spinlock nf_conntrack_lock"), it is possible to race against
get_next_corpse().
The race is against the get_next_corpse() cleanup on
the "unconfirmed" list (a per-cpu list with seperate locking),
which set the DYING bit.
Fix this race, in __nf_conntrack_confirm(), by removing the CT
from unconfirmed list before checking the DYING bit. In case
race occured, re-add the CT to the dying list.
While at this, fix coding style of the comment that has been
updated.
Fixes: 93bb0ceb75be2 ("netfilter: conntrack: remove central spinlock nf_conntrack_lock") Reported-by: bill bonaparte <programme110@gmail.com> Signed-off-by: bill bonaparte <programme110@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On x86_64, kernel text mappings are mapped read-only with CONFIG_DEBUG_RODATA.
In that case, KVM will fail to patch VMCALL instructions to VMMCALL
as required on AMD processors.
The failure mode is currently a divide-by-zero exception, which obviously
is a KVM bug that has to be fixed. However, picking the right instruction
between VMCALL and VMMCALL will be faster and will help if you cannot upgrade
the hypervisor.
Reported-by: Chris Webb <chris@arachsys.com> Tested-by: Chris Webb <chris@arachsys.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Chris J Arges <chris.j.arges@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Turning vdd on/off can generate a long hpd pulse on eDP ports. In order
to handle hpd we would need to turn on vdd to perform aux transfers.
This would lead to an endless cycle of
"vdd off -> long hpd -> vdd on -> detect -> vdd off -> ..."
So ignore long hpd pulses on eDP ports. eDP panels should be physically
tied to the machine anyway so they should not actually disappear and
thus don't need long hpd handling. Short hpds are still needed for link
re-train and whatnot so we can't just turn off the hpd interrupt
entirely for eDP ports. Perhaps we could turn it off whenever the panel
is disabled, but just ignoring the long hpd seems sufficient.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Todd Previte <tprevite@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Older firmwares do not provide support for the HOT_SPOT_CMD command.
Check for the appropriate TLV flag that declares hotspot support in
the firmware to prevent a firmware assertion failure that can be
triggered from the userspace,
Some radeon ASICs don't support all 64 address bits of MSIs despite
advertising support for 64-bit MSIs in their configuration space.
This breaks on systems such as IBM POWER7/8, where 64-bit MSIs can
be assigned with some of the high address bits set.
This makes use of the newly introduced "no_64bit_msi" flag in structure
pci_dev to allow the MSI allocation code to fallback to 32-bit MSIs
on those adapters.
Adding Alex's review tag. Patch to the driver is identical to the
reviewed one, I dropped the arch/powerpc hunk rewrote the subject
and cset comment.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If ddc fails, presumably the i2c mux (and hopefully the signal
mux) are switched to the other GPU so don't fetch the edid from
the vbios so that the connector reports disconnected.
During a GPU reset we need to get pending page flip cleared out
since the ring contents are gone and flip will never complete
on its own. This used to work until the mmio vs. CS flip race
detection came about. That piece of code is looking for a
specific surface address in the SURFLIVE register, but as
a flip to that address may never happen the check may never
pass. So we should just skip the SURFLIVE and flip counter
checks when the GPU gets reset.
intel_display_handle_reset() tries to effectively complete
the flip anyway by calling .update_primary_plane(). But that
may not satisfy the conditions of the mmio vs. CS race
detection since there's no guarantee that a modeset didn't
sneak in between the GPU reset and intel_display_handle_reset().
Such a modeset will not wait for pending flips due to the ongoing GPU
reset, and then the primary plane updates performed by
intel_display_handle_reset() will already use the new surface
address, and thus the surface address the flip is waiting for
might never appear in SURFLIVE. The result is that the flip
will never complete and attempts to perform further page flips
will fail with -EBUSY.
During the GPU reset intel_crtc_has_pending_flip() will return
false regardless, so the deadlock with a modeset vs. the error
work acquiring crtc->mutex was avoided. And the reset_counter
check in intel_crtc_has_pending_flip() actually made this bug
even less severe since it allowed normal modesets to go through
even though there's a pending flip.
This is a regression introduced by me here:
commit 75f7f3ec600524c9544cc31695155f1a9ddbe1d9
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Tue Apr 15 21:41:34 2014 +0300
drm/i915: Fix mmio vs. CS flip race on ILK+
Testcase: igt/kms_flip/flip-vs-panning-vs-hang Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reported-by: Leo Wolf <jclw@ymail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79996 Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
[Jani: amended the commit message slightly.] Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Just use the acpi interface. That's what windows uses on this
generation and it's the only thing that seems to work reliably
on these generation parts.
You can still force the native backlight interface by setting
radeon.backlight=1
Commit 79c6ab509558 (clk: divider: add CLK_DIVIDER_READ_ONLY flag) in
v3.16 introduced the CLK_DIVIDER_READ_ONLY flag which caused the
recalc_rate() and round_rate() clock callbacks to be omitted.
However using this flag has the unfortunate side effect of causing the
clock recalculation code when a clock rate change is attempted to always
treat it as a pass-through clock, i.e. with a fixed divide of 1, which
may not be the case. Child clock rates are then recalculated using the
wrong parent rate.
Therefore instead of dropping the recalc_rate() and round_rate()
callbacks, alter clk_divider_bestdiv() to always report the current
divider as the best divider so that it is never altered.
For me the read only clock was the system clock, which divided the PLL
rate by 2, from which both the UART and the SPI clocks were divided.
Initial setting of the UART rate set it correctly, but when the SPI
clock was set, the other child clocks were miscalculated. The UART clock
was recalculated using the PLL rate as the parent rate, resulting in a
UART new_rate of double what it should be, and a UART which spewed forth
garbage when the rate changes were propagated.
Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Thomas Abraham <thomas.ab@samsung.com> Cc: Tomasz Figa <t.figa@samsung.com> Cc: Max Schwarz <max.schwarz@online.de> Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> Signed-off-by: Michael Turquette <mturquette@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>