git.ipfire.org Git - thirdparty/kernel/stable.git/log

wl18xx: show rx_frames_per_rates as an array as it really is

commit a3fa71c40f1853d0c27e8f5bc01a722a705d9682 upstream.

In struct wl18xx_acx_rx_rate_stat, rx_frames_per_rates field is an
array, not a number.  This means WL18XX_DEBUGFS_FWSTATS_FILE can't be
used to display this field in debugfs (it would display a pointer, not
the actual data).  Use WL18XX_DEBUGFS_FWSTATS_FILE_ARRAY instead.

This bug has been found by adding a __printf attribute to
wl1271_format_buffer.  gcc complained about "format '%u' expects
argument of type 'unsigned int', but argument 5 has type 'u32 *'".

Fixes: c5d94169e818 ("wl18xx: use new fw stats structures")
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

lib: memzero_explicit: use barrier instead of OPTIMIZER_HIDE_VAR

commit 0b053c9518292705736329a8fe20ef4686ffc8e9 upstream.

OPTIMIZER_HIDE_VAR(), as defined when using gcc, is insufficient to
ensure protection from dead store optimization.

For the random driver and crypto drivers, calls are emitted ...

  $ gdb vmlinux
  (gdb) disassemble memzero_explicit
  Dump of assembler code for function memzero_explicit:
    0xffffffff813a18b0 <+0>: push   %rbp
    0xffffffff813a18b1 <+1>: mov    %rsi,%rdx
    0xffffffff813a18b4 <+4>: xor    %esi,%esi
    0xffffffff813a18b6 <+6>: mov    %rsp,%rbp
    0xffffffff813a18b9 <+9>: callq  0xffffffff813a7120 <memset>
    0xffffffff813a18be <+14>: pop    %rbp
    0xffffffff813a18bf <+15>: retq
  End of assembler dump.

  (gdb) disassemble extract_entropy
  [...]
    0xffffffff814a5009 <+313>: mov    %r12,%rdi
    0xffffffff814a500c <+316>: mov    $0xa,%esi
    0xffffffff814a5011 <+321>: callq  0xffffffff813a18b0 <memzero_explicit>
    0xffffffff814a5016 <+326>: mov    -0x48(%rbp),%rax
  [...]

... but in case in future we might use facilities such as LTO, then
OPTIMIZER_HIDE_VAR() is not sufficient to protect gcc from a possible
eviction of the memset(). We have to use a compiler barrier instead.

Minimal test example when we assume memzero_explicit() would *not* be
a call, but would have been *inlined* instead:

  static inline void memzero_explicit(void *s, size_t count)
  {
    memset(s, 0, count);
    <foo>
  }

  int main(void)
  {
    char buff[20];

    snprintf(buff, sizeof(buff) - 1, "test");
    printf("%s", buff);

    memzero_explicit(buff, sizeof(buff));
    return 0;
  }

With <foo> := OPTIMIZER_HIDE_VAR():

  (gdb) disassemble main
  Dump of assembler code for function main:
  [...]
   0x0000000000400464 <+36>: callq  0x400410 <printf@plt>
   0x0000000000400469 <+41>: xor    %eax,%eax
   0x000000000040046b <+43>: add    $0x28,%rsp
   0x000000000040046f <+47>: retq
  End of assembler dump.

With <foo> := barrier():

  (gdb) disassemble main
  Dump of assembler code for function main:
  [...]
   0x0000000000400464 <+36>: callq  0x400410 <printf@plt>
   0x0000000000400469 <+41>: movq   $0x0,(%rsp)
   0x0000000000400471 <+49>: movq   $0x0,0x8(%rsp)
   0x000000000040047a <+58>: movl   $0x0,0x10(%rsp)
   0x0000000000400482 <+66>: xor    %eax,%eax
   0x0000000000400484 <+68>: add    $0x28,%rsp
   0x0000000000400488 <+72>: retq
  End of assembler dump.

As can be seen, movq, movq, movl are being emitted inlined
via memset().

Reference: http://thread.gmane.org/gmane.linux.kernel.cryptoapi/13764/
Fixes: d4c5efdb9777 ("random: add and use memzero_explicit() for clearing data")
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: mancha security <mancha1@zoho.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

e1000: add dummy allocator to fix race condition between mtu change and netpoll

commit 08e8331654d1d7b2c58045e549005bc356aa7810 upstream.

There is a race condition between e1000_change_mtu's cleanups and
netpoll, when we change the MTU across jumbo size:

Changing MTU frees all the rx buffers:
    e1000_change_mtu -> e1000_down -> e1000_clean_all_rx_rings ->
        e1000_clean_rx_ring

Then, close to the end of e1000_change_mtu:
    pr_info -> ... -> netpoll_poll_dev -> e1000_clean ->
        e1000_clean_rx_irq -> e1000_alloc_rx_buffers -> e1000_alloc_frag

And when we come back to do the rest of the MTU change:
    e1000_up -> e1000_configure -> e1000_configure_rx ->
        e1000_alloc_jumbo_rx_buffers

alloc_jumbo finds the buffers already != NULL, since data (shared with
page in e1000_rx_buffer->rxbuf) has been re-alloc'd, but it's garbage,
or at least not what is expected when in jumbo state.

This results in an unusable adapter (packets don't get through), and a
NULL pointer dereference on the next call to e1000_clean_rx_ring
(other mtu change, link down, shutdown):

BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff81194d6e>] put_compound_page+0x7e/0x330

    [...]

Call Trace:
[<ffffffff81195445>] put_page+0x55/0x60
[<ffffffff815d9f44>] e1000_clean_rx_ring+0x134/0x200
[<ffffffff815da055>] e1000_clean_all_rx_rings+0x45/0x60
[<ffffffff815df5e0>] e1000_down+0x1c0/0x1d0
[<ffffffff811e2260>] ? deactivate_slab+0x7f0/0x840
[<ffffffff815e21bc>] e1000_change_mtu+0xdc/0x170
[<ffffffff81647050>] dev_set_mtu+0xa0/0x140
[<ffffffff81664218>] do_setlink+0x218/0xac0
[<ffffffff814459e9>] ? nla_parse+0xb9/0x120
[<ffffffff816652d0>] rtnl_newlink+0x6d0/0x890
[<ffffffff8104f000>] ? kvm_clock_read+0x20/0x40
[<ffffffff810a2068>] ? sched_clock_cpu+0xa8/0x100
[<ffffffff81663802>] rtnetlink_rcv_msg+0x92/0x260

By setting the allocator to a dummy version, netpoll can't mess up our
rx buffers.  The allocator is set back to a sane value in
e1000_configure_rx.

Fixes: edbbb3ca1077 ("e1000: implement jumbo receive with partial descriptors")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ksoftirqd: Enable IRQs and call cond_resched() before poking RCU

commit 28423ad283d5348793b0c45cc9b1af058e776fd6 upstream.

While debugging an issue with excessive softirq usage, I encountered the
following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
infrastructure"):

[ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]

...but despite this note, the patch still calls RCU with IRQs disabled.

This seemingly innocuous change caused a significant regression in softirq
CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
introducing 0.01% packet loss, the softirq usage would jump to around 25%,
spiking as high as 50%. Before the change, the usage would never exceed 5%.

Moving the call to rcu_note_context_switch() after the cond_sched() call,
as it was originally before the hotplug patch, completely eliminated this
problem.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

RCU pathwalk breakage when running into a symlink overmounting something

commit 3cab989afd8d8d1bc3d99fef0e7ed87c31e7b647 upstream.

Calling unlazy_walk() in walk_component() and do_last() when we find
a symlink that needs to be followed doesn't acquire a reference to vfsmount.
That's fine when the symlink is on the same vfsmount as the parent directory
(which is almost always the case), but it's not always true - one _can_
manage to bind a symlink on top of something. And in such cases we end up
with excessive mntput().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

drm/i915: cope with large i2c transfers

commit 9535c4757b881e06fae72a857485ad57c422b8d2 upstream.

The hardware, according to the specs, is limited to 256 byte transfers,
and current driver has no protections in case users attempt to do larger
transfers. The code will just stomp over status register and mayhem
ensues.

Let's split larger transfers into digestable chunks. Doing this allows
Atmel MXT driver on Pixel 1 function properly (it hasn't since commit
9d8dc3e529a19e427fd379118acd132520935c5d "Input: atmel_mxt_ts -
implement T44 message handling" which tries to consume multiple
touchscreen/touchpad reports in a single transaction).

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

drm/radeon: fix doublescan modes (v2)

commit fd99a0943ffaa0320ea4f69d09ed188f950c0432 upstream.

Use the correct flags for atom.

v2: handle DRM_MODE_FLAG_DBLCLK

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

i2c: core: Export bus recovery functions

commit c1c21f4e60ed4523292f1a89ff45a208bddd3849 upstream.

Current -next fails to link an ARM allmodconfig because drivers that use
the core recovery functions can be built as modules but those functions
are not exported:

ERROR: "i2c_generic_gpio_recovery" [drivers/i2c/busses/i2c-davinci.ko] undefined!
ERROR: "i2c_generic_scl_recovery" [drivers/i2c/busses/i2c-davinci.ko] undefined!
ERROR: "i2c_recover_bus" [drivers/i2c/busses/i2c-davinci.ko] undefined!

Add exports to fix this.

Fixes: 5f9296ba21b3c (i2c: Add bus recovery infrastructure)
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

IB/mlx4: Fix WQE LSO segment calculation

commit ca9b590caa17bcbbea119594992666e96cde9c2f upstream.

The current code decreases from the mss size (which is the gso_size
from the kernel skb) the size of the packet headers.

It shouldn't do that because the mss that comes from the stack
(e.g IPoIB) includes only the tcp payload without the headers.

The result is indication to the HW that each packet that the HW sends
is smaller than what it could be, and too many packets will be sent
for big messages.

An easy way to demonstrate one more aspect of the problem is by
configuring the ipoib mtu to be less than 2*hlen (2*56) and then
run app sending big TCP messages. This will tell the HW to send packets
with giant (negative value which under unsigned arithmetics becomes
a huge positive one) length and the QP moves to SQE state.

Fixes: b832be1e4007 ('IB/mlx4: Add IPoIB LSO support')
Reported-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

IB/core: don't disallow registering region starting at 0x0

commit 66578b0b2f69659f00b6169e6fe7377c4b100d18 upstream.

In a call to ib_umem_get(), if address is 0x0 and size is
already page aligned, check added in commit 8494057ab5e4
("IB/uverbs: Prevent integer overflow in ib_umem_get address
arithmetic") will refuse to register a memory region that
could otherwise be valid (provided vm.mmap_min_addr sysctl
and mmap_low_allowed SELinux knobs allow userspace to map
something at address 0x0).

This patch allows back such registration: ib_umem_get()
should probably don't care of the base address provided it
can be pinned with get_user_pages().

There's two possible overflows, in (addr + size) and in
PAGE_ALIGN(addr + size), this patch keep ensuring none
of them happen while allowing to pin memory at address
0x0. Anyway, the case of size equal 0 is no more (partially)
handled as 0-length memory region are disallowed by an
earlier check.

Link: http://mid.gmane.org/cover.1428929103.git.ydroneaud@opteya.com
Cc: Shachar Raindel <raindel@mellanox.com>
Cc: Jack Morgenstein <jackm@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

IB/core: disallow registering 0-sized memory region

commit 8abaae62f3fdead8f4ce0ab46b4ab93dee39bab2 upstream.

If ib_umem_get() is called with a size equal to 0 and an
non-page aligned address, one page will be pinned and a
0-sized umem will be returned to the caller.

This should not be allowed: it's not expected for a memory
region to have a size equal to 0.

This patch adds a check to explicitly refuse to register
a 0-sized region.

Link: http://mid.gmane.org/cover.1428929103.git.ydroneaud@opteya.com
Cc: Shachar Raindel <raindel@mellanox.com>
Cc: Jack Morgenstein <jackm@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

stk1160: Make sure current buffer is released

commit aeff09276748b66072f2db2e668cec955cf41959 upstream.

The available (i.e. not used) buffers are returned by stk1160_clear_queue(),
on the stop_streaming() path. However, this is insufficient and the current
buffer must be released as well. Fix it.

Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

mvsas: fix panic on expander attached SATA devices

commit 56cbd0ccc1b508de19561211d7ab9e1c77e6b384 upstream.

mvsas is giving a General protection fault when it encounters an expander
attached ATA device. Analysis of mvs_task_prep_ata() shows that the driver is
assuming all ATA devices are locally attached and obtaining the phy mask by
indexing the local phy table (in the HBA structure) with the phy id. Since
expanders have many more phys than the HBA, this is causing the index into the
HBA phy table to overflow and returning rubbish as the pointer.

mvs_task_prep_ssp() instead does the phy mask using the port properties.
Mirror this in mvs_task_prep_ata() to fix the panic.

Reported-by: Adam Talbot <ajtalbot1@gmail.com>
Tested-by: Adam Talbot <ajtalbot1@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

Drivers: hv: vmbus: Fix a bug in the error path in vmbus_open()

commit 40384e4bbeb9f2651fe9bffc0062d9f31ef625bf upstream.

Correctly rollback state if the failure occurs after we have handed over
the ownership of the buffer to the host.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

gpio: mvebu: Fix mask/unmask managment per irq chip type

commit 61819549f572edd7fce53f228c0d8420cdc85f71 upstream.

Level IRQ handlers and edge IRQ handler are managed by tow different
sets of registers. But currently the driver uses the same mask for the
both registers. It lead to issues with the following scenario:

First, an IRQ is requested on a GPIO to be triggered on front. After,
this an other IRQ is requested for a GPIO of the same bank but
triggered on level. Then the first one will be also setup to be
triggered on level. It leads to an interrupt storm.

The different kind of handler are already associated with two
different irq chip type. With this patch the driver uses a private
mask for each one which solves this issue.

It has been tested on an Armada XP based board and on an Armada 375
board. For the both boards, with this patch is applied, there is no
such interrupt storm when running the previous scenario.

This bug was already fixed but in a different way in the legacy
version of this driver by Evgeniy Dushistov:
9ece8839b1277fb9128ff6833411614ab6c88d68 "ARM: orion: Fix for certain
sequence of request_irq can cause irq storm". The fact the new version
of the gpio drive could be affected had been discussed there:
http://thread.gmane.org/gmane.linux.ports.arm.kernel/344670/focus=364012

Reported-by: Evgeniy A. Dushistov <dushistov@mail.ru>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

xtensa: ISS: fix locking in TAP network adapter

commit 24e94454c8cb6a13634f5a2f5a01da53a546a58d upstream.

- don't lock lp->lock in the iss_net_timer for the call of iss_net_poll,
  it will lock it itself;
- invert order of lp->lock and opened_lock acquisition in the
  iss_net_open to make it consistent with iss_net_poll;
- replace spin_lock with spin_lock_bh when acquiring locks used in
  iss_net_timer from non-atomic context;
- replace spin_lock_irqsave with spin_lock_bh in the iss_net_start_xmit
  as the driver doesn't use lp->lock in the hard IRQ context;
- replace __SPIN_LOCK_UNLOCKED(lp.lock) with spin_lock_init, otherwise
  lockdep is unhappy about using non-static key.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

xtensa: provide __NR_sync_file_range2 instead of __NR_sync_file_range

commit 01e84c70fe40c8111f960987bcf7f931842e6d07 upstream.

xtensa actually uses sync_file_range2 implementation, so it should
define __NR_sync_file_range2 as other architectures that use that
function. That fixes userspace interface (that apparently never worked)
and avoids special-casing xtensa in libc implementations.
See the thread ending at
http://lists.busybox.net/pipermail/uclibc/2015-February/048833.html
for more details.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

xtensa: xtfpga: fix hardware lockup caused by LCD driver

commit 4949009eb8d40a441dcddcd96e101e77d31cf1b2 upstream.

LCD driver is always built for the XTFPGA platform, but its base address
is not configurable, and is wrong for ML605/KC705. Its initialization
locks up KC705 board hardware.

Make the whole driver optional, and its base address and bus width
configurable. Implement 4-bit bus access method.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ACPICA: Utilities: split IO address types from data type models.

commit 2b8760100e1de69b6ff004c986328a82947db4ad upstream.

ACPICA commit aacf863cfffd46338e268b7415f7435cae93b451

It is reported that on a physically 64-bit addressed machine, 32-bit kernel
can trigger crashes in accessing the memory regions that are beyond the
32-bit boundary. The region field's start address should still be 32-bit
compliant, but after a calculation (adding some offsets), it may exceed the
32-bit boundary. This case is rare and buggy, but there are real BIOSes
leaked with such issues (see References below).

This patch fixes this gap by always defining IO addresses as 64-bit, and
allows OSPMs to optimize it for a real 32-bit machine to reduce the size of
the internal objects.

Internal acpi_physical_address usages in the structures that can be fixed
by this change include:
1. struct acpi_object_region:
    acpi_physical_address address;
2. struct acpi_address_range:
    acpi_physical_address start_address;
    acpi_physical_address end_address;
3. struct acpi_mem_space_context;
    acpi_physical_address address;
4. struct acpi_table_desc
    acpi_physical_address address;
See known issues 1 for other usages.

Note that acpi_io_address which is used for ACPI_PROCESSOR may also suffer
from same problem, so this patch changes it accordingly.

For iasl, it will enforce acpi_physical_address as 32-bit to generate
32-bit OSPM compatible tables on 32-bit platforms, we need to define
ACPI_32BIT_PHYSICAL_ADDRESS for it in acenv.h.

Known issues:
1. Cleanup of mapped virtual address
   In struct acpi_mem_space_context, acpi_physical_address is used as a virtual
   address:
    acpi_physical_address                   mapped_physical_address;
   It is better to introduce acpi_virtual_address or use acpi_size instead.
   This patch doesn't make such a change. Because this should be done along
   with a change to acpi_os_map_memory()/acpi_os_unmap_memory().
   There should be no functional problem to leave this unchanged except
   that only this structure is enlarged unexpectedly.

Link: https://github.com/acpica/acpica/commit/aacf863c
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=87971
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=79501
Reported-and-tested-by: Paul Menzel <paulepanter@users.sourceforge.net>
Reported-and-tested-by: Sial Nije <sialnije@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

powerpc: Fix missing L2 cache size in /sys/devices/system/cpu

commit f7e9e358362557c3aa2c1ec47490f29fe880a09e upstream.

This problem appears to have been introduced in 2.6.29 by commit
93197a36a9c1 "Rewrite sysfs processor cache info code".

This caused lscpu to error out on at least e500v2 devices, eg:

error: cannot open /sys/devices/system/cpu/cpu0/cache/index2/size: No such file or directory

Some embedded powerpc systems use cache-size in DTS for the unified L2
cache size, not d-cache-size, so we need to allow for both DTS names.
Added a new CACHE_TYPE_UNIFIED_D cache_type_info structure to handle
this.

Fixes: 93197a36a9c1 ("powerpc: Rewrite sysfs processor cache info code")
Signed-off-by: Dave Olson <olson@cumulusnetworks.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

Bluetooth: ath3k: Add support Atheros AR5B195 combo Mini PCIe card

commit 2eeff0b4317a02f0e281df891d990194f0737aae upstream.

Add 04f2:aff1 to ath3k.c supported devices list and btusb.c blacklist, so
that the device can load the ath3k firmware and re-enumerate itself as an
AR3011 device.

T:  Bus=05 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=12   MxCh= 0
D:  Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=04f2 ProdID=aff1 Rev= 0.01
C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms

Signed-off-by: Alexander Ploumistos <alexpl@fedoraproject.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

target: Fix COMPARE_AND_WRITE with SG_TO_MEM_NOALLOC handling

commit c8e639852ad720499912acedfd6b072325fd2807 upstream.

This patch fixes a bug for COMPARE_AND_WRITE handling with
fabrics using SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC.

It adds the missing allocation for cmd->t_bidi_data_sg within
transport_generic_new_cmd() that is used by COMPARE_AND_WRITE
for the initial READ payload, even if the fabric is already
providing a pre-allocated buffer for cmd->t_data_sg.

Also, fix zero-length COMPARE_AND_WRITE handling within the
compare_and_write_callback() and target_complete_ok_work()
to queue the response, skipping the initial READ.

This fixes COMPARE_AND_WRITE emulation with loopback, vhost,
and xen-backend fabric drivers using SG_TO_MEM_NOALLOC.

Reported-by: Christoph Hellwig <hch@lst.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

scsi: storvsc: Fix a bug in copy_from_bounce_buffer()

commit 8de580742fee8bc34d116f57a20b22b9a5f08403 upstream.

We may exit this function without properly freeing up the maapings
we may have acquired. Fix the bug.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

UBI: fix check for "too many bytes"

commit 299d0c5b27346a77a0777c993372bf8777d4f2e5 upstream.

The comparison from the previous line seems to have been erroneously
(partially) copied-and-pasted onto the next. The second line should be
checking req.bytes, not req.lnum.

Coverity CID #139400

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
[rw: Fixed comparison]
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

UBI: initialize LEB number variable

commit f16db8071ce18819fbd705ddcc91c6f392fb61f8 upstream.

In some of the 'out_not_moved' error paths, lnum may be used
uninitialized. Don't ignore the warning; let's fix it.

This uninitialized variable doesn't have much visible effect in the end,
since we just schedule the PEB for erasure, and its LEB number doesn't
really matter (it just gets printed in debug messages). But let's get it
straight anyway.

Coverity CID #113449

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

UBI: fix out of bounds write

commit d74adbdb9abf0d2506a6c4afa534d894f28b763f upstream.

If aeb->len >= vol->reserved_pebs, we should not be writing aeb into the
PEB->LEB mapping.

Caught by Coverity, CID #711212.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

UBI: account for bitflips in both the VID header and data

commit 8eef7d70f7c6772c3490f410ee2bceab3b543fa1 upstream.

We are completely discarding the earlier value of 'bitflips', which
could reflect a bitflip found in ubi_io_read_vid_hdr(). Let's use the
bitwise OR of header and data 'bitflip' statuses instead.

Coverity CID #1226856

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

tools/power turbostat: Use $(CURDIR) instead of $(PWD) and add support for O= option in Makefile

commit f82263c6989c31ae9b94cecddffb29dcbec38710 upstream.

Since commit ee0778a30153
("tools/power: turbostat: make Makefile a bit more capable")
turbostat's Makefile is using

  [...]
  BUILD_OUTPUT    := $(PWD)
  [...]

which obviously causes trouble when building "turbostat" with

  make -C /usr/src/linux/tools/power/x86/turbostat ARCH=x86 turbostat

because GNU make does not update nor guarantee that $PWD is set.

This patch changes the Makefile to use $CURDIR instead, which GNU make
guarantees to set and update (i.e. when using "make -C ...") and also
adds support for the O= option (see "make help" in your root of your
kernel source tree for more details).

Link: https://bugs.gentoo.org/show_bug.cgi?id=533918
Fixes: ee0778a30153 ("tools/power: turbostat: make Makefile a bit more capable")
Signed-off-by: Thomas D. <whissi@whissi.de>
Cc: Mark Asselstine <mark.asselstine@windriver.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

tools lib traceevent kbuffer: Remove extra update to data pointer in PADDING

commit c5e691928bf166ac03430e957038b60adba3cf6c upstream.

When a event PADDING is hit (a deleted event that is still in the ring
buffer), translate_data() sets the length of the padding and also updates
the data pointer which is passed back to the caller.

This is unneeded because the caller also updates the data pointer with
the passed back length. translate_data() should not update the pointer,
only set the length.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20150324135923.461431960@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

powerpc/perf: Cap 64bit userspace backtraces to PERF_MAX_STACK_DEPTH

commit 9a5cbce421a283e6aea3c4007f141735bf9da8c3 upstream.

We cap 32bit userspace backtraces to PERF_MAX_STACK_DEPTH
(currently 127), but we forgot to do the same for 64bit backtraces.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ext4: make fsync to sync parent dir in no-journal for real this time

commit e12fb97222fc41e8442896934f76d39ef99b590a upstream.

Previously commit 14ece1028b3ed53ffec1b1213ffc6acaf79ad77c added a
support for for syncing parent directory of newly created inodes to
make sure that the inode is not lost after a power failure in
no-journal mode.

However this does not work in majority of cases, namely:
- if the directory has inline data
- if the directory is already indexed
- if the directory already has at least one block and:
- the new entry fits into it
- or we've successfully converted it to indexed

So in those cases we might lose the inode entirely even after fsync in
the no-journal mode. This also includes ext2 default mode obviously.

I've noticed this while running xfstest generic/321 and even though the
test should fail (we need to run fsck after a crash in no-journal mode)
I could not find a newly created entries even when if it was fsynced
before.

Fix this by adjusting the ext4_add_entry() successful exit paths to set
the inode EXT4_STATE_NEWENTRY so that fsync has the chance to fsync the
parent directory as well.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Frank Mayhar <fmayhar@google.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm64: vdso: fix build error when switching from LE to BE

commit 1915e2ad1cf548217c963121e4076b3d44dd0169 upstream.

Building a kernel with CPU_BIG_ENDIAN fails if there are stale objects
from a !CPU_BIG_ENDIAN build. Due to a missing FORCE prerequisite on an
if_changed rule in the VDSO Makefile, we attempt to link a stale LE
object into the new BE kernel.

According to Documentation/kbuild/makefiles.txt, FORCE is required for
if_changed rules and forgetting it is a common mistake, so fix it by
'Forcing' the build of vdso. This patch fixes build errors like these:

arch/arm64/kernel/vdso/note.o: compiled for a little endian system and target is big endian
failed to merge target specific data of file arch/arm64/kernel/vdso/note.o

arch/arm64/kernel/vdso/sigreturn.o: compiled for a little endian system and target is big endian
failed to merge target specific data of file arch/arm64/kernel/vdso/sigreturn.o

Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Arun Chandran <achandran@mvista.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

rtlwifi: rtl8192cu: Add new device ID

commit 9374e7d2fdcad3c36dafc8d3effd554bc702c4b6 upstream.

Add new ID for ASUS N10 WiFi dongle.

Signed-off-by: Marek Vasut <marex@denx.de>
Tested-by: Marek Vasut <marex@denx.de>
Cc: Larry Finger <Larry.Finger@lwfinger.net>
Cc: John W. Linville <linville@tuxdriver.com>
Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

rtlwifi: rtl8192cu: Add new USB ID

commit 2f92b314f4daff2117847ac5343c54d3d041bf78 upstream.

USB ID 2001:330d is used for a D-Link DWA-131.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ptrace: fix race between ptrace_resume() and wait_task_stopped()

commit b72c186999e689cb0b055ab1c7b3cd8fffbeb5ed upstream.

ptrace_resume() is called when the tracee is still __TASK_TRACED. We set
tracee->exit_code and then wake_up_state() changes tracee->state. If the
tracer's sub-thread does wait() in between, task_stopped_code(ptrace => T)
wrongly looks like another report from tracee.

This confuses debugger, and since wait_task_stopped() clears ->exit_code
the tracee can miss a signal.

Test-case:

#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/ptrace.h>
#include <pthread.h>
#include <assert.h>

int pid;

void *waiter(void *arg)
{
int stat;

for (;;) {
assert(pid == wait(&stat));
assert(WIFSTOPPED(stat));
if (WSTOPSIG(stat) == SIGHUP)
continue;

assert(WSTOPSIG(stat) == SIGCONT);
printf("ERR! extra/wrong report:%x\n", stat);
}
}

int main(void)
{
pthread_t thread;

pid = fork();
if (!pid) {
assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);
for (;;)
kill(getpid(), SIGHUP);
}

assert(pthread_create(&thread, NULL, waiter, NULL) == 0);

for (;;)
ptrace(PTRACE_CONT, pid, 0, SIGCONT);

return 0;
}

Note for stable: the bug is very old, but without 9899d11f6544 "ptrace:
ensure arch_ptrace/ptrace_request can never race with SIGKILL" the fix
should use lock_task_sighand(child).

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Pavel Labath <labath@google.com>
Tested-by: Pavel Labath <labath@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

fs/binfmt_elf.c: fix bug in loading of PIE binaries

commit a87938b2e246b81b4fb713edb371a9fa3c5c3c86 upstream.

With CONFIG_ARCH_BINFMT_ELF_RANDOMIZE_PIE enabled, and a normal top-down
address allocation strategy, load_elf_binary() will attempt to map a PIE
binary into an address range immediately below mm->mmap_base.

Unfortunately, load_elf_ binary() does not take account of the need to
allocate sufficient space for the entire binary which means that, while
the first PT_LOAD segment is mapped below mm->mmap_base, the subsequent
PT_LOAD segment(s) end up being mapped above mm->mmap_base into the are
that is supposed to be the "gap" between the stack and the binary.

Since the size of the "gap" on x86_64 is only guaranteed to be 128MB this
means that binaries with large data segments > 128MB can end up mapping
part of their data segment over their stack resulting in corruption of the
stack (and the data segment once the binary starts to run).

Any PIE binary with a data segment > 128MB is vulnerable to this although
address randomization means that the actual gap between the stack and the
end of the binary is normally greater than 128MB. The larger the data
segment of the binary the higher the probability of failure.

Fix this by calculating the total size of the binary in the same way as
load_elf_interp().

Signed-off-by: Michael Davidson <md@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

Input: elantech - fix absolute mode setting on some ASUS laptops

commit bd884149aca61de269fd9bad83fe2a4232ffab21 upstream.

On ASUS TP500LN and X750JN, the touchpad absolute mode is reset each
time set_rate is done.

In order to fix this, we will verify the firmware version, and if it
matches the one in those laptops, the set_rate function is overloaded
with a function elantech_set_rate_restore_reg_07 that performs the
set_rate with the original function, followed by a restore of reg_07
(the register that sets the absolute mode on elantech v4 hardware).

Also the ASUS TP500LN and X750JN firmware version, capabilities, and
button constellation is added to elantech.c

Reported-and-tested-by: George Moutsopoulos <gmoutso@yahoo.co.uk>
Signed-off-by: Ulrik De Bie <ulrik.debie-os@e2big.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ALSA: emu10k1: don't deadlock in proc-functions

commit 91bf0c2dcb935a87e5c0795f5047456b965fd143 upstream.

The functions snd_emu10k1_proc_spdif_read and snd_emu1010_fpga_read
acquire the emu_lock before accessing the FPGA. The function used
to access the FPGA (snd_emu1010_fpga_read) also tries to take
the emu_lock which causes a deadlock.
Remove the outer locking in the proc-functions (guarding only the
already safe fpga read) to prevent this deadlock.

[removed superfluous flags variables too -- tiwai]

Signed-off-by: Michael Gernoth <michael@gernoth.net>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: core: hub: use new USB_RESUME_TIMEOUT

commit bbc78c07a51f6fd29c227b1220a9016e585358ba upstream.

Make sure we're using the new macro, so our
resume signaling will always pass certification.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: dwc2: hcd: use new USB_RESUME_TIMEOUT

commit 74bd7b69801819707713b88e9d0bc074efa2f5e7 upstream.

Make sure we're using the new macro, so our
resume signaling will always pass certification.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: host: sl811: use new USB_RESUME_TIMEOUT

commit 08debfb13b199716da6153940c31968c556b195d upstream.

Make sure we're using the new macro, so our
resume signaling will always pass certification.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: host: xhci: use new USB_RESUME_TIMEOUT

commit b9e451885deb6262dbaf5cd14aa77d192d9ac759 upstream.

Make sure we're using the new macro, so our
resume signaling will always pass certification.

Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: host: isp116x: use new USB_RESUME_TIMEOUT

commit 8c0ae6574ccfd3d619876a65829aad74c9d22ba5 upstream.

Make sure we're using the new macro, so our
resume signaling will always pass certification.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: host: r8a66597: use new USB_RESUME_TIMEOUT

commit 7a606ac29752a3e571b83f9b3fceb1eaa1d37781 upstream.

While this driver was already using a 50ms resume
timeout, let's make sure everybody uses the same
macro so it's easy to fix later should anything
go wrong.

It also gives a more "stable" expectation to Linux
users.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: host: fotg210: use new USB_RESUME_TIMEOUT

commit 7e136bb71a08e8b8be3bc492f041d9b0bea3856d upstream.

Make sure we're using the new macro, so our
resume signaling will always pass certification.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: host: uhci: use new USB_RESUME_TIMEOUT

commit b8fb6f79f76f478acbbffccc966daa878f172a0a upstream.

Make sure we're using the new macro, so our
resume signaling will always pass certification.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: host: fusbh200: use new USB_RESUME_TIMEOUT

commit 595227db1f2d98bfc33f02a55842f268e12b247d upstream.

Make sure we're using the new macro, so our
resume signaling will always pass certification.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: define a generic USB_RESUME_TIMEOUT macro

commit 62f0342de1f012f3e90607d39e20fce811391169 upstream.

Every USB Host controller should use this new
macro to define for how long resume signalling
should be driven on the bus.

Currently, almost every single USB controller
is using a 20ms timeout for resume signalling.

That's problematic for two reasons:

a) sometimes that 20ms timer expires a little
before 20ms, which makes us fail certification

b) some (many) devices actually need more than
20ms resume signalling.

Sure, in case of (b) we can state that the device
is against the USB spec, but the fact is that
we have no control over which device the certification
lab will use. We also have no control over which host
they will use. Most likely they'll be using a Windows
PC which, again, we have no control over how that
USB stack is written and how long resume signalling
they are using.

At the end of the day, we must make sure Linux passes
electrical compliance when working as Host or as Device
and currently we don't pass compliance as host because
we're driving resume signallig for exactly 20ms and
that confuses certification test setup resulting in
Certification failure.

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Peter Chen <peter.chen@freescale.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: phy: Find the right match in devm_usb_phy_match

commit 869aee0f31429fa9d94d5aef539602b73ae0cf4b upstream.

The res parameter passed to devm_usb_phy_match() is the location where the
pointer to the usb_phy is stored, hence it needs to be dereferenced before
comparing to the match data in order to find the correct match.

Fixes: 410219dcd2ba ("usb: otg: utils: devres: Add API's to associate a device with the phy")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ARM: dts: dove: Fix uart[23] reg property

commit a74cd13b807029397f7232449df929bac11fb228 upstream.

Fix Dove's register addresses of uart2 and uart3 nodes that seem to
be broken since ages due to a copy-and-paste error.

Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ARM: S3C64XX: Use fixed IRQ bases to avoid conflicts on Cragganmore

commit 4e330ae4ab2915444f1e6dca1358a910aa259362 upstream.

There are two PMICs on Cragganmore, currently one dynamically assign
its IRQ base and the other uses a fixed base. It is possible for the
statically assigned PMIC to fail if its IRQ is taken by the dynamically
assigned one. Fix this by statically assigning both the IRQ bases.

Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ARM: 8320/1: fix integer overflow in ELF_ET_DYN_BASE

commit 8defb3367fcd19d1af64c07792aade0747b54e0f upstream.

Usually ELF_ET_DYN_BASE is 2/3 of TASK_SIZE. With 3G/1G user/kernel
split this is not so, because 2*TASK_SIZE overflows 32 bits,
so the actual value of ELF_ET_DYN_BASE is:
(2 * TASK_SIZE / 3) = 0x2a000000

When ASLR is disabled PIE binaries will load at ELF_ET_DYN_BASE address.
On 32bit platforms AddressSanitzer uses addresses [0x20000000 - 0x40000000]
for shadow memory [1]. So ASan doesn't work for PIE binaries when ASLR disabled
as it fails to map shadow memory.
Also after Kees's 'split ET_DYN ASLR from mmap ASLR' patchset PIE binaries
has a high chance of loading somewhere in between [0x2a000000 - 0x40000000]
even if ASLR enabled. This makes ASan with PIE absolutely incompatible.

Fix overflow by dividing TASK_SIZE prior to multiplying.
After this patch ELF_ET_DYN_BASE equals to (for CONFIG_VMSPLIT_3G=y):
(TASK_SIZE / 3 * 2) = 0x7f555554

[1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm#Mapping

Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Reported-by: Maria Guseva <m.guseva@samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

power_supply: lp8788-charger: Fix leaked power supply on probe fail

commit a7117f81e8391e035c49b3440792f7e6cea28173 upstream.

Driver forgot to unregister charger power supply if registering of
battery supply failed in probe(). In such case the memory associated
with power supply leaked.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Fixes: 98a276649358 ("power_supply: Add new lp8788 charger driver")
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

power_supply: twl4030_madc: Check return value of power_supply_register

commit 68c3ed6fa7e0d69529ced772d650ab128916a81d upstream.

The return value of power_supply_register() call was not checked and
even on error probe() function returned 0. If registering failed then
during unbind the driver tried to unregister power supply which was not
actually registered.

This could lead to memory corruption because power_supply_unregister()
unconditionally cleans up given power supply.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Fixes: da0a00ebc239 ("power: Add twl4030_madc battery driver.")
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ring-buffer: Replace this_cpu_*() with __this_cpu_*()

commit 80a9b64e2c156b6523e7a01f2ba6e5d86e722814 upstream.

It has come to my attention that this_cpu_read/write are horrible on
architectures other than x86. Worse yet, they actually disable
preemption or interrupts! This caused some unexpected tracing results
on ARM.

101.356868: preempt_count_add <-ring_buffer_lock_reserve
101.356870: preempt_count_sub <-ring_buffer_lock_reserve

The ring_buffer_lock_reserve has recursion protection that requires
accessing a per cpu variable. But since preempt_disable() is traced, it
too got traced while accessing the variable that is suppose to prevent
recursion like this.

The generic version of this_cpu_read() and write() are:

#define this_cpu_generic_read(pcp) \
({ typeof(pcp) ret__; \
preempt_disable(); \
ret__ = *this_cpu_ptr(&(pcp)); \
preempt_enable(); \
ret__; \
})

#define this_cpu_generic_to_op(pcp, val, op) \
do { \
unsigned long flags; \
raw_local_irq_save(flags); \
*__this_cpu_ptr(&(pcp)) op val; \
raw_local_irq_restore(flags); \
} while (0)

Which is unacceptable for locations that know they are within preempt
disabled or interrupt disabled locations.

Paul McKenney stated that __this_cpu_() versions produce much better code on
other architectures than this_cpu_() does, if we know that the call is done in
a preempt disabled location.

I also changed the recursive_unlock() to use two local variables instead
of accessing the per_cpu variable twice.

Link: http://lkml.kernel.org/r/20150317114411.GE3589@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/20150317104038.312e73d1@gandalf.local.home
Acked-by: Christoph Lameter <cl@linux.com>
Reported-by: Uwe Kleine-Koenig <u.kleine-koenig@pengutronix.de>
Tested-by: Uwe Kleine-Koenig <u.kleine-koenig@pengutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

compal-laptop: Check return value of power_supply_register

commit 1915a718b1872edffcb13e5436a9f7302d3d36f0 upstream.

The return value of power_supply_register() call was not checked and
even on error probe() function returned 0. If registering failed then
during unbind the driver tried to unregister power supply which was not
actually registered.

This could lead to memory corruption because power_supply_unregister()
unconditionally cleans up given power supply.

Fix this by checking return status of power_supply_register() call. In
case of failure, clean up sysfs entries and fail the probe.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Fixes: 9be0fcb5ed46 ("compal-laptop: add JHL90, battery & hwmon interface")
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz> [backport to 3.12]

spi: spidev: fix possible arithmetic overflow for multi-transfer message

commit f20fbaad7620af2df36a1f9d1c9ecf48ead5b747 upstream.

`spidev_message()` sums the lengths of the individual SPI transfers to
determine the overall SPI message length.  It restricts the total
length, returning an error if too long, but it does not check for
arithmetic overflow.  For example, if the SPI message consisted of two
transfers and the first has a length of 10 and the second has a length
of (__u32)(-1), the total length would be seen as 9, even though the
second transfer is actually very long.  If the second transfer specifies
a null `rx_buf` and a non-null `tx_buf`, the `copy_from_user()` could
overrun the spidev's pre-allocated tx buffer before it reaches an
invalid user memory address.  Fix it by checking that neither the total
nor the individual transfer lengths exceed the maximum allowed value.

Thanks to Dan Carpenter for reporting the potential integer overflow.

Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

cdc-wdm: fix endianness bug in debug statements

commit 323ece54e0761198946ecd0c2091f1d2bfdfcb64 upstream.

Values directly from descriptors given in debug statements
must be converted to native endianness.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

MIPS: Hibernate: flush TLB entries earlier

commit a843d00d038b11267279e3b5388222320f9ddc1d upstream.

We found that TLB mismatch not only happens after kernel resume, but
also happens during snapshot restore. So move it to the beginning of
swsusp_arch_suspend().

Signed-off-by: Huacai Chen <chenhc@lemote.com>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/9621/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

KVM: use slowpath for cross page cached accesses

commit ca3f0874723fad81d0c701b63ae3a17a408d5f25 upstream.

kvm_write_guest_cached() does not mark all written pages as dirty and
code comments in kvm_gfn_to_hva_cache_init() talk about NULL memslot
with cross page accesses. Fix all the easy way.

The check is '<= 1' to have the same result for 'len = 0' cache anywhere
in the page. (nr_pages_needed is 0 on page boundary.)

Fixes: 8f964525a121 ("KVM: Allow cross page reads and writes from cached translations.")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Message-Id: <20150408121648.GA3519@potion.brq.redhat.com>
Reviewed-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

s390/hibernate: fix save and restore of kernel text section

commit d74419495633493c9cd3f2bbeb7f3529d0edded6 upstream.

Sebastian reported a crash caused by a jump label mismatch after resume.
This happens because we do not save the kernel text section during suspend
and therefore also do not restore it during resume, but use the kernel image
that restores the old system.

This means that after a suspend/resume cycle we lost all modifications done
to the kernel text section.
The reason for this is the pfn_is_nosave() function, which incorrectly
returns that read-only pages don't need to be saved. This is incorrect since
we mark the kernel text section read-only.
We still need to make sure to not save and restore pages contained within
NSS and DCSS segment.
To fix this add an extra case for the kernel text section and only save
those pages if they are not contained within an NSS segment.

Fixes the following crash (and the above bugs as well):

Jump label code mismatch at netif_receive_skb_internal+0x28/0xd0
Found:    c0 04 00 00 00 00
Expected: c0 f4 00 00 00 11
New:      c0 04 00 00 00 00
Kernel panic - not syncing: Corrupted kernel text
CPU: 0 PID: 9 Comm: migration/0 Not tainted 3.19.0-01975-gb1b096e70f23 #4
Call Trace:
  [<0000000000113972>] show_stack+0x72/0xf0
  [<000000000081f15e>] dump_stack+0x6e/0x90
  [<000000000081c4e8>] panic+0x108/0x2b0
  [<000000000081be64>] jump_label_bug.isra.2+0x104/0x108
  [<0000000000112176>] __jump_label_transform+0x9e/0xd0
  [<00000000001121e6>] __sm_arch_jump_label_transform+0x3e/0x50
  [<00000000001d1136>] multi_cpu_stop+0x12e/0x170
  [<00000000001d1472>] cpu_stopper_thread+0xb2/0x168
  [<000000000015d2ac>] smpboot_thread_fn+0x134/0x1b0
  [<0000000000158baa>] kthread+0x10a/0x110
  [<0000000000824a86>] kernel_thread_starter+0x6/0xc

Reported-and-tested-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

KVM: s390: Zero out current VMDB of STSI before including level3 data.

commit b75f4c9afac2604feb971441116c07a24ecca1ec upstream.

s390 documentation requires words 0 and 10-15 to be reserved and stored as
zeros. As we fill out all other fields, we can memset the full structure.

Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

nosave: consolidate __nosave_{begin,end} in <asm/sections.h>

commit 7f8998c7aef3ac9c5f3f2943e083dfa6302e90d0 upstream.

The different architectures used their own (and different) declarations:

    extern __visible const void __nosave_begin, __nosave_end;
    extern const void __nosave_begin, __nosave_end;
    extern long __nosave_begin, __nosave_end;

Consolidate them using the first variant in <asm/sections.h>.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
[js -- port to 3.12: arm does not have hibernation yet]

usb: gadget: composite: enable BESL support

commit a6615937bcd9234e6d6bb817c3701fce44d0a84d upstream.

According to USB 2.0 ECN Errata for Link Power
Management (USB2-LPM-Errata-final.pdf), BESL
must be enabled if LPM is enabled.

This helps with USB30CV TD 9.21 LPM L1
Suspend Resume Test.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

drivers: parport: Kconfig: exclude arm64 for PARPORT_PC

Fix build problem seen with arm64:allmodconfig.

drivers/parport/parport_pc.c:67:25: fatal error: asm/parport.h: No such file or
directory

arm64 does not support PARPORT_PC.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

mm/hugetlb: use pmd_page() in follow_huge_pmd()

commit 97534127012f0e396eddea4691f4c9b170aed74b upstream.

Commit 61f77eda9bbf ("mm/hugetlb: reduce arch dependent code around
follow_huge_*") broke follow_huge_pmd() on s390, where pmd and pte
layout differ and using pte_page() on a huge pmd will return wrong
results. Using pmd_page() instead fixes this.

All architectures that were touched by that commit have pmd_page()
defined, so this should not break anything on other architectures.

Fixes: 61f77eda "mm/hugetlb: reduce arch dependent code around follow_huge_*"
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@suse.cz>, Andrea Arcangeli <aarcange@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

sched/idle/x86: Restore mwait_idle() to fix boot hangs, to improve power savings and to improve performance

commit b253149b843f89cd300cbdbea27ce1f847506f99 upstream.

In Linux-3.9 we removed the mwait_idle() loop:

  69fb3676df33 ("x86 idle: remove mwait_idle() and "idle=mwait" cmdline param")

The reasoning was that modern machines should be sufficiently
happy during the boot process using the default_idle() HALT
loop, until cpuidle loads and either acpi_idle or intel_idle
invoke the newer MWAIT-with-hints idle loop.

But two machines reported problems:

1. Certain Core2-era machines support MWAIT-C1 and HALT only.
    MWAIT-C1 is preferred for optimal power and performance.
    But if they support just C1, cpuidle never loads and
    so they use the boot-time default idle loop forever.

2. Some laptops will boot-hang if HALT is used,
    but will boot successfully if MWAIT is used.
    This appears to be a hidden assumption in BIOS SMI,
    that is presumably valid on the proprietary OS
    where the BIOS was validated.

       https://bugzilla.kernel.org/show_bug.cgi?id=60770

So here we effectively revert the patch above, restoring
the mwait_idle() loop.  However, we don't bother restoring
the idle=mwait cmdline parameter, since it appears to add
no value.

Maintainer notes:

  For 3.9, simply revert 69fb3676df
  for 3.10, patch -F3 applies, fuzz needed due to __cpuinit use in
  context For 3.11, 3.12, 3.13, this patch applies cleanly

Tested-by: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Mike Galbraith <bitbucket@online.de>
Cc: <stable@vger.kernel.org> # 3.9+
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ian Malone <ibmalone@gmail.com>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/345254a551eb5a6a866e048d7ab570fd2193aca4.1389763084.git.len.brown@intel.com
[ Ported to recent kernels. ]
[ Mike: 3.10 backport ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

Btrfs: fix inode eviction infinite loop after extent_same ioctl

commit 113e8283869b9855c8b999796aadd506bbac155f upstream.

If we pass a length of 0 to the extent_same ioctl, we end up locking an
extent range with a start offset greater then its end offset (if the
destination file's offset is greater than zero). This results in a warning
from extent_io.c:insert_state through the following call chain:

  btrfs_extent_same()
    btrfs_double_lock()
      lock_extent_range()
        lock_extent(inode->io_tree, offset, offset + len - 1)
          lock_extent_bits()
            __set_extent_bit()
              insert_state()
                --> WARN_ON(end < start)

This leads to an infinite loop when evicting the inode. This is the same
problem that my previous patch titled
"Btrfs: fix inode eviction infinite loop after cloning into it" addressed
but for the extent_same ioctl instead of the clone ioctl.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

Btrfs: fix inode eviction infinite loop after cloning into it

commit ccccf3d67294714af2d72a6fd6fd7d73b01c9329 upstream.

If we attempt to clone a 0 length region into a file we can end up
inserting a range in the inode's extent_io tree with a start offset
that is greater then the end offset, which triggers immediately the
following warning:

[ 3914.619057] WARNING: CPU: 17 PID: 4199 at fs/btrfs/extent_io.c:435 insert_state+0x4b/0x10b [btrfs]()
[ 3914.620886] BTRFS: end < start 4095 4096
(...)
[ 3914.638093] Call Trace:
[ 3914.638636]  [<ffffffff81425fd9>] dump_stack+0x4c/0x65
[ 3914.639620]  [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb
[ 3914.640789]  [<ffffffffa03ca44f>] ? insert_state+0x4b/0x10b [btrfs]
[ 3914.642041]  [<ffffffff810453f0>] warn_slowpath_fmt+0x46/0x48
[ 3914.643236]  [<ffffffffa03ca44f>] insert_state+0x4b/0x10b [btrfs]
[ 3914.644441]  [<ffffffffa03ca729>] __set_extent_bit+0x107/0x3f4 [btrfs]
[ 3914.645711]  [<ffffffffa03cb256>] lock_extent_bits+0x65/0x1bf [btrfs]
[ 3914.646914]  [<ffffffff8142b2fb>] ? _raw_spin_unlock+0x28/0x33
[ 3914.648058]  [<ffffffffa03cbac4>] ? test_range_bit+0xcc/0xde [btrfs]
[ 3914.650105]  [<ffffffffa03cb3c3>] lock_extent+0x13/0x15 [btrfs]
[ 3914.651361]  [<ffffffffa03db39e>] lock_extent_range+0x3d/0xcd [btrfs]
[ 3914.652761]  [<ffffffffa03de1fe>] btrfs_ioctl_clone+0x278/0x388 [btrfs]
[ 3914.654128]  [<ffffffff811226dd>] ? might_fault+0x58/0xb5
[ 3914.655320]  [<ffffffffa03e0909>] btrfs_ioctl+0xb51/0x2195 [btrfs]
(...)
[ 3914.669271] ---[ end trace 14843d3e2e622fc1 ]---

This later makes the inode eviction handler enter an infinite loop that
keeps dumping the following warning over and over:

[ 3915.117629] WARNING: CPU: 22 PID: 4228 at fs/btrfs/extent_io.c:435 insert_state+0x4b/0x10b [btrfs]()
[ 3915.119913] BTRFS: end < start 4095 4096
(...)
[ 3915.137394] Call Trace:
[ 3915.137913]  [<ffffffff81425fd9>] dump_stack+0x4c/0x65
[ 3915.139154]  [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb
[ 3915.140316]  [<ffffffffa03ca44f>] ? insert_state+0x4b/0x10b [btrfs]
[ 3915.141505]  [<ffffffff810453f0>] warn_slowpath_fmt+0x46/0x48
[ 3915.142709]  [<ffffffffa03ca44f>] insert_state+0x4b/0x10b [btrfs]
[ 3915.143849]  [<ffffffffa03ca729>] __set_extent_bit+0x107/0x3f4 [btrfs]
[ 3915.145120]  [<ffffffffa038c1e3>] ? btrfs_kill_super+0x17/0x23 [btrfs]
[ 3915.146352]  [<ffffffff811548f6>] ? deactivate_locked_super+0x3b/0x50
[ 3915.147565]  [<ffffffffa03cb256>] lock_extent_bits+0x65/0x1bf [btrfs]
[ 3915.148785]  [<ffffffff8142b7e2>] ? _raw_write_unlock+0x28/0x33
[ 3915.149931]  [<ffffffffa03bc325>] btrfs_evict_inode+0x196/0x482 [btrfs]
[ 3915.151154]  [<ffffffff81168904>] evict+0xa0/0x148
[ 3915.152094]  [<ffffffff811689e5>] dispose_list+0x39/0x43
[ 3915.153081]  [<ffffffff81169564>] evict_inodes+0xdc/0xeb
[ 3915.154062]  [<ffffffff81154418>] generic_shutdown_super+0x49/0xef
[ 3915.155193]  [<ffffffff811546d1>] kill_anon_super+0x13/0x1e
[ 3915.156274]  [<ffffffffa038c1e3>] btrfs_kill_super+0x17/0x23 [btrfs]
(...)
[ 3915.167404] ---[ end trace 14843d3e2e622fc2 ]---

So just bail out of the clone ioctl if the length of the region to clone
is zero, without locking any extent range, in order to prevent this issue
(same behaviour as a pwrite with a 0 length for example).

This is trivial to reproduce. For example, the steps for the test I just
made for fstests:

  mkfs.btrfs -f SCRATCH_DEV
  mount SCRATCH_DEV $SCRATCH_MNT

  touch $SCRATCH_MNT/foo
  touch $SCRATCH_MNT/bar

  $CLONER_PROG -s 0 -d 4096 -l 0 $SCRATCH_MNT/foo $SCRATCH_MNT/bar
  umount $SCRATCH_MNT

A test case for fstests follows soon.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

btrfs: don't accept bare namespace as a valid xattr

commit 3c3b04d10ff1811a27f86684ccd2f5ba6983211d upstream.

Due to insufficient check in btrfs_is_valid_xattr, this unexpectedly
works:

$ touch file
$ setfattr -n user. -v 1 file
$ getfattr -d file
user.="1"

ie. the missing attribute name after the namespace.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=94291
Reported-by: William Douglas <william.douglas@intel.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

Btrfs: fix log tree corruption when fs mounted with -o discard

commit dcc82f4783ad91d4ab654f89f37ae9291cdc846a upstream.

While committing a transaction we free the log roots before we write the
new super block. Freeing the log roots implies marking the disk location
of every node/leaf (metadata extent) as pinned before the new super block
is written. This is to prevent the disk location of log metadata extents
from being reused before the new super block is written, otherwise we
would have a corrupted log tree if before the new super block is written
a crash/reboot happens and the location of any log tree metadata extent
ended up being reused and rewritten.

Even though we pinned the log tree's metadata extents, we were issuing a
discard against them if the fs was mounted with the -o discard option,
resulting in corruption of the log tree if a crash/reboot happened before
writing the new super block - the next time the fs was mounted, during
the log replay process we would find nodes/leafs of the log btree with
a content full of zeroes, causing the process to fail and require the
use of the tool btrfs-zero-log to wipeout the log tree (and all data
previously fsynced becoming lost forever).

Fix this by not doing a discard when pinning an extent. The discard will
be done later when it's safe (after the new super block is committed) at
extent-tree.c:btrfs_finish_extent_commit().

Fixes: e688b7252f78 (Btrfs: fix extent pinning bugs in the tree log)
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

net: fix crash in build_skb()

[ Upstream commit 2ea2f62c8bda242433809c7f4e9eae1c52c40bbe ]

When I added pfmemalloc support in build_skb(), I forgot netlink
was using build_skb() with a vmalloc() area.

In this patch I introduce __build_skb() for netlink use,
and build_skb() is a wrapper handling both skb->head_frag and
skb->pfmemalloc

This means netlink no longer has to hack skb->head_frag

[ 1567.700067] kernel BUG at arch/x86/mm/physaddr.c:26!
[ 1567.700067] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
[ 1567.700067] Dumping ftrace buffer:
[ 1567.700067]    (ftrace buffer empty)
[ 1567.700067] Modules linked in:
[ 1567.700067] CPU: 9 PID: 16186 Comm: trinity-c182 Not tainted 4.0.0-next-20150424-sasha-00037-g4796e21 #2167
[ 1567.700067] task: ffff880127efb000 ti: ffff880246770000 task.ti: ffff880246770000
[ 1567.700067] RIP: __phys_addr (arch/x86/mm/physaddr.c:26 (discriminator 3))
[ 1567.700067] RSP: 0018:ffff8802467779d8  EFLAGS: 00010202
[ 1567.700067] RAX: 000041000ed8e000 RBX: ffffc9008ed8e000 RCX: 000000000000002c
[ 1567.700067] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffffb3fd6049
[ 1567.700067] RBP: ffff8802467779f8 R08: 0000000000000019 R09: ffff8801d0168000
[ 1567.700067] R10: ffff8801d01680c7 R11: ffffed003a02d019 R12: ffffc9000ed8e000
[ 1567.700067] R13: 0000000000000f40 R14: 0000000000001180 R15: ffffc9000ed8e000
[ 1567.700067] FS:  00007f2a7da3f700(0000) GS:ffff8801d1000000(0000) knlGS:0000000000000000
[ 1567.700067] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1567.700067] CR2: 0000000000738308 CR3: 000000022e329000 CR4: 00000000000007e0
[ 1567.700067] Stack:
[ 1567.700067]  ffffc9000ed8e000 ffff8801d0168000 ffffc9000ed8e000 ffff8801d0168000
[ 1567.700067]  ffff880246777a28 ffffffffad7c0a21 0000000000001080 ffff880246777c08
[ 1567.700067]  ffff88060d302e68 ffff880246777b58 ffff880246777b88 ffffffffad9a6821
[ 1567.700067] Call Trace:
[ 1567.700067] build_skb (include/linux/mm.h:508 net/core/skbuff.c:316)
[ 1567.700067] netlink_sendmsg (net/netlink/af_netlink.c:1633 net/netlink/af_netlink.c:2329)
[ 1567.774369] ? sched_clock_cpu (kernel/sched/clock.c:311)
[ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273)
[ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273)
[ 1567.774369] sock_sendmsg (net/socket.c:614 net/socket.c:623)
[ 1567.774369] sock_write_iter (net/socket.c:823)
[ 1567.774369] ? sock_sendmsg (net/socket.c:806)
[ 1567.774369] __vfs_write (fs/read_write.c:479 fs/read_write.c:491)
[ 1567.774369] ? get_lock_stats (kernel/locking/lockdep.c:249)
[ 1567.774369] ? default_llseek (fs/read_write.c:487)
[ 1567.774369] ? vtime_account_user (kernel/sched/cputime.c:701)
[ 1567.774369] ? rw_verify_area (fs/read_write.c:406 (discriminator 4))
[ 1567.774369] vfs_write (fs/read_write.c:539)
[ 1567.774369] SyS_write (fs/read_write.c:586 fs/read_write.c:577)
[ 1567.774369] ? SyS_read (fs/read_write.c:577)
[ 1567.774369] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
[ 1567.774369] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2594 kernel/locking/lockdep.c:2636)
[ 1567.774369] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42)
[ 1567.774369] system_call_fastpath (arch/x86/kernel/entry_64.S:261)

Fixes: 79930f5892e ("net: do not deplete pfmemalloc reserve")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

net: do not deplete pfmemalloc reserve

[ Upstream commit 79930f5892e134c6da1254389577fffb8bd72c66 ]

build_skb() should look at the page pfmemalloc status.
If set, this means page allocator allocated this page in the
expectation it would help to free other pages. Networking
stack can do that only if skb->pfmemalloc is also set.

Also, we must refrain using high order pages from the pfmemalloc
reserve, so __page_frag_refill() must also use __GFP_NOMEMALLOC for
them. Under memory pressure, using order-0 pages is probably the best
strategy.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

tcp: avoid looping in tcp_send_fin()

[ Upstream commit 845704a535e9b3c76448f52af1b70e4422ea03fd ]

Presence of an unbound loop in tcp_send_fin() had always been hard
to explain when analyzing crash dumps involving gigantic dying processes
with millions of sockets.

Lets try a different strategy :

In case of memory pressure, try to add the FIN flag to last packet
in write queue, even if packet was already sent. TCP stack will
be able to deliver this FIN after a timeout event. Note that this
FIN being delivered by a retransmit, it also carries a Push flag
given our current implementation.

By checking sk_under_memory_pressure(), we anticipate that cooking
many FIN packets might deplete tcp memory.

In the case we could not allocate a packet, even with __GFP_WAIT
allocation, then not sending a FIN seems quite reasonable if it allows
to get rid of this socket, free memory, and not block the process from
eventually doing other useful work.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

tcp: fix possible deadlock in tcp_send_fin()

[ Upstream commit d83769a580f1132ac26439f50068a29b02be535e ]

Using sk_stream_alloc_skb() in tcp_send_fin() is dangerous in
case a huge process is killed by OOM, and tcp_mem[2] is hit.

To be able to free memory we need to make progress, so this
patch allows FIN packets to not care about tcp_mem[2], if
skb allocation succeeded.

In a follow-up patch, we might abort tcp_send_fin() infinite loop
in case TIF_MEMDIE is set on this thread, as memory allocator
did its best getting extra memory already.

This patch reverts d22e15371811 ("tcp: fix tcp fin memory accounting")

Fixes: d22e15371811 ("tcp: fix tcp fin memory accounting")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ip_forward: Drop frames with attached skb->sk

[ Upstream commit 2ab957492d13bb819400ac29ae55911d50a82a13 ]

Initial discussion was:
[FYI] xfrm: Don't lookup sk_policy for timewait sockets

Forwarded frames should not have a socket attached. Especially
tw sockets will lead to panics later-on in the stack.

This was observed with TPROXY assigning a tw socket and broken
policy routing (misconfigured). As a result frame enters
forwarding path instead of input. We cannot solve this in
TPROXY as it cannot know that policy routing is broken.

v2:
Remove useless comment

Signed-off-by: Sebastian Poehn <sebastian.poehn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

Linux 3.12.42

arm/arm64: KVM: Keep elrsr/aisr in sync with software model

commit ae705930fca6322600690df9dc1c7d0516145a93 upstream.

There is an interesting bug in the vgic code, which manifests itself
when the KVM run loop has a signal pending or needs a vmid generation
rollover after having disabled interrupts but before actually switching
to the guest.

In this case, we flush the vgic as usual, but we sync back the vgic
state and exit to userspace before entering the guest. The consequence
is that we will be syncing the list registers back to the software model
using the GICH_ELRSR and GICH_EISR from the last execution of the guest,
potentially overwriting a list register containing an interrupt.

This showed up during migration testing where we would capture a state
where the VM has masked the arch timer but there were no interrupts,
resulting in a hung test.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Reported-by: Alex Bennee <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm64: KVM: Do not use pgd_index to index stage-2 pgd

commit 04b8dc85bf4a64517e3cf20e409eeaa503b15cc1 upstream.

The kernel's pgd_index macro is designed to index a normal, page
sized array. KVM is a bit diffferent, as we can use concatenated
pages to have a bigger address space (for example 40bit IPA with
4kB pages gives us an 8kB PGD.

In the above case, the use of pgd_index will always return an index
inside the first 4kB, which makes a guest that has memory above
0x8000000000 rather unhappy, as it spins forever in a page fault,
whist the host happilly corrupts the lower pgd.

The obvious fix is to get our own kvm_pgd_index that does the right
thing(tm).

Tested on X-Gene with a hacked kvmtool that put memory at a stupidly
high address.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm64: KVM: Fix HCR setting for 32bit guests

commit 801f6772cecea6cfc7da61aa197716ab64db5f9e upstream.

Commit b856a59141b1 (arm/arm64: KVM: Reset the HCR on each vcpu
when resetting the vcpu) moved the init of the HCR register to
happen later in the init of a vcpu, but left out the fixup
done in kvm_reset_vcpu when preparing for a 32bit guest.

As a result, the 32bit guest is run as a 64bit guest, but the
rest of the kernel still manages it as a 32bit. Fun follows.

Moving the fixup to vcpu_reset_hcr solves the problem for good.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm64: KVM: Fix TLB invalidation by IPA/VMID

commit 55e858b75808347378e5117c3c2339f46cc03575 upstream.

It took about two years for someone to notice that the IPA passed
to TLBI IPAS2E1IS must be shifted by 12 bits. Clearly our reviewing
is not as good as it should be...

Paper bag time for me.

Reported-by: Mario Smarduch <m.smarduch@samsung.com>
Tested-by: Mario Smarduch <m.smarduch@samsung.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm/arm64: KVM: Require in-kernel vgic for the arch timers

commit 05971120fca43e0357789a14b3386bb56eef2201 upstream.

It is curently possible to run a VM with architected timers support
without creating an in-kernel VGIC, which will result in interrupts from
the virtual timer going nowhere.

To address this issue, move the architected timers initialization to the
time when we run a VCPU for the first time, and then only initialize
(and enable) the architected timers if we have a properly created and
initialized in-kernel VGIC.

When injecting interrupts from the virtual timer to the vgic, the
current setup should ensure that this never calls an on-demand init of
the VGIC, which is the only call path that could return an error from
kvm_vgic_inject_irq(), so capture the return value and raise a warning
if there's an error there.

We also change the kvm_timer_init() function from returning an int to be
a void function, since the function always succeeds.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm/arm64: KVM: Don't allow creating VCPUs after vgic_initialized

commit 716139df2517fbc3f2306dbe8eba0fa88dca0189 upstream.

When the vgic initializes its internal state it does so based on the
number of VCPUs available at the time. If we allow KVM to create more
VCPUs after the VGIC has been initialized, we are likely to error out in
unfortunate ways later, perform buffer overflows etc.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm/arm64: KVM: Introduce stage2_unmap_vm

commit 957db105c99792ae8ef61ffc9ae77d910f6471da upstream.

Introduce a new function to unmap user RAM regions in the stage2 page
tables. This is needed on reboot (or when the guest turns off the MMU)
to ensure we fault in pages again and make the dcache, RAM, and icache
coherent.

Using unmap_stage2_range for the whole guest physical range does not
work, because that unmaps IO regions (such as the GIC) which will not be
recreated or in the best case faulted in on a page-by-page basis.

Call this function on secondary and subsequent calls to the
KVM_ARM_VCPU_INIT ioctl so that a reset VCPU will detect the guest
Stage-1 MMU is off when faulting in pages and make the caches coherent.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm/arm64: KVM: Reset the HCR on each vcpu when resetting the vcpu

commit b856a59141b1066d3c896a0d0231f84dabd040af upstream.

When userspace resets the vcpu using KVM_ARM_VCPU_INIT, we should also
reset the HCR, because we now modify the HCR dynamically to
enable/disable trapping of guest accesses to the VM registers.

This is crucial for reboot of VMs working since otherwise we will not be
doing the necessary cache maintenance operations when faulting in pages
with the guest MMU off.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm/arm64: KVM: Correct KVM_ARM_VCPU_INIT power off option

commit 3ad8b3de526a76fbe9466b366059e4958957b88f upstream.

The implementation of KVM_ARM_VCPU_INIT is currently not doing what
userspace expects, namely making sure that a vcpu which may have been
turned off using PSCI is returned to its initial state, which would be
powered on if userspace does not set the KVM_ARM_VCPU_POWER_OFF flag.

Implement the expected functionality and clarify the ABI.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm/arm64: KVM: Don't clear the VCPU_POWER_OFF flag

commit 03f1d4c17edb31b41b14ca3a749ae38d2dd6639d upstream.

If a VCPU was originally started with power off (typically to be brought
up by PSCI in SMP configurations), there is no need to clear the
POWER_OFF flag in the kernel, as this flag is only tested during the
init ioctl itself.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm/arm64: kvm: drop inappropriate use of kvm_is_mmio_pfn()

commit 07a9748c78cfc39b54f06125a216b67b9c8f09ed upstream.

Instead of using kvm_is_mmio_pfn() to decide whether a host region
should be stage 2 mapped with device attributes, add a new static
function kvm_is_device_pfn() that disregards RAM pages with the
reserved bit set, as those should usually not be mapped as device
memory.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm64/kvm: Fix assembler compatibility of macros

commit 286fb1cc32b11c18da3573a8c8c37a4f9da16e30 upstream.

Some of the macros defined in kvm_arm.h are useful in assembly files, but are
not compatible with the assembler.  Change any C language integer constant
definitions using appended U, UL, or ULL to the UL() preprocessor macro.  Also,
add a preprocessor include of the asm/memory.h file which defines the UL()
macro.

Fixes build errors like these when using kvm_arm.h in assembly
source files:

  Error: unexpected characters following instruction at operand 3 -- `and x0,x1,#((1U<<25)-1)'

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm: kvm: STRICT_MM_TYPECHECKS fix for user_mem_abort

commit 3d08c629244257473450a8ba17cb8184b91e68f8 upstream.

Commit:
b886576 ARM: KVM: user_mem_abort: support stage 2 MMIO page mapping

introduced some code in user_mem_abort that failed to compile if
STRICT_MM_TYPECHECKS was enabled.

This patch fixes up the failing comparison.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Reviewed-by: Kim Phillips <kim.phillips@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE

commit c3058d5da2222629bc2223c488a4512b59bb4baf upstream.

When creating or moving a memslot, make sure the IPA space is within the
addressable range of the guest. Otherwise, user space can create too
large a memslot and KVM would try to access potentially unallocated page
table entries when inserting entries in the Stage-2 page tables.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm: kvm: fix CPU hotplug

commit 37a34ac1d4775aafbc73b9db53c7daebbbc67e6a upstream.

On some platforms with no power management capabilities, the hotplug
implementation is allowed to return from a smp_ops.cpu_die() call as a
function return. Upon a CPU onlining event, the KVM CPU notifier tries
to reinstall the hyp stub, which fails on platform where no reset took
place following a hotplug event, with the message:

CPU1: smp_ops.cpu_die() returned, trying to resuscitate
CPU1: Booted secondary processor
Kernel panic - not syncing: unexpected prefetch abort in Hyp mode at: 0x80409540
unexpected data abort in Hyp mode at: 0x80401fe8
unexpected HVC/SVC trap in Hyp mode at: 0x805c6170

since KVM code is trying to reinstall the stub on a system where it is
already configured.

To prevent this issue, this patch adds a check in the KVM hotplug
notifier that detects if the HYP stub really needs re-installing when a
CPU is onlined and skips the installation call if the stub is already in
place, which means that the CPU has not been reset.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm/arm64: KVM: Fix VTTBR_BADDR_MASK and pgd alloc

commit dbff124e29fa24aff9705b354b5f4648cd96e0bb upstream.

The current aarch64 calculation for VTTBR_BADDR_MASK masks only 39 bits
and not all the bits in the PA range. This is clearly a bug that
manifests itself on systems that allocate memory in the higher address
space range.

[ Modified from Joel's original patch to be based on PHYS_MASK_SHIFT
   instead of a hard-coded value and to move the alignment check of the
   allocation to mmu.c.  Also added a comment explaining why we hardcode
   the IPA range and changed the stage-2 pgd allocation to be based on
   the 40 bit IPA range instead of the maximum possible 48 bit PA range.
   - Christoffer ]

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Joel Schopp <joel.schopp@amd.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

KVM: ARM: vgic: plug irq injection race

commit 71afaba4a2e98bb7bdeba5078370ab43d46e67a1 upstream.

As it stands, nothing prevents userspace from injecting an interrupt
before the guest's GIC is actually initialized.

This goes unnoticed so far (as everything is pretty much statically
allocated), but ends up exploding in a spectacular way once we switch
to a more dynamic allocation (the GIC data structure isn't there yet).

The fix is to test for the "ready" flag in the VGIC distributor before
trying to inject the interrupt. Note that in order to avoid breaking
userspace, we have to ignore what is essentially an error.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ARM/arm64: KVM: fix use of WnR bit in kvm_is_write_fault()

commit a7d079cea2dffb112e26da2566dd84c0ef1fce97 upstream.

The ISS encoding for an exception from a Data Abort has a WnR
bit[6] that indicates whether the Data Abort was caused by a
read or a write instruction. While there are several fields
in the encoding that are only valid if the ISV bit[24] is set,
WnR is not one of them, so we can read it unconditionally.

Instead of fixing both implementations of kvm_is_write_fault()
in place, reimplement it just once using kvm_vcpu_dabt_iswrite(),
which already does the right thing with respect to the WnR bit.
Also fix up the callers to pass 'vcpu'

Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm/arm64: KVM: Complete WFI/WFE instructions

commit 05e0127f9e362b36aa35f17b1a3d52bca9322a3a upstream.

The architecture specifies that when the processor wakes up from a WFE
or WFI instruction, the instruction is considered complete, however we
currrently return to EL1 (or EL0) at the WFI/WFE instruction itself.

While most guests may not be affected by this because their local
exception handler performs an exception returning setting the event bit
or with an interrupt pending, some guests like UEFI will get wedged due
this little mishap.

Simply skip the instruction when we have completed the emulation.

Cc: <stable@vger.kernel.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ARM/ARM64: KVM: Nuke Hyp-mode tlbs before enabling MMU

commit f6edbbf36da3a27b298b66c7955fc84e1dcca305 upstream.

X-Gene u-boot runs in EL2 mode with MMU enabled hence we might
have stale EL2 tlb enteris when we enable EL2 MMU on each host CPU.

This can happen on any ARM/ARM64 board running bootloader in
Hyp-mode (or EL2-mode) with MMU enabled.

This patch ensures that we flush all Hyp-mode (or EL2-mode) TLBs
on each host CPU before enabling Hyp-mode (or EL2-mode) MMU.

Cc: <stable@vger.kernel.org>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

KVM: vgic: return int instead of bool when checking I/O ranges

commit 1fa451bcc67fa921a04c5fac8dbcde7844d54512 upstream.

vgic_ioaddr_overlap claims to return a bool, but in reality it returns
an int. Shut sparse up by fixing the type signature.

Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

KVM: ARM/arm64: avoid returning negative error code as bool

commit 18d457661fb9fa69352822ab98d39331c3d0e571 upstream.

is_valid_cache returns true if the specified cache is valid.
Unfortunately, if the parameter passed it out of range, we return
-ENOENT, which ends up as true leading to potential hilarity.

This patch returns false on the failure path instead.

Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

KVM: ARM/arm64: fix broken __percpu annotation

commit 4000be423cb01a8d09de878bb8184511c49d4238 upstream.

Running sparse results in a bunch of noisy address space mismatches
thanks to the broken __percpu annotation on kvm_get_running_vcpus.

This function returns a pcpu pointer to a pointer, not a pointer to a
pcpu pointer. This patch fixes the annotation, which kills the warnings
from sparse.

Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>