git.ipfire.org Git - thirdparty/kernel/stable.git/log

IB/rxe: Fix a memory leak in rxe_qp_cleanup()

commit e259934d4df7f99f2a5c2c4f074f6a55bd4b1722 upstream.

A socket is associated with every QP by the rxe driver but sock_release()
is never called. Add a call to sock_release() in rxe_qp_cleanup().

Fixes: commit 8700e3e7c48A5 ("Add Soft RoCE driver")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Moni Shoua <monis@mellanox.com>
Cc: Kamal Heib <kamalh@mellanox.com>
Cc: Amir Vadai <amirv@mellanox.com>
Cc: Haggai Eran <haggaie@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/multicast: Check ib_find_pkey() return value

commit d3a2418ee36a59bc02e9d454723f3175dcf4bfd9 upstream.

This patch avoids that Coverity complains about not checking the
ib_find_pkey() return value.

Fixes: commit 547af76521b3 ("IB/multicast: Report errors on multicast groups if P_key changes")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IPoIB: Avoid reading an uninitialized member variable

commit 11b642b84e8c43e8597de031678d15c08dd057bc upstream.

This patch avoids that Coverity reports the following:

Using uninitialized value port_attr.state when calling printk

Fixes: commit 94232d9ce817 ("IPoIB: Start multicast join process only on active ports")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/mad: Fix an array index check

commit 2fe2f378dd45847d2643638c07a7658822087836 upstream.

The array ib_mad_mgmt_class_table.method_table has MAX_MGMT_CLASS
(80) elements. Hence compare the array index with that value instead
of with IB_MGMT_MAX_METHODS (128). This patch avoids that Coverity
reports the following:

Overrunning array class->method_table of 80 8-byte elements at element index 127 (byte offset 1016) using index convert_mgmt_class(mad_hdr->mgmt_class) (which evaluates to 127).

Fixes: commit b7ab0b19a85f ("IB/mad: Verify mgmt class in received MADs")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fgraph: Handle a case where a tracer ignores set_graph_notrace

commit 794de08a16cf1fc1bf785dc48f66d36218cf6d88 upstream.

Both the wakeup and irqsoff tracers can use the function graph tracer when
the display-graph option is set. The problem is that they ignore the notrace
file, and record the entry of functions that would be ignored by the
function_graph tracer. This causes the trace->depth to be recorded into the
ring buffer. The set_graph_notrace uses a trick by adding a large negative
number to the trace->depth when a graph function is to be ignored.

On trace output, the graph function uses the depth to record a stack of
functions. But since the depth is negative, it accesses the array with a
negative number and causes an out of bounds access that can cause a kernel
oops or corrupt data.

Have the print functions handle cases where a tracer still records functions
even when they are in set_graph_notrace.

Also add warnings if the depth is below zero before accessing the array.

Note, the function graph logic will still prevent the return of these
functions from being recorded, which means that they will be left hanging
without a return. For example:

   # echo '*spin*' > set_graph_notrace
   # echo 1 > options/display-graph
   # echo wakeup > current_tracer
   # cat trace
   [...]
      _raw_spin_lock() {
        preempt_count_add() {
        do_raw_spin_lock() {
      update_rq_clock();

Where it should look like:

      _raw_spin_lock() {
        preempt_count_add();
        do_raw_spin_lock();
      }
      update_rq_clock();

Cc: Namhyung Kim <namhyung.kim@lge.com>
Fixes: 29ad23b00474 ("ftrace: Add set_graph_notrace filter")
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

platform/x86: asus-nb-wmi.c: Add X45U quirk

commit e74e259939275a5dd4e0d02845c694f421e249ad upstream.

Without this patch, the Asus X45U wireless card can't be turned
on (hard-blocked), but after a suspend/resume it just starts working.

Following this bug report[1], there are other cases like this one, but
this Asus is the only model that I can test.

[1] https://ubuntuforums.org/showthread.php?t=2181558

Signed-off-by: Marcos Paulo de Souza <marcos.souza.org@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ftrace/x86_32: Set ftrace_stub to weak to prevent gcc from using short jumps to it

commit 847fa1a6d3d00f3bdf68ef5fa4a786f644a0dd67 upstream.

With new binutils, gcc may get smart with its optimization and change a jmp
from a 5 byte jump to a 2 byte one even though it was jumping to a global
function. But that global function existed within a 2 byte radius, and gcc
was able to optimize it. Unfortunately, that jump was also being modified
when function graph tracing begins. Since ftrace expected that jump to be 5
bytes, but it was only two, it overwrote code after the jump, causing a
crash.

This was fixed for x86_64 with commit 8329e818f149, with the same subject as
this commit, but nothing was done for x86_32.

Fixes: d61f82d06672 ("ftrace: use dynamic patching for updating mcount calls")
Reported-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

vsock/virtio: fix src/dst cid format

commit f83f12d660d11718d3eed9d979ee03e83aa55544 upstream.

These fields are 64 bit, using le32_to_cpu and friends
on these will not do the right thing.
Fix this up.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fsnotify: Fix possible use-after-free in inode iteration on umount

commit 5716863e0f8251d3360d4cbfc0e44e08007075df upstream.

fsnotify_unmount_inodes() plays complex tricks to pin next inode in the
sb->s_inodes list when iterating over all inodes. Furthermore the code has a
bug that if the current inode is the last on i_sb_list that does not have e.g.
I_FREEING set, then we leave next_i pointing to inode which may get removed
from the i_sb_list once we drop s_inode_list_lock thus resulting in
use-after-free issues (usually manifesting as infinite looping in
fsnotify_unmount_inodes()).

Fix the problem by keeping current inode pinned somewhat longer. Then we can
make the code much simpler and standard.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

kvm: nVMX: Allow L1 to intercept software exceptions (#BP and #OF)

commit ef85b67385436ddc1998f45f1d6a210f935b3388 upstream.

When L2 exits to L0 due to "exception or NMI", software exceptions
(#BP and #OF) for which L1 has requested an intercept should be
handled by L1 rather than L0. Previously, only hardware exceptions
were forwarded to L1.

Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM: PPC: Book3S HV: Don't lose hardware R/C bit updates in H_PROTECT

commit f064a0de1579fabded8990bed93971e30deb9ecb upstream.

The hashed page table MMU in POWER processors can update the R
(reference) and C (change) bits in a HPTE at any time until the
HPTE has been invalidated and the TLB invalidation sequence has
completed.  In kvmppc_h_protect, which implements the H_PROTECT
hypercall, we read the HPTE, modify the second doubleword,
invalidate the HPTE in memory, do the TLB invalidation sequence,
and then write the modified value of the second doubleword back
to memory.  In doing so we could overwrite an R/C bit update done
by hardware between when we read the HPTE and when the TLB
invalidation completed.  To fix this we re-read the second
doubleword after the TLB invalidation and OR in the (possibly)
new values of R and C.  We can use an OR since hardware only ever
sets R and C, never clears them.

This race was found by code inspection.  In principle this bug could
cause occasional guest memory corruption under host memory pressure.

Fixes: a8606e20e41a ("KVM: PPC: Handle some PAPR hcalls in the kernel", 2011-06-29)
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM: PPC: Book3S HV: Save/restore XER in checkpointed register state

commit 0d808df06a44200f52262b6eb72bcb6042f5a7c5 upstream.

When switching from/to a guest that has a transaction in progress,
we need to save/restore the checkpointed register state. Although
XER is part of the CPU state that gets checkpointed, the code that
does this saving and restoring doesn't save/restore XER.

This fixes it by saving and restoring the XER. To allow userspace
to read/write the checkpointed XER value, we also add a new ONE_REG
specifier.

The visible effect of this bug is that the guest may see its XER
value being corrupted when it uses transactions.

Fixes: e4e38121507a ("KVM: PPC: Book3S HV: Add transactional memory support")
Fixes: 0a8eccefcb34 ("KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exit")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: aacraid: remove wildcard for series 9 controllers

commit ae2aae2421983f6f68eb7c4692624bc43ea50712 upstream.

Controllers with this PCI ID never shipped outside of
PMCS/Microsemi. Remove the ID from the aacraid driver. smartpqi is the
correct driver for these controllers.

[mkp: patch description]

Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

md/raid5: limit request size according to implementation limits

commit e8d7c33232e5fdfa761c3416539bc5b4acd12db5 upstream.

Current implementation employ 16bit counter of active stripes in lower
bits of bio->bi_phys_segments. If request is big enough to overflow
this counter bio will be completed and freed too early.

Fortunately this not happens in default configuration because several
other limits prevent that: stripe_cache_size * nr_disks effectively
limits count of active stripes. And small max_sectors_kb at lower
disks prevent that during normal read/write operations.

Overflow easily happens in discard if it's enabled by module parameter
"devices_handle_discard_safely" and stripe_cache_size is set big enough.

This patch limits requests size with 256Mb - 8Kb to prevent overflows.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Shaohua Li <shli@kernel.org>
Cc: Neil Brown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sc16is7xx: Drop bogus use of IRQF_ONESHOT

commit 04da73803c05dc1150ccc31cbf93e8cd56679c09 upstream.

The use of IRQF_ONESHOT when registering an interrupt handler with
request_irq() is non-sensical.

Not only that, it also prevents the handler from being threaded when it
otherwise should be w/ IRQ_FORCED_THREADING is enabled.  This causes the
following deadlock observed by Sean Nyekjaer on -rt:

Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
[..]
   rt_spin_lock_slowlock from queue_kthread_work
   queue_kthread_work from sc16is7xx_irq
   sc16is7xx_irq [sc16is7xx] from handle_irq_event_percpu
   handle_irq_event_percpu from handle_irq_event
   handle_irq_event from handle_level_irq
   handle_level_irq from generic_handle_irq
   generic_handle_irq from mxc_gpio_irq_handler
   mxc_gpio_irq_handler from mx3_gpio_irq_handler
   mx3_gpio_irq_handler from generic_handle_irq
   generic_handle_irq from __handle_domain_irq
   __handle_domain_irq from gic_handle_irq
   gic_handle_irq from __irq_svc
   __irq_svc from rt_spin_unlock
   rt_spin_unlock from kthread_worker_fn
   kthread_worker_fn from kthread
   kthread from ret_from_fork

Fixes: 9e6f4ca3e567 ("sc16is7xx: use kthread_worker for tx_work and irq")
Reported-by: Sean Nyekjaer <sean.nyekjaer@prevas.dk>
Signed-off-by: Josh Cartwright <joshc@ni.com>
Cc: linux-rt-users@vger.kernel.org
Cc: Jakub Kicinski <moorray3@wp.pl>
Cc: linux-serial@vger.kernel.org
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Julia Cartwright <julia@ni.com>
Acked-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: KVM: pmu: Reset PMSELR_EL0.SEL to a sane value before entering the guest

commit 21cbe3cc8a48ff17059912e019fbde28ed54745a upstream.

The ARMv8 architecture allows the cycle counter to be configured
by setting PMSELR_EL0.SEL==0x1f and then accessing PMXEVTYPER_EL0,
hence accessing PMCCFILTR_EL0. But it disallows the use of
PMSELR_EL0.SEL==0x1f to access the cycle counter itself through
PMXEVCNTR_EL0.

Linux itself doesn't violate this rule, but we may end up with
PMSELR_EL0.SEL being set to 0x1f when we enter a guest. If that
guest accesses PMXEVCNTR_EL0, the access may UNDEF at EL1,
despite the guest not having done anything wrong.

In order to avoid this unfortunate course of events (haha!), let's
sanitize PMSELR_EL0 on guest entry. This ensures that the guest
won't explode unexpectedly.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/kexec: use node 0 when re-adding crash kernel memory

commit 9f88eb4df728aebcd2ddd154d99f1d75b428b897 upstream.

When re-adding crash kernel memory within setup_resources() the
function memblock_add() is used. That function will add memory by
default to node "MAX_NUMNODES" instead of node 0, like the memory
detection code does. In case of !NUMA this will trigger this warning
when the kernel generates the vmemmap:

Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead
WARNING: CPU: 0 PID: 0 at mm/memblock.c:1261 memblock_virt_alloc_internal+0x76/0x220
CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc6 #16
Call Trace:
[<0000000000d0b2e8>] memblock_virt_alloc_try_nid+0x88/0xc8
[<000000000083c8ea>] __earlyonly_bootmem_alloc.constprop.1+0x42/0x50
[<000000000083e7f4>] vmemmap_populate+0x1ac/0x1e0
[<0000000000840136>] sparse_mem_map_populate+0x46/0x68
[<0000000000d0c59c>] sparse_init+0x184/0x238
[<0000000000cf45f6>] paging_init+0xbe/0xf8
[<0000000000cf1d4a>] setup_arch+0xa02/0xae0
[<0000000000ced75a>] start_kernel+0x72/0x450
[<0000000000100020>] _stext+0x20/0x80

If NUMA is selected numa_setup_memory() will fix the node assignments
before the vmemmap will be populated; so this warning will only appear
if NUMA is not selected.

To fix this simply use memblock_add_node() and re-add crash kernel
memory explicitly to node 0.

Reported-and-tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: 4e042af463f8 ("s390/kexec: fix crash on resize of reserved memory")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/vmlogrdr: fix IUCV buffer allocation

commit 5457e03de918f7a3e294eb9d26a608ab8a579976 upstream.

The buffer for iucv_message_receive() needs to be below 2 GB. In
__iucv_message_receive(), the buffer address is casted to an u32, which
would result in either memory corruption or an addressing exception when
using addresses >= 2 GB.

Fix this by using GFP_DMA for the buffer allocation.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

firmware: fix usermode helper fallback loading

commit 2e700f8d85975f516ccaad821278c1fe66b2cc98 upstream.

When you use the firmware usermode helper fallback with a timeout value set to a
value greater than INT_MAX (2147483647) a cast overflow issue causes the
timeout value to go negative and breaks all usermode helper loading. This
regression was introduced through commit 68ff2a00dbf5 ("firmware_loader:
handle timeout via wait_for_completion_interruptible_timeout()") on kernel
v4.0.

The firmware_class drivers relies on the firmware usermode helper
fallback as a mechanism to look for firmware if the direct filesystem
search failed only if:

  a) You've enabled CONFIG_FW_LOADER_USER_HELPER_FALLBACK (not many distros):

  Then all of these callers will rely on the fallback mechanism in case
  the firmware is not found through an initial direct filesystem lookup:

  o request_firmware()
  o request_firmware_into_buf()
  o request_firmware_nowait()

  b) If you've only enabled CONFIG_FW_LOADER_USER_HELPER (most distros):

  Then only callers using request_firmware_nowait() with the second
  argument set to false, this explicitly is requesting the UMH firmware
  fallback to be relied on in case the first filesystem lookup fails.

  Using Coccinelle SmPL grammar we have identified only two drivers
  explicitly requesting the UMH firmware fallback mechanism:

  - drivers/firmware/dell_rbu.c
  - drivers/leds/leds-lp55xx-common.c

Since most distributions only enable CONFIG_FW_LOADER_USER_HELPER the
biggest impact of this regression are users of the dell_rbu and
leds-lp55xx-common device driver which required the UMH to find their
respective needed firmwares.

The default timeout for the UMH is set to 60 seconds always, as of
commit 68ff2a00dbf5 ("firmware_loader: handle timeout via
wait_for_completion_interruptible_timeout()") the timeout was bumped
to MAX_JIFFY_OFFSET ((LONG_MAX >> 1)-1). Additionally the MAX_JIFFY_OFFSET
value was also used if the timeout was configured by a user to 0.

The following works:

echo 2147483647 > /sys/class/firmware/timeout

But both of the following set the timeout to MAX_JIFFY_OFFSET even if
we display 0 back to userspace:

echo 2147483648 > /sys/class/firmware/timeout
cat /sys/class/firmware/timeout
0

echo 0> /sys/class/firmware/timeout
cat /sys/class/firmware/timeout
0

A max value of INT_MAX (2147483647) seconds is therefore implicit due to the
another cast with simple_strtol().

This fixes the secondary cast (the first one is simple_strtol() but its an
issue only by forcing an implicit limit) by re-using the timeout variable and
only setting retval in appropriate cases.

Lastly worth noting systemd had ripped out the UMH firmware fallback
mechanism from udev since udev 2014 via commit be2ea723b1d023b3d
("udev: remove userspace firmware loading support"), so as of systemd v217.

Signed-off-by: Yves-Alexis Perez <corsac@corsac.net>
Fixes: 68ff2a00dbf5 "firmware_loader: handle timeout via wait_for_completion_interruptible_timeout()"
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
[mcgrof@kernel.org: gave commit log a whole lot of love]
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARC: mm: arc700: Don't assume 2 colours for aliasing VIPT dcache

commit 08fe007968b2b45e831daf74899f79a54d73f773 upstream.

An ARC700 customer reported linux boot crashes when upgrading to bigger
L1 dcache (64K from 32K). Turns out they had an aliasing VIPT config and
current code only assumed 2 colours, while theirs had 4. So default to 4
colours and complain if there are fewer. Ideally this needs to be a
Kconfig option, but heck that's too much of hassle for a single user.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: avoid a permanent stop of the scsi device's request queue

commit d2a145252c52792bc59e4767b486b26c430af4bb upstream.

A race between scanning and fc_remote_port_delete() may result in a
permanent stop if the device gets blocked before scsi_sysfs_add_sdev()
and unblocked after.  The reason is that blocking a device sets both the
SDEV_BLOCKED state and the QUEUE_FLAG_STOPPED.  However,
scsi_sysfs_add_sdev() unconditionally sets SDEV_RUNNING which causes the
device to be ignored by scsi_target_unblock() and thus never have its
QUEUE_FLAG_STOPPED cleared leading to a device which is apparently
running but has a stopped queue.

We actually have two places where SDEV_RUNNING is set: once in
scsi_add_lun() which respects the blocked flag and once in
scsi_sysfs_add_sdev() which doesn't.  Since the second set is entirely
spurious, simply remove it to fix the problem.

Reported-by: Zengxi Chen <chenzengxi@huawei.com>
Signed-off-by: Wei Fang <fangwei1@huawei.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: zfcp: fix rport unblock race with LUN recovery

commit 6f2ce1c6af37191640ee3ff6e8fc39ea10352f4c upstream.

It is unavoidable that zfcp_scsi_queuecommand() has to finish requests
with DID_IMM_RETRY (like fc_remote_port_chkready()) during the time
window when zfcp detected an unavailable rport but
fc_remote_port_delete(), which is asynchronous via
zfcp_scsi_schedule_rport_block(), has not yet blocked the rport.

However, for the case when the rport becomes available again, we should
prevent unblocking the rport too early.  In contrast to other FCP LLDDs,
zfcp has to open each LUN with the FCP channel hardware before it can
send I/O to a LUN.  So if a port already has LUNs attached and we
unblock the rport just after port recovery, recoveries of LUNs behind
this port can still be pending which in turn force
zfcp_scsi_queuecommand() to unnecessarily finish requests with
DID_IMM_RETRY.

This also opens a time window with unblocked rport (until the followup
LUN reopen recovery has finished).  If a scsi_cmnd timeout occurs during
this time window fc_timed_out() cannot work as desired and such command
would indeed time out and trigger scsi_eh. This prevents a clean and
timely path failover.  This should not happen if the path issue can be
recovered on FC transport layer such as path issues involving RSCNs.

Fix this by only calling zfcp_scsi_schedule_rport_register(), to
asynchronously trigger fc_remote_port_add(), after all LUN recoveries as
children of the rport have finished and no new recoveries of equal or
higher order were triggered meanwhile.  Finished intentionally includes
any recovery result no matter if successful or failed (still unblock
rport so other successful LUNs work).  For simplicity, we check after
each finished LUN recovery if there is another LUN recovery pending on
the same port and then do nothing.  We handle the special case of a
successful recovery of a port without LUN children the same way without
changing this case's semantics.

For debugging we introduce 2 new trace records written if the rport
unblock attempt was aborted due to still unfinished or freshly triggered
recovery. The records are only written above the default trace level.

Benjamin noticed the important special case of new recovery that can be
triggered between having given up the erp_lock and before calling
zfcp_erp_action_cleanup() within zfcp_erp_strategy().  We must avoid the
following sequence:

ERP thread                 rport_work      other context
-------------------------  --------------  --------------------------------
port is unblocked, rport still blocked,
due to pending/running ERP action,
so ((port->status & ...UNBLOCK) != 0)
and (port->rport == NULL)
unlock ERP
zfcp_erp_action_cleanup()
case ZFCP_ERP_ACTION_REOPEN_LUN:
zfcp_erp_try_rport_unblock()
((status & ...UNBLOCK) != 0) [OLD!]
                                           zfcp_erp_port_reopen()
                                           lock ERP
                                           zfcp_erp_port_block()
                                           port->status clear ...UNBLOCK
                                           unlock ERP
                                           zfcp_scsi_schedule_rport_block()
                                           port->rport_task = RPORT_DEL
                                           queue_work(rport_work)
                           zfcp_scsi_rport_work()
                           (port->rport_task != RPORT_ADD)
                           port->rport_task = RPORT_NONE
                           zfcp_scsi_rport_block()
                           if (!port->rport) return
zfcp_scsi_schedule_rport_register()
port->rport_task = RPORT_ADD
queue_work(rport_work)
                           zfcp_scsi_rport_work()
                           (port->rport_task == RPORT_ADD)
                           port->rport_task = RPORT_NONE
                           zfcp_scsi_rport_register()
                           (port->rport == NULL)
                           rport = fc_remote_port_add()
                           port->rport = rport;

Now the rport was erroneously unblocked while the zfcp_port is blocked.
This is another situation we want to avoid due to scsi_eh
potential. This state would at least remain until the new recovery from
the other context finished successfully, or potentially forever if it
failed.  In order to close this race, we take the erp_lock inside
zfcp_erp_try_rport_unblock() when checking the status of zfcp_port or
LUN.  With that, the possible corresponding rport state sequences would
be: (unblock[ERP thread],block[other context]) if the ERP thread gets
erp_lock first and still sees ((port->status & ...UNBLOCK) != 0),
(block[other context],NOP[ERP thread]) if the ERP thread gets erp_lock
after the other context has already cleard ...UNBLOCK from port->status.

Since checking fields of struct erp_action is unsafe because they could
have been overwritten (re-used for new recovery) meanwhile, we only
check status of zfcp_port and LUN since these are only changed under
erp_lock elsewhere. Regarding the check of the proper status flags (port
or port_forced are similar to the shown adapter recovery):

[zfcp_erp_adapter_shutdown()]
zfcp_erp_adapter_reopen()
zfcp_erp_adapter_block()
  * clear UNBLOCK ---------------------------------------+
zfcp_scsi_schedule_rports_block()                       |
write_lock_irqsave(&adapter->erp_lock, flags);-------+  |
zfcp_erp_action_enqueue()                            |  |
  zfcp_erp_setup_act()                                |  |
   * set ERP_INUSE -----------------------------------|--|--+
write_unlock_irqrestore(&adapter->erp_lock, flags);--+  |  |
.context-switch.                                         |  |
zfcp_erp_thread()                                        |  |
zfcp_erp_strategy()                                     |  |
  write_lock_irqsave(&adapter->erp_lock, flags);------+  |  |
  ...                                                 |  |  |
  zfcp_erp_strategy_check_target()                    |  |  |
   zfcp_erp_strategy_check_adapter()                  |  |  |
    zfcp_erp_adapter_unblock()                        |  |  |
     * set UNBLOCK -----------------------------------|--+  |
  zfcp_erp_action_dequeue()                           |     |
   * clear ERP_INUSE ---------------------------------|-----+
  ...                                                 |
  write_unlock_irqrestore(&adapter->erp_lock, flags);-+

Hence, we should check for both UNBLOCK and ERP_INUSE because they are
interleaved.  Also we need to explicitly check ERP_FAILED for the link
down case which currently does not clear the UNBLOCK flag in
zfcp_fsf_link_down_info_eval().

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 8830271c4819 ("[SCSI] zfcp: Dont fail SCSI commands when transitioning to blocked fc_rport")
Fixes: a2fa0aede07c ("[SCSI] zfcp: Block FC transport rports early on errors")
Fixes: 5f852be9e11d ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI")
Fixes: 338151e06608 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable")
Fixes: 3859f6a248cb ("[PATCH] zfcp: add rports to enable scsi_add_device to work again")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: zfcp: do not trace pure benign residual HBA responses at default level

commit 56d23ed7adf3974f10e91b643bd230e9c65b5f79 upstream.

Since quite a while, Linux issues enough SCSI commands per scsi_device
which successfully return with FCP_RESID_UNDER, FSF_FCP_RSP_AVAILABLE,
and SAM_STAT_GOOD. This floods the HBA trace area and we cannot see
other and important HBA trace records long enough.

Therefore, do not trace HBA response errors for pure benign residual
under counts at the default trace level.

This excludes benign residual under count combined with other validity
bits set in FCP_RSP_IU, such as FCP_SNS_LEN_VAL. For all those other
cases, we still do want to see both the HBA record and the corresponding
SCSI record by default.

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: a54ca0f62f95 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: zfcp: fix use-after-"free" in FC ingress path after TMF

commit dac37e15b7d511e026a9313c8c46794c144103cd upstream.

When SCSI EH invokes zFCP's callbacks for eh_device_reset_handler() and
eh_target_reset_handler(), it expects us to relent the ownership over
the given scsi_cmnd and all other scsi_cmnds within the same scope - LUN
or target - when returning with SUCCESS from the callback ('release'
them).  SCSI EH can then reuse those commands.

We did not follow this rule to release commands upon SUCCESS; and if
later a reply arrived for one of those supposed to be released commands,
we would still make use of the scsi_cmnd in our ingress tasklet. This
will at least result in undefined behavior or a kernel panic because of
a wrong kernel pointer dereference.

To fix this, we NULLify all pointers to scsi_cmnds (struct zfcp_fsf_req
*)->data in the matching scope if a TMF was successful. This is done
under the locks (struct zfcp_adapter *)->abort_lock and (struct
zfcp_reqlist *)->lock to prevent the requests from being removed from
the request-hashtable, and the ingress tasklet from making use of the
scsi_cmnd-pointer in zfcp_fsf_fcp_cmnd_handler().

For cases where a reply arrives during SCSI EH, but before we get a
chance to NULLify the pointer - but before we return from the callback
-, we assume that the code is protected from races via the CAS operation
in blk_complete_request() that is called in scsi_done().

The following stacktrace shows an example for a crash resulting from the
previous behavior:

Unable to handle kernel pointer dereference at virtual kernel address fffffee17a672000
Oops: 0038 [#1] SMP
CPU: 2 PID: 0 Comm: swapper/2 Not tainted
task: 00000003f7ff5be0 ti: 00000003f3d38000 task.ti: 00000003f3d38000
Krnl PSW : 0404d00180000000 00000000001156b0 (smp_vcpu_scheduled+0x18/0x40)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 EA:3
Krnl GPRS: 000000200000007e 0000000000000000 fffffee17a671fd8 0000000300000015
           ffffffff80000000 00000000005dfde8 07000003f7f80e00 000000004fa4e800
           000000036ce8d8f8 000000036ce8d9c0 00000003ece8fe00 ffffffff969c9e93
           00000003fffffffd 000000036ce8da10 00000000003bf134 00000003f3b07918
Krnl Code: 00000000001156a2: a7190000        lghi    %r1,0
           00000000001156a6: a7380015        lhi    %r3,21
          #00000000001156aa: e32050000008    ag    %r2,0(%r5)
          >00000000001156b0: 482022b0        lh    %r2,688(%r2)
           00000000001156b4: ae123000        sigp    %r1,%r2,0(%r3)
           00000000001156b8: b2220020        ipm    %r2
           00000000001156bc: 8820001c        srl    %r2,28
           00000000001156c0: c02700000001    xilf    %r2,1
Call Trace:
([<0000000000000000>] 0x0)
[<000003ff807bdb8e>] zfcp_fsf_fcp_cmnd_handler+0x3de/0x490 [zfcp]
[<000003ff807be30a>] zfcp_fsf_req_complete+0x252/0x800 [zfcp]
[<000003ff807c0a48>] zfcp_fsf_reqid_check+0xe8/0x190 [zfcp]
[<000003ff807c194e>] zfcp_qdio_int_resp+0x66/0x188 [zfcp]
[<000003ff80440c64>] qdio_kick_handler+0xdc/0x310 [qdio]
[<000003ff804463d0>] __tiqdio_inbound_processing+0xf8/0xcd8 [qdio]
[<0000000000141fd4>] tasklet_action+0x9c/0x170
[<0000000000141550>] __do_softirq+0xe8/0x258
[<000000000010ce0a>] do_softirq+0xba/0xc0
[<000000000014187c>] irq_exit+0xc4/0xe8
[<000000000046b526>] do_IRQ+0x146/0x1d8
[<00000000005d6a3c>] io_return+0x0/0x8
[<00000000005d6422>] vtime_stop_cpu+0x4a/0xa0
([<0000000000000000>] 0x0)
[<0000000000103d8a>] arch_cpu_idle+0xa2/0xb0
[<0000000000197f94>] cpu_startup_entry+0x13c/0x1f8
[<0000000000114782>] smp_start_secondary+0xda/0xe8
[<00000000005d6efe>] restart_int_handler+0x56/0x6c
[<0000000000000000>] 0x0
Last Breaking-Event-Address:
[<00000000003bf12e>] arch_spin_lock_wait+0x56/0xb0

Suggested-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Fixes: ea127f9754 ("[PATCH] s390 (7/7): zfcp host adapter.") (tglx/history.git)
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iscsi-target: Return error if unable to add network portal

commit 83337e544323a8bd7492994d64af339175ac7107 upstream.

If iscsit_tpg_add_network_portal() fails then
return error code instead of 0 to user space.

If iscsi-target returns 0 then user space keeps
on retrying same command infinitely, targetcli or
echo hangs till command completes with non zero
return value. In some cases it is possible that
add network portal command never completes with
success even after retrying multiple times,
for example - cxgbit_setup_np() always returns
-EINVAL if portal IP does not belong to Chelsio
adapter interface.

Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
[ bvanassche: Added "Fixes:" and "Cc: stable" tags ]
Fixes: commit d4b3fa4b0881 ("iscsi-target: Make iscsi_tpg_np driver show/store use generic code")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: megaraid_sas: Do not set MPI2_TYPE_CUDA for JBOD FP path for FW which does not support JBOD sequence map

commit d5573584429254a14708cf8375c47092b5edaf2c upstream.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: megaraid_sas: For SRIOV enabled firmware, ensure VF driver waits for 30secs before reset

commit 18e1c7f68a5814442abad849abe6eacbf02ffd7c upstream.

For SRIOV enabled firmware, if there is a OCR(online controller reset)
possibility driver set the convert flag to 1, which is not happening if
there are outstanding commands even after 180 seconds. As driver does
not set convert flag to 1 and still making the OCR to run, VF(Virtual
function) driver is directly writing on to the register instead of
waiting for 30 seconds. Setting convert flag to 1 will cause VF driver
will wait for 30 secs before going for reset.

Signed-off-by: Kiran Kumar Kasturi <kiran-kumar.kasturi@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

stm class: Fix device leak in open error path

commit a0ebf519b8a2666438d999c62995618c710573e5 upstream.

Make sure to drop the reference taken by class_find_device() also on
allocation errors in open().

Signed-off-by: Johan Hovold <johan@kernel.org>
Fixes: 7bd1d4093c2f ("stm class: Introduce an abstraction for...")
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

vt: fix Scroll Lock LED trigger name

commit 31b5929d533f5183972cf57a7844b456ed996f3c upstream.

There is a disagreement between drivers/tty/vt/keyboard.c and
drivers/input/input-leds.c with regard to what is a Scroll Lock LED
trigger name: input calls it "kbd-scrolllock", but vt calls it
"kbd-scrollock" (two l's).
This prevents Scroll Lock LED trigger from binding to this LED by default.

Since it is a scroLL Lock LED, this interface was introduced only about a
year ago and in an Internet search people seem to reference this trigger
only to set it to this LED let's simply rename it to "kbd-scrolllock".

Also, it looks like this was supposed to be changed before this code was
merged: https://lkml.org/lkml/2015/6/9/697 but it was done only on
the input side.

Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

block: protect iterate_bdevs() against concurrent close

commit af309226db916e2c6e08d3eba3fa5c34225200c4 upstream.

If a block device is closed while iterate_bdevs() is handling it, the
following NULL pointer dereference occurs because bdev->b_disk is NULL
in bdev_get_queue(), which is called from blk_get_backing_dev_info() (in
turn called by the mapping_cap_writeback_dirty() call in
__filemap_fdatawrite_range()):

BUG: unable to handle kernel NULL pointer dereference at 0000000000000508
IP: [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
PGD 9e62067 PUD 9ee8067 PMD 0
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in:
CPU: 1 PID: 2422 Comm: sync Not tainted 4.5.0-rc7+ #400
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
task: ffff880009f4d700 ti: ffff880009f5c000 task.ti: ffff880009f5c000
RIP: 0010:[<ffffffff81314790>]  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
RSP: 0018:ffff880009f5fe68  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88000ec17a38 RCX: ffffffff81a4e940
RDX: 7fffffffffffffff RSI: 0000000000000000 RDI: ffff88000ec176c0
RBP: ffff880009f5fe68 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff88000ec17860
R13: ffffffff811b25c0 R14: ffff88000ec178e0 R15: ffff88000ec17a38
FS:  00007faee505d700(0000) GS:ffff88000fb00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000508 CR3: 0000000009e8a000 CR4: 00000000000006e0
Stack:
  ffff880009f5feb8 ffffffff8112e7f5 0000000000000000 7fffffffffffffff
  0000000000000000 0000000000000000 7fffffffffffffff 0000000000000001
  ffff88000ec178e0 ffff88000ec17860 ffff880009f5fec8 ffffffff8112e81f
Call Trace:
  [<ffffffff8112e7f5>] __filemap_fdatawrite_range+0x85/0x90
  [<ffffffff8112e81f>] filemap_fdatawrite+0x1f/0x30
  [<ffffffff811b25d6>] fdatawrite_one_bdev+0x16/0x20
  [<ffffffff811bc402>] iterate_bdevs+0xf2/0x130
  [<ffffffff811b2763>] sys_sync+0x63/0x90
  [<ffffffff815d4272>] entry_SYSCALL_64_fastpath+0x12/0x76
Code: 0f 1f 44 00 00 48 8b 87 f0 00 00 00 55 48 89 e5 <48> 8b 80 08 05 00 00 5d
RIP  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
  RSP <ffff880009f5fe68>
CR2: 0000000000000508
---[ end trace 2487336ceb3de62d ]---

The crash is easily reproducible by running the following command, if an
msleep(100) is inserted before the call to func() in iterate_devs():

while :; do head -c1 /dev/nullb0; done > /dev/null & while :; do sync; done

Fix it by holding the bd_mutex across the func() call and only calling
func() if the bdev is opened.

Fixes: 5c0d6b60a0ba ("vfs: Create function for iterating over block devices")
Reported-and-tested-by: Wei Fang <fangwei1@huawei.com>
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mei: me: add lewisburg device ids

commit 9ff2007bea1f1bfc53ac0bc7ccf8200bb275fd52 upstream.

Add MEI Lewisburg PCH IDs for Purley based workstations.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mei: request async autosuspend at the end of enumeration

commit d5f8e166c25750adc147b0adf64a62a91653438a upstream.

pm_runtime_autosuspend can take synchronous or asynchronous
paths, Because we are calling pm_runtime_mark_last_busy just before
this most of the cases it takes the asynchronous way. However,
when the FW or driver resets during already running runtime suspend,
the call will result in calling to the driver's rpm callback and results
in a deadlock on device_lock.
The simplest fix is to replace pm_runtime_autosuspend with
asynchronous pm_request_autosuspend.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drivers/gpu/drm/ast: Fix infinite loop if read fails

commit 298360af3dab45659810fdc51aba0c9f4097e4f6 upstream.

ast_get_dram_info() configures a window in order to access BMC memory.
A BMC register can be configured to disallow this, and if so, causes
an infinite loop in the ast driver which renders the system unusable.

Fix this by erroring out if an error is detected. On powerpc systems with
EEH, this leads to the device being fenced and the system continuing to
operate.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161215051241.20815-1-ruscur@russell.cc
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amdgpu: fix init save/restore list in gfx_v8.0

commit 202e0b227b906cb80a2791f21216a55d9468d61b upstream.

set valid data to mmRLC_SRM_INDEX_CNTL_ADDRx/DATAx.

Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/gma500: Add compat ioctl

commit 0a97c81a9717431e6c57ea845b59c3c345edce67 upstream.

Hook up drm_compat_ioctl to support 32-bit userspace on 64-bit kernels.
It turns out that N2600 and N2800 comes with 64-bit enabled. We
previously assumed there where no such systems out there.

Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20161101144315.2955-1-patrik.r.jakobsson@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon/si: load the proper firmware on 0x87 oland boards

commit abb2e3c1ce64c8bba678973800c34ea1dc97c42c upstream.

New variant.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: add additional pci revision to dpm workaround

commit 8729675c00a8d13cb2094d617d70a4a4da7d83c5 upstream.

New variant.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: Hide the HW cursor while it's out of bounds

commit 6b16cf7785a4200b1bddf4f70c9dda2efc49e278 upstream.

Fixes hangs in that case under some circumstances.

v2:
* Only use non-0 x/yorigin if the cursor is (partially) outside of the
top/left edge of the total surface with AVIVO/DCE

Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1000433
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: Also call cursor_move_locked when the cursor size changes

commit dcab0fa64e300afa18f39cd98d05e0950f652adf upstream.

The cursor size also affects the register programming.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/nouveau/fifo/gf100-: protect channel preempt with subdev mutex

commit b27add13f500469127afdf011dbcc9c649e16e54 upstream.

This avoids an issue that occurs when we're attempting to preempt multiple
channels simultaneously. HW seems to ignore preempt requests while it's
still processing a previous one, which, well, makes sense.

Fixes random "fifo: SCHED_ERROR 0d []" + GPCCS page faults during parallel
piglit runs on (at least) GM107.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/nouveau/i2c/gk110b,gm10x: use the correct implementation

commit 5b3800a6b763874e4a23702fb9628d3bd3315ce9 upstream.

DPAUX registers moved on Kepler, these chipsets were still using the
Fermi implementation for some reason.

This fixes detection of hotplug/sink IRQs on DP connectors.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/nouveau/ttm: wait for bo fence to signal before unmapping vmas

commit 10dcab3e7f477bffee88d518aad57d06777cfdf4 upstream.

TTM was changed a while back to allow for pipelining of buffer moves, and
part of this was the removal of waiting for a BO to idle before calling
move(), placing the responsibility on the driver to do this if required.

That's all well and good, except, we make use of move_notify() to handle
mapping/unmapping from the GPU VMM as move() isn't called on all paths.

This commit adds a wait before unmapping from a VMM in move_notify(), to
prevent GPU page faults where a buffer is still being accessed.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/nouveau/ltc: protect clearing of comptags with mutex

commit f4e65efc88b64c1dbca275d42a188edccedb56c6 upstream.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/nouveau/bios: require checksum to match for fast acpi shadow method

commit 5dc7f4aa9d84ea94b54a9bfcef095f0289f1ebda upstream.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/nouveau/kms: lvds panel strap moved again on maxwell

commit 768e847759d551c96e129e194588dbfb11a1d576 upstream.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/nouveau/gr: fallback to legacy paths during firmware lookup

commit e137040e0d0376b404fc5155eba44ea07126e3bd upstream.

Look for firmware files using the legacy ("nouveau/nvxx_fucxxxx") path
if they cannot be found in the new, "official" path. User setups were
broken by the switch, which is bad.

There are only 4 firmware files we may want to look up that way, so
hardcode them into the lookup function. All new firmware files should
use the standard "nvidia/<chip>/gr/" path.

Fixes: 8539b37acef7 ("drm/nouveau/gr: use NVIDIA-provided external firmwares")
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/amdgpu: enable GUI idle INT after enabling CGCG

commit dd31ae9ac933636c3712b7dd0f6152c1d71f81fe upstream.

GUI idle interrupts should be enabled only after we
have enabled coarse grain clock gating (CGCG). This
prevents GFX engine generating idle interrupt even
though CGCG is not completely enabled.

Most of the time this goes un-noticed, but on some
Stoney ASICs this results in GFX engine hang after
system resumes from suspend. The issue is not
particular to Stoney though and could have occured
on any ASIC. The patch fixes this issue.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Sunil Uttarwar <Sunil.Uttarwar1@amd.com>
Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI / video: Add force_native quirk for HP Pavilion dv6

commit 6276e53fa8c06a3a5cf7b95b77b079966de9ad66 upstream.

The HP Pavilion dv6 has a non-working acpi_video0 backlight interface
and an intel_backlight interface which works fine. Add a force_native
quirk for it so that the non-working acpi_video0 interface does not get
registered.

Note that there are quite a few HP Pavilion dv6 variants, some
woth ATI and some with NVIDIA hybrid gfx, both seem to need this
quirk to have working backlight control. There are also some versions
with only Intel integrated gfx, these may not need this quirk, but it
should not hurt there.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1204476
Link: https://bugs.launchpad.net/ubuntu/+source/linux-lts-trusty/+bug/1416940
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI / video: Add force_native quirk for Dell XPS 17 L702X

commit 350fa038c31b056fc509624efb66348ac2c1e3d0 upstream.

The Dell XPS 17 L702X has a non-working acpi_video0 backlight interface
and an intel_backlight interface which works fine. Add a force_native
quirk for it so that the non-working acpi_video0 interface does not get
registered.

Note that there also is an issue with the brightnesskeys on this laptop,
they do not generate key-press events in anyway. That is not solved by
this patch.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1123661
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: comedi: ni_mio_common: fix E series ni_ai_insn_read() data

commit 857a661020a2de3a0304edf33ad656abee100891 upstream.

Commit 0557344e2149 ("staging: comedi: ni_mio_common: fix local var for
32-bit read") changed the type of local variable `d` from `unsigned
short` to `unsigned int` to fix a bug introduced in
commit 9c340ac934db ("staging: comedi: ni_stc.h: add read/write
callbacks to struct ni_private") when reading AI data for NI PCI-6110
and PCI-6111 cards.  Unfortunately, other parts of the function rely on
the variable being `unsigned short` when an offset value in local
variable `signbits` is added to `d` before writing the value to the
`data` array:

d += signbits;
   data[n] = d;

The `signbits` variable will be non-zero in bipolar mode, and is used to
convert the hardware's 2's complement, 16-bit numbers to Comedi's
straight binary sample format (with 0 representing the most negative
voltage).  This breaks because `d` is now 32 bits wide instead of 16
bits wide, so after the addition of `signbits`, `data[n]` ends up being
set to values above 65536 for negative voltages.  This affects all
supported "E series" cards except PCI-6143 (and PXI-6143). Fix it by
ANDing the value written to the `data[n]` with the mask 0xffff.

Fixes: 0557344e2149 ("staging: comedi: ni_mio_common: fix local var for 32-bit read")
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: comedi: ni_mio_common: fix M Series ni_ai_insn_read() data mask

commit 655c4d442d1213b617926cc6d54e2a9a793fb46b upstream.

For NI M Series cards, the Comedi `insn_read` handler for the AI
subdevice is broken due to ANDing the value read from the AI FIFO data
register with an incorrect mask.  The incorrect mask clears all but the
most significant bit of the sample data.  It should preserve all the
sample data bits.  Correct it.

Fixes: 817144ae7fda ("staging: comedi: ni_mio_common: remove unnecessary use of 'board->adbits'")
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels()

commit abd1026da4a7700a8db370947f75cd17b6ae6f76 upstream.

"kernel BUG at drivers/hv/channel_mgmt.c:350!" is observed when hv_vmbus
module is unloaded. BUG_ON() was introduced in commit 85d9aa705184
("Drivers: hv: vmbus: add an API vmbus_hvsock_device_unregister()") as
vmbus_free_channels() codepath was apparently forgotten.

Fixes: 85d9aa705184 ("Drivers: hv: vmbus: add an API vmbus_hvsock_device_unregister()")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

docs: sphinx-extensions: make rstFlatTable work with docutils 0.13

commit 217e2bfab22e740227df09f22165e834cddd8a3b upstream.

In docutils 0.13, the return type of get_column_widths method of the
Table directive has changed [1], which breaks our flat-table directive
and leads to a TypeError when trying to build the docs [2].

This patch adds support for the new return type, while keeping support
for older docutils versions too.

[1] https://sourceforge.net/p/docutils/patches/120/
[2] https://sourceforge.net/p/docutils/bugs/303/

Signed-off-by: Dmitry Shachnev <mitya57@debian.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

thermal: hwmon: Properly report critical temperature in sysfs

commit f37fabb8643eaf8e3b613333a72f683770c85eca upstream.

In the critical sysfs entry the thermal hwmon was returning wrong
temperature to the user-space. It was reporting the temperature of the
first trip point instead of the temperature of critical trip point.

For example:
/sys/class/hwmon/hwmon0/temp1_crit:50000
/sys/class/thermal/thermal_zone0/trip_point_0_temp:50000
/sys/class/thermal/thermal_zone0/trip_point_0_type:active
/sys/class/thermal/thermal_zone0/trip_point_3_temp:120000
/sys/class/thermal/thermal_zone0/trip_point_3_type:critical

Since commit e68b16abd91d ("thermal: add hwmon sysfs I/F") the driver
have been registering a sysfs entry if get_crit_temp() callback was
provided. However when accessed, it was calling get_trip_temp() instead
of the get_crit_temp().

Fixes: e68b16abd91d ("thermal: add hwmon sysfs I/F")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: bcm2835: Avoid overwriting the div info when disabling a pll_div clk

commit 68af4fa8f39b542a6cde7ac19518d88e9b3099dc upstream.

bcm2835_pll_divider_off() is resetting the divider field in the A2W reg
to zero when disabling the clock.

Make sure we preserve this value by reading the previous a2w_reg value
first and ORing the result with A2W_PLL_CHANNEL_DISABLE.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Fixes: 41691b8862e2 ("clk: bcm2835: Add support for programming the audio domain clocks")
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: tegra: Add VDD_GPU regulator to Jetson TX1

commit 5e6b9a89afceadb1ee45472098f7d20af260335c upstream.

Add the VDD_GPU regulator (a GPIO-enabled PWM regulator) to the Jetson
TX1 board. This addition allows the GPU to be used provided the
bootloader properly enabled the GPU node.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
[as pointed out by Thierry on IRC, nobody has reported a bug
in the field, but using a new bootloader with a .dtb that
has the incorrect data, it will crash on boot]
Fixes: 336f79c7b6d7 ("arm64: tegra: Add NVIDIA Jetson TX1 Developer Kit support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

gpio: chardev: Return error for seek operations

commit f4e81c529767b9a33d1b27695c54dc84a14af30d upstream.

The GPIO chardev is used for management tasks (allocating line and event
handles) and does neither support read() nor write() operations. Hence it
does not make much sense to allow seek operations.

Currently the chardev uses noop_llseek() for its seek implementation. This
function does not move the pointer and simply returns the current position
(always 0 for the GPIO chardev). noop_llseek() is primarily meant for
devices that can not support seek, but where there might be a user that
depends on the seek() operation succeeding. For newly added devices that
can not support seek operations it is recommended to use no_llseek(), which
will return an error. For more information see commit 6038f373a3dc
("llseek: automatically add .llseek fop").

Unfortunately this was overlooked when the GPIO chardev ABI was introduced.
But it is highly unlikely that since then userspace applications have
appeared that rely on being able to perform non-failing seek operations on
a GPIO chardev file descriptor. So it should be safe to change from
noop_llseel() to no_seek(). Also use nonseekable_open() in the chardev
open() callback to clear the FMODE_SEEK, FMODE_PREAD and FMODE_PWRITE flags
from the file. Neither of these should be set on a file that does not
support seek operations.

Fixes: 3c702e9987e2 ("gpio: add a userspace chardev ABI for GPIOs")
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

timekeeping_Force_unsigned_clocksource_to_nanoseconds_conversion

commit 9c1645727b8fa90d07256fdfcc45bf831242a3ab upstream.

The clocksource delta to nanoseconds conversion is using signed math, but
the delta is unsigned. This makes the conversion space smaller than
necessary and in case of a multiplication overflow the conversion can
become negative. The conversion is done with scaled math:

    s64 nsec_delta = ((s64)clkdelta * clk->mult) >> clk->shift;

Shifting a signed integer right obvioulsy preserves the sign, which has
interesting consequences:

- Time jumps backwards

- __iter_div_u64_rem() which is used in one of the calling code pathes
   will take forever to piecewise calculate the seconds/nanoseconds part.

This has been reported by several people with different scenarios:

David observed that when stopping a VM with a debugger:

"It was essentially the stopped by debugger case.  I forget exactly why,
  but the guest was being explicitly stopped from outside, it wasn't just
  scheduling lag.  I think it was something in the vicinity of 10 minutes
  stopped."

When lifting the stop the machine went dead.

The stopped by debugger case is not really interesting, but nevertheless it
would be a good thing not to die completely.

But this was also observed on a live system by Liav:

"When the OS is too overloaded, delta will get a high enough value for the
  msb of the sum delta * tkr->mult + tkr->xtime_nsec to be set, and so
  after the shift the nsec variable will gain a value similar to
  0xffffffffff000000."

Unfortunately this has been reintroduced recently with commit 6bd58f09e1d8
("time: Add cycles to nanoseconds translation"). It had been fixed a year
ago already in commit 35a4933a8959 ("time: Avoid signed overflow in
timekeeping_get_ns()").

Though it's not surprising that the issue has been reintroduced because the
function itself and the whole call chain uses s64 for the result and the
propagation of it. The change in this recent commit is subtle:

   s64 nsec;

-  nsec = (d * m + n) >> s:
+  nsec = d * m + n;
+  nsec >>= s;

d being type of cycle_t adds another level of obfuscation.

This wouldn't have happened if the previous change to unsigned computation
would have made the 'nsec' variable u64 right away and a follow up patch
had cleaned up the whole call chain.

There have been patches submitted which basically did a revert of the above
patch leaving everything else unchanged as signed. Back to square one. This
spawned a admittedly pointless discussion about potential users which rely
on the unsigned behaviour until someone pointed out that it had been fixed
before. The changelogs of said patches added further confusion as they made
finally false claims about the consequences for eventual users which expect
signed results.

Despite delta being cycle_t, aka. u64, it's very well possible to hand in
a signed negative value and the signed computation will happily return the
correct result. But nobody actually sat down and analyzed the code which
was added as user after the propably unintended signed conversion.

Though in sensitive code like this it's better to analyze it proper and
make sure that nothing relies on this than hunting the subtle wreckage half
a year later. After analyzing all call chains it stands that no caller can
hand in a negative value (which actually would work due to the s64 cast)
and rely on the signed math to do the right thing.

Change the conversion function to unsigned math. The conversion of all call
chains is done in a follow up patch.

This solves the starvation issue, which was caused by the negative result,
but it does not solve the underlying problem. It merily procrastinates
it. When the timekeeper update is deferred long enough that the unsigned
multiplication overflows, then time going backwards is observable again.

It does neither solve the issue of clocksources with a small counter width
which will wrap around possibly several times and cause random time stamps
to be generated. But those are usually not found on systems used for
virtualization, so this is likely a non issue.

I took the liberty to claim authorship for this simply because
analyzing all callsites and writing the changelog took substantially
more time than just making the simple s/s64/u64/ change and ignore the
rest.

Fixes: 6bd58f09e1d8 ("time: Add cycles to nanoseconds translation")
Reported-by: David Gibson <david@gibson.dropbear.id.au>
Reported-by: Liav Rehana <liavr@mellanox.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Parit Bhargava <prarit@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: "Christopher S. Hall" <christopher.s.hall@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20161208204228.688545601@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

regulator: stw481x-vmmc: fix ages old enable error

commit 295070e9aa015abb9b92cccfbb1e43954e938133 upstream.

The regulator has never been properly enabled, it has been
dormant all the time. It's strange that MMC was working
at all, but it likely worked by the signals going through
the levelshifter and reaching the card anyways.

Fixes: 3615a34ea1a6 ("regulator: add STw481x VMMC driver")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: sdhci: Fix recovery from tuning timeout

commit 61e53bd0047d58caee0c7170613045bf96de4458 upstream.

Clearing the tuning bits should reset the tuning circuit. However there is
more to do. Reset the command and data lines for good measure, and then
for eMMC ensure the card is not still trying to process a tuning command by
sending a stop command.

Note the JEDEC eMMC specification says the stop command (CMD12) can be used
to stop a tuning command (CMD21) whereas the SD specification is silent on
the subject with respect to the SD tuning command (CMD19). Considering that
CMD12 is not a valid SDIO command, the stop command is sent only when the
tuning command is CMD21 i.e. for eMMC. That addresses cases seen so far
which have been on eMMC.

Note that this replaces the commit fe5fb2e3b58f ("mmc: sdhci: Reset cmd and
data circuits after tuning failure") which is being reverted for v4.9+.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Dan O'Donovan <dan@emutex.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ath9k: Really fix LED polarity for some Mini PCI AR9220 MB92 cards.

commit 79e57dd113d307a6c74773b8aaecf5442068988a upstream.

The active_high LED of my Wistron DNMA-92 is still being recognized as
active_low on 4.7.6 mainline. When I was preparing my former commit
0f9edcdd88a9 ("ath9k: Fix LED polarity for some Mini PCI AR9220 MB92
cards.") to fix that I must have somehow messed up with testing, because
I tested the final version of that patch before sending it, and it was
apparently working; but now it is not working on 4.7.6 mainline.

I initially added the PCI_DEVICE_SUB section for 0x0029/0x2096 above the
PCI_VDEVICE section for 0x0029; but then I moved the former below the
latter after seeing how 0x002A sections were sorted in the file.

This turned out to be wrong: if a generic PCI_VDEVICE entry (that has
both subvendor and subdevice IDs set to PCI_ANY_ID) is put before a more
specific one (PCI_DEVICE_SUB), then the generic PCI_VDEVICE entry will
match first and will be used.

With this patch, 0x0029/0x2096 has finally got active_high LED on 4.7.6.

While I'm at it, let's fix 0x002A too by also moving its generic definition
below its specific ones.

Fixes: 0f9edcdd88a9 ("ath9k: Fix LED polarity for some Mini PCI AR9220 MB92 cards.")
Signed-off-by: Vittorio Gambaletta <linuxbugs@vittgam.net>
[kvalo@qca.qualcomm.com: improve the commit log based on email discussions]
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ath9k: fix ath9k_hw_gpio_get() to return 0 or 1 on success

commit 91851cc7a939039bd401adb6ca3da4402bec1d0c upstream.

Commit b2d70d4944c1 ("ath9k: make GPIO API to support both of WMAC and
SOC") refactored ath9k_hw_gpio_get() to support both WMAC and SOC GPIOs,
changing the return on success from 1 to BIT(gpio). This broke some callers
like ath_is_rfkill_set(). This doesn't fix any known bug in mainline at the
moment, but should be fixed anyway.

Instead of fixing all callers, change ath9k_hw_gpio_get() back to only
return 0 or 1.

Fixes: b2d70d4944c1 ("ath9k: make GPIO API to support both of WMAC and SOC")
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
[kvalo@qca.qualcomm.com: mention that doesn't fix any known bug]
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cfg80211/mac80211: fix BSS leaks when abandoning assoc attempts

commit e6f462df9acd2a3295e5d34eb29e2823220cf129 upstream.

When mac80211 abandons an association attempt, it may free
all the data structures, but inform cfg80211 and userspace
about it only by sending the deauth frame it received, in
which case cfg80211 has no link to the BSS struct that was
used and will not cfg80211_unhold_bss() it.

Fix this by providing a way to inform cfg80211 of this with
the BSS entry passed, so that it can clean up properly, and
use this ability in the appropriate places in mac80211.

This isn't ideal: some code is more or less duplicated and
tracing is missing. However, it's a fairly small change and
it's thus easier to backport - cleanups can come later.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rtl8xxxu: Work around issue with 8192eu and 8723bu devices not reconnecting

commit c59f13bbead475096bdfebc7ef59c12e180858de upstream.

The H2C MEDIA_STATUS_RPT command for some reason causes 8192eu and
8723bu devices not being able to reconnect.

Reported-by: Barry Day <briselec@gmail.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf/x86/intel/cstate: Prevent hotplug callback leak

commit 834fcd298003c10ce450e66960c78893cb1cc4b5 upstream.

If the pmu registration fails the registered hotplug callbacks are not
removed. Wrong in any case, but fatal in case of a modular driver.

Replace the nonsensical state names with proper ones while at it.

Fixes: 77c34ef1c319 ("perf/x86/intel/cstate: Convert Intel CSTATE to hotplug state machine")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf/x86: Fix exclusion of BTS and LBR for Goldmont

commit b0c1ef52959582144bbea9a2b37db7f4c9e399f7 upstream.

An earlier patch allowed enabling PT and LBR at the same
time on Goldmont. However it also allowed enabling BTS and LBR
at the same time, which is still not supported. Fix this by
bypassing the check only for PT.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: alexander.shishkin@intel.com
Cc: kan.liang@intel.com
Fixes: ccbebba4c6bf ("perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports it")
Link: http://lkml.kernel.org/r/20161209001417.4713-1-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rtlwifi: Fix enter/exit power_save

commit ba9f93f82abafe2552eac942ebb11c2df4f8dd7f upstream.

In commit a5ffbe0a1993 ("rtlwifi: Fix scheduling while atomic bug") and
commit a269913c52ad ("rtlwifi: Rework rtl_lps_leave() and rtl_lps_enter()
to use work queue"), an error was introduced in the power-save routines
due to the fact that leaving PS was delayed by the use of a work queue.

This problem is fixed by detecting if the enter or leave routines are
in interrupt mode. If so, the workqueue is used to place the request.
If in normal mode, the enter or leave routines are called directly.

Fixes: a269913c52ad ("rtlwifi: Rework rtl_lps_leave() and rtl_lps_enter() to use work queue")
Reported-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ssb: Fix error routine when fallback SPROM fails

commit 8052d7245b6089992343c80b38b14dbbd8354651 upstream.

When there is a CRC error in the SPROM read from the device, the code
attempts to handle a fallback SPROM. When this also fails, the driver
returns zero rather than an error code.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux 4.8.16

driver core: fix race between creating/querying glue dir and its cleanup

commit cebf8fd16900fdfd58c0028617944f808f97fe50 upstream.

The global mutex of 'gdp_mutex' is used to serialize creating/querying
glue dir and its cleanup. Turns out it isn't a perfect way because
part(kobj_kset_leave()) of the actual cleanup action() is done inside
the release handler of the glue dir kobject. That means gdp_mutex has
to be held before releasing the last reference count of the glue dir
kobject.

This patch moves glue dir's cleanup after kobject_del() in device_del()
for avoiding the race.

Cc: Yijing Wang <wangyijing@huawei.com>
Reported-by: Chandra Sekhar Lingutla <clingutla@codeaurora.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "netfilter: move nat hlist_head to nf_conn"

This reverts commit 7c9664351980aaa6a4b8837a314360b3a4ad382a as it is
not working properly. Please move to 4.9 to get the full fix.

Reported-by: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "netfilter: nat: convert nat bysrc hash to rhashtable"

This reverts commit 870190a9ec9075205c0fa795a09fa931694a3ff1 as it is
not working properly. Please move to 4.9 to get the full fix.

Reported-by: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: mark reserved memblock regions explicitly in iomem

commit e7cd190385d17790cc3eb3821b1094b00aacf325 upstream.

Kdump(kexec-tools) parses /proc/iomem to identify all the memory regions
on the system. Since the current kernel names "nomap" regions, like UEFI
runtime services code/data, as "System RAM," kexec-tools sets up elf core
header to include them in a crash dump file (/proc/vmcore).

Then crash dump kernel parses UEFI memory map again, re-marks those regions
as "nomap" and does not create a memory mapping for them unlike the other
areas of System RAM. In this case, copying /proc/vmcore through
copy_oldmem_page() on crash dump kernel will end up with a kernel abort,
as reported in [1].

This patch names all the "nomap" regions explicitly as "reserved" so that
we can exclude them from a crash dump file. acpi_os_ioremap() must also
be modified because those regions have WB attributes [2].

Apart from kdump, this change also matches x86's use of acpi (and
/proc/iomem).

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-August/448186.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-August/450089.html

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: James Morse <james.morse@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Matthias Brugger <mbrugger@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xfs: set AGI buffer type in xlog_recover_clear_agi_bucket

commit 6b10b23ca94451fae153a5cc8d62fd721bec2019 upstream.

xlog_recover_clear_agi_bucket didn't set the
type to XFS_BLFT_AGI_BUF, so we got a warning during log
replay (or an ASSERT on a debug build).

XFS (md0): Unknown buffer type 0!
XFS (md0): _xfs_buf_ioapply: no ops on block 0xaea8802/0x1

Fix this, as was done in f19b872b for 2 other locations
with the same problem.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm/xen: Use alloc_percpu rather than __alloc_percpu

commit 24d5373dda7c00a438d26016bce140299fae675e upstream.

The function xen_guest_init is using __alloc_percpu with an alignment
which are not power of two.

However, the percpu allocator never supported alignments which are not power
of two and has always behaved incorectly in thise case.

Commit 3ca45a4 "percpu: ensure requested alignment is power of two"
introduced a check which trigger a warning [1] when booting linux-next
on Xen. But in reality this bug was always present.

This can be fixed by replacing the call to __alloc_percpu with
alloc_percpu. The latter will use an alignment which are a power of two.

[1]

[    0.023921] illegal size (48) or align (48) for percpu allocation
[    0.024167] ------------[ cut here ]------------
[    0.024344] WARNING: CPU: 0 PID: 1 at linux/mm/percpu.c:892 pcpu_alloc+0x88/0x6c0
[    0.024584] Modules linked in:
[    0.024708]
[    0.024804] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
4.9.0-rc7-next-20161128 #473
[    0.025012] Hardware name: Foundation-v8A (DT)
[    0.025162] task: ffff80003d870000 task.stack: ffff80003d844000
[    0.025351] PC is at pcpu_alloc+0x88/0x6c0
[    0.025490] LR is at pcpu_alloc+0x88/0x6c0
[    0.025624] pc : [<ffff00000818e678>] lr : [<ffff00000818e678>]
pstate: 60000045
[    0.025830] sp : ffff80003d847cd0
[    0.025946] x29: ffff80003d847cd0 x28: 0000000000000000
[    0.026147] x27: 0000000000000000 x26: 0000000000000000
[    0.026348] x25: 0000000000000000 x24: 0000000000000000
[    0.026549] x23: 0000000000000000 x22: 00000000024000c0
[    0.026752] x21: ffff000008e97000 x20: 0000000000000000
[    0.026953] x19: 0000000000000030 x18: 0000000000000010
[    0.027155] x17: 0000000000000a3f x16: 00000000deadbeef
[    0.027357] x15: 0000000000000006 x14: ffff000088f79c3f
[    0.027573] x13: ffff000008f79c4d x12: 0000000000000041
[    0.027782] x11: 0000000000000006 x10: 0000000000000042
[    0.027995] x9 : ffff80003d847a40 x8 : 6f697461636f6c6c
[    0.028208] x7 : 6120757063726570 x6 : ffff000008f79c84
[    0.028419] x5 : 0000000000000005 x4 : 0000000000000000
[    0.028628] x3 : 0000000000000000 x2 : 000000000000017f
[    0.028840] x1 : ffff80003d870000 x0 : 0000000000000035
[    0.029056]
[    0.029152] ---[ end trace 0000000000000000 ]---
[    0.029297] Call trace:
[    0.029403] Exception stack(0xffff80003d847b00 to
                               0xffff80003d847c30)
[    0.029621] 7b00: 0000000000000030 0001000000000000
ffff80003d847cd0 ffff00000818e678
[    0.029901] 7b20: 0000000000000002 0000000000000004
ffff000008f7c060 0000000000000035
[    0.030153] 7b40: ffff000008f79000 ffff000008c4cd88
ffff80003d847bf0 ffff000008101778
[    0.030402] 7b60: 0000000000000030 0000000000000000
ffff000008e97000 00000000024000c0
[    0.030647] 7b80: 0000000000000000 0000000000000000
0000000000000000 0000000000000000
[    0.030895] 7ba0: 0000000000000035 ffff80003d870000
000000000000017f 0000000000000000
[    0.031144] 7bc0: 0000000000000000 0000000000000005
ffff000008f79c84 6120757063726570
[    0.031394] 7be0: 6f697461636f6c6c ffff80003d847a40
0000000000000042 0000000000000006
[    0.031643] 7c00: 0000000000000041 ffff000008f79c4d
ffff000088f79c3f 0000000000000006
[    0.031877] 7c20: 00000000deadbeef 0000000000000a3f
[    0.032051] [<ffff00000818e678>] pcpu_alloc+0x88/0x6c0
[    0.032229] [<ffff00000818ece8>] __alloc_percpu+0x18/0x20
[    0.032409] [<ffff000008d9606c>] xen_guest_init+0x174/0x2f4
[    0.032591] [<ffff0000080830f8>] do_one_initcall+0x38/0x130
[    0.032783] [<ffff000008d90c34>] kernel_init_freeable+0xe0/0x248
[    0.032995] [<ffff00000899a890>] kernel_init+0x10/0x100
[    0.033172] [<ffff000008082ec0>] ret_from_fork+0x10/0x50

Reported-by: Wei Chen <wei.chen@arm.com>
Link: https://lkml.org/lkml/2016/11/28/669
Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xen/gntdev: Use VM_MIXEDMAP instead of VM_IO to avoid NUMA balancing

commit 30faaafdfa0c754c91bac60f216c9f34a2bfdf7e upstream.

Commit 9c17d96500f7 ("xen/gntdev: Grant maps should not be subject to
NUMA balancing") set VM_IO flag to prevent grant maps from being
subjected to NUMA balancing.

It was discovered recently that this flag causes get_user_pages() to
always fail with -EFAULT.

check_vma_flags
__get_user_pages
__get_user_pages_locked
__get_user_pages_unlocked
get_user_pages_fast
iov_iter_get_pages
dio_refill_pages
do_direct_IO
do_blockdev_direct_IO
do_blockdev_direct_IO
ext4_direct_IO_read
generic_file_read_iter
aio_run_iocb

(which can happen if guest's vdisk has direct-io-safe option).

To avoid this let's use VM_MIXEDMAP flag instead --- it prevents
NUMA balancing just as VM_IO does and has no effect on
check_vma_flags().

Reported-by: Olaf Hering <olaf@aepfle.de>
Suggested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Hugh Dickins <hughd@google.com>
Tested-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tpm xen: Remove bogus tpm_chip_unregister

commit 1f0f30e404b3d8f4597a2d9b77fba55452f8fd0e upstream.

tpm_chip_unregister can only be called after tpm_chip_register.
devm manages the allocation so no unwind is needed here.

Fixes: afb5abc262e96 ("tpm: two-phase chip management functions")
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

kernel/debug/debug_core.c: more properly delay for secondary CPUs

commit 2d13bb6494c807bcf3f78af0e96c0b8615a94385 upstream.

We've got a delay loop waiting for secondary CPUs.  That loop uses
loops_per_jiffy.  However, loops_per_jiffy doesn't actually mean how
many tight loops make up a jiffy on all architectures.  It is quite
common to see things like this in the boot log:

  Calibrating delay loop (skipped), value calculated using timer
  frequency.. 48.00 BogoMIPS (lpj=24000)

In my case I was seeing lots of cases where other CPUs timed out
entering the debugger only to print their stack crawls shortly after the
kdb> prompt was written.

Elsewhere in kgdb we already use udelay(), so that should be safe enough
to use to implement our timeout.  We'll delay 1 ms for 1000 times, which
should give us a full second of delay (just like the old code wanted)
but allow us to notice that we're done every 1 ms.

[akpm@linux-foundation.org: simplifications, per Daniel]
Link: http://lkml.kernel.org/r/1477091361-2039-1-git-send-email-dianders@chromium.org
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Brian Norris <briannorris@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

watchdog: qcom: fix kernel panic due to external abort on non-linefetch

commit f06f35c66fdbd5ac38901a3305ce763a0cd59375 upstream.

This patch fixes a off-by-one in the "watchdog: qcom: add option for
standalone watchdog not in timer block" patch that causes the
following panic on boot:

> Unhandled fault: external abort on non-linefetch (0x1008) at 0xc8874002
> pgd = c0204000
> [c8874002] *pgd=87806811, *pte=0b017653, *ppte=0b017453
> Internal error: : 1008 [#1] SMP ARM
> CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.8.6 #0
> Hardware name: Generic DT based system
> PC is at 0xc02222f4
> LR is at 0x1
> pc : [<c02222f4>]    lr : [<00000001>]    psr: 00000113
> sp : c782fc98  ip : 00000003  fp : 00000000
> r10: 00000004  r9 : c782e000  r8 : c04ab98c
> r7 : 00000001  r6 : c8874002  r5 : c782fe00  r4 : 00000002
> r3 : 00000000  r2 : c782fe00  r1 : 00100000  r0 : c8874002
> Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> Control: 10c5387d  Table: 8020406a  DAC: 00000051
> Process swapper/0 (pid: 1, stack limit = 0xc782e210)
> Stack: (0xc782fc98 to 0xc7830000)
> [...]

The WDT_STS (status) needs to be translated via wdt_addr as well.

fixes: f0d9d0f4b44a ("watchdog: qcom: add option for standalone watchdog not in timer block")
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

watchdog: mei_wdt: request stop on reboot to prevent false positive event

commit 9eff1140a82db8c5520f76e51c21827b4af670b3 upstream.

Systemd on reboot enables shutdown watchdog that leaves the watchdog
device open to ensure that even if power down process get stuck the
platform reboots nonetheless.
The iamt_wdt is an alarm-only watchdog and can't reboot system, but the
FW will generate an alarm event reboot was completed in time, as the
watchdog is not automatically disabled during power cycle.
So we should request stop watchdog on reboot to eliminate wrong alarm
from the FW.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

kernel/watchdog: use nmi registers snapshot in hardlockup handler

commit 4d1f0fb096aedea7bb5489af93498a82e467c480 upstream.

NMI handler doesn't call set_irq_regs(), it's set only by normal IRQ.
Thus get_irq_regs() returns NULL or stale registers snapshot with IP/SP
pointing to the code interrupted by IRQ which was interrupted by NMI.
NULL isn't a problem: in this case watchdog calls dump_stack() and
prints full stack trace including NMI. But if we're stuck in IRQ
handler then NMI watchlog will print stack trace without IRQ part at
all.

This patch uses registers snapshot passed into NMI handler as arguments:
these registers point exactly to the instruction interrupted by NMI.

Fixes: 55537871ef66 ("kernel/watchdog.c: perform all-CPU backtrace in case of hard lockup")
Link: http://lkml.kernel.org/r/146771764784.86724.6006627197118544150.stgit@buzz
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

CIFS: Fix a possible memory corruption in push locks

commit e3d240e9d505fc67f8f8735836df97a794bbd946 upstream.

If maxBuf is not 0 but less than a size of SMB2 lock structure
we can end up with a memory corruption.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

CIFS: Fix missing nls unload in smb2_reconnect()

commit 4772c79599564bd08ee6682715a7d3516f67433f upstream.

Acked-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

CIFS: Fix a possible memory corruption during reconnect

commit 53e0e11efe9289535b060a51d4cf37c25e0d0f2b upstream.

We can not unlock/lock cifs_tcp_ses_lock while walking through ses
and tcon lists because it can corrupt list iterator pointers and
a tcon structure can be released if we don't hold an extra reference.
Fix it by moving a reconnect process to a separate delayed work
and acquiring a reference to every tcon that needs to be reconnected.
Also do not send an echo request on newly established connections.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ASoC: intel: Fix crash at suspend/resume without card registration

commit 2fc995a87f2efcd803438f07bfecd35cc3d90d32 upstream.

When ASoC Intel SST Medfield driver is probed but without codec / card
assigned, it causes an Oops and freezes the kernel at suspend/resume,

PM: Suspending system (freeze)
Suspending console(s) (use no_console_suspend to debug)
BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
IP: [<ffffffffc09d9409>] sst_soc_prepare+0x19/0xa0 [snd_soc_sst_mfld_platform]
Oops: 0000 [#1] PREEMPT SMP
CPU: 0 PID: 1552 Comm: systemd-sleep Tainted: G W 4.9.0-rc6-1.g5f5c2ad-default #1
Call Trace:
  [<ffffffffb45318f9>] dpm_prepare+0x209/0x460
  [<ffffffffb4531b61>] dpm_suspend_start+0x11/0x60
  [<ffffffffb40d3cc2>] suspend_devices_and_enter+0xb2/0x710
  [<ffffffffb40d462e>] pm_suspend+0x30e/0x390
  [<ffffffffb40d2eba>] state_store+0x8a/0x90
  [<ffffffffb43c670f>] kobj_attr_store+0xf/0x20
  [<ffffffffb42b0d97>] sysfs_kf_write+0x37/0x40
  [<ffffffffb42b02bc>] kernfs_fop_write+0x11c/0x1b0
  [<ffffffffb422be68>] __vfs_write+0x28/0x140
  [<ffffffffb43728a8>] ? apparmor_file_permission+0x18/0x20
  [<ffffffffb433b2ab>] ? security_file_permission+0x3b/0xc0
  [<ffffffffb422d095>] vfs_write+0xb5/0x1a0
  [<ffffffffb422e3d6>] SyS_write+0x46/0xa0
  [<ffffffffb4719fbb>] entry_SYSCALL_64_fastpath+0x1e/0xad

Add proper NULL checks in the PM code of mdfld driver.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm space map metadata: fix 'struct sm_metadata' leak on failed create

commit 314c25c56c1ee5026cf99c570bdfe01847927acb upstream.

In dm_sm_metadata_create() we temporarily change the dm_space_map
operations from 'ops' (whose .destroy function deallocates the
sm_metadata) to 'bootstrap_ops' (whose .destroy function doesn't).

If dm_sm_metadata_create() fails in sm_ll_new_metadata() or
sm_ll_extend(), it exits back to dm_tm_create_internal(), which calls
dm_sm_destroy() with the intention of freeing the sm_metadata, but it
doesn't (because the dm_space_map operations is still set to
'bootstrap_ops').

Fix this by setting the dm_space_map operations back to 'ops' if
dm_sm_metadata_create() fails when it is set to 'bootstrap_ops'.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm raid: fix discard support regression

commit 11e2968478edc07a75ee1efb45011b3033c621c2 upstream.

Commit ecbfb9f118 ("dm raid: add raid level takeover support") moved the
configure_discard_support() call from raid_ctr() to raid_preresume().

Enabling/disabling discard _must_ happen during table load (through the
.ctr hook). Fix this regression by moving the
configure_discard_support() call back to raid_ctr().

Fixes: ecbfb9f118 ("dm raid: add raid level takeover support")
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm rq: fix a race condition in rq_completed()

commit d15bb3a6467e102e60d954aadda5fb19ce6fd8ec upstream.

It is required to hold the queue lock when calling blk_run_queue_async()
to avoid that a race between blk_run_queue_async() and
blk_cleanup_queue() is triggered.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm crypt: mark key as invalid until properly loaded

commit 265e9098bac02bc5e36cda21fdbad34cb5b2f48d upstream.

In crypt_set_key(), if a failure occurs while replacing the old key
(e.g. tfm->setkey() fails) the key must not have DM_CRYPT_KEY_VALID flag
set. Otherwise, the crypto layer would have an invalid key that still
has DM_CRYPT_KEY_VALID flag set.

Signed-off-by: Ondrej Kozina <okozina@redhat.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm flakey: return -EINVAL on interval bounds error in flakey_ctr()

commit bff7e067ee518f9ed7e1cbc63e4c9e01670d0b71 upstream.

Fix to return error code -EINVAL instead of 0, as is done elsewhere in
this function.

Fixes: e80d1c805a3b ("dm: do not override error code returned from dm_get_device()")
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm table: an 'all_blk_mq' table must be loaded for a blk-mq DM device

commit 301fc3f5efb98633115bd887655b19f42c6dfaa8 upstream.

When dm_table_set_type() is used by a target to establish a DM table's
type (e.g. DM_TYPE_MQ_REQUEST_BASED in the case of DM multipath) the
DM core must go on to verify that the devices in the table are
compatible with the established type.

Fixes: e83068a5 ("dm mpath: add optional "queue_mode" feature")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm table: fix 'all_blk_mq' inconsistency when an empty table is loaded

commit 6936c12cf809850180b24947271b8f068fdb15e9 upstream.

An earlier DM multipath table could have been build ontop of underlying
devices that were all using blk-mq. In that case, if that active
multipath table is replaced with an empty DM multipath table (that
reflects all paths have failed) then it is important that the
'all_blk_mq' state of the active table is transfered to the new empty DM
table. Otherwise dm-rq.c:dm_old_prep_tio() will incorrectly clone a
request that isn't needed by the DM multipath target when it is to issue
IO to an underlying blk-mq device.

Fixes: e83068a5 ("dm mpath: add optional "queue_mode" feature")
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

blk-mq: Do not invoke .queue_rq() for a stopped queue

commit bc27c01b5c46d3bfec42c96537c7a3fae0bb2cc4 upstream.

The meaning of the BLK_MQ_S_STOPPED flag is "do not call
.queue_rq()". Hence modify blk_mq_make_request() such that requests
are queued instead of issued if a queue has been stopped.

Reported-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

PM / OPP: Pass opp_table to dev_pm_opp_put_regulator()

commit 91291d9ad92faa65a56a9a19d658d8049b78d3d4 upstream.

Joonyoung Shim reported an interesting problem on his ARM octa-core
Odoroid-XU3 platform. During system suspend, dev_pm_opp_put_regulator()
was failing for a struct device for which dev_pm_opp_set_regulator() is
called earlier.

This happened because an earlier call to
dev_pm_opp_of_cpumask_remove_table() function (from cpufreq-dt.c file)
removed all the entries from opp_table->dev_list apart from the last CPU
device in the cpumask of CPUs sharing the OPP.

But both dev_pm_opp_set_regulator() and dev_pm_opp_put_regulator()
routines get CPU device for the first CPU in the cpumask. And so the OPP
core failed to find the OPP table for the struct device.

This patch attempts to fix this problem by returning a pointer to the
opp_table from dev_pm_opp_set_regulator() and using that as the
parameter to dev_pm_opp_put_regulator(). This ensures that the
dev_pm_opp_put_regulator() doesn't fail to find the opp table.

Note that similar design problem also exists with other
dev_pm_opp_put_*() APIs, but those aren't used currently by anyone and
so we don't need to update them for now.

Reported-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[ Viresh: Wrote commit log and tested on exynos 5250 ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: gadget: composite: always set ep->mult to a sensible value

commit eaa496ffaaf19591fe471a36cef366146eeb9153 upstream.

ep->mult is supposed to be set to Isochronous and
Interrupt Endapoint's multiplier value. This value
is computed from different places depending on the
link speed.

If we're dealing with HighSpeed, then it's part of
bits [12:11] of wMaxPacketSize. This case wasn't
taken into consideration before.

While at that, also make sure the ep->mult defaults
to one so drivers can use it unconditionally and
assume they'll never multiply ep->maxpacket to zero.

Cc: <stable@vger.kernel.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm, page_alloc: keep pcp count and list contents in sync if struct page is corrupted

commit a6de734bc002fe2027ccc074fbbd87d72957b7a4 upstream.

Vlastimil Babka pointed out that commit 479f854a207c ("mm, page_alloc:
defer debugging checks of pages allocated from the PCP") will allow the
per-cpu list counter to be out of sync with the per-cpu list contents if
a struct page is corrupted.

The consequence is an infinite loop if the per-cpu lists get fully
drained by free_pcppages_bulk because all the lists are empty but the
count is positive.  The infinite loop occurs here

                do {
                        batch_free++;
                        if (++migratetype == MIGRATE_PCPTYPES)
                                migratetype = 0;
                        list = &pcp->lists[migratetype];
                } while (list_empty(list));

What the user sees is a bad page warning followed by a soft lockup with
interrupts disabled in free_pcppages_bulk().

This patch keeps the accounting in sync.

Fixes: 479f854a207c ("mm, page_alloc: defer debugging checks of pages allocated from the PCP")
Link: http://lkml.kernel.org/r/20161202112951.23346-2-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/vmscan.c: set correct defer count for shrinker

commit 5f33a0803bbd781de916f5c7448cbbbbc763d911 upstream.

Our system uses significantly more slab memory with memcg enabled with
the latest kernel.  With 3.10 kernel, slab uses 2G memory, while with
4.6 kernel, 6G memory is used.  The shrinker has problem.  Let's see we
have two memcg for one shrinker.  In do_shrink_slab:

1. Check cg1.  nr_deferred = 0, assume total_scan = 700.  batch size
   is 1024, then no memory is freed.  nr_deferred = 700

2. Check cg2.  nr_deferred = 700.  Assume freeable = 20, then
   total_scan = 10 or 40.  Let's assume it's 10.  No memory is freed.
   nr_deferred = 10.

The deferred share of cg1 is lost in this case.  kswapd will free no
memory even run above steps again and again.

The fix makes sure one memcg's deferred share isn't lost.

Link: http://lkml.kernel.org/r/2414be961b5d25892060315fbb56bb19d81d0c07.1476227351.git.shli@fb.com
Signed-off-by: Shaohua Li <shli@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nvmet: Fix possible infinite loop triggered on hot namespace removal

commit e4fcf07cca6a3b6c4be00df16f08be894325eaa3 upstream.

When removing a namespace we delete it from the subsystem namespaces
list with list_del_init which allows us to know if it is enabled or
not.

The problem is that list_del_init initialize the list next and does
not respect the RCU list-traversal we do on the IO path for locating
a namespace. Instead we need to use list_del_rcu which is allowed to
run concurrently with the _rcu list-traversal primitives (keeps list
next intact) and guarantees concurrent nvmet_find_naespace forward
progress.

By changing that, we cannot rely on ns->dev_link for knowing if the
namspace is enabled, so add enabled indicator entry to nvmet_ns for
that.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Solganik Alexander <sashas@lightbitslabs.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

loop: return proper error from loop_queue_rq()

commit b4a567e8114327518c09f5632339a5954ab975a3 upstream.

->queue_rq() should return one of the BLK_MQ_RQ_QUEUE_* constants, not
an errno.

Fixes: f4aa4c7bbac6 ("block: loop: convert to per-device workqueue")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

f2fs: fix overflow due to condition check order

commit e87f7329bbd6760c2acc4f1eb423362b08851a71 upstream.

In the last ilen case, i was already increased, resulting in accessing out-
of-boundary entry of do_replace and blkaddr.
Fix to check ilen first to exit the loop.

Fixes: 2aa8fbb9693020 ("f2fs: refactor __exchange_data_block for speed up")
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>