git.ipfire.org Git - thirdparty/kernel/stable.git/log

Linux 3.16.7-ckt23

Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

KEYS: Fix keyring ref leak in join_session_keyring()

commit 23567fd052a9abb6d67fe8e7a9ccdd9800a540f2 upstream.

This fixes CVE-2016-0728.

If a thread is asked to join as a session keyring the keyring that's already
set as its session, we leak a keyring reference.

This can be tested with the following program:

#include <stddef.h>
#include <stdio.h>
#include <sys/types.h>
#include <keyutils.h>

int main(int argc, const char *argv[])
{
int i = 0;
key_serial_t serial;

serial = keyctl(KEYCTL_JOIN_SESSION_KEYRING,
"leaked-keyring");
if (serial < 0) {
perror("keyctl");
return -1;
}

if (keyctl(KEYCTL_SETPERM, serial,
   KEY_POS_ALL | KEY_USR_ALL) < 0) {
perror("keyctl");
return -1;
}

for (i = 0; i < 100; i++) {
serial = keyctl(KEYCTL_JOIN_SESSION_KEYRING,
"leaked-keyring");
if (serial < 0) {
perror("keyctl");
return -1;
}
}

return 0;
}

If, after the program has run, there something like the following line in
/proc/keys:

3f3d898f I--Q---   100 perm 3f3f0000     0     0 keyring   leaked-keyring: empty

with a usage count of 100 * the number of times the program has been run,
then the kernel is malfunctioning.  If leaked-keyring has zero usages or
has been garbage collected, then the problem is fixed.

Reported-by: Yevgeny Pats <yevgeny@perception-point.io>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

udp: properly support MSG_PEEK with truncated buffers

commit 197c949e7798fbf28cfadc69d9ca0c2abbf93191 upstream.

Backport of this upstream commit into stable kernels :
89c22d8c3b27 ("net: Fix skb csum races when peeking")
exposed a bug in udp stack vs MSG_PEEK support, when user provides
a buffer smaller than skb payload.

In this case,
skb_copy_and_csum_datagram_iovec(skb, sizeof(struct udphdr),
msg->msg_iov);
returns -EFAULT.

This bug does not happen in upstream kernels since Al Viro did a great
job to replace this into :
skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr), msg);
This variant is safe vs short buffers.

For the time being, instead reverting Herbert Xu patch and add back
skb->ip_summed invalid changes, simply store the result of
udp_lib_checksum_complete() so that we avoid computing the checksum a
second time, and avoid the problematic
skb_copy_and_csum_datagram_iovec() call.

This patch can be applied on recent kernels as it avoids a double
checksumming, then backported to stable kernels as a bug fix.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

Revert "[stable-only] net: add length argument to skb_copy_and_csum_datagram_iovec"

This reverts commit fa89ae5548ed282f0ceb4660b3b93e4e2ee875f3.

As reported by Michal Kubecek, the backport of commit 89c22d8c3b27
("net: Fix skb csum races when peeking") exposed an upstream bug.
It is being reverted and replaced by a better fix.

Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

firmware: dmi_scan: Fix UUID endianness for SMBIOS >= 2.6

commit ff4319dc7cd58c92b389960e375038335d157a60 upstream.

The dmi_ver wasn't updated correctly before the dmi_decode method run
to save the uuid.

That resulted in "dmidecode -s system-uuid" and
/sys/class/dmi/id/product_uuid disagreeing. The latter was buggy and
this fixes it.

Reported-by: Federico Simoncelli <fsimonce@redhat.com>
Fixes: 9f9c9cbb6057 ("drivers/firmware/dmi_scan.c: fetch dmi version from SMBIOS if it exists")
Fixes: 79bae42d51a5 ("dmi_scan: refactor dmi_scan_machine(), {smbios,dmi}_present()")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

kvm: x86: only channel 0 of the i8254 is linked to the HPET

commit e5e57e7a03b1cdcb98e4aed135def2a08cbf3257 upstream.

While setting the KVM PIT counters in 'kvm_pit_load_count', if
'hpet_legacy_start' is set, the function disables the timer on
channel[0], instead of the respective index 'channel'. This is
because channels 1-3 are not linked to the HPET. Fix the caller
to only activate the special HPET processing for channel 0.

Reported-by: P J P <pjp@fedoraproject.org>
Fixes: 0185604c2d82c560dab2f2933a18f797e74ab5a8
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

net: possible use after free in dst_release

commit 07a5d38453599052aff0877b16bb9c1585f08609 upstream.

dst_release should not access dst->flags after decrementing
__refcnt to 0. The dst_entry may be in dst_busy_list and
dst_gc_task may dst_destroy it before dst_release gets a chance
to access dst->flags.

Fixes: d69bbf88c8d0 ("net: fix a race in dst_release()")
Fixes: 27b75c95f10d ("net: avoid RCU for NOCACHE dst")
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

net: filter: make JITs zero A for SKF_AD_ALU_XOR_X

commit 55795ef5469290f89f04e12e662ded604909e462 upstream.

The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
instructions since it XORs A with X while all the others replace A with
some loaded value.  All the BPF JITs fail to clear A if this is used as
the first instruction in a filter.  This was found using american fuzzy
lop.

Add a helper to determine if A needs to be cleared given the first
instruction in a filter, and use this in the JITs.  Except for ARM, the
rest have only been compile-tested.

Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum")
Signed-off-by: Rabin Vincent <rabin@rab.in>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ASoC: Use nested lock for snd_soc_dapm_mutex_lock

commit 783513eec3209542fcd6ac0cbcb030b3c17a4827 upstream.

snd_soc_dapm_mutex_lock currently uses the un-nested call which can
cause lockdep warnings when called from control handlers (a relatively
common usage) and using modules. As creating the control causes a
potential mutex inversion with the handler, creating the control will
take the controls_rwsem under the dapm_mutex and accessing the control
will take the dapm_mutex under controls_rwsem.

All the users look like they want to be using the runtime class of the
lock anyway, so this patch just changes snd_soc_dapm_mutex_lock to use
the nested call, with the SND_SOC_DAPM_CLASS_RUNTIME class.

Fixes: f6d5e586b416 ("ASoC: dapm: Add helpers to lock/unlock DAPM mutex")
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ftrace/module: Call clean up function when module init fails early

commit 049fb9bd416077b3622d317a45796be4f2431df3 upstream.

If the module init code fails after calling ftrace_module_init() and before
calling do_init_module(), we can suffer from a memory leak. This is because
ftrace_module_init() allocates pages to store the locations that ftrace
hooks are placed in the module text. If do_init_module() fails, it still
calls the MODULE_GOING notifiers which will tell ftrace to do a clean up of
the pages it allocated for the module. But if load_module() fails before
then, the pages allocated by ftrace_module_init() will never be freed.

Call ftrace_release_mod() on the module if load_module() fails before
getting to do_init_module().

Link: http://lkml.kernel.org/r/567CEA31.1070507@intel.com
Reported-by: "Qiu, PeiyangX" <peiyangx.qiu@intel.com>
Fixes: a949ae560a511 "ftrace/module: Hardcode ftrace_module_init() call into load_module()"
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

async_tx: use GFP_NOWAIT rather than GFP_IO

commit b02bab6b0f928d49dbfb03e1e4e9dd43647623d7 upstream.

These async_XX functions are called from md/raid5 in an atomic
section, between get_cpu() and put_cpu(), so they must not sleep.
So use GFP_NOWAIT rather than GFP_IO.

Dan Williams writes: Longer term async_tx needs to be merged into md
directly as we can allocate this unmap data statically per-stripe
rather than per request.

Fixed: 7476bd79fc01 ("async_pq: convert to dmaengine_unmap_data")
Reported-and-tested-by: Stanislav Samsonov <slava@annapurnalabs.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ASoC: arizona: Fix bclk for sample rates that are multiple of 4kHz

commit e73694d871867cae8471d2350ce89acb38bc2b63 upstream.

For a sample rate of 12kHz the bclk was taken from the 44.1kHz table as
we test for a multiple of 8kHz. This patch fixes this issue by testing
for multiples of 4kHz instead.

Signed-off-by: Nikesh Oswal <Nikesh.Oswal@cirrus.com>
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

x86/mce: Ensure offline CPUs don't participate in rendezvous process

commit d90167a941f62860f35eb960e1012aa2d30e7e94 upstream.

Intel's MCA implementation broadcasts MCEs to all CPUs on the
node. This poses a problem for offlined CPUs which cannot
participate in the rendezvous process:

  Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
  Kernel Offset: disabled
  Rebooting in 100 seconds..

More specifically, Linux does a soft offline of a CPU when
writing a 0 to /sys/devices/system/cpu/cpuX/online, which
doesn't prevent the #MC exception from being broadcasted to that
CPU.

Ensure that offline CPUs don't participate in the MCE rendezvous
and clear the RIP valid status bit so that a second MCE won't
cause a shutdown.

Without the patch, mce_start() will increment mce_callin and
wait for all CPUs. Offlined CPUs should avoid participating in
the rendezvous process altogether.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
[ Massage commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1449742346-21470-2-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

dts: vt8500: Add SDHC node to DTS file for WM8650

commit 0f090bf14e51e7eefb71d9d1c545807f8b627986 upstream.

Since WM8650 has the same 'WMT' SDHC controller as WM8505, and the driver
is already in the kernel, this node enables the controller support for
WM8650

Signed-off-by: Roman Volkov <rvolkov@v1ros.org>
Reviewed-by: Alexey Charkov <alchark@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

tracing: Fix setting of start_index in find_next()

commit f36d1be2930ede0a1947686e1126ffda5d5ee1bb upstream.

When we do cat /sys/kernel/debug/tracing/printk_formats, we hit kernel
panic at t_show.

general protection fault: 0000 [#1] PREEMPT SMP
CPU: 0 PID: 2957 Comm: sh Tainted: G W  O 3.14.55-x86_64-01062-gd4acdc7 #2
RIP: 0010:[<ffffffff811375b2>]
[<ffffffff811375b2>] t_show+0x22/0xe0
RSP: 0000:ffff88002b4ebe80  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000004
RDX: 0000000000000004 RSI: ffffffff81fd26a6 RDI: ffff880032f9f7b1
RBP: ffff88002b4ebe98 R08: 0000000000001000 R09: 000000000000ffec
R10: 0000000000000000 R11: 000000000000000f R12: ffff880004d9b6c0
R13: 7365725f6d706400 R14: ffff880004d9b6c0 R15: ffffffff82020570
FS:  0000000000000000(0000) GS:ffff88003aa00000(0063) knlGS:00000000f776bc40
CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 00000000f6c02ff0 CR3: 000000002c2b3000 CR4: 00000000001007f0
Call Trace:
[<ffffffff811dc076>] seq_read+0x2f6/0x3e0
[<ffffffff811b749b>] vfs_read+0x9b/0x160
[<ffffffff811b7f69>] SyS_read+0x49/0xb0
[<ffffffff81a3a4b9>] ia32_do_call+0x13/0x13
---[ end trace 5bd9eb630614861e ]---
Kernel panic - not syncing: Fatal exception

When the first time find_next calls find_next_mod_format, it should
iterate the trace_bprintk_fmt_list to find the first print format of
the module. However in current code, start_index is smaller than *pos
at first, and code will not iterate the list. Latter container_of will
get the wrong address with former v, which will cause mod_fmt be a
meaningless object and so is the returned mod_fmt->fmt.

This patch will fix it by correcting the start_index. After fixed,
when the first time calls find_next_mod_format, start_index will be
equal to *pos, and code will iterate the trace_bprintk_fmt_list to
get the right module printk format, so is the returned mod_fmt->fmt.

Link: http://lkml.kernel.org/r/5684B900.9000309@intel.com
Fixes: 102c9323c35a8 "tracing: Add __tracepoint_string() to export string pointers"
Signed-off-by: Qiu Peiyang <peiyangx.qiu@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ftrace/scripts: Fix incorrect use of sprintf in recordmcount

commit 713a3e4de707fab49d5aa4bceb77db1058572a7b upstream.

Fix build warning:

scripts/recordmcount.c:589:4: warning: format not a string
literal and no format arguments [-Wformat-security]
sprintf("%s: failed\n", file);

Fixes: a50bd43935586 ("ftrace/scripts: Have recordmcount copy the object file")
Link: http://lkml.kernel.org/r/1451516801-16951-1-git-send-email-colin.king@canonical.com
Cc: Li Bin <huawei.libin@huawei.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

genirq: Prevent chip buslock deadlock

commit abc7e40c81d113ef4bacb556f0a77ca63ac81d85 upstream.

If a interrupt chip utilizes chip->buslock then free_irq() can
deadlock in the following way:

CPU0 CPU1
interrupt(X) (Shared or spurious)
free_irq(X) interrupt_thread(X)
chip_bus_lock(X)
irq_finalize_oneshot(X)
chip_bus_lock(X)
synchronize_irq(X)

synchronize_irq() waits for the interrupt thread to complete,
i.e. forever.

Solution is simple: Drop chip_bus_lock() before calling
synchronize_irq() as we do with the irq_desc lock. There is nothing to
be protected after the point where irq_desc lock has been released.

This adds chip_bus_lock/unlock() to the remove_irq() code path, but
that's actually correct in the case where remove_irq() is called on
such an interrupt. The current users of remove_irq() are not affected
as none of those interrupts is on a chip which requires buslock.

Reported-by: Fredrik Markström <fredrik.markstrom@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

qlcnic: fix a loop exit condition better

commit 3358a5c0c1578fa215f90a0e750579cd6258ddd9 upstream.

In the original code, if we succeeded on the last iteration through the
loop then we still returned failure.

Fixes: 389e4e04ad2d ('qlcnic: fix a timeout loop')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ipv6/addrlabel: fix ip6addrlbl_get()

commit e459dfeeb64008b2d23bdf600f03b3605dbb8152 upstream.

ip6addrlbl_get() has never worked. If ip6addrlbl_hold() succeeded,
ip6addrlbl_get() will exit with '-ESRCH'. If ip6addrlbl_hold() failed,
ip6addrlbl_get() will use about to be free ip6addrlbl_entry pointer.

Fix this by inverting ip6addrlbl_hold() check.

Fixes: 2a8cc6c89039 ("[IPV6] ADDRCONF: Support RFC3484 configurable address selection policy table.")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Cong Wang <cwang@twopensource.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

net/mlx4_en: Fix HW timestamp init issue upon system startup

commit 90683061dd50b0d70f01466c2d694f4e928a86f3 upstream.

mlx4_en_init_timestamp was called before creation of netdev and port
init, thus used uninitialized values. Specifically - NIC frequency was
incorrect causing wrong calculations and later wrong HW timestamps.

Fixes: 1ec4864b1017 ('net/mlx4_en: Fixed crash when port type is changed')
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Marina Varshaver <marinav@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

net/mlx4_en: Remove dependency between timestamping capability and service_task

commit fc9f5ea9b4ecbe9b7839c92f0a54261809c723d3 upstream.

Service task is responsible for other tasks in addition to timestamping
overflow check. Launch it even if timestamping is not supported by device.

Fixes: 07841f9d94c1 ('net/mlx4_en: Schedule napi when RX buffers allocation fails')
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone()

commit 5f0f2887f4de9508dcf438deab28f1de8070c271 upstream.

test_pages_in_a_zone() does not account for the possibility of missing
sections in the given pfn range. pfn_valid_within always returns 1 when
CONFIG_HOLES_IN_ZONE is not set, allowing invalid pfns from missing
sections to pass the test, leading to a kernel oops.

Wrap an additional pfn loop with PAGES_PER_SECTION granularity to check
for missing sections before proceeding into the zone-check code.

This also prevents a crash from offlining memory devices with missing
sections. Despite this, it may be a good idea to keep the related patch
'[PATCH 3/3] drivers: memory: prohibit offlining of memory blocks with
missing sections' because missing sections in a memory block may lead to
other problems not covered by the scope of this fix.

Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Alex Thorlton <athorlton@sgi.com>
Cc: Russ Anderson <rja@sgi.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Greg KH <greg@kroah.com>
Cc: Seth Jennings <sjennings@variantweb.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ocfs2: fix BUG when calculate new backup super

commit 5c9ee4cbf2a945271f25b89b137f2c03bbc3be33 upstream.

When resizing, it firstly extends the last gd.  Once it should backup
super in the gd, it calculates new backup super and update the
corresponding value.

But it currently doesn't consider the situation that the backup super is
already done.  And in this case, it still sets the bit in gd bitmap and
then decrease from bg_free_bits_count, which leads to a corrupted gd and
trigger the BUG in ocfs2_block_group_set_bits:

    BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits);

So check whether the backup super is done and then do the updates.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

[PATCH] arm: fix handling of F_OFD_... in oabi_fcntl64()

commit 76cc404bfdc0d419c720de4daaf2584542734f42 upstream.

Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

MIPS: uaccess: Fix strlen_user with EVA

commit 5dc62fdd8383afbd2faca6b6e6ea1052b45b0124 upstream.

The strlen_user() function calls __strlen_kernel_asm in both branches of
the eva_kernel_access() conditional. For EVA it should be calling
__strlen_user_eva for user accesses, otherwise it will load from the
kernel address space instead of the user address space, and the access
checking will likely be ineffective at preventing it due to EVA's
overlapping user and kernel address spaces.

This was found after extending the test_user_copy module to cover user
string access functions, which gave the following error with EVA:

test_user_copy: illegal strlen_user passed

Fortunately the use of strlen_user() has been all but eradicated from
the mainline kernel, so only out of tree modules could be affected.

Fixes: e3a9b07a9caf ("MIPS: asm: uaccess: Add EVA support for str*_user operations")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10842/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ALSA: hda/realtek - Fix silent headphone output on MacPro 4,1 (v2)

commit 9f660a1c43890c2cdd1f423fd73654e7ca08fe56 upstream.

Without this patch, internal speaker and line-out work,
but front headphone output jack stays silent on the
Mac Pro 4,1.

This code path also gets executed on the MacPro 5,1 due
to identical codec SSID, but i don't know if it has any
positive or adverse effects there or not.

(v2) Implement feedback from Takashi Iwai: Reuse
alc889_fixup_mbp_vref and just add a new nid
0x19 for the MacPro 4,1.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

parisc: Fix syscall restarts

commit 71a71fb5374a23be36a91981b5614590b9e722c3 upstream.

On parisc syscalls which are interrupted by signals sometimes failed to
restart and instead returned -ENOSYS which in the worst case lead to
userspace crashes.
A similiar problem existed on MIPS and was fixed by commit e967ef02
("MIPS: Fix restart of indirect syscalls").

On parisc the current syscall restart code assumes that all syscall
callers load the syscall number in the delay slot of the ble
instruction. That's how it is e.g. done in the unistd.h header file:
ble 0x100(%sr2, %r0)
ldi #syscall_nr, %r20
Because of that assumption the current code never restored %r20 before
returning to userspace.

This assumption is at least not true for code which uses the glibc
syscall() function, which instead uses this syntax:
ble 0x100(%sr2, %r0)
copy regX, %r20
where regX depend on how the compiler optimizes the code and register
usage.

This patch fixes this problem by adding code to analyze how the syscall
number is loaded in the delay branch and - if needed - copy the syscall
number to regX prior returning to userspace for the syscall restart.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

s390/dis: Fix handling of format specifiers

commit 272fa59ccb4fc802af28b1d699c2463db6a71bf7 upstream.

The print_insn() function returns strings like "lghi %r1,0". To escape the
'%' character in sprintf() a second '%' is used. For example "lghi %%r1,0"
is converted into "lghi %r1,0".

After print_insn() the output string is passed to printk(). Because format
specifiers like "%r" or "%f" are ignored by printk() this works by chance
most of the time. But for instructions with control registers like
"lctl %c6,%c6,780" this fails because printk() interprets "%c" as
character format specifier.

Fix this problem and escape the '%' characters twice.

For example "lctl %%%%c6,%%%%c6,780" is then converted by sprintf()
into "lctl %%c6,%%c6,780" and by printk() into "lctl %c6,%c6,780".

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[ luis: backported to 3.16:
- drop condition with OPERAND_VR introduced only with commit
3585cb028065 ("s390/disassembler: add vector instructions") ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ALSA: hda - Set SKL+ hda controller power at freeze() and thaw()

commit 3e6db33aaf1d42a30339f831ec4850570d6cc7a3 upstream.

It takes three minutes to enter into hibernation on some OEM SKL
machines and we see many codec spurious response after thaw() opertion.
This is because HDA is still in D0 state after freeze() call and
pci_pm_freeze/pci_pm_freeze_noirq() don't set D3 hot in pci_bus driver.
It seems bios still access HDA when system enter into freeze state,
HDA will receive codec response interrupt immediately after thaw() call.
Because of this unexpected interrupt, HDA enter into a abnormal
state and slow down the system enter into hibernation.

In this patch, we put HDA into D3 hot state in azx_freeze_noirq() and
put HDA into D0 state in azx_thaw_noirq().

V2: Only apply this fix to SKL+
Fix compile error when CONFIG_PM_SLEEP isn't defined

[Yet another fix for CONFIG_PM_SLEEP ifdef and the additional comment
by tiwai]

Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ftrace/scripts: Have recordmcount copy the object file

commit a50bd43935586420fb75f4558369eb08566fac5e upstream.

Russell King found that he had weird side effects when compiling the kernel
with hard linked ccache. The reason was that recordmcount modified the
kernel in place via mmap, and when a file gets modified twice by
recordmcount, it will complain about it. To fix this issue, Russell wrote a
patch that checked if the file was hard linked more than once and would
unlink it if it was.

Linus Torvalds was not happy with the fact that recordmcount does this in
place modification. Instead of doing the unlink only if the file has two or
more hard links, it does the unlink all the time. In otherwords, it always
does a copy if it changed something. That is, it does the write out if a
change was made.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

scripts: recordmcount: break hardlinks

commit dd39a26538e37f6c6131e829a4a510787e43c783 upstream.

recordmcount edits the file in-place, which can cause problems when
using ccache in hardlink mode. Arrange for recordmcount to break a
hardlinked object.

Link: http://lkml.kernel.org/r/E1a7MVT-0000et-62@rmk-PC.arm.linux.org.uk
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ARM: dts: imx6: Fix Ethernet PHY mode on Ventana boards

commit 3a35e470bc6bc4ce34c19c410ebbe4e3bbf0bafe upstream.

Gateworks Ventana boards seem to need "RGMII-ID" (internal delay)
PHY mode, instead of simple "RGMII", for their Marvell 88E1510
transceiver. Otherwise, the Ethernet MAC doesn't work with Marvell PHY
driver (TX doesn't seem to work correctly).

Tested on GW5400 rev. C.

This bug affects ARM Fedora 23.

Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl>
Acked-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ASoC: wm8974: set cache type for regmap

commit 1ea5998afe903384ddc16391d4c023cd4c867bea upstream.

Attempting to use this codec driver triggers a BUG() in regcache_sync()
since no cache type is set. The register map of this device is fairly
small and has few holes so a flat cache is suitable.

Signed-off-by: Mans Rullgard <mans@mansr.com>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

KVM: PPC: Book3S HV: Prohibit setting illegal transaction state in MSR

commit c20875a3e638e4a03e099b343ec798edd1af5cc6 upstream.

Currently it is possible for userspace (e.g. QEMU) to set a value
for the MSR for a guest VCPU which has both of the TS bits set,
which is an illegal combination. The result of this is that when
we execute a hrfid (hypervisor return from interrupt doubleword)
instruction to enter the guest, the CPU will take a TM Bad Thing
type of program interrupt (vector 0x700).

Now, if PR KVM is configured in the kernel along with HV KVM, we
actually handle this without crashing the host or giving hypervisor
privilege to the guest; instead what happens is that we deliver a
program interrupt to the guest, with SRR0 reflecting the address
of the hrfid instruction and SRR1 containing the MSR value at that
point. If PR KVM is not configured in the kernel, then we try to
run the host's program interrupt handler with the MMU set to the
guest context, which almost certainly causes a host crash.

This closes the hole by making kvmppc_set_msr_hv() check for the
illegal combination and force the TS field to a safe value (00,
meaning non-transactional).

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

vmstat: allocate vmstat_wq before it is used

commit 751e5f5c753e8d447bcf89f9e96b9616ac081628 upstream.

kernel test robot has reported the following crash:

  BUG: unable to handle kernel NULL pointer dereference at 00000100
  IP: [<c1074df6>] __queue_work+0x26/0x390
  *pdpt = 0000000000000000 *pde = f000ff53f000ff53 *pde = f000ff53f000ff53
  Oops: 0000 [#1] PREEMPT PREEMPT SMP SMP
  CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.4.0-rc4-00139-g373ccbe #1
  Workqueue: events vmstat_shepherd
  task: cb684600 ti: cb7ba000 task.ti: cb7ba000
  EIP: 0060:[<c1074df6>] EFLAGS: 00010046 CPU: 0
  EIP is at __queue_work+0x26/0x390
  EAX: 00000046 EBX: cbb37800 ECX: cbb37800 EDX: 00000000
  ESI: 00000000 EDI: 00000000 EBP: cb7bbe68 ESP: cb7bbe38
   DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
  CR0: 8005003b CR2: 00000100 CR3: 01fd5000 CR4: 000006b0
  Stack:
  Call Trace:
    __queue_delayed_work+0xa1/0x160
    queue_delayed_work_on+0x36/0x60
    vmstat_shepherd+0xad/0xf0
    process_one_work+0x1aa/0x4c0
    worker_thread+0x41/0x440
    kthread+0xb0/0xd0
    ret_from_kernel_thread+0x21/0x40

The reason is that start_shepherd_timer schedules the shepherd work item
which uses vmstat_wq (vmstat_shepherd) before setup_vmstat allocates
that workqueue so if the further initialization takes more than HZ we
might end up scheduling on a NULL vmstat_wq.  This is really unlikely
but not impossible.

Fixes: 373ccbe59270 ("mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress")
Reported-by: kernel test robot <ying.huang@linux.intel.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
[ luis: backported to 3.16: based on Ben's backport to 3.2:
  - as with 3.2, there's a similar race but with the CPU hotplug code ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

drm/i915: Fix SRC_COPY width on 830/845g

commit 611a7a4fd8b5fb6b25ab1f8bdcde61800a7feacf upstream.

One small change I forgot to make in

commit c4d69da167fa967749aeb70bc0e94a457e5d00c1
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Sep 8 14:25:41 2014 +0100

drm/i915: Evict CS TLBs between batches

was to update the copy width for the compact BLT copy instruction.

Reported-by: Thomas Richter <thor@math.tu-berlin.de>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Thomas Richter <thor@math.tu-berlin.de>
Cc: Jani Nikula <jani.nikula@intel.com>
Tested-by: Thomas Richter <thor@math.tu-berlin.de>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

include/linux/mmdebug.h: should include linux/bug.h

commit 1d5cda4076d930d6d52088ed2c7753f7c564cbd7 upstream.

mmdebug.h uses BUILD_BUG_ON_INVALID(), assuming someone else included
linux/bug.h.  Include it ourselves.

This saves build-failures such as:

  arch/arm64/include/asm/pgtable.h: In function 'set_pte_at':
  arch/arm64/include/asm/pgtable.h:281:3: error: implicit declaration of function 'BUILD_BUG_ON_INVALID' [-Werror=implicit-function-declaration]
   VM_WARN_ONCE(!pte_young(pte),

Fixes: 02602a18c32d7 ("bug: completely remove code generated by disabled VM_BUG_ON()")
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration

commit 7bbadd2d1009575dad675afc16650ebb5aa10612 upstream.

Docbook does not like the definition of macros inside a field declaration
and adds a warning. Move the definition out.

Fixes: 79462ad02e86180 ("net: add validation for the socket syscall protocol argument")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ser_gigaset: fix deallocation of platform device structure

commit 4c5e354a974214dfb44cd23fa0429327693bc3ea upstream.

When shutting down the device, the struct ser_cardstate must not be
kfree()d immediately after the call to platform_device_unregister()
since the embedded struct platform_device is still in use.
Move the kfree() call to the release method instead.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Fixes: 2869b23e4b95 ("drivers/isdn/gigaset: new M101 driver (v2)")
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

qlcnic: fix a timeout loop

commit 389e4e04ad2d4887c7bdd7c01a93d3dfa5c14a06 upstream.

The problem here is that at the end of the loop we test for if
idc->vnic_wait_limit is zero, but since idc->vnic_wait_limit-- is a
post-op, it actually ends up set to (u8)-1. I have fixed this by
moving the decrement inside the loop.

Fixes: 486a5bc77a4a ('qlcnic: Add support for 83xx suspend and resume.')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

amd-xgbe: fix a couple timeout loops

commit c7557e6a56510ff6636d40ad4ff64a3ef7d9e197 upstream.

At the end of the loop we test "if (!count)" but because "count--" is
a post-op then the loop will end with count set to -1. I have fixed
this by changing it to --count.

Fixes: c5aa9e3b8156 ('amd-xgbe: Initial AMD 10GbE platform driver')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

mISDN: fix a loop count

commit 40d24c4d8a7430aa4dfd7a665fa3faf3b05b673f upstream.

There are two issue here.
1)  cnt starts as maxloop + 1 so all these loops iterate one more time
    than intended.
2)  At the end of the loop we test for "if (maxloop && !cnt)" but for
    the first two loops, we end with cnt equal to -1.  Changing this to
    a pre-op means we end with cnt set to 0.

Fixes: cae86d4a4e56 ('mISDN: Add driver for Infineon ISDN chipset family')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

sh_eth: fix TX buffer byte-swapping

commit 3e2309937f1e5d538ff13da5fb8de41196927c61 upstream.

For the little-endian SH771x kernels the driver has to byte-swap the RX/TX
buffers, however yet unset physcial address from the TX descriptor is used
to call sh_eth_soft_swap(). Use 'skb->data' instead...

Fixes: 31fcb99d9958 ("net: sh_eth: remove __flush_purge_region")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

net: phy: mdio-mux: Check return value of mdiobus_alloc()

commit 20b08e1a793d898f0f13040d5418ee0955f678cf upstream.

mdiobus_alloc() might return NULL, but its return value is not
checked in mdio_mux_init(). This could potentially lead to a NULL
pointer dereference. Fix it by checking the return value

Fixes: 0ca2997d1452 ("netdev/of/phy: Add MDIO bus multiplexer support.")
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

pinctrl: bcm2835: Fix initial value for direction_output

commit 4c02cba18cc9de672a554ddda4f23dec8cb4b48e upstream.

Currently the provided initial value for bcm2835_gpio_direction_output
has no effect. So fix this issue by changing the value before
changing the GPIO direction. As a result we need to move the function below
bcm2835_gpio_set.

Suggested-by: Martin Sperl <kernel@martin.sperl.org>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Stephen Warren <swarren@wwwdotorg.org>
Fixes: e1b2dc70cd5b ("pinctrl: add bcm2835 driver")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
[ luis: backported to 3.16:
- file rename: drivers/pinctrl/bcm/pinctrl-bcm2835.c ->
drivers/pinctrl/pinctrl-bcm2835.c ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

USB: fix invalid memory access in hub_activate()

commit e50293ef9775c5f1cf3fcc093037dd6a8c5684ea upstream.

Commit 8520f38099cc ("USB: change hub initialization sleeps to
delayed_work") changed the hub_activate() routine to make part of it
run in a workqueue.  However, the commit failed to take a reference to
the usb_hub structure or to lock the hub interface while doing so.  As
a result, if a hub is plugged in and quickly unplugged before the work
routine can run, the routine will try to access memory that has been
deallocated.  Or, if the hub is unplugged while the routine is
running, the memory may be deallocated while it is in active use.

This patch fixes the problem by taking a reference to the usb_hub at
the start of hub_activate() and releasing it at the end (when the work
is finished), and by locking the hub interface while the work routine
is running.  It also adds a check at the start of the routine to see
if the hub has already been disconnected, in which nothing should be
done.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Alexandru Cornea <alexandru.cornea@intel.com>
Tested-by: Alexandru Cornea <alexandru.cornea@intel.com>
Fixes: 8520f38099cc ("USB: change hub initialization sleeps to delayed_work")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ luis: backported to 3.16:
  - Added forward declaration of hub_release() which mainline had with commit
    32a6958998c5 ("usb: hub: convert khubd into workqueue") ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

USB: ipaq.c: fix a timeout loop

commit abdc9a3b4bac97add99e1d77dc6d28623afe682b upstream.

The code expects the loop to end with "retries" set to zero but, because
it is a post-op, it will end set to -1. I have fixed this by moving the
decrement inside the loop.

Fixes: 014aa2a3c32e ('USB: ipaq: minor ipaq_open() cleanup.')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

xen/pciback: Don't allow MSI-X ops if PCI_COMMAND_MEMORY is not set.

commit 408fb0e5aa7fda0059db282ff58c3b2a4278baa0 upstream.

commit f598282f51 ("PCI: Fix the NIU MSI-X problem in a better way")
teaches us that dealing with MSI-X can be troublesome.

Further checks in the MSI-X architecture shows that if the
PCI_COMMAND_MEMORY bit is turned of in the PCI_COMMAND we
may not be able to access the BAR (since they are memory regions).

Since the MSI-X tables are located in there.. that can lead
to us causing PCIe errors. Inhibit us performing any
operation on the MSI-X unless the MEMORY bit is set.

Note that Xen hypervisor with:
"x86/MSI-X: access MSI-X table only after having enabled MSI-X"
will return:
xen_pciback: 0000:0a:00.1: error -6 enabling MSI-X for guest 3!

When the generic MSI code tries to setup the PIRQ without
MEMORY bit set. Which means with later versions of Xen
(4.6) this patch is not neccessary.

This is part of XSA-157

Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

xen/pciback: For XEN_PCI_OP_disable_msi[|x] only disable if device has MSI(X) enabled.

commit 7cfb905b9638982862f0331b36ccaaca5d383b49 upstream.

Otherwise just continue on, returning the same values as
previously (return of 0, and op->result has the PIRQ value).

This does not change the behavior of XEN_PCI_OP_disable_msi[|x].

The pci_disable_msi or pci_disable_msix have the checks for
msi_enabled or msix_enabled so they will error out immediately.

However the guest can still call these operations and cause
us to disable the 'ack_intr'. That means the backend IRQ handler
for the legacy interrupt will not respond to interrupts anymore.

This will lead to (if the device is causing an interrupt storm)
for the Linux generic code to disable the interrupt line.

Naturally this will only happen if the device in question
is plugged in on the motherboard on shared level interrupt GSI.

This is part of XSA-157

Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

xen/pciback: Do not install an IRQ handler for MSI interrupts.

commit a396f3a210c3a61e94d6b87ec05a75d0be2a60d0 upstream.

Otherwise an guest can subvert the generic MSI code to trigger
an BUG_ON condition during MSI interrupt freeing:

for (i = 0; i < entry->nvec_used; i++)
BUG_ON(irq_has_action(entry->irq + i));

Xen PCI backed installs an IRQ handler (request_irq) for
the dev->irq whenever the guest writes PCI_COMMAND_MEMORY
(or PCI_COMMAND_IO) to the PCI_COMMAND register. This is
done in case the device has legacy interrupts the GSI line
is shared by the backend devices.

To subvert the backend the guest needs to make the backend
to change the dev->irq from the GSI to the MSI interrupt line,
make the backend allocate an interrupt handler, and then command
the backend to free the MSI interrupt and hit the BUG_ON.

Since the backend only calls 'request_irq' when the guest
writes to the PCI_COMMAND register the guest needs to call
XEN_PCI_OP_enable_msi before any other operation. This will
cause the generic MSI code to setup an MSI entry and
populate dev->irq with the new PIRQ value.

Then the guest can write to PCI_COMMAND PCI_COMMAND_MEMORY
and cause the backend to setup an IRQ handler for dev->irq
(which instead of the GSI value has the MSI pirq). See
'xen_pcibk_control_isr'.

Then the guest disables the MSI: XEN_PCI_OP_disable_msi
which ends up triggering the BUG_ON condition in 'free_msi_irqs'
as there is an IRQ handler for the entry->irq (dev->irq).

Note that this cannot be done using MSI-X as the generic
code does not over-write dev->irq with the MSI-X PIRQ values.

The patch inhibits setting up the IRQ handler if MSI or
MSI-X (for symmetry reasons) code had been called successfully.

P.S.
Xen PCIBack when it sets up the device for the guest consumption
ends up writting 0 to the PCI_COMMAND (see xen_pcibk_reset_device).
XSA-120 addendum patch removed that - however when upstreaming said
addendum we found that it caused issues with qemu upstream. That
has now been fixed in qemu upstream.

This is part of XSA-157

Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

xen/pciback: Return error on XEN_PCI_OP_enable_msix when device has MSI or MSI-X enabled

commit 5e0ce1455c09dd61d029b8ad45d82e1ac0b6c4c9 upstream.

The guest sequence of:

a) XEN_PCI_OP_enable_msix
b) XEN_PCI_OP_enable_msix

results in hitting an NULL pointer due to using freed pointers.

The device passed in the guest MUST have MSI-X capability.

The a) constructs and SysFS representation of MSI and MSI groups.
The b) adds a second set of them but adding in to SysFS fails (duplicate entry).
'populate_msi_sysfs' frees the newly allocated msi_irq_groups (note that
in a) pdev->msi_irq_groups is still set) and also free's ALL of the
MSI-X entries of the device (the ones allocated in step a) and b)).

The unwind code: 'free_msi_irqs' deletes all the entries and tries to
delete the pdev->msi_irq_groups (which hasn't been set to NULL).
However the pointers in the SysFS are already freed and we hit an
NULL pointer further on when 'strlen' is attempted on a freed pointer.

The patch adds a simple check in the XEN_PCI_OP_enable_msix to guard
against that. The check for msi_enabled is not stricly neccessary.

This is part of XSA-157

Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

xen/pciback: Return error on XEN_PCI_OP_enable_msi when device has MSI or MSI-X enabled

commit 56441f3c8e5bd45aab10dd9f8c505dd4bec03b0d upstream.

The guest sequence of:

a) XEN_PCI_OP_enable_msi
b) XEN_PCI_OP_enable_msi
c) XEN_PCI_OP_disable_msi

results in hitting an BUG_ON condition in the msi.c code.

The MSI code uses an dev->msi_list to which it adds MSI entries.
Under the above conditions an BUG_ON() can be hit. The device
passed in the guest MUST have MSI capability.

The a) adds the entry to the dev->msi_list and sets msi_enabled.
The b) adds a second entry but adding in to SysFS fails (duplicate entry)
and deletes all of the entries from msi_list and returns (with msi_enabled
is still set). c) pci_disable_msi passes the msi_enabled checks and hits:

BUG_ON(list_empty(dev_to_msi_list(&dev->dev)));

and blows up.

The patch adds a simple check in the XEN_PCI_OP_enable_msi to guard
against that. The check for msix_enabled is not stricly neccessary.

This is part of XSA-157.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

xen/pciback: Save xen_pci_op commands before processing it

commit 8135cf8b092723dbfcc611fe6fdcb3a36c9951c5 upstream.

Double fetch vulnerabilities that happen when a variable is
fetched twice from shared memory but a security check is only
performed the first time.

The xen_pcibk_do_op function performs a switch statements on the op->cmd
value which is stored in shared memory. Interestingly this can result
in a double fetch vulnerability depending on the performed compiler
optimization.

This patch fixes it by saving the xen_pci_op command before
processing it. We also use 'barrier' to make sure that the
compiler does not perform any optimization.

This is part of XSA155.

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

xen-blkback: read from indirect descriptors only once

commit 18779149101c0dd43ded43669ae2a92d21b6f9cb upstream.

Since indirect descriptors are in memory shared with the frontend, the
frontend could alter the first_sect and last_sect values after they have
been validated but before they are recorded in the request.  This may
result in I/O requests that overflow the foreign page, possibly
overwriting local pages when the I/O request is executed.

When parsing indirect descriptors, only read first_sect and last_sect
once.

This is part of XSA155.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[ luis: backported to 3.16:
  - Use ACCESS_ONCE instead of READ_ONCE
  - Use PAGE_SIZE instead of XEN_PAGE_SIZE ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

xen-blkback: only read request operation from shared ring once

commit 1f13d75ccb806260079e0679d55d9253e370ec8a upstream.

A compiler may load a switch statement value multiple times, which could
be bad when the value is in memory shared with the frontend.

When converting a non-native request to a native one, ensure that
src->operation is only loaded once by using READ_ONCE().

This is part of XSA155.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[ luis: backported to 3.16:
- replaced READ_ONCE() by ACCESS_ONCE() ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

xen-netback: use RING_COPY_REQUEST() throughout

commit 68a33bfd8403e4e22847165d149823a2e0e67c9c upstream.

Instead of open-coding memcpy()s and directly accessing Tx and Rx
requests, use the new RING_COPY_REQUEST() that ensures the local copy
is correct.

This is more than is strictly necessary for guest Rx requests since
only the id and gref fields are used and it is harmless if the
frontend modifies these.

This is part of XSA155.

Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

xen-netback: don't use last request to determine minimum Tx credit

commit 0f589967a73f1f30ab4ac4dd9ce0bb399b4d6357 upstream.

The last from guest transmitted request gives no indication about the
minimum amount of credit that the guest might need to send a packet
since the last packet might have been a small one.

Instead allow for the worst case 128 KiB packet.

This is part of XSA155.

Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

xen: Add RING_COPY_REQUEST()

commit 454d5d882c7e412b840e3c99010fe81a9862f6fb upstream.

Using RING_GET_REQUEST() on a shared ring is easy to use incorrectly
(i.e., by not considering that the other end may alter the data in the
shared ring while it is being inspected). Safe usage of a request
generally requires taking a local copy.

Provide a RING_COPY_REQUEST() macro to use instead of
RING_GET_REQUEST() and an open-coded memcpy(). This takes care of
ensuring that the copy is done correctly regardless of any possible
compiler optimizations.

Use a volatile source to prevent the compiler from reordering or
omitting the copy.

This is part of XSA155.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

powerpc/powernv: pr_warn_once on unsupported OPAL_MSG type

commit 98da62b716a3b24ab8e77453c9a8a954124c18cd upstream.

When running on newer OPAL firmware that supports sending extra
OPAL_MSG types, we would print a warning on *every* message received.

This could be a problem for kernels that don't support OPAL_MSG_OCC
on machines that are running real close to thermal limits and the
OCC is throttling the chip. For a kernel that is paying attention to
the message queue, we could get these notifications quite often.

Conceivably, future message types could also come fairly often,
and printing that we didn't understand them 10,000 times provides
no further information than printing them once.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

powerpc/powernv: Fix the overflow of OPAL message notifiers head array

commit 792f96e9a769b799a2944e9369e4ea1e467135b2 upstream.

Fixes the condition check of incoming message type which can
otherwise shoot beyond the message notifiers head array.

Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ARC: dw2 unwind: Ignore CIE version !=1 gracefully instead of bailing

commit 323f41f9e7d0cb5b1d1586aded6682855f1e646d upstream.

ARC dwarf unwinder only supports CIE version == 1
The boot time dwarf sanitizer (part of binary lookup table constructor)
would simply bail if it saw CIE version == 3, rendering unwinder with a
NULL lookup table.

It seems libgcc linked with kernel does have such entries.

With fallback linear search removed, and a NULL binary lookup table,
unwinder fails to generate any stack trace.

So allow graceful ignoring of unsupported CIE entries.

This problem was initially seen in Alexey's setup (and not mine) as he
was using buildroot built toolchain (libgcc) which doesn't get built with
CFLAGS_FOR_TARGET="-gdwarf-2 which is my default

Fixes STAR 9000985048: "kernel unwinder broken with stock tools"

Fixes: 2e22502c080f ARC: dw2 unwind: Remove falllback linear search thru FDE entries
Reported-by Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ARC: dw2 unwind: Reinstante unwinding out of modules

commit bc79c9a7216562a2035d2f64f73626613c1300d0 upstream.

The fix which removed linear searching of dwarf (because binary lookup
data always exists) missed out on the fact that modules don't get the
binary lookup tables info. This caused unwinding out of modules to stop
working.

So add binary lookup header setup (equivalent of eh_frame_hdr setup) to
modules as well.

While at it, confine the header setup to within unwinder code,
reducing one API exposed out of unwinder code.

Fixes: 2e22502c080f ARC: dw2 unwind: Remove falllback linear search thru FDE entries
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

dma-debug: Fix dma_debug_entry offset calculation

commit 0354aec19ce3d355c6213b0434064efc25c9b22c upstream.

dma-debug uses struct dma_debug_entry to keep track of dma coherent
memory allocation requests. The virtual address is converted into a pfn
and an offset. Previously, the offset was calculated using an incorrect
bit mask. As a result, we saw incorrect error messages from dma-debug
like the following:

"DMA-API: exceeded 7 overlapping mappings of cacheline 0x03e00000"

Cacheline 0x03e00000 does not exist on our platform.

Fixes: 0abdd7a81b7e ("dma-debug: introduce debug_dma_assert_idle()")
Signed-off-by: Daniel Mentz <danielmentz@google.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

spi: fix parent-device reference leak

commit 157f38f993919b648187ba341bfb05d0e91ad2f6 upstream.

Fix parent-device reference leak due to SPI-core taking an unnecessary
reference to the parent when allocating the master structure, a
reference that was never released.

Note that driver core takes its own reference to the parent when the
master device is registered.

Fixes: 49dce689ad4e ("spi doesn't need class_device")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ALSA: hda - Add a fixup for Thinkpad X1 Carbon 2nd

commit b6903c0ed9f0bcbbe88f67f7ed43d1721cbc6235 upstream.

Apply the same fixup for Thinkpad with dock to Thinkpad X1 Carbon 2nd,
too. This reduces the annoying loud cracking noise problem, as well
as the support of missing docking port.

Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=958439
Reported-and-tested-by: Benjamin Poirier <bpoirier@suse.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ARM: 8471/1: need to save/restore arm register(r11) when it is corrupted

commit fa0708b320f6da4c1104fe56e01b7abf66fd16ad upstream.

In cpu_v7_do_suspend routine, r11 is used while it is NOT
saved/restored, different compiler may have different usage
of ARM general registers, so it may cause issues during
calling cpu_v7_do_suspend.

We meet kernel fault occurs when using GCC 4.8.3, r11 contains
valid value before calling into cpu_v7_do_suspend, but when returned
from this routine, r11 is corrupted and lead to kernel fault.
Doing save/restore for those corrupted registers is a must in
assemble code.

Signed-off-by: Anson Huang <Anson.Huang@freescale.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ALSA: usb-audio: Add a more accurate volume quirk for AudioQuest DragonFly

commit 42e3121d90f42e57f6dbd6083dff2f57b3ec7daa upstream.

AudioQuest DragonFly DAC reports a volume control range of 0..50
(0x0000..0x0032) which in USB Audio means a range of 0 .. 0.2dB, which
is obviously incorrect and would cause software using the dB information
in e.g. volume sliders to have a massive volume difference in 100..102%
range.

Commit 2d1cb7f658fb ("ALSA: usb-audio: add dB range mapping for some
devices") added a dB range mapping for it with range 0..50 dB.

However, the actual volume mapping seems to be neither linear volume nor
linear dB scale, but instead quite close to the cubic mapping e.g.
alsamixer uses, with a range of approx. -53...0 dB.

Replace the previous quirk with a custom dB mapping based on some basic
output measurements, using a 10-item range TLV (which will still fit in
alsa-lib MAX_TLV_RANGE_SIZE).

Tested on AudioQuest DragonFly HW v1.2. The quirk is only applied if the
range is 0..50, so if this gets fixed/changed in later HW revisions it
will no longer be applied.

v2: incorporated Takashi Iwai's suggestion for the quirk application
method

Signed-off-by: Anssi Hannula <anssi.hannula@iki.fi>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

tty: Fix GPF in flush_to_ldisc()

commit 9ce119f318ba1a07c29149301f1544b6c4bea52a upstream.

A line discipline which does not define a receive_buf() method can
can cause a GPF if data is ever received [1]. Oddly, this was known
to the author of n_tracesink in 2011, but never fixed.

[1] GPF report
    BUG: unable to handle kernel NULL pointer dereference at           (null)
    IP: [<          (null)>]           (null)
    PGD 3752d067 PUD 37a7b067 PMD 0
    Oops: 0010 [#1] SMP KASAN
    Modules linked in:
    CPU: 2 PID: 148 Comm: kworker/u10:2 Not tainted 4.4.0-rc2+ #51
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Workqueue: events_unbound flush_to_ldisc
    task: ffff88006da94440 ti: ffff88006db60000 task.ti: ffff88006db60000
    RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
    RSP: 0018:ffff88006db67b50  EFLAGS: 00010246
    RAX: 0000000000000102 RBX: ffff88003ab32f88 RCX: 0000000000000102
    RDX: 0000000000000000 RSI: ffff88003ab330a6 RDI: ffff88003aabd388
    RBP: ffff88006db67c48 R08: ffff88003ab32f9c R09: ffff88003ab31fb0
    R10: ffff88003ab32fa8 R11: 0000000000000000 R12: dffffc0000000000
    R13: ffff88006db67c20 R14: ffffffff863df820 R15: ffff88003ab31fb8
    FS:  0000000000000000(0000) GS:ffff88006dc00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 0000000000000000 CR3: 0000000037938000 CR4: 00000000000006e0
    Stack:
     ffffffff829f46f1 ffff88006da94bf8 ffff88006da94bf8 0000000000000000
     ffff88003ab31fb0 ffff88003aabd438 ffff88003ab31ff8 ffff88006430fd90
     ffff88003ab32f9c ffffed0007557a87 1ffff1000db6cf78 ffff88003ab32078
    Call Trace:
     [<ffffffff8127cf91>] process_one_work+0x8f1/0x17a0 kernel/workqueue.c:2030
     [<ffffffff8127df14>] worker_thread+0xd4/0x1180 kernel/workqueue.c:2162
     [<ffffffff8128faaf>] kthread+0x1cf/0x270 drivers/block/aoe/aoecmd.c:1302
     [<ffffffff852a7c2f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
    Code:  Bad RIP value.
    RIP  [<          (null)>]           (null)
     RSP <ffff88006db67b50>
    CR2: 0000000000000000
    ---[ end trace a587f8947e54d6ea ]---

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

n_tty: Fix poll() after buffer-limited eof push read

commit ac8f3bf8832a405cc6e4dccb1d26d5cb2994d234 upstream.

commit 40d5e0905a03 ("n_tty: Fix EOF push handling") fixed EOF push
for reads. However, that approach still allows a condition mismatch
between poll() and read(), where poll() returns POLLIN but read()
blocks. This state can happen when a previous read() returned because
the user buffer was full and the next character was an EOF not at the
beginning of the line. While the next read() will properly identify
the condition and advance the read buffer tail without improperly
indicating an EOF file condition (ie., read() will not mistakenly
return 0), poll() will mistakenly indicate POLLIN.

Although a possible solution would be to peek at the input buffer
in n_tty_poll(), the better solution in this patch is to eat the
EOF during the previous read() (ie., fix the problem by eliminating
the condition).

The current canon line buffer copy limits the scan for next end-of-line
to the smaller of either,
   a. the remaining user buffer size
   b. completed lines in the input buffer
When the remaining user buffer size is exactly one less than the
end-of-line marked by EOF push, the EOF is not scanned nor skipped
but left for subsequent reads. In the example below, the scan
index 'eol' has stopped at the EOF because it is past the scan
limit of 5 (not because it has found the next set bit in read_flags)

   user buffer [*nr = 5]    _ _ _ _ _

   read_flags               0 0 0 0 0   1
   input buffer             h e l l o [EOF]
                            ^           ^
                           /           /
                         tail        eol

   result: found = 0, tail += 5, *nr += 5

Instead, allow the scan to peek ahead 1 byte (while still limiting the
scan to completed lines in the input buffer). For the example above,

   result: found = 1, tail += 6, *nr += 5

Because the scan limit is now bumped +1 byte, when the scan is
completed, the tail advance and the user buffer copy limit is
re-clamped to *nr when EOF is _not_ found.

Fixes: 40d5e0905a03 ("n_tty: Fix EOF push handling")
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

powercap / RAPL: fix BIOS lock check

commit 79a21dbfae3cd40d5a801778071a9967b79c2c20 upstream.

Intel RAPL initialized on several systems where the BIOS lock bit (msr
0x610, bit 63) was set. This occured because the return value of
rapl_read_data_raw() was being checked, rather than the value of the variable
passed in, locked.

This patch properly implments the rapl_read_data_raw() call to check the
variable locked, and now the Intel RAPL driver outputs the warning:

intel_rapl: RAPL package 0 domain package locked by BIOS

and does not initialize for the package.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ses: fix additional element traversal bug

commit 5e1033561da1152c57b97ee84371dba2b3d64c25 upstream.

KASAN found that our additional element processing scripts drop off
the end of the VPD page into unallocated space. The reason is that
not every element has additional information but our traversal
routines think they do, leading to them expecting far more additional
information than is present. Fix this by adding a gate to the
traversal routine so that it only processes elements that are expected
to have additional information (list is in SES-2 section 6.1.13.1:
Additional Element Status diagnostic page overview)

Reported-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Tested-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

Revert "SCSI: Fix NULL pointer dereference in runtime PM"

commit 1c69d3b6eb73e466ecbb8edaf1bc7fd585b288da upstream.

This reverts commit 49718f0fb8c9 ("SCSI: Fix NULL pointer dereference in
runtime PM")

The old commit may lead to a issue that blk_{pre|post}_runtime_suspend and
blk_{pre|post}_runtime_resume may not be called in pairs.

Take sr device as example, when sr device goes to runtime suspend,
blk_{pre|post}_runtime_suspend will be called since sr device defined
pm->runtime_suspend. But blk_{pre|post}_runtime_resume will not be called
since sr device doesn't have pm->runtime_resume. so, sr device can not
resume correctly anymore.

More discussion can be found from below link.
http://marc.info/?l=linux-scsi&m=144163730531875&w=2

Signed-off-by: Ken Xue <Ken.Xue@amd.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Xiangliang Yu <Xiangliang.Yu@amd.com>
Cc: James E.J. Bottomley <JBottomley@odin.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Terry <Michael.terry@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ses: Fix problems with simple enclosures

commit 3417c1b5cb1fdc10261dbed42b05cc93166a78fd upstream.

Simple enclosure implementations (mostly USB) are allowed to return only
page 8 to every diagnostic query.  That really confuses our
implementation because we assume the return is the page we asked for and
end up doing incorrect offsets based on bogus information leading to
accesses outside of allocated ranges.  Fix that by checking the page
code of the return and giving an error if it isn't the one we asked for.
This should fix reported bugs with USB storage by simply refusing to
attach to enclosures that behave like this.  It's also good defensive
practise now that we're starting to see more USB enclosures.

Reported-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

rfkill: copy the name into the rfkill struct

commit b7bb110008607a915298bf0f47d25886ecb94477 upstream.

Some users of rfkill, like NFC and cfg80211, use a dynamic name when
allocating rfkill, in those cases dev_name(). Therefore, the pointer
passed to rfkill_alloc() might not be valid forever, I specifically
found the case that the rfkill name was quite obviously an invalid
pointer (or at least garbage) when the wiphy had been renamed.

Fix this by making a copy of the rfkill name in rfkill_alloc().

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

crypto: skcipher - Copy iv from desc even for 0-len walks

commit 70d906bc17500edfa9bdd8c8b7e59618c7911613 upstream.

Some ciphers actually support encrypting zero length plaintexts. For
example, many AEAD modes support this. The resulting ciphertext for
those winds up being only the authentication tag, which is a result of
the key, the iv, the additional data, and the fact that the plaintext
had zero length. The blkcipher constructors won't copy the IV to the
right place, however, when using a zero length input, resulting in
some significant problems when ciphers call their initialization
routines, only to find that the ->iv parameter is uninitialized. One
such example of this would be using chacha20poly1305 with a zero length
input, which then calls chacha20, which calls the key setup routine,
which eventually OOPSes due to the uninitialized ->iv member.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

video: fbdev: fsl: Fix kernel crash when diu_ops is not implemented

commit acfc1cc13fe5bc6d7a10afa624f1e560850ddad3 upstream.

If diu_ops is not implemented on platform, kernel will access a NULL
pointer. We need to check this pointer in DIU initialization.

Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Acked-by: Timur Tabi <timur@tabi.org>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

xen/events/fifo: Consume unprocessed events when a CPU dies

commit 3de88d622fd68bd4dbee0f80168218b23f798fd0 upstream.

When a CPU is offlined, there may be unprocessed events on a port for
that CPU. If the port is subsequently reused on a different CPU, it
could be in an unexpected state with the link bit set, resulting in
interrupts being missed. Fix this by consuming any unprocessed events
for a particular CPU when that CPU dies.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

i2c: mv64xxx: The n clockdiv factor is 0 based on sunxi SoCs

commit bba61f50f76574ca5b84b310925be7c2e8e64275 upstream.

According to the datasheets the n factor for dividing the tclk is
2 to the power n on Allwinner SoCs, not 2 to the power n + 1 as it is
on other mv64xxx implementations.

I've contacted Allwinner about this and they have confirmed that the
datasheet is correct.

This commit fixes the clk-divider calculations for Allwinner SoCs
accordingly.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Tested-by: Olliver Schinagl <oliver@schinagl.nl>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

tools: Add a "make all" rule

commit f6ba98c5dc78708cb7fd29950c4a50c4c7e88f95 upstream.

Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Pali Rohar <pali.rohar@gmail.com>
Cc: Roberta Dobrescu <roberta.dobrescu@gmail.com>
Link: http://lkml.kernel.org/r/1447280736-2161-2-git-send-email-kamal@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
[ kamal: backport to 3.16-stable: build all tools for this version ]
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

MIPS: uaccess: Take EVA into account in [__]clear_user

commit d6a428fb583738ad685c91a684748cdee7b2a05f upstream.

__clear_user() (and clear_user() which uses it), always access the user
mode address space, which results in EVA store instructions when EVA is
enabled even if the current user address limit is KERNEL_DS.

Fix this by adding a new symbol __bzero_kernel for the normal kernel
address space bzero in EVA mode, and call that from __clear_user() if
eva_kernel_access().

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10844/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
[james.hogan@imgtec.com: backport]
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

MIPS: uaccess: Take EVA into account in __copy_from_user()

commit 6f06a2c45d8d714ea3b11a360b4a7191e52acaa4 upstream.

When EVA is in use, __copy_from_user() was unconditionally using the EVA
instructions to read the user address space, however this can also be
used for kernel access. If the address isn't a valid user address it
will cause an address error or TLB exception, and if it is then user
memory may be read instead of kernel memory.

For example in the following stack trace from Linux v3.10 (changes since
then will prevent this particular one still happening) kernel_sendmsg()
set the user address limit to KERNEL_DS, and tcp_sendmsg() goes on to
use __copy_from_user() with a kernel address in KSeg0.

[<8002d434>] __copy_fromuser_common+0x10c/0x254
[<805710e0>] tcp_sendmsg+0x5f4/0xf00
[<804e8e3c>] sock_sendmsg+0x78/0xa0
[<804e8f28>] kernel_sendmsg+0x24/0x38
[<804ee0f8>] sock_no_sendpage+0x70/0x7c
[<8017c820>] pipe_to_sendpage+0x80/0x98
[<8017c6b0>] splice_from_pipe_feed+0xa8/0x198
[<8017cc54>] __splice_from_pipe+0x4c/0x8c
[<8017e844>] splice_from_pipe+0x58/0x78
[<8017e884>] generic_splice_sendpage+0x20/0x2c
[<8017d690>] do_splice_from+0xb4/0x110
[<8017d710>] direct_splice_actor+0x24/0x30
[<8017d394>] splice_direct_to_actor+0xd8/0x208
[<8017d51c>] do_splice_direct+0x58/0x7c
[<8014eaf4>] do_sendfile+0x1dc/0x39c
[<8014f82c>] SyS_sendfile+0x90/0xf8

Add the eva_kernel_access() check in __copy_from_user() like the one in
copy_from_user().

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10843/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
[james.hogan@imgtec.com: backport]
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

efi: Disable interrupts around EFI calls, not in the epilog/prolog calls

commit 23a0d4e8fa6d3a1d7fb819f79bcc0a3739c30ba9 upstream.

Tapasweni Pathak reported that we do a kmalloc() in efi_call_phys_prolog()
on x86-64 while having interrupts disabled, which is a big no-no, as
kmalloc() can sleep.

Solve this by removing the irq disabling from the prolog/epilog calls
around EFI calls: it's unnecessary, as in this stage we are single
threaded in the boot thread, and we don't ever execute this from
interrupt contexts.

Reported-by: Tapasweni Pathak <tapaswenipathak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

usb: musb: USB_TI_CPPI41_DMA requires dmaengine support

commit 183e53e8ddf4165c3763181682189362d6b403f7 upstream.

The CPPI-4.1 driver selects TI_CPPI41, which is a dmaengine
driver and that may not be available when CONFIG_DMADEVICES
is not set:

warning: (USB_TI_CPPI41_DMA) selects TI_CPPI41 which has unmet direct dependencies (DMADEVICES && ARCH_OMAP)

This adds an extra dependency to avoid generating warnings in randconfig
builds. Ideally we'd remove the 'select' statement, but that has the
potential to break defconfig files.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 411dd19c682d ("usb: musb: Kconfig: Select the DMA driver if DMA mode of MUSB is enabled")
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

sh64: fix __NR_fgetxattr

commit 2d33fa1059da4c8e816627a688d950b613ec0474 upstream.

According to arch/sh/kernel/syscalls_64.S and common sense, __NR_fgetxattr
has to be defined to 259, but it doesn't. Instead, it's defined to 269,
which is of course used by another syscall, __NR_sched_setaffinity in this
case.

This bug was found by strace test suite.

Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ocfs2: fix SGID not inherited issue

commit 854ee2e944b4daf795e32562a7d2f9e90ab5a6a8 upstream.

Commit 8f1eb48758aa ("ocfs2: fix umask ignored issue") introduced an
issue, SGID of sub dir was not inherited from its parents dir. It is
because SGID is set into "inode->i_mode" in ocfs2_get_init_inode(), but
is overwritten by "mode" which don't have SGID set later.

Fixes: 8f1eb48758aa ("ocfs2: fix umask ignored issue")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Acked-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

drivers/base/memory.c: prohibit offlining of memory blocks with missing sections

commit 26bbe7ef6d5cdc7ec08cba6d433fca4060f258f3 upstream.

Commit bdee237c0343 ("x86: mm: Use 2GB memory block size on large-memory
x86-64 systems") and 982792c782ef ("x86, mm: probe memory block size for
generic x86 64bit") introduced large block sizes for x86. This made it
possible to have multiple sections per memory block where previously,
there was a only every one section per block.

Since blocks consist of contiguous ranges of section, there can be holes
in the blocks where sections are not present. If one attempts to
offline such a block, a crash occurs since the code is not designed to
deal with this.

This patch is a quick fix to gaurd against the crash by not allowing
blocks with non-present sections to be offlined.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=107781

Signed-off-by: Seth Jennings <sjennings@variantweb.net>
Reported-by: Andrew Banman <abanman@sgi.com>
Cc: Daniel J Blueman <daniel@numascale.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Greg KH <greg@kroah.com>
Cc: Russ Anderson <rja@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

mm: hugetlb: call huge_pte_alloc() only if ptep is null

commit 0d777df5d8953293be090d9ab5a355db893e8357 upstream.

Currently at the beginning of hugetlb_fault(), we call huge_pte_offset()
and check whether the obtained *ptep is a migration/hwpoison entry or
not. And if not, then we get to call huge_pte_alloc(). This is racy
because the *ptep could turn into migration/hwpoison entry after the
huge_pte_offset() check. This race results in BUG_ON in
huge_pte_alloc().

We don't have to call huge_pte_alloc() when the huge_pte_offset()
returns non-NULL, so let's fix this bug with moving the code into else
block.

Note that the *ptep could turn into a migration/hwpoison entry after
this block, but that's not a problem because we have another
!pte_present check later (we never go into hugetlb_no_page() in that
case.)

Fixes: 290408d4a250 ("hugetlb: hugepage migration core")
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress

commit 373ccbe5927034b55bdc80b0f8b54d6e13fe8d12 upstream.

Tetsuo Handa has reported that the system might basically livelock in
OOM condition without triggering the OOM killer.

The issue is caused by internal dependency of the direct reclaim on
vmstat counter updates (via zone_reclaimable) which are performed from
the workqueue context.  If all the current workers get assigned to an
allocation request, though, they will be looping inside the allocator
trying to reclaim memory but zone_reclaimable can see stalled numbers so
it will consider a zone reclaimable even though it has been scanned way
too much.  WQ concurrency logic will not consider this situation as a
congested workqueue because it relies that worker would have to sleep in
such a situation.  This also means that it doesn't try to spawn new
workers or invoke the rescuer thread if the one is assigned to the
queue.

In order to fix this issue we need to do two things.  First we have to
let wq concurrency code know that we are in trouble so we have to do a
short sleep.  In order to prevent from issues handled by 0e093d99763e
("writeback: do not sleep on the congestion queue if there are no
congested BDIs or if significant congestion is not being encountered in
the current zone") we limit the sleep only to worker threads which are
the ones of the interest anyway.

The second thing to do is to create a dedicated workqueue for vmstat and
mark it WQ_MEM_RECLAIM to note it participates in the reclaim and to
have a spare worker thread for it.

Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Tejun Heo <tj@kernel.org>
Cc: Cristopher Lameter <clameter@sgi.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Arkadiusz Miskiewicz <arekm@maven.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
[ luis: backported to 3.16, based on Ben's backport to 3.2:
  - use queue_delayed_work instead of queue_delayed_work_on in function
    vmstat_update()
  - change start_cpu_timer() instead of vmstat_shepherd()
  - adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

mm: hugetlb: fix hugepage memory leak caused by wrong reserve count

commit a88c769548047b21f76fd71e04b6a3300ff17160 upstream.

When dequeue_huge_page_vma() in alloc_huge_page() fails, we fall back on
alloc_buddy_huge_page() to directly create a hugepage from the buddy
allocator.

In that case, however, if alloc_buddy_huge_page() succeeds we don't
decrement h->resv_huge_pages, which means that successful
hugetlb_fault() returns without releasing the reserve count.  As a
result, subsequent hugetlb_fault() might fail despite that there are
still free hugepages.

This patch simply adds decrementing code on that code path.

I reproduced this problem when testing v4.3 kernel in the following situation:
- the test machine/VM is a NUMA system,
- hugepage overcommiting is enabled,
- most of hugepages are allocated and there's only one free hugepage
   which is on node 0 (for example),
- another program, which calls set_mempolicy(MPOL_BIND) to bind itself to
   node 1, tries to allocate a hugepage,
- the allocation should fail but the reserve count is still hold.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ luis: backported to 3.16:
  - use 'chg' instead of 'gbl_chg'
  - adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

parisc iommu: fix panic due to trying to allocate too large region

commit e46e31a3696ae2d66f32c207df3969613726e636 upstream.

When using the Promise TX2+ SATA controller on PA-RISC, the system often
crashes with kernel panic, for example just writing data with the dd
utility will make it crash.

Kernel panic - not syncing: drivers/parisc/sba_iommu.c: I/O MMU @ 000000000000a000 is out of mapping resources

CPU: 0 PID: 18442 Comm: mkspadfs Not tainted 4.4.0-rc2 #2
Backtrace:
[<000000004021497c>] show_stack+0x14/0x20
[<0000000040410bf0>] dump_stack+0x88/0x100
[<000000004023978c>] panic+0x124/0x360
[<0000000040452c18>] sba_alloc_range+0x698/0x6a0
[<0000000040453150>] sba_map_sg+0x260/0x5b8
[<000000000c18dbb4>] ata_qc_issue+0x264/0x4a8 [libata]
[<000000000c19535c>] ata_scsi_translate+0xe4/0x220 [libata]
[<000000000c19a93c>] ata_scsi_queuecmd+0xbc/0x320 [libata]
[<0000000040499bbc>] scsi_dispatch_cmd+0xfc/0x130
[<000000004049da34>] scsi_request_fn+0x6e4/0x970
[<00000000403e95a8>] __blk_run_queue+0x40/0x60
[<00000000403e9d8c>] blk_run_queue+0x3c/0x68
[<000000004049a534>] scsi_run_queue+0x2a4/0x360
[<000000004049be68>] scsi_end_request+0x1a8/0x238
[<000000004049de84>] scsi_io_completion+0xfc/0x688
[<0000000040493c74>] scsi_finish_command+0x17c/0x1d0

The cause of the crash is not exhaustion of the IOMMU space, there is
plenty of free pages. The function sba_alloc_range is called with size
0x11000, thus the pages_needed variable is 0x11. The function
sba_search_bitmap is called with bits_wanted 0x11 and boundary size is
0x10 (because dma_get_seg_boundary(dev) returns 0xffff).

The function sba_search_bitmap attempts to allocate 17 pages that must not
cross 16-page boundary - it can't satisfy this requirement
(iommu_is_span_boundary always returns true) and fails even if there are
many free entries in the IOMMU space.

How did it happen that we try to allocate 17 pages that don't cross
16-page boundary? The cause is in the function iommu_coalesce_chunks. This
function tries to coalesce adjacent entries in the scatterlist. The
function does several checks if it may coalesce one entry with the next,
one of those checks is this:

if (startsg->length + dma_len > max_seg_size)
break;

When it finishes coalescing adjacent entries, it allocates the mapping:

sg_dma_len(contig_sg) = dma_len;
dma_len = ALIGN(dma_len + dma_offset, IOVP_SIZE);
sg_dma_address(contig_sg) =
PIDE_FLAG
| (iommu_alloc_range(ioc, dev, dma_len) << IOVP_SHIFT)
| dma_offset;

It is possible that (startsg->length + dma_len > max_seg_size) is false
(we are just near the 0x10000 max_seg_size boundary), so the funcion
decides to coalesce this entry with the next entry. When the coalescing
succeeds, the function performs
dma_len = ALIGN(dma_len + dma_offset, IOVP_SIZE);
And now, because of non-zero dma_offset, dma_len is greater than 0x10000.
iommu_alloc_range (a pointer to sba_alloc_range) is called and it attempts
to allocate 17 pages for a device that must not cross 16-page boundary.

To fix the bug, we must make sure that dma_len after addition of
dma_offset and alignment doesn't cross the segment boundary. I.e. change
if (startsg->length + dma_len > max_seg_size)
break;
to
if (ALIGN(dma_len + dma_offset + startsg->length, IOVP_SIZE) > max_seg_size)
break;

This patch makes this change (it precalculates max_seg_boundary at the
beginning of the function iommu_coalesce_chunks). I also added a check
that the mapping length doesn't exceed dma_get_seg_boundary(dev) (it is
not needed for Promise TX2+ SATA, but it may be needed for other devices
that have dma_get_seg_boundary lower than dma_get_max_seg_size).

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

USB: add quirk for devices with broken LPM

commit ad87e03213b552a5c33d5e1e7a19a73768397010 upstream.

Some USB device / host controller combinations seem to have problems
with Link Power Management. For example, Steinar found that his xHCI
controller wouldn't handle bandwidth calculations correctly for two
video cards simultaneously when LPM was enabled, even though the bus
had plenty of bandwidth available.

This patch introduces a new quirk flag for devices that should remain
disabled for LPM, and creates quirk entries for Steinar's devices.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

xhci: fix usb2 resume timing and races.

commit f69115fdbc1ac0718e7d19ad3caa3da2ecfe1c96 upstream.

According to USB 2 specs ports need to signal resume for at least 20ms,
in practice even longer, before moving to U0 state.
Both host and devices can initiate resume.

On device initiated resume, a port status interrupt with the port in resume
state in issued. The interrupt handler tags a resume_done[port]
timestamp with current time + USB_RESUME_TIMEOUT, and kick roothub timer.
Root hub timer requests for port status, finds the port in resume state,
checks if resume_done[port] timestamp passed, and set port to U0 state.

On host initiated resume, current code sets the port to resume state,
sleep 20ms, and finally sets the port to U0 state. This should also
be changed to work in a similar way as the device initiated resume, with
timestamp tagging, but that is not yet tested and will be a separate
fix later.

There are a few issues with this approach

1. A host initiated resume will also generate a resume event. The event
   handler will find the port in resume state, believe it's a device
   initiated resume, and act accordingly.

2. A port status request might cut the resume signalling short if a
   get_port_status request is handled during the host resume signalling.
   The port will be found in resume state. The timestamp is not set leading
   to time_after_eq(jiffies, timestamp) returning true, as timestamp = 0.
   get_port_status will proceed with moving the port to U0.

3. If an error, or anything else happens to the port during device
   initiated resume signalling it will leave all the device resume
   parameters hanging uncleared, preventing further suspend, returning
   -EBUSY, and cause the pm thread to busyloop trying to enter suspend.

Fix this by using the existing resuming_ports bitfield to indicate that
resume signalling timing is taken care of.
Check if the resume_done[port] is set before using it for timestamp
comparison, and also clear out any resume signalling related variables
if port is not in U0 or Resume state

This issue was discovered when a PM thread busylooped, trying to runtime
suspend the xhci USB 2 roothub on a Dell XPS

Reported-by: Daniel J Blueman <daniel@quora.org>
Tested-by: Daniel J Blueman <daniel@quora.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

vgaarb: fix signal handling in vga_get()

commit 9f5bd30818c42c6c36a51f93b4df75a2ea2bd85e upstream.

There are few defects in vga_get() related to signal hadning:

  - we shouldn't check for pending signals for TASK_UNINTERRUPTIBLE
    case;

  - if we found pending signal we must remove ourself from wait queue
    and change task state back to running;

  - -ERESTARTSYS is more appropriate, I guess.

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

dm btree: fix bufio buffer leaks in dm_btree_del() error path

commit ed8b45a3679eb49069b094c0711b30833f27c734 upstream.

If dm_btree_del()'s call to push_frame() fails, e.g. due to
btree_node_validator finding invalid metadata, the dm_btree_del() error
path must unlock all frames (which have active dm-bufio buffers) that
were pushed onto the del_stack.

Otherwise, dm_bufio_client_destroy() will BUG_ON() because dm-bufio
buffers have leaked, e.g.:
device-mapper: bufio: leaked buffer 3, hold count 1, list 0

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ipmi: move timer init to before irq is setup

commit 27f972d3e00b50639deb4cc1392afaeb08d3cecc upstream.

We encountered a panic on boot in ipmi_si on a dell per320 due to an
uninitialized timer as follows.

static int smi_start_processing(void       *send_info,
                                ipmi_smi_t intf)
{
        /* Try to claim any interrupts. */
        if (new_smi->irq_setup)
                new_smi->irq_setup(new_smi);

--> IRQ arrives here and irq handler tries to modify uninitialized timer

    which triggers BUG_ON(!timer->function) in __mod_timer().

Call Trace:
   <IRQ>
   [<ffffffffa0532617>] start_new_msg+0x47/0x80 [ipmi_si]
   [<ffffffffa053269e>] start_check_enables+0x4e/0x60 [ipmi_si]
   [<ffffffffa0532bd8>] smi_event_handler+0x1e8/0x640 [ipmi_si]
   [<ffffffff810f5584>] ? __rcu_process_callbacks+0x54/0x350
   [<ffffffffa053327c>] si_irq_handler+0x3c/0x60 [ipmi_si]
   [<ffffffff810efaf0>] handle_IRQ_event+0x60/0x170
   [<ffffffff810f245e>] handle_edge_irq+0xde/0x180
   [<ffffffff8100fc59>] handle_irq+0x49/0xa0
   [<ffffffff8154643c>] do_IRQ+0x6c/0xf0
   [<ffffffff8100ba53>] ret_from_intr+0x0/0x11

        /* Set up the timer that drives the interface. */
        setup_timer(&new_smi->si_timer, smi_timeout, (long)new_smi);

The following patch fixes the problem.

To: Openipmi-developer@lists.sourceforge.net
To: Corey Minyard <minyard@acm.org>
CC: linux-kernel@vger.kernel.org
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Tony Camuso <tcamuso@redhat.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

dm space map metadata: fix ref counting bug when bootstrapping a new space map

commit 50dd842ad83b43bed71790efb31cfb2f6c05c9c1 upstream.

When applying block operations (BOPs) do not remove them from the
uncommitted BOP ring-buffer until after they've been applied -- in case
we recurse.

Also, perform BOP_INC operation, in dm_sm_metadata_create() and
sm_metadata_extend(), in terms of the uncommitted BOP ring-buffer rather
than using direct calls to sm_ll_inc().

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

dm thin metadata: fix bug when taking a metadata snapshot

commit 49e99fc717f624aa75ca755d6e7bc029efd3f0e9 upstream.

When you take a metadata snapshot the btree roots for the mapping and
details tree need to have their reference counts incremented so they
persist for the lifetime of the metadata snap.

The roots being incremented were those currently written in the
superblock, which could possibly be out of date if concurrent IO is
triggering new mappings, breaking of sharing, etc.

Fix this by performing a commit with the metadata lock held while taking
a metadata snapshot.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

ALSA: hda - Fix noise problems on Thinkpad T440s

commit 9a811230481243f384b8036c6a558bfdbd961f78 upstream.

Lenovo Thinkpad T440s suffers from constant background noises, and it
seems to be a generic hardware issue on this model:
  https://forums.lenovo.com/t5/ThinkPad-T400-T500-and-newer-T/T440s-speaker-noise/td-p/1339883

As the noise comes from the analog loopback path, disabling the path
is the easy workaround.

Also, the machine gives significant cracking noises at PM suspend.  A
workaround found by trial-and-error is to disable the shutup callback
currently used for ALC269-variant.

This patch addresses these noise issues by introducing a new fixup
chain.  Although the same workaround might be applicable to other
Thinkpad models, it's applied only to T440s (17aa:220c) in this patch,
so far, just to be safe (you chicken!).  As a compromise, a new model
option string "tp440" is provided now, though, so that owners of other
Thinkpad models can test it more easily.

Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=958504
Reported-and-tested-by: Tim Hardeck <thardeck@suse.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

radeon: Fix VCE IB test on Big-Endian systems

commit 361c32d39087e7caa99e629c0d7fb00643cb2190 upstream.

This patch makes the VCE IB test pass on Big-Endian systems. It converts
to little-endian the contents of the VCE message.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

radeon: Fix VCE ring test for Big-Endian systems

commit 687f4b98d1f4e27508f7ad4bcce787c1ba58b289 upstream.

This patch fixes the VCE ring test when running on Big-Endian machines.
Every write to the ring needs to be translated to little-endian.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>