]> git.ipfire.org Git - thirdparty/linux.git/log
thirdparty/linux.git
5 years agoKVM: x86: Open code kvm_set_hflags
Sean Christopherson [Tue, 2 Apr 2019 15:03:10 +0000 (08:03 -0700)] 
KVM: x86: Open code kvm_set_hflags

Prepare for clearing HF_SMM_MASK prior to loading state from the SMRAM
save state map, i.e. kvm_smm_changed() needs to be called after state
has been loaded and so cannot be done automatically when setting
hflags from RSM.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoKVM: x86: Load SMRAM in a single shot when leaving SMM
Sean Christopherson [Tue, 2 Apr 2019 15:03:09 +0000 (08:03 -0700)] 
KVM: x86: Load SMRAM in a single shot when leaving SMM

RSM emulation is currently broken on VMX when the interrupted guest has
CR4.VMXE=1.  Rather than dance around the issue of HF_SMM_MASK being set
when loading SMSTATE into architectural state, ideally RSM emulation
itself would be reworked to clear HF_SMM_MASK prior to loading non-SMM
architectural state.

Ostensibly, the only motivation for having HF_SMM_MASK set throughout
the loading of state from the SMRAM save state area is so that the
memory accesses from GET_SMSTATE() are tagged with role.smm.  Load
all of the SMRAM save state area from guest memory at the beginning of
RSM emulation, and load state from the buffer instead of reading guest
memory one-by-one.

This paves the way for clearing HF_SMM_MASK prior to loading state,
and also aligns RSM with the enter_smm() behavior, which fills a
buffer and writes SMRAM save state in a single go.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoKVM: nVMX: Expose RDPMC-exiting only when guest supports PMU
Liran Alon [Mon, 25 Mar 2019 19:09:17 +0000 (21:09 +0200)] 
KVM: nVMX: Expose RDPMC-exiting only when guest supports PMU

Issue was discovered when running kvm-unit-tests on KVM running as L1 on
top of Hyper-V.

When vmx_instruction_intercept unit-test attempts to run RDPMC to test
RDPMC-exiting, it is intercepted by L1 KVM which it's EXIT_REASON_RDPMC
handler raise #GP because vCPU exposed by Hyper-V doesn't support PMU.
Instead of unit-test expectation to be reflected with EXIT_REASON_RDPMC.

The reason vmx_instruction_intercept unit-test attempts to run RDPMC
even though Hyper-V doesn't support PMU is because L1 expose to L2
support for RDPMC-exiting. Which is reasonable to assume that is
supported only in case CPU supports PMU to being with.

Above issue can easily be simulated by modifying
vmx_instruction_intercept config in x86/unittests.cfg to run QEMU with
"-cpu host,+vmx,-pmu" and run unit-test.

To handle issue, change KVM to expose RDPMC-exiting only when guest
supports PMU.

Reported-by: Saar Amar <saaramar@microsoft.com>
Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoKVM: x86: Raise #GP when guest vCPU do not support PMU
Liran Alon [Mon, 25 Mar 2019 19:10:17 +0000 (21:10 +0200)] 
KVM: x86: Raise #GP when guest vCPU do not support PMU

Before this change, reading a VMware pseduo PMC will succeed even when
PMU is not supported by guest. This can easily be seen by running
kvm-unit-test vmware_backdoors with "-cpu host,-pmu" option.

Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agox86/kvm: move kvm_load/put_guest_xcr0 into atomic context
WANG Chao [Fri, 12 Apr 2019 07:55:39 +0000 (15:55 +0800)] 
x86/kvm: move kvm_load/put_guest_xcr0 into atomic context

guest xcr0 could leak into host when MCE happens in guest mode. Because
do_machine_check() could schedule out at a few places.

For example:

kvm_load_guest_xcr0
...
kvm_x86_ops->run(vcpu) {
  vmx_vcpu_run
    vmx_complete_atomic_exit
      kvm_machine_check
        do_machine_check
          do_memory_failure
            memory_failure
              lock_page

In this case, host_xcr0 is 0x2ff, guest vcpu xcr0 is 0xff. After schedule
out, host cpu has guest xcr0 loaded (0xff).

In __switch_to {
     switch_fpu_finish
       copy_kernel_to_fpregs
         XRSTORS

If any bit i in XSTATE_BV[i] == 1 and xcr0[i] == 0, XRSTORS will
generate #GP (In this case, bit 9). Then ex_handler_fprestore kicks in
and tries to reinitialize fpu by restoring init fpu state. Same story as
last #GP, except we get DOUBLE FAULT this time.

Cc: stable@vger.kernel.org
Signed-off-by: WANG Chao <chao.wang@ucloud.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoKVM: x86: svm: make sure NMI is injected after nmi_singlestep
Vitaly Kuznetsov [Wed, 3 Apr 2019 14:06:42 +0000 (16:06 +0200)] 
KVM: x86: svm: make sure NMI is injected after nmi_singlestep

I noticed that apic test from kvm-unit-tests always hangs on my EPYC 7401P,
the hanging test nmi-after-sti is trying to deliver 30000 NMIs and tracing
shows that we're sometimes able to deliver a few but never all.

When we're trying to inject an NMI we may fail to do so immediately for
various reasons, however, we still need to inject it so enable_nmi_window()
arms nmi_singlestep mode. #DB occurs as expected, but we're not checking
for pending NMIs before entering the guest and unless there's a different
event to process, the NMI will never get delivered.

Make KVM_REQ_EVENT request on the vCPU from db_interception() to make sure
pending NMIs are checked and possibly injected.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agosvm/avic: Fix invalidate logical APIC id entry
Suthikulpanit, Suravee [Tue, 26 Mar 2019 03:57:37 +0000 (03:57 +0000)] 
svm/avic: Fix invalidate logical APIC id entry

Only clear the valid bit when invalidate logical APIC id entry.
The current logic clear the valid bit, but also set the rest of
the bits (including reserved bits) to 1.

Fixes: 98d90582be2e ('svm: Fix AVIC DFR and LDR handling')
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoRevert "svm: Fix AVIC incomplete IPI emulation"
Suthikulpanit, Suravee [Wed, 20 Mar 2019 08:12:28 +0000 (08:12 +0000)] 
Revert "svm: Fix AVIC incomplete IPI emulation"

This reverts commit bb218fbcfaaa3b115d4cd7a43c0ca164f3a96e57.

As Oren Twaig pointed out the old discussion:

  https://patchwork.kernel.org/patch/8292231/

that the change coud potentially cause an extra IPI to be sent to
the destination vcpu because the AVIC hardware already set the IRR bit
before the incomplete IPI #VMEXIT with id=1 (target vcpu is not running).
Since writting to ICR and ICR2 will also set the IRR. If something triggers
the destination vcpu to get scheduled before the emulation finishes, then
this could result in an additional IPI.

Also, the issue mentioned in the commit bb218fbcfaaa was misdiagnosed.

Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Oren Twaig <oren@scalemp.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agokvm: mmu: Fix overflow on kvm mmu page limit calculation
Ben Gardon [Mon, 8 Apr 2019 18:07:30 +0000 (11:07 -0700)] 
kvm: mmu: Fix overflow on kvm mmu page limit calculation

KVM bases its memory usage limits on the total number of guest pages
across all memslots. However, those limits, and the calculations to
produce them, use 32 bit unsigned integers. This can result in overflow
if a VM has more guest pages that can be represented by a u32. As a
result of this overflow, KVM can use a low limit on the number of MMU
pages it will allocate. This makes KVM unable to map all of guest memory
at once, prompting spurious faults.

Tested: Ran all kvm-unit-tests on an Intel Haswell machine. This patch
introduced no new failures.

Signed-off-by: Ben Gardon <bgardon@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoKVM: nVMX: always use early vmcs check when EPT is disabled
Paolo Bonzini [Mon, 15 Apr 2019 13:57:19 +0000 (15:57 +0200)] 
KVM: nVMX: always use early vmcs check when EPT is disabled

The remaining failures of vmx.flat when EPT is disabled are caused by
incorrectly reflecting VMfails to the L1 hypervisor.  What happens is
that nested_vmx_restore_host_state corrupts the guest CR3, reloading it
with the host's shadow CR3 instead, because it blindly loads GUEST_CR3
from the vmcs01.

For simplicity let's just always use hardware VMCS checks when EPT is
disabled.  This way, nested_vmx_restore_host_state is not reached at
all (or at least shouldn't be reached).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agosc16is7xx: move label 'err_spi' to correct section
Guoqing Jiang [Tue, 9 Apr 2019 08:16:38 +0000 (16:16 +0800)] 
sc16is7xx: move label 'err_spi' to correct section

err_spi is used when SERIAL_SC16IS7XX_SPI is enabled, so make
the label only available under SERIAL_SC16IS7XX_SPI option.
Otherwise, the below warning appears.

drivers/tty/serial/sc16is7xx.c:1523:1: warning: label ‘err_spi’ defined but not used [-Wunused-label]
 err_spi:
  ^~~~~~~

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Fixes: ac0cdb3d9901 ("sc16is7xx: missing unregister/delete driver on error in sc16is7xx_init()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoserial: sh-sci: Fix HSCIF RX sampling point adjustment
Geert Uytterhoeven [Fri, 29 Mar 2019 09:10:26 +0000 (10:10 +0100)] 
serial: sh-sci: Fix HSCIF RX sampling point adjustment

The calculation of the sampling point has min() and max() exchanged.
Fix this by using the clamp() helper instead.

Fixes: 63ba1e00f178a448 ("serial: sh-sci: Support for HSCIF RX sampling point adjustment")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Ulrich Hecht <uli+renesas@fpond.eu>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Dirk Behme <dirk.behme@de.bosch.com>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoserial: sh-sci: Fix HSCIF RX sampling point calculation
Geert Uytterhoeven [Mon, 1 Apr 2019 11:25:10 +0000 (13:25 +0200)] 
serial: sh-sci: Fix HSCIF RX sampling point calculation

There are several issues with the formula used for calculating the
deviation from the intended rate:
  1. While min_err and last_stop are signed, srr and baud are unsigned.
     Hence the signed values are promoted to unsigned, which will lead
     to a bogus value of deviation if min_err is negative,
  2. Srr is the register field value, which is one less than the actual
     sampling rate factor,
  3. The divisions do not use rounding.

Fix this by casting unsigned variables to int, adding one to srr, and
using a single DIV_ROUND_CLOSEST().

Fixes: 63ba1e00f178a448 ("serial: sh-sci: Support for HSCIF RX sampling point adjustment")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Ulrich Hecht <uli+renesas@fpond.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agostaging: erofs: fix unexpected out-of-bound data access
Gao Xiang [Fri, 12 Apr 2019 09:53:14 +0000 (17:53 +0800)] 
staging: erofs: fix unexpected out-of-bound data access

Unexpected out-of-bound data will be read in erofs_read_raw_page
after commit 07173c3ec276 ("block: enable multipage bvecs") since
one iovec could have multiple pages.

Let's fix as what Ming's pointed out in the previous email [1].

[1] https://lore.kernel.org/lkml/20190411080953.GE421@ming.t460p/

Suggested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Fixes: 07173c3ec276 ("block: enable multipage bvecs")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/Kconfig: Fix spelling mistake "effectivness" -> "effectiveness"
Colin Ian King [Tue, 16 Apr 2019 10:57:51 +0000 (11:57 +0100)] 
x86/Kconfig: Fix spelling mistake "effectivness" -> "effectiveness"

The Kconfig text contains a spelling mistake, fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20190416105751.18899-1-colin.king@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
5 years agostaging: comedi: vmk80xx: Fix possible double-free of ->usb_rx_buf
Ian Abbott [Mon, 15 Apr 2019 11:52:30 +0000 (12:52 +0100)] 
staging: comedi: vmk80xx: Fix possible double-free of ->usb_rx_buf

`vmk80xx_alloc_usb_buffers()` is called from `vmk80xx_auto_attach()` to
allocate RX and TX buffers for USB transfers.  It allocates
`devpriv->usb_rx_buf` followed by `devpriv->usb_tx_buf`.  If the
allocation of `devpriv->usb_tx_buf` fails, it frees
`devpriv->usb_rx_buf`,  leaving the pointer set dangling, and returns an
error.  Later, `vmk80xx_detach()` will be called from the core comedi
module code to clean up.  `vmk80xx_detach()` also frees both
`devpriv->usb_rx_buf` and `devpriv->usb_tx_buf`, but
`devpriv->usb_rx_buf` may have already been freed, leading to a
double-free error.  Fix it by removing the call to
`kfree(devpriv->usb_rx_buf)` from `vmk80xx_alloc_usb_buffers()`, relying
on `vmk80xx_detach()` to free the memory.

Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agostaging: comedi: vmk80xx: Fix use of uninitialized semaphore
Ian Abbott [Mon, 15 Apr 2019 11:10:14 +0000 (12:10 +0100)] 
staging: comedi: vmk80xx: Fix use of uninitialized semaphore

If `vmk80xx_auto_attach()` returns an error, the core comedi module code
will call `vmk80xx_detach()` to clean up.  If `vmk80xx_auto_attach()`
successfully allocated the comedi device private data,
`vmk80xx_detach()` assumes that a `struct semaphore limit_sem` contained
in the private data has been initialized and uses it.  Unfortunately,
there are a couple of places where `vmk80xx_auto_attach()` can return an
error after allocating the device private data but before initializing
the semaphore, so this assumption is invalid.  Fix it by initializing
the semaphore just after allocating the private data in
`vmk80xx_auto_attach()` before any other errors can be returned.

I believe this was the cause of the following syzbot crash report
<https://syzkaller.appspot.com/bug?extid=54c2f58f15fe6876b6ad>:

usb 1-1: config 0 has no interface number 0
usb 1-1: New USB device found, idVendor=10cf, idProduct=8068, bcdDevice=e6.8d
usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
usb 1-1: config 0 descriptor??
vmk80xx 1-1:0.117: driver 'vmk80xx' failed to auto-configure device.
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.1.0-rc4-319354-g9a33b36 #3
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: usb_hub_wq hub_event
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xe8/0x16e lib/dump_stack.c:113
 assign_lock_key kernel/locking/lockdep.c:786 [inline]
 register_lock_class+0x11b8/0x1250 kernel/locking/lockdep.c:1095
 __lock_acquire+0xfb/0x37c0 kernel/locking/lockdep.c:3582
 lock_acquire+0x10d/0x2f0 kernel/locking/lockdep.c:4211
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0x44/0x60 kernel/locking/spinlock.c:152
 down+0x12/0x80 kernel/locking/semaphore.c:58
 vmk80xx_detach+0x59/0x100 drivers/staging/comedi/drivers/vmk80xx.c:829
 comedi_device_detach+0xed/0x800 drivers/staging/comedi/drivers.c:204
 comedi_device_cleanup.part.0+0x68/0x140 drivers/staging/comedi/comedi_fops.c:156
 comedi_device_cleanup drivers/staging/comedi/comedi_fops.c:187 [inline]
 comedi_free_board_dev.part.0+0x16/0x90 drivers/staging/comedi/comedi_fops.c:190
 comedi_free_board_dev drivers/staging/comedi/comedi_fops.c:189 [inline]
 comedi_release_hardware_device+0x111/0x140 drivers/staging/comedi/comedi_fops.c:2880
 comedi_auto_config.cold+0x124/0x1b0 drivers/staging/comedi/drivers.c:1068
 usb_probe_interface+0x31d/0x820 drivers/usb/core/driver.c:361
 really_probe+0x2da/0xb10 drivers/base/dd.c:509
 driver_probe_device+0x21d/0x350 drivers/base/dd.c:671
 __device_attach_driver+0x1d8/0x290 drivers/base/dd.c:778
 bus_for_each_drv+0x163/0x1e0 drivers/base/bus.c:454
 __device_attach+0x223/0x3a0 drivers/base/dd.c:844
 bus_probe_device+0x1f1/0x2a0 drivers/base/bus.c:514
 device_add+0xad2/0x16e0 drivers/base/core.c:2106
 usb_set_configuration+0xdf7/0x1740 drivers/usb/core/message.c:2021
 generic_probe+0xa2/0xda drivers/usb/core/generic.c:210
 usb_probe_device+0xc0/0x150 drivers/usb/core/driver.c:266
 really_probe+0x2da/0xb10 drivers/base/dd.c:509
 driver_probe_device+0x21d/0x350 drivers/base/dd.c:671
 __device_attach_driver+0x1d8/0x290 drivers/base/dd.c:778
 bus_for_each_drv+0x163/0x1e0 drivers/base/bus.c:454
 __device_attach+0x223/0x3a0 drivers/base/dd.c:844
 bus_probe_device+0x1f1/0x2a0 drivers/base/bus.c:514
 device_add+0xad2/0x16e0 drivers/base/core.c:2106
 usb_new_device.cold+0x537/0xccf drivers/usb/core/hub.c:2534
 hub_port_connect drivers/usb/core/hub.c:5089 [inline]
 hub_port_connect_change drivers/usb/core/hub.c:5204 [inline]
 port_event drivers/usb/core/hub.c:5350 [inline]
 hub_event+0x138e/0x3b00 drivers/usb/core/hub.c:5432
 process_one_work+0x90f/0x1580 kernel/workqueue.c:2269
 worker_thread+0x9b/0xe20 kernel/workqueue.c:2415
 kthread+0x313/0x420 kernel/kthread.c:253
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352

Reported-by: syzbot+54c2f58f15fe6876b6ad@syzkaller.appspotmail.com
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoMerge tag 'extcon-fixes-for-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel...
Greg Kroah-Hartman [Tue, 16 Apr 2019 10:46:09 +0000 (12:46 +0200)] 
Merge tag 'extcon-fixes-for-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into char-misc-linus

Chanwoo writes:

Update extcon for v5.1-rc4

Detailed description for this pull request:
1. Fix the build issue of extcon-ptn5150.c driver by editing
   the module dependency in Kconfig.

* tag 'extcon-fixes-for-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon:
  extcon: ptn5150: fix COMPILE_TEST dependencies

5 years agoperf/x86: Fix incorrect PEBS_REGS
Kan Liang [Tue, 2 Apr 2019 19:44:58 +0000 (12:44 -0700)] 
perf/x86: Fix incorrect PEBS_REGS

PEBS_REGS used as mask for the supported registers for large PEBS.
However, the mask cannot filter the sample_regs_user/sample_regs_intr
correctly.

(1ULL << PERF_REG_X86_*) should be used to replace PERF_REG_X86_*, which
is only the index.

Rename PEBS_REGS to PEBS_GP_REGS, because the mask is only for general
purpose registers.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Fixes: 2fe1bc1f501d ("perf/x86: Enable free running PEBS for REGS_USER/INTR")
Link: https://lkml.kernel.org/r/20190402194509.2832-2-kan.liang@linux.intel.com
[ Renamed it to PEBS_GP_REGS - as 'GPRS' is used elsewhere ;-) ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
5 years agoperf/ring_buffer: Fix AUX record suppression
Alexander Shishkin [Fri, 29 Mar 2019 09:13:38 +0000 (11:13 +0200)] 
perf/ring_buffer: Fix AUX record suppression

The following commit:

  1627314fb54a33e ("perf: Suppress AUX/OVERWRITE records")

has an unintended side-effect of also suppressing all AUX records with no flags
and non-zero size, so all the regular records in the full trace mode.
This breaks some use cases for people.

Fix this by restoring "regular" AUX records.

Reported-by: Ben Gainey <Ben.Gainey@arm.com>
Tested-by: Ben Gainey <Ben.Gainey@arm.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 1627314fb54a33e ("perf: Suppress AUX/OVERWRITE records")
Link: https://lkml.kernel.org/r/20190329091338.29999-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
5 years agoperf/core: Fix the address filtering fix
Alexander Shishkin [Fri, 29 Mar 2019 09:12:12 +0000 (11:12 +0200)] 
perf/core: Fix the address filtering fix

The following recent commit:

  c60f83b813e5 ("perf, pt, coresight: Fix address filters for vmas with non-zero offset")

changes the address filtering logic to communicate filter ranges to the PMU driver
via a single address range object, instead of having the driver do the final bit of
math.

That change forgets to take into account kernel filters, which are not calculated
the same way as DSO based filters.

Fix that by passing the kernel filters the same way as file-based filters.
This doesn't require any additional changes in the drivers.

Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: c60f83b813e5 ("perf, pt, coresight: Fix address filters for vmas with non-zero offset")
Link: https://lkml.kernel.org/r/20190329091212.29870-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
5 years agoKVM: nVMX: allow tests to use bad virtual-APIC page address
Paolo Bonzini [Mon, 15 Apr 2019 13:16:17 +0000 (15:16 +0200)] 
KVM: nVMX: allow tests to use bad virtual-APIC page address

As mentioned in the comment, there are some special cases where we can simply
clear the TPR shadow bit from the CPU-based execution controls in the vmcs02.
Handle them so that we can remove some XFAILs from vmx.flat.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agox86/mm/tlb: Revert "x86/mm: Align TLB invalidation info"
Peter Zijlstra [Tue, 16 Apr 2019 08:03:35 +0000 (10:03 +0200)] 
x86/mm/tlb: Revert "x86/mm: Align TLB invalidation info"

Revert the following commit:

  515ab7c41306: ("x86/mm: Align TLB invalidation info")

I found out (the hard way) that under some .config options (notably L1_CACHE_SHIFT=7)
and compiler combinations this on-stack alignment leads to a 320 byte
stack usage, which then triggers a KASAN stack warning elsewhere.

Using 320 bytes of stack space for a 40 byte structure is ludicrous and
clearly not right.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Nadav Amit <namit@vmware.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 515ab7c41306 ("x86/mm: Align TLB invalidation info")
Link: http://lkml.kernel.org/r/20190416080335.GM7905@worktop.programming.kicks-ass.net
[ Minor changelog edits. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
5 years agox86/reboot, efi: Use EFI reboot for Acer TravelMate X514-51T
Jian-Hong Pan [Fri, 12 Apr 2019 08:01:53 +0000 (16:01 +0800)] 
x86/reboot, efi: Use EFI reboot for Acer TravelMate X514-51T

Upon reboot, the Acer TravelMate X514-51T laptop appears to complete the
shutdown process, but then it hangs in BIOS POST with a black screen.

The problem is intermittent - at some points it has appeared related to
Secure Boot settings or different kernel builds, but ultimately we have
not been able to identify the exact conditions that trigger the issue to
come and go.

Besides, the EFI mode cannot be disabled in the BIOS of this model.

However, after extensive testing, we observe that using the EFI reboot
method reliably avoids the issue in all cases.

So add a boot time quirk to use EFI reboot on such systems.

Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=203119
Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Cc: linux@endlessm.com
Link: http://lkml.kernel.org/r/20190412080152.3718-1-jian-hong@endlessm.com
[ Fix !CONFIG_EFI build failure, clarify the code and the changelog a bit. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
5 years agox86/mm: Prevent bogus warnings with "noexec=off"
Thomas Gleixner [Mon, 15 Apr 2019 08:46:07 +0000 (10:46 +0200)] 
x86/mm: Prevent bogus warnings with "noexec=off"

Xose Vazquez Perez reported boot warnings when NX is disabled on the kernel command line.

__early_set_fixmap() triggers this warning:

  attempted to set unsupported pgprot:    8000000000000163
       bits:      8000000000000000
       supported: 7fffffffffffffff

  WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/pgtable.h:537
    __early_set_fixmap+0xa2/0xff

because it uses __default_kernel_pte_mask to mask out unsupported bits.

Use __supported_pte_mask instead.

Disabling NX on the command line also triggers the NX warning in the page
table mapping check:

  WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:262 note_page+0x2ae/0x650
  ....

Make the warning depend on NX set in __supported_pte_mask.

Reported-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Tested-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1904151037530.1729@nanos.tec.linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
5 years agokprobes: Fix error check when reusing optimized probes
Masami Hiramatsu [Mon, 15 Apr 2019 06:01:25 +0000 (15:01 +0900)] 
kprobes: Fix error check when reusing optimized probes

The following commit introduced a bug in one of our error paths:

  819319fc9346 ("kprobes: Return error if we fail to reuse kprobe instead of BUG_ON()")

it missed to handle the return value of kprobe_optready() as
error-value. In reality, the kprobe_optready() returns a bool
result, so "true" case must be passed instead of 0.

This causes some errors on kprobe boot-time selftests on ARM:

 [   ] Beginning kprobe tests...
 [   ] Probe ARM code
 [   ]     kprobe
 [   ]     kretprobe
 [   ] ARM instruction simulation
 [   ]     Check decoding tables
 [   ]     Run test cases
 [   ] FAIL: test_case_handler not run
 [   ] FAIL: Test andge r10, r11, r14, asr r7
 [   ] FAIL: Scenario 11
 ...
 [   ] FAIL: Scenario 7
 [   ] Total instruction simulation tests=1631, pass=1433 fail=198
 [   ] kprobe tests failed

This can happen if an optimized probe is unregistered and next
kprobe is registered on same address until the previous probe
is not reclaimed.

If this happens, a hidden aggregated probe may be kept in memory,
and no new kprobe can probe same address. Also, in that case
register_kprobe() will return "1" instead of minus error value,
which can mislead caller logic.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Naveen N . Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # v5.0+
Fixes: 819319fc9346 ("kprobes: Return error if we fail to reuse kprobe instead of BUG_ON()")
Link: http://lkml.kernel.org/r/155530808559.32517.539898325433642204.stgit@devnote2
Signed-off-by: Ingo Molnar <mingo@kernel.org>
5 years agolocking/lockdep: Make lockdep_unregister_key() honor 'debug_locks' again
Bart Van Assche [Mon, 15 Apr 2019 17:05:38 +0000 (10:05 -0700)] 
locking/lockdep: Make lockdep_unregister_key() honor 'debug_locks' again

If lockdep_register_key() and lockdep_unregister_key() are called with
debug_locks == false then the following warning is reported:

  WARNING: CPU: 2 PID: 15145 at kernel/locking/lockdep.c:4920 lockdep_unregister_key+0x1ad/0x240

That warning is reported because lockdep_unregister_key() ignores the
value of 'debug_locks' and because the behavior of lockdep_register_key()
depends on whether or not 'debug_locks' is set. Fix this inconsistency
by making lockdep_unregister_key() take 'debug_locks' again into
account.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: shenghui <shhuiw@foxmail.com>
Fixes: 90c1cba2b3b3 ("locking/lockdep: Zap lock classes even with lock debugging disabled")
Link: http://lkml.kernel.org/r/20190415170538.23491-1-bvanassche@acm.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
5 years agox86/build/lto: Fix truncated .bss with -fdata-sections
Sami Tolvanen [Mon, 15 Apr 2019 16:49:56 +0000 (09:49 -0700)] 
x86/build/lto: Fix truncated .bss with -fdata-sections

With CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y, we compile the kernel with
-fdata-sections, which also splits the .bss section.

The new section, with a new .bss.* name, which pattern gets missed by the
main x86 linker script which only expects the '.bss' name. This results
in the discarding of the second part and a too small, truncated .bss
section and an unhappy, non-working kernel.

Use the common BSS_MAIN macro in the linker script to properly capture
and merge all the generated BSS sections.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20190415164956.124067-1-samitolvanen@google.com
[ Extended the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
5 years agobnx2x: fix spelling mistake "dicline" -> "decline"
Colin Ian King [Mon, 15 Apr 2019 15:47:03 +0000 (16:47 +0100)] 
bnx2x: fix spelling mistake "dicline" -> "decline"

There is a spelling mistake in a BNX2X_ERR message, fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'libnvdimm-fixes-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 15 Apr 2019 23:48:51 +0000 (16:48 -0700)] 
Merge tag 'libnvdimm-fixes-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fixes from Dan Williams:
 "I debated holding this back for the v5.2 merge window due to the size
  of the "zero-key" changes, but affected users would benefit from
  having the fixes sooner. It did not make sense to change the zero-key
  semantic in isolation for the "secure-erase" command, but instead
  include it for all security commands.

  The short background on the need for these changes is that some NVDIMM
  platforms enable security with a default zero-key rather than let the
  OS specify the initial key. This makes the security enabling that
  landed in v5.0 unusable for some users.

  Summary:

   - Compatibility fix for nvdimm-security implementations with a
     default zero-key.

   - Miscellaneous small fixes for out-of-bound accesses, cleanup after
     initialization failures, and missing debug messages"

* tag 'libnvdimm-fixes-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  tools/testing/nvdimm: Retain security state after overwrite
  libnvdimm/pmem: fix a possible OOB access when read and write pmem
  libnvdimm/security, acpi/nfit: unify zero-key for all security commands
  libnvdimm/security: provide fix for secure-erase to use zero-key
  libnvdimm/btt: Fix a kmemdup failure check
  libnvdimm/namespace: Fix a potential NULL pointer dereference
  acpi/nfit: Always dump _DSM output payload

5 years agoMerge tag 'fsdax-fix-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm...
Linus Torvalds [Mon, 15 Apr 2019 22:10:20 +0000 (15:10 -0700)] 
Merge tag 'fsdax-fix-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull fsdax fix from Dan Williams:
 "A single filesystem-dax fix. It has been lingering in -next for a long
  while and there are no other fsdax fixes on the horizon:

   - Avoid a crash scenario with architectures like powerpc that require
     'pgtable_deposit' for the zero page"

* tag 'fsdax-fix-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  fs/dax: Deposit pagetable even when installing zero page

5 years agoroute: Avoid crash from dereferencing NULL rt->from
Jonathan Lemon [Sun, 14 Apr 2019 21:21:29 +0000 (14:21 -0700)] 
route: Avoid crash from dereferencing NULL rt->from

When __ip6_rt_update_pmtu() is called, rt->from is RCU dereferenced, but is
never checked for null - rt6_flush_exceptions() may have removed the entry.

[ 1913.989004] RIP: 0010:ip6_rt_cache_alloc+0x13/0x170
[ 1914.209410] Call Trace:
[ 1914.214798]  <IRQ>
[ 1914.219226]  __ip6_rt_update_pmtu+0xb0/0x190
[ 1914.228649]  ip6_tnl_xmit+0x2c2/0x970 [ip6_tunnel]
[ 1914.239223]  ? ip6_tnl_parse_tlv_enc_lim+0x32/0x1a0 [ip6_tunnel]
[ 1914.252489]  ? __gre6_xmit+0x148/0x530 [ip6_gre]
[ 1914.262678]  ip6gre_tunnel_xmit+0x17e/0x3c7 [ip6_gre]
[ 1914.273831]  dev_hard_start_xmit+0x8d/0x1f0
[ 1914.283061]  sch_direct_xmit+0xfa/0x230
[ 1914.291521]  __qdisc_run+0x154/0x4b0
[ 1914.299407]  net_tx_action+0x10e/0x1f0
[ 1914.307678]  __do_softirq+0xca/0x297
[ 1914.315567]  irq_exit+0x96/0xa0
[ 1914.322494]  smp_apic_timer_interrupt+0x68/0x130
[ 1914.332683]  apic_timer_interrupt+0xf/0x20
[ 1914.341721]  </IRQ>

Fixes: a68886a69180 ("net/ipv6: Make from in rt6_info rcu protected")
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMAINTAINERS: normalize Woojung Huh's email address
Lukas Bulwahn [Sat, 13 Apr 2019 07:52:15 +0000 (09:52 +0200)] 
MAINTAINERS: normalize Woojung Huh's email address

MAINTAINERS contains a lower-case and upper-case variant of
Woojung Huh' s email address.

Only keep the lower-case variant in MAINTAINERS.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobonding: fix event handling for stacked bonds
Sabrina Dubroca [Fri, 12 Apr 2019 13:04:10 +0000 (15:04 +0200)] 
bonding: fix event handling for stacked bonds

When a bond is enslaved to another bond, bond_netdev_event() only
handles the event as if the bond is a master, and skips treating the
bond as a slave.

This leads to a refcount leak on the slave, since we don't remove the
adjacency to its master and the master holds a reference on the slave.

Reproducer:
  ip link add bondL type bond
  ip link add bondU type bond
  ip link set bondL master bondU
  ip link del bondL

No "Fixes:" tag, this code is older than git history.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoRevert "net-sysfs: Fix memory leak in netdev_register_kobject"
Wang Hai [Fri, 12 Apr 2019 20:36:33 +0000 (16:36 -0400)] 
Revert "net-sysfs: Fix memory leak in netdev_register_kobject"

This reverts commit 6b70fc94afd165342876e53fc4b2f7d085009945.

The reverted bugfix will cause another issue.
Reported by syzbot+6024817a931b2830bc93@syzkaller.appspotmail.com.
See https://syzkaller.appspot.com/x/log.txt?x=1737671b200000 for
details.

Signed-off-by: Wang Hai <wanghai26@huawei.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'wireless-drivers-for-davem-2019-04-15' of git://git.kernel.org/pub/scm...
David S. Miller [Mon, 15 Apr 2019 19:02:29 +0000 (12:02 -0700)] 
Merge tag 'wireless-drivers-for-davem-2019-04-15' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
wireless-drivers fixes for 5.1

Second set of fixes for 5.1.

iwlwifi

* add some new PCI IDs (plus a struct name change they depend on)

* fix crypto with new devices, namely 22560 and above

* fix for a potential deadlock in the TX path

* a fix for offloaded rate-control

* support new PCI HW IDs which use a new FW

mt76

* fix lock initialisation and a possible deadlock

* aggregation fixes

rt2x00

* fix sequence numbering during retransmits
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoKVM: x86/mmu: Fix an inverted list_empty() check when zapping sptes
Sean Christopherson [Sat, 13 Apr 2019 02:55:41 +0000 (19:55 -0700)] 
KVM: x86/mmu: Fix an inverted list_empty() check when zapping sptes

A recently introduced helper for handling zap vs. remote flush
incorrectly bails early, effectively leaking defunct shadow pages.
Manifests as a slab BUG when exiting KVM due to the shadow pages
being alive when their associated cache is destroyed.

==========================================================================
BUG kvm_mmu_page_header: Objects remaining in kvm_mmu_page_header on ...
--------------------------------------------------------------------------
Disabling lock debugging due to kernel taint
INFO: Slab 0x00000000fc436387 objects=26 used=23 fp=0x00000000d023caee ...
CPU: 6 PID: 4315 Comm: rmmod Tainted: G    B             5.1.0-rc2+ #19
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Call Trace:
 dump_stack+0x46/0x5b
 slab_err+0xad/0xd0
 ? on_each_cpu_mask+0x3c/0x50
 ? ksm_migrate_page+0x60/0x60
 ? on_each_cpu_cond_mask+0x7c/0xa0
 ? __kmalloc+0x1ca/0x1e0
 __kmem_cache_shutdown+0x13a/0x310
 shutdown_cache+0xf/0x130
 kmem_cache_destroy+0x1d5/0x200
 kvm_mmu_module_exit+0xa/0x30 [kvm]
 kvm_arch_exit+0x45/0x60 [kvm]
 kvm_exit+0x6f/0x80 [kvm]
 vmx_exit+0x1a/0x50 [kvm_intel]
 __x64_sys_delete_module+0x153/0x1f0
 ? exit_to_usermode_loop+0x88/0xc0
 do_syscall_64+0x4f/0x100
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: a21136345cb6f ("KVM: x86/mmu: Split remote_flush+zap case out of kvm_mmu_flush_or_zap()")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agodrivers: power: supply: goldfish_battery: Fix bogus SPDX identifier
Thomas Gleixner [Tue, 19 Mar 2019 14:51:56 +0000 (15:51 +0100)] 
drivers: power: supply: goldfish_battery: Fix bogus SPDX identifier

spdxcheck.py complains:

 drivers/power/supply/goldfish_battery.c: 1:28 Invalid License ID: GPL

which is correct because GPL is not a valid identifier. Of course this
could have been caught by checkpatch.pl _before_ submitting or merging the
patch.

 WARNING: 'SPDX-License-Identifier: GPL' is not supported in LICENSES/...
 #19: FILE: drivers/power/supply/goldfish_battery.c:1:
 +// SPDX-License-Identifier: GPL

Which is absolutely hillarious as the commit introducing this wreckage says
in the changelog:

  There was a checkpatch complain:

    "Missing or malformed SPDX-License-Identifier tag".

Oh well. Replacing a checkpatch warning by a different checkpatch warning
is a really useful exercise.

Use the proper GPL-2.0 identifier which is what the boiler plate in the
file had originally.

Fixes: e75e3a125b40 ("drivers: power: supply: goldfish_battery: Put an SPDX tag")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agos390/pkey: add one more argument space for debug feature entry
Harald Freudenberger [Fri, 12 Apr 2019 09:04:50 +0000 (11:04 +0200)] 
s390/pkey: add one more argument space for debug feature entry

The debug feature entries have been used with up to 5 arguents
(including the pointer to the format string) but there was only
space reserved for 4 arguemnts. So now the registration does
reserve space for 5 times a long value.

This fixes a sometime appearing weired value as the last
value of an debug feature entry like this:

... pkey_sec2protkey zcrypt_send_cprb (cardnr=10 domain=12)
   failed with errno -2143346254

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reported-by: Christian Rund <Christian.Rund@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
5 years agodrm/amd/display: If one stream full updates, full update all planes
David Francis [Fri, 29 Mar 2019 17:23:15 +0000 (13:23 -0400)] 
drm/amd/display: If one stream full updates, full update all planes

[Why]
On some compositors, with two monitors attached, VT terminal
switch can cause a graphical issue by the following means:

There are two streams, one for each monitor. Each stream has one
plane

current state:
M1:S1->P1
M2:S2->P2

The user calls for a terminal switch and a commit is made to
change both planes to linear swizzle mode. In atomic check,
a new dc_state is constructed with new planes on each stream

new state:
M1:S1->P3
M2:S2->P4

In commit tail, each stream is committed, one at a time. The first
stream (S1) updates properly, triggerring a full update and replacing
the state

current state:
M1:S1->P3
M2:S2->P4

The update for S2 comes in, but dc detects that there is no difference
between the stream and plane in the new and current states, and so
triggers a fast update. The fast update does not program swizzle,
so the second monitor is corrupted

[How]
Add a flag to dc_plane_state that forces full updates

When a stream undergoes a full update, set this flag on all changed
planes, then clear it on the current stream

Subsequent streams will get full updates as a result

Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Acked-by: Bhawanpreet Lakha <Bhawanpreet Lakha@amd.com>
Acked-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 years agoLinux 5.1-rc5 v5.1-rc5
Linus Torvalds [Sun, 14 Apr 2019 22:17:41 +0000 (15:17 -0700)] 
Linux 5.1-rc5

5 years agoMerge branch 'page-refs' (page ref overflow)
Linus Torvalds [Sun, 14 Apr 2019 22:09:40 +0000 (15:09 -0700)] 
Merge branch 'page-refs' (page ref overflow)

Merge page ref overflow branch.

Jann Horn reported that he can overflow the page ref count with
sufficient memory (and a filesystem that is intentionally extremely
slow).

Admittedly it's not exactly easy.  To have more than four billion
references to a page requires a minimum of 32GB of kernel memory just
for the pointers to the pages, much less any metadata to keep track of
those pointers.  Jann needed a total of 140GB of memory and a specially
crafted filesystem that leaves all reads pending (in order to not ever
free the page references and just keep adding more).

Still, we have a fairly straightforward way to limit the two obvious
user-controllable sources of page references: direct-IO like page
references gotten through get_user_pages(), and the splice pipe page
duplication.  So let's just do that.

* branch page-refs:
  fs: prevent page refcount overflow in pipe_buf_get
  mm: prevent get_user_pages() from overflowing page refcount
  mm: add 'try_get_page()' helper function
  mm: make page ref count overflow check tighter and more explicit

5 years agoMerge tag 'mlx5-fixes-2019-04-09' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Sun, 14 Apr 2019 22:07:30 +0000 (15:07 -0700)] 
Merge tag 'mlx5-fixes-2019-04-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2019-04-09

This series provides some fixes to mlx5 driver.

I've cc'ed some of the checksum fixes to Eric Dumazet and i would like to get
his feedback before you pull.

For -stable v4.19
('net/mlx5: FPGA, tls, idr remove on flow delete')
('net/mlx5: FPGA, tls, hold rcu read lock a bit longer')

For -stable v4.20
('net/mlx5e: Rx, Check ip headers sanity')
('Revert "net/mlx5e: Enable reporting checksum unnecessary also for L3 packets"')
('net/mlx5e: Rx, Fixup skb checksum for packets with tail padding')

For -stable v5.0
('net/mlx5e: Switch to Toeplitz RSS hash by default')
('net/mlx5e: Protect against non-uplink representor for encap')
('net/mlx5e: XDP, Avoid checksum complete when XDP prog is loaded')
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agortnetlink: fix rtnl_valid_stats_req() nlmsg_len check
Eric Dumazet [Sun, 14 Apr 2019 18:02:05 +0000 (11:02 -0700)] 
rtnetlink: fix rtnl_valid_stats_req() nlmsg_len check

Jakub forgot to either use nlmsg_len() or nlmsg_msg_size(),
allowing KMSAN to detect a possible uninit-value in rtnl_stats_get

BUG: KMSAN: uninit-value in rtnl_stats_get+0x6d9/0x11d0 net/core/rtnetlink.c:4997
CPU: 0 PID: 10428 Comm: syz-executor034 Not tainted 5.1.0-rc2+ #24
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x173/0x1d0 lib/dump_stack.c:113
 kmsan_report+0x131/0x2a0 mm/kmsan/kmsan.c:619
 __msan_warning+0x7a/0xf0 mm/kmsan/kmsan_instr.c:310
 rtnl_stats_get+0x6d9/0x11d0 net/core/rtnetlink.c:4997
 rtnetlink_rcv_msg+0x115b/0x1550 net/core/rtnetlink.c:5192
 netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2485
 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5210
 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
 netlink_unicast+0xf3e/0x1020 net/netlink/af_netlink.c:1336
 netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1925
 sock_sendmsg_nosec net/socket.c:622 [inline]
 sock_sendmsg net/socket.c:632 [inline]
 ___sys_sendmsg+0xdb3/0x1220 net/socket.c:2137
 __sys_sendmsg net/socket.c:2175 [inline]
 __do_sys_sendmsg net/socket.c:2184 [inline]
 __se_sys_sendmsg+0x305/0x460 net/socket.c:2182
 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2182
 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x63/0xe7

Fixes: 51bc860d4a99 ("rtnetlink: stats: validate attributes in get as well as dumps")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agox86/speculation: Prevent deadlock on ssb_state::lock
Thomas Gleixner [Sun, 14 Apr 2019 17:51:06 +0000 (19:51 +0200)] 
x86/speculation: Prevent deadlock on ssb_state::lock

Mikhail reported a lockdep splat related to the AMD specific ssb_state
lock:

  CPU0                       CPU1
  lock(&st->lock);
                             local_irq_disable();
                             lock(&(&sighand->siglock)->rlock);
                             lock(&st->lock);
  <Interrupt>
     lock(&(&sighand->siglock)->rlock);

  *** DEADLOCK ***

The connection between sighand->siglock and st->lock comes through seccomp,
which takes st->lock while holding sighand->siglock.

Make sure interrupts are disabled when __speculation_ctrl_update() is
invoked via prctl() -> speculation_ctrl_update(). Add a lockdep assert to
catch future offenders.

Fixes: 1f50ddb4f418 ("x86/speculation: Handle HT correctly on AMD")
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: Thomas Lendacky <thomas.lendacky@amd.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1904141948200.4917@nanos.tec.linutronix.de
5 years agoMerge branch 'qed-doorbell-overflow-recovery'
David S. Miller [Sun, 14 Apr 2019 20:59:49 +0000 (13:59 -0700)] 
Merge branch 'qed-doorbell-overflow-recovery'

Denis Bolotin says:

====================
qed: Fix the Doorbell Overflow Recovery mechanism

This patch series fixes and improves the doorbell recovery mechanism.
The main goals of this series are to fix missing attentions from the
doorbells block (DORQ) or not handling them properly, and execute the
recovery from periodic handler instead of the attention handler.

Please consider applying the series to net.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: Fix the DORQ's attentions handling
Denis Bolotin [Sun, 14 Apr 2019 14:23:08 +0000 (17:23 +0300)] 
qed: Fix the DORQ's attentions handling

Separate the overflow handling from the hardware interrupt status analysis.
The interrupt status is a single register and is common for all PFs. The
first PF reading the register is not necessarily the one who overflowed.
All PFs must check their overflow status on every attention.
In this change we clear the sticky indication in the attention handler to
allow doorbells to be processed again as soon as possible, but running
the doorbell recovery is scheduled for the periodic handler to reduce the
time spent in the attention handler.
Checking the need for DORQ flush was changed to "db_bar_no_edpm" because
qed_edpm_enabled()'s result could change dynamically and might have
prevented a needed flush.

Signed-off-by: Denis Bolotin <dbolotin@marvell.com>
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: Fix missing DORQ attentions
Denis Bolotin [Sun, 14 Apr 2019 14:23:07 +0000 (17:23 +0300)] 
qed: Fix missing DORQ attentions

When the DORQ (doorbell block) is overflowed, all PFs get attentions at the
same time. If one PF finished handling the attention before another PF even
started, the second PF might miss the DORQ's attention bit and not handle
the attention at all.
If the DORQ attention is missed and the issue is not resolved, another
attention will not be sent, therefore each attention is treated as a
potential DORQ attention.
As a result, the attention callback is called more frequently so the debug
print was moved to reduce its quantity.
The number of periodic doorbell recovery handler schedules was reduced
because it was the previous way to mitigating the missed attention issue.

Signed-off-by: Denis Bolotin <dbolotin@marvell.com>
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: Fix the doorbell address sanity check
Denis Bolotin [Sun, 14 Apr 2019 14:23:06 +0000 (17:23 +0300)] 
qed: Fix the doorbell address sanity check

Fix the condition which verifies that doorbell address is inside the
doorbell bar by checking that the end of the address is within range
as well.

Signed-off-by: Denis Bolotin <dbolotin@marvell.com>
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: Delete redundant doorbell recovery types
Denis Bolotin [Sun, 14 Apr 2019 14:23:05 +0000 (17:23 +0300)] 
qed: Delete redundant doorbell recovery types

DB_REC_DRY_RUN (running doorbell recovery without sending doorbells) is
never used. DB_REC_ONCE (send a single doorbell from the doorbell recovery)
is not needed anymore because by running the periodic handler we make sure
we check the overflow status later instead.
This patch is needed because in the next patches, the only doorbell
recovery type being used is DB_REC_REAL_DEAL, and the fixes are much
cleaner without this enum.

Signed-off-by: Denis Bolotin <dbolotin@marvell.com>
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4: ensure rcu_read_lock() in ipv4_link_failure()
Eric Dumazet [Sun, 14 Apr 2019 00:32:21 +0000 (17:32 -0700)] 
ipv4: ensure rcu_read_lock() in ipv4_link_failure()

fib_compute_spec_dst() needs to be called under rcu protection.

syzbot reported :

WARNING: suspicious RCU usage
5.1.0-rc4+ #165 Not tainted
include/linux/inetdevice.h:220 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
1 lock held by swapper/0/0:
 #0: 0000000051b67925 ((&n->timer)){+.-.}, at: lockdep_copy_map include/linux/lockdep.h:170 [inline]
 #0: 0000000051b67925 ((&n->timer)){+.-.}, at: call_timer_fn+0xda/0x720 kernel/time/timer.c:1315

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.1.0-rc4+ #165
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5162
 __in_dev_get_rcu include/linux/inetdevice.h:220 [inline]
 fib_compute_spec_dst+0xbbd/0x1030 net/ipv4/fib_frontend.c:294
 spec_dst_fill net/ipv4/ip_options.c:245 [inline]
 __ip_options_compile+0x15a7/0x1a10 net/ipv4/ip_options.c:343
 ipv4_link_failure+0x172/0x400 net/ipv4/route.c:1195
 dst_link_failure include/net/dst.h:427 [inline]
 arp_error_report+0xd1/0x1c0 net/ipv4/arp.c:297
 neigh_invalidate+0x24b/0x570 net/core/neighbour.c:995
 neigh_timer_handler+0xc35/0xf30 net/core/neighbour.c:1081
 call_timer_fn+0x190/0x720 kernel/time/timer.c:1325
 expire_timers kernel/time/timer.c:1362 [inline]
 __run_timers kernel/time/timer.c:1681 [inline]
 __run_timers kernel/time/timer.c:1649 [inline]
 run_timer_softirq+0x652/0x1700 kernel/time/timer.c:1694
 __do_softirq+0x266/0x95a kernel/softirq.c:293
 invoke_softirq kernel/softirq.c:374 [inline]
 irq_exit+0x180/0x1d0 kernel/softirq.c:414
 exiting_irq arch/x86/include/asm/apic.h:536 [inline]
 smp_apic_timer_interrupt+0x14a/0x570 arch/x86/kernel/apic/apic.c:1062
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:807

Fixes: ed0de45a1008 ("ipv4: recompile ip options in ipv4_link_failure")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Stephen Suryaputra <ssuryaextr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agofs: prevent page refcount overflow in pipe_buf_get
Matthew Wilcox [Fri, 5 Apr 2019 21:02:10 +0000 (14:02 -0700)] 
fs: prevent page refcount overflow in pipe_buf_get

Change pipe_buf_get() to return a bool indicating whether it succeeded
in raising the refcount of the page (if the thing in the pipe is a page).
This removes another mechanism for overflowing the page refcount.  All
callers converted to handle a failure.

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: prevent get_user_pages() from overflowing page refcount
Linus Torvalds [Thu, 11 Apr 2019 17:49:19 +0000 (10:49 -0700)] 
mm: prevent get_user_pages() from overflowing page refcount

If the page refcount wraps around past zero, it will be freed while
there are still four billion references to it.  One of the possible
avenues for an attacker to try to make this happen is by doing direct IO
on a page multiple times.  This patch makes get_user_pages() refuse to
take a new page reference if there are already more than two billion
references to the page.

Reported-by: Jann Horn <jannh@google.com>
Acked-by: Matthew Wilcox <willy@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: add 'try_get_page()' helper function
Linus Torvalds [Thu, 11 Apr 2019 17:14:59 +0000 (10:14 -0700)] 
mm: add 'try_get_page()' helper function

This is the same as the traditional 'get_page()' function, but instead
of unconditionally incrementing the reference count of the page, it only
does so if the count was "safe".  It returns whether the reference count
was incremented (and is marked __must_check, since the caller obviously
has to be aware of it).

Also like 'get_page()', you can't use this function unless you already
had a reference to the page.  The intent is that you can use this
exactly like get_page(), but in situations where you want to limit the
maximum reference count.

The code currently does an unconditional WARN_ON_ONCE() if we ever hit
the reference count issues (either zero or negative), as a notification
that the conditional non-increment actually happened.

NOTE! The count access for the "safety" check is inherently racy, but
that doesn't matter since the buffer we use is basically half the range
of the reference count (ie we look at the sign of the count).

Acked-by: Matthew Wilcox <willy@infradead.org>
Cc: Jann Horn <jannh@google.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: make page ref count overflow check tighter and more explicit
Linus Torvalds [Thu, 11 Apr 2019 17:06:20 +0000 (10:06 -0700)] 
mm: make page ref count overflow check tighter and more explicit

We have a VM_BUG_ON() to check that the page reference count doesn't
underflow (or get close to overflow) by checking the sign of the count.

That's all fine, but we actually want to allow people to use a "get page
ref unless it's already very high" helper function, and we want that one
to use the sign of the page ref (without triggering this VM_BUG_ON).

Change the VM_BUG_ON to only check for small underflows (or _very_ close
to overflowing), and ignore overflows which have strayed into negative
territory.

Acked-by: Matthew Wilcox <willy@infradead.org>
Cc: Jann Horn <jannh@google.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agox86/resctrl: Do not repeat rdtgroup mode initialization
Xiaochen Shen [Tue, 9 Apr 2019 19:53:49 +0000 (03:53 +0800)] 
x86/resctrl: Do not repeat rdtgroup mode initialization

When cache allocation is supported and the user creates a new resctrl
resource group, the allocations of the new resource group are
initialized to all regions that it can possibly use. At this time these
regions are all that are shareable by other resource groups as well as
regions that are not currently used. The new resource group's mode is
also initialized to reflect this initialization and set to "shareable".

The new resource group's mode is currently repeatedly initialized within
the loop that configures the hardware with the resource group's default
allocations.

Move the initialization of the resource group's mode outside the
hardware configuration loop. The resource group's mode is now
initialized only once as the final step to reflect that its configured
allocations are "shareable".

Fixes: 95f0b77efa57 ("x86/intel_rdt: Initialize new resource group with sane defaults")
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: pei.p.jia@intel.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/1554839629-5448-1-git-send-email-xiaochen.shen@intel.com
5 years agoMerge tag 'for-linus-20190412' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 13 Apr 2019 23:23:16 +0000 (16:23 -0700)] 
Merge tag 'for-linus-20190412' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Set of fixes that should go into this round. This pull is larger than
  I'd like at this time, but there's really no specific reason for that.
  Some are fixes for issues that went into this merge window, others are
  not. Anyway, this contains:

   - Hardware queue limiting for virtio-blk/scsi (Dongli)

   - Multi-page bvec fixes for lightnvm pblk

   - Multi-bio dio error fix (Jason)

   - Remove the cache hint from the io_uring tool side, since we didn't
     move forward with that (me)

   - Make io_uring SETUP_SQPOLL root restricted (me)

   - Fix leak of page in error handling for pc requests (Jérôme)

   - Fix BFQ regression introduced in this merge window (Paolo)

   - Fix break logic for bio segment iteration (Ming)

   - Fix NVMe cancel request error handling (Ming)

   - NVMe pull request with two fixes (Christoph):
       - fix the initial CSN for nvme-fc (James)
       - handle log page offsets properly in the target (Keith)"

* tag 'for-linus-20190412' of git://git.kernel.dk/linux-block:
  block: fix the return errno for direct IO
  nvmet: fix discover log page when offsets are used
  nvme-fc: correct csn initialization and increments on error
  block: do not leak memory in bio_copy_user_iov()
  lightnvm: pblk: fix crash in pblk_end_partial_read due to multipage bvecs
  nvme: cancel request synchronously
  blk-mq: introduce blk_mq_complete_request_sync()
  scsi: virtio_scsi: limit number of hw queues by nr_cpu_ids
  virtio-blk: limit number of hw queues by nr_cpu_ids
  block, bfq: fix use after free in bfq_bfqq_expire
  io_uring: restrict IORING_SETUP_SQPOLL to root
  tools/io_uring: remove IOCQE_FLAG_CACHEHIT
  block: don't use for-inside-for in bio_for_each_segment_all

5 years agoMerge tag 'nfs-for-5.1-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Linus Torvalds [Sat, 13 Apr 2019 21:47:06 +0000 (14:47 -0700)] 
Merge tag 'nfs-for-5.1-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "Highlights include:

  Stable fix:

   - Fix a deadlock in close() due to incorrect draining of RDMA queues

  Bugfixes:

   - Revert "SUNRPC: Micro-optimise when the task is known not to be
     sleeping" as it is causing stack overflows

   - Fix a regression where NFSv4 getacl and fs_locations stopped
     working

   - Forbid setting AF_INET6 to "struct sockaddr_in"->sin_family.

   - Fix xfstests failures due to incorrect copy_file_range() return
     values"

* tag 'nfs-for-5.1-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  Revert "SUNRPC: Micro-optimise when the task is known not to be sleeping"
  NFSv4.1 fix incorrect return value in copy_file_range
  xprtrdma: Fix helper that drains the transport
  NFS: Fix handling of reply page vector
  NFS: Forbid setting AF_INET6 to "struct sockaddr_in"->sin_family.

5 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 13 Apr 2019 21:37:49 +0000 (14:37 -0700)] 
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fix from James Bottomley:
 "One obvious fix for a ciostor data corruption on error bug"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: csiostor: fix missing data copy in csio_scsi_err_handler()

5 years agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 13 Apr 2019 21:33:56 +0000 (14:33 -0700)] 
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
 "Here's more than a handful of clk driver fixes for changes that came
  in during the merge window:

   - Fix the AT91 sama5d2 programmable clk prescaler formula

   - A bunch of Amlogic meson clk driver fixes for the VPU clks

   - A DMI quirk for Intel's Bay Trail SoC's driver to properly mark pmc
     clks as critical only when really needed

   - Stop overwriting CLK_SET_RATE_PARENT flag in mediatek's clk gate
     implementation

   - Use the right structure to test for a frequency table in i.MX's
     PLL_1416x driver"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: imx: Fix PLL_1416X not rounding rates
  clk: mediatek: fix clk-gate flag setting
  platform/x86: pmc_atom: Drop __initconst on dmi table
  clk: x86: Add system specific quirk to mark clocks as critical
  clk: meson: vid-pll-div: remove warning and return 0 on invalid config
  clk: meson: pll: fix rounding and setting a rate that matches precisely
  clk: meson-g12a: fix VPU clock parents
  clk: meson: g12a: fix VPU clock muxes mask
  clk: meson-gxbb: round the vdec dividers to closest
  clk: at91: fix programmable clock for sama5d2

5 years agoMerge tag 'pci-v5.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Linus Torvalds [Sat, 13 Apr 2019 21:29:21 +0000 (14:29 -0700)] 
Merge tag 'pci-v5.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

 - Add a DMA alias quirk for another Marvell SATA device (Andre
   Przywara)

 - Fix a pciehp regression that broke safe removal of devices (Sergey
   Miroshnichenko)

* tag 'pci-v5.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: pciehp: Ignore Link State Changes after powering off a slot
  PCI: Add function 1 DMA alias quirk for Marvell 9170 SATA controller

5 years agoMerge tag 'powerpc-5.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sat, 13 Apr 2019 16:03:09 +0000 (09:03 -0700)] 
Merge tag 'powerpc-5.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "A minor build fix for 64-bit FLATMEM configs.

  A fix for a boot failure on 32-bit powermacs.

  My commit to fix CLOCK_MONOTONIC across Y2038 broke the 32-bit VDSO on
  64-bit kernels, ie. compat mode, which is only used on big endian.

  The rewrite of the SLB code we merged in 4.20 missed the fact that the
  0x380 exception is also used with the Radix MMU to report out of range
  accesses. This could lead to an oops if userspace tried to read from
  addresses outside the user or kernel range.

  Thanks to: Aneesh Kumar K.V, Christophe Leroy, Larry Finger, Nicholas
  Piggin"

* tag 'powerpc-5.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/mm: Define MAX_PHYSMEM_BITS for all 64-bit configs
  powerpc/64s/radix: Fix radix segment exception handling
  powerpc/vdso32: fix CLOCK_MONOTONIC on PPC64
  powerpc/32: Fix early boot failure with RTAS built-in

5 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Sat, 13 Apr 2019 15:57:00 +0000 (08:57 -0700)] 
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "The main thing is a fix to our FUTEX_WAKE_OP implementation which was
  unbelievably broken, but did actually work for the one scenario that
  GLIBC used to use.

  Summary:

   - Fix stack unwinding so we ignore user stacks

   - Fix ftrace module PLT trampoline initialisation checks

   - Fix terminally broken implementation of FUTEX_WAKE_OP atomics"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value
  arm64: backtrace: Don't bother trying to unwind the userspace stack
  arm64/ftrace: fix inadvertent BUG() in trampoline check

5 years agoafs: Fix in-progess ops to ignore server-level callback invalidation
David Howells [Sat, 13 Apr 2019 07:37:37 +0000 (08:37 +0100)] 
afs: Fix in-progess ops to ignore server-level callback invalidation

The in-kernel afs filesystem client counts the number of server-level
callback invalidation events (CB.InitCallBackState* RPC operations) that it
receives from the server.  This is stored in cb_s_break in various
structures, including afs_server and afs_vnode.

If an inode is examined by afs_validate(), say, the afs_server copy is
compared, along with other break counters, to those in afs_vnode, and if
one or more of the counters do not match, it is considered that the
server's callback promise is broken.  At points where this happens,
AFS_VNODE_CB_PROMISED is cleared to indicate that the status must be
refetched from the server.

afs_validate() issues an FS.FetchStatus operation to get updated metadata -
and based on the updated data_version may invalidate the pagecache too.

However, the break counters are also used to determine whether to note a
new callback in the vnode (which would set the AFS_VNODE_CB_PROMISED flag)
and whether to cache the permit data included in the YFSFetchStatus record
by the server.

The problem comes when the server sends us a CB.InitCallBackState op.  The
first such instance doesn't cause cb_s_break to be incremented, but rather
causes AFS_SERVER_FL_NEW to be cleared - but thereafter, say some hours
after last use and all the volumes have been automatically unmounted and
the server has forgotten about the client[*], this *will* likely cause an
increment.

 [*] There are other circumstances too, such as the server restarting or
     needing to make space in its callback table.

Note that the server won't send us a CB.InitCallBackState op until we talk
to it again.

So what happens is:

 (1) A mount for a new volume is attempted, a inode is created for the root
     vnode and vnode->cb_s_break and AFS_VNODE_CB_PROMISED aren't set
     immediately, as we don't have a nominated server to talk to yet - and
     we may iterate through a few to find one.

 (2) Before the operation happens, afs_fetch_status(), say, notes in the
     cursor (fc.cb_break) the break counter sum from the vnode, volume and
     server counters, but the server->cb_s_break is currently 0.

 (3) We send FS.FetchStatus to the server.  The server sends us back
     CB.InitCallBackState.  We increment server->cb_s_break.

 (4) Our FS.FetchStatus completes.  The reply includes a callback record.

 (5) xdr_decode_AFSCallBack()/xdr_decode_YFSCallBack() check to see whether
     the callback promise was broken by checking the break counter sum from
     step (2) against the current sum.

     This fails because of step (3), so we don't set the callback record
     and, importantly, don't set AFS_VNODE_CB_PROMISED on the vnode.

This does not preclude the syscall from progressing, and we don't loop here
rechecking the status, but rather assume it's good enough for one round
only and will need to be rechecked next time.

 (6) afs_validate() it triggered on the vnode, probably called from
     d_revalidate() checking the parent directory.

 (7) afs_validate() notes that AFS_VNODE_CB_PROMISED isn't set, so doesn't
     update vnode->cb_s_break and assumes the vnode to be invalid.

 (8) afs_validate() needs to calls afs_fetch_status().  Go back to step (2)
     and repeat, every time the vnode is validated.

This primarily affects volume root dir vnodes.  Everything subsequent to
those inherit an already incremented cb_s_break upon mounting.

The issue is that we assume that the callback record and the cached permit
information in a reply from the server can't be trusted after getting a
server break - but this is wrong since the server makes sure things are
done in the right order, holding up our ops if necessary[*].

 [*] There is an extremely unlikely scenario where a reply from before the
     CB.InitCallBackState could get its delivery deferred till after - at
     which point we think we have a promise when we don't.  This, however,
     requires unlucky mass packet loss to one call.

AFS_SERVER_FL_NEW tries to paper over the cracks for the initial mount from
a server we've never contacted before, but this should be unnecessary.
It's also further insulated from the problem on an initial mount by
querying the server first with FS.GetCapabilities, which triggers the
CB.InitCallBackState.

Fix this by

 (1) Remove AFS_SERVER_FL_NEW.

 (2) In afs_calc_vnode_cb_break(), don't include cb_s_break in the
     calculation.

 (3) In afs_cb_is_broken(), don't include cb_s_break in the check.

Signed-off-by: David Howells <dhowells@redhat.com>
5 years agoafs: Unlock pages for __pagevec_release()
Marc Dionne [Sat, 13 Apr 2019 07:37:37 +0000 (08:37 +0100)] 
afs: Unlock pages for __pagevec_release()

__pagevec_release() complains loudly if any page in the vector is still
locked.  The pages need to be locked for generic_error_remove_page(), but
that function doesn't actually unlock them.

Unlock the pages afterwards.

Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Jonathan Billings <jsbillin@umich.edu>
5 years agoafs: Differentiate abort due to unmarshalling from other errors
David Howells [Sat, 13 Apr 2019 07:37:37 +0000 (08:37 +0100)] 
afs: Differentiate abort due to unmarshalling from other errors

Differentiate an abort due to an unmarshalling error from an abort due to
other errors, such as ENETUNREACH.  It doesn't make sense to set abort code
RXGEN_*_UNMARSHAL in such a case, so use RX_USER_ABORT instead.

Signed-off-by: David Howells <dhowells@redhat.com>
5 years agoafs: Avoid section confusion in CM_NAME
Andi Kleen [Sat, 13 Apr 2019 07:37:36 +0000 (08:37 +0100)] 
afs: Avoid section confusion in CM_NAME

__tracepoint_str cannot be const because the tracepoint_str
section is not read-only. Remove the stray const.

Cc: dhowells@redhat.com
Cc: viro@zeniv.linux.org.uk
Signed-off-by: Andi Kleen <ak@linux.intel.com>
5 years agoafs: avoid deprecated get_seconds()
Arnd Bergmann [Sat, 13 Apr 2019 07:37:36 +0000 (08:37 +0100)] 
afs: avoid deprecated get_seconds()

get_seconds() has a limited range on 32-bit architectures and is
deprecated because of that. While AFS uses the same limits for
its inode timestamps on the wire protocol, let's just use the
simpler current_time() as we do for other file systems.

This will still zero out the 'tv_nsec' field of the timestamps
internally.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Howells <dhowells@redhat.com>
5 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 13 Apr 2019 03:54:40 +0000 (20:54 -0700)] 
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
 "Fix typos in user-visible resctrl parameters, and also fix assembly
  constraint bugs that might result in miscompilation"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asm: Use stricter assembly constraints in bitops
  x86/resctrl: Fix typos in the mba_sc mount option

5 years agoMerge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 13 Apr 2019 03:52:28 +0000 (20:52 -0700)] 
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fix from Ingo Molnar:
 "Fix the alarm_timer_remaining() return value"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  alarmtimer: Return correct remaining time

5 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 13 Apr 2019 03:50:43 +0000 (20:50 -0700)] 
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fix from Ingo Molnar:
 "Fix a NULL pointer dereference crash in certain environments"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Do not re-read ->h_load_next during hierarchical load calculation

5 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 13 Apr 2019 03:42:30 +0000 (20:42 -0700)] 
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "Six kernel side fixes: three related to NMI handling on AMD systems, a
  race fix, a kexec initialization fix and a PEBS sampling fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix perf_event_disable_inatomic() race
  x86/perf/amd: Remove need to check "running" bit in NMI handler
  x86/perf/amd: Resolve NMI latency issues for active PMCs
  x86/perf/amd: Resolve race condition when disabling PMC
  perf/x86/intel: Initialize TFA MSR
  perf/x86/intel: Fix handling of wakeup_events for multi-entry PEBS

5 years agoMerge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 13 Apr 2019 03:31:08 +0000 (20:31 -0700)] 
Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fix from Ingo Molnar:
 "Fixes a crash when accessing /proc/lockdep"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/lockdep: Zap lock classes even with lock debugging disabled

5 years agoMerge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 13 Apr 2019 03:21:59 +0000 (20:21 -0700)] 
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Ingo Molnar:
 "Two genirq fixes, plus an irqchip driver error handling fix"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Respect IRQCHIP_SKIP_SET_WAKE in irq_chip_set_wake_parent()
  genirq: Initialize request_mutex if CONFIG_SPARSE_IRQ=n
  irqchip/irq-ls1x: Missing error code in ls1x_intc_of_init()

5 years agoMerge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 13 Apr 2019 03:13:13 +0000 (20:13 -0700)] 
Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core fixes from Ingo Molnar:
 "Fix an objtool warning plus fix a u64_to_user_ptr() macro expansion
  bug"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Add rewind_stack_do_exit() to the noreturn list
  linux/kernel.h: Use parentheses around argument in u64_to_user_ptr()

5 years agoipv4: recompile ip options in ipv4_link_failure
Stephen Suryaputra [Fri, 12 Apr 2019 20:19:27 +0000 (16:19 -0400)] 
ipv4: recompile ip options in ipv4_link_failure

Recompile IP options since IPCB may not be valid anymore when
ipv4_link_failure is called from arp_error_report.

Refer to the commit 3da1ed7ac398 ("net: avoid use IPCB in cipso_v4_error")
and the commit before that (9ef6b42ad6fd) for a similar issue.

Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'rxrpc-fixes'
David S. Miller [Fri, 12 Apr 2019 23:57:23 +0000 (16:57 -0700)] 
Merge branch 'rxrpc-fixes'

David Howells says:

====================
rxrpc: Fixes

Here is a collection of fixes for rxrpc:

 (1) rxrpc_error_report() needs to call sock_error() to clear the error
     code from the UDP transport socket, lest it be unexpectedly revisited
     on the next kernel_sendmsg() call.  This has been causing all sorts of
     weird effects in AFS as the effects have typically been felt by the
     wrong RxRPC call.

 (2) Allow a kernel user of AF_RXRPC to easily detect if an rxrpc call has
     completed.

 (3) Allow errors incurred by attempting to transmit data through the UDP
     socket to get back up the stack to AFS.

 (4) Make AFS use (2) to abort the synchronous-mode call waiting loop if
     the rxrpc-level call completed.

 (5) Add a missing tracepoint case for tracing abort reception.

 (6) Fix detection and handling of out-of-order ACKs.

====================

Tested-by: Jonathan Billings <jsbillin@umich.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agorxrpc: Fix detection of out of order acks
Jeffrey Altman [Fri, 12 Apr 2019 15:34:16 +0000 (16:34 +0100)] 
rxrpc: Fix detection of out of order acks

The rxrpc packet serial number cannot be safely used to compute out of
order ack packets for several reasons:

 1. The allocation of serial numbers cannot be assumed to imply the order
    by which acks are populated and transmitted.  In some rxrpc
    implementations, delayed acks and ping acks are transmitted
    asynchronously to the receipt of data packets and so may be transmitted
    out of order.  As a result, they can race with idle acks.

 2. Serial numbers are allocated by the rxrpc connection and not the call
    and as such may wrap independently if multiple channels are in use.

In any case, what matters is whether the ack packet provides new
information relating to the bounds of the window (the firstPacket and
previousPacket in the ACK data).

Fix this by discarding packets that appear to wind back the window bounds
rather than on serial number procession.

Fixes: 298bc15b2079 ("rxrpc: Only take the rwind and mtu values from latest ACK")
Signed-off-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agorxrpc: Trace received connection aborts
David Howells [Fri, 12 Apr 2019 15:34:09 +0000 (16:34 +0100)] 
rxrpc: Trace received connection aborts

Trace received calls that are aborted due to a connection abort, typically
because of authentication failure.  Without this, connection aborts don't
show up in the trace log.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoafs: Check for rxrpc call completion in wait loop
Marc Dionne [Fri, 12 Apr 2019 15:34:02 +0000 (16:34 +0100)] 
afs: Check for rxrpc call completion in wait loop

Check the state of the rxrpc call backing an afs call in each iteration of
the call wait loop in case the rxrpc call has already been terminated at
the rxrpc layer.

Interrupt the wait loop and mark the afs call as complete if the rxrpc
layer call is complete.

There were cases where rxrpc errors were not passed up to afs, which could
result in this loop waiting forever for an afs call to transition to
AFS_CALL_COMPLETE while the rx call was already complete.

Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agorxrpc: Allow errors to be returned from rxrpc_queue_packet()
Marc Dionne [Fri, 12 Apr 2019 15:33:54 +0000 (16:33 +0100)] 
rxrpc: Allow errors to be returned from rxrpc_queue_packet()

Change rxrpc_queue_packet()'s signature so that it can return any error
code it may encounter when trying to send the packet.

This allows the caller to eventually do something in case of error - though
it should be noted that the packet has been queued and a resend is
scheduled.

Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agorxrpc: Make rxrpc_kernel_check_life() indicate if call completed
Marc Dionne [Fri, 12 Apr 2019 15:33:47 +0000 (16:33 +0100)] 
rxrpc: Make rxrpc_kernel_check_life() indicate if call completed

Make rxrpc_kernel_check_life() pass back the life counter through the
argument list and return true if the call has not yet completed.

Suggested-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agorxrpc: Clear socket error
Marc Dionne [Fri, 12 Apr 2019 15:33:40 +0000 (16:33 +0100)] 
rxrpc: Clear socket error

When an ICMP or ICMPV6 error is received, the error will be attached
to the socket (sk_err) and the report function will get called.
Clear any pending error here by calling sock_error().

This would cause the following attempt to use the socket to fail with
the error code stored by the ICMP error, resulting in unexpected errors
with various side effects depending on the context.

Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Jonathan Billings <jsbillin@umich.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqede: fix write to free'd pointer error and double free of ptp
Colin Ian King [Fri, 12 Apr 2019 14:13:27 +0000 (15:13 +0100)] 
qede: fix write to free'd pointer error and double free of ptp

The err2 error return path calls qede_ptp_disable that cleans up
on an error and frees ptp. After this, the free'd ptp is dereferenced
when ptp->clock is set to NULL and the code falls-through to error
path err1 that frees ptp again.

Fix this by calling qede_ptp_disable and exiting via an error
return path that does not set ptp->clock or kfree ptp.

Addresses-Coverity: ("Write to pointer after free")
Fixes: 035744975aec ("qede: Add support for PTP resource locking.")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agovxge: fix return of a free'd memblock on a failed dma mapping
Colin Ian King [Fri, 12 Apr 2019 13:45:12 +0000 (14:45 +0100)] 
vxge: fix return of a free'd memblock on a failed dma mapping

Currently if a pci dma mapping failure is detected a free'd
memblock address is returned rather than a NULL (that indicates
an error). Fix this by ensuring NULL is returned on this error case.

Addresses-Coverity: ("Use after free")
Fixes: 528f727279ae ("vxge: code cleanup and reorganization")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoclk: imx: Fix PLL_1416X not rounding rates
Leonard Crestez [Fri, 12 Apr 2019 14:10:03 +0000 (14:10 +0000)] 
clk: imx: Fix PLL_1416X not rounding rates

Code which initializes the "clk_init_data.ops" checks pll->rate_table
before that field is ever assigned to so it always picks
"clk_pll1416x_min_ops".

This breaks dynamic rate rounding for features such as cpufreq.

Fix by checking pll_clk->rate_table instead, here pll_clk refers to
the constant initialization data coming from per-soc clk driver.

Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Fixes: 8646d4dcc7fb ("clk: imx: Add PLLs driver for imx8mm soc")
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
5 years agoMerge tag 'iwlwifi-for-kalle-2019-04-03' of git://git.kernel.org/pub/scm/linux/kernel...
Kalle Valo [Fri, 12 Apr 2019 18:34:27 +0000 (21:34 +0300)] 
Merge tag 'iwlwifi-for-kalle-2019-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes

Second batch of iwlwifi fixes intended for v5.1

* fix for a potential deadlock in the TX path;
* a fix for offloaded rate-control;
* support new PCI HW IDs which use a new FW;

5 years agomt76x02: avoid status_list.lock and sta->rate_ctrl_lock dependency
Stanislaw Gruszka [Fri, 5 Apr 2019 11:42:56 +0000 (13:42 +0200)] 
mt76x02: avoid status_list.lock and sta->rate_ctrl_lock dependency

Move ieee80211_tx_status_ext() outside of status_list lock section
in order to avoid locking dependency and possible deadlock reposed by
LOCKDEP in below warning.

Also do mt76_tx_status_lock() just before it's needed.

[  440.224832] WARNING: possible circular locking dependency detected
[  440.224833] 5.1.0-rc2+ #22 Not tainted
[  440.224834] ------------------------------------------------------
[  440.224835] kworker/u16:28/2362 is trying to acquire lock:
[  440.224836] 0000000089b8cacf (&(&q->lock)->rlock#2){+.-.}, at: mt76_wake_tx_queue+0x4c/0xb0 [mt76]
[  440.224842]
               but task is already holding lock:
[  440.224842] 000000002cfedc59 (&(&sta->lock)->rlock){+.-.}, at: ieee80211_stop_tx_ba_cb+0x32/0x1f0 [mac80211]
[  440.224863]
               which lock already depends on the new lock.

[  440.224863]
               the existing dependency chain (in reverse order) is:
[  440.224864]
               -> #3 (&(&sta->lock)->rlock){+.-.}:
[  440.224869]        _raw_spin_lock_bh+0x34/0x40
[  440.224880]        ieee80211_start_tx_ba_session+0xe4/0x3d0 [mac80211]
[  440.224894]        minstrel_ht_get_rate+0x45c/0x510 [mac80211]
[  440.224906]        rate_control_get_rate+0xc1/0x140 [mac80211]
[  440.224918]        ieee80211_tx_h_rate_ctrl+0x195/0x3c0 [mac80211]
[  440.224930]        ieee80211_xmit_fast+0x26d/0xa50 [mac80211]
[  440.224942]        __ieee80211_subif_start_xmit+0xfc/0x310 [mac80211]
[  440.224954]        ieee80211_subif_start_xmit+0x38/0x390 [mac80211]
[  440.224956]        dev_hard_start_xmit+0xb8/0x300
[  440.224957]        __dev_queue_xmit+0x7d4/0xbb0
[  440.224968]        ip6_finish_output2+0x246/0x860 [ipv6]
[  440.224978]        mld_sendpack+0x1bd/0x360 [ipv6]
[  440.224987]        mld_ifc_timer_expire+0x1a4/0x2f0 [ipv6]
[  440.224989]        call_timer_fn+0x89/0x2a0
[  440.224990]        run_timer_softirq+0x1bd/0x4d0
[  440.224992]        __do_softirq+0xdb/0x47c
[  440.224994]        irq_exit+0xfa/0x100
[  440.224996]        smp_apic_timer_interrupt+0x9a/0x220
[  440.224997]        apic_timer_interrupt+0xf/0x20
[  440.224999]        cpuidle_enter_state+0xc1/0x470
[  440.225000]        do_idle+0x21a/0x260
[  440.225001]        cpu_startup_entry+0x19/0x20
[  440.225004]        start_secondary+0x135/0x170
[  440.225006]        secondary_startup_64+0xa4/0xb0
[  440.225007]
               -> #2 (&(&sta->rate_ctrl_lock)->rlock){+.-.}:
[  440.225009]        _raw_spin_lock_bh+0x34/0x40
[  440.225022]        rate_control_tx_status+0x4f/0xb0 [mac80211]
[  440.225031]        ieee80211_tx_status_ext+0x142/0x1a0 [mac80211]
[  440.225035]        mt76x02_send_tx_status+0x2e4/0x340 [mt76x02_lib]
[  440.225037]        mt76x02_tx_status_data+0x31/0x40 [mt76x02_lib]
[  440.225040]        mt76u_tx_status_data+0x51/0xa0 [mt76_usb]
[  440.225042]        process_one_work+0x237/0x5d0
[  440.225043]        worker_thread+0x3c/0x390
[  440.225045]        kthread+0x11d/0x140
[  440.225046]        ret_from_fork+0x3a/0x50
[  440.225047]
               -> #1 (&(&list->lock)->rlock#8){+.-.}:
[  440.225049]        _raw_spin_lock_bh+0x34/0x40
[  440.225052]        mt76_tx_status_skb_add+0x51/0x100 [mt76]
[  440.225054]        mt76x02u_tx_prepare_skb+0xbd/0x116 [mt76x02_usb]
[  440.225056]        mt76u_tx_queue_skb+0x5f/0x180 [mt76_usb]
[  440.225058]        mt76_tx+0x93/0x190 [mt76]
[  440.225070]        ieee80211_tx_frags+0x148/0x210 [mac80211]
[  440.225081]        __ieee80211_tx+0x75/0x1b0 [mac80211]
[  440.225092]        ieee80211_tx+0xde/0x110 [mac80211]
[  440.225105]        __ieee80211_tx_skb_tid_band+0x72/0x90 [mac80211]
[  440.225122]        ieee80211_send_auth+0x1f3/0x360 [mac80211]
[  440.225141]        ieee80211_auth.cold.40+0x6c/0x100 [mac80211]
[  440.225156]        ieee80211_mgd_auth.cold.50+0x132/0x15f [mac80211]
[  440.225171]        cfg80211_mlme_auth+0x149/0x360 [cfg80211]
[  440.225181]        nl80211_authenticate+0x273/0x2e0 [cfg80211]
[  440.225183]        genl_family_rcv_msg+0x196/0x3a0
[  440.225184]        genl_rcv_msg+0x47/0x8e
[  440.225185]        netlink_rcv_skb+0x3a/0xf0
[  440.225187]        genl_rcv+0x24/0x40
[  440.225188]        netlink_unicast+0x16d/0x210
[  440.225189]        netlink_sendmsg+0x204/0x3b0
[  440.225191]        sock_sendmsg+0x36/0x40
[  440.225193]        ___sys_sendmsg+0x259/0x2b0
[  440.225194]        __sys_sendmsg+0x47/0x80
[  440.225196]        do_syscall_64+0x60/0x1f0
[  440.225197]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  440.225198]
               -> #0 (&(&q->lock)->rlock#2){+.-.}:
[  440.225200]        lock_acquire+0xb9/0x1a0
[  440.225202]        _raw_spin_lock_bh+0x34/0x40
[  440.225204]        mt76_wake_tx_queue+0x4c/0xb0 [mt76]
[  440.225215]        ieee80211_agg_start_txq+0xe8/0x2b0 [mac80211]
[  440.225225]        ieee80211_stop_tx_ba_cb+0xb8/0x1f0 [mac80211]
[  440.225235]        ieee80211_ba_session_work+0x1c1/0x2f0 [mac80211]
[  440.225236]        process_one_work+0x237/0x5d0
[  440.225237]        worker_thread+0x3c/0x390
[  440.225239]        kthread+0x11d/0x140
[  440.225240]        ret_from_fork+0x3a/0x50
[  440.225240]
               other info that might help us debug this:

[  440.225241] Chain exists of:
                 &(&q->lock)->rlock#2 --> &(&sta->rate_ctrl_lock)->rlock --> &(&sta->lock)->rlock

[  440.225243]  Possible unsafe locking scenario:

[  440.225244]        CPU0                    CPU1
[  440.225244]        ----                    ----
[  440.225245]   lock(&(&sta->lock)->rlock);
[  440.225245]                                lock(&(&sta->rate_ctrl_lock)->rlock);
[  440.225246]                                lock(&(&sta->lock)->rlock);
[  440.225247]   lock(&(&q->lock)->rlock#2);
[  440.225248]
                *** DEADLOCK ***

[  440.225249] 5 locks held by kworker/u16:28/2362:
[  440.225250]  #0: 0000000048fcd291 ((wq_completion)phy0){+.+.}, at: process_one_work+0x1b5/0x5d0
[  440.225252]  #1: 00000000f1c6828f ((work_completion)(&sta->ampdu_mlme.work)){+.+.}, at: process_one_work+0x1b5/0x5d0
[  440.225254]  #2: 00000000433d2b2c (&sta->ampdu_mlme.mtx){+.+.}, at: ieee80211_ba_session_work+0x5c/0x2f0 [mac80211]
[  440.225265]  #3: 000000002cfedc59 (&(&sta->lock)->rlock){+.-.}, at: ieee80211_stop_tx_ba_cb+0x32/0x1f0 [mac80211]
[  440.225276]  #4: 000000009d7b9a44 (rcu_read_lock){....}, at: ieee80211_agg_start_txq+0x33/0x2b0 [mac80211]
[  440.225286]
               stack backtrace:
[  440.225288] CPU: 2 PID: 2362 Comm: kworker/u16:28 Not tainted 5.1.0-rc2+ #22
[  440.225289] Hardware name: LENOVO 20KGS23S0P/20KGS23S0P, BIOS N23ET55W (1.30 ) 08/31/2018
[  440.225300] Workqueue: phy0 ieee80211_ba_session_work [mac80211]
[  440.225301] Call Trace:
[  440.225304]  dump_stack+0x85/0xc0
[  440.225306]  print_circular_bug.isra.38.cold.58+0x15c/0x195
[  440.225307]  check_prev_add.constprop.48+0x5f0/0xc00
[  440.225309]  ? check_prev_add.constprop.48+0x39d/0xc00
[  440.225311]  ? __lock_acquire+0x41d/0x1100
[  440.225312]  __lock_acquire+0xd98/0x1100
[  440.225313]  ? __lock_acquire+0x41d/0x1100
[  440.225315]  lock_acquire+0xb9/0x1a0
[  440.225317]  ? mt76_wake_tx_queue+0x4c/0xb0 [mt76]
[  440.225319]  _raw_spin_lock_bh+0x34/0x40
[  440.225321]  ? mt76_wake_tx_queue+0x4c/0xb0 [mt76]
[  440.225323]  mt76_wake_tx_queue+0x4c/0xb0 [mt76]
[  440.225334]  ieee80211_agg_start_txq+0xe8/0x2b0 [mac80211]
[  440.225344]  ieee80211_stop_tx_ba_cb+0xb8/0x1f0 [mac80211]
[  440.225354]  ieee80211_ba_session_work+0x1c1/0x2f0 [mac80211]
[  440.225356]  process_one_work+0x237/0x5d0
[  440.225358]  worker_thread+0x3c/0x390
[  440.225359]  ? wq_calc_node_cpumask+0x70/0x70
[  440.225360]  kthread+0x11d/0x140
[  440.225362]  ? kthread_create_on_node+0x40/0x40
[  440.225363]  ret_from_fork+0x3a/0x50

Cc: stable@vger.kernel.org
Fixes: 88046b2c9f6d ("mt76: add support for reporting tx status with skb")
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
5 years agort2x00: do not increment sequence number while re-transmitting
Vijayakumar Durai [Wed, 27 Mar 2019 10:03:17 +0000 (11:03 +0100)] 
rt2x00: do not increment sequence number while re-transmitting

Currently rt2x00 devices retransmit the management frames with
incremented sequence number if hardware is assigning the sequence.

This is HW bug fixed already for non-QOS data frames, but it should
be fixed for management frames except beacon.

Without fix retransmitted frames have wrong SN:

 AlphaNet_e8:fb:36 Vivotek_52:31:51 Authentication, SN=1648, FN=0, Flags=........C Frame is not being retransmitted 1648 1
 AlphaNet_e8:fb:36 Vivotek_52:31:51 Authentication, SN=1649, FN=0, Flags=....R...C Frame is being retransmitted 1649 1
 AlphaNet_e8:fb:36 Vivotek_52:31:51 Authentication, SN=1650, FN=0, Flags=....R...C Frame is being retransmitted 1650 1

With the fix SN stays correctly the same:

 88:6a:e3:e8:f9:a2 8c:f5:a3:88:76:87 Authentication, SN=1450, FN=0, Flags=........C
 88:6a:e3:e8:f9:a2 8c:f5:a3:88:76:87 Authentication, SN=1450, FN=0, Flags=....R...C
 88:6a:e3:e8:f9:a2 8c:f5:a3:88:76:87 Authentication, SN=1450, FN=0, Flags=....R...C

Cc: stable@vger.kernel.org
Signed-off-by: Vijayakumar Durai <vijayakumar.durai1@vivint.com>
[sgruszka: simplify code, change comments and changelog]
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
5 years agomt76: mt7603: send BAR after powersave wakeup
Felix Fietkau [Tue, 26 Mar 2019 08:34:21 +0000 (09:34 +0100)] 
mt76: mt7603: send BAR after powersave wakeup

Now that the sequence number allocation is fixed, we can finally send a BAR
at powersave wakeup time to refresh the receiver side reorder window

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
5 years agomt76: mt7603: fix sequence number assignment
Felix Fietkau [Tue, 26 Mar 2019 08:34:20 +0000 (09:34 +0100)] 
mt76: mt7603: fix sequence number assignment

If the MT_TXD3_SN_VALID flag is not set in the tx descriptor, the hardware
assigns the sequence number. However, the rest of the code assumes that the
sequence number specified in the 802.11 header gets transmitted.
This was causing issues with the aggregation setup, which worked for the
initial one (where the sequence numbers were still close), but not for
further teardown/re-establishing of sessions.

Additionally, the overwrite of the TID sequence number in WTBL2 was resetting
the hardware assigned sequence numbers, causing them to drift further apart.

Fix this by using the software assigned sequence numbers

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
5 years agomt76: mt7603: add missing initialization for dev->ps_lock
Felix Fietkau [Tue, 26 Mar 2019 08:34:19 +0000 (09:34 +0100)] 
mt76: mt7603: add missing initialization for dev->ps_lock

Fixes lockdep complaint and a potential race condition

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
5 years agoudpv6: Check address length before reading address family
Tetsuo Handa [Fri, 12 Apr 2019 10:56:39 +0000 (19:56 +0900)] 
udpv6: Check address length before reading address family

KMSAN will complain if valid address length passed to udpv6_pre_connect()
is shorter than sizeof("struct sockaddr"->sa_family) bytes.

(This patch is bogus if it is guaranteed that udpv6_pre_connect() is
always called after checking "struct sockaddr"->sa_family. In that case,
we want a comment why we don't need to check valid address length here.)

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobpf: Check address length before reading address family
Tetsuo Handa [Fri, 12 Apr 2019 10:55:47 +0000 (19:55 +0900)] 
bpf: Check address length before reading address family

KMSAN will complain if valid address length passed to bpf_bind() is
shorter than sizeof("struct sockaddr"->sa_family) bytes.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agollc: Check address length before reading address field
Tetsuo Handa [Fri, 12 Apr 2019 10:55:14 +0000 (19:55 +0900)] 
llc: Check address length before reading address field

KMSAN will complain if valid address length passed to bind() is shorter
than sizeof(struct sockaddr_llc) bytes.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoBluetooth: Check address length before reading address field
Tetsuo Handa [Fri, 12 Apr 2019 10:54:33 +0000 (19:54 +0900)] 
Bluetooth: Check address length before reading address field

KMSAN will complain if valid address length passed to bind() is shorter
than sizeof(struct sockaddr_sco) bytes.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agorxrpc: Check address length before reading srx_service field
Tetsuo Handa [Fri, 12 Apr 2019 10:54:05 +0000 (19:54 +0900)] 
rxrpc: Check address length before reading srx_service field

KMSAN will complain if valid address length passed to bind() is shorter
than sizeof(struct sockaddr_rxrpc) bytes.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: netlink: Check address length before reading groups field
Tetsuo Handa [Fri, 12 Apr 2019 10:53:38 +0000 (19:53 +0900)] 
net: netlink: Check address length before reading groups field

KMSAN will complain if valid address length passed to bind() is shorter
than sizeof(struct sockaddr_nl) bytes.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosctp: Check address length before reading address family
Tetsuo Handa [Fri, 12 Apr 2019 10:53:10 +0000 (19:53 +0900)] 
sctp: Check address length before reading address family

KMSAN will complain if valid address length passed to connect() is shorter
than sizeof("struct sockaddr"->sa_family) bytes.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomISDN: Check address length before reading address family
Tetsuo Handa [Fri, 12 Apr 2019 10:52:36 +0000 (19:52 +0900)] 
mISDN: Check address length before reading address family

KMSAN will complain if valid address length passed to bind() is shorter
than sizeof("struct sockaddr_mISDN"->family) bytes.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>