Ben Hutchings [Mon, 1 Aug 2016 20:12:27 +0000 (21:12 +0100)]
USB: quirks: Fix entries on wrong list in 3.16.y
Commits ddbe1fca0bcb ("USB: Add device quirk for ASUS T100 Base
Station keyboard") and e5dff0e80463 ("USB: Add OTG PET device to TPL")
were wrongly backported to 3.16.y. The original commits added to
usb_quirk_list but the backported versions added to
usb_interface_quirk_list.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
On some platforms, the clocks might be registered by a platform
driver. When this is the case, the clock platform driver may very well
be probed after xhci-plat, in which case the first probe() invocation
of xhci-plat will receive -EPROBE_DEFER as the return value of
devm_clk_get().
The current code handles that as a normal error, and simply assumes
that this means that the system doesn't have a clock for the XHCI
controller, and continues probing without calling
clk_prepare_enable(). Unfortunately, this doesn't work on systems
where the XHCI controller does have a clock, but that clock is
provided by another platform driver. In order to fix this situation,
we handle the -EPROBE_DEFER error condition specially, and abort the
XHCI controller probe(). It will be retried later automatically, the
clock will be available, devm_clk_get() will succeed, and the probe()
will continue with the clock prepared and enabled as expected.
In practice, such issue is seen on the ARM64 Marvell 7K/8K platform,
where the clocks are registered by a platform driver.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16: correct error label here is unmap_registers] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Flushing a work that reschedules itself is not a sensible operation. It needs
to be killed. Failure to do so leads to a kernel panic in the timer code.
Signed-off-by: Oliver Neukum <ONeukum@suse.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
[ 40.467381] =============================================
[ 40.473013] [ INFO: possible recursive locking detected ]
[ 40.478651] 4.6.0-08691-g7f3db9a #37 Not tainted
[ 40.483466] ---------------------------------------------
[ 40.489098] usb/733 is trying to acquire lock:
[ 40.493734] (&(&dev->lock)->rlock){-.....}, at: [<bf129288>] ep0_complete+0x18/0xdc [gadgetfs]
[ 40.502882]
[ 40.502882] but task is already holding lock:
[ 40.508967] (&(&dev->lock)->rlock){-.....}, at: [<bf12a420>] ep0_read+0x20/0x5e0 [gadgetfs]
[ 40.517811]
[ 40.517811] other info that might help us debug this:
[ 40.524623] Possible unsafe locking scenario:
[ 40.524623]
[ 40.530798] CPU0
[ 40.533346] ----
[ 40.535894] lock(&(&dev->lock)->rlock);
[ 40.540088] lock(&(&dev->lock)->rlock);
[ 40.544284]
[ 40.544284] *** DEADLOCK ***
[ 40.544284]
[ 40.550461] May be due to missing lock nesting notation
[ 40.550461]
[ 40.557544] 2 locks held by usb/733:
[ 40.561271] #0: (&f->f_pos_lock){+.+.+.}, at: [<c02a6114>] __fdget_pos+0x40/0x48
[ 40.569219] #1: (&(&dev->lock)->rlock){-.....}, at: [<bf12a420>] ep0_read+0x20/0x5e0 [gadgetfs]
[ 40.578523]
[ 40.578523] stack backtrace:
[ 40.583075] CPU: 0 PID: 733 Comm: usb Not tainted 4.6.0-08691-g7f3db9a #37
[ 40.590246] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 40.596625] [<c010ffbc>] (unwind_backtrace) from [<c010c1bc>] (show_stack+0x10/0x14)
[ 40.604718] [<c010c1bc>] (show_stack) from [<c04207fc>] (dump_stack+0xb0/0xe4)
[ 40.612267] [<c04207fc>] (dump_stack) from [<c01886ec>] (__lock_acquire+0xf68/0x1994)
[ 40.620440] [<c01886ec>] (__lock_acquire) from [<c0189528>] (lock_acquire+0xd8/0x238)
[ 40.628621] [<c0189528>] (lock_acquire) from [<c06ad6b4>] (_raw_spin_lock_irqsave+0x38/0x4c)
[ 40.637440] [<c06ad6b4>] (_raw_spin_lock_irqsave) from [<bf129288>] (ep0_complete+0x18/0xdc [gadgetfs])
[ 40.647339] [<bf129288>] (ep0_complete [gadgetfs]) from [<bf10a728>] (musb_g_giveback+0x118/0x1b0 [musb_hdrc])
[ 40.657842] [<bf10a728>] (musb_g_giveback [musb_hdrc]) from [<bf108768>] (musb_g_ep0_queue+0x16c/0x188 [musb_hdrc])
[ 40.668772] [<bf108768>] (musb_g_ep0_queue [musb_hdrc]) from [<bf12a944>] (ep0_read+0x544/0x5e0 [gadgetfs])
[ 40.678963] [<bf12a944>] (ep0_read [gadgetfs]) from [<c0284470>] (__vfs_read+0x20/0x110)
[ 40.687414] [<c0284470>] (__vfs_read) from [<c0285324>] (vfs_read+0x88/0x114)
[ 40.694864] [<c0285324>] (vfs_read) from [<c0286150>] (SyS_read+0x44/0x9c)
[ 40.702051] [<c0286150>] (SyS_read) from [<c0107820>] (ret_fast_syscall+0x0/0x1c)
This is caused by the spinlock bug in ep0_read().
Fix the two other deadlock sources in gadgetfs_setup() too.
Signed-off-by: Bin Liu <b-liu@ti.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
[bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
This loop is supposed to set all the .num[] values to -1 but it's off by
one so it skips the first element and sets one element past the end of
the array.
I've cleaned up the loop a little as well.
Fixes: ddf8abd25994 ('USB: f_fs: the FunctionFS driver') Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
[bwh: Backported to 3.16:
- Adjust filename, context
- Add definition of i variable] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
dwc3-exynos has two problems during init if the regulators are slow
to come up (for instance if the I2C bus driver is not on the initramfs)
and return probe deferral. First, every time this happens, the driver
leaks the USB phys created; they need to be deallocated on error.
Second, since the phy devices are created before the regulators fail,
this means that there's a new device to re-trigger deferred probing,
which causes it to essentially go into a busy loop of re-probing the
device until the regulators come up.
Move the phy creation to after the regulators have succeeded, and also
fix cleanup on failure. On my ODROID XU4 system (with Debian's initramfs
which doesn't contain the I2C driver), this reduces the number of probe
attempts (for each of the two controllers) from more than 2000 to eight.
Signed-off-by: Steinar H. Gunderson <sesse@google.com> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Reviewed-by: Vivek Gautam <gautam.vivek@samsung.com> Fixes: d720f057fda4 ("usb: dwc3: exynos: add nop transceiver support") Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Temperature channels report scaled samples in Celsius although expected as
milli degree Celsius in Documentation/ABI/testing/sysfs-bus-iio.
Gains are not implemented at all for LPS001WP pressure and temperature
channels.
This patch ensures that proper offsets and scales are exposed to userpace
for both pressure and temperature channels.
Also fix a NULL pointer exception when userspace reads content of sysfs
scale attribute when gains are not defined.
Signed-off-by: Gregor Boirie <gregor.boirie@parrot.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
[bwh: Backported to 3.16:
- Adjust context
- In st_press_read_raw() use pdata instead of press_data] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Buffer wasn't of a valid size to allow the timestamp, and correct padding.
This patchset also moves the buffer off the stack, and onto the heap.
Cc: george.mccollister@gmail.com Signed-off-by: Matt Ranostay <mranostay@gmail.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Triggered buffers shouldn't return processed data, and the respective
conversion was overflowing the defined .realbits for the channel.
Cc: george.mccollister@gmail.com Signed-off-by: Matt Ranostay <mranostay@gmail.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
IIO_CHAN_INFO_RAW was returning processed data which was incorrect.
This also adds the IIO_CHAN_INFO_SCALE value to convert to a processed value.
Signed-off-by: Matt Ranostay <mranostay@gmail.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Reported-by: H. Nikolaus Schaller <hns@goldelico.com> Tested-by: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
caam_jr_alloc() used to return NULL if a JR device could not be
allocated for a session. In turn, every user of this function used
IS_ERR() function to verify if anything went wrong, which does NOT look
for NULL values. This made the kernel crash if the sanity check failed,
because the driver continued to think it had allocated a valid JR dev
instance to the session and at some point it tries to do a caam_jr_free()
on a NULL JR dev pointer.
This patch is a fix for this issue.
Signed-off-by: Catalin Vasile <cata.vasile@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The current code goes through a lot of indirection just to call a
known handler. Simplify it: just call the handlers directly.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
[bwh: Backported to 3.16: old code forwarded to uvc_v4l2_ioctl()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Mask the reserved bits when reading the number of newly
created XRCD.
Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Add 16 reserved bytes at the end of mlx5_modify_qp_mbox_in to
match the hardware spec definition.
Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The bcm_kona_gpio_reset() calls bcm_kona_gpio_write_lock_regs()
with what looks like the wrong parameter. The write_lock_regs
function takes a pointer to the registers, not the bcm_kona_gpio
structure.
Fix the warning, and probably bug by changing the function to
pass reg_base instead of kona_gpio, fixing the following warning:
drivers/gpio/gpio-bcm-kona.c:550:47: warning: incorrect type in argument 1
(different address spaces)
expected void [noderef] <asn:2>*reg_base
got struct bcm_kona_gpio *kona_gpio
warning: incorrect type in argument 1 (different address spaces)
expected void [noderef] <asn:2>*reg_base
got struct bcm_kona_gpio *kona_gpio
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Acked-by: Ray Jui <ray.jui@broadcom.com> Reviewed-by: Markus Mayer <mmayer@broadcom.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Ascend-to-parent logics in d_walk() depends on all encountered child
dentries not getting freed without an RCU delay. Unfortunately, in
quite a few cases it is not true, with hard-to-hit oopsable race as
the result.
Fortunately, the fix is simiple; right now the rule is "if it ever
been hashed, freeing must be delayed" and changing it to "if it
ever had a parent, freeing must be delayed" closes that hole and
covers all cases the old rule used to cover. Moreover, pipes and
sockets remain _not_ covered, so we do not introduce RCU delay in
the cases which are the reason for having that delay conditional
in the first place.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[bwh: Backported to 3.16:
- Adjust context
- Also set the flag in __d_materialise_dentry())] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When turbo is disabled, the ->set_policy() interface is broken.
For example, when turbo is disabled and cpuinfo.max = 2900000 (full
max turbo frequency), setting the limits results in frequency less
than the requested one:
Set 1000000 KHz results in 0700000 KHz
Set 1500000 KHz results in 1100000 KHz
Set 2000000 KHz results in 1500000 KHz
This is because the limits->max_perf fraction is calculated using
the max turbo frequency as the reference, but when the max P-State is
capped in intel_pstate_get_min_max(), the reference is not the max
turbo P-State. This results in reducing max P-State.
One option is to always use max turbo as reference for calculating
limits. But this will not be correct. By definition the intel_pstate
sysfs limits, shows percentage of available performance. So when
BIOS has disabled turbo, the available performance is max non turbo.
So the max_perf_pct should still show 100%.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw : Subject & changelog, rewrite in fewer lines of code ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[bwh: Backported to 3.16:
- limits is a perf_limits structure, not a pointer to it
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Because of an improper dereference, a stray 'C' character was output to
the modalias when no 'compatible' was specified. This is the case for
some old PowerMac drivers which only set the 'name' property. Fix it to
let them match again.
Reported-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Tested-by: Mathieu Malaterre <malat@debian.org> Cc: Philipp Zabel <p.zabel@pengutronix.de> Cc: Andreas Schwab <schwab@linux-m68k.org> Fixes: 6543becf26fff6 ("mod/file2alias: make modalias generation safe for cross compiling") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The v6 tcp stats scan do not provide TLP and ER timer information
correctly like the v4 version . This patch fixes that.
Fixes: 6ba8a3b19e76 ("tcp: Tail loss probe (TLP)") Fixes: eed530b6c676 ("tcp: early retransmit") Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
For newer versions of Syslinux, we need ldlinux.c32 in addition to
isolinux.bin to reside on the boot disk, so if the latter is found,
copy it, too, to the isoimage tree.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
ipoib_neigh_get unconditionally updates the "alive" variable member on
any packet send. This prevents the neighbor garbage collection from
cleaning out a dead neighbor entry if we are still queueing packets
for it. If the queue for this neighbor is full, then don't update the
alive timestamp. That way the neighbor can time out even if packets
are still being queued as long as none of them are being sent.
Fixes: b63b70d87741 ("IPoIB: Use a private hash table for path lookup in xmit path") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
FW port-change events are fired on Active <-> non Active port state
transitions only.
When the port state changes from Active to Initializing (Active ->
Down -> Initializing), a single event is fired.
The HCA transitions from Down to Initializing unless prevented from
doing so, hence the driver should also propagate events when the port
state is Initializing to consumers so they'll be aware that the port
is no longer Active and act accordingly.
Fixes: e126ba97dba9e ('mlx5: Add driver for Mellanox Connect-IB...') Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In ipoib_remove_one the driver holds the rtnl_lock and tries to do some
operation like dev_change_flags or unregister_netdev, while sysfs
callback like ipoib_vlan_delete holds sysfs mutex and tries to hold the
rtnl_lock via rtnl_trylock() and restart_syscall() if the lock is not
free, meanwhile ipoib_remove_one tries to get the sysfs lock in order to
free its sysfs directory, and we will get a->b, b->a deadlock.
MNT_LOCKED implies on a child mount implies the child is locked to the
parent. So while looping through the children the children should be
tested (not their parent).
Typically an unshare of a mount namespace locks all mounts together
making both the parent and the slave as locked but there are a few
corner cases where other things work.
Fixes: ceeb0e5d39fc ("vfs: Ignore unlocked mounts in fs_fully_visible") Reported-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In commit 8445a87f7092 "powerpc/iommu: Remove the dependency on EEH
struct in DDW mechanism", the PE address was replaced with the PCI
config address in order to remove dependency on EEH. According to PAPR
spec, firmware (pHyp or QEMU) should accept "xxBBSSxx" format PCI config
address, not "xxxxBBSS" provided by the patch. Note that "BB" is PCI bus
number and "SS" is the combination of slot and function number.
This fixes the PCI address passed to DDW RTAS calls.
Fixes: 8445a87f7092 ("powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism") Reported-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
This means the userspace program clock_adjtime called the clock_adjtime()
syscall and then crashed inside the compat_get_timex() function.
Syscalls should never crash programs, but instead return EFAULT.
The IIR register contains the executed instruction, which disassebles
into "ldw 0(sr3,r5),r9".
This load-word instruction is part of __get_user() which tried to read the word
at %r5/IOR (0xfa6f7fff). This means the unaligned handler jumped in. The
unaligned handler is able to emulate all ldw instructions, but it fails if it
fails to read the source e.g. because of page fault.
int main(void) {
/* allocate 8k */
char *ptr = mmap(NULL, 2*4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
/* free second half (upper 4k) and make it invalid. */
munmap(ptr+4096, 4096);
/* syscall where first int is unaligned and clobbers into invalid memory region */
/* syscall should return EFAULT */
return syscall(__NR_clock_adjtime, 0, ptr+4095);
}
To fix this issue we simply need to check if the faulting instruction address
is in the exception fixup table when the unaligned handler failed. If it
is, call the fixup routine instead of crashing.
While looking at the unaligned handler I found another issue as well: The
target register should not be modified if the handler was unsuccessful.
Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The kernel-doc for the of_irq_get[_byname]() is clearly inadequate in
describing the return values -- of_irq_get_byname() is documented better
than of_irq_get() but it still doesn't mention that 0 is returned iff
irq_create_of_mapping() fails (it doesn't return an error code in this
case). Document all possible return value variants, making the writing
of the word "IRQ" consistent, while at it...
Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq") Fixes: ad69674e73a1 ("of/irq: do irq resolution in platform_get_irq_byname()") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
MOV to DR6 or DR7 causes a #GP if an attempt is made to write a 1 to
any of bits 63:32. However, this is not detected at KVM_SET_DEBUGREGS
time, and the next KVM_RUN oopses:
PTRACE_SETVFPREGS fails to properly mark the VFP register set to be
reloaded, because it undoes one of the effects of vfp_flush_hwstate().
Specifically vfp_flush_hwstate() sets thread->vfpstate.hard.cpu to
an invalid CPU number, but vfp_set() overwrites this with the original
CPU number, thereby rendering the hardware state as apparently "valid",
even though the software state is more recent.
Fix this by reverting the previous change.
Fixes: 8130b9d7b9d8 ("ARM: 7308/1: vfp: flush thread hwstate before copying ptrace registers") Acked-by: Will Deacon <will.deacon@arm.com> Tested-by: Simon Marchi <simon.marchi@ericsson.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Roland Dreier reports that one of his systems cannot boot because of
the changes made by commit ac212b6980d8 (ACPI / processor: Use common
hotplug infrastructure).
The problematic part of it is the request_region() call in
acpi_processor_get_info() that used to run at module init time before
the above commit and now it runs much earlier. Unfortunately, the
region(s) reserved by it fall into a range the PCI subsystem attempts
to reserve for AHCI IO BARs. As a result, the PCI reservation fails
and AHCI doesn't work, while previously the PCI reservation would
be made before acpi_processor_get_info() and it would succeed.
That request_region() call, however, was overlooked by commit ac212b6980d8, as it is not necessary for the enumeration of the
processors. It only is needed when the ACPI processor driver
actually attempts to handle them which doesn't happen before
loading the ACPI processor driver module. Therefore that call
should have been moved from acpi_processor_get_info() into that
module.
Address the problem by moving the request_region() call in question
out of acpi_processor_get_info() and use the observation that the
region reserved by it is only needed if the FADT-based CPU
throttling method is going to be used, which means that it should
be sufficient to invoke it from acpi_processor_get_throttling_fadt().
Fixes: ac212b6980d8 (ACPI / processor: Use common hotplug infrastructure) Reported-by: Roland Dreier <roland@purestorage.com> Tested-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fix this by suppressing VPD inquiry for this device.
Signed-off-by: Ewan D. Milne <emilne@redhat.com> Reported-by: Jan Stancek <jstancek@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fix both of these by unconditionally flushing paths before destroying
the sta, and by adding a synchronize_net() after path flush to ensure
no active readers can still dereference the sta.
Reported-by: Fred Veldini <fred.veldini@gmail.com> Tested-by: Fred Veldini <fred.veldini@gmail.com> Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
A wmediumd that does not send this attribute causes a NULL pointer
dereference, as the attribute is accessed even if it does not exist.
The attribute was required but never checked ever since userspace frame
forwarding has been introduced. The issue gets more problematic once we
allow wmediumd registration from user namespaces.
Fixes: 7882513bacb1 ("mac80211_hwsim driver support userspace frame tx/rx") Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
We are already using the privileged versions of MMCR0, MMCR1
and MMCRA in the kernel, so for MMCR2, we should better use
the privileged versions, too, to be consistent.
Fixes: 240686c13687 ("powerpc: Initialise PMU related regs on Power8") Suggested-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The SIAR and SDAR registers are available twice, one time as SPRs
780 / 781 (unprivileged, but read-only), and one time as the SPRs
796 / 797 (privileged, but read and write). The Linux kernel code
currently uses the unprivileged SPRs - while this is OK for reading,
writing to that register of course does not work.
Since the KVM code tries to write to this register, too (see the mtspr
in book3s_hv_rmhandlers.S), the contents of this register sometimes get
lost for the guests, e.g. during migration of a VM.
To fix this issue, simply switch to the privileged SPR numbers instead.
Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In the "ibm,configure-pe" and "ibm,configure-bridge" RTAS calls, the
spec states that values of 9900-9905 can be returned, indicating that
software should delay for 10^x (where x is the last digit, i.e. 990x)
milliseconds and attempt the call again. Currently, the kernel doesn't
know about this, and respecting it fixes some PCI failures when the
hypervisor is busy.
The delay is capped at 0.2 seconds.
Signed-off-by: Russell Currey <ruscur@russell.cc> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The HPFS filesystem used generic_show_options to produce string that is
displayed in /proc/mounts. However, there is a problem that the options
may disappear after remount. If we mount the filesystem with option1
and then remount it with option2, /proc/mounts should show both option1
and option2, however it only shows option2 because the whole option
string is replaced with replace_mount_options in hpfs_remount_fs.
To fix this bug, implement the hpfs_show_options function that prints
options that are currently selected.
Commit ce657611baf9 ("hpfs: kstrdup() out of memory handling") checks if
the kstrdup function returns NULL due to out-of-memory condition.
However, if we are remounting a filesystem with no change to
filesystem-specific options, the parameter data is NULL. In this case,
kstrdup returns NULL (because it was passed NULL parameter), although no
out of memory condition exists. The mount syscall then fails with
ENOMEM.
This patch fixes the bug. We fail with ENOMEM only if data is non-NULL.
The patch also changes the call to replace_mount_options - if we didn't
pass any filesystem-specific options, we don't call
replace_mount_options (thus we don't erase existing reported options).
Fixes: ce657611baf9 ("hpfs: kstrdup() out of memory handling") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Commit d96cc3d1ec5d "MIPS: Add microMIPS MSA support." attempted to use
the value of a macro within an inline asm statement but instead emitted
a comment leading to the cfcmsa & ctcmsa instructions being omitted. Fix
that by passing CFC_MSA_INSN & CTC_MSA_INSN as arguments to the asm
statements.
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7305/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
On a Freescale i.MX53 based board we ran into "BUG: scheduling while
atomic" because input_inject_event locks interrupts, but
imx_pwm_config_v2 sleeps.
Tested on Freescale i.MX53 SoC with 4.6.0.
Signed-off-by: Manfred Schlaegl <manfred.schlaegl@gmx.at> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
After initially connecting a wired Xbox 360 controller or sending it
a command to change LEDs, a status/response packet is interpreted as
controller input. This causes the state of buttons represented in
byte 2 of the controller data packet to be incorrect until the next
valid input packet. Wireless Xbox 360 controllers are not affected.
Writing a new value to the LED device while holding the Start button
and running jstest is sufficient to reproduce this bug. An event will
come through with the Start button released.
Xboxdrv also won't attempt to read controller input from a packet
where byte 0 is non-zero. It also checks that byte 1 is 0x14, but
that value differs between wired and wireless controllers and this
code is shared by both. I think just checking byte 0 is enough to
eliminate unwanted packets.
The following are some examples of 3-byte status packets I saw:
01 03 02
02 03 00
03 03 03
08 03 00
Signed-off-by: Cameron Gutman <aicommander@gmail.com> Signed-off-by: Pavel Rojtberg <rojtberg@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
With netconsole (at least) the pr_err("... disablingn") call can
recurse back into the dma-debug code, where it'll try to grab
free_entries_lock again. Avoid the problem by doing the printk after
dropping the lock.
Otherwise, if we fail to allocate new PIO buffers, our TXQs will try to
use the old ones, which aren't there any more.
Fixes: 183233bec810 "sfc: Allocate and link PIO buffers; map them with write-combining" Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The ccp-crypto module for AES XTS support has a bug that can allow requests
greater than 4096 bytes in size to be passed to the CCP hardware. The CCP
hardware does not support request sizes larger than 4096, resulting in
incorrect output. The request should actually be handled by the fallback
mechanism instantiated by the ccp-crypto module.
Add a check to insure the request size is less than or equal to the maximum
supported size and use the fallback mechanism if it is not.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: b955150ea784 ('RDMA/cxgb3: When a user QP is marked in error, also mark the CQs in error') Signed-off-by: Honggang Li <honli@redhat.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Ezequiel reported that he's facing UBI going into read-only
mode after power cut. It turned out that this behavior happens
only when updating a static volume is interrupted and Fastmap is
used.
UBI checks static volumes for data consistency and reads the
whole volume upon first open. If the volume is found erroneous
users of UBI cannot read from it, but another volume update is
possible to fix it. The check is performed by running
ubi_eba_read_leb() on every allocated LEB of the volume.
For static volumes ubi_eba_read_leb() computes the checksum of all
data stored in a LEB. To verify the computed checksum it has to read
the LEB's volume header which stores the original checksum.
If the volume header is not found UBI treats this as fatal internal
error and switches to RO mode. If the UBI device was attached via a
full scan the assumption is correct, the volume header has to be
present as it had to be there while scanning to get known as mapped.
If the attach operation happened via Fastmap the assumption is no
longer correct. When attaching via Fastmap UBI learns the mapping
table from Fastmap's snapshot of the system state and not via a full
scan. It can happen that a LEB got unmapped after a Fastmap was
written to the flash. Then UBI can learn the LEB still as mapped and
accessing it returns only 0xFF bytes. As UBI is not a FTL it is
allowed to have mappings to empty PEBs, it assumes that the layer
above takes care of LEB accounting and referencing.
UBIFS does so using the LEB property tree (LPT).
For static volumes UBI blindly assumes that all LEBs are present and
therefore special actions have to be taken.
The described situation can happen when updating a static volume is
interrupted, either by a user or a power cut.
The volume update code first unmaps all LEBs of a volume and then
writes LEB by LEB. If the sequence of operations is interrupted UBI
detects this either by the absence of LEBs, no volume header present
at scan time, or corrupted payload, detected via checksum.
In the Fastmap case the former method won't trigger as no scan
happened and UBI automatically thinks all LEBs are present.
Only by reading data from a LEB it detects that the volume header is
missing and incorrectly treats this as fatal error.
To deal with the situation ubi_eba_read_leb() from now on checks
whether we attached via Fastmap and handles the absence of a
volume header like a data corruption error.
This way interrupted static volume updates will correctly get detected
also when Fastmap is used.
Reported-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> Tested-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> Signed-off-by: Richard Weinberger <richard@nod.at>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
commit 0e707ae79ba3 ("UBI: do propagate positive error codes up") seems
to have produced an unintended change in the control flow here.
Completely untested, but it looks obvious.
Caught by Coverity, which didn't like the indentation. CID 1271184.
Signed-off-by: Brian Norris <computersforpeace@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
UBI uses positive function return codes internally, and should not propagate
them up, except in the place this path fixes. Here is the original bug report
from Dan Carpenter:
The problem is really in ubi_eba_read_leb().
drivers/mtd/ubi/eba.c
412 err = ubi_io_read_vid_hdr(ubi, pnum, vid_hdr, 1);
413 if (err && err != UBI_IO_BITFLIPS) {
414 if (err > 0) {
415 /*
416 * The header is either absent or corrupted.
417 * The former case means there is a bug -
418 * switch to read-only mode just in case.
419 * The latter case means a real corruption - we
420 * may try to recover data. FIXME: but this is
421 * not implemented.
422 */
423 if (err == UBI_IO_BAD_HDR_EBADMSG ||
424 err == UBI_IO_BAD_HDR) {
425 ubi_warn("corrupted VID header at PEB %d, LEB %d:%d",
426 pnum, vol_id, lnum);
427 err = -EBADMSG;
428 } else
429 ubi_ro_mode(ubi);
On this path we return UBI_IO_FF and UBI_IO_FF_BITFLIPS and it
eventually gets passed to ERR_PTR(). We probably dereference the bad
pointer and oops. At that point we've gone read only so it was already
a bad situation...
Commit ff1e22e7a638 ("xen/events: Mask a moving irq") open-coded
irq_move_irq() but left out checking if the IRQ is disabled. This broke
resuming from suspend since it tries to move a (disabled) irq without
holding the IRQ's desc->lock. Fix it by adding in a check for disabled
IRQs.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
creates an unreapable zombie if /sbin/init doesn't use __WALL.
This is not a kernel bug, at least in a sense that everything works as
expected: debugger should reap a traced sub-thread before it can reap the
leader, but without __WALL/__WCLONE do_wait() ignores sub-threads.
Unfortunately, it seems that /sbin/init in most (all?) distributions
doesn't use it and we have to change the kernel to avoid the problem.
Note also that most init's use sys_waitid() which doesn't allow __WALL, so
the necessary user-space fix is not that trivial.
This patch just adds the "ptrace" check into eligible_child(). To some
degree this matches the "tsk->ptrace" in exit_notify(), ->exit_signal is
mostly ignored when the tracee reports to debugger. Or WSTOPPED, the
tracer doesn't need to set this flag to wait for the stopped tracee.
This obviously means the user-visible change: __WCLONE and __WALL no
longer have any meaning for debugger. And I can only hope that this won't
break something, but at least strace/gdb won't suffer.
We could make a more conservative change. Say, we can take __WCLONE into
account, or !thread_group_leader(). But it would be nice to not
complicate these historical/confusing checks.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Roland McGrath <roland@hack.frob.com> Cc: <syzkaller@googlegroups.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The length of the GSS MIC token need not be a multiple of four bytes.
It is then padded by XDR to a multiple of 4 B, but unwrap_integ_data()
would previously only trim mic.len + 4 B. The remaining up to three
bytes would then trigger a check in nfs4svc_decode_compoundargs(),
leading to a "garbage args" error and mount failure:
nfs4svc_decode_compoundargs: compound not properly padded!
nfsd: failed to decode arguments!
This would prevent older clients using the pre-RFC 4121 MIC format
(37-byte MIC including a 9-byte OID) from mounting exports from v3.9+
servers using krb5i.
The trimming was introduced by commit 4c190e2f913f ("sunrpc: trim off
trailing checksum before returning decrypted or integrity authenticated
buffer").
Fixes: 4c190e2f913f "unrpc: trim off trailing checksum..." Signed-off-by: Tomáš Trnka <ttrnka@mail.muni.cz> Acked-by: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
008GE0 Toshiba mmc in some Intel Baytrail tablets responds to
MMC_SEND_EXT_CSD in 450-600ms.
This patch will...
() Increase the long read time quirk timeout from 300ms to 600ms. Original
author of that quirk says 300ms was only a guess and that the number
may need to be raised in the future.
() Add this specific MMC to the quirk
Signed-off-by: Matt Gumbel <matthew.k.gumbel@intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When we read out the watermark state from the hardware we're supposed to
transfer that into the active watermarks, but currently we fail to any
part of the active watermarks that isn't explicitly written. Let's clear
it all upfront.
Looks like this has been like this since the beginning, when I added the
readout. No idea why I didn't clear it up.
Cc: Matt Roper <matthew.d.roper@intel.com> Fixes: 243e6a44b9ca ("drm/i915: Init HSW watermark tracking in intel_modeset_setup_hw_state()") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1463151318-14719-2-git-send-email-ville.syrjala@linux.intel.com
(cherry picked from commit 15606534bf0a65d8a74a90fd57b8712d147dbca6) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When SCSI was written, all commands coming from the filesystem
(REQ_TYPE_FS commands) had data. This meant that our signal for needing
to complete the command was the number of bytes completed being equal to
the number of bytes in the request. Unfortunately, with the advent of
flush barriers, we can now get zero length REQ_TYPE_FS commands, which
confuse this logic because they satisfy the condition every time. This
means they never get retried even for retryable conditions, like UNIT
ATTENTION because we complete them early assuming they're done. Fix
this by special casing the early completion condition to recognise zero
length commands with errors and let them drop through to the retry code.
Reported-by: Sebastian Parschauer <s.parschauer@gmx.de> Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Tested-by: Jack Wang <jinpu.wang@profitbricks.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
We used to check dev->reg_state against NETREG_REGISTERED after each
time we are woke up. But after commit 9e641bdcfa4e ("net-tun:
restructure tun_do_read for better sleep/wakeup efficiency"), it uses
skb_recv_datagram() which does not check dev->reg_state. This will
result if we delete a tun/tap device after a process is blocked in the
reading. The device will wait for the reference count which was held
by that process for ever.
Fixes this by using RCV_SHUTDOWN which will be checked during
sk_recv_datagram() before trying to wake up the process during uninit.
Fixes: 9e641bdcfa4e ("net-tun: restructure tun_do_read for better
sleep/wakeup efficiency") Cc: Eric Dumazet <edumazet@google.com> Cc: Xi Wang <xii@google.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The PM runtime will be left disabled for the device if its
.suspend_late() callback fails and async suspend is not allowed
for this device. In this case device will not be added in
dpm_late_early_list and dpm_resume_early() will ignore this
device, as result PM runtime will be disabled for it forever
(side effect: after 8 subsequent failures for the same device
the PM runtime will be reenabled due to disable_depth overflow).
To fix this problem, add devices to dpm_late_early_list regardless
of whether or not device_suspend_late() returns errors for them.
That will ensure failures in there to be handled consistently for
all devices regardless of their async suspend/resume status.
Reported-by: Grygorii Strashko <grygorii.strashko@ti.com> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When running a 32-bit userspace on a 64-bit kernel, the UI_SET_PHYS
ioctl needs to be treated with special care, as it has the pointer
size encoded in the command.
The session key is the default keyring set for request_key operations.
This session key is revoked when the user owning the session logs out.
Any long running daemon processes started by this session ends up with
revoked session keyring which prevents these processes from using the
request_key mechanism from obtaining the krb5 keys.
The problem has been reported by a large number of autofs users. The
problem is also seen with multiuser mounts where the share may be used
by processes run by a user who has since logged out. A reproducer using
automount is available on the Red Hat bz.
The patch creates a new keyring which is used to cache cifs spnego
upcalls.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reported-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.16: keyring_alloc() doesn't take a restrict_link param] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
ehea_get_port may return NULL. Do not dereference NULL value.
Fixes: 8c4877a4128e ("ehea: Use the standard logging functions") Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Acked-by: Thadeu Lima de Souza Cascardo <cascardo@debian.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In some rare randconfig builds, we can end up with
ASYMMETRIC_PUBLIC_KEY_SUBTYPE enabled but CRYPTO_AKCIPHER disabled,
which fails to link because of the reference to crypto_alloc_akcipher:
crypto/built-in.o: In function `public_key_verify_signature':
:(.text+0x110e4): undefined reference to `crypto_alloc_akcipher'
This adds a Kconfig 'select' statement to ensure the dependency
is always there.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
We don't write back stale inodes so we should skip them in
xfs_iflush_cluster, too.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Some careless idiot(*) wrote crap code in commit 1a3e8f3 ("xfs:
convert inode cache lookups to use RCU locking") back in late 2010,
and so xfs_iflush_cluster checks the wrong inode for whether it is
still valid under RCU protection. Fix it to lock and check the
correct inode.
(*) Careless-idiot: Dave Chinner <dchinner@redhat.com>
Discovered-by: Brain Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When a failure due to an inode buffer occurs, the error handling
fails to abort the inode writeback correctly. This can result in the
inode being reclaimed whilst still in the AIL, leading to
use-after-free situations as well as filesystems that cannot be
unmounted as the inode log items left in the AIL never get removed.
Fix this by ensuring fatal errors from xfs_imap_to_bp() result in
the inode flush being aborted correctly.
Reported-by: Shyam Kaushik <shyam@zadarastorage.com> Diagnosed-by: Shyam Kaushik <shyam@zadarastorage.com> Tested-by: Shyam Kaushik <shyam@zadarastorage.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
[bwh: Backported to 3.16: as Dave pointed out, error codes are positive
here so compare with positive EAGAIN] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The function batadv_iv_ogm_orig_add_if allocates new buffers for bcast_own
and bcast_own_sum. It is expected that these buffers are unchanged in case
either bcast_own or bcast_own_sum couldn't be resized.
But the error handling of this function frees the already resized buffer
for bcast_own when the allocation of the new bcast_own_sum buffer failed.
This will lead to an invalid memory access when some code will try to
access bcast_own.
Instead the resized new bcast_own buffer has to be kept. This will not lead
to problems because the size of the buffer was only increased and therefore
no user of the buffer will try to access bytes outside of the new buffer.
Fixes: d0015fdd3d2c ("batman-adv: provide orig_node routing API") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <a@unstable.cc>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Commit 0b89e9aa2856 (cpuidle: delay enabling interrupts until all
coupled CPUs leave idle) rightfully fixed a regression by letting
the coupled idle state framework to handle local interrupt enabling
when the CPU is exiting an idle state.
The current code checks if the idle state is coupled and, if so, it
will let the coupled code to enable interrupts. This way, it can
decrement the ready-count before handling the interrupt. This
mechanism prevents the other CPUs from waiting for a CPU which is
handling interrupts.
But the check is done against the state index returned by the back
end driver's ->enter functions which could be different from the
initial index passed as parameter to the cpuidle_enter_state()
function.
if (!cpuidle_state_is_coupled(drv, entered_state))
local_irq_enable();
[ ... ]
If the 'index' is referring to a coupled idle state but the
'entered_state' is *not* coupled, then the interrupts are enabled
again. All CPUs blocked on the sync barrier may busy loop longer
if the CPU has interrupts to handle before decrementing the
ready-count. That's consuming more energy than saving.
Fixes: 0b89e9aa2856 (cpuidle: delay enabling interrupts until all coupled CPUs leave idle) Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
[ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Commit 176e21ee2ec8 ("SUNRPC: Support for RPC over AF_LOCAL
transports") added a 5-character netid, but did not bump
RPCBIND_MAXNETIDLEN from 4 to 5.
Fixes: 176e21ee2ec8 ("SUNRPC: Support for RPC over AF_LOCAL ...") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Wrong return code was being returned on SMB3 rmdir of
non-empty directory.
For SMB3 (unlike for cifs), we attempt to delete a directory by
set of delete on close flag on the open. Windows clients set
this flag via a set info (SET_FILE_DISPOSITION to set this flag)
which properly checks if the directory is empty.
With this patch on smb3 mounts we correctly return
"DIRECTORY NOT EMPTY"
on attempts to remove a non-empty directory.
Signed-off-by: Steve French <steve.french@primarydata.com> Acked-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.16:adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.16: adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.16: adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
See [MS-NLMP] 3.2.5.1.2 Server Receives an AUTHENTICATE_MESSAGE from the Client:
...
Set NullSession to FALSE
If (AUTHENTICATE_MESSAGE.UserNameLen == 0 AND
AUTHENTICATE_MESSAGE.NtChallengeResponse.Length == 0 AND
(AUTHENTICATE_MESSAGE.LmChallengeResponse == Z(1)
OR
AUTHENTICATE_MESSAGE.LmChallengeResponse.Length == 0))
-- Special case: client requested anonymous authentication
Set NullSession to TRUE
...
Only server which map unknown users to guest will allow
access using a non-null NTChallengeResponse.
For Samba it's the "map to guest = bad user" option.
During boot, MST hotplugs are generally expected (even if no physical
hotplugging occurs) and result in DRM's connector topology changing.
This means that using num_connector from the current mode configuration
can lead to the number of connectors changing under us. This can lead to
some nasty scenarios in fbcon:
- We allocate an array to the size of dev->mode_config.num_connectors.
- MST hotplug occurs, dev->mode_config.num_connectors gets incremented.
- We try to loop through each element in the array using the new value
of dev->mode_config.num_connectors, and end up going out of bounds
since dev->mode_config.num_connectors is now larger then the array we
allocated.
fb_helper->connector_count however, will always remain consistent while
we do a modeset in fb_helper.
Note: This is just polish for 4.7, Dave Airlie's drm_connector
refcounting fixed these bugs for real. But it's good enough duct-tape
for stable kernel backporting, since backporting the refcounting
changes is way too invasive.
During boot time, MST devices usually send a ton of hotplug events
irregardless of whether or not any physical hotplugs actually occurred.
Hotplugs mean connectors being created/destroyed, and the number of DRM
connectors changing under us. This isn't a problem if we use
fb_helper->connector_count since we only set it once in the code,
however if we use num_connector from struct drm_mode_config we risk it's
value changing under us. On top of that, there's even a chance that
dev->mode_config.num_connector != fb_helper->connector_count. If the
number of connectors happens to increase under us, we'll end up using
the wrong array size for memcpy and start writing beyond the actual
length of the array, occasionally resulting in kernel panics.
Note: This is just polish for 4.7, Dave Airlie's drm_connector
refcounting fixed these bugs for real. But it's good enough duct-tape
for stable kernel backporting, since backporting the refcounting
changes is way too invasive.
Irrespective of that fact that that is horrible code that should be
fixed for many reasons, it does highlight a deficiency in the generic
preempt_count manipulators. As it is never right to combine/elide
preempt_count manipulations like this.
Therefore sprinkle some volatile in the two generic accessors to
ensure the compiler is aware of the fact that the preempt_count is
observed outside of the regular program-order view and thus cannot be
optimized away like this.
x86; the only arch not using the generic code is not affected as we
do all this in asm in order to use the segment base per-cpu stuff.
Reported-by: Vikram Mulukutla <markivx@codeaurora.org> Tested-by: Vikram Mulukutla <markivx@codeaurora.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: a787870924db ("sched, arch: Create asm/preempt.h") Link: http://lkml.kernel.org/r/20160516131751.GH3205@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.16: use ACCESS_ONCE() instead of READ_ONCE()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When we free cb->skb after a dump, we do it after releasing the
lock. This means that a new dump could have started in the time
being and we'll end up freeing their skb instead of ours.
This patch saves the skb and module before we unlock so we free
the right memory.
Fixes: 16b304f3404f ("netlink: Eliminate kmalloc in netlink dump operation.") Reported-by: Baozeng Ding <sploving1@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
b84106b4e229 ("PCI: Disable IO/MEM decoding for devices with non-compliant
BARs") disabled BAR sizing for BARs 0-5 of devices that don't comply with
the PCI spec. But it didn't do anything for expansion ROM BARs, so we
still try to size them, resulting in warnings like this on Broadwell-EP:
pci 0000:ff:12.0: BAR 6: failed to assign [mem size 0x00000001 pref]
Move the non-compliant BAR check from __pci_read_base() up to
pci_read_bases() so it applies to the expansion ROM BAR as well as
to BARs 0-5.
Note that direct callers of __pci_read_base(), like sriov_init(), will now
bypass this check. We haven't had reports of devices with broken SR-IOV
BARs yet.
[bhelgaas: changelog] Fixes: b84106b4e229 ("PCI: Disable IO/MEM decoding for devices with non-compliant BARs") Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> CC: "H. Peter Anvin" <hpa@zytor.com> CC: Andi Kleen <ak@linux.intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Commit b894157145e4 ("x86/PCI: Mark Broadwell-EP Home Agent & PCU as having
non-compliant BARs") marked Home Agent 0 & PCU has having non-compliant
BARs. Home Agent 1 also has non-compliant BARs.
Mark Home Agent 1 as having non-compliant BARs so the PCI core doesn't
touch them.
The problem with these devices is documented in the Xeon v4 specification
update:
BDF2 PCI BARs in the Home Agent Will Return Non-Zero Values
During Enumeration
Problem: During system initialization the Operating System may access
the standard PCI BARs (Base Address Registers). Due to
this erratum, accesses to the Home Agent BAR registers (Bus
1; Device 18; Function 0,4; Offsets (0x14-0x24) will return
non-zero values.
Implication: The operating system may issue a warning. Intel has not
observed any functional failures due to this erratum.
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e5-v4-spec-update.html Fixes: b894157145e4 ("x86/PCI: Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs") Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> CC: "H. Peter Anvin" <hpa@zytor.com> CC: Andi Kleen <ak@linux.intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Prevent using uninitialized or negative index when handling
steering entries.
Fixes: b12d93d63c32 ('mlx4: Add support for promiscuous mode in the new steering model.') Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When this_order variable in blk_mq_init_rq_map() becomes zero
the code incorrectly decrements the variable and passes the result
to order_to_size() helper causing undefined behaviour:
UBSAN: Undefined behaviour in block/blk-mq.c:1459:27
shift exponent 4294967295 is too large for 32-bit type 'unsigned int'
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc6-00072-g33656a1 #22
Fix the code by checking this_order variable for not having the zero
value first.
Reported-by: Meelis Roos <mroos@linux.ee> Fixes: 320ae51feed5 ("blk-mq: new multi-queue block IO queueing mechanism") Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Some eMMCs set the partition switch timeout too low.
Now typically eMMCs are considered a critical component (e.g. because
they store the root file system) and consequently are expected to be
reliable. Thus we can neglect the use case where eMMCs can't switch
reliably and we might want a lower timeout to facilitate speedy
recovery.
Although we could employ a quirk for the cards that are affected (if
we could identify them all), as described above, there is little
benefit to having a low timeout, so instead simply set a minimum
timeout.
The minimum is set to 300ms somewhat arbitrarily - the examples that
have been seen had a timeout of 10ms but were sometimes taking 60-70ms.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fix array overrun when going over callback table.
In declaration of callback table, the max size isn't provided and
in registration phase, it is provided.
There is potential scenario where a new operation is added
and it is not supported by current client. The acceptance of
such operation by ib_netlink will cause to array overrun.
Fixes: 809d5fc9bf65 ("infiniband: pass rdma_cm module to netlink_dump_start") Fixes: b493d91d333e ("iwcm: common code for port mapper") Fixes: 2ca546b92a02 ("IB/sa: Route SA pathrecord query through netlink") Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
[bwh: Backported to 3.16:
- Only cma.c needs to be fixed
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In case ibnl_put_msg fails in send_nlmsg_done,
the function returns with -ENOMEM without freeing.
This patch fixes this behavior.
Fixes: 30dc5e63d6a5 ("RDMA/core: Add support for iWARP Port Mapper user space service") Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently c4iw_peer_abort_intr() does not wake up the waiter if the
endpoint state indicates we're using MPAv2 and we're currently trying to
connect. This was introduced with commit 7c0a33d61187a ("RDMA/cxgb4:
Don't wakeup threads for MPAv2")
However, this original fix is flawed because it introduces a race that
can cause a deadlock of the iwarp stack. Here is the race:
->local side sets up an active offload connection.
->local side sends MPA_START request.
->peer sends MPA_START response.
->local side ingress cpl thread begins processing the MPA_START response,
but before it changes the state from MPA_REQ_SENT to FPDU_MODE:
->peer sends a RST which results in a ABORT_REQ_RSS. This triggers
peer_abort_intr() which sees the state in MPA_REQ_SENT and since mpa_rev
is 2, it will avoid waking up the endpoint with -ECONNRESET, assuming the
stack will re-attempt the connection using MPAv1.
->Meanwhile, the cpl thread moves the state to FPDU_MODE and calls
c4iw_modify_rc_qp() which calls rdma_init() which sends a RI_WR/INIT WR
to firmware. But since HW sent an abort, FW correctly drops the RI_WR/INIT
WR.
->So the cpl thread is stuck waiting for a reply and cannot process the
ABORT_REQ_RSS cpl sitting in its input queue. Thus everything comes to a
halt because no more ingress cpls are processed by the stack...
The correct fix for the issue is to always do the wake up in
c4iw_abort_intr() but reinitialize the wait object in c4iw_reconnect().
Fixes: 7c0a33d61187a ("RDMA/cxgb4: Don't wakeup threads for MPAv2") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Thus it overflows and the resulting number is less than 4080, which makes
3823 / 4080 = 0
an nr_pages is set to this. As we already checked against the minimum that
nr_pages may be, this causes the logic to fail as well, and we crash the
kernel.
There's no reason to have the two DIV_ROUND_UP() (that's just result of
historical code changes), clean up the code and fix this bug.
Fixes: 83f40318dab00 ("ring-buffer: Make removal of ring buffer pages atomic") Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The size variable to change the ring buffer in ftrace is a long. The
nr_pages used to update the ring buffer based on the size is int. On 64 bit
machines this can cause an overflow problem.
For example, the following will cause the ring buffer to crash:
WARNING: CPU: 1 PID: 318 at kernel/trace/ring_buffer.c:1527 rb_update_pages+0x22f/0x260
Which is:
RB_WARN_ON(cpu_buffer, nr_removed);
Note each ring buffer page holds 4080 bytes.
This is because:
1) 10 causes the ring buffer to have 3 pages.
(10kb requires 3 * 4080 pages to hold)
2) (2^31 / 2^10 + 1) * 4080 = 8556384240
The value written into buffer_size_kb is shifted by 10 and then passed
to ring_buffer_resize(). 8556384240 * 2^10 = 8761737461760
3) The size passed to ring_buffer_resize() is then divided by BUF_PAGE_SIZE
which is 4080. 8761737461760 / 4080 = 2147484672
4) nr_pages is subtracted from the current nr_pages (3) and we get: 2147484669. This value is saved in a signed integer nr_pages_to_update
5) 2147484669 is greater than 2^31 but smaller than 2^32, a signed int
turns into the value of -2147482627
6) As the value is a negative number, in update_pages_handler() it is
negated and passed to rb_remove_pages() and 2147482627 pages will
be removed, which is much larger than 3 and it causes the warning
because not all the pages asked to be removed were removed.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=118001 Fixes: 7a8e76a3829f1 ("tracing: unified trace buffer") Reported-by: Hao Qin <QEver.cn@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When emulating a jalr instruction with rd == $0, the code in
isBranchInstr was incorrectly writing to GPR $0 which should actually
always remain zeroed. This would lead to any further instructions
emulated which use $0 operating on a bogus value until the task is next
context switched, at which point the value of $0 in the task context
would be restored to the correct zero by a store in SAVE_SOME. Fix this
by not writing to rd if it is $0.
Fixes: 102cedc32a6e ("MIPS: microMIPS: Floating point support.") Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: Maciej W. Rozycki <macro@imgtec.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13160/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The lazy cache flushing implemented in the MIPS kernel suffers from a
race condition that is exposed by do_set_pte() in mm/memory.c.
A pre-condition is a file-system that writes to the page from the CPU
in its readpage method and then calls flush_dcache_page(). One example
is ubifs. Another pre-condition is that the dcache flush is postponed
in __flush_dcache_page().
Upon a page fault for an executable mapping not existing in the
page-cache, the following will happen:
1. Write to the page
2. flush_dcache_page
3. flush_icache_page
4. set_pte_at
5. update_mmu_cache (commits the flush of a dcache-dirty page)
Between steps 4 and 5 another thread can hit the same page and it will
encounter a valid pte. Because the data still is in the L1 dcache the CPU
will fetch stale data from L2 into the icache and execute garbage.
This fix moves the commit of the cache flush to step 3 to close the
race window. It also reduces the amount of flushes on non-executable
mappings because we never enter __flush_dcache_page() for non-aliasing
CPUs.
Regressions can occur in drivers that mistakenly relies on the
flush_dcache_page() in get_user_pages() for DMA operations.
[ralf@linux-mips.org: Folded in patch 9346 to fix highmem issue.]
Commit 39baadbf36ce ("powerpc/eeh: Remove eeh information from pci_dn")
changed the pci_dn struct by removing its EEH-related members.
As part of this clean-up, DDW mechanism was modified to read the device
configuration address from eeh_dev struct.
As a consequence, now if we disable EEH mechanism on kernel command-line
for example, the DDW mechanism will fail, generating a kernel oops by
dereferencing a NULL pointer (which turns to be the eeh_dev pointer).
This patch just changes the configuration address calculation on DDW
functions to a manual calculation based on pci_dn members instead of
using eeh_dev-based address.
No functional changes were made. This was tested on pSeries, both
in PHyp and qemu guest.
Fixes: 39baadbf36ce ("powerpc/eeh: Remove eeh information from pci_dn") Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>