]> git.ipfire.org Git - thirdparty/kernel/stable.git/log
thirdparty/kernel/stable.git
10 years agoiio: adis16400: Fix adis16448 gyroscope scale
Lars-Peter Clausen [Wed, 5 Aug 2015 13:38:13 +0000 (15:38 +0200)] 
iio: adis16400: Fix adis16448 gyroscope scale

commit 8166537283b31d7abaae9e56bd48fbbc30cdc579 upstream.

Use the correct scale for the adis16448 gyroscope output.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agodevres: fix devres_get()
Masahiro Yamada [Wed, 15 Jul 2015 01:29:00 +0000 (10:29 +0900)] 
devres: fix devres_get()

commit 64526370d11ce8868ca495723d595b61e8697fbf upstream.

Currently, devres_get() passes devres_free() the pointer to devres,
but devres_free() should be given with the pointer to resource data.

Fixes: 9ac7849e35f7 ("devres: device resource management")
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoauxdisplay: ks0108: fix refcount
Sudip Mukherjee [Mon, 20 Jul 2015 11:57:21 +0000 (17:27 +0530)] 
auxdisplay: ks0108: fix refcount

commit bab383de3b84e584b0f09227151020b2a43dc34c upstream.

parport_find_base() will implicitly do parport_get_port() which
increases the refcount. Then parport_register_device() will again
increment the refcount. But while unloading the module we are only
doing parport_unregister_device() decrementing the refcount only once.
We add an parport_put_port() to neutralize the effect of
parport_get_port().

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoKVM: MMU: fix validation of mmio page fault
Xiao Guangrong [Wed, 5 Aug 2015 04:04:19 +0000 (12:04 +0800)] 
KVM: MMU: fix validation of mmio page fault

commit 6f691251c0350ac52a007c54bf3ef62e9d8cdc5e upstream.

We got the bug that qemu complained with "KVM: unknown exit, hardware
reason 31" and KVM shown these info:
[84245.284948] EPT: Misconfiguration.
[84245.285056] EPT: GPA: 0xfeda848
[84245.285154] ept_misconfig_inspect_spte: spte 0x5eaef50107 level 4
[84245.285344] ept_misconfig_inspect_spte: spte 0x5f5fadc107 level 3
[84245.285532] ept_misconfig_inspect_spte: spte 0x5141d18107 level 2
[84245.285723] ept_misconfig_inspect_spte: spte 0x52e40dad77 level 1

This is because we got a mmio #PF and the handler see the mmio spte becomes
normal (points to the ram page)

However, this is valid after introducing fast mmio spte invalidation which
increases the generation-number instead of zapping mmio sptes, a example
is as follows:
1. QEMU drops mmio region by adding a new memslot
2. invalidate all mmio sptes
3.

        VCPU 0                        VCPU 1
    access the invalid mmio spte
                            access the region originally was MMIO before
                            set the spte to the normal ram map

    mmio #PF
    check the spte and see it becomes normal ram mapping !!!

This patch fixes the bug just by dropping the check in mmio handler, it's
good for backport. Full check will be introduced in later patches

Reported-by: Pavel Shirshov <ru.pchel@gmail.com>
Tested-by: Pavel Shirshov <ru.pchel@gmail.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoserial: 8250_pci: Add support for Pericom PI7C9X795[1248]
Adam Lee [Mon, 3 Aug 2015 05:28:13 +0000 (13:28 +0800)] 
serial: 8250_pci: Add support for Pericom PI7C9X795[1248]

commit 89c043a6cb2d4525d48a38ed78d5f0f5672338b3 upstream.

Pericom PI7C9X795[1248] are Uno/Dual/Quad/Octal UART devices, this
patch enables them, also defines PCI_VENDOR_ID_PERICOM here.

Signed-off-by: Adam Lee <adam.lee@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoDoc: ABI: testing: configfs-usb-gadget-sourcesink
Peter Chen [Fri, 31 Jul 2015 08:36:29 +0000 (16:36 +0800)] 
Doc: ABI: testing: configfs-usb-gadget-sourcesink

commit 4bc58eb16bb2352854b9c664cc36c1c68d2bfbb7 upstream.

Fix the name of attribute

Signed-off-by: Peter Chen <peter.chen@freescale.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoDoc: ABI: testing: configfs-usb-gadget-loopback
Peter Chen [Fri, 31 Jul 2015 08:36:28 +0000 (16:36 +0800)] 
Doc: ABI: testing: configfs-usb-gadget-loopback

commit 8cd50626823c00ca7472b2f61cb8c0eb9798ddc0 upstream.

Fix the name of attribute

Signed-off-by: Peter Chen <peter.chen@freescale.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agousb: dwc3: ep0: Fix mem corruption on OUT transfers of more than 512 bytes
Kishon Vijay Abraham I [Mon, 27 Jul 2015 06:55:27 +0000 (12:25 +0530)] 
usb: dwc3: ep0: Fix mem corruption on OUT transfers of more than 512 bytes

commit b2fb5b1a0f50d3ebc12342c8d8dead245e9c9d4e upstream.

DWC3 uses bounce buffer to handle non max packet aligned OUT transfers and
the size of bounce buffer is 512 bytes. However if the host initiates OUT
transfers of size more than 512 bytes (and non max packet aligned), the
driver throws a WARN dump but still programs the TRB to receive more than
512 bytes. This will cause bounce buffer to overflow and corrupt the
adjacent memory locations which can be fatal.

Fix it by programming the TRB to receive a maximum of DWC3_EP0_BOUNCE_SIZE
(512) bytes.

Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoclk: exynos4: Fix wrong clock for Exynos4x12 ADC
Krzysztof Kozlowski [Fri, 12 Jun 2015 01:53:25 +0000 (10:53 +0900)] 
clk: exynos4: Fix wrong clock for Exynos4x12 ADC

commit e323d56eb06b266b77c2b430cb5f1977ba549e03 upstream.

The TSADC gate clock was used in Exynos4x12 DTSI for exynos-adc driver.
However TSADC is present only on Exynos4210 so on Trats2 board (with
Exynos4412 SoC) the exynos-adc driver could not be probed:
   ERROR: could not get clock /adc@126C0000:adc(0)
   exynos-adc 126c0000.adc: failed getting clock, err = -2
   exynos-adc: probe of 126c0000.adc failed with error -2

Instead on Exynos4x12 SoCs the main clock used by Analog to Digital
Converter is located in different register and it is named in datasheet
as PCLK_ADC. Regardless of the name the purpose of this PCLK_ADC clock
is the same as purpose of TSADC from Exynos4210.

The patch adds gate clock for Exynos4x12 using the proper register so
backward compatibility is preserved. This fixes the probe of exynos-adc
driver on Exynos4x12 boards and allows accessing sensors connected to it
on Trats2 board (ntc,ncp15wb473 AP and battery thermistors).

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Fixes: c63c57433003 ("ARM: dts: Add ADC's dt data to read raw data for exynos4x12")
Reviewed-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Acked-by: Tomasz Figa <tomasz.figa@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agostaging: comedi: usbduxsigma: don't clobber ao_timer in command test
Ian Abbott [Wed, 16 Sep 2015 13:18:04 +0000 (14:18 +0100)] 
staging: comedi: usbduxsigma: don't clobber ao_timer in command test

commit c04a1f17803e0d3eeada586ca34a6b436959bc20 upstream.

`devpriv->ao_timer` is used while an asynchronous command is running on
the AO subdevice.  It also gets modified by the subdevice's `cmdtest`
handler for checking new asynchronous commands,
`usbduxsigma_ao_cmdtest()`, which is not correct as it's allowed to
check new commands while an old command is still running.  Fix it by
moving the code which sets up `devpriv->ao_timer` into the subdevice's
`cmd` handler, `usbduxsigma_ao_cmd()`.

** This backported patch also moves the code that sets up
`devpriv->ao_sample_count` from `usbduxsigma_ao_cmdtest()` to
`usbduxsigma_ao_cmd()` for the same reason as above.  (This was not
needed in the upstream commit.) **

Note that the removed code in `usbduxsigma_ao_cmdtest()` checked that
`devpriv->ao_timer` did not end up less that 1, but that could not
happen due because `cmd->scan_begin_arg` or `cmd->convert_arg` had
already been range-checked.

Also note that we tested the `high_speed` variable in the old code, but
that is currently always 0 and means that we always use "scan" timing
(`cmd->scan_begin_src == TRIG_TIMER` and `cmd->convert_src == TRIG_NOW`)
and never "convert" (individual sample) timing (`cmd->scan_begin_src ==
TRIG_FOLLOW` and `cmd->convert_src == TRIG_TIMER`).  The moved code
tests `cmd->convert_src` instead to decide whether "scan" or "convert"
timing is being used, although currently only "scan" timing is
supported.

Fixes: fb1ef622e7a3 ("staging: comedi: usbduxsigma: tidy up analog output command support")
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agostaging: comedi: usbduxsigma: don't clobber ai_timer in command test
Ian Abbott [Wed, 16 Sep 2015 13:17:14 +0000 (14:17 +0100)] 
staging: comedi: usbduxsigma: don't clobber ai_timer in command test

commit 423b24c37dd5794a674c74b0ed56392003a69891 upstream.

`devpriv->ai_timer` is used while an asynchronous command is running on
the AI subdevice.  It also gets modified by the subdevice's `cmdtest`
handler for checking new asynchronous commands
(`usbduxsigma_ai_cmdtest()`), which is not correct as it's allowed to
check new commands while an old command is still running.  Fix it by
moving the code which sets up `devpriv->ai_timer` and
`devpriv->ai_interval` into the subdevice's `cmd` handler,
`usbduxsigma_ai_cmd()`.

** This backported patch also moves the code that sets up
`devpriv->ai_sample_count` from `usbduxsigma_ai_cmdtest()` to
`usbduxsigma_ai_cmd()` for the same reason as above. (This was not
needed in the upstream commit.) **

Note that the removed code in `usbduxsigma_ai_cmdtest()` checked that
`devpriv->ai_timer` did not end up less than than 1, but that could not
happen because `cmd->scan_begin_arg` had already been checked to be at
least the minimum required value (at least when `cmd->scan_begin_src ==
TRIG_TIMER`, which had also been checked to be the case).

Fixes: b986be8527c7 ("staging: comedi: usbduxsigma: tidy up analog input command support)
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoPCI: Add VPD function 0 quirk for Intel Ethernet devices
Mark Rustad [Mon, 13 Jul 2015 18:40:07 +0000 (11:40 -0700)] 
PCI: Add VPD function 0 quirk for Intel Ethernet devices

commit 7aa6ca4d39edf01f997b9e02cf6d2fdeb224f351 upstream.

Set the PCI_DEV_FLAGS_VPD_REF_F0 flag on all Intel Ethernet device
functions other than function 0, so that on multi-function devices, we will
always read VPD from function 0 instead of from the other functions.

[bhelgaas: changelog]
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoPCI: Add dev_flags bit to access VPD through function 0
Mark Rustad [Mon, 13 Jul 2015 18:40:02 +0000 (11:40 -0700)] 
PCI: Add dev_flags bit to access VPD through function 0

commit 932c435caba8a2ce473a91753bad0173269ef334 upstream.

Add a dev_flags bit, PCI_DEV_FLAGS_VPD_REF_F0, to access VPD through
function 0 to provide VPD access on other functions.  This is for hardware
devices that provide copies of the same VPD capability registers in
multiple functions.  Because the kernel expects that each function has its
own registers, both the locking and the state tracking are affected by VPD
accesses to different functions.

On such devices for example, if a VPD write is performed on function 0,
*any* later attempt to read VPD from any other function of that device will
hang.  This has to do with how the kernel tracks the expected value of the
F bit per function.

Concurrent accesses to different functions of the same device can not only
hang but also corrupt both read and write VPD data.

When hangs occur, typically the error message:

  vpd r/w failed.  This is likely a firmware bug on this device.

will be seen.

Never set this bit on function 0 or there will be an infinite recursion.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agomac80211: enable assoc check for mesh interfaces
Bob Copeland [Sat, 13 Jun 2015 14:16:31 +0000 (10:16 -0400)] 
mac80211: enable assoc check for mesh interfaces

commit 3633ebebab2bbe88124388b7620442315c968e8f upstream.

We already set a station to be associated when peering completes, both
in user space and in the kernel.  Thus we should always have an
associated sta before sending data frames to that station.

Failure to check assoc state can cause crashes in the lower-level driver
due to transmitting unicast data frames before driver sta structures
(e.g. ampdu state in ath9k) are initialized.  This occurred when
forwarding in the presence of fixed mesh paths: frames were transmitted
to stations with whom we hadn't yet completed peering.

Reported-by: Alexis Green <agreen@cococorp.com>
Tested-by: Jesse Jones <jjones@cococorp.com>
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoARM: OMAP2+: DRA7: clockdomain: change l4per2_7xx_clkdm to SW_WKUP
Vignesh R [Wed, 3 Jun 2015 11:51:20 +0000 (17:21 +0530)] 
ARM: OMAP2+: DRA7: clockdomain: change l4per2_7xx_clkdm to SW_WKUP

commit b9e23f321940d2db2c9def8ff723b8464fb86343 upstream.

Legacy IPs like PWMSS, present under l4per2_7xx_clkdm, cannot support
smart-idle when its clock domain is in HW_AUTO on DRA7 SoCs. Hence,
program clock domain to SW_WKUP.

Signed-off-by: Vignesh R <vigneshr@ti.com>
Acked-by: Tero Kristo <t-kristo@ti.com>
Reviewed-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoxtensa: fix threadptr reload on return to userspace
Max Filippov [Sat, 4 Jul 2015 12:27:39 +0000 (15:27 +0300)] 
xtensa: fix threadptr reload on return to userspace

commit 4229fb12a03e5da5882b420b0aa4a02e77447b86 upstream.

Userspace return code may skip restoring THREADPTR register if there are
no registers that need to be zeroed. This leads to spurious failures in
libc NPTL tests.

Always restore THREADPTR on return to userspace.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoHID: cp2112: fix byte order in SMBUS operations
Ellen Wang [Fri, 10 Jul 2015 05:04:31 +0000 (22:04 -0700)] 
HID: cp2112: fix byte order in SMBUS operations

commit 29e2d6d1f6f61ba2b5cc9d9867e01d8c31a6c4f7 upstream.

Change all occurrences of be16 to le16 in cp2112_xfer(),
because SMBUS words are little endian, not big endian.

Signed-off-by: Ellen Wang <ellen@cumulusnetworks.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoHID: cp2112: fix I2C_SMBUS_BYTE write
Ellen Wang [Mon, 13 Jul 2015 22:23:54 +0000 (15:23 -0700)] 
HID: cp2112: fix I2C_SMBUS_BYTE write

commit 6d00d153f00097d259f86304e11858a50a1b8ad1 upstream.

When doing an I2C_SMBUS_BYTE write (one byte write, no address),
the data to be written is in "command" not "data->byte".

Signed-off-by: Ellen Wang <ellen@cumulusnetworks.com>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Reviewed-by: Antonio Borneo <borneo.antonio@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years ago[media] rc-core: fix remove uevent generation
David Härdeman [Tue, 19 May 2015 22:03:12 +0000 (19:03 -0300)] 
[media] rc-core: fix remove uevent generation

commit a66b0c41ad277ae62a3ae6ac430a71882f899557 upstream.

The input_dev is already gone when the rc device is being unregistered
so checking for its presence only means that no remove uevent will be
generated.

Signed-off-by: David Härdeman <david@hardeman.nu>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years ago[media] v4l: omap3isp: Fix sub-device power management code
Sakari Ailus [Fri, 12 Jun 2015 23:06:23 +0000 (20:06 -0300)] 
[media] v4l: omap3isp: Fix sub-device power management code

commit 9d39f05490115bf145e5ea03c0b7ec9d3d015b01 upstream.

Commit 813f5c0ac5cc ("media: Change media device link_notify behaviour")
modified the media controller link setup notification API and updated the
OMAP3 ISP driver accordingly. As a side effect it introduced a bug by
turning power on after setting the link instead of before. This results in
sub-devices not being powered down in some cases when they should be. Fix
it.

Fixes: 813f5c0ac5cc [media] media: Change media device link_notify behaviour
Signed-off-by: Sakari Ailus <sakari.ailus@iki.fi>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agovirtio-net: drop NETIF_F_FRAGLIST
Jason Wang [Wed, 5 Aug 2015 02:34:04 +0000 (10:34 +0800)] 
virtio-net: drop NETIF_F_FRAGLIST

commit 48900cb6af4282fa0fb6ff4d72a81aa3dadb5c39 upstream.

virtio declares support for NETIF_F_FRAGLIST, but assumes
that there are at most MAX_SKB_FRAGS + 2 fragments which isn't
always true with a fraglist.

A longer fraglist in the skb will make the call to skb_to_sgvec overflow
the sg array, leading to memory corruption.

Drop NETIF_F_FRAGLIST so we only get what we can handle.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Moritz Muehlenhoff <jmm@debian.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoipv6: addrconf: validate new MTU before applying it
Marcelo Leitner [Mon, 23 Feb 2015 14:17:13 +0000 (11:17 -0300)] 
ipv6: addrconf: validate new MTU before applying it

commit 77751427a1ff25b27d47a4c36b12c3c8667855ac upstream.

Currently we don't check if the new MTU is valid or not and this allows
one to configure a smaller than minimum allowed by RFCs or even bigger
than interface own MTU, which is a problem as it may lead to packet
drops.

If you have a daemon like NetworkManager running, this may be exploited
by remote attackers by forging RA packets with an invalid MTU, possibly
leading to a DoS. (NetworkManager currently only validates for values
too small, but not for too big ones.)

The fix is just to make sure the new value is valid. That is, between
IPV6_MIN_MTU and interface's MTU.

Note that similar check is already performed at
ndisc_router_discovery(), for when kernel itself parses the RA.

Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Moritz Muehlenhoff <jmm@debian.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoLinux 3.16.7-ckt17
Luis Henriques [Fri, 11 Sep 2015 09:10:46 +0000 (10:10 +0100)] 
Linux 3.16.7-ckt17

Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agosignal: fix information leak in copy_siginfo_from_user32
Amanieu d'Antras [Thu, 6 Aug 2015 22:46:26 +0000 (15:46 -0700)] 
signal: fix information leak in copy_siginfo_from_user32

commit 3c00cb5e68dc719f2fc73a33b1b230aadfcb1309 upstream.

This function can leak kernel stack data when the user siginfo_t has a
positive si_code value.  The top 16 bits of si_code descibe which fields
in the siginfo_t union are active, but they are treated inconsistently
between copy_siginfo_from_user32, copy_siginfo_to_user32 and
copy_siginfo_to_user.

copy_siginfo_from_user32 is called from rt_sigqueueinfo and
rt_tgsigqueueinfo in which the user has full control overthe top 16 bits
of si_code.

This fixes the following information leaks:
x86:   8 bytes leaked when sending a signal from a 32-bit process to
       itself. This leak grows to 16 bytes if the process uses x32.
       (si_code = __SI_CHLD)
x86:   100 bytes leaked when sending a signal from a 32-bit process to
       a 64-bit process. (si_code = -1)
sparc: 4 bytes leaked when sending a signal from a 32-bit process to a
       64-bit process. (si_code = any)

parsic and s390 have similar bugs, but they are not vulnerable because
rt_[tg]sigqueueinfo have checks that prevent sending a positive si_code
to a different process.  These bugs are also fixed for consistency.

Signed-off-by: Amanieu d'Antras <amanieu@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agosignal: fix information leak in copy_siginfo_to_user
Amanieu d'Antras [Thu, 6 Aug 2015 22:46:29 +0000 (15:46 -0700)] 
signal: fix information leak in copy_siginfo_to_user

commit 26135022f85105ad725cda103fa069e29e83bd16 upstream.

This function may copy the si_addr_lsb, si_lower and si_upper fields to
user mode when they haven't been initialized, which can leak kernel
stack data to user mode.

Just checking the value of si_code is insufficient because the same
si_code value is shared between multiple signals.  This is solved by
checking the value of si_signo in addition to si_code.

Signed-off-by: Amanieu d'Antras <amanieu@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ luis: backported to 3.16:
  - dropped 2nd hunk in kernel/signal.c ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agosignalfd: fix information leak in signalfd_copyinfo
Amanieu d'Antras [Thu, 6 Aug 2015 22:46:33 +0000 (15:46 -0700)] 
signalfd: fix information leak in signalfd_copyinfo

commit 3ead7c52bdb0ab44f4bb1feed505a8323cc12ba7 upstream.

This function may copy the si_addr_lsb field to user mode when it hasn't
been initialized, which can leak kernel stack data to user mode.

Just checking the value of si_code is insufficient because the same
si_code value is shared between multiple signals.  This is solved by
checking the value of si_signo in addition to si_code.

Signed-off-by: Amanieu d'Antras <amanieu@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoxen/gntdevt: Fix race condition in gntdev_release()
Marek Marczykowski-Górecki [Fri, 26 Jun 2015 01:28:24 +0000 (03:28 +0200)] 
xen/gntdevt: Fix race condition in gntdev_release()

commit 30b03d05e07467b8c6ec683ea96b5bffcbcd3931 upstream.

While gntdev_release() is called the MMU notifier is still registered
and can traverse priv->maps list even if no pages are mapped (which is
the case -- gntdev_release() is called after all). But
gntdev_release() will clear that list, so make sure that only one of
those things happens at the same time.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoxen/gntdev: convert priv->lock to a mutex
David Vrabel [Fri, 9 Jan 2015 18:06:12 +0000 (18:06 +0000)] 
xen/gntdev: convert priv->lock to a mutex

commit 1401c00e59ea021c575f74612fe2dbba36d6a4ee upstream.

Unmapping may require sleeping and we unmap while holding priv->lock, so
convert it to a mutex.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
[ luis: 3.16 prereq for:
  30b03d05e074 xen/gntdevt: Fix race condition in gntdev_release() ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agos390/sclp: fix compile error
Sebastian Ott [Thu, 25 Jun 2015 07:32:22 +0000 (09:32 +0200)] 
s390/sclp: fix compile error

commit a313bdc5310dd807655d3ca3eb2219cd65dfe45a upstream.

Fix this error when compiling with CONFIG_SMP=n and
CONFIG_DYNAMIC_DEBUG=y:

drivers/s390/char/sclp_early.c: In function 'sclp_read_info_early':
drivers/s390/char/sclp_early.c:87:19: error: 'EBUSY' undeclared (first use in this function)
   } while (rc == -EBUSY);
                   ^

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Stephen Powell <zlinuxman@wowway.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoip6_gre: release cached dst on tunnel removal
huaibin Wang [Tue, 25 Aug 2015 14:20:34 +0000 (16:20 +0200)] 
ip6_gre: release cached dst on tunnel removal

commit d4257295ba1b389c693b79de857a96e4b7cd8ac0 upstream.

When a tunnel is deleted, the cached dst entry should be released.

This problem may prevent the removal of a netns (seen with a x-netns IPv6
gre tunnel):
  unregister_netdevice: waiting for lo to become free. Usage count = 3

CC: Dmitry Kozlov <xeb@mail.ru>
Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Signed-off-by: huaibin Wang <huaibin.wang@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agonet: Fix RCU splat in af_key
David Ahern [Mon, 24 Aug 2015 21:17:17 +0000 (15:17 -0600)] 
net: Fix RCU splat in af_key

commit ba51b6be38c122f7dab40965b4397aaf6188a464 upstream.

Hit the following splat testing VRF change for ipsec:

[  113.475692] ===============================
[  113.476194] [ INFO: suspicious RCU usage. ]
[  113.476667] 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED Not tainted
[  113.477545] -------------------------------
[  113.478013] /work/monster-14/dsa/kernel.git/include/linux/rcupdate.h:568 Illegal context switch in RCU read-side critical section!
[  113.479288]
[  113.479288] other info that might help us debug this:
[  113.479288]
[  113.480207]
[  113.480207] rcu_scheduler_active = 1, debug_locks = 1
[  113.480931] 2 locks held by setkey/6829:
[  113.481371]  #0:  (&net->xfrm.xfrm_cfg_mutex){+.+.+.}, at: [<ffffffff814e9887>] pfkey_sendmsg+0xfb/0x213
[  113.482509]  #1:  (rcu_read_lock){......}, at: [<ffffffff814e767f>] rcu_read_lock+0x0/0x6e
[  113.483509]
[  113.483509] stack backtrace:
[  113.484041] CPU: 0 PID: 6829 Comm: setkey Not tainted 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED
[  113.485422] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
[  113.486845]  0000000000000001 ffff88001d4c7a98 ffffffff81518af2 ffffffff81086962
[  113.487732]  ffff88001d538480 ffff88001d4c7ac8 ffffffff8107ae75 ffffffff8180a154
[  113.488628]  0000000000000b30 0000000000000000 00000000000000d0 ffff88001d4c7ad8
[  113.489525] Call Trace:
[  113.489813]  [<ffffffff81518af2>] dump_stack+0x4c/0x65
[  113.490389]  [<ffffffff81086962>] ? console_unlock+0x3d6/0x405
[  113.491039]  [<ffffffff8107ae75>] lockdep_rcu_suspicious+0xfa/0x103
[  113.491735]  [<ffffffff81064032>] rcu_preempt_sleep_check+0x45/0x47
[  113.492442]  [<ffffffff8106404d>] ___might_sleep+0x19/0x1c8
[  113.493077]  [<ffffffff81064268>] __might_sleep+0x6c/0x82
[  113.493681]  [<ffffffff81133190>] cache_alloc_debugcheck_before.isra.50+0x1d/0x24
[  113.494508]  [<ffffffff81134876>] kmem_cache_alloc+0x31/0x18f
[  113.495149]  [<ffffffff814012b5>] skb_clone+0x64/0x80
[  113.495712]  [<ffffffff814e6f71>] pfkey_broadcast_one+0x3d/0xff
[  113.496380]  [<ffffffff814e7b84>] pfkey_broadcast+0xb5/0x11e
[  113.497024]  [<ffffffff814e82d1>] pfkey_register+0x191/0x1b1
[  113.497653]  [<ffffffff814e9770>] pfkey_process+0x162/0x17e
[  113.498274]  [<ffffffff814e9895>] pfkey_sendmsg+0x109/0x213

In pfkey_sendmsg the net mutex is taken and then pfkey_broadcast takes
the RCU lock.

Since pfkey_broadcast takes the RCU lock the allocation argument is
pointless since GFP_ATOMIC must be used between the rcu_read_{,un}lock.
The one call outside of rcu can be done with GFP_KERNEL.

Fixes: 7f6b9dbd5afbd ("af_key: locking change")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoperf: Fix PERF_EVENT_IOC_PERIOD migration race
Peter Zijlstra [Tue, 4 Aug 2015 17:22:49 +0000 (19:22 +0200)] 
perf: Fix PERF_EVENT_IOC_PERIOD migration race

commit c7999c6f3fed9e383d3131474588f282ae6d56b9 upstream.

I ran the perf fuzzer, which triggered some WARN()s which are due to
trying to stop/restart an event on the wrong CPU.

Use the normal IPI pattern to ensure we run the code on the correct CPU.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: bad7192b842c ("perf: Fix PERF_EVENT_IOC_PERIOD to force-reset the period")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agobatman-adv: protect tt_local_entry from concurrent delete events
Marek Lindner [Wed, 17 Jun 2015 12:01:36 +0000 (20:01 +0800)] 
batman-adv: protect tt_local_entry from concurrent delete events

commit ef72706a0543d0c3a5ab29bd6378fdfb368118d9 upstream.

The tt_local_entry deletion performed in batadv_tt_local_remove() was neither
protecting against simultaneous deletes nor checking whether the element was
still part of the list before calling hlist_del_rcu().

Replacing the hlist_del_rcu() call with batadv_hash_remove() provides adequate
protection via hash spinlocks as well as an is-element-still-in-hash check to
avoid 'blind' hash removal.

Fixes: 068ee6e204e1 ("batman-adv: roaming handling mechanism redesign")
Reported-by: alfonsname@web.de
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agobatman-adv: fix kernel crash due to missing NULL checks
Marek Lindner [Tue, 9 Jun 2015 13:24:36 +0000 (21:24 +0800)] 
batman-adv: fix kernel crash due to missing NULL checks

commit 354136bcc3c4f40a2813bba8f57ca5267d812d15 upstream.

batadv_softif_vlan_get() may return NULL which has to be verified
by the caller.

Fixes: 35df3b298fc8 ("batman-adv: fix TT VLAN inconsistency on VLAN re-add")
Reported-by: Ryan Thompson <ryan@eero.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoarm64: KVM: Fix host crash when injecting a fault into a 32bit guest
Marc Zyngier [Thu, 27 Aug 2015 15:10:01 +0000 (16:10 +0100)] 
arm64: KVM: Fix host crash when injecting a fault into a 32bit guest

commit 126c69a0bd0e441bf6766a5d9bf20de011be9f68 upstream.

When injecting a fault into a misbehaving 32bit guest, it seems
rather idiotic to also inject a 64bit fault that is only going
to corrupt the guest state. This leads to a situation where we
perform an illegal exception return at EL2 causing the host
to crash instead of killing the guest.

Just fix the stupid bug that has been there from day 1.

Reported-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoAdd factory recertified Crucial M500s to blacklist
Guillermo A. Amaral [Wed, 26 Aug 2015 06:29:13 +0000 (23:29 -0700)] 
Add factory recertified Crucial M500s to blacklist

commit 7a7184b01aa9deb86df661c6f7cbcf69a95b728c upstream.

The Crucial M500 is known to have issues with queued TRIM commands, the
factory recertified SSDs use a different model number naming convention
which causes them to get ignored by the blacklist.

The new naming convention boils down to: s/Crucial_/FC/

Signed-off-by: Guillermo A. Amaral <g@maral.me>
Signed-off-by: Tejun Heo <tj@kernel.org>
[ luis: backported to 3.16:
  - dropped ATA_HORKAGE_ZERO_AFTER_TRIM flag
  - adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoALSA: usb-audio: Fix runtime PM unbalance
Takashi Iwai [Wed, 19 Aug 2015 05:20:14 +0000 (07:20 +0200)] 
ALSA: usb-audio: Fix runtime PM unbalance

commit 9003ebb13f61e8c78a641e0dda7775183ada0625 upstream.

The fix for deadlock in PM in commit [1ee23fe07ee8: ALSA: usb-audio:
Fix deadlocks at resuming] introduced a new check of in_pm flag.
However, the brainless patch author evaluated it in a wrong way
(logical AND instead of logical OR), thus usb_autopm_get_interface()
is wrongly called at probing, leading to unbalance of runtime PM
refcount.

This patch fixes it by correcting the logic.

Reported-by: Hans Yang <hansy@nvidia.com>
Fixes: 1ee23fe07ee8 ('ALSA: usb-audio: Fix deadlocks at resuming')
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoSCSI: Fix NULL pointer dereference in runtime PM
Alan Stern [Mon, 17 Aug 2015 15:02:42 +0000 (11:02 -0400)] 
SCSI: Fix NULL pointer dereference in runtime PM

commit 49718f0fb8c9af192b33d8af3a2826db04025371 upstream.

The routines in scsi_rpm.c assume that if a runtime-PM callback is
invoked for a SCSI device, it can only mean that the device's driver
has asked the block layer to handle the runtime power management (by
calling blk_pm_runtime_init(), which among other things sets q->dev).

However, this assumption turns out to be wrong for things like the ses
driver.  Normally ses devices are not allowed to do runtime PM, but
userspace can override this setting.  If this happens, the kernel gets
a NULL pointer dereference when blk_post_runtime_resume() tries to use
the uninitialized q->dev pointer.

This patch fixes the problem by calling the block layer's runtime-PM
routines only if the device's driver really does have a runtime-PM
callback routine.  Since ses doesn't define any such callbacks, the
crash won't occur.

This fixes Bugzilla #101371.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Stanisław Pitucha <viraptor@gmail.com>
Reported-by: Ilan Cohen <ilanco@gmail.com>
Tested-by: Ilan Cohen <ilanco@gmail.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agox86/ldt: Further fix FPU emulation
Andy Lutomirski [Fri, 14 Aug 2015 22:02:55 +0000 (15:02 -0700)] 
x86/ldt: Further fix FPU emulation

commit 12e244f4b550498bbaf654a52f93633f7dde2dc7 upstream.

The previous fix confused a selector with a segment prefix.  Fix it.

Compile-tested only.

Cc: Juergen Gross <jgross@suse.com>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: 4809146b86c3 ("x86/ldt: Correct FPU emulation access to LDT")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoMIPS: Fix seccomp syscall argument for MIPS64
Markos Chandras [Thu, 13 Aug 2015 07:47:59 +0000 (08:47 +0100)] 
MIPS: Fix seccomp syscall argument for MIPS64

commit 9f161439e4104b641a7bfb9b89581d801159fec8 upstream.

Commit 4c21b8fd8f14 ("MIPS: seccomp: Handle indirect system calls (o32)")
fixed indirect system calls on O32 but it also introduced a bug for MIPS64
where it erroneously modified the v0 (syscall) register with the assumption
that the sycall offset hasn't been taken into consideration. This breaks
seccomp on MIPS64 n64 and n32 ABIs. We fix this by replacing the addition
with a move instruction.

Fixes: 4c21b8fd8f14 ("MIPS: seccomp: Handle indirect system calls (o32)")
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10951/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoipc/sem.c: update/correct memory barriers
Manfred Spraul [Fri, 14 Aug 2015 22:35:10 +0000 (15:35 -0700)] 
ipc/sem.c: update/correct memory barriers

commit 3ed1f8a99d70ea1cd1508910eb107d0edcae5009 upstream.

sem_lock() did not properly pair memory barriers:

!spin_is_locked() and spin_unlock_wait() are both only control barriers.
The code needs an acquire barrier, otherwise the cpu might perform read
operations before the lock test.

As no primitive exists inside <include/spinlock.h> and since it seems
noone wants another primitive, the code creates a local primitive within
ipc/sem.c.

With regards to -stable:

The change of sem_wait_array() is a bugfix, the change to sem_lock() is a
nop (just a preprocessor redefinition to improve the readability).  The
bugfix is necessary for all kernels that use sem_wait_array() (i.e.:
starting from 3.10).

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Kirill Tkhai <ktkhai@parallels.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoipc/sem.c: change memory barrier in sem_lock() to smp_rmb()
Manfred Spraul [Sat, 13 Dec 2014 00:58:11 +0000 (16:58 -0800)] 
ipc/sem.c: change memory barrier in sem_lock() to smp_rmb()

commit 2e094abfd1f29a08a60523b42d4508281b8dee0e upstream.

When I fixed bugs in the sem_lock() logic, I was more conservative than
necessary.  Therefore it is safe to replace the smp_mb() with smp_rmb().
And: With smp_rmb(), semop() syscalls are up to 10% faster.

The race we must protect against is:

sem->lock is free
sma->complex_count = 0
sma->sem_perm.lock held by thread B

thread A:

A: spin_lock(&sem->lock)

B: sma->complex_count++; (now 1)
B: spin_unlock(&sma->sem_perm.lock);

A: spin_is_locked(&sma->sem_perm.lock);
A: XXXXX memory barrier
A: if (sma->complex_count == 0)

Thread A must read the increased complex_count value, i.e. the read must
not be reordered with the read of sem_perm.lock done by spin_is_locked().

Since it's about ordering of reads, smp_rmb() is sufficient.

[akpm@linux-foundation.org: update sem_lock() comment, from Davidlohr]
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ luis: 3.16 prereq for:
  3ed1f8a99d70 "ipc/sem.c: update/correct memory barriers" ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits
Herton R. Krzesinski [Fri, 14 Aug 2015 22:35:02 +0000 (15:35 -0700)] 
ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits

commit 602b8593d2b4138c10e922eeaafe306f6b51817b upstream.

The current semaphore code allows a potential use after free: in
exit_sem we may free the task's sem_undo_list while there is still
another task looping through the same semaphore set and cleaning the
sem_undo list at freeary function (the task called IPC_RMID for the same
semaphore set).

For example, with a test program [1] running which keeps forking a lot
of processes (which then do a semop call with SEM_UNDO flag), and with
the parent right after removing the semaphore set with IPC_RMID, and a
kernel built with CONFIG_SLAB, CONFIG_SLAB_DEBUG and
CONFIG_DEBUG_SPINLOCK, you can easily see something like the following
in the kernel log:

   Slab corruption (Not tainted): kmalloc-64 start=ffff88003b45c1c0, len=64
   000: 6b 6b 6b 6b 6b 6b 6b 6b 00 6b 6b 6b 6b 6b 6b 6b  kkkkkkkk.kkkkkkk
   010: ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff  ....kkkk........
   Prev obj: start=ffff88003b45c180, len=64
   000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
   010: ff ff ff ff ff ff ff ff c0 fb 01 37 00 88 ff ff  ...........7....
   Next obj: start=ffff88003b45c200, len=64
   000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
   010: ff ff ff ff ff ff ff ff 68 29 a7 3c 00 88 ff ff  ........h).<....
   BUG: spinlock wrong CPU on CPU#2, test/18028
   general protection fault: 0000 [#1] SMP
   Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
   CPU: 2 PID: 18028 Comm: test Not tainted 4.2.0-rc5+ #1
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
   RIP: spin_dump+0x53/0xc0
   Call Trace:
     spin_bug+0x30/0x40
     do_raw_spin_unlock+0x71/0xa0
     _raw_spin_unlock+0xe/0x10
     freeary+0x82/0x2a0
     ? _raw_spin_lock+0xe/0x10
     semctl_down.clone.0+0xce/0x160
     ? __do_page_fault+0x19a/0x430
     ? __audit_syscall_entry+0xa8/0x100
     SyS_semctl+0x236/0x2c0
     ? syscall_trace_leave+0xde/0x130
     entry_SYSCALL_64_fastpath+0x12/0x71
   Code: 8b 80 88 03 00 00 48 8d 88 60 05 00 00 48 c7 c7 a0 2c a4 81 31 c0 65 8b 15 eb 40 f3 7e e8 08 31 68 00 4d 85 e4 44 8b 4b 08 74 5e <45> 8b 84 24 88 03 00 00 49 8d 8c 24 60 05 00 00 8b 53 04 48 89
   RIP  [<ffffffff810d6053>] spin_dump+0x53/0xc0
    RSP <ffff88003750fd68>
   ---[ end trace 783ebb76612867a0 ]---
   NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [test:18053]
   Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
   CPU: 3 PID: 18053 Comm: test Tainted: G      D         4.2.0-rc5+ #1
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
   RIP: native_read_tsc+0x0/0x20
   Call Trace:
     ? delay_tsc+0x40/0x70
     __delay+0xf/0x20
     do_raw_spin_lock+0x96/0x140
     _raw_spin_lock+0xe/0x10
     sem_lock_and_putref+0x11/0x70
     SYSC_semtimedop+0x7bf/0x960
     ? handle_mm_fault+0xbf6/0x1880
     ? dequeue_task_fair+0x79/0x4a0
     ? __do_page_fault+0x19a/0x430
     ? kfree_debugcheck+0x16/0x40
     ? __do_page_fault+0x19a/0x430
     ? __audit_syscall_entry+0xa8/0x100
     ? do_audit_syscall_entry+0x66/0x70
     ? syscall_trace_enter_phase1+0x139/0x160
     SyS_semtimedop+0xe/0x10
     SyS_semop+0x10/0x20
     entry_SYSCALL_64_fastpath+0x12/0x71
   Code: 47 10 83 e8 01 85 c0 89 47 10 75 08 65 48 89 3d 1f 74 ff 7e c9 c3 0f 1f 44 00 00 55 48 89 e5 e8 87 17 04 00 66 90 c9 c3 0f 1f 00 <55> 48 89 e5 0f 31 89 c1 48 89 d0 48 c1 e0 20 89 c9 48 09 c8 c9
   Kernel panic - not syncing: softlockup: hung tasks

I wasn't able to trigger any badness on a recent kernel without the
proper config debugs enabled, however I have softlockup reports on some
kernel versions, in the semaphore code, which are similar as above (the
scenario is seen on some servers running IBM DB2 which uses semaphore
syscalls).

The patch here fixes the race against freeary, by acquiring or waiting
on the sem_undo_list lock as necessary (exit_sem can race with freeary,
while freeary sets un->semid to -1 and removes the same sem_undo from
list_proc or when it removes the last sem_undo).

After the patch I'm unable to reproduce the problem using the test case
[1].

[1] Test case used below:

    #include <stdio.h>
    #include <sys/types.h>
    #include <sys/ipc.h>
    #include <sys/sem.h>
    #include <sys/wait.h>
    #include <stdlib.h>
    #include <time.h>
    #include <unistd.h>
    #include <errno.h>

    #define NSEM 1
    #define NSET 5

    int sid[NSET];

    void thread()
    {
            struct sembuf op;
            int s;
            uid_t pid = getuid();

            s = rand() % NSET;
            op.sem_num = pid % NSEM;
            op.sem_op = 1;
            op.sem_flg = SEM_UNDO;

            semop(sid[s], &op, 1);
            exit(EXIT_SUCCESS);
    }

    void create_set()
    {
            int i, j;
            pid_t p;
            union {
                    int val;
                    struct semid_ds *buf;
                    unsigned short int *array;
                    struct seminfo *__buf;
            } un;

            /* Create and initialize semaphore set */
            for (i = 0; i < NSET; i++) {
                    sid[i] = semget(IPC_PRIVATE , NSEM, 0644 | IPC_CREAT);
                    if (sid[i] < 0) {
                            perror("semget");
                            exit(EXIT_FAILURE);
                    }
            }
            un.val = 0;
            for (i = 0; i < NSET; i++) {
                    for (j = 0; j < NSEM; j++) {
                            if (semctl(sid[i], j, SETVAL, un) < 0)
                                    perror("semctl");
                    }
            }

            /* Launch threads that operate on semaphore set */
            for (i = 0; i < NSEM * NSET * NSET; i++) {
                    p = fork();
                    if (p < 0)
                            perror("fork");
                    if (p == 0)
                            thread();
            }

            /* Free semaphore set */
            for (i = 0; i < NSET; i++) {
                    if (semctl(sid[i], NSEM, IPC_RMID))
                            perror("IPC_RMID");
            }

            /* Wait for forked processes to exit */
            while (wait(NULL)) {
                    if (errno == ECHILD)
                            break;
            };
    }

    int main(int argc, char **argv)
    {
            pid_t p;

            srand(time(NULL));

            while (1) {
                    p = fork();
                    if (p < 0) {
                            perror("fork");
                            exit(EXIT_FAILURE);
                    }
                    if (p == 0) {
                            create_set();
                            goto end;
                    }

                    /* Wait for forked processes to exit */
                    while (wait(NULL)) {
                            if (errno == ECHILD)
                                    break;
                    };
            }
    end:
            return 0;
    }

[akpm@linux-foundation.org: use normal comment layout]
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Rafael Aquini <aquini@redhat.com>
CC: Aristeu Rozanski <aris@redhat.com>
Cc: David Jeffery <djeffery@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agomm/hwpoison: fix page refcount of unknown non LRU page
Wanpeng Li [Fri, 14 Aug 2015 22:34:56 +0000 (15:34 -0700)] 
mm/hwpoison: fix page refcount of unknown non LRU page

commit 4f32be677b124a49459e2603321c7a5605ceb9f8 upstream.

After trying to drain pages from pagevec/pageset, we try to get reference
count of the page again, however, the reference count of the page is not
reduced if the page is still not on LRU list.

Fix it by adding the put_page() to drop the page reference which is from
__get_any_page().

Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agodrm/vmwgfx: Fix execbuf locking issues
Thomas Hellstrom [Wed, 12 Aug 2015 05:31:17 +0000 (22:31 -0700)] 
drm/vmwgfx: Fix execbuf locking issues

commit 3e04e2fe6d87807d27521ad6ebb9e7919d628f25 upstream.

This addresses two issues that cause problems with viewperf maya-03 in
situation with memory pressure.

The first issue causes attempts to unreserve buffers if batched
reservation fails due to, for example, a signal pending. While previously
the ttm_eu api was resistant against this type of error, it is no longer
and the lockdep code will complain about attempting to unreserve buffers
that are not reserved. The issue is resolved by avoid calling
ttm_eu_backoff_reservation in the buffer reserve error path.

The second issue is that the binding_mutex may be held when user-space
fence objects are created and hence during memory reclaims. This may cause
recursive attempts to grab the binding mutex. The issue is resolved by not
holding the binding mutex across fence creation and submission.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agocrypto: caam - fix memory corruption in ahash_final_ctx
Horia Geant? [Tue, 11 Aug 2015 17:19:20 +0000 (20:19 +0300)] 
crypto: caam - fix memory corruption in ahash_final_ctx

commit b310c178e6d897f82abb9da3af1cd7c02b09f592 upstream.

When doing pointer operation for accessing the HW S/G table,
a value representing number of entries (and not number of bytes)
must be used.

Fixes: 045e36780f115 ("crypto: caam - ahash hmac support")
Signed-off-by: Horia Geant? <horia.geanta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoEDAC, ppc4xx: Access mci->csrows array elements properly
Michael Walle [Tue, 21 Jul 2015 09:00:53 +0000 (11:00 +0200)] 
EDAC, ppc4xx: Access mci->csrows array elements properly

commit 5c16179b550b9fd8114637a56b153c9768ea06a5 upstream.

The commit

  de3910eb79ac ("edac: change the mem allocation scheme to
 make Documentation/kobject.txt happy")

changed the memory allocation for the csrows member. But ppc4xx_edac was
forgotten in the patch. Fix it.

Signed-off-by: Michael Walle <michael@walle.cc>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Link: http://lkml.kernel.org/r/1437469253-8611-1-git-send-email-michael@walle.cc
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agolibfc: Fix fc_fcp_cleanup_each_cmd()
Bart Van Assche [Fri, 5 Jun 2015 21:20:51 +0000 (14:20 -0700)] 
libfc: Fix fc_fcp_cleanup_each_cmd()

commit 8f2777f53e3d5ad8ef2a176a4463a5c8e1a16431 upstream.

Since fc_fcp_cleanup_cmd() can sleep this function must not
be called while holding a spinlock. This patch avoids that
fc_fcp_cleanup_each_cmd() triggers the following bug:

BUG: scheduling while atomic: sg_reset/1512/0x00000202
1 lock held by sg_reset/1512:
 #0:  (&(&fsp->scsi_pkt_lock)->rlock){+.-...}, at: [<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
Preemption disabled at:[<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
Call Trace:
 [<ffffffff816c612c>] dump_stack+0x4f/0x7b
 [<ffffffff810828bc>] __schedule_bug+0x6c/0xd0
 [<ffffffff816c87aa>] __schedule+0x71a/0xa10
 [<ffffffff816c8ad2>] schedule+0x32/0x80
 [<ffffffffc0217eac>] fc_seq_set_resp+0xac/0x100 [libfc]
 [<ffffffffc0218b11>] fc_exch_done+0x41/0x60 [libfc]
 [<ffffffffc0225cff>] fc_fcp_cleanup_each_cmd.isra.21+0xcf/0x150 [libfc]
 [<ffffffffc0225f43>] fc_eh_device_reset+0x1c3/0x270 [libfc]
 [<ffffffff814a2cc9>] scsi_try_bus_device_reset+0x29/0x60
 [<ffffffff814a3908>] scsi_ioctl_reset+0x258/0x2d0
 [<ffffffff814a2650>] scsi_ioctl+0x150/0x440
 [<ffffffff814b3a9d>] sd_ioctl+0xad/0x120
 [<ffffffff8132f266>] blkdev_ioctl+0x1b6/0x810
 [<ffffffff811da608>] block_ioctl+0x38/0x40
 [<ffffffff811b4e08>] do_vfs_ioctl+0x2f8/0x530
 [<ffffffff811b50c1>] SyS_ioctl+0x81/0xa0
 [<ffffffff816cf8b2>] system_call_fastpath+0x16/0x7a

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agolibfc: Fix fc_exch_recv_req() error path
Bart Van Assche [Fri, 5 Jun 2015 21:20:46 +0000 (14:20 -0700)] 
libfc: Fix fc_exch_recv_req() error path

commit f6979adeaab578f8ca14fdd32b06ddee0d9d3314 upstream.

Due to patch "libfc: Do not invoke the response handler after
fc_exch_done()" (commit ID 7030fd62) the lport_recv() call
in fc_exch_recv_req() is passed a dangling pointer. Avoid this
by moving the fc_frame_free() call from fc_invoke_resp() to its
callers. This patch fixes the following crash:

general protection fault: 0000 [#3] PREEMPT SMP
RIP: fc_lport_recv_req+0x72/0x280 [libfc]
Call Trace:
 fc_exch_recv+0x642/0xde0 [libfc]
 fcoe_percpu_receive_thread+0x46a/0x5ed [fcoe]
 kthread+0x10a/0x120
 ret_from_fork+0x42/0x70

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agolibiscsi: Fix host busy blocking during connection teardown
John Soni Jose [Wed, 24 Jun 2015 01:11:58 +0000 (06:41 +0530)] 
libiscsi: Fix host busy blocking during connection teardown

commit 660d0831d1494a6837b2f810d08b5be092c1f31d upstream.

In case of hw iscsi offload, an host can have N-number of active
connections. There can be IO's running on some connections which
make host->host_busy always TRUE. Now if logout from a connection
is tried then the code gets into an infinite loop as host->host_busy
is always TRUE.

 iscsi_conn_teardown(....)
 {
   .........
    /*
     * Block until all in-progress commands for this connection
     * time out or fail.
     */
     for (;;) {
      spin_lock_irqsave(session->host->host_lock, flags);
      if (!atomic_read(&session->host->host_busy)) { /* OK for ERL == 0 */
      spin_unlock_irqrestore(session->host->host_lock, flags);
              break;
      }
     spin_unlock_irqrestore(session->host->host_lock, flags);
     msleep_interruptible(500);
     iscsi_conn_printk(KERN_INFO, conn, "iscsi conn_destroy(): "
                 "host_busy %d host_failed %d\n",
          atomic_read(&session->host->host_busy),
          session->host->host_failed);

................
...............
     }
  }

This is not an issue with software-iscsi/iser as each cxn is a separate
host.

Fix:
Acquiring eh_mutex in iscsi_conn_teardown() before setting
session->state = ISCSI_STATE_TERMINATE.

Signed-off-by: John Soni Jose <sony.john@avagotech.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agodrm/radeon: add new OLAND pci id
Alex Deucher [Mon, 10 Aug 2015 19:28:49 +0000 (15:28 -0400)] 
drm/radeon: add new OLAND pci id

commit e037239e5e7b61007763984aa35a8329596d8c88 upstream.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agodm btree: add ref counting ops for the leaves of top level btrees
Joe Thornber [Wed, 12 Aug 2015 14:12:09 +0000 (15:12 +0100)] 
dm btree: add ref counting ops for the leaves of top level btrees

commit b0dc3c8bc157c60b1d470163882be8c13e1950af upstream.

When using nested btrees, the top leaves of the top levels contain
block addresses for the root of the next tree down.  If we shadow a
shared leaf node the leaf values (sub tree roots) should be incremented
accordingly.

This is only an issue if there is metadata sharing in the top levels.
Which only occurs if metadata snapshots are being used (as is possible
with dm-thinp).  And could result in a block from the thinp metadata
snap being reused early, thus corrupting the thinp metadata snap.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
[ luis: backported to 3.16:
  - dropped changes to remove_one() as suggested by Mike Snitzer ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agodm thin metadata: delete btrees when releasing metadata snapshot
Joe Thornber [Wed, 12 Aug 2015 14:10:21 +0000 (15:10 +0100)] 
dm thin metadata: delete btrees when releasing metadata snapshot

commit 7f518ad0a212e2a6fd68630e176af1de395070a7 upstream.

The device details and mapping trees were just being decremented
before.  Now btree_del() is called to do a deep delete.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agolocalmodconfig: Use Kbuild files too
Richard Weinberger [Sun, 26 Jul 2015 22:06:55 +0000 (00:06 +0200)] 
localmodconfig: Use Kbuild files too

commit c0ddc8c745b7f89c50385fd7aa03c78dc543fa7a upstream.

In kbuild it is allowed to define objects in files named "Makefile"
and "Kbuild".
Currently localmodconfig reads objects only from "Makefile"s and misses
modules like nouveau.

Link: http://lkml.kernel.org/r/1437948415-16290-1-git-send-email-richard@nod.at
Reported-and-tested-by: Leonidas Spyropoulos <artafinde@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agox86/ldt: Correct FPU emulation access to LDT
Juergen Gross [Thu, 6 Aug 2015 17:54:34 +0000 (19:54 +0200)] 
x86/ldt: Correct FPU emulation access to LDT

commit 4809146b86c3d41ce588fdb767d021e2a80600dd upstream.

Commit 37868fe113ff ("x86/ldt: Make modify_ldt synchronous")
introduced a new struct ldt_struct anchored at mm->context.ldt.

Adapt the x86 fpu emulation code to use that new structure.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: billm@melbpc.org.au
Link: http://lkml.kernel.org/r/1438883674-1240-1-git-send-email-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agorcu: Move lockless_dereference() out of rcupdate.h
Peter Zijlstra [Wed, 27 May 2015 01:39:36 +0000 (11:09 +0930)] 
rcu: Move lockless_dereference() out of rcupdate.h

commit 0a04b0166929405cd833c1cc40f99e862b965ddc upstream.

I want to use lockless_dereference() from seqlock.h, which would mean
including rcupdate.h from it, however rcupdate.h already includes
seqlock.h.

Avoid this by moving lockless_dereference() into compiler.h. This is
somewhat tricky since it uses smp_read_barrier_depends() which isn't
available there, but its a CPP macro so we can get away with it.

The alternative would be moving it into asm/barrier.h, but that would
be updating each arch (I can do if people feel that is more
appropriate).

Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
[ luis: 3.16 prereq for:
  37868fe113ff "x86/ldt: Make modify_ldt synchronous" ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agorcu: Provide counterpart to rcu_dereference() for non-RCU situations
Paul E. McKenney [Tue, 28 Oct 2014 04:11:27 +0000 (21:11 -0700)] 
rcu: Provide counterpart to rcu_dereference() for non-RCU situations

commit 54ef6df3f3f1353d99c80c437259d317b2cd1cbd upstream.

Although rcu_dereference() and friends can be used in situations where
object lifetimes are being managed by something other than RCU, the
resulting sparse and lockdep-RCU noise can be annoying.  This commit
therefore supplies a lockless_dereference(), which provides the
protection for dereferences without the RCU-related debugging noise.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[ luis: 3.16 prereq for:
  37868fe113ff "x86/ldt: Make modify_ldt synchronous" ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agox86/ldt: Correct LDT access in single stepping logic
Juergen Gross [Thu, 6 Aug 2015 08:04:38 +0000 (10:04 +0200)] 
x86/ldt: Correct LDT access in single stepping logic

commit 136d9d83c07c5e30ac49fc83b27e8c4842f108fc upstream.

Commit 37868fe113ff ("x86/ldt: Make modify_ldt synchronous")
introduced a new struct ldt_struct anchored at mm->context.ldt.

convert_ip_to_linear() was changed to reflect this, but indexing
into the ldt has to be changed as the pointer is no longer void *.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@suse.de
Link: http://lkml.kernel.org/r/1438848278-12906-1-git-send-email-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agox86/ldt: Make modify_ldt synchronous
Andy Lutomirski [Thu, 30 Jul 2015 21:31:32 +0000 (14:31 -0700)] 
x86/ldt: Make modify_ldt synchronous

commit 37868fe113ff2ba814b3b4eb12df214df555f8dc upstream.

modify_ldt() has questionable locking and does not synchronize
threads.  Improve it: redesign the locking and synchronize all
threads' LDTs using an IPI on all modifications.

This will dramatically slow down modify_ldt in multithreaded
programs, but there shouldn't be any multithreaded programs that
care about modify_ldt's performance in the first place.

This fixes some fallout from the CVE-2015-5157 fixes.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: security@kernel.org <security@kernel.org>
Cc: xen-devel <xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/4c6978476782160600471bd865b318db34c7b628.1438291540.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[ luis: backported to 3.16:
  - dropped changes to comments in switch_mm()
  - included asm/mmu_context.h in perf_event.c
  - adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoKVM: x86: Use adjustment in guest cycles when handling MSR_IA32_TSC_ADJUST
Haozhong Zhang [Fri, 7 Aug 2015 03:24:32 +0000 (11:24 +0800)] 
KVM: x86: Use adjustment in guest cycles when handling MSR_IA32_TSC_ADJUST

commit d7add05458084a5e3d65925764a02ca9c8202c1e upstream.

When kvm_set_msr_common() handles a guest's write to
MSR_IA32_TSC_ADJUST, it will calcuate an adjustment based on the data
written by guest and then use it to adjust TSC offset by calling a
call-back adjust_tsc_offset(). The 3rd parameter of adjust_tsc_offset()
indicates whether the adjustment is in host TSC cycles or in guest TSC
cycles. If SVM TSC scaling is enabled, adjust_tsc_offset()
[i.e. svm_adjust_tsc_offset()] will first scale the adjustment;
otherwise, it will just use the unscaled one. As the MSR write here
comes from the guest, the adjustment is in guest TSC cycles. However,
the current kvm_set_msr_common() uses it as a value in host TSC
cycles (by using true as the 3rd parameter of adjust_tsc_offset()),
which can result in an incorrect adjustment of TSC offset if SVM TSC
scaling is enabled. This patch fixes this problem.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoperf: Fix fasync handling on inherited events
Peter Zijlstra [Thu, 11 Jun 2015 08:32:01 +0000 (10:32 +0200)] 
perf: Fix fasync handling on inherited events

commit fed66e2cdd4f127a43fd11b8d92a99bdd429528c upstream.

Vince reported that the fasync signal stuff doesn't work proper for
inherited events. So fix that.

Installing fasync allocates memory and sets filp->f_flags |= FASYNC,
which upon the demise of the file descriptor ensures the allocation is
freed and state is updated.

Now for perf, we can have the events stick around for a while after the
original FD is dead because of references from child events. So we
cannot copy the fasync pointer around. We can however consistently use
the parent's fasync, as that will be updated.

Reported-and-Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho deMelo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1434011521.1495.71.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agotarget: REPORT LUNS should return LUN 0 even for dynamic ACLs
Roland Dreier [Fri, 24 Jul 2015 19:11:46 +0000 (12:11 -0700)] 
target: REPORT LUNS should return LUN 0 even for dynamic ACLs

commit 9c395170a559d3b23dad100b01fc4a89d661c698 upstream.

If an initiator doesn't have any real LUNs assigned, we should report
LUN 0 and a LUN list length of 1.  Some versions of Solaris at least
go beserk if we report a LUN list length of 0.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agotarget/iscsi: Fix double free of a TUR followed by a solicited NOPOUT
Alexei Potashnik [Tue, 21 Jul 2015 22:07:56 +0000 (15:07 -0700)] 
target/iscsi: Fix double free of a TUR followed by a solicited NOPOUT

commit 9547308bda296b6f69876c840a0291fcfbeddbb8 upstream.

Make sure all non-READ SCSI commands get targ_xfer_tag initialized
to 0xffffffff, not just WRITEs.

Double-free of a TUR cmd object occurs under the following scenario:

1. TUR received (targ_xfer_tag is uninitialized and left at 0)
2. TUR status sent
3. First unsolicited NOPIN is sent to initiator (gets targ_xfer_tag of 0)
4. NOPOUT for NOPIN (with TTT=0) arrives
 - its ExpStatSN acks TUR status, TUR is queued for removal
 - LIO tries to find NOPIN with TTT=0, but finds the same TUR instead,
   TUR is queued for removal for the 2nd time

(Drop unbalanced conditional bracket usage - nab)

Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Signed-off-by: Spencer Baugh <sbaugh@catern.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
[ luis: backported to 3.16:
  - kept brackets as they are needed in 3.16 kernel ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoregmap: regcache-rbtree: Clean new present bits on present bitmap resize
Guenter Roeck [Mon, 27 Jul 2015 04:34:50 +0000 (21:34 -0700)] 
regmap: regcache-rbtree: Clean new present bits on present bitmap resize

commit 8ef9724bf9718af81cfc5132253372f79c71b7e2 upstream.

When inserting a new register into a block, the present bit map size is
increased using krealloc. krealloc does not clear the additionally
allocated memory, leaving it filled with random values. Result is that
some registers are considered cached even though this is not the case.

Fix the problem by clearing the additionally allocated memory. Also, if
the bitmap size does not increase, do not reallocate the bitmap at all
to reduce overhead.

Fixes: 3f4ff561bc88 ("regmap: rbtree: Make cache_present bitmap per node")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoxen-blkback: replace work_pending with work_busy in purge_persistent_gnt()
Bob Liu [Wed, 22 Jul 2015 06:40:10 +0000 (14:40 +0800)] 
xen-blkback: replace work_pending with work_busy in purge_persistent_gnt()

commit 53bc7dc004fecf39e0ba70f2f8d120a1444315d3 upstream.

The BUG_ON() in purge_persistent_gnt() will be triggered when previous purge
work haven't finished.

There is a work_pending() before this BUG_ON, but it doesn't account if the work
is still currently running.

Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoxen-blkfront: don't add indirect pages to list when !feature_persistent
Bob Liu [Wed, 22 Jul 2015 06:40:09 +0000 (14:40 +0800)] 
xen-blkfront: don't add indirect pages to list when !feature_persistent

commit 7b0767502b5db11cb1f0daef2d01f6d71b1192dc upstream.

We should consider info->feature_persistent when adding indirect page to list
info->indirect_pages, else the BUG_ON() in blkif_free() would be triggered.

When we are using persistent grants the indirect_pages list
should always be empty because blkfront has pre-allocated enough
persistent pages to fill all requests on the ring.

Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agomfd: arizona: Fix initialisation of the PM runtime
Charles Keepax [Sun, 14 Jun 2015 14:41:50 +0000 (15:41 +0100)] 
mfd: arizona: Fix initialisation of the PM runtime

commit 72e43164fd472f6c2659c8313b87da962322dbcf upstream.

The PM runtime core by default assumes a chip is suspended when runtime
PM is enabled. Currently the arizona driver enables runtime PM when the
chip is fully active and then disables the DCVDD regulator at the end of
arizona_dev_init. This however has several problems, firstly the if we
reach the end of arizona_dev_init, we did not properly follow all the
proceedures for shutting down the chip, and most notably we never marked
the chip as cache only so any writes occurring between then and the next
PM runtime resume will be lost. Secondly, if we are already resumed when
we reach the end of dev_init, then at best we get unbalanced regulator
enable/disables at work we lose DCVDD whilst we need it.

Additionally, since the commit 4f0216409f7c ("mfd: arizona: Add better
support for system suspend"), the PM runtime operations may
disable/enable the IRQ, so the IRQs must now be enabled before we call
any PM operations.

This patch adds a call to pm_runtime_set_active to inform the PM core
that the device is starting up active and moves the PM enabling to
around the IRQ initialisation to avoid any PM callbacks happening until
the IRQs are initialised.

Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoima: extend "mask" policy matching support
Mimi Zohar [Wed, 5 Nov 2014 12:53:55 +0000 (07:53 -0500)] 
ima: extend "mask" policy matching support

commit 4351c294b8c1028077280f761e158d167b592974 upstream.

The current "mask" policy option matches files opened as MAY_READ,
MAY_WRITE, MAY_APPEND or MAY_EXEC.  This patch extends the "mask"
option to match files opened containing one of these modes.  For
example, "mask=^MAY_READ" would match files opened read-write.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Dr. Greg Wettstein <gw@idfusion.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoima: add support for new "euid" policy condition
Mimi Zohar [Wed, 5 Nov 2014 12:48:36 +0000 (07:48 -0500)] 
ima: add support for new "euid" policy condition

commit 139069eff7388407f19794384c42a534d618ccd7 upstream.

The new "euid" policy condition measures files with the specified
effective uid (euid).  In addition, for CAP_SETUID files it measures
files with the specified uid or suid.

Changelog:
- fixed checkpatch.pl warnings
- fixed avc denied {setuid} messages - based on Roberto's feedback

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Dr. Greg Wettstein <gw@idfusion.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoHID: usbhid: add Chicony/Pixart usb optical mouse that needs QUIRK_ALWAYS_POLL
Herton R. Krzesinski [Thu, 21 May 2015 18:04:15 +0000 (15:04 -0300)] 
HID: usbhid: add Chicony/Pixart usb optical mouse that needs QUIRK_ALWAYS_POLL

commit 7250dc3fee806eb2b7560ab7d6072302e7ae8cf8 upstream.

I received a report from an user of following mouse which needs this quirk:

usb 1-1.6: USB disconnect, device number 58
usb 1-1.6: new low speed USB device number 59 using ehci_hcd
usb 1-1.6: New USB device found, idVendor=04f2, idProduct=1053
usb 1-1.6: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 1-1.6: Product: USB Optical Mouse
usb 1-1.6: Manufacturer: PixArt
usb 1-1.6: configuration #1 chosen from 1 choice
input: PixArt USB Optical Mouse as /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.6/1-1.6:1.0/input/input5887
generic-usb 0003:04F2:1053.16FE: input,hidraw2: USB HID v1.11 Mouse [PixArt USB Optical Mouse] on usb-0000:00:1a.0-1.6/input0

The quirk was tested by the reporter and it fixed the frequent disconnections etc.

[jkosina@suse.cz: reorder the position in hid-ids.h]
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoext4: avoid deadlocks in the writeback path by using sb_getblk_gfp
Nikolay Borisov [Thu, 2 Jul 2015 05:34:07 +0000 (01:34 -0400)] 
ext4: avoid deadlocks in the writeback path by using sb_getblk_gfp

commit c45653c341f5c8a0ce19c8f0ad4678640849cb86 upstream.

Switch ext4 to using sb_getblk_gfp with GFP_NOFS added to fix possible
deadlocks in the page writeback path.

Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agobufferhead: Add _gfp version for sb_getblk()
Nikolay Borisov [Thu, 2 Jul 2015 05:32:44 +0000 (01:32 -0400)] 
bufferhead: Add _gfp version for sb_getblk()

commit bd7ade3cd9b0850264306f5c2b79024a417b6396 upstream.

sb_getblk() is used during ext4 (and possibly other FSes) writeback
paths. Sometimes such path require allocating memory and guaranteeing
that such allocation won't block. Currently, however, there is no way
to provide user flags for sb_getblk which could lead to deadlocks.

This patch implements a sb_getblk_gfp with the only difference it can
accept user-provided GFP flags.

Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agofs/buffer.c: support buffer cache allocations with gfp modifiers
Gioh Kim [Fri, 5 Sep 2014 02:04:42 +0000 (22:04 -0400)] 
fs/buffer.c: support buffer cache allocations with gfp modifiers

commit 3b5e6454aaf6b4439b19400d8365e2ec2d24e411 upstream.

A buffer cache is allocated from movable area because it is referred
for a while and released soon.  But some filesystems are taking buffer
cache for a long time and it can disturb page migration.

New APIs are introduced to allocate buffer cache with user specific
flag.  *_gfp APIs are for user want to set page allocation flag for
page cache allocation.  And *_unmovable APIs are for the user wants to
allocate page cache from non-movable area.

Signed-off-by: Gioh Kim <gioh.kim@lge.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
[ luis: 3.16 prereq for:
  bd7ade3cd9b0 "bufferhead: Add _gfp version for sb_getblk()" ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agomegaraid_sas: use raw_smp_processor_id()
Christoph Hellwig [Wed, 15 Apr 2015 16:44:37 +0000 (09:44 -0700)] 
megaraid_sas: use raw_smp_processor_id()

commit 16b8528d20607925899b1df93bfd8fbab98d267c upstream.

We only want to steer the I/O completion towards a queue, but don't
actually access any per-CPU data, so the raw_ version is fine to use
and avoids the warnings when using smp_processor_id().

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Andy Lutomirski <luto@kernel.org>
Tested-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Sumit Saxena <sumit.saxena@avagotech.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
[ luis: backported to 3.16:
  - dropped changes to megasas_build_dcdb_fusion() ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoqla2xxx: Mark port lost when we receive an RSCN for it.
Chad Dupuis [Thu, 25 Sep 2014 09:17:01 +0000 (05:17 -0400)] 
qla2xxx: Mark port lost when we receive an RSCN for it.

commit ef86cb2059a14b4024c7320999ee58e938873032 upstream.

Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoFix firmware loader uevent buffer NULL pointer dereference
Linus Torvalds [Thu, 9 Jul 2015 18:20:01 +0000 (11:20 -0700)] 
Fix firmware loader uevent buffer NULL pointer dereference

commit 6f957724b94cb19f5c1c97efd01dd4df8ced323c upstream.

The firmware class uevent function accessed the "fw_priv->buf" buffer
without the proper locking and testing for NULL.  This is an old bug
(looks like it goes back to 2012 and commit 1244691c73b2: "firmware
loader: introduce firmware_buf"), but for some reason it's triggering
only now in 4.2-rc1.

Shuah Khan is trying to bisect what it is that causes this to trigger
more easily, but in the meantime let's just fix the bug since others are
hitting it too (at least Ingo reports having seen it as well).

Reported-and-tested-by: Shuah Khan <shuahkh@osg.samsung.com>
Acked-by: Ming Lei <ming.lei@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agonet: gso: use feature flag argument in all protocol gso handlers
Florian Westphal [Mon, 20 Oct 2014 11:49:16 +0000 (13:49 +0200)] 
net: gso: use feature flag argument in all protocol gso handlers

commit 1e16aa3ddf863c6b9f37eddf52503230a62dedb3 upstream.

skb_gso_segment() has a 'features' argument representing offload features
available to the output path.

A few handlers, e.g. GRE, instead re-fetch the features of skb->dev and use
those instead of the provided ones when handing encapsulation/tunnels.

Depending on dev->hw_enc_features of the output device skb_gso_segment() can
then return NULL even when the caller has disabled all GSO feature bits,
as segmentation of inner header thinks device will take care of segmentation.

This e.g. affects the tbf scheduler, which will silently drop GRE-encap GSO skbs
that did not fit the remaining token quota as the segmentation does not work
when device supports corresponding hw offload capabilities.

Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: used davem's backport to 3.14 ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agobna: fix interrupts storm caused by erroneous packets
Ivan Vecera [Thu, 6 Aug 2015 20:48:23 +0000 (22:48 +0200)] 
bna: fix interrupts storm caused by erroneous packets

commit ade4dc3e616e33c80d7e62855fe1b6f9895bc7c3 upstream.

The commit "e29aa33 bna: Enable Multi Buffer RX" moved packets counter
increment from the beginning of the NAPI processing loop after the check
for erroneous packets so they are never accounted. This counter is used
to inform firmware about number of processed completions (packets).
As these packets are never acked the firmware fires IRQs for them again
and again.

Fixes: e29aa33 ("bna: Enable Multi Buffer RX")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoudp: fix dst races with multicast early demux
Eric Dumazet [Sat, 1 Aug 2015 10:14:33 +0000 (12:14 +0200)] 
udp: fix dst races with multicast early demux

commit 10e2eb878f3ca07ac2f05fa5ca5e6c4c9174a27a upstream.

Multicast dst are not cached. They carry DST_NOCACHE.

As mentioned in commit f8864972126899 ("ipv4: fix dst race in
sk_dst_get()"), these dst need special care before caching them
into a socket.

Caching them is allowed only if their refcnt was not 0, ie we
must use atomic_inc_not_zero()

Also, we must use READ_ONCE() to fetch sk->sk_rx_dst, as mentioned
in commit d0c294c53a771 ("tcp: prevent fetching dst twice in early demux
code")

Fixes: 421b3885bf6d ("udp: ipv4: Add udp early demux")
Tested-by: Gregory Hoggarth <Gregory.Hoggarth@alliedtelesis.co.nz>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Gregory Hoggarth <Gregory.Hoggarth@alliedtelesis.co.nz>
Reported-by: Alex Gartrell <agartrell@fb.com>
Cc: Michal Kubeček <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: used davem's backport to 3.14 ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agords: fix an integer overflow test in rds_info_getsockopt()
Dan Carpenter [Sat, 1 Aug 2015 12:33:26 +0000 (15:33 +0300)] 
rds: fix an integer overflow test in rds_info_getsockopt()

commit 468b732b6f76b138c0926eadf38ac88467dcd271 upstream.

"len" is a signed integer.  We check that len is not negative, so it
goes from zero to INT_MAX.  PAGE_SIZE is unsigned long so the comparison
is type promoted to unsigned long.  ULONG_MAX - 4095 is a higher than
INT_MAX so the condition can never be true.

I don't know if this is harmful but it seems safe to limit "len" to
INT_MAX - 4095.

Fixes: a8c879a7ee98 ('RDS: Info and stats')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agonetlink: don't hold mutex in rcu callback when releasing mmapd ring
Florian Westphal [Tue, 21 Jul 2015 14:33:50 +0000 (16:33 +0200)] 
netlink: don't hold mutex in rcu callback when releasing mmapd ring

commit 0470eb99b4721586ccac954faac3fa4472da0845 upstream.

Kirill A. Shutemov says:

This simple test-case trigers few locking asserts in kernel:

int main(int argc, char **argv)
{
        unsigned int block_size = 16 * 4096;
        struct nl_mmap_req req = {
                .nm_block_size          = block_size,
                .nm_block_nr            = 64,
                .nm_frame_size          = 16384,
                .nm_frame_nr            = 64 * block_size / 16384,
        };
        unsigned int ring_size;
int fd;

fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
        if (setsockopt(fd, SOL_NETLINK, NETLINK_RX_RING, &req, sizeof(req)) < 0)
                exit(1);
        if (setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, &req, sizeof(req)) < 0)
                exit(1);

ring_size = req.nm_block_nr * req.nm_block_size;
mmap(NULL, 2 * ring_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
return 0;
}

+++ exited with 0 +++
BUG: sleeping function called from invalid context at /home/kas/git/public/linux-mm/kernel/locking/mutex.c:616
in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: init
3 locks held by init/1:
 #0:  (reboot_mutex){+.+...}, at: [<ffffffff81080959>] SyS_reboot+0xa9/0x220
 #1:  ((reboot_notifier_list).rwsem){.+.+..}, at: [<ffffffff8107f379>] __blocking_notifier_call_chain+0x39/0x70
 #2:  (rcu_callback){......}, at: [<ffffffff810d32e0>] rcu_do_batch.isra.49+0x160/0x10c0
Preemption disabled at:[<ffffffff8145365f>] __delay+0xf/0x20

CPU: 1 PID: 1 Comm: init Not tainted 4.1.0-00009-gbddf4c4818e0 #253
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Debian-1.8.2-1 04/01/2014
 ffff88017b3d8000 ffff88027bc03c38 ffffffff81929ceb 0000000000000102
 0000000000000000 ffff88027bc03c68 ffffffff81085a9d 0000000000000002
 ffffffff81ca2a20 0000000000000268 0000000000000000 ffff88027bc03c98
Call Trace:
 <IRQ>  [<ffffffff81929ceb>] dump_stack+0x4f/0x7b
 [<ffffffff81085a9d>] ___might_sleep+0x16d/0x270
 [<ffffffff81085bed>] __might_sleep+0x4d/0x90
 [<ffffffff8192e96f>] mutex_lock_nested+0x2f/0x430
 [<ffffffff81932fed>] ? _raw_spin_unlock_irqrestore+0x5d/0x80
 [<ffffffff81464143>] ? __this_cpu_preempt_check+0x13/0x20
 [<ffffffff8182fc3d>] netlink_set_ring+0x1ed/0x350
 [<ffffffff8182e000>] ? netlink_undo_bind+0x70/0x70
 [<ffffffff8182fe20>] netlink_sock_destruct+0x80/0x150
 [<ffffffff817e484d>] __sk_free+0x1d/0x160
 [<ffffffff817e49a9>] sk_free+0x19/0x20
[..]

Cong Wang says:

We can't hold mutex lock in a rcu callback, [..]

Thomas Graf says:

The socket should be dead at this point. It might be simpler to
add a netlink_release_ring() function which doesn't require
locking at all.

Reported-by: "Kirill A. Shutemov" <kirill@shutemov.name>
Diagnosed-by: Cong Wang <cwang@twopensource.com>
Suggested-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agobonding: correct the MAC address for "follow" fail_over_mac policy
dingtianhong [Thu, 16 Jul 2015 08:30:02 +0000 (16:30 +0800)] 
bonding: correct the MAC address for "follow" fail_over_mac policy

commit a951bc1e6ba58f11df5ed5ddc41311e10f5fd20b upstream.

The "follow" fail_over_mac policy is useful for multiport devices that
either become confused or incur a performance penalty when multiple
ports are programmed with the same MAC address, but the same MAC
address still may happened by this steps for this policy:

1) echo +eth0 > /sys/class/net/bond0/bonding/slaves
   bond0 has the same mac address with eth0, it is MAC1.

2) echo +eth1 > /sys/class/net/bond0/bonding/slaves
   eth1 is backup, eth1 has MAC2.

3) ifconfig eth0 down
   eth1 became active slave, bond will swap MAC for eth0 and eth1,
   so eth1 has MAC1, and eth0 has MAC2.

4) ifconfig eth1 down
   there is no active slave, and eth1 still has MAC1, eth2 has MAC2.

5) ifconfig eth0 up
   the eth0 became active slave again, the bond set eth0 to MAC1.

Something wrong here, then if you set eth1 up, the eth0 and eth1 will have the same
MAC address, it will break this policy for ACTIVE_BACKUP mode.

This patch will fix this problem by finding the old active slave and
swap them MAC address before change active slave.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Tested-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: used davem's backport to 3.14 ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoipv6: lock socket in ip6_datagram_connect()
Eric Dumazet [Tue, 14 Jul 2015 06:10:22 +0000 (08:10 +0200)] 
ipv6: lock socket in ip6_datagram_connect()

commit 03645a11a570d52e70631838cb786eb4253eb463 upstream.

ip6_datagram_connect() is doing a lot of socket changes without
socket being locked.

This looks wrong, at least for udp_lib_rehash() which could corrupt
lists because of concurrent udp_sk(sk)->udp_portaddr_hash accesses.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agonet: Fix skb_set_peeked use-after-free bug
Herbert Xu [Tue, 4 Aug 2015 07:42:47 +0000 (15:42 +0800)] 
net: Fix skb_set_peeked use-after-free bug

commit a0a2a6602496a45ae838a96db8b8173794b5d398 upstream.

The commit 738ac1ebb96d02e0d23bc320302a6ea94c612dec ("net: Clone
skb before setting peeked flag") introduced a use-after-free bug
in skb_recv_datagram.  This is because skb_set_peeked may create
a new skb and free the existing one.  As it stands the caller will
continue to use the old freed skb.

This patch fixes it by making skb_set_peeked return the new skb
(or the old one if unchanged).

Fixes: 738ac1ebb96d ("net: Clone skb before setting peeked flag")
Reported-by: Brenden Blanco <bblanco@plumgrid.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Brenden Blanco <bblanco@plumgrid.com>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agonet: Fix skb csum races when peeking
Herbert Xu [Mon, 13 Jul 2015 12:01:42 +0000 (20:01 +0800)] 
net: Fix skb csum races when peeking

commit 89c22d8c3b278212eef6a8cc66b570bc840a6f5a upstream.

When we calculate the checksum on the recv path, we store the
result in the skb as an optimisation in case we need the checksum
again down the line.

This is in fact bogus for the MSG_PEEK case as this is done without
any locking.  So multiple threads can peek and then store the result
to the same skb, potentially resulting in bogus skb states.

This patch fixes this by only storing the result if the skb is not
shared.  This preserves the optimisations for the few cases where
it can be done safely due to locking or other reasons, e.g., SIOCINQ.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agonet: Clone skb before setting peeked flag
Herbert Xu [Mon, 13 Jul 2015 08:04:13 +0000 (16:04 +0800)] 
net: Clone skb before setting peeked flag

commit 738ac1ebb96d02e0d23bc320302a6ea94c612dec upstream.

Shared skbs must not be modified and this is crucial for broadcast
and/or multicast paths where we use it as an optimisation to avoid
unnecessary cloning.

The function skb_recv_datagram breaks this rule by setting peeked
without cloning the skb first.  This causes funky races which leads
to double-free.

This patch fixes this by cloning the skb and replacing the skb
in the list when setting skb->peeked.

Fixes: a59322be07c9 ("[UDP]: Only increment counter on first peek/recv")
Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agonet: call rcu_read_lock early in process_backlog
Julian Anastasov [Thu, 9 Jul 2015 06:59:10 +0000 (09:59 +0300)] 
net: call rcu_read_lock early in process_backlog

commit 2c17d27c36dcce2b6bf689f41a46b9e909877c21 upstream.

Incoming packet should be either in backlog queue or
in RCU read-side section. Otherwise, the final sequence of
flush_backlog() and synchronize_net() may miss packets
that can run without device reference:

CPU 1                  CPU 2
                       skb->dev: no reference
                       process_backlog:__skb_dequeue
                       process_backlog:local_irq_enable

on_each_cpu for
flush_backlog =>       IPI(hardirq): flush_backlog
                       - packet not found in backlog

                       CPU delayed ...
synchronize_net
- no ongoing RCU
read-side sections

netdev_run_todo,
rcu_barrier: no
ongoing callbacks
                       __netif_receive_skb_core:rcu_read_lock
                       - too late
free dev
                       process packet for freed dev

Fixes: 6e583ce5242f ("net: eliminate refcounting in backlog queue")
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: used davem's backport to 3.18 ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agonet: pktgen: fix race between pktgen_thread_worker() and kthread_stop()
Oleg Nesterov [Wed, 8 Jul 2015 19:42:11 +0000 (21:42 +0200)] 
net: pktgen: fix race between pktgen_thread_worker() and kthread_stop()

commit fecdf8be2d91e04b0a9a4f79ff06499a36f5d14f upstream.

pktgen_thread_worker() is obviously racy, kthread_stop() can come
between the kthread_should_stop() check and set_current_state().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Jan Stancek <jstancek@redhat.com>
Reported-by: Marcelo Leitner <mleitner@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agonet/tipc: initialize security state for new connection socket
Stephen Smalley [Tue, 7 Jul 2015 13:43:45 +0000 (09:43 -0400)] 
net/tipc: initialize security state for new connection socket

commit fdd75ea8df370f206a8163786e7470c1277a5064 upstream.

Calling connect() with an AF_TIPC socket would trigger a series
of error messages from SELinux along the lines of:
SELinux: Invalid class 0
type=AVC msg=audit(1434126658.487:34500): avc:  denied  { <unprintable> }
  for pid=292 comm="kworker/u16:5" scontext=system_u:system_r:kernel_t:s0
  tcontext=system_u:object_r:unlabeled_t:s0 tclass=<unprintable>
  permissive=0

This was due to a failure to initialize the security state of the new
connection sock by the tipc code, leaving it with junk in the security
class field and an unlabeled secid.  Add a call to security_sk_clone()
to inherit the security state from the parent socket.

Reported-by: Tim Shearer <tim.shearer@overturenetworks.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agortnetlink: verify IFLA_VF_INFO attributes before passing them to driver
Daniel Borkmann [Mon, 6 Jul 2015 22:07:52 +0000 (00:07 +0200)] 
rtnetlink: verify IFLA_VF_INFO attributes before passing them to driver

commit 4f7d2cdfdde71ffe962399b7020c674050329423 upstream.

Jason Gunthorpe reported that since commit c02db8c6290b ("rtnetlink: make
SR-IOV VF interface symmetric"), we don't verify IFLA_VF_INFO attributes
anymore with respect to their policy, that is, ifla_vfinfo_policy[].

Before, they were part of ifla_policy[], but they have been nested since
placed under IFLA_VFINFO_LIST, that contains the attribute IFLA_VF_INFO,
which is another nested attribute for the actual VF attributes such as
IFLA_VF_MAC, IFLA_VF_VLAN, etc.

Despite the policy being split out from ifla_policy[] in this commit,
it's never applied anywhere. nla_for_each_nested() only does basic nla_ok()
testing for struct nlattr, but it doesn't know about the data context and
their requirements.

Fix, on top of Jason's initial work, does 1) parsing of the attributes
with the right policy, and 2) using the resulting parsed attribute table
from 1) instead of the nla_for_each_nested() loop (just like we used to
do when still part of ifla_policy[]).

Reference: http://thread.gmane.org/gmane.linux.network/368913
Fixes: c02db8c6290b ("rtnetlink: make SR-IOV VF interface symmetric")
Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Cc: Greg Rose <gregory.v.rose@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Rony Efraim <ronye@mellanox.com>
Cc: Vlad Zolotarov <vladz@cloudius-systems.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: used davem's backport to 3.18 ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agodrm/radeon: fix hotplug race at startup
Dave Airlie [Thu, 20 Aug 2015 00:13:55 +0000 (10:13 +1000)] 
drm/radeon: fix hotplug race at startup

commit 7f98ca454ad373fc1b76be804fa7138ff68c1d27 upstream.

We apparantly get a hotplug irq before we've initialised
modesetting,

[drm] Loading R100 Microcode
BUG: unable to handle kernel NULL pointer dereference at   (null)
IP: [<c125f56f>] __mutex_lock_slowpath+0x23/0x91
*pde = 00000000
Oops: 0002 [#1]
Modules linked in: radeon(+) drm_kms_helper ttm drm i2c_algo_bit backlight pcspkr psmouse evdev sr_mod input_leds led_class cdrom sg parport_pc parport floppy intel_agp intel_gtt lpc_ich acpi_cpufreq processor button mfd_core agpgart uhci_hcd ehci_hcd rng_core snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm usbcore usb_common i2c_i801 i2c_core snd_timer snd soundcore thermal_sys
CPU: 0 PID: 15 Comm: kworker/0:1 Not tainted 4.2.0-rc7-00015-gbf67402 #111
Hardware name: MicroLink                               /D850MV                         , BIOS MV85010A.86A.0067.P24.0304081124 04/08/2003
Workqueue: events radeon_hotplug_work_func [radeon]
task: f6ca5900 ti: f6d3e000 task.ti: f6d3e000
EIP: 0060:[<c125f56f>] EFLAGS: 00010282 CPU: 0
EIP is at __mutex_lock_slowpath+0x23/0x91
EAX: 00000000 EBX: f5e900fc ECX: 00000000 EDX: fffffffe
ESI: f6ca5900 EDI: f5e90100 EBP: f5e90000 ESP: f6d3ff0c
 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
CR0: 8005003b CR2: 00000000 CR3: 36f61000 CR4: 000006d0
Stack:
 f5e90100 00000000 c103c4c1 f6d2a5a0 f5e900fc f6df394c c125f162 f8b0faca
 f6d2a5a0 c138ca00 f6df394c f7395600 c1034741 00d40000 00000000 f6d2a5a0
 c138ca00 f6d2a5b8 c138ca10 c1034b58 00000001 f6d40000 f6ca5900 f6d0c940
Call Trace:
 [<c103c4c1>] ? dequeue_task_fair+0xa4/0xb7
 [<c125f162>] ? mutex_lock+0x9/0xa
 [<f8b0faca>] ? radeon_hotplug_work_func+0x17/0x57 [radeon]
 [<c1034741>] ? process_one_work+0xfc/0x194
 [<c1034b58>] ? worker_thread+0x18d/0x218
 [<c10349cb>] ? rescuer_thread+0x1d5/0x1d5
 [<c103742a>] ? kthread+0x7b/0x80
 [<c12601c0>] ? ret_from_kernel_thread+0x20/0x30
 [<c10373af>] ? init_completion+0x18/0x18
Code: 42 08 e8 8e a6 dd ff c3 57 56 53 83 ec 0c 8b 35 48 f7 37 c1 8b 10 4a 74 1a 89 c3 8d 78 04 8b 40 08 89 63

Reported-and-Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agofsnotify: fix oops in fsnotify_clear_marks_by_group_flags()
Jan Kara [Thu, 6 Aug 2015 22:46:42 +0000 (15:46 -0700)] 
fsnotify: fix oops in fsnotify_clear_marks_by_group_flags()

commit 8f2f3eb59dff4ec538de55f2e0592fec85966aab upstream.

fsnotify_clear_marks_by_group_flags() can race with
fsnotify_destroy_marks() so that when fsnotify_destroy_mark_locked()
drops mark_mutex, a mark from the list iterated by
fsnotify_clear_marks_by_group_flags() can be freed and thus the next
entry pointer we have cached may become stale and we dereference free
memory.

Fix the problem by first moving marks to free to a special private list
and then always free the first entry in the special list.  This method
is safe even when entries from the list can disappear once we drop the
lock.

Signed-off-by: Jan Kara <jack@suse.com>
Reported-by: Ashish Sangwan <a.sangwan@samsung.com>
Reviewed-by: Ashish Sangwan <a.sangwan@samsung.com>
Cc: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoocfs2: fix BUG in ocfs2_downconvert_thread_do_work()
Joseph Qi [Thu, 6 Aug 2015 22:46:23 +0000 (15:46 -0700)] 
ocfs2: fix BUG in ocfs2_downconvert_thread_do_work()

commit 209f7512d007980fd111a74a064d70a3656079cf upstream.

The "BUG_ON(list_empty(&osb->blocked_lock_list))" in
ocfs2_downconvert_thread_do_work can be triggered in the following case:

ocfs2dc has firstly saved osb->blocked_lock_count to local varibale
processed, and then processes the dentry lockres.  During the dentry
put, it calls iput and then deletes rw, inode and open lockres from
blocked list in ocfs2_mark_lockres_freeing.  And this causes the
variable `processed' to not reflect the number of blocked lockres to be
processed, which triggers the BUG.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoipc: modify message queue accounting to not take kernel data structures into account
Marcus Gelderie [Thu, 6 Aug 2015 22:46:10 +0000 (15:46 -0700)] 
ipc: modify message queue accounting to not take kernel data structures into account

commit de54b9ac253787c366bbfb28d901a31954eb3511 upstream.

A while back, the message queue implementation in the kernel was
improved to use btrees to speed up retrieval of messages, in commit
d6629859b36d ("ipc/mqueue: improve performance of send/recv").

That patch introducing the improved kernel handling of message queues
(using btrees) has, as a by-product, changed the meaning of the QSIZE
field in the pseudo-file created for the queue.  Before, this field
reflected the size of the user-data in the queue.  Since, it also takes
kernel data structures into account.  For example, if 13 bytes of user
data are in the queue, on my machine the file reports a size of 61
bytes.

There was some discussion on this topic before (for example
https://lkml.org/lkml/2014/10/1/115).  Commenting on a th lkml, Michael
Kerrisk gave the following background
(https://lkml.org/lkml/2015/6/16/74):

    The pseudofiles in the mqueue filesystem (usually mounted at
    /dev/mqueue) expose fields with metadata describing a message
    queue. One of these fields, QSIZE, as originally implemented,
    showed the total number of bytes of user data in all messages in
    the message queue, and this feature was documented from the
    beginning in the mq_overview(7) page. In 3.5, some other (useful)
    work happened to break the user-space API in a couple of places,
    including the value exposed via QSIZE, which now includes a measure
    of kernel overhead bytes for the queue, a figure that renders QSIZE
    useless for its original purpose, since there's no way to deduce
    the number of overhead bytes consumed by the implementation.
    (The other user-space breakage was subsequently fixed.)

This patch removes the accounting of kernel data structures in the
queue.  Reporting the size of these data-structures in the QSIZE field
was a breaking change (see Michael's comment above).  Without the QSIZE
field reporting the total size of user-data in the queue, there is no
way to deduce this number.

It should be noted that the resource limit RLIMIT_MSGQUEUE is counted
against the worst-case size of the queue (in both the old and the new
implementation).  Therefore, the kernel overhead accounting in QSIZE is
not necessary to help the user understand the limitations RLIMIT imposes
on the processes.

Signed-off-by: Marcus Gelderie <redmnic@gmail.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: John Duffy <jb_duffy@btinternet.com>
Cc: Arto Bendiken <arto@bendiken.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoMIPS: Make set_pte() SMP safe.
David Daney [Tue, 4 Aug 2015 00:48:43 +0000 (17:48 -0700)] 
MIPS: Make set_pte() SMP safe.

commit 46011e6ea39235e4aca656673c500eac81a07a17 upstream.

On MIPS the GLOBAL bit of the PTE must have the same value in any
aligned pair of PTEs.  These pairs of PTEs are referred to as
"buddies".  In a SMP system is is possible for two CPUs to be calling
set_pte() on adjacent PTEs at the same time.  There is a race between
setting the PTE and a different CPU setting the GLOBAL bit in its
buddy PTE.

This race can be observed when multiple CPUs are executing
vmap()/vfree() at the same time.

Make setting the buddy PTE's GLOBAL bit an atomic operation to close
the race condition.

The case of CONFIG_64BIT_PHYS_ADDR && CONFIG_CPU_MIPS32 is *not*
handled.

Signed-off-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10835/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agomm, vmscan: Do not wait for page writeback for GFP_NOFS allocations
Michal Hocko [Tue, 4 Aug 2015 21:36:58 +0000 (14:36 -0700)] 
mm, vmscan: Do not wait for page writeback for GFP_NOFS allocations

commit ecf5fc6e9654cd7a268c782a523f072b2f1959f9 upstream.

Nikolay has reported a hang when a memcg reclaim got stuck with the
following backtrace:

PID: 18308  TASK: ffff883d7c9b0a30  CPU: 1   COMMAND: "rsync"
  #0 __schedule at ffffffff815ab152
  #1 schedule at ffffffff815ab76e
  #2 schedule_timeout at ffffffff815ae5e5
  #3 io_schedule_timeout at ffffffff815aad6a
  #4 bit_wait_io at ffffffff815abfc6
  #5 __wait_on_bit at ffffffff815abda5
  #6 wait_on_page_bit at ffffffff8111fd4f
  #7 shrink_page_list at ffffffff81135445
  #8 shrink_inactive_list at ffffffff81135845
  #9 shrink_lruvec at ffffffff81135ead
 #10 shrink_zone at ffffffff811360c3
 #11 shrink_zones at ffffffff81136eff
 #12 do_try_to_free_pages at ffffffff8113712f
 #13 try_to_free_mem_cgroup_pages at ffffffff811372be
 #14 try_charge at ffffffff81189423
 #15 mem_cgroup_try_charge at ffffffff8118c6f5
 #16 __add_to_page_cache_locked at ffffffff8112137d
 #17 add_to_page_cache_lru at ffffffff81121618
 #18 pagecache_get_page at ffffffff8112170b
 #19 grow_dev_page at ffffffff811c8297
 #20 __getblk_slow at ffffffff811c91d6
 #21 __getblk_gfp at ffffffff811c92c1
 #22 ext4_ext_grow_indepth at ffffffff8124565c
 #23 ext4_ext_create_new_leaf at ffffffff81246ca8
 #24 ext4_ext_insert_extent at ffffffff81246f09
 #25 ext4_ext_map_blocks at ffffffff8124a848
 #26 ext4_map_blocks at ffffffff8121a5b7
 #27 mpage_map_one_extent at ffffffff8121b1fa
 #28 mpage_map_and_submit_extent at ffffffff8121f07b
 #29 ext4_writepages at ffffffff8121f6d5
 #30 do_writepages at ffffffff8112c490
 #31 __filemap_fdatawrite_range at ffffffff81120199
 #32 filemap_flush at ffffffff8112041c
 #33 ext4_alloc_da_blocks at ffffffff81219da1
 #34 ext4_rename at ffffffff81229b91
 #35 ext4_rename2 at ffffffff81229e32
 #36 vfs_rename at ffffffff811a08a5
 #37 SYSC_renameat2 at ffffffff811a3ffc
 #38 sys_renameat2 at ffffffff811a408e
 #39 sys_rename at ffffffff8119e51e
 #40 system_call_fastpath at ffffffff815afa89

Dave Chinner has properly pointed out that this is a deadlock in the
reclaim code because ext4 doesn't submit pages which are marked by
PG_writeback right away.

The heuristic was introduced by commit e62e384e9da8 ("memcg: prevent OOM
with too many dirty pages") and it was applied only when may_enter_fs
was specified.  The code has been changed by c3b94f44fcb0 ("memcg:
further prevent OOM with too many dirty pages") which has removed the
__GFP_FS restriction with a reasoning that we do not get into the fs
code.  But this is not sufficient apparently because the fs doesn't
necessarily submit pages marked PG_writeback for IO right away.

ext4_bio_write_page calls io_submit_add_bh but that doesn't necessarily
submit the bio.  Instead it tries to map more pages into the bio and
mpage_map_one_extent might trigger memcg charge which might end up
waiting on a page which is marked PG_writeback but hasn't been submitted
yet so we would end up waiting for something that never finishes.

Fix this issue by replacing __GFP_IO by may_enter_fs check (for case 2)
before we go to wait on the writeback.  The page fault path, which is
the only path that triggers memcg oom killer since 3.12, shouldn't
require GFP_NOFS and so we shouldn't reintroduce the premature OOM
killer issue which was originally addressed by the heuristic.

As per David Chinner the xfs is doing similar thing since 2.6.15 already
so ext4 is not the only affected filesystem.  Moreover he notes:

: For example: IO completion might require unwritten extent conversion
: which executes filesystem transactions and GFP_NOFS allocations. The
: writeback flag on the pages can not be cleared until unwritten
: extent conversion completes. Hence memory reclaim cannot wait on
: page writeback to complete in GFP_NOFS context because it is not
: safe to do so, memcg reclaim or otherwise.

[tytso@mit.edu: corrected the control flow]
Fixes: c3b94f44fcb0 ("memcg: further prevent OOM with too many dirty pages")
Reported-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ luis: backported to 3.16: used Hugh's backport for 4.1 ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoALSA: fireworks/firewire-lib: add support for recent firmware quirk
Takashi Sakamoto [Wed, 5 Aug 2015 00:21:05 +0000 (09:21 +0900)] 
ALSA: fireworks/firewire-lib: add support for recent firmware quirk

commit 18f5ed365d3f188a91149d528c853000330a4a58 upstream.

Fireworks uses TSB43CB43(IceLynx-Micro) as its IEC 61883-1/6 interface.
This chip includes ARM7 core, and loads and runs program. The firmware
is stored in on-board memory and loaded every powering-on from it.

Echo Audio ships several versions of firmwares for each model. These
firmwares have each quirk and the quirk changes a sequence of packets.

As long as I investigated, AudioFire2/AudioFire4/AudioFirePre8 have a
quirk to transfer a first packet with 0x02 in its dbc field. This causes
ALSA Fireworks driver to detect discontinuity. In this case, firmware
version 5.7.0, 5.7.3 and 5.8.0 are used.

Payload  CIP      CIP
quadlets header1  header2
02       00050002 90ffffff <-
42       0005000a 90013000
42       00050012 90014400
42       0005001a 90015800
02       0005001a 90ffffff
42       00050022 90019000
42       0005002a 9001a400
42       00050032 9001b800
02       00050032 90ffffff
42       0005003a 9001d000
42       00050042 9001e400
42       0005004a 9001f800
02       0005004a 90ffffff
(AudioFire2 with firmware version 5.7.)

$ dmesg
snd-fireworks fw1.0: Detect discontinuity of CIP: 00 02

These models, AudioFire8 (since Jul 2009 ) and Gibson Robot Interface
Pack series uses the same ARM binary as their firmware. Thus, this
quirk may be observed among them.

This commit adds a new member for AMDTP structure. This member represents
the value of dbc field in a first AMDTP packet. Drivers can set it with
a preferred value according to model's quirk.

Tested-by: Johannes Oertei <johannes.oertel@uni-due.de>
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agodrivers/usb: Delete XHCI command timer if necessary
Gavin Shan [Mon, 3 Aug 2015 13:07:49 +0000 (16:07 +0300)] 
drivers/usb: Delete XHCI command timer if necessary

commit ffe5adcb7661d94e952d6b5ed7f493cb4ef0c7bc upstream.

When xhci_mem_cleanup() is called, it's possible that the command
timer isn't initialized and scheduled. For those cases, to delete
the command timer causes soft-lockup as below stack dump shows.

The patch avoids deleting the command timer if it's not scheduled
with the help of timer_pending().

NMI watchdog: BUG: soft lockup - CPU#40 stuck for 23s! [kworker/40:1:8140]
      :
NIP [c000000000150b30] lock_timer_base.isra.34+0x90/0xa0
LR [c000000000150c24] try_to_del_timer_sync+0x34/0xa0
Call Trace:
[c000000f67c975e0] [c0000000015b84f8] mon_ops+0x0/0x8 (unreliable)
[c000000f67c97620] [c000000000150c24] try_to_del_timer_sync+0x34/0xa0
[c000000f67c97660] [c000000000150cf0] del_timer_sync+0x60/0x80
[c000000f67c97690] [c00000000070ac0c] xhci_mem_cleanup+0x5c/0x5e0
[c000000f67c97740] [c00000000070c2e8] xhci_mem_init+0x1158/0x13b0
[c000000f67c97860] [c000000000700978] xhci_init+0x88/0x110
[c000000f67c978e0] [c000000000701644] xhci_gen_setup+0x2b4/0x590
[c000000f67c97970] [c0000000006d4410] xhci_pci_setup+0x40/0x190
[c000000f67c979f0] [c0000000006b1af8] usb_add_hcd+0x418/0xba0
[c000000f67c97ab0] [c0000000006cb15c] usb_hcd_pci_probe+0x1dc/0x5c0
[c000000f67c97b50] [c0000000006d3ba4] xhci_pci_probe+0x64/0x1f0
[c000000f67c97ba0] [c0000000004fe9ac] local_pci_probe+0x6c/0x130
[c000000f67c97c30] [c0000000000e5ce8] work_for_cpu_fn+0x38/0x60
[c000000f67c97c60] [c0000000000eacb8] process_one_work+0x198/0x470
[c000000f67c97cf0] [c0000000000eb6ac] worker_thread+0x37c/0x5a0
[c000000f67c97d80] [c0000000000f2730] kthread+0x110/0x130
[c000000f67c97e30] [c000000000009660] ret_from_kernel_thread+0x5c/0x7c

Reported-by: Priya M. A <priyama2@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoxhci: fix off by one error in TRB DMA address boundary check
Mathias Nyman [Mon, 3 Aug 2015 13:07:48 +0000 (16:07 +0300)] 
xhci: fix off by one error in TRB DMA address boundary check

commit 7895086afde2a05fa24a0e410d8e6b75ca7c8fdd upstream.

We need to check that a TRB is part of the current segment
before calculating its DMA address.

Previously a ring segment didn't use a full memory page, and every
new ring segment got a new memory page, so the off by one
error in checking the upper bound was never seen.

Now that we use a full memory page, 256 TRBs (4096 bytes), the off by one
didn't catch the case when a TRB was the first element of the next segment.

This is triggered if the virtual memory pages for a ring segment are
next to each in increasing order where the ring buffer wraps around and
causes errors like:

[  106.398223] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 0 comp_code 1
[  106.398230] xhci_hcd 0000:00:14.0: Looking for event-dma fffd3000 trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 seg-end fffd4ff0

The trb-end address is one outside the end-seg address.

Tested-by: Arkadiusz Miśkiewicz <arekm@maven.pl>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
10 years agoMIPS: Flush RPS on kernel entry with EVA
James Hogan [Fri, 31 Jul 2015 15:29:38 +0000 (16:29 +0100)] 
MIPS: Flush RPS on kernel entry with EVA

commit 3aff47c062b944a5e1f9af56a37a23f5295628fc upstream.

When EVA is enabled, flush the Return Prediction Stack (RPS) present on
some MIPS cores on entry to the kernel from user mode.

This is important specifically for interAptiv with EVA enabled,
otherwise kernel mode RPS mispredicts may trigger speculative fetches of
user return addresses, which may be sensitive in the kernel address
space due to EVA's overlapping user/kernel address spaces.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10812/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>