git.ipfire.org Git - people/ms/linux.git/log

arc: mm: Fix build failure

commit e262eb9381ad51b5de7a9e762ee773bbd25ce650 upstream.

Fix misspelled define.

Fixes: 33692f27597f ("vm: add VM_FAULT_SIGSEGV handling support")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

vm: add VM_FAULT_SIGSEGV handling support

commit 33692f27597fcab536d7cbbcc8f52905133e4aa7 upstream.

The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
"you should SIGSEGV" error, because the SIGSEGV case was generally
handled by the caller - usually the architecture fault handler.

That results in lots of duplication - all the architecture fault
handlers end up doing very similar "look up vma, check permissions, do
retries etc" - but it generally works.  However, there are cases where
the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.

In particular, when accessing the stack guard page, libsigsegv expects a
SIGSEGV.  And it usually got one, because the stack growth is handled by
that duplicated architecture fault handler.

However, when the generic VM layer started propagating the error return
from the stack expansion in commit fee7e49d4514 ("mm: propagate error
from stack expansion even for guard page"), that now exposed the
existing VM_FAULT_SIGBUS result to user space.  And user space really
expected SIGSEGV, not SIGBUS.

To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
duplicate architecture fault handlers about it.  They all already have
the code to handle SIGSEGV, so it's about just tying that new return
value to the existing code, but it's all a bit annoying.

This is the mindless minimal patch to do this.  A more extensive patch
would be to try to gather up the mostly shared fault handling logic into
one generic helper routine, and long-term we really should do that
cleanup.

Just from this patch, you can generally see that most architectures just
copied (directly or indirectly) the old x86 way of doing things, but in
the meantime that original x86 model has been improved to hold the VM
semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
"newer" things, so it would be a good idea to bring all those
improvements to the generic case and teach other architectures about
them too.

Reported-and-tested-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Jan Engelhardt <jengelh@inai.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
Cc: linux-arch@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

x86: mm: move mmap_sem unlock from mm_fault_error() to caller

commit 7fb08eca45270d0ae86e1ad9d39c40b7a55d0190 upstream.

This replaces four copies in various stages of mm_fault_error() handling
with just a single one. It will also allow for more natural placement
of the unlocking after some further cleanup.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

cdc-acm: Add support for Denso cradle CU-321

commit b20b1618b8fca858c83e52da4aa22cd6b13b0359 upstream.

In order to support an older USB cradle by Denso, I added its vendor- and product-ID to the array of usb_device_id acm_ids. In this way cdc-acm feels responsible for this cradle. The related /dev/ttyACM node is being created properly, and the data transfer works.

However, later cradle models by Denso do have proper descriptors, so the patch is not required for these. At the same time both the older and the later model have the same vendor- and product-ID, but they both work with the patched driver.

Declaration of the Denso cradles I tested:
- both models have the same IDs: vendorID 0x076d, productID 0x0006
- older model: Denso CU-321 (descriptors not properly set)
- later model: Denso CU-821 (with proper descriptors)

Signed-off-by: Bjoern Gerhart <oss@airbjorn.de>
Acked-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: musb: core: add pm_runtime_irq_safe()

commit 3e43a0725637299a14369e3ef109c25a8ec5c008 upstream.

We need a pm_runtime_get_sync() call from
within musb_gadget_pullup() to make sure
registers are accessible at that time.

The problem is that musb_gadget_pullup() is
called with IRQs disabled and, because of that,
we need to tell pm_runtime that this pm_runtime_get_sync()
is IRQ safe.

We can simply add pm_runtime_irq_safe(), however, because
we need to make our read/write accessor function pointers
have been initialized before trying to use them. This means
that all pm_runtime initialization for musb_core needs to
be moved down so that when we call pm_runtime_irq_safe(),
the pm_runtime_get_sync() that it calls on the parent, won't
cause a crash due to NULL musb_read/write accessors.

Reported-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: gadget: function: phonet: balance usb_ep_disable calls

commit 9ec36f7fe20ef919cc15171e1da1b6739222541a upstream.

f_phonet's ->set_alt() method will call usb_ep_disable()
potentially on an endpoint which is already disabled. That's
something the gadget/function driver must guarantee that it's
always balanced.

In order to balance the calls, just make sure the endpoint
was enabled before by means of checking the validity of
driver_data.

Reported-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

USB: serial: add Google simple serial SubClass support

commit 679315e5fae1e4614eed0d9aa26999ddcb6a0f77 upstream.

Add support for Google devices that export simple serial
interfaces using the vendor specific SubClass/Protocol pair
0x50/0x01.

Signed-off-by: Anton Staaf <robotboy@chromium.org>
Reviewed-by: Benson Leung <bleung@chromium.org>
[johan: move id entries and update Kconfig]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

HID: microsoft: add support for Japanese Surface Type Cover 3

commit 5e7e9e90b5867a3754159a8ce524299d930fbac8 upstream.

Based on code for the US Surface Type Cover 3
from commit be3b16341d5cd8cf2a64fcc7a604a8efe6599ff0
("HID: add support for MS Surface Pro 3 Type Cover"):

Signed-off-by: Alan Wu <alan.c.wu@gmail.com>
Tested-by: Karlis Dreizis <karlisdreizis@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

HID: add support for MS Surface Pro 3 Type Cover

commit be3b16341d5cd8cf2a64fcc7a604a8efe6599ff0 upstream.

Surface Pro 3 Type Cover that works with Ubuntu (and possibly Arch) from this thread. Both trackpad and keyboard work after compiling my own kernel.
http://ubuntuforums.org/showthread.php?t=2231207&page=2&s=44910e0c56047e4f93dfd9fea58121ef

Also includes Jarrad Whitaker's message which sources
http://winaero.com/blog/how-to-install-linux-on-surface-pro-3/
which he says is sourced from a Russian site

Signed-off-by: Alan Wu <alan.c.wu@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

HID: hid-microsoft: Add support for scrollwheel and special keypad keys

commit 3faed1aff786a007b3ea0549ac469e09f48c98f9 upstream.

The Microsoft Office keyboard has a scrollwheel as well as some special keys
above the keypad which are handled through the custom MS usage page, this
commit adds support for these.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

HID: pidff: Fix initialisation forMicrosoft Sidewinder FF Pro 2

commit afd700d933963d07391e3e3dfbfbc05e905960ef upstream.

The FF2 driver (usbhid/hid-pidff.c) sends commands to the stick during ff_init.
However, this is called inside a block where driver_input_lock is locked, so
the results of these initial commands are discarded. This behavior is the
"killer", without this nothing else works.

ff_init issues commands using "hid_hw_request". This eventually goes to
hid_input_report, which returns -EBUSY because driver_input_lock is locked. The
change is to delay the ff_init call in hid-core.c until after this lock has
been released.

Calling hid_device_io_start() releases the lock so the device can be
configured. We also need to call hid_device_io_stop() on exit for the lock to
remain locked while ending the init of the drivers.

[ benjamin.tissoires@redhat.com: imrpoved the changelog a lot ]

Signed-off-by: Jim Keir <jimkeir@oracledbadirect.com>
Reviewed-by: Benjamin.tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

HID: apple: fix battery support for the 2009 ANSI wireless keyboard

commit cbd366bea2b8513bc0fc1c9e8832cb0ab221d6d5 upstream.

Enabled quirks necessary for correct battery capacity reporting. Cleaned up
surrounding style.

Signed-off-by: Ross Skaliotis <rskaliotis@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

tty: fix up atime/mtime mess, take four

commit f0bf0bd07943bfde8f5ac39a32664810a379c7d3 upstream.

This problem was taken care of three times already in
* b0de59b5733d18b0d1974a060860a8b5c1b36a2e (TTY: do not update
  atime/mtime on read/write),
* 37b7f3c76595e23257f61bd80b223de8658617ee (TTY: fix atime/mtime
  regression), and
* b0b885657b6c8ef63a46bc9299b2a7715d19acde (tty: fix up atime/mtime
  mess, take three)

But it still misses one point. As John Paul correctly points out, we
do not care about setting date. If somebody ever changes wall
time backwards (by mistake for example), tty timestamps are never
updated until the original wall time passes.

So check the absolute difference of times and if it large than "8
seconds or so", always update the time. That means we will update
immediatelly when changing time. Ergo, CAP_SYS_TIME can foul the
check, but it was always that way.

Thanks John for serving me this so nicely debugged.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reported-by: John Paul Perry <john_paul.perry@alcatel-lucent.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ARC: Fix KSTK_ESP()

commit 13648b0118a24f4fc76c34e6c7b6ccf447e46a2a upstream.

/proc/<pid>/maps currently don't annotate stack vma with "[stack]"
This is because KSTK_ESP ie expected to return usermode SP of tsk while
currently it returns the kernel mode SP of a sleeping tsk.

While the fix is trivial, we also need to adjust the ARC kernel stack
unwinder to not use KSTK_SP and friends any more.

Reported-and-suggested-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

sunrpc: fix braino in ->poll()

commit 1711fd9addf214823b993468567cab1f8254fc51 upstream.

POLL_OUT isn't what callers of ->poll() are expecting to see; it's
actually __SI_POLL | 2 and it's a siginfo code, not a poll bitmap
bit...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Bruce Fields <bfields@fieldses.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

procfs: fix race between symlink removals and traversals

commit 7e0e953bb0cf649f93277ac8fb67ecbb7f7b04a9 upstream.

use_pde()/unuse_pde() in ->follow_link()/->put_link() resp.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

debugfs: leave freeing a symlink body until inode eviction

commit 0db59e59299f0b67450c5db21f7f316c8fb04e84 upstream.

As it is, we have debugfs_remove() racing with symlink traversals.
Supply ->evict_inode() and do freeing there - inode will remain
pinned until we are done with the symlink body.

And rip the idiocy with checking if dentry is positive right after
we'd verified debugfs_positive(), which is a stronger check...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation

commit 0a280962dc6e117e0e4baa668453f753579265d9 upstream.

X-Coverup: just ask spender
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

USB: serial: fix tty-device error handling at probe

commit ca4383a3947a83286bc9b9c598a1f55e867871d7 upstream.

Add missing error handling when registering the tty device at port
probe. This avoids trying to remove an uninitialised character device
when the port device is removed.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

USB: serial: fix potential use-after-free after failed probe

commit 07fdfc5e9f1c966be8722e8fa927e5ea140df5ce upstream.

Fix return value in probe error path, which could end up returning
success (0) on errors. This could in turn lead to use-after-free or
double free (e.g. in port_remove) when the port device is removed.

Fixes: c706ebdfc895 ("USB: usb-serial: call port_probe and port_remove
at the right times")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

TTY: fix tty_wait_until_sent on 64-bit machines

commit 79fbf4a550ed6a22e1ae1516113e6c7fa5d56a53 upstream.

Fix overflow bug in tty_wait_until_sent on 64-bit machines, where an
infinite timeout (0) would be passed to the underlying tty-driver's
wait_until_sent-operation as a negative timeout (-1), causing it to
return immediately.

This manifests itself for example as tcdrain() returning immediately,
drivers not honouring the drain flags when setting terminal attributes,
or even dropped data on close as a requested infinite closing-wait
timeout would be ignored.

The first symptom was reported by Asier LLANO who noted that tcdrain()
returned prematurely when using the ftdi_sio usb-serial driver.

Fix this by passing 0 rather than MAX_SCHEDULE_TIMEOUT (LONG_MAX) to the
underlying tty driver.

Note that the serial-core wait_until_sent-implementation is not affected
by this bug due to a lucky chance (comparison to an unsigned maximum
timeout), and neither is the cyclades one that had an explicit check for
negative timeouts, but all other tty drivers appear to be affected.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: ZIV-Asier Llano Palacios <asier.llano@cgglobal.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

USB: serial: fix infinite wait_until_sent timeout

commit f528bf4f57e43d1af4b2a5c97f09e43e0338c105 upstream.

Make sure to handle an infinite timeout (0).

Note that wait_until_sent is currently never called with a 0-timeout
argument due to a bug in tty_wait_until_sent.

Fixes: dcf010503966 ("USB: serial: add generic wait_until_sent
implementation")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

net: irda: fix wait_until_sent poll timeout

commit 2c3fbe3cf28fbd7001545a92a83b4f8acfd9fa36 upstream.

In case an infinite timeout (0) is requested, the irda wait_until_sent
implementation would use a zero poll timeout rather than the default
200ms.

Note that wait_until_sent is currently never called with a 0-timeout
argument due to a bug in tty_wait_until_sent.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

mac80211: Send EAPOL frames at lowest rate

commit 9c1c98a3bb7b7593b60264b9a07e001e68b46697 upstream.

The current minstrel_ht rate control behavior is somewhat optimistic in
trying to find optimum TX rate. While this is usually fine for normal
Data frames, there are cases where a more conservative set of retry
parameters would be beneficial to make the connection more robust.

EAPOL frames are critical to the authentication and especially the
EAPOL-Key message 4/4 (the last message in the 4-way handshake) is
important to get through to the AP. If that message is lost, the only
recovery mechanism in many cases is to reassociate with the AP and start
from scratch. This can often be avoided by trying to send the frame with
more conservative rate and/or with more link layer retries.

In most cases, minstrel_ht is currently using the initial EAPOL-Key
frames for probing higher rates and this results in only five link layer
transmission attempts (one at high(ish) MCS and four at MCS0). While
this works with most APs, it looks like there are some deployed APs that
may have issues with the EAPOL frames using HT MCS immediately after
association. Similarly, there may be issues in cases where the signal
strength or radio environment is not good enough to be able to get
frames through even at couple of MCS 0 tries.

The best approach for this would likely to be to reduce the TX rate for
the last rate (3rd rate parameter in the set) to a low basic rate (say,
6 Mbps on 5 GHz and 2 or 5.5 Mbps on 2.4 GHz), but doing that cleanly
requires some more effort. For now, we can start with a simple one-liner
that forces the minimum rate to be used for EAPOL frames similarly how
the TX rate is selected for the IEEE 802.11 Management frames. This does
result in a small extra latency added to the cases where the AP would be
able to receive the higher rate, but taken into account how small number
of EAPOL frames are used, this is likely to be insignificant. A future
optimization in the minstrel_ht design can also allow this patch to be
reverted to get back to the more optimized initial TX rate.

It should also be noted that many drivers that do not use minstrel as
the rate control algorithm are already doing similar workarounds by
forcing the lowest TX rate to be used for EAPOL frames.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

xhci: fix reporting of 0-sized URBs in control endpoint

commit 45ba2154d12fc43b70312198ec47085f10be801a upstream.

When a control transfer has a short data stage, the xHCI controller generates
two transfer events: a COMP_SHORT_TX event that specifies the untransferred
amount, and a COMP_SUCCESS event. But when the data stage is not short, only the
COMP_SUCCESS event occurs. Therefore, xhci-hcd must set urb->actual_length to
urb->transfer_buffer_length while processing the COMP_SUCCESS event, unless
urb->actual_length was set already by a previous COMP_SHORT_TX event.

The driver checks this by seeing whether urb->actual_length == 0, but this alone
is the wrong test, as it is entirely possible for a short transfer to have an
urb->actual_length = 0.

This patch changes the xhci driver to rely on a new td->urb_length_set flag,
which is set to true when a COMP_SHORT_TX event is received and the URB length
updated at that stage.

This fixes a bug which affected the HSO plugin, which relies on URBs with
urb->actual_length == 0 to halt re-submitting the RX URB in the control
endpoint.

Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

xhci: Allocate correct amount of scratchpad buffers

commit 6596a926b0b6c80b730a1dd2fa91908e0a539c37 upstream.

Include the high order bit fields for Max scratchpad buffers when
calculating how many scratchpad buffers are needed.

I'm suprised this hasn't caused more issues, we never allocated more than
32 buffers even if xhci needed more. Either we got lucky and xhci never
really used past that area, or then we got enough zeroed dma memory anyway.

Should be backported as far back as possible

Reported-by: Tim Chen <tim.c.chen@linux.intel.com>
Tested-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: dwc3: dwc3-omap: Fix disable IRQ

commit 96e5d31244c5542f5b2ea81d76f14ba4b8a7d440 upstream.

In the wrapper the IRQ disable should be done by writing 1's to the
IRQ*_CLR register. Existing code is broken because it instead writes
zeros to IRQ*_SET register.

Fix this by adding functions dwc3_omap_write_irqmisc_clr() and
dwc3_omap_write_irq0_clr() which do the right thing.

Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver")
Signed-off-by: George Cherian <george.cherian@ti.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: ftdi_sio: Add jtag quirk support for Cyber Cortex AV boards

commit c7d373c3f0da2b2b78c4b1ce5ae41485b3ef848c upstream.

This patch integrates Cyber Cortex AV boards with the existing
ftdi_jtag_quirk in order to use serial port 0 with JTAG which is
required by the manufacturers' software.

Steps: 2

[ftdi_sio_ids.h]
1. Defined the device PID

[ftdi_sio.c]
2. Added a macro declaration to the ids array, in order to enable the
jtag quirk for the device.

Signed-off-by: Max Mansfield <max.m.mansfield@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

USB: ftdi_sio: add PIDs for Actisense USB devices

commit f6950344d3cf4a1e231b5828b50c4ac168db3886 upstream.

These product identifiers (PID) all deal with marine NMEA format data
used on motor boats and yachts. We supply the programmed devices to
Chetco, for use inside their equipment. The PIDs are a direct copy of
our Windows device drivers (FTDI drivers with altered PIDs).

Signed-off-by: Mark Glover <mark@actisense.com>
[johan: edit commit message slightly ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

USB: usbfs: don't leak kernel data in siginfo

commit f0c2b68198589249afd2b1f2c4e8de8c03e19c16 upstream.

When a signal is delivered, the information in the siginfo structure
is copied to userspace. Good security practice dicatates that the
unused fields in this structure should be initialized to 0 so that
random kernel stack data isn't exposed to the user. This patch adds
such an initialization to the two places where usbfs raises signals.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Dave Mielke <dave@mielke.cc>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

USB: serial: cp210x: Adding Seletek device id's

commit 675af70856d7cc026be8b6ea7a8b9db10b8b38a1 upstream.

These device ID's are not associated with the cp210x module currently,
but should be. This patch allows the devices to operate upon connecting
them to the usb bus as intended.

Signed-off-by: Michiel van de Garde <mgparser@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

KVM: MIPS: Fix trace event to save PC directly

commit b3cffac04eca9af46e1e23560a8ee22b1bd36d43 upstream.

Currently the guest exit trace event saves the VCPU pointer to the
structure, and the guest PC is retrieved by dereferencing it when the
event is printed rather than directly from the trace record. This isn't
safe as the printing may occur long afterwards, after the PC has changed
and potentially after the VCPU has been freed. Usually this results in
the same (wrong) PC being printed for multiple trace events. It also
isn't portable as userland has no way to access the VCPU data structure
when interpreting the trace record itself.

Lets save the actual PC in the structure so that the correct value is
accessible later.

Fixes: 669e846e6c4e ("KVM/MIPS32: MIPS arch specific APIs for KVM")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

KVM: emulate: fix CMPXCHG8B on 32-bit hosts

commit 4ff6f8e61eb7f96d3ca535c6d240f863ccd6fb7d upstream.

This has been broken for a long time: it broke first in 2.6.35, then was
almost fixed in 2.6.36 but this one-liner slipped through the cracks.
The bug shows up as an infinite loop in Windows 7 (and newer) boot on
32-bit hosts without EPT.

Windows uses CMPXCHG8B to write to page tables, which causes a
page fault if running without EPT; the emulator is then called from
kvm_mmu_page_fault. The loop then happens if the higher 4 bytes are
not 0; the common case for this is that the NX bit (bit 63) is 1.

Fixes: 6550e1f165f384f3a46b60a1be9aba4bc3c2adad
Fixes: 16518d5ada690643453eb0aef3cc7841d3623c2d
Reported-by: Erik Rull <erik.rull@rdsoftware.de>
Tested-by: Erik Rull <erik.rull@rdsoftware.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

Btrfs:__add_inode_ref: out of bounds memory read when looking for extended ref.

commit dd9ef135e3542ffc621c4eb7f0091870ec7a1504 upstream.

Improper arithmetics when calculting the address of the extended ref could
lead to an out of bounds memory read and kernel panic.

Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

Btrfs: fix data loss in the fast fsync path

commit 3a8b36f378060d20062a0918e99fae39ff077bf0 upstream.

When using the fast file fsync code path we can miss the fact that new
writes happened since the last file fsync and therefore return without
waiting for the IO to finish and write the new extents to the fsync log.

Here's an example scenario where the fsync will miss the fact that new
file data exists that wasn't yet durably persisted:

1. fs_info->last_trans_committed == N - 1 and current transaction is
   transaction N (fs_info->generation == N);

2. do a buffered write;

3. fsync our inode, this clears our inode's full sync flag, starts
   an ordered extent and waits for it to complete - when it completes
   at btrfs_finish_ordered_io(), the inode's last_trans is set to the
   value N (via btrfs_update_inode_fallback -> btrfs_update_inode ->
   btrfs_set_inode_last_trans);

4. transaction N is committed, so fs_info->last_trans_committed is now
   set to the value N and fs_info->generation remains with the value N;

5. do another buffered write, when this happens btrfs_file_write_iter
   sets our inode's last_trans to the value N + 1 (that is
   fs_info->generation + 1 == N + 1);

6. transaction N + 1 is started and fs_info->generation now has the
   value N + 1;

7. transaction N + 1 is committed, so fs_info->last_trans_committed
   is set to the value N + 1;

8. fsync our inode - because it doesn't have the full sync flag set,
   we only start the ordered extent, we don't wait for it to complete
   (only in a later phase) therefore its last_trans field has the
   value N + 1 set previously by btrfs_file_write_iter(), and so we
   have:

       inode->last_trans <= fs_info->last_trans_committed
           (N + 1)              (N + 1)

   Which made us not log the last buffered write and exit the fsync
   handler immediately, returning success (0) to user space and resulting
   in data loss after a crash.

This can actually be triggered deterministically and the following excerpt
from a testcase I made for xfstests triggers the issue. It moves a dummy
file across directories and then fsyncs the old parent directory - this
is just to trigger a transaction commit, so moving files around isn't
directly related to the issue but it was chosen because running 'sync' for
example does more than just committing the current transaction, as it
flushes/waits for all file data to be persisted. The issue can also happen
at random periods, since the transaction kthread periodicaly commits the
current transaction (about every 30 seconds by default).
The body of the test is:

  _scratch_mkfs >> $seqres.full 2>&1
  _init_flakey
  _mount_flakey

  # Create our main test file 'foo', the one we check for data loss.
  # By doing an fsync against our file, it makes btrfs clear the 'needs_full_sync'
  # bit from its flags (btrfs inode specific flags).
  $XFS_IO_PROG -f -c "pwrite -S 0xaa 0 8K" \
                  -c "fsync" $SCRATCH_MNT/foo | _filter_xfs_io

  # Now create one other file and 2 directories. We will move this second file
  # from one directory to the other later because it forces btrfs to commit its
  # currently open transaction if we fsync the old parent directory. This is
  # necessary to trigger the data loss bug that affected btrfs.
  mkdir $SCRATCH_MNT/testdir_1
  touch $SCRATCH_MNT/testdir_1/bar
  mkdir $SCRATCH_MNT/testdir_2

  # Make sure everything is durably persisted.
  sync

  # Write more 8Kb of data to our file.
  $XFS_IO_PROG -c "pwrite -S 0xbb 8K 8K" $SCRATCH_MNT/foo | _filter_xfs_io

  # Move our 'bar' file into a new directory.
  mv $SCRATCH_MNT/testdir_1/bar $SCRATCH_MNT/testdir_2/bar

  # Fsync our first directory. Because it had a file moved into some other
  # directory, this made btrfs commit the currently open transaction. This is
  # a condition necessary to trigger the data loss bug.
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/testdir_1

  # Now fsync our main test file. If the fsync succeeds, we expect the 8Kb of
  # data we wrote previously to be persisted and available if a crash happens.
  # This did not happen with btrfs, because of the transaction commit that
  # happened when we fsynced the parent directory.
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo

  # Simulate a crash/power loss.
  _load_flakey_table $FLAKEY_DROP_WRITES
  _unmount_flakey

  _load_flakey_table $FLAKEY_ALLOW_WRITES
  _mount_flakey

  # Now check that all data we wrote before are available.
  echo "File content after log replay:"
  od -t x1 $SCRATCH_MNT/foo

  status=0
  exit

The expected golden output for the test, which is what we get with this
fix applied (or when running against ext3/4 and xfs), is:

  wrote 8192/8192 bytes at offset 0
  XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
  wrote 8192/8192 bytes at offset 8192
  XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
  File content after log replay:
  0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
  *
  0020000 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb
  *
  0040000

Without this fix applied, the output shows the test file does not have
the second 8Kb extent that we successfully fsynced:

  wrote 8192/8192 bytes at offset 0
  XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
  wrote 8192/8192 bytes at offset 8192
  XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
  File content after log replay:
  0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
  *
  0020000

So fix this by skipping the fsync only if we're doing a full sync and
if the inode's last_trans is <= fs_info->last_trans_committed, or if
the inode is already in the log. Also remove setting the inode's
last_trans in btrfs_file_write_iter since it's useless/unreliable.

Also because btrfs_file_write_iter no longer sets inode->last_trans to
fs_info->generation + 1, don't set last_trans to 0 if we bail out and don't
bail out if last_trans is 0, otherwise something as simple as the following
example wouldn't log the second write on the last fsync:

  1. write to file

  2. fsync file

  3. fsync file
       |--> btrfs_inode_in_log() returns true and it set last_trans to 0

  4. write to file
       |--> btrfs_file_write_iter() no longers sets last_trans, so it
            remained with a value of 0
  5. fsync
       |--> inode->last_trans == 0, so it bails out without logging the
            second write

A test case for xfstests will be sent soon.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

btrfs: fix lost return value due to variable shadowing

commit 1932b7be973b554ffe20a5bba6ffaed6fa995cdc upstream.

A block-local variable stores error code but btrfs_get_blocks_direct may
not return it in the end as there's a ret defined in the function scope.

Fixes: d187663ef24c ("Btrfs: lock extents as we map them in DIO")
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

mei: make device disabled on stop unconditionally

commit 6c15a8516b8118eb19a59fd0bd22df41b9101c32 upstream.

Set the internal device state to to disabled after hardware reset in stop flow.
This will cover cases when driver was not brought to disabled state because of
an error and in stop flow we wish not to retry the reset.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

iio: ad5686: fix optional reference voltage declaration

commit da019f59cb16570e78feaf10380ac65a3a06861e upstream.

When not using the "_optional" function, a dummy regulator is returned
and the driver fails to initialize.

Signed-off-by: Urs Fässler <urs.fassler@bytesatwork.ch>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

iio: imu: adis16400: Fix sign extension

commit 19e353f2b344ad86cea6ebbc0002e5f903480a90 upstream.

The intention is obviously to sign-extend a 12 bit quantity. But
because of C's promotion rules, the assignment is equivalent to "val16
&= 0xfff;". Use the proper API for this.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

x86/asm/entry/64: Remove a bogus 'ret_from_fork' optimization

commit 956421fbb74c3a6261903f3836c0740187cf038b upstream.

'ret_from_fork' checks TIF_IA32 to determine whether 'pt_regs' and
the related state make sense for 'ret_from_sys_call'. This is
entirely the wrong check. TS_COMPAT would make a little more
sense, but there's really no point in keeping this optimization
at all.

This fixes a return to the wrong user CS if we came from int
0x80 in a 64-bit task.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/4710be56d76ef994ddf59087aad98c000fbab9a4.1424989793.git.luto@amacapital.net
[ Backported from tip:x86/asm. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

target: Check for LBA + sectors wrap-around in sbc_parse_cdb

commit aa179935edea9a64dec4b757090c8106a3907ffa upstream.

This patch adds a check to sbc_parse_cdb() in order to detect when
an LBA + sector vs. end-of-device calculation wraps when the LBA is
sufficently large enough (eg: 0xFFFFFFFFFFFFFFFF).

Cc: Martin Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

target: Add missing WRITE_SAME end-of-device sanity check

commit 8e575c50a171f2579e367a7f778f86477dfdaf49 upstream.

This patch adds a check to sbc_setup_write_same() to verify
the incoming WRITE_SAME LBA + number of blocks does not exceed
past the end-of-device.

Also check for potential LBA wrap-around as well.

Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Martin Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

target: Fix PR_APTPL_BUF_LEN buffer size limitation

commit f161d4b44d7cc1dc66b53365215227db356378b1 upstream.

This patch addresses the original PR_APTPL_BUF_LEN = 8k limitiation
for write-out of PR APTPL metadata that Martin has recently been
running into.

It changes core_scsi3_update_and_write_aptpl() to use vzalloc'ed
memory instead of kzalloc, and increases the default hardcoded
length to 256k.

It also adds logic in core_scsi3_update_and_write_aptpl() to double
the original length upon core_scsi3_update_aptpl_buf() failure, and
retries until the vzalloc'ed buffer is large enough to accommodate
the outgoing APTPL metadata.

Reported-by: Martin Svec <martin.svec@zoner.cz>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

drm/radeon: workaround for CP HW bug on CIK

commit a9c73a0e022c33954835e66fec3cd744af90ec98 upstream.

Emit the EOP twice to avoid cache flushing problems.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

drm/radeon: only enable kv/kb dpm interrupts once v3

commit 410af8d7285a0b96314845c75c39fd612b755688 upstream.

Enable at init and disable on fini. Workaround for hardware problems.

v2 (chk): extend commit message
v3: add new function

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com> (v2)
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

mm/memory.c: actually remap enough memory

commit 9cb12d7b4ccaa976f97ce0c5fd0f1b6a83bc2a75 upstream.

For whatever reason, generic_access_phys() only remaps one page, but
actually allows to access arbitrary size. It's quite easy to trigger
large reads, like printing out large structure with gdb, which leads to a
crash. Fix it by remapping correct size.

Fixes: 28b2ee20c7cb ("access_process_vm device memory infrastructure")
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

mm/compaction: fix wrong order check in compact_finished()

commit 372549c2a3778fd3df445819811c944ad54609ca upstream.

What we want to check here is whether there is highorder freepage in buddy
list of other migratetype in order to steal it without fragmentation.
But, current code just checks cc->order which means allocation request
order. So, this is wrong.

Without this fix, non-movable synchronous compaction below pageblock order
would not stopped until compaction is complete, because migratetype of
most pageblocks are movable and high order freepage made by compaction is
usually on movable type buddy list.

There is some report related to this bug. See below link.

http://www.spinics.net/lists/linux-mm/msg81666.html

Although the issued system still has load spike comes from compaction,
this makes that system completely stable and responsive according to his
report.

stress-highalloc test in mmtests with non movable order 7 allocation
doesn't show any notable difference in allocation success rate, but, it
shows more compaction success rate.

Compaction success rate (Compaction success * 100 / Compaction stalls, %)
18.47 : 28.94

Fixes: 1fb3f8ca0e92 ("mm: compaction: capture a suitable high-order page immediately when it is made available")
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

mm/nommu.c: fix arithmetic overflow in __vm_enough_memory()

commit 8138a67a5557ffea3a21dfd6f037842d4e748513 upstream.

I noticed that "allowed" can easily overflow by falling below 0, because
(total_vm / 32) can be larger than "allowed". The problem occurs in
OVERCOMMIT_NONE mode.

In this case, a huge allocation can success and overcommit the system
(despite OVERCOMMIT_NONE mode). All subsequent allocations will fall
(system-wide), so system become unusable.

The problem was masked out by commit c9b1d0981fcc
("mm: limit growth of 3% hardcoded other user reserve"),
but it's easy to reproduce it on older kernels:
1) set overcommit_memory sysctl to 2
2) mmap() large file multiple times (with VM_SHARED flag)
3) try to malloc() large amount of memory

It also can be reproduced on newer kernels, but miss-configured
sysctl_user_reserve_kbytes is required.

Fix this issue by switching to signed arithmetic here.

Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Cc: Andrew Shewmaker <agshew@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

mm/mmap.c: fix arithmetic overflow in __vm_enough_memory()

commit 5703b087dc8eaf47bfb399d6cf512d471beff405 upstream.

I noticed, that "allowed" can easily overflow by falling below 0,
because (total_vm / 32) can be larger than "allowed". The problem
occurs in OVERCOMMIT_NONE mode.

In this case, a huge allocation can success and overcommit the system
(despite OVERCOMMIT_NONE mode). All subsequent allocations will fall
(system-wide), so system become unusable.

The problem was masked out by commit c9b1d0981fcc
("mm: limit growth of 3% hardcoded other user reserve"),
but it's easy to reproduce it on older kernels:
1) set overcommit_memory sysctl to 2
2) mmap() large file multiple times (with VM_SHARED flag)
3) try to malloc() large amount of memory

It also can be reproduced on newer kernels, but miss-configured
sysctl_user_reserve_kbytes is required.

Fix this issue by switching to signed arithmetic here.

[akpm@linux-foundation.org: use min_t]
Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Cc: Andrew Shewmaker <agshew@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

mm/hugetlb: add migration entry check in __unmap_hugepage_range

commit 9fbc1f635fd0bd28cb32550211bf095753ac637a upstream.

If __unmap_hugepage_range() tries to unmap the address range over which
hugepage migration is on the way, we get the wrong page because pte_page()
doesn't work for migration entries. This patch simply clears the pte for
migration entries as we do for hwpoison entries.

Fixes: 290408d4a2 ("hugetlb: hugepage migration core")
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

mm/hugetlb: add migration/hwpoisoned entry check in hugetlb_change_protection

commit a8bda28d87c38c6aa93de28ba5d30cc18e865a11 upstream.

There is a race condition between hugepage migration and
change_protection(), where hugetlb_change_protection() doesn't care about
migration entries and wrongly overwrites them. That causes unexpected
results like kernel crash. HWPoison entries also can cause the same
problem.

This patch adds is_hugetlb_entry_(migration|hwpoisoned) check in this
function to do proper actions.

Fixes: 290408d4a2 ("hugetlb: hugepage migration core")
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

team: don't traverse port list using rcu in team_set_mac_address

[ Upstream commit 9215f437b85da339a7dfe3db6e288637406f88b2 ]

Currently the list is traversed using rcu variant. That is not correct
since dev_set_mac_address can be called which eventually calls
rtmsg_ifinfo_build_skb and there, skb allocation can sleep. So fix this
by remove the rcu usage here.

Fixes: 3d249d4ca7 "net: introduce ethernet teaming device"
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

net: ping: Return EAFNOSUPPORT when appropriate.

[ Upstream commit 9145736d4862145684009d6a72a6e61324a9439e ]

1. For an IPv4 ping socket, ping_check_bind_addr does not check
   the family of the socket address that's passed in. Instead,
   make it behave like inet_bind, which enforces either that the
   address family is AF_INET, or that the family is AF_UNSPEC and
   the address is 0.0.0.0.
2. For an IPv6 ping socket, ping_check_bind_addr returns EINVAL
   if the socket family is not AF_INET6. Return EAFNOSUPPORT
   instead, for consistency with inet6_bind.
3. Make ping_v4_sendmsg and ping_v6_sendmsg return EAFNOSUPPORT
   instead of EINVAL if an incorrect socket address structure is
   passed in.
4. Make IPv6 ping sockets be IPv6-only. The code does not support
   IPv4, and it cannot easily be made to support IPv4 because
   the protocol numbers for ICMP and ICMPv6 are different. This
   makes connect(::ffff:192.0.2.1) fail with EAFNOSUPPORT instead
   of making the socket unusable.

Among other things, this fixes an oops that can be triggered by:

    int s = socket(AF_INET, SOCK_DGRAM, IPPROTO_ICMP);
    struct sockaddr_in6 sin6 = {
        .sin6_family = AF_INET6,
        .sin6_addr = in6addr_any,
    };
    bind(s, (struct sockaddr *) &sin6, sizeof(sin6));

Change-Id: If06ca86d9f1e4593c0d6df174caca3487c57a241
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

udp: only allow UFO for packets from SOCK_DGRAM sockets

[ Upstream commit acf8dd0a9d0b9e4cdb597c2f74802f79c699e802 ]

If an over-MTU UDP datagram is sent through a SOCK_RAW socket to a
UFO-capable device, ip_ufo_append_data() sets skb->ip_summed to
CHECKSUM_PARTIAL unconditionally as all GSO code assumes transport layer
checksum is to be computed on segmentation. However, in this case,
skb->csum_start and skb->csum_offset are never set as raw socket
transmit path bypasses udp_send_skb() where they are usually set. As a
result, driver may access invalid memory when trying to calculate the
checksum and store the result (as observed in virtio_net driver).

Moreover, the very idea of modifying the userspace provided UDP header
is IMHO against raw socket semantics (I wasn't able to find a document
clearly stating this or the opposite, though). And while allowing
CHECKSUM_NONE in the UFO case would be more efficient, it would be a bit
too intrusive change just to handle a corner case like this. Therefore
disallowing UFO for packets from SOCK_DGRAM seems to be the best option.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: plusb: Add support for National Instruments host-to-host cable

[ Upstream commit 42c972a1f390e3bc51ca1e434b7e28764992067f ]

The National Instruments USB Host-to-Host Cable is based on the Prolific
PL-25A1 chipset. Add its VID/PID so the plusb driver will recognize it.

Signed-off-by: Ben Shelton <ben.shelton@ni.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

macvtap: make sure neighbour code can push ethernet header

[ Upstream commit 2f1d8b9e8afa5a833d96afcd23abcb8cdf8d83ab ]

Brian reported crashes using IPv6 traffic with macvtap/veth combo.

I tracked the crashes in neigh_hh_output()

-> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD);

Neighbour code assumes headroom to push Ethernet header is
at least 16 bytes.

It appears macvtap has only 14 bytes available on arches
where NET_IP_ALIGN is 0 (like x86)

Effect is a corruption of 2 bytes right before skb->head,
and possible crashes if accessing non existing memory.

This fix should also increase IPv4 performance, as paranoid code
in ip_finish_output2() wont have to call skb_realloc_headroom()

Reported-by: Brian Rak <brak@vultr.com>
Tested-by: Brian Rak <brak@vultr.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

net: compat: Ignore MSG_CMSG_COMPAT in compat_sys_{send, recv}msg

[ Upstream commit d720d8cec563ce4e4fa44a613d4f2dcb1caf2998 ]

With commit a7526eb5d06b (net: Unbreak compat_sys_{send,recv}msg), the
MSG_CMSG_COMPAT flag is blocked at the compat syscall entry points,
changing the kernel compat behaviour from the one before the commit it
was trying to fix (1be374a0518a, net: Block MSG_CMSG_COMPAT in
send(m)msg and recv(m)msg).

On 32-bit kernels (!CONFIG_COMPAT), MSG_CMSG_COMPAT is 0 and the native
32-bit sys_sendmsg() allows flag 0x80000000 to be set (it is ignored by
the kernel). However, on a 64-bit kernel, the compat ABI is different
with commit a7526eb5d06b.

This patch changes the compat_sys_{send,recv}msg behaviour to the one
prior to commit 1be374a0518a.

The problem was found running 32-bit LTP (sendmsg01) binary on an arm64
kernel. Arguably, LTP should not pass 0xffffffff as flags to sendmsg()
but the general rule is not to break user ABI (even when the user
behaviour is not entirely sane).

Fixes: a7526eb5d06b (net: Unbreak compat_sys_{send,recv}msg)
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

team: fix possible null pointer dereference in team_handle_frame

[ Upstream commit 57e595631904c827cfa1a0f7bbd7cc9a49da5745 ]

Currently following race is possible in team:

CPU0                                        CPU1
                                            team_port_del
                                              team_upper_dev_unlink
                                                priv_flags &= ~IFF_TEAM_PORT
team_handle_frame
  team_port_get_rcu
    team_port_exists
      priv_flags & IFF_TEAM_PORT == 0
    return NULL (instead of port got
                 from rx_handler_data)
                                              netdev_rx_handler_unregister

The thing is that the flag is removed before rx_handler is unregistered.
If team_handle_frame is called in between, team_port_exists returns 0
and team_port_get_rcu will return NULL.
So do not check the flag here. It is guaranteed by netdev_rx_handler_unregister
that team_handle_frame will always see valid rx_handler_data pointer.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

net: reject creation of netdev names with colons

[ Upstream commit a4176a9391868bfa87705bcd2e3b49e9b9dd2996 ]

colons are used as a separator in netdev device lookup in dev_ioctl.c

Specific functions are SIOCGIFTXQLEN SIOCETHTOOL SIOCSIFNAME

Signed-off-by: Matthew Thode <mthode@mthode.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ematch: Fix auto-loading of ematch modules.

[ Upstream commit 34eea79e2664b314cab6a30fc582fdfa7a1bb1df ]

In tcf_em_validate(), after calling request_module() to load the
kind-specific module, set em->ops to NULL before returning -EAGAIN, so
that module_put() is not called again by tcf_em_tree_destroy().

Signed-off-by: Ignacy Gawędzki <ignacy.gawedzki@green-communications.fr>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

net: phy: Fix verification of EEE support in phy_init_eee

[ Upstream commit 54da5a8be3c1e924c35480eb44c6e9b275f6444e ]

phy_init_eee uses phy_find_setting(phydev->speed, phydev->duplex)
to find a valid entry in the settings array for the given speed
and duplex value. For full duplex 1000baseT, this will return
the first matching entry, which is the entry for 1000baseKX_Full.

If the phy eee does not support 1000baseKX_Full, this entry will not
match, causing phy_init_eee to fail for no good reason.

Fixes: 9a9c56cb34e6 ("net: phy: fix a bug when verify the EEE support")
Fixes: 3e7077067e80c ("phy: Expand phy speed/duplex settings array")
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ipv4: ip_check_defrag should not assume that skb_network_offset is zero

[ Upstream commit 3e32e733d1bbb3f227259dc782ef01d5706bdae0 ]

ip_check_defrag() may be used by af_packet to defragment outgoing packets.
skb_network_offset() of af_packet's outgoing packets is not zero.

Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ipv4: ip_check_defrag should correctly check return value of skb_copy_bits

[ Upstream commit fba04a9e0c869498889b6445fd06cbe7da9bb834 ]

skb_copy_bits() returns zero on success and negative value on error,
so it is needed to invert the condition in ip_check_defrag().

Fixes: 1bf3751ec90c ("ipv4: ip_check_defrag must not modify skb before unsharing")
Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

gen_stats.c: Duplicate xstats buffer for later use

[ Upstream commit 1c4cff0cf55011792125b6041bc4e9713e46240f ]

The gnet_stats_copy_app() function gets called, more often than not, with its
second argument a pointer to an automatic variable in the caller's stack.
Therefore, to avoid copying garbage afterwards when calling
gnet_stats_finish_copy(), this data is better copied to a dynamically allocated
memory that gets freed after use.

[xiyou.wangcong@gmail.com: remove a useless kfree()]

Signed-off-by: Ignacy Gawędzki <ignacy.gawedzki@green-communications.fr>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

rtnetlink: call ->dellink on failure when ->newlink exists

[ Upstream commit 7afb8886a05be68e376655539a064ec672de8a8e ]

Ignacy reported that when eth0 is down and add a vlan device
on top of it like:

ip link add link eth0 name eth0.1 up type vlan id 1

We will get a refcount leak:

unregister_netdevice: waiting for eth0.1 to become free. Usage count = 2

The problem is when rtnl_configure_link() fails in rtnl_newlink(),
we simply call unregister_device(), but for stacked device like vlan,
we almost do nothing when we unregister the upper device, more work
is done when we unregister the lower device, so call its ->dellink().

Reported-by: Ignacy Gawedzki <ignacy.gawedzki@green-communications.fr>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ipv6: fix ipv6_cow_metrics for non DST_HOST case

[ Upstream commit 3b4711757d7903ab6fa88a9e7ab8901b8227da60 ]

ipv6_cow_metrics() currently assumes only DST_HOST routes require
dynamic metrics allocation from inetpeer.  The assumption breaks
when ndisc discovered router with RTAX_MTU and RTAX_HOPLIMIT metric.
Refer to ndisc_router_discovery() in ndisc.c and note that dst_metric_set()
is called after the route is created.

This patch creates the metrics array (by calling dst_cow_metrics_generic) in
ipv6_cow_metrics().

Test:
radvd.conf:
interface qemubr0
{
AdvLinkMTU 1300;
AdvCurHopLimit 30;

prefix fd00:face:face:face::/64
{
AdvOnLink on;
AdvAutonomous on;
AdvRouterAddr off;
};
};

Before:
[root@qemu1 ~]# ip -6 r show | egrep -v unreachable
fd00:face:face:face::/64 dev eth0  proto kernel  metric 256  expires 27sec
fe80::/64 dev eth0  proto kernel  metric 256
default via fe80::74df:d0ff:fe23:8ef2 dev eth0  proto ra  metric 1024  expires 27sec

After:
[root@qemu1 ~]# ip -6 r show | egrep -v unreachable
fd00:face:face:face::/64 dev eth0  proto kernel  metric 256  expires 27sec mtu 1300
fe80::/64 dev eth0  proto kernel  metric 256  mtu 1300
default via fe80::74df:d0ff:fe23:8ef2 dev eth0  proto ra  metric 1024  expires 27sec mtu 1300 hoplimit 30

Fixes: 8e2ec639173f325 (ipv6: don't use inetpeer to store metrics for routes.)
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

rtnetlink: ifla_vf_policy: fix misuses of NLA_BINARY

[ Upstream commit 364d5716a7adb91b731a35765d369602d68d2881 ]

ifla_vf_policy[] is wrong in advertising its individual member types as
NLA_BINARY since .type = NLA_BINARY in combination with .len declares the
len member as *max* attribute length [0, len].

The issue is that when do_setvfinfo() is being called to set up a VF
through ndo handler, we could set corrupted data if the attribute length
is less than the size of the related structure itself.

The intent is exactly the opposite, namely to make sure to pass at least
data of minimum size of len.

Fixes: ebc08a6f47ee ("rtnetlink: Add VF config code to rtnetlink")
Cc: Mitch Williams <mitch.a.williams@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

pktgen: fix UDP checksum computation

[ Upstream commit 7744b5f3693cc06695cb9d6667671c790282730f ]

This patch fixes two issues in UDP checksum computation in pktgen.

First, the pseudo-header uses the source and destination IP
addresses. Currently, the ports are used for IPv4.

Second, the UDP checksum covers both header and data. So we need to
generate the data earlier (move pktgen_finalize_skb up), and compute
the checksum for UDP header + data.

Fixes: c26bf4a51308c ("pktgen: Add UDPCSUM flag to support UDP checksums")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

netfilter: xt_socket: fix a stack corruption bug

[ upstream commit 78296c97ca1fd3b104f12e1f1fbc06c46635990b ]

As soon as extract_icmp6_fields() returns, its local storage (automatic
variables) is deallocated and can be overwritten.

Lets add an additional parameter to make sure storage is valid long
enough.

While we are at it, adds some const qualifiers.

Cc: <stable@vger.kernel.org> # 3.12.x
Cc: <stable@vger.kernel.org> # 3.14.x
Cc: <stable@vger.kernel.org> # 3.18.x
Cc: <stable@vger.kernel.org> # 3.19.x
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: b64c9256a9b76 ("tproxy: added IPv6 support to the socket match")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ipvs: rerouting to local clients is not needed anymore

[ upstream commit 579eb62ac35845686a7c4286c0a820b4eb1f96aa ]

commit f5a41847acc5 ("ipvs: move ip_route_me_harder for ICMP")
from 2.6.37 introduced ip_route_me_harder() call for responses to
local clients, so that we can provide valid rt_src after SNAT.
It was used by TCP to provide valid daddr for ip_send_reply().
After commit 0a5ebb8000c5 ("ipv4: Pass explicit daddr arg to
ip_send_reply()." from 3.0 this rerouting is not needed anymore
and should be avoided, especially in LOCAL_IN.

Fixes 3.12.33 crash in xfrm reported by Florian Wiessner:
"3.12.33 - BUG xfrm_selector_match+0x25/0x2f6"

Cc: <stable@vger.kernel.org> # 3.10.x
Cc: <stable@vger.kernel.org> # 3.12.x
Cc: <stable@vger.kernel.org> # 3.14.x
Cc: <stable@vger.kernel.org> # 3.18.x
Reported-by: Smart Weblications GmbH - Florian Wiessner <f.wiessner@smart-weblications.de>
Tested-by: Smart Weblications GmbH - Florian Wiessner <f.wiessner@smart-weblications.de>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ipvs: add missing ip_vs_pe_put in sync code

[ upstream commit 528c943f3bb919aef75ab2fff4f00176f09a4019 ]

ip_vs_conn_fill_param_sync() gets in param.pe a module
reference for persistence engine from __ip_vs_pe_getbyname()
but forgets to put it. Problem occurs in backup for
sync protocol v1 (2.6.39).

Also, pe_data usually comes in sync messages for
connection templates and ip_vs_conn_new() copies
the pointer only in this case. Make sure pe_data
is not leaked if it comes unexpectedly for normal
connections. Leak can happen only if bogus messages
are sent to backup server.

Fixes: fe5e7a1efb66 ("IPVS: Backup, Adding Version 1 receive capability")
Cc: <stable@vger.kernel.org> # 3.10.x
Cc: <stable@vger.kernel.org> # 3.12.x
Cc: <stable@vger.kernel.org> # 3.14.x
Cc: <stable@vger.kernel.org> # 3.18.x
Cc: <stable@vger.kernel.org> # 3.19.x
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

MIPS: Export FP functions used by lose_fpu(1) for KVM

commit 3ce465e04bfd8de9956d515d6e9587faac3375dc upstream.

Export the _save_fp asm function used by the lose_fpu(1) macro to GPL
modules so that KVM can make use of it when it is built as a module.

This fixes the following build error when CONFIG_KVM=m due to commit
f798217dfd03 ("KVM: MIPS: Don't leak FPU/DSP to guest"):

ERROR: "_save_fp" [arch/mips/kvm/kvm.ko] undefined!

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Fixes: f798217dfd03 (KVM: MIPS: Don't leak FPU/DSP to guest)
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9260/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
[james.hogan@imgtec.com: Only export when CPU_R4K_FPU=y prior to v3.16,
so as not to break the Octeon build which excludes FPU support. KVM
depends on MIPS32r2 anyway.]
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

USB: EHCI: adjust error return code

commit c401e7b4a808d50ab53ef45cb8d0b99b238bf2c9 upstream.

The USB stack uses error code -ENOSPC to indicate that the periodic
schedule is too full, with insufficient bandwidth to accommodate a new
allocation. It uses -EFBIG to indicate that an isochronous transfer
could not be linked into the schedule because it would exceed the
number of isochronous packets the host controller driver can handle
(generally because the new transfer would extend too far into the
future).

ehci-hcd uses the wrong error code at one point. This patch fixes it,
along with a misleading comment and debugging message.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

staging: comedi: cb_pcidas64: fix incorrect AI range code handling

commit be8e89087ec2d2c8a1ad1e3db64bf4efdfc3c298 upstream.

The hardware range code values and list of valid ranges for the AI
subdevice is incorrect for several supported boards.  The hardware range
code values for all boards except PCI-DAS4020/12 is determined by
calling `ai_range_bits_6xxx()` based on the maximum voltage of the range
and whether it is bipolar or unipolar, however it only returns the
correct hardware range code for the PCI-DAS60xx boards.  For
PCI-DAS6402/16 (and /12) it returns the wrong code for the unipolar
ranges.  For PCI-DAS64/Mx/16 it returns the wrong code for all the
ranges and the comedi range table is incorrect.

Change `ai_range_bits_6xxx()` to use a look-up table pointed to by new
member `ai_range_codes` of `struct pcidas64_board` to map the comedi
range table indices to the hardware range codes.  Use a new comedi range
table for the PCI-DAS64/Mx/16 boards (and the commented out variants).

Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ath6kl: fix struct hif_scatter_req list handling

commit 31b9cc9a873dcab161999622314f98a75d838975 upstream.

Jason noticed that with Yocto GCC 4.8.1 ath6kl crashes with this iperf command:

iperf -c $TARGET_IP -i 5 -t 50 -w 1M

The crash was:

Unable to handle kernel paging request at virtual address 1a480000
pgd = 80004000
[1a480000] *pgd=00000000
Internal error: Oops: 805 [#1] SMP ARM
Modules linked in: ath6kl_sdio ath6kl_core [last unloaded: ath6kl_core]
CPU: 0 PID: 1953 Comm: kworker/u4:0 Not tainted 3.10.9-1.0.0_alpha+dbf364b #1
Workqueue: ath6kl ath6kl_sdio_write_async_work [ath6kl_sdio]
task: dcc9a680 ti: dc9ae000 task.ti: dc9ae000
PC is at v7_dma_clean_range+0x20/0x38
LR is at dma_cache_maint_page+0x50/0x54
pc : [<8001a6f8>]    lr : [<800170fc>]    psr: 20000093
sp : dc9afcf8  ip : 8001a748  fp : 00000004
r10: 00000000  r9 : 00000001  r8 : 00000000
r7 : 00000001  r6 : 00000000  r5 : 80cb7000  r4 : 03f9a480
r3 : 0000001f  r2 : 00000020  r1 : 1a480000  r0 : 1a480000
Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c53c7d  Table: 6cc5004a  DAC: 00000015
Process kworker/u4:0 (pid: 1953, stack limit = 0xdc9ae238)
Stack: (0xdc9afcf8 to 0xdc9b0000)
fce0:                                                       80c9b29c 00000000
fd00: 00000000 80017134 8001a748 dc302ac0 00000000 00000000 dc454a00 80c12ed8
fd20: dc115410 80017238 00000000 dc454a10 00000001 80017588 00000001 00000000
fd40: 00000000 dc302ac0 dc9afe38 dc9afe68 00000004 80c12ed8 00000000 dc454a00
fd60: 00000004 80436f88 00000000 00000000 00000600 0000ffff 0000000c 80c113c4
fd80: 80c9b29c 00000001 00000004 dc115470 60000013 dc302ac0 dc46e000 dc302800
fda0: dc9afe10 dc302b78 60000013 dc302ac0 dc46e000 00000035 dc46e5b0 80438c90
fdc0: dc9afe10 dc302800 dc302800 dc9afe68 dc9afe38 80424cb4 00000005 dc9afe10
fde0: dc9afe20 80424de8 dc9afe10 dc302800 dc46e910 80424e90 dc473c00 dc454f00
fe00: 000001b5 7f619d64 dcc7c830 00000000 00000000 dc9afe38 dc9afe68 00000000
fe20: 00000000 00000000 dc9afe28 dc9afe28 80424d80 00000000 00000035 9cac0034
fe40: 00000000 00000000 00000000 00000000 000001b5 00000000 00000000 00000000
fe60: dc9afe68 dc9afe10 3b9aca00 00000000 00000080 00000034 00000000 00000100
fe80: 00000000 00000000 dc9afe10 00000004 dc454a00 00000000 dc46e010 dc46e96c
fea0: dc46e000 dc46e964 00200200 00100100 dc46e910 7f619ec0 00000600 80c0e770
fec0: dc15a900 dcc7c838 00000000 dc46e954 8042d434 dcc44680 dc46e954 dc004400
fee0: dc454500 00000000 00000000 dc9ae038 dc004400 8003c450 dcc44680 dc004414
ff00: dc46e954 dc454500 00000001 dcc44680 dc004414 dcc44698 dc9ae000 dc9ae030
ff20: 00000001 dc9ae000 dc004400 8003d158 8003d020 00000000 00000000 80c53941
ff40: dc9aff64 dcb71ea0 00000000 dcc44680 8003d020 00000000 00000000 00000000
ff60: 00000000 80042480 00000000 00000000 000000f8 dcc44680 00000000 00000000
ff80: dc9aff80 dc9aff80 00000000 00000000 dc9aff90 dc9aff90 dc9affac dcb71ea0
ffa0: 800423cc 00000000 00000000 8000e018 00000000 00000000 00000000 00000000
ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[<8001a6f8>] (v7_dma_clean_range+0x20/0x38) from [<800170fc>] (dma_cache_maint_page+0x50/0x54)
[<800170fc>] (dma_cache_maint_page+0x50/0x54) from [<80017134>] (__dma_page_cpu_to_dev+0x34/0x9c)
[<80017134>] (__dma_page_cpu_to_dev+0x34/0x9c) from [<80017238>] (arm_dma_map_page+0x64/0x68)
[<80017238>] (arm_dma_map_page+0x64/0x68) from [<80017588>] (arm_dma_map_sg+0x7c/0xf4)
[<80017588>] (arm_dma_map_sg+0x7c/0xf4) from [<80436f88>] (sdhci_send_command+0x894/0xe00)
[<80436f88>] (sdhci_send_command+0x894/0xe00) from [<80438c90>] (sdhci_request+0xc0/0x1ec)
[<80438c90>] (sdhci_request+0xc0/0x1ec) from [<80424cb4>] (mmc_start_request+0xb8/0xd4)
[<80424cb4>] (mmc_start_request+0xb8/0xd4) from [<80424de8>] (__mmc_start_req+0x60/0x84)
[<80424de8>] (__mmc_start_req+0x60/0x84) from [<80424e90>] (mmc_wait_for_req+0x10/0x20)
[<80424e90>] (mmc_wait_for_req+0x10/0x20) from [<7f619d64>] (ath6kl_sdio_scat_rw.isra.10+0x1dc/0x240 [ath6kl_sdio])
[<7f619d64>] (ath6kl_sdio_scat_rw.isra.10+0x1dc/0x240 [ath6kl_sdio]) from [<7f619ec0>] (ath6kl_sdio_write_async_work+0x5c/0x104 [ath6kl_sdio])
[<7f619ec0>] (ath6kl_sdio_write_async_work+0x5c/0x104 [ath6kl_sdio]) from [<8003c450>] (process_one_work+0x10c/0x370)
[<8003c450>] (process_one_work+0x10c/0x370) from [<8003d158>] (worker_thread+0x138/0x3fc)
[<8003d158>] (worker_thread+0x138/0x3fc) from [<80042480>] (kthread+0xb4/0xb8)
[<80042480>] (kthread+0xb4/0xb8) from [<8000e018>] (ret_from_fork+0x14/0x3c)
Code: e1a02312 e2423001 e1c00003 f57ff04f (ee070f3a)
---[ end trace 0c038f0b8e0b67a3 ]---
Kernel panic - not syncing: Fatal exception

Jason's analysis:

  "The GCC 4.8.1 compiler will not do the for-loop till scat_entries, instead,
   it only run one round loop. This may be caused by that the GCC 4.8.1 thought
   that the scat_list only have one item and then no need to do full iteration,
   but this is simply wrong by looking at the assebly code. This will cause the sg
   buffer not get set when scat_entries > 1 and thus lead to kernel panic.

   Note: This issue not observed with GCC 4.7.2, only found on the GCC 4.8.1)"

Fix this by using the normal [0] style for defining unknown number of list
entries following the struct. This also fixes corruption with scat_q_depth, which
was mistankely added to the end of struct and overwritten if there were more
than item in the scat list.

Reported-by: Jason Liu <r64343@freescale.com>
Tested-by: Jason Liu <r64343@freescale.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

x86, mm/ASLR: Fix stack randomization on 64-bit systems

commit 4e7c22d447bb6d7e37bfe39ff658486ae78e8d77 upstream.

The issue is that the stack for processes is not properly randomized on
64 bit architectures due to an integer overflow.

The affected function is randomize_stack_top() in file
"fs/binfmt_elf.c":

  static unsigned long randomize_stack_top(unsigned long stack_top)
  {
           unsigned int random_variable = 0;

           if ((current->flags & PF_RANDOMIZE) &&
                   !(current->personality & ADDR_NO_RANDOMIZE)) {
                   random_variable = get_random_int() & STACK_RND_MASK;
                   random_variable <<= PAGE_SHIFT;
           }
           return PAGE_ALIGN(stack_top) + random_variable;
           return PAGE_ALIGN(stack_top) - random_variable;
  }

Note that, it declares the "random_variable" variable as "unsigned int".
Since the result of the shifting operation between STACK_RND_MASK (which
is 0x3fffff on x86_64, 22 bits) and PAGE_SHIFT (which is 12 on x86_64):

  random_variable <<= PAGE_SHIFT;

then the two leftmost bits are dropped when storing the result in the
"random_variable". This variable shall be at least 34 bits long to hold
the (22+12) result.

These two dropped bits have an impact on the entropy of process stack.
Concretely, the total stack entropy is reduced by four: from 2^28 to
2^30 (One fourth of expected entropy).

This patch restores back the entropy by correcting the types involved
in the operations in the functions randomize_stack_top() and
stack_maxrandom_size().

The successful fix can be tested with:

  $ for i in `seq 1 10`; do cat /proc/self/maps | grep stack; done
  7ffeda566000-7ffeda587000 rw-p 00000000 00:00 0                          [stack]
  7fff5a332000-7fff5a353000 rw-p 00000000 00:00 0                          [stack]
  7ffcdb7a1000-7ffcdb7c2000 rw-p 00000000 00:00 0                          [stack]
  7ffd5e2c4000-7ffd5e2e5000 rw-p 00000000 00:00 0                          [stack]
  ...

Once corrected, the leading bytes should be between 7ffc and 7fff,
rather than always being 7fff.

Signed-off-by: Hector Marco-Gisbert <hecmargi@upv.es>
Signed-off-by: Ismael Ripoll <iripoll@upv.es>
[ Rebased, fixed 80 char bugs, cleaned up commit message, added test example and CVE ]
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Fixes: CVE-2015-1593
Link: http://lkml.kernel.org/r/20150214173350.GA18393@www.outflux.net
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

blk-throttle: check stats_cpu before reading it from sysfs

commit 045c47ca306acf30c740c285a77a4b4bda6be7c5 upstream.

When reading blkio.throttle.io_serviced in a recently created blkio
cgroup, it's possible to race against the creation of a throttle policy,
which delays the allocation of stats_cpu.

Like other functions in the throttle code, just checking for a NULL
stats_cpu prevents the following oops caused by that race.

[ 1117.285199] Unable to handle kernel paging request for data at address 0x7fb4d0020
[ 1117.285252] Faulting instruction address: 0xc0000000003efa2c
[ 1137.733921] Oops: Kernel access of bad area, sig: 11 [#1]
[ 1137.733945] SMP NR_CPUS=2048 NUMA PowerNV
[ 1137.734025] Modules linked in: bridge stp llc kvm_hv kvm binfmt_misc autofs4
[ 1137.734102] CPU: 3 PID: 5302 Comm: blkcgroup Not tainted 3.19.0 #5
[ 1137.734132] task: c000000f1d188b00 ti: c000000f1d210000 task.ti: c000000f1d210000
[ 1137.734167] NIP: c0000000003efa2c LR: c0000000003ef9f0 CTR: c0000000003ef980
[ 1137.734202] REGS: c000000f1d213500 TRAP: 0300 Not tainted (3.19.0)
[ 1137.734230] MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI> CR: 42008884 XER: 20000000
[ 1137.734325] CFAR: 0000000000008458 DAR: 00000007fb4d0020 DSISR: 40000000 SOFTE: 0
GPR00: c0000000003ed3a0 c000000f1d213780 c000000000c59538 0000000000000000
GPR04: 0000000000000800 0000000000000000 0000000000000000 0000000000000000
GPR08: ffffffffffffffff 00000007fb4d0020 00000007fb4d0000 c000000000780808
GPR12: 0000000022000888 c00000000fdc0d80 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 000001003e120200 c000000f1d5b0cc0 0000000000000200 0000000000000000
GPR24: 0000000000000001 c000000000c269e0 0000000000000020 c000000f1d5b0c80
GPR28: c000000000ca3a08 c000000000ca3dec c000000f1c667e00 c000000f1d213850
[ 1137.734886] NIP [c0000000003efa2c] .tg_prfill_cpu_rwstat+0xac/0x180
[ 1137.734915] LR [c0000000003ef9f0] .tg_prfill_cpu_rwstat+0x70/0x180
[ 1137.734943] Call Trace:
[ 1137.734952] [c000000f1d213780] [d000000005560520] 0xd000000005560520 (unreliable)
[ 1137.734996] [c000000f1d2138a0] [c0000000003ed3a0] .blkcg_print_blkgs+0xe0/0x1a0
[ 1137.735039] [c000000f1d213960] [c0000000003efb50] .tg_print_cpu_rwstat+0x50/0x70
[ 1137.735082] [c000000f1d2139e0] [c000000000104b48] .cgroup_seqfile_show+0x58/0x150
[ 1137.735125] [c000000f1d213a70] [c0000000002749dc] .kernfs_seq_show+0x3c/0x50
[ 1137.735161] [c000000f1d213ae0] [c000000000218630] .seq_read+0xe0/0x510
[ 1137.735197] [c000000f1d213bd0] [c000000000275b04] .kernfs_fop_read+0x164/0x200
[ 1137.735240] [c000000f1d213c80] [c0000000001eb8e0] .__vfs_read+0x30/0x80
[ 1137.735276] [c000000f1d213cf0] [c0000000001eb9c4] .vfs_read+0x94/0x1b0
[ 1137.735312] [c000000f1d213d90] [c0000000001ebb38] .SyS_read+0x58/0x100
[ 1137.735349] [c000000f1d213e30] [c000000000009218] syscall_exit+0x0/0x98
[ 1137.735383] Instruction dump:
[ 1137.735405] 7c6307b4 7f891800 409d00b8 60000000 60420000 3d420004 392a63b0 786a1f24
[ 1137.735471] 7d49502a e93e01c8 7d495214 7d2ad214 <7cead02a> e9090008 e9490010 e9290018

And here is one code that allows to easily reproduce this, although this
has first been found by running docker.

void run(pid_t pid)
{
int n;
int status;
int fd;
char *buffer;
buffer = memalign(BUFFER_ALIGN, BUFFER_SIZE);
n = snprintf(buffer, BUFFER_SIZE, "%d\n", pid);
fd = open(CGPATH "/test/tasks", O_WRONLY);
write(fd, buffer, n);
close(fd);
if (fork() > 0) {
fd = open("/dev/sda", O_RDONLY | O_DIRECT);
read(fd, buffer, 512);
close(fd);
wait(&status);
} else {
fd = open(CGPATH "/test/blkio.throttle.io_serviced", O_RDONLY);
n = read(fd, buffer, BUFFER_SIZE);
close(fd);
}
free(buffer);
exit(0);
}

void test(void)
{
int status;
mkdir(CGPATH "/test", 0666);
if (fork() > 0)
wait(&status);
else
run(getpid());
rmdir(CGPATH "/test");
}

int main(int argc, char **argv)
{
int i;
for (i = 0; i < NR_TESTS; i++)
test();
return 0;
}

Reported-by: Ricardo Marin Matinata <rmm@br.ibm.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

btrfs: set proper message level for skinny metadata

commit 5efa0490cc94aee06cd8d282683e22a8ce0a0026 upstream.

This has been confusing people for too long, the message is really just
informative.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

jffs2: fix handling of corrupted summary length

commit 164c24063a3eadee11b46575c5482b2f1417be49 upstream.

sm->offset maybe wrong but magic maybe right, the offset do not have CRC.

Badness at c00c7580 [verbose debug info unavailable]
NIP: c00c7580 LR: c00c718c CTR: 00000014
REGS: df07bb40 TRAP: 0700   Not tainted  (2.6.34.13-WR4.3.0.0_standard)
MSR: 00029000 <EE,ME,CE>  CR: 22084f84  XER: 00000000
TASK = df84d6e0[908] 'mount' THREAD: df07a000
GPR00: 00000001 df07bbf0 df84d6e0 00000000 00000001 00000000 df07bb58 00000041
GPR08: 00000041 c0638860 00000000 00000010 22084f88 100636c8 df814ff8 00000000
GPR16: df84d6e0 dfa558cc c05adb90 00000048 c0452d30 00000000 000240d0 000040d0
GPR24: 00000014 c05ae734 c05be2e0 00000000 00000001 00000000 00000000 c05ae730
NIP [c00c7580] __alloc_pages_nodemask+0x4d0/0x638
LR [c00c718c] __alloc_pages_nodemask+0xdc/0x638
Call Trace:
[df07bbf0] [c00c718c] __alloc_pages_nodemask+0xdc/0x638 (unreliable)
[df07bc90] [c00c7708] __get_free_pages+0x20/0x48
[df07bca0] [c00f4a40] __kmalloc+0x15c/0x1ec
[df07bcd0] [c01fc880] jffs2_scan_medium+0xa58/0x14d0
[df07bd70] [c01ff38c] jffs2_do_mount_fs+0x1f4/0x6b4
[df07bdb0] [c020144c] jffs2_do_fill_super+0xa8/0x260
[df07bdd0] [c020230c] jffs2_fill_super+0x104/0x184
[df07be00] [c0335814] get_sb_mtd_aux+0x9c/0xec
[df07be20] [c033596c] get_sb_mtd+0x84/0x1e8
[df07be60] [c0201ed0] jffs2_get_sb+0x1c/0x2c
[df07be70] [c0103898] vfs_kern_mount+0x78/0x1e8
[df07bea0] [c0103a58] do_kern_mount+0x40/0x100
[df07bec0] [c011fe90] do_mount+0x240/0x890
[df07bf10] [c0120570] sys_mount+0x90/0xd8
[df07bf40] [c00110d8] ret_from_syscall+0x0/0x4

=== Exception: c01 at 0xff61a34
    LR = 0x100135f0
Instruction dump:
38800005 38600000 48010f41 4bfffe1c 4bfc2d15 4bfffe8c 72e90200 4082fc28
3d20c064 39298860 8809000d 68000001 <0f000000> 2f800000 419efc0c 38000001
mount: mounting /dev/mtdblock3 on /common failed: Input/output error

Signed-off-by: Chen Jie <chenjie6@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

EDAC, amd64_edac: Prevent OOPS with >16 memory controllers

commit 0c510cc83bdbaac8406f4f7caef34f4da0ba35ea upstream.

When DRAM errors occur on memory controllers after EDAC_MAX_MCS (16),
the kernel fatally dereferences unallocated structures, see splat below;
this occurs on at least NumaConnect systems.

Fix by checking if a memory controller info structure was found.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000320
IP: [<ffffffff819f714f>] decode_bus_error+0x2f/0x2b0
PGD 2f8b5a3067 PUD 2f8b5a2067 PMD 0
Oops: 0000 [#2] SMP
Modules linked in:
CPU: 224 PID: 11930 Comm: stream_c.exe.gn Tainted: G D 3.19.0 #1
Hardware name: Supermicro H8QGL/H8QGL, BIOS 3.5b 01/28/2015
task: ffff8807dbfb8c00 ti: ffff8807dd16c000 task.ti: ffff8807dd16c000
RIP: 0010:[<ffffffff819f714f>] [<ffffffff819f714f>] decode_bus_error+0x2f/0x2b0
RSP: 0000:ffff8907dfc03c48 EFLAGS: 00010297
RAX: 0000000000000001 RBX: 9c67400010080a13 RCX: 0000000000001dc6
RDX: 000000001dc61dc6 RSI: ffff8907dfc03df0 RDI: 000000000000001c
RBP: ffff8907dfc03ce8 R08: 0000000000000000 R09: 0000000000000022
R10: ffff891fffa30380 R11: 00000000001cfc90 R12: 0000000000000008
R13: 0000000000000000 R14: 000000000000001c R15: 00009c6740001000
FS: 00007fa97ee18700(0000) GS:ffff8907dfc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000320 CR3: 0000003f889b8000 CR4: 00000000000407e0
Stack:
0000000000000000 ffff8907dfc03df0 0000000000000008 9c67400010080a13
000000000000001c 00009c6740001000 ffff8907dfc03c88 ffffffff810e4f9a
ffff8907dfc03ce8 ffffffff81b375b9 0000000000000000 0000000000000010
Call Trace:
<IRQ>
? vprintk_default
? printk
amd_decode_mce
notifier_call_chain
atomic_notifier_call_chain
mce_log
machine_check_poll
mce_timer_fn
? mce_cpu_restart
call_timer_fn.isra.29
run_timer_softirq
__do_softirq
irq_exit
smp_apic_timer_interrupt
apic_timer_interrupt
<EOI>
? down_read_trylock
__do_page_fault
? __schedule
do_page_fault
page_fault

Signed-off-by: Daniel J Blueman <daniel@numascale.com>
Link: http://lkml.kernel.org/r/1424144078-24589-1-git-send-email-daniel@numascale.com
[ Boris: massage commit message ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz> [backport to 3.12]

md/raid1: fix read balance when a drive is write-mostly.

commit d1901ef099c38afd11add4cfb3312c02ef21ec4a upstream.

When a drive is marked write-mostly it should only be the
target of reads if there is no other option.

This behaviour was broken by

commit 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc
md/raid1: read balance chooses idlest disk for SSD

which causes a write-mostly device to be *preferred* is some cases.

Restore correct behaviour by checking and setting
best_dist_disk and best_pending_disk rather than best_disk.

We only need to test one of these as they are both changed
from -1 or >=0 at the same time.

As we leave min_pending and best_dist unchanged, any non-write-mostly
device will appear better than the write-mostly device.

Reported-by: Tomáš Hodek <tomas.hodek@volny.cz>
Reported-by: Dark Penguin <darkpenguin@yandex.ru>
Signed-off-by: NeilBrown <neilb@suse.de>
Link: http://marc.info/?l=linux-raid&m=135982797322422
Fixes: 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

md/raid5: Fix livelock when array is both resyncing and degraded.

commit 26ac107378c4742978216be1005b7291b799c7b2 upstream.

Commit a7854487cd7128a30a7f4f5259de9f67d5efb95f:
md: When RAID5 is dirty, force reconstruct-write instead of read-modify-write.

Causes an RCW cycle to be forced even when the array is degraded.
A degraded array cannot support RCW as that requires reading all data
blocks, and one may be missing.

Forcing an RCW when it is not possible causes a live-lock and the code
spins, repeatedly deciding to do something that cannot succeed.

So change the condition to only force RCW on non-degraded arrays.

Reported-by: Manibalan P <pmanibalan@amiindia.co.in>
Bisected-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Tested-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Fixes: a7854487cd7128a30a7f4f5259de9f67d5efb95f
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

metag: Fix KSTK_EIP() and KSTK_ESP() macros

commit c2996cb29bfb73927a79dc96e598a718e843f01a upstream.

The KSTK_EIP() and KSTK_ESP() macros should return the user program
counter (PC) and stack pointer (A0StP) of the given task. These are used
to determine which VMA corresponds to the user stack in
/proc/<pid>/maps, and for the user PC & A0StP in /proc/<pid>/stat.

However for Meta the PC & A0StP from the task's kernel context are used,
resulting in broken output. For example in following /proc/<pid>/maps
output, the 3afff000-3b021000 VMA should be described as the stack:

  # cat /proc/self/maps
  ...
  100b0000-100b1000 rwxp 00000000 00:00 0          [heap]
  3afff000-3b021000 rwxp 00000000 00:00 0

And in the following /proc/<pid>/stat output, the PC is in kernel code
(1074234964 = 0x40078654) and the A0StP is in the kernel heap
(1335981392 = 0x4fa17550):

  # cat /proc/self/stat
  51 (cat) R ... 1335981392 1074234964 ...

Fix the definitions of KSTK_EIP() and KSTK_ESP() to use
task_pt_regs(tsk)->ctx rather than (tsk)->thread.kernel_context. This
gets the registers from the user context stored after the thread info at
the base of the kernel stack, which is from the last entry into the
kernel from userland, regardless of where in the kernel the task may
have been interrupted, which results in the following more correct
/proc/<pid>/maps output:

  # cat /proc/self/maps
  ...
  0800b000-08070000 r-xp 00000000 00:02 207        /lib/libuClibc-0.9.34-git.so
  ...
  100b0000-100b1000 rwxp 00000000 00:00 0          [heap]
  3afff000-3b021000 rwxp 00000000 00:00 0          [stack]

And /proc/<pid>/stat now correctly reports the PC in libuClibc
(134320308 = 0x80190b4) and the A0StP in the [stack] region (989864576 =
0x3b002280):

  # cat /proc/self/stat
  51 (cat) R ... 989864576 134320308 ...

Reported-by: Alexey Brodkin <Alexey.Brodkin@synopsys.com>
Reported-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

xfs: Fix quota type in quota structures when reusing quota file

commit dfcc70a8c868fe03276fa59864149708fb41930b upstream.

For filesystems without separate project quota inode field in the
superblock we just reuse project quota file for group quotas (and vice
versa) if project quota file is allocated and we need group quota file.
When we reuse the file, quota structures on disk suddenly have wrong
type stored in d_flags though. Nobody really cares about this (although
structure type reported to userspace was wrong as well) except
that after commit 14bf61ffe6ac (quota: Switch ->get_dqblk() and
->set_dqblk() to use bytes as space units) assertion in
xfs_qm_scall_getquota() started to trigger on xfs/106 test (apparently I
was testing without XFS_DEBUG so I didn't notice when submitting the
above commit).

Fix the problem by properly resetting ddq->d_flags when running quotacheck
for a quota file.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

gpio: tps65912: fix wrong container_of arguments

commit 2f97c20e5f7c3582c7310f65a04465bfb0fd0e85 upstream.

The gpio_chip operations receive a pointer the gpio_chip struct which is
contained in the driver's private struct, yet the container_of call in those
functions point to the mfd struct defined in include/linux/mfd/tps65912.h.

Signed-off-by: Nicolas Saenz Julienne <nicolassaenzj@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

gpiolib: of: allow of_gpiochip_find_and_xlate to find more than one chip per node

commit 9cf75e9e4ddd587ac12e88e8751c358b7b27e95f upstream.

The change:

7b8792bbdffdff3abda704f89c6a45ea97afdc62
gpiolib: of: Correct error handling in of_get_named_gpiod_flags

assumed that only one gpio-chip is registred per of-node.
Some drivers register more than one chip per of-node, so
adjust the matching function of_gpiochip_find_and_xlate to
not stop looking for chips if a node-match is found and
the translation fails.

Fixes: 7b8792bbdffd ("gpiolib: of: Correct error handling in of_get_named_gpiod_flags")
Signed-off-by: Hans Holmberg <hans.holmberg@intel.com>
Acked-by: Alexandre Courbot <acourbot@nvidia.com>
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Tested-by: Tyler Hall <tylerwhall@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

arm64: compat Fix siginfo_t -> compat_siginfo_t conversion on big endian

commit 9d42d48a342aee208c1154696196497fdc556bbf upstream.

The native (64-bit) sigval_t union contains sival_int (32-bit) and
sival_ptr (64-bit). When a compat application invokes a syscall that
takes a sigval_t value (as part of a larger structure, e.g.
compat_sys_mq_notify, compat_sys_timer_create), the compat_sigval_t
union is converted to the native sigval_t with sival_int overlapping
with either the least or the most significant half of sival_ptr,
depending on endianness. When the corresponding signal is delivered to a
compat application, on big endian the current (compat_uptr_t)sival_ptr
cast always returns 0 since sival_int corresponds to the top part of
sival_ptr. This patch fixes copy_siginfo_to_user32() so that sival_int
is copied to the compat_siginfo_t structure.

Reported-by: Bamvor Jian Zhang <bamvor.zhangjian@huawei.com>
Tested-by: Bamvor Jian Zhang <bamvor.zhangjian@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

hx4700: regulator: declare full constraints

commit a52d209336f8fc7483a8c7f4a8a7d2a8e1692a6c upstream.

Since the removal of CONFIG_REGULATOR_DUMMY option, the touchscreen stopped
working. This patch enables the "replacement" for REGULATOR_DUMMY and
allows the touchscreen to work even though there is no regulator for "vcc".

Signed-off-by: Martin Vajnar <martin.vajnar@gmail.com>
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

KVM: x86: update masterclock values on TSC writes

commit 7f187922ddf6b67f2999a76dcb71663097b75497 upstream.

When the guest writes to the TSC, the masterclock TSC copy must be
updated as well along with the TSC_OFFSET update, otherwise a negative
tsc_timestamp is calculated at kvm_guest_time_update.

Once "if (!vcpus_matched && ka->use_master_clock)" is simplified to
"if (ka->use_master_clock)", the corresponding "if (!ka->use_master_clock)"
becomes redundant, so remove the do_request boolean and collapse
everything into a single condition.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

libceph: fix double __remove_osd() problem

commit 7eb71e0351fbb1b242ae70abb7bb17107fe2f792 upstream.

It turns out it's possible to get __remove_osd() called twice on the
same OSD.  That doesn't sit well with rb_erase() - depending on the
shape of the tree we can get a NULL dereference, a soft lockup or
a random crash at some point in the future as we end up touching freed
memory.  One scenario that I was able to reproduce is as follows:

            <osd3 is idle, on the osd lru list>
<con reset - osd3>
con_fault_finish()
  osd_reset()
                              <osdmap - osd3 down>
                              ceph_osdc_handle_map()
                                <takes map_sem>
                                kick_requests()
                                  <takes request_mutex>
                                  reset_changed_osds()
                                    __reset_osd()
                                      __remove_osd()
                                  <releases request_mutex>
                                <releases map_sem>
    <takes map_sem>
    <takes request_mutex>
    __kick_osd_requests()
      __reset_osd()
        __remove_osd() <-- !!!

A case can be made that osd refcounting is imperfect and reworking it
would be a proper resolution, but for now Sage and I decided to fix
this by adding a safe guard around __remove_osd().

Fixes: http://tracker.ceph.com/issues/8087
Cc: Sage Weil <sage@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

libceph: change from BUG to WARN for __remove_osd() asserts

commit cc9f1f518cec079289d11d732efa490306b1ddad upstream.

No reason to use BUG_ON for osd request list assertions.

Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

libceph: assert both regular and lingering lists in __remove_osd()

commit 7c6e6fc53e7335570ed82f77656cedce1502744e upstream.

It is important that both regular and lingering requests lists are
empty when the OSD is removed.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

Bluetooth: Add support for Acer [0489:e078]

commit 4b552bc9edfdc947862af225a0e2521edb5d37a0 upstream.

Add support for the QCA6174 chip.

    T:  Bus=06 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#=  3 Spd=12  MxCh= 0
    D:  Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
    P:  Vendor=0489 ProdID=e078 Rev=00.01
    C:  #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
    I:  If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    I:  If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb

Signed-off-by: Anantha Krishnan <ananthk@codeaurora.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

KVM: MIPS: Don't leak FPU/DSP to guest

[ Upstream commit f798217dfd038af981a18bbe4bc57027a08bb182 ]

The FPU and DSP are enabled via the CP0 Status CU1 and MX bits by
kvm_mips_set_c0_status() on a guest exit, presumably in case there is
active state that needs saving if pre-emption occurs. However neither of
these bits are cleared again when returning to the guest.

This effectively gives the guest access to the FPU/DSP hardware after
the first guest exit even though it is not aware of its presence,
allowing FP instructions in guest user code to intermittently actually
execute instead of trapping into the guest OS for emulation. It will
then read & manipulate the hardware FP registers which technically
belong to the user process (e.g. QEMU), or are stale from another user
process. It can also crash the guest OS by causing an FP exception, for
which a guest exception handler won't have been registered.

First lets save and disable the FPU (and MSA) state with lose_fpu(1)
before entering the guest. This simplifies the problem, especially for
when guest FPU/MSA support is added in the future, and prevents FR=1 FPU
state being live when the FR bit gets cleared for the guest, which
according to the architecture causes the contents of the FPU and vector
registers to become UNPREDICTABLE.

We can then safely remove the enabling of the FPU in
kvm_mips_set_c0_status(), since there should never be any active FPU or
MSA state to save at pre-emption, which should plug the FPU leak.

DSP state is always live rather than being lazily restored, so for that
it is simpler to just clear the MX bit again when re-entering the guest.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # v3.10+: 044f0f03eca0: MIPS: KVM: Deliver guest interrupts
Cc: <stable@vger.kernel.org> # v3.10+: 3ce465e04bfd: MIPS: Export FP functions used by lose_fpu(1) for KVM
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ARC: fix page address calculation if PAGE_OFFSET != LINUX_LINK_BASE

commit 06f34e1c28f3608b0ce5b310e41102d3fe7b65a1 upstream.

We used to calculate page address differently in 2 cases:

1. In virt_to_page(x) we do
--->8---
mem_map + (x - CONFIG_LINUX_LINK_BASE) >> PAGE_SHIFT
--->8---

2. In in pte_page(x) we do
--->8---
mem_map + (pte_val(x) - PAGE_OFFSET) >> PAGE_SHIFT
--->8---

That leads to problems in case PAGE_OFFSET != CONFIG_LINUX_LINK_BASE -
different pages will be selected depending on where and how we calculate
page address.

In particular in the STAR 9000853582 when gdb attempted to read memory
of another process it got improper page in get_user_pages() because this
is exactly one of the places where we search for a page by pte_page().

The fix is trivial - we need to calculate page address similarly in both
cases.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

kdb: fix incorrect counts in KDB summary command output

commit 146755923262037fc4c54abc28c04b1103f3cc51 upstream.

The output of KDB 'summary' command should report MemTotal, MemFree
and Buffers output in kB. Current codes report in unit of pages.

A define of K(x) as
is defined in the code, but not used.

This patch would apply the define to convert the values to kB.
Please include me on Cc on replies. I do not subscribe to linux-kernel.

Signed-off-by: Jay Lan <jlan@sgi.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ARM: pxa: add regulator_has_full_constraints to poodle board file

commit 9bc78f32c2e430aebf6def965b316aa95e37a20c upstream.

Add regulator_has_full_constraints() call to poodle board file to let
regulator core know that we do not have any additional regulators left.
This lets it substitute unprovided regulators with dummy ones.

This fixes the following warnings that can be seen on poodle if
regulators are enabled:

ads7846 spi1.0: unable to get regulator: -517
spi spi1.0: Driver ads7846 requests probe deferral
wm8731 0-001b: Failed to get supply 'AVDD': -517
wm8731 0-001b: Failed to request supplies: -517
wm8731 0-001b: ASoC: failed to probe component -517

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

ARM: pxa: add regulator_has_full_constraints to corgi board file

commit 271e80176aae4e5b481f4bb92df9768c6075bbca upstream.

Add regulator_has_full_constraints() call to corgi board file to let
regulator core know that we do not have any additional regulators left.
This lets it substitute unprovided regulators with dummy ones.

This fixes the following warnings that can be seen on corgi if
regulators are enabled:

ads7846 spi1.0: unable to get regulator: -517
spi spi1.0: Driver ads7846 requests probe deferral
wm8731 0-001b: Failed to get supply 'AVDD': -517
wm8731 0-001b: Failed to request supplies: -517
wm8731 0-001b: ASoC: failed to probe component -517
corgi-audio corgi-audio: ASoC: failed to instantiate card -517

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

vt: provide notifications on selection changes

commit 19e3ae6b4f07a87822c1c9e7ed99d31860e701af upstream.

The vcs device's poll/fasync support relies on the vt notifier to signal
changes to the screen content. Notifier invocations were missing for
changes that comes through the selection interface though. Fix that.

Tested with BRLTTY 5.2.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Cc: Dave Mielke <dave@mielke.cc>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

usb: core: buffer: smallest buffer should start at ARCH_DMA_MINALIGN

commit 5efd2ea8c9f4f12916ffc8ba636792ce052f6911 upstream.

the following error pops up during "testusb -a -t 10"
| musb-hdrc musb-hdrc.1.auto: dma_pool_free buffer-128, f134e000/be842000 (bad dma)
hcd_buffer_create() creates a few buffers, the smallest has 32 bytes of
size. ARCH_KMALLOC_MINALIGN is set to 64 bytes. This combo results in
hcd_buffer_alloc() returning memory which is 32 bytes aligned and it
might by identified by buffer_offset() as another buffer. This means the
buffer which is on a 32 byte boundary will not get freed, instead it
tries to free another buffer with the error message.

This patch fixes the issue by creating the smallest DMA buffer with the
size of ARCH_KMALLOC_MINALIGN (or 32 in case ARCH_KMALLOC_MINALIGN is
smaller). This might be 32, 64 or even 128 bytes. The next three pools
will have the size 128, 512 and 2048.
In case the smallest pool is 128 bytes then we have only three pools
instead of four (and zero the first entry in the array).
The last pool size is always 2048 bytes which is the assumed PAGE_SIZE /
2 of 4096. I doubt it makes sense to continue using PAGE_SIZE / 2 where
we would end up with 8KiB buffer in case we have 16KiB pages.
Instead I think it makes sense to have a common size(s) and extend them
if there is need to.
There is a BUILD_BUG_ON() now in case someone has a minalign of more than
128 bytes.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>