If CONFIG_BALLOON_COMPACTION=n balloon_page_insert() does not link pages
with balloon and doesn't set PagePrivate flag, as a result
balloon_page_dequeue() cannot get any pages because it thinks that all
of them are isolated. Without balloon compaction nobody can isolate
ballooned pages. It's safe to remove this check.
Fixes: d6d86c0a7f8d ("mm/balloon_compaction: redesign ballooned pages management"). Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com> Reported-by: Matt Mullins <mmullins@mmlx.us> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If __bitmap_shift_left() or __bitmap_shift_right() are asked to shift by
a multiple of BITS_PER_LONG, they will try to shift a long value by
BITS_PER_LONG bits which is undefined. Change the functions to avoid
the undefined shift.
Coverity id: 1192175
Coverity id: 1192174 Signed-off-by: Jan Kara <jack@suse.cz> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API") changed
page migration to uncharge the old page right away. The page is locked,
unmapped, truncated, and off the LRU, but it could race with writeback
ending, which then doesn't unaccount the page properly:
test_clear_page_writeback() migration
wait_on_page_writeback()
TestClearPageWriteback()
mem_cgroup_migrate()
clear PCG_USED
mem_cgroup_update_page_stat()
if (PageCgroupUsed(pc))
decrease memcg pages under writeback
release pc->mem_cgroup->move_lock
The per-page statistics interface is heavily optimized to avoid a
function call and a lookup_page_cgroup() in the file unmap fast path,
which means it doesn't verify whether a page is still charged before
clearing PageWriteback() and it has to do it in the stat update later.
Rework it so that it looks up the page's memcg once at the beginning of
the transaction and then uses it throughout. The charge will be
verified before clearing PageWriteback() and migration can't uncharge
the page as long as that is still set. The RCU lock will protect the
memcg past uncharge.
As far as losing the optimization goes, the following test results are
from a microbenchmark that maps, faults, and unmaps a 4GB sparse file
three times in a nested fashion, so that there are two negative passes
that don't account but still go through the new transaction overhead.
There is no actual difference:
old: 33.195102545 seconds time elapsed ( +- 0.01% )
new: 33.199231369 seconds time elapsed ( +- 0.03% )
The time spent in page_remove_rmap()'s callees still adds up to the
same, but the time spent in the function itself seems reduced:
Commit ff7ee93f4715 ("cgroup/kmemleak: Annotate alloc_page() for cgroup
allocations") introduces kmemleak_alloc() for alloc_page_cgroup(), but
corresponding kmemleak_free() is missing, which makes kmemleak be
wrongly disabled after memory offlining. Log is pasted at the end of
this commit message.
This patch add kmemleak_free() into free_page_cgroup(). During page
offlining, this patch removes corresponding entries in kmemleak rbtree.
After that, the freed memory can be allocated again by other subsystems
without killing kmemleak.
bash # for x in 1 2 3 4; do echo offline > /sys/devices/system/memory/memory$x/state ; sleep 1; done ; dmesg | grep leak
Compound page should be freed by put_page() or free_pages() with correct
order. Not doing so will cause tail pages leaked.
The compound order can be obtained by compound_order() or use
HPAGE_PMD_ORDER in our case. Some people would argue the latter is
faster but I prefer the former which is more general.
This bug was observed not just on our servers (the worst case we saw is
11G leaked on a 48G machine) but also on our workstations running Ubuntu
based distro.
ima_inode_setxattr() can be called with no value. Function does not
check the length so that following command can be used to produce
kernel oops: setfattr -n security.ima FOO. This patch fixes it.
Changes in v3:
* for stable reverted "allow setting hash only in fix or log mode"
It will be a separate patch.
Changes in v2:
* testing validity of xattr type
* allow setting hash only in fix or log mode (Mimi)
The PLAT_S5P Kconfig symbol was removed in commit d78c16ccde96
("ARM: SAMSUNG: Remove remaining legacy code"). There are still
some references left, fix that by replacing them with ARCH_S5PV210.
[10238.622067] scsi host3: uas_eh_bus_reset_handler start
[10240.766164] usb 3-4: reset SuperSpeed USB device number 3 using xhci_hcd
[10245.779365] usb 3-4: device descriptor read/8, error -110
[10245.883331] usb 3-4: reset SuperSpeed USB device number 3 using xhci_hcd
[10250.897603] usb 3-4: device descriptor read/8, error -110
[10251.058200] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
[10251.058244] IP: [<ffffffff815ac6e1>] xhci_check_streams_endpoint+0x91/0x140
<snip>
[10251.059473] Call Trace:
[10251.059487] [<ffffffff815aca6c>] xhci_calculate_streams_and_bitmask+0xbc/0x130
[10251.059520] [<ffffffff815aeb5f>] xhci_alloc_streams+0x10f/0x5a0
[10251.059548] [<ffffffff810a4685>] ? check_preempt_curr+0x75/0xa0
[10251.059575] [<ffffffff810a46dc>] ? ttwu_do_wakeup+0x2c/0x100
[10251.059601] [<ffffffff810a49e6>] ? ttwu_do_activate.constprop.111+0x66/0x70
[10251.059635] [<ffffffff815779ab>] usb_alloc_streams+0xab/0xf0
[10251.059662] [<ffffffffc0616b48>] uas_configure_endpoints+0x128/0x150 [uas]
[10251.059694] [<ffffffffc0616bac>] uas_post_reset+0x3c/0xb0 [uas]
[10251.059722] [<ffffffff815727d9>] usb_reset_device+0x1b9/0x2a0
[10251.059749] [<ffffffffc0616f42>] uas_eh_bus_reset_handler+0xb2/0x190 [uas]
[10251.059781] [<ffffffff81514293>] scsi_try_bus_reset+0x53/0x110
[10251.059808] [<ffffffff815163b7>] scsi_eh_bus_reset+0xf7/0x270
<snip>
The problem is the following call sequence (simplified):
1) usb_reset_device
2) usb_reset_and_verify_device
2) hub_port_init
3) hub_port_finish_reset
3) xhci_discover_or_reset_device
This frees xhci->devs[slot_id]->eps[ep_index].ring for all eps but 0
4) usb_get_device_descriptor
This fails
5) hub_port_init fails
6) usb_reset_and_verify_device fails, does not restore device config
7) uas_post_reset
8) xhci_alloc_streams
NULL deref on the free-ed ring
This commit fixes this by not allowing usb_alloc_streams to continue if
the device is not configured.
Note that we do allow usb_free_streams to continue after a (logical)
disconnect, as it is necessary to explicitly free the streams at the xhci
controller level.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sometimes mass-storage devices using the Bulk-only transport will
mistakenly skip the data phase of a command. Rather than sending the
data expected by the host or sending a zero-length packet, they go
directly to the status phase and send the CSW.
This causes problems for usb-storage, for obvious reasons. The driver
will interpret the CSW as a short data transfer and will wait to
receive a CSW. The device won't have anything left to send, so the
command eventually times out.
The SCSI layer doesn't retry commands after they time out (this is a
relatively recent change). Therefore we should do our best to detect
a skipped data phase and handle it promptly.
This patch adds code to do that. If usb-storage receives a short
13-byte data transfer from the device, and if the first four bytes of
the data match the CSW signature, the driver will set the residue to
the full transfer length and interpret the data as a CSW.
This fixes Bugzilla #86611.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net> Tested-by: Paul Osmialowski <newchief@king.net.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This comes from the fact that usb-audio driver may receive the
disconnect callback multiple times, per each usb interface. When a
device has both audio and midi interfaces, it gets called twice, and
currently the driver tries to release resources at the last call.
At this point, the first parent interface has been already deleted,
thus deleting a child of the first parent hits such a warning.
For fixing this problem, we need to call snd_card_disconnect() and
cancel pending operations at the very first disconnect while the
release of the whole objects waits until the last disconnect call.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=80931 Reported-and-tested-by: Tomas Gayoso <tgayoso@gmail.com> Reported-and-tested-by: Chris J Arges <chris.j.arges@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, there's no guarantee that udc->driver
will be valid when using soft_connect sysfs
interface. In fact, we can very easily trigger
a NULL pointer dereference by trying to disconnect
when a gadget driver isn't loaded.
Enable the always-poll quirk for Elan Touchscreens found on some recent
Samsung laptops.
Without this quirk the device keeps disconnecting from the bus (and is
re-enumerated) unless opened (and kept open, should an input event
occur).
Note that while the device can be run-time suspended, the autosuspend
timeout must be high enough to allow the device to be polled at least
once before being suspended. Specifically, using autosuspend_delay_ms=0
will still cause the device to disconnect on input events.
Add quirk to make sure that a device is always polled for input events
even if it hasn't been opened.
This is needed for devices that disconnects from the bus unless the
interrupt endpoint has been polled at least once or when not responding
to an input event (e.g. after having shut down X).
Currently this quirk is enabled for the model with the device id 0x0089, it
is needed for the 0x009b model, which is found on the Fujitsu Lifebook u904
as well.
Enable device-qualifier quirk for Elan Touchscreen, which often fails to
handle requests for the device_descriptor.
Note that the device sometimes do respond properly with a Request Error
(three times as USB core retries), but usually fails to respond at all.
When this happens any further descriptor requests also fails, for
example:
[ 1528.688934] usb 2-7: new full-speed USB device number 4 using xhci_hcd
[ 1530.945588] usb 2-7: unable to read config index 0 descriptor/start: -71
[ 1530.945592] usb 2-7: can't read configurations, error -71
This has been observed repeating for over a minute before eventual
successful enumeration.
Add new quirk for devices that cannot handle requests for the
device_qualifier descriptor.
A USB-2.0 compliant device must respond to requests for the
device_qualifier descriptor (even if it's with a request error), but at
least one device is known to misbehave after such a request.
The commit '2e4c7553cd usb: gadget: f_fs: add aio support' broke the
quirk implemented to align buffer size to maxpacketsize on out endpoint.
As result, functionfs does not work on Intel platforms using dwc3 driver
(i.e. Bay Trail and Merrifield). This patch fixes the issue.
This code is based on a previous Qiuxu's patch.
Fixes: 2e4c7553cd (usb: gadget: f_fs: add aio support) Signed-off-by: David Cohen <david.a.cohen@linux.intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During FunctionFS bind, ffs_data_get() function was called twice
(in functionfs_bind() and in ffs_do_functionfs_bind()), while on unbind
ffs_data_put() was called once (in functionfs_unbind() function).
In result refcount never reached value 0, and ffs memory resources has
been never released.
Since ffs_data_get() call in ffs_do_functionfs_bind() is redundant
and not neccessary, we remove it to have equal number of gets ans puts,
and free allocated memory after refcount reach 0.
Fixes: 5920cda (usb: gadget: FunctionFS: convert to new function
interface with backward compatibility) Signed-off-by: Robert Baldyga <r.baldyga@samsung.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 468bcc2a2ca ("usb: musb: dsps: kill OTG timer on suspend") stopped
the timer in suspend path but forgot the re-enable it in the resume
path. This patch fixes the behaviour.
Fixes 468bcc2a2ca "usb: musb: dsps: kill OTG timer on suspend" Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c58d80f52 ("usb: musb: Ensure that cppi41 timer gets armed on
premature DMA TX irq") fixed hrtimer scheduling bug. There is one left
which does not trigger that often.
The following scenario is still possible:
expires:
t->function();
lock(&x->lock);
lock(&x->lock); if (!hrtimer_queued(&x->t))
hrtimer_start(&x->t);
unlock(&x->lock);
if (!list_empty(x->early_tx_list))
ret = HRTIMER_RESTART;
-> hrtimer_forward_now(...)
} else
ret = HRTIMER_NORESTART;
unlock(&x->lock);
and the timer callback returns HRTIMER_RESTART for an armed timer. This
is wrong and we run into the BUG_ON() in __run_hrtimer().
This can happens on SMP or PREEMPT-RT.
The patch fixes the problem by only starting the timer if the timer is
not yet queued.
Reported-by: Torben Hohn <torbenh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[bigeasy: collected information and created a patch + description based
on it] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If PM_RUNTIME is enabled, it is easy to trigger the following backtrace
on pxa2xx hosts:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at /home/lumag/linux/arch/arm/mach-pxa/clock.c:35 clk_disable+0xa0/0xa8()
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted 3.17.0-00007-g1b3d2ee-dirty #104
[<c000de68>] (unwind_backtrace) from [<c000c078>] (show_stack+0x10/0x14)
[<c000c078>] (show_stack) from [<c001d75c>] (warn_slowpath_common+0x6c/0x8c)
[<c001d75c>] (warn_slowpath_common) from [<c001d818>] (warn_slowpath_null+0x1c/0x24)
[<c001d818>] (warn_slowpath_null) from [<c0015e80>] (clk_disable+0xa0/0xa8)
[<c0015e80>] (clk_disable) from [<c02507f8>] (pxa2xx_spi_suspend+0x2c/0x34)
[<c02507f8>] (pxa2xx_spi_suspend) from [<c0200360>] (platform_pm_suspend+0x2c/0x54)
[<c0200360>] (platform_pm_suspend) from [<c0207fec>] (dpm_run_callback.isra.14+0x2c/0x74)
[<c0207fec>] (dpm_run_callback.isra.14) from [<c0209254>] (__device_suspend+0x120/0x2f8)
[<c0209254>] (__device_suspend) from [<c0209a94>] (dpm_suspend+0x50/0x208)
[<c0209a94>] (dpm_suspend) from [<c00455ac>] (suspend_devices_and_enter+0x8c/0x3a0)
[<c00455ac>] (suspend_devices_and_enter) from [<c0045ad4>] (pm_suspend+0x214/0x2a8)
[<c0045ad4>] (pm_suspend) from [<c04b5c34>] (test_suspend+0x14c/0x1dc)
[<c04b5c34>] (test_suspend) from [<c000880c>] (do_one_initcall+0x8c/0x1fc)
[<c000880c>] (do_one_initcall) from [<c04aecfc>] (kernel_init_freeable+0xf4/0x1b4)
[<c04aecfc>] (kernel_init_freeable) from [<c0378078>] (kernel_init+0x8/0xec)
[<c0378078>] (kernel_init) from [<c0009590>] (ret_from_fork+0x14/0x24)
---[ end trace 46524156d8faa4f6 ]---
This happens because suspend function tries to disable a clock that is
already disabled by runtime_suspend callback. Add if
(!pm_runtime_suspended()) checks to suspend/resume path.
Fixes: 7d94a505858 (spi/pxa2xx: add support for runtime PM) Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Reported-by: Andrea Adami <andrea.adami@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are only 4 CTAR registers (CTAR0 - CTAR3) so we can only use the
lower 2 bits of the chip select to select a CTAR register.
SPI_PUSHR_CTAS used the lower 3 bits which would result in wrong bit values
if the chip selects 4/5 are used. For those chip selects SPI_CTAR even
calculated offsets of non-existing registers.
Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When mapped RX DMA entries are unmapped in an error condition when DMA
is firstly configured in the driver, the number of TX DMA entries was
passed in, which is incorrect
Signed-off-by: Ray Jui <rjui@broadcom.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On ISOC endpoints the last trb_pool entry used as a
LINK TRB is not getting zeroed out correctly due to
memset being called incorrectly and in the wrong place.
If pool allocated from DMA was not zero-initialized
to begin with this will result in the size and ctrl
values being random garbage. Call memset correctly after
assignment of the trb_link pointer.
Fixes: f6bafc6a1c ("usb: dwc3: convert TRBs into bitshifts") Signed-off-by: Jack Pham <jackp@codeaurora.org> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1) There's no way dwc3-omap's ->suspend() can cause any effect
on xhci's ->suspend(). Linux device driver model guarantees
that a parent's ->suspend() will only be called after all
children are suspended. dwc3-omap is the parent of the
parent of xhci.
2) When implementing Deep Sleep states where context is lost,
USBOTGSS_IRQ0 register, well, looses context so we
_must_ rewrite it otherwise core IRQs will never be
reenabled and USB will appear to be dead.
Fixes: 02dae36 (usb: dwc3: dwc3-omap: Disable/Enable only
wrapper interrupts in prepare/complete) Cc: George Cherian <george.cherian@ti.com> Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Dan Williams <dcbw@redhat.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The USB OTG port does not work since v3.16 on omap platform.
This is a regression introduced by the commit eb82a3d846fa (phy: omap-usb2: Balance pm_runtime_enable() on probe failure
and remove).
This because the call to pm_runtime_enable() function is moved after the
call to devm_phy_create() function, which has side effect since later in
the subsequent calls of devm_phy_create() there is a check with
pm_runtime_enabled() to configure few things.
Fixes: eb82a3d846fa Signed-off-by: Oussama Ghorbel <ghorbel@pivasoftware.com> Tested-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add new quirk for devices that cannot handle control-line state
requests.
Note that we currently send these requests to all devices, regardless of
whether they claim to support it, but that errors are only logged if
support is claimed.
Since commit 0943d8ead30e ("USB: cdc-acm: use tty-port dtr_rts"), which
only changed the timings for these requests slightly, this has been
reported to cause occasional firmware crashes on Simtec Electronics
Entropy Key devices after re-enumeration. Enable the quirk for this
device.
An official recent Windows driver from FTDI detects counterfeit devices
and reprograms the internal EEPROM containing the USB PID to 0, effectively
bricking the device.
Add support for this VID/PID pair to correctly bind the driver on these
devices.
Enable Silicon Labs Ember VID chips to enumerate with the cp210x usb serial
driver. EM358x devices operating with the Ember Z-Net 5.1.2 stack may now
connect to host PCs over a USB serial link.
uart_get_baud_rate() will return baud == 0 if the max rate is set
to the "magic" 38400 rate and the SPD_* flags are also specified.
On the first iteration, if the current baud rate is higher than the
max, the baud rate is clamped at the max (which in the degenerate
case is 38400). On the second iteration, the now-"magic" 38400 baud
rate selects the possibly higher alternate baud rate indicated by
the SPD_* flag. Since only two loop iterations are performed, the
loop is exited, a kernel WARNING is generated and a baud rate of
0 is returned.
Only perform the "magic" 38400 -> SPD_* baud transform on the first
loop iteration, which prevents the degenerate case from recognizing
the clamped baud rate as the "magic" 38400 value.
Reported-by: Robert Święcki <robert@swiecki.net> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Frank reports that after continuing in kgdb the RX stale event
doesn't occur until after the RX fifo is filled up with exactly
the amount of characters programmed for the RX watermark (in this
case it's 48). To read a single character from the uartdm
hardware we force a stale event so that any characters in the RX
packing buffer are flushed into the RX fifo immediately instead
of waiting for a stale timeout or for the fifo to fill. Forcing
that stale event asserts the stale interrupt but we never clear
that interrupt via UART_CR_CMD_RESET_STALE_INT in the polling
functions. So when kgdb continues the stale interrupt is left
pending in the hardware and we don't timeout with a stale event,
like we usually would if a user typed one character on the
console, until the reset stale interrupt and stale event commands
are sent. Frank could get things working again by running
handle_rx_dm(). By putting enough characters into the fifo he
could trigger a watermark interrupt, and thus cause
handle_rx_dm() to run finally resetting the stale interrupt
and enabling the stale event so that single characters would
cause timeouts again.
The fix is to just do what the interrupt routine was doing all
along and clear the stale interrupt and enable the event again.
Doing this also smooths over any differences in the fifo behavior
between v1.3 and v1.4 hardware allowing us to skip forcing the
uart into single character mode.
Reviewed-by: Frank Rowand <frank.rowand@sonymobile.com> Tested-by: Frank Rowand <frank.rowand@sonymobile.com> Fixes: f7e54d7ad743 "msm_serial: Add support for poll_{get,put}_char()" Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In older versions of the IIO framework it was possible to pass a completely
different set of channels to iio_buffer_register() as the one that is
assigned to the IIO device. Commit 959d2952d124 ("staging:iio: make
iio_sw_buffer_preenable much more general.") introduced a restriction that
requires that the set of channels that is passed to iio_buffer_register() is
a subset of the channels assigned to the IIO device as the IIO core will use
the list of channels that is assigned to the device to lookup a channel by
scan index in iio_compute_scan_bytes(). If it can not find the channel the
function will crash. This patch fixes the issue by making sure that the same
set of channels is assigned to the IIO device and passed to
iio_buffer_register().
Note that we need to remove the IIO_CHAN_INFO_RAW and IIO_CHAN_INFO_SCALE
info attributes from the channels since we don't actually want those to be
registered.
Fixes: 959d2952d124 ("staging:iio: make iio_sw_buffer_preenable much more general.") Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In older versions of the IIO framework it was possible to pass a
completely different set of channels to iio_buffer_register() as the one
that is assigned to the IIO device. Commit 959d2952d124 ("staging:iio: make
iio_sw_buffer_preenable much more general.") introduced a restriction that
requires that the set of channels that is passed to iio_buffer_register() is
a subset of the channels assigned to the IIO device as the IIO core will use
the list of channels that is assigned to the device to lookup a channel by
scan index in iio_compute_scan_bytes(). If it can not find the channel the
function will crash. This patch fixes the issue by making sure that the same
set of channels is assigned to the IIO device and passed to
iio_buffer_register().
Fixes: 959d2952d124 ("staging:iio: make iio_sw_buffer_preenable much more general.") Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use byte_for_channel as iterator to properly initialize the buffer.
Signed-off-by: Robin van der Gracht <robin@protonic.nl> Acked-by: Denis Ciocca <denis.ciocca@st.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
PM freezer relies on having all tasks frozen by the time devices are
getting frozen so that no task will touch them while they are getting
frozen. But OOM killer is allowed to kill an already frozen task in
order to handle OOM situtation. In order to protect from late wake ups
OOM killer is disabled after all tasks are frozen. This, however, still
keeps a window open when a killed task didn't manage to die by the time
freeze_processes finishes.
Reduce the race window by checking all tasks after OOM killer has been
disabled. This is still not race free completely unfortunately because
oom_killer_disable cannot stop an already ongoing OOM killer so a task
might still wake up from the fridge and get killed without
freeze_processes noticing. Full synchronization of OOM and freezer is,
however, too heavy weight for this highly unlikely case.
Introduce and check oom_kills counter which gets incremented early when
the allocator enters __alloc_pages_may_oom path and only check all the
tasks if the counter changes during the freezing attempt. The counter
is updated so early to reduce the race window since allocator checked
oom_killer_disabled which is set by PM-freezing code. A false positive
will push the PM-freezer into a slow path but that is not a big deal.
Changes since v1
- push the re-check loop out of freeze_processes into
check_frozen_processes and invert the condition to make the code more
readable as per Rafael
Fixes: f660daac474c6f (oom: thaw threads if oom killed thread is frozen before deferring) Signed-off-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It is reported that Samsung laptops that need to poll events are broken by
the following commit:
Commit 3afcf2ece453e1a8c2c6de19cdf06da3772a1b08
Subject: ACPI / EC: Add support to disallow QR_EC to be issued when SCI_EVT isn't set
The behaviors of the 2 vendor firmwares are conflict:
1. Acer: OSPM shouldn't issue QR_EC unless SCI_EVT is set, firmware
automatically sets SCI_EVT as long as there is event queued up.
2. Samsung: OSPM should issue QR_EC whatever SCI_EVT is set, firmware
returns 0 when there is no event queued up.
This patch is a quick fix to distinguish the behaviors to make Acer
behavior only effective for Acer EC firmware so that the breakages on
Samsung EC firmware can be avoided.
Fixes: 3afcf2ece453 (ACPI / EC: Add support to disallow QR_EC to be issued ...) Link: https://bugzilla.kernel.org/show_bug.cgi?id=44161 Reported-and-tested-by: Ortwin Glück <odi@odi.ch> Signed-off-by: Lv Zheng <lv.zheng@intel.com>
[ rjw : Subject ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Function mp_register_gsi() returns blindly the GSI number for the ACPI
SCI interrupt. That causes a regression when the GSI for ACPI SCI is
shared with other devices.
The regression was caused by commit 84245af7297ced9e8fe "x86, irq, ACPI:
Change __acpi_register_gsi to return IRQ number instead of GSI" and
exposed on a SuperMicro system, which shares one GSI between ACPI SCI
and PCI device, with following failure:
When IOAPIC is disabled, acpi_gsi_to_irq() should return gsi directly
instead of calling mp_map_gsi_to_irq() to translate gsi to IRQ by IOAPIC.
It fixes https://bugzilla.kernel.org/show_bug.cgi?id=84381.
This regression was introduced with commit 6b9fb7082409 "x86, ACPI,
irq: Consolidate algorithm of mapping (ioapic, pin) to IRQ number"
Reported-and-Tested-by: Thomas Richter <thor@math.tu-berlin.de> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Thomas Richter <thor@math.tu-berlin.de> Cc: rui.zhang@intel.com Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Link: http://lkml.kernel.org/r/1413816327-12850-1-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
d_splice_alias() callers expect it to either stash the inode reference
into a new alias, or drop the inode reference. That makes it possible
to just return d_splice_alias() result from ->lookup() instance, without
any extra housekeeping required.
Unfortunately, that should include the failure exits. If d_splice_alias()
returns an error, it leaves the dentry it has been given negative and
thus it *must* drop the inode reference. Easily fixed, but it goes way
back and will need backporting.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since f660daac474c6f (oom: thaw threads if oom killed thread is frozen
before deferring) OOM killer relies on being able to thaw a frozen task
to handle OOM situation but a3201227f803 (freezer: make freezing() test
freeze conditions in effect instead of TIF_FREEZE) has reorganized the
code and stopped clearing freeze flag in __thaw_task. This means that
the target task only wakes up and goes into the fridge again because the
freezing condition hasn't changed for it. This reintroduces the bug
fixed by f660daac474c6f.
Fix the issue by checking for TIF_MEMDIE thread flag in
freezing_slow_path and exclude the task from freezing completely. If a
task was already frozen it would get woken by __thaw_task from OOM killer
and get out of freezer after rechecking freezing().
Changes since v1
- put TIF_MEMDIE check into freezing_slowpath rather than in __refrigerator
as per Oleg
- return __thaw_task into oom_scan_process_thread because
oom_kill_process will not wake task in the fridge because it is
sleeping uninterruptible
[mhocko@suse.cz: rewrote the changelog] Fixes: a3201227f803 (freezer: make freezing() test freeze conditions in effect instead of TIF_FREEZE) Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Using a VID value that is not high enough for the requested P state can
cause machine checks. Add a ceiling function to ensure calulated VIDs
with fractional values are set to the next highest integer VID value.
The algorythm for calculating the non-trubo VID from the BIOS writers
guide is:
vid_ratio = (vid_max - vid_min) / (max_pstate - min_pstate)
vid = ceiling(vid_min + (req_pstate - min_pstate) * vid_ratio)
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
BYT has a different conversion from P state to frequency than the core
processors. This causes the min/max and current frequency to be
misreported on some BYT SKUs. Tested on BYT N2820, Ivybridge and
Haswell processors.
Link: https://bugzilla.yoctoproject.org/show_bug.cgi?id=6663 Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit da167ad7638759 ("rtc: ia64: allow other architectures to use EFI
RTC") inadvertently introduced a regression for x86 because we've been
careful not to enable the EFI rtc driver due to the generally buggy
implementations of the time-related EFI runtime services.
In fact, since the above commit was merged we've seen reports of crashes
on 32-bit tablets,
Disable it explicitly for x86 so that we don't give users false hope
that this driver will work - it won't, and your machine is likely to
crash.
Acked-by: Mark Salter <msalter@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Intel processors which don't report cache information via cpuid(2)
or cpuid(4) need quirk code in the legacy_cache_size callback to
report this data. For Intel that callback is is intel_size_cache().
This patch enables calling of cpu_detect_cache_sizes() inside of
init_intel() and hence the calling of the legacy_cache callback in
intel_size_cache(). Adding this call will ensure that PIII Tualatin
currently in intel_size_cache() and Quark SoC X1000 being added to
intel_size_cache() in this patch will report their respective cache
sizes.
This model of calling cpu_detect_cache_sizes() is consistent with
AMD/Via/Cirix/Transmeta and Centaur.
Also added is a string to idenitfy the Quark as Quark SoC X1000
giving better and more descriptive output via /proc/cpuinfo
Adding cpu_detect_cache_sizes to init_intel() will enable calling
of intel_size_cache() on Intel processors which currently no code
can reach. Therefore this patch will also re-enable reporting
of PIII Tualatin cache size information as well as add
Quark SoC X1000 support.
Comment text and cache flow logic suggested by Thomas Gleixner
Code which changes policy to powersave changes also max_policy_pct based on
max_freq. Code which change max_perf_pct has upper limit base on value
max_policy_pct. When policy is changing from powersave back to performance
then max_policy_pct is not changed. Which means that changing max_perf_pct is
not possible to high values if max_freq was too low in powersave policy.
Some BIOSes modify the state of MSR_IA32_MISC_ENABLE_TURBO_DISABLE
based on the current power source for the system battery AC vs
battery. Reflect the correct current state and ability to modify the
no_turbo sysfs file based on current state of
MSR_IA32_MISC_ENABLE_TURBO_DISABLE.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=83151 Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently the core does not expose scaling_cur_freq for set_policy()
drivers this breaks some userspace monitoring tools.
Change the core to expose this file for all drivers and if the
set_policy() driver supports the get() callback use it to retrieve the
current frequency.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=73741 Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The page offset is 12 bits. For example if we have an
8 GB VM, we'd need 33 bits. The number of bits needed
for PD + PT is 21 (33 - 12 or log2(8) + 18), not 20
(log2(8) + 17).
Noticed by Alexey during code review.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Improper truncated integer division in the scale() function causes
actual_brightness != brightness. This (partial) work-around should be
sufficient for a majority of use-cases, but it is by no means a complete
solution.
TODO: Determine how best to scale "user" values to "hw" values, and
vice-versa, when the ranges are of different sizes. That would be a
buggy scenario even with this work-around.
The issue was introduced in the following (v3.17-rc1) commit:
6dda730 drm/i915: respect the VBT minimum backlight brightness
Note that for easier backporting this commit adds a duplicated macro.
A follow-up cleanup patch rectifies this for 3.18+
v2: (thanks to Chris Wilson) clarify commit message, use rounded division
macro
v3: -DIV_ROUND_CLOSEST() fails to build with CONFIG_X86_32=y. (Jani)
-Use DIV_ROUND_CLOSEST_ULL() instead. (Damien)
-v1 and v2 originally authored by Joe Konno.
Signed-off-by: U. Artie Eoff <ullysses.a.eoff@intel.com> Reviewed-By: Joe Konno <joe.konno@intel.com>
[danvet: Add backporting note.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
free_pi_state and exit_pi_state_list both clean up futex_pi_state's.
exit_pi_state_list takes the hb lock first, and most callers of
free_pi_state do too. requeue_pi doesn't, which means free_pi_state
can free the pi_state out from under exit_pi_state_list. For example:
task A | task B
exit_pi_state_list |
pi_state = |
curr->pi_state_list->next |
| futex_requeue(requeue_pi=1)
| // pi_state is the same as
| // the one in task A
| free_pi_state(pi_state)
| list_del_init(&pi_state->list)
| kfree(pi_state)
list_del_init(&pi_state->list) |
Move the free_pi_state calls in requeue_pi to before it drops the hb
locks which it's already holding.
[ tglx: Removed a pointless free_pi_state() call and the hb->lock held
debugging. The latter comes via a seperate patch ]
Signed-off-by: Brian Silverman <bsilver16384@gmail.com> Cc: austin.linux@gmail.com Cc: darren@dvhart.com Cc: peterz@infradead.org Link: http://lkml.kernel.org/r/1414282837-23092-1-git-send-email-bsilver16384@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
O_DIRECT flags can be toggeled via fcntl(F_SETFL). But this value checked
twice inside ext4_file_write_iter() and __generic_file_write() which
result in BUG_ON inside ext4_direct_IO.
When there are no meta block groups update_backups() will compute the
backup block in 32-bit arithmetics thus possibly overflowing the block
number and corrupting the filesystem. OTOH filesystems without meta
block groups larger than 16 TB should be rare. Fix the problem by doing
the counting in 64-bit arithmetics.
Coverity-id: 741252 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we fail to load block bitmap in __ext4_new_inode() we will
dereference NULL pointer in ext4_journal_get_write_access(). So check
for error from ext4_read_block_bitmap().
Coverity-id: 989065 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the ext4_has_group_desc_csum predicate to look for a checksum
driver instead of the metadata_csum flag and change the bg checksum
calculation function to look for GDT_CSUM before taking the crc16
path.
Without this patch, if we mount with ^uninit_bg,^metadata_csum and
later metadata_csum gets turned on by accident, the block group
checksum functions will incorrectly assume that checksumming is
enabled (metadata_csum) but that crc16 should be used
(!s_chksum_driver). This is totally wrong, so fix the predicate
and the checksum formula selection.
(Granted, if the metadata_csum feature bit gets enabled on a live FS
then something underhanded is going on, but we could at least avoid
writing garbage into the on-disk fields.)
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Besides the fact that this replacement improves code readability
it also protects from errors caused direct EXT4_S(sb)->s_es manipulation
which may result attempt to use uninitialized csum machinery.
#Testcase_BEGIN
IMG=/dev/ram0
MNT=/mnt
mkfs.ext4 $IMG
mount $IMG $MNT
#Enable feature directly on disk, on mounted fs
tune2fs -O metadata_csum $IMG
# Provoke metadata update, likey result in OOPS
touch $MNT/test
umount $MNT
#Testcase_END
Delalloc write journal reservations only reserve 1 credit,
to update the inode if necessary. However, it may happen
once in a filesystem's lifetime that a file will cross
the 2G threshold, and require the LARGE_FILE feature to
be set in the superblock as well, if it was not set already.
This overruns the transaction reservation, and can be
demonstrated simply on any ext4 filesystem without the LARGE_FILE
feature already set:
The boot loader inode (inode #5) should never be visible in the
directory hierarchy, but it's possible if the file system is corrupted
that there will be a directory entry that points at inode #5. In
order to avoid accidentally trashing it, when such a directory inode
is opened, the inode will be marked as a bad inode, so that it's not
possible to modify (or read) the inode from userspace.
Unfortunately, when we unlink this (invalid/illegal) directory entry,
we will put the bad inode on the ophan list, and then when try to
unlink the directory, we don't actually remove the bad inode from the
orphan list before freeing in-memory inode structure. This means the
in-memory orphan list is corrupted, leading to a kernel oops.
In addition, avoid truncating a bad inode in ext4_destroy_inode(),
since truncating the boot loader inode is not a smart thing to do.
Reported-by: Sami Liedes <sami.liedes@iki.fi> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If there is a corrupted file system which has directory entries that
point at reserved, metadata inodes, prohibit them from being used by
treating them the same way we treat Boot Loader inodes --- that is,
mark them to be bad inodes. This prohibits them from being opened,
deleted, or modified via chmod, chown, utimes, etc.
In particular, this prevents a corrupted file system which has a
directory entry which points at the journal inode from being deleted
and its blocks released, after which point Much Hilarity Ensues.
Use truncate_isize_extended() when hole is being created in a file so that
->page_mkwrite() will get called for the partial tail page if it is
mmaped (see the first patch in the series for details).
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The check whether quota format is set even though there are no
quota files with journalled quota is pointless and it actually
makes it impossible to turn off journalled quotas (as there's
no way to unset journalled quota format). Just remove the check.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When loading extended attributes, check each entry's value offset to
make sure it doesn't collide with the entries.
Without this check it is easy to crash the kernel by mounting a
malicious FS containing a file with an EA wherein e_value_offs = 0 and
e_value_size > 0 and then deleting the EA, which corrupts the name
list.
(See the f_ea_value_crash test's FS image in e2fsprogs for an example.)
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In commit 8393c524a25609 (MIPS: tlbex: Fix a missing statement for
HUGETLB), the TLB Refill handler was fixed so that non-OCTEON targets
would work properly with huge pages. The change was incorrect in that
it broke the OCTEON case.
In the non-octeon case there is a destructive test for the huge PTE
bit, and then at 0, $k0 is reloaded (that is what the 8393c524a25609
patch added).
In the octeon case, we modify k1 in the branch delay slot, but we
never need k0 again, so the new load is not needed, but since k1 is
modified, if we do the load, we load from a garbage location and then
get a nested TLB Refill, which is seen in userspace as either SIGBUS
or SIGSEGV (depending on the garbage).
The real fix is to only do this reloading if it is needed, and never
where it is harmful.
Code before the .fixup section needs to have the .insn directive.
This has no side effects on MIPS32/64 but it affects the way microMIPS
loads the address for the return label.
Fixes the following build problem:
mips-linux-gnu-ld: arch/mips/built-in.o: .fixup+0x4a0: Unsupported jump between
ISA modes; consider recompiling with interlinking enabled.
mips-linux-gnu-ld: final link failed: Bad value
Makefile:819: recipe for target 'vmlinux' failed
The fix is similar to 1658f914ff91c3bf ("MIPS: microMIPS:
Disable LL/SC and fix linker bug.")