On a Freescale i.MX53 based board we ran into "BUG: scheduling while
atomic" because input_inject_event locks interrupts, but
imx_pwm_config_v2 sleeps.
Tested on Freescale i.MX53 SoC with 4.6.0.
Signed-off-by: Manfred Schlaegl <manfred.schlaegl@gmx.at> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
After initially connecting a wired Xbox 360 controller or sending it
a command to change LEDs, a status/response packet is interpreted as
controller input. This causes the state of buttons represented in
byte 2 of the controller data packet to be incorrect until the next
valid input packet. Wireless Xbox 360 controllers are not affected.
Writing a new value to the LED device while holding the Start button
and running jstest is sufficient to reproduce this bug. An event will
come through with the Start button released.
Xboxdrv also won't attempt to read controller input from a packet
where byte 0 is non-zero. It also checks that byte 1 is 0x14, but
that value differs between wired and wireless controllers and this
code is shared by both. I think just checking byte 0 is enough to
eliminate unwanted packets.
The following are some examples of 3-byte status packets I saw:
01 03 02
02 03 00
03 03 03
08 03 00
Signed-off-by: Cameron Gutman <aicommander@gmail.com> Signed-off-by: Pavel Rojtberg <rojtberg@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
With netconsole (at least) the pr_err("... disablingn") call can
recurse back into the dma-debug code, where it'll try to grab
free_entries_lock again. Avoid the problem by doing the printk after
dropping the lock.
Otherwise, if we fail to allocate new PIO buffers, our TXQs will try to
use the old ones, which aren't there any more.
Fixes: 183233bec810 "sfc: Allocate and link PIO buffers; map them with write-combining" Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The ccp-crypto module for AES XTS support has a bug that can allow requests
greater than 4096 bytes in size to be passed to the CCP hardware. The CCP
hardware does not support request sizes larger than 4096, resulting in
incorrect output. The request should actually be handled by the fallback
mechanism instantiated by the ccp-crypto module.
Add a check to insure the request size is less than or equal to the maximum
supported size and use the fallback mechanism if it is not.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: b955150ea784 ('RDMA/cxgb3: When a user QP is marked in error, also mark the CQs in error') Signed-off-by: Honggang Li <honli@redhat.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Ezequiel reported that he's facing UBI going into read-only
mode after power cut. It turned out that this behavior happens
only when updating a static volume is interrupted and Fastmap is
used.
UBI checks static volumes for data consistency and reads the
whole volume upon first open. If the volume is found erroneous
users of UBI cannot read from it, but another volume update is
possible to fix it. The check is performed by running
ubi_eba_read_leb() on every allocated LEB of the volume.
For static volumes ubi_eba_read_leb() computes the checksum of all
data stored in a LEB. To verify the computed checksum it has to read
the LEB's volume header which stores the original checksum.
If the volume header is not found UBI treats this as fatal internal
error and switches to RO mode. If the UBI device was attached via a
full scan the assumption is correct, the volume header has to be
present as it had to be there while scanning to get known as mapped.
If the attach operation happened via Fastmap the assumption is no
longer correct. When attaching via Fastmap UBI learns the mapping
table from Fastmap's snapshot of the system state and not via a full
scan. It can happen that a LEB got unmapped after a Fastmap was
written to the flash. Then UBI can learn the LEB still as mapped and
accessing it returns only 0xFF bytes. As UBI is not a FTL it is
allowed to have mappings to empty PEBs, it assumes that the layer
above takes care of LEB accounting and referencing.
UBIFS does so using the LEB property tree (LPT).
For static volumes UBI blindly assumes that all LEBs are present and
therefore special actions have to be taken.
The described situation can happen when updating a static volume is
interrupted, either by a user or a power cut.
The volume update code first unmaps all LEBs of a volume and then
writes LEB by LEB. If the sequence of operations is interrupted UBI
detects this either by the absence of LEBs, no volume header present
at scan time, or corrupted payload, detected via checksum.
In the Fastmap case the former method won't trigger as no scan
happened and UBI automatically thinks all LEBs are present.
Only by reading data from a LEB it detects that the volume header is
missing and incorrectly treats this as fatal error.
To deal with the situation ubi_eba_read_leb() from now on checks
whether we attached via Fastmap and handles the absence of a
volume header like a data corruption error.
This way interrupted static volume updates will correctly get detected
also when Fastmap is used.
Reported-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> Tested-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> Signed-off-by: Richard Weinberger <richard@nod.at>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
commit 0e707ae79ba3 ("UBI: do propagate positive error codes up") seems
to have produced an unintended change in the control flow here.
Completely untested, but it looks obvious.
Caught by Coverity, which didn't like the indentation. CID 1271184.
Signed-off-by: Brian Norris <computersforpeace@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
UBI uses positive function return codes internally, and should not propagate
them up, except in the place this path fixes. Here is the original bug report
from Dan Carpenter:
The problem is really in ubi_eba_read_leb().
drivers/mtd/ubi/eba.c
412 err = ubi_io_read_vid_hdr(ubi, pnum, vid_hdr, 1);
413 if (err && err != UBI_IO_BITFLIPS) {
414 if (err > 0) {
415 /*
416 * The header is either absent or corrupted.
417 * The former case means there is a bug -
418 * switch to read-only mode just in case.
419 * The latter case means a real corruption - we
420 * may try to recover data. FIXME: but this is
421 * not implemented.
422 */
423 if (err == UBI_IO_BAD_HDR_EBADMSG ||
424 err == UBI_IO_BAD_HDR) {
425 ubi_warn("corrupted VID header at PEB %d, LEB %d:%d",
426 pnum, vol_id, lnum);
427 err = -EBADMSG;
428 } else
429 ubi_ro_mode(ubi);
On this path we return UBI_IO_FF and UBI_IO_FF_BITFLIPS and it
eventually gets passed to ERR_PTR(). We probably dereference the bad
pointer and oops. At that point we've gone read only so it was already
a bad situation...
Commit ff1e22e7a638 ("xen/events: Mask a moving irq") open-coded
irq_move_irq() but left out checking if the IRQ is disabled. This broke
resuming from suspend since it tries to move a (disabled) irq without
holding the IRQ's desc->lock. Fix it by adding in a check for disabled
IRQs.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
creates an unreapable zombie if /sbin/init doesn't use __WALL.
This is not a kernel bug, at least in a sense that everything works as
expected: debugger should reap a traced sub-thread before it can reap the
leader, but without __WALL/__WCLONE do_wait() ignores sub-threads.
Unfortunately, it seems that /sbin/init in most (all?) distributions
doesn't use it and we have to change the kernel to avoid the problem.
Note also that most init's use sys_waitid() which doesn't allow __WALL, so
the necessary user-space fix is not that trivial.
This patch just adds the "ptrace" check into eligible_child(). To some
degree this matches the "tsk->ptrace" in exit_notify(), ->exit_signal is
mostly ignored when the tracee reports to debugger. Or WSTOPPED, the
tracer doesn't need to set this flag to wait for the stopped tracee.
This obviously means the user-visible change: __WCLONE and __WALL no
longer have any meaning for debugger. And I can only hope that this won't
break something, but at least strace/gdb won't suffer.
We could make a more conservative change. Say, we can take __WCLONE into
account, or !thread_group_leader(). But it would be nice to not
complicate these historical/confusing checks.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Roland McGrath <roland@hack.frob.com> Cc: <syzkaller@googlegroups.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The length of the GSS MIC token need not be a multiple of four bytes.
It is then padded by XDR to a multiple of 4 B, but unwrap_integ_data()
would previously only trim mic.len + 4 B. The remaining up to three
bytes would then trigger a check in nfs4svc_decode_compoundargs(),
leading to a "garbage args" error and mount failure:
nfs4svc_decode_compoundargs: compound not properly padded!
nfsd: failed to decode arguments!
This would prevent older clients using the pre-RFC 4121 MIC format
(37-byte MIC including a 9-byte OID) from mounting exports from v3.9+
servers using krb5i.
The trimming was introduced by commit 4c190e2f913f ("sunrpc: trim off
trailing checksum before returning decrypted or integrity authenticated
buffer").
Fixes: 4c190e2f913f "unrpc: trim off trailing checksum..." Signed-off-by: Tomáš Trnka <ttrnka@mail.muni.cz> Acked-by: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
008GE0 Toshiba mmc in some Intel Baytrail tablets responds to
MMC_SEND_EXT_CSD in 450-600ms.
This patch will...
() Increase the long read time quirk timeout from 300ms to 600ms. Original
author of that quirk says 300ms was only a guess and that the number
may need to be raised in the future.
() Add this specific MMC to the quirk
Signed-off-by: Matt Gumbel <matthew.k.gumbel@intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When we read out the watermark state from the hardware we're supposed to
transfer that into the active watermarks, but currently we fail to any
part of the active watermarks that isn't explicitly written. Let's clear
it all upfront.
Looks like this has been like this since the beginning, when I added the
readout. No idea why I didn't clear it up.
Cc: Matt Roper <matthew.d.roper@intel.com> Fixes: 243e6a44b9ca ("drm/i915: Init HSW watermark tracking in intel_modeset_setup_hw_state()") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1463151318-14719-2-git-send-email-ville.syrjala@linux.intel.com
(cherry picked from commit 15606534bf0a65d8a74a90fd57b8712d147dbca6) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When SCSI was written, all commands coming from the filesystem
(REQ_TYPE_FS commands) had data. This meant that our signal for needing
to complete the command was the number of bytes completed being equal to
the number of bytes in the request. Unfortunately, with the advent of
flush barriers, we can now get zero length REQ_TYPE_FS commands, which
confuse this logic because they satisfy the condition every time. This
means they never get retried even for retryable conditions, like UNIT
ATTENTION because we complete them early assuming they're done. Fix
this by special casing the early completion condition to recognise zero
length commands with errors and let them drop through to the retry code.
Reported-by: Sebastian Parschauer <s.parschauer@gmx.de> Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Tested-by: Jack Wang <jinpu.wang@profitbricks.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
We used to check dev->reg_state against NETREG_REGISTERED after each
time we are woke up. But after commit 9e641bdcfa4e ("net-tun:
restructure tun_do_read for better sleep/wakeup efficiency"), it uses
skb_recv_datagram() which does not check dev->reg_state. This will
result if we delete a tun/tap device after a process is blocked in the
reading. The device will wait for the reference count which was held
by that process for ever.
Fixes this by using RCV_SHUTDOWN which will be checked during
sk_recv_datagram() before trying to wake up the process during uninit.
Fixes: 9e641bdcfa4e ("net-tun: restructure tun_do_read for better
sleep/wakeup efficiency") Cc: Eric Dumazet <edumazet@google.com> Cc: Xi Wang <xii@google.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The PM runtime will be left disabled for the device if its
.suspend_late() callback fails and async suspend is not allowed
for this device. In this case device will not be added in
dpm_late_early_list and dpm_resume_early() will ignore this
device, as result PM runtime will be disabled for it forever
(side effect: after 8 subsequent failures for the same device
the PM runtime will be reenabled due to disable_depth overflow).
To fix this problem, add devices to dpm_late_early_list regardless
of whether or not device_suspend_late() returns errors for them.
That will ensure failures in there to be handled consistently for
all devices regardless of their async suspend/resume status.
Reported-by: Grygorii Strashko <grygorii.strashko@ti.com> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When running a 32-bit userspace on a 64-bit kernel, the UI_SET_PHYS
ioctl needs to be treated with special care, as it has the pointer
size encoded in the command.
The session key is the default keyring set for request_key operations.
This session key is revoked when the user owning the session logs out.
Any long running daemon processes started by this session ends up with
revoked session keyring which prevents these processes from using the
request_key mechanism from obtaining the krb5 keys.
The problem has been reported by a large number of autofs users. The
problem is also seen with multiuser mounts where the share may be used
by processes run by a user who has since logged out. A reproducer using
automount is available on the Red Hat bz.
The patch creates a new keyring which is used to cache cifs spnego
upcalls.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reported-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.16: keyring_alloc() doesn't take a restrict_link param] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
ehea_get_port may return NULL. Do not dereference NULL value.
Fixes: 8c4877a4128e ("ehea: Use the standard logging functions") Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Acked-by: Thadeu Lima de Souza Cascardo <cascardo@debian.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In some rare randconfig builds, we can end up with
ASYMMETRIC_PUBLIC_KEY_SUBTYPE enabled but CRYPTO_AKCIPHER disabled,
which fails to link because of the reference to crypto_alloc_akcipher:
crypto/built-in.o: In function `public_key_verify_signature':
:(.text+0x110e4): undefined reference to `crypto_alloc_akcipher'
This adds a Kconfig 'select' statement to ensure the dependency
is always there.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
We don't write back stale inodes so we should skip them in
xfs_iflush_cluster, too.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Some careless idiot(*) wrote crap code in commit 1a3e8f3 ("xfs:
convert inode cache lookups to use RCU locking") back in late 2010,
and so xfs_iflush_cluster checks the wrong inode for whether it is
still valid under RCU protection. Fix it to lock and check the
correct inode.
(*) Careless-idiot: Dave Chinner <dchinner@redhat.com>
Discovered-by: Brain Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When a failure due to an inode buffer occurs, the error handling
fails to abort the inode writeback correctly. This can result in the
inode being reclaimed whilst still in the AIL, leading to
use-after-free situations as well as filesystems that cannot be
unmounted as the inode log items left in the AIL never get removed.
Fix this by ensuring fatal errors from xfs_imap_to_bp() result in
the inode flush being aborted correctly.
Reported-by: Shyam Kaushik <shyam@zadarastorage.com> Diagnosed-by: Shyam Kaushik <shyam@zadarastorage.com> Tested-by: Shyam Kaushik <shyam@zadarastorage.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
[bwh: Backported to 3.16: as Dave pointed out, error codes are positive
here so compare with positive EAGAIN] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The function batadv_iv_ogm_orig_add_if allocates new buffers for bcast_own
and bcast_own_sum. It is expected that these buffers are unchanged in case
either bcast_own or bcast_own_sum couldn't be resized.
But the error handling of this function frees the already resized buffer
for bcast_own when the allocation of the new bcast_own_sum buffer failed.
This will lead to an invalid memory access when some code will try to
access bcast_own.
Instead the resized new bcast_own buffer has to be kept. This will not lead
to problems because the size of the buffer was only increased and therefore
no user of the buffer will try to access bytes outside of the new buffer.
Fixes: d0015fdd3d2c ("batman-adv: provide orig_node routing API") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <a@unstable.cc>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Commit 0b89e9aa2856 (cpuidle: delay enabling interrupts until all
coupled CPUs leave idle) rightfully fixed a regression by letting
the coupled idle state framework to handle local interrupt enabling
when the CPU is exiting an idle state.
The current code checks if the idle state is coupled and, if so, it
will let the coupled code to enable interrupts. This way, it can
decrement the ready-count before handling the interrupt. This
mechanism prevents the other CPUs from waiting for a CPU which is
handling interrupts.
But the check is done against the state index returned by the back
end driver's ->enter functions which could be different from the
initial index passed as parameter to the cpuidle_enter_state()
function.
if (!cpuidle_state_is_coupled(drv, entered_state))
local_irq_enable();
[ ... ]
If the 'index' is referring to a coupled idle state but the
'entered_state' is *not* coupled, then the interrupts are enabled
again. All CPUs blocked on the sync barrier may busy loop longer
if the CPU has interrupts to handle before decrementing the
ready-count. That's consuming more energy than saving.
Fixes: 0b89e9aa2856 (cpuidle: delay enabling interrupts until all coupled CPUs leave idle) Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
[ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Commit 176e21ee2ec8 ("SUNRPC: Support for RPC over AF_LOCAL
transports") added a 5-character netid, but did not bump
RPCBIND_MAXNETIDLEN from 4 to 5.
Fixes: 176e21ee2ec8 ("SUNRPC: Support for RPC over AF_LOCAL ...") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Wrong return code was being returned on SMB3 rmdir of
non-empty directory.
For SMB3 (unlike for cifs), we attempt to delete a directory by
set of delete on close flag on the open. Windows clients set
this flag via a set info (SET_FILE_DISPOSITION to set this flag)
which properly checks if the directory is empty.
With this patch on smb3 mounts we correctly return
"DIRECTORY NOT EMPTY"
on attempts to remove a non-empty directory.
Signed-off-by: Steve French <steve.french@primarydata.com> Acked-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.16:adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.16: adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.16: adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
See [MS-NLMP] 3.2.5.1.2 Server Receives an AUTHENTICATE_MESSAGE from the Client:
...
Set NullSession to FALSE
If (AUTHENTICATE_MESSAGE.UserNameLen == 0 AND
AUTHENTICATE_MESSAGE.NtChallengeResponse.Length == 0 AND
(AUTHENTICATE_MESSAGE.LmChallengeResponse == Z(1)
OR
AUTHENTICATE_MESSAGE.LmChallengeResponse.Length == 0))
-- Special case: client requested anonymous authentication
Set NullSession to TRUE
...
Only server which map unknown users to guest will allow
access using a non-null NTChallengeResponse.
For Samba it's the "map to guest = bad user" option.
During boot, MST hotplugs are generally expected (even if no physical
hotplugging occurs) and result in DRM's connector topology changing.
This means that using num_connector from the current mode configuration
can lead to the number of connectors changing under us. This can lead to
some nasty scenarios in fbcon:
- We allocate an array to the size of dev->mode_config.num_connectors.
- MST hotplug occurs, dev->mode_config.num_connectors gets incremented.
- We try to loop through each element in the array using the new value
of dev->mode_config.num_connectors, and end up going out of bounds
since dev->mode_config.num_connectors is now larger then the array we
allocated.
fb_helper->connector_count however, will always remain consistent while
we do a modeset in fb_helper.
Note: This is just polish for 4.7, Dave Airlie's drm_connector
refcounting fixed these bugs for real. But it's good enough duct-tape
for stable kernel backporting, since backporting the refcounting
changes is way too invasive.
During boot time, MST devices usually send a ton of hotplug events
irregardless of whether or not any physical hotplugs actually occurred.
Hotplugs mean connectors being created/destroyed, and the number of DRM
connectors changing under us. This isn't a problem if we use
fb_helper->connector_count since we only set it once in the code,
however if we use num_connector from struct drm_mode_config we risk it's
value changing under us. On top of that, there's even a chance that
dev->mode_config.num_connector != fb_helper->connector_count. If the
number of connectors happens to increase under us, we'll end up using
the wrong array size for memcpy and start writing beyond the actual
length of the array, occasionally resulting in kernel panics.
Note: This is just polish for 4.7, Dave Airlie's drm_connector
refcounting fixed these bugs for real. But it's good enough duct-tape
for stable kernel backporting, since backporting the refcounting
changes is way too invasive.
Irrespective of that fact that that is horrible code that should be
fixed for many reasons, it does highlight a deficiency in the generic
preempt_count manipulators. As it is never right to combine/elide
preempt_count manipulations like this.
Therefore sprinkle some volatile in the two generic accessors to
ensure the compiler is aware of the fact that the preempt_count is
observed outside of the regular program-order view and thus cannot be
optimized away like this.
x86; the only arch not using the generic code is not affected as we
do all this in asm in order to use the segment base per-cpu stuff.
Reported-by: Vikram Mulukutla <markivx@codeaurora.org> Tested-by: Vikram Mulukutla <markivx@codeaurora.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: a787870924db ("sched, arch: Create asm/preempt.h") Link: http://lkml.kernel.org/r/20160516131751.GH3205@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.16: use ACCESS_ONCE() instead of READ_ONCE()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When we free cb->skb after a dump, we do it after releasing the
lock. This means that a new dump could have started in the time
being and we'll end up freeing their skb instead of ours.
This patch saves the skb and module before we unlock so we free
the right memory.
Fixes: 16b304f3404f ("netlink: Eliminate kmalloc in netlink dump operation.") Reported-by: Baozeng Ding <sploving1@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
b84106b4e229 ("PCI: Disable IO/MEM decoding for devices with non-compliant
BARs") disabled BAR sizing for BARs 0-5 of devices that don't comply with
the PCI spec. But it didn't do anything for expansion ROM BARs, so we
still try to size them, resulting in warnings like this on Broadwell-EP:
pci 0000:ff:12.0: BAR 6: failed to assign [mem size 0x00000001 pref]
Move the non-compliant BAR check from __pci_read_base() up to
pci_read_bases() so it applies to the expansion ROM BAR as well as
to BARs 0-5.
Note that direct callers of __pci_read_base(), like sriov_init(), will now
bypass this check. We haven't had reports of devices with broken SR-IOV
BARs yet.
[bhelgaas: changelog] Fixes: b84106b4e229 ("PCI: Disable IO/MEM decoding for devices with non-compliant BARs") Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> CC: "H. Peter Anvin" <hpa@zytor.com> CC: Andi Kleen <ak@linux.intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Commit b894157145e4 ("x86/PCI: Mark Broadwell-EP Home Agent & PCU as having
non-compliant BARs") marked Home Agent 0 & PCU has having non-compliant
BARs. Home Agent 1 also has non-compliant BARs.
Mark Home Agent 1 as having non-compliant BARs so the PCI core doesn't
touch them.
The problem with these devices is documented in the Xeon v4 specification
update:
BDF2 PCI BARs in the Home Agent Will Return Non-Zero Values
During Enumeration
Problem: During system initialization the Operating System may access
the standard PCI BARs (Base Address Registers). Due to
this erratum, accesses to the Home Agent BAR registers (Bus
1; Device 18; Function 0,4; Offsets (0x14-0x24) will return
non-zero values.
Implication: The operating system may issue a warning. Intel has not
observed any functional failures due to this erratum.
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e5-v4-spec-update.html Fixes: b894157145e4 ("x86/PCI: Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs") Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> CC: "H. Peter Anvin" <hpa@zytor.com> CC: Andi Kleen <ak@linux.intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Prevent using uninitialized or negative index when handling
steering entries.
Fixes: b12d93d63c32 ('mlx4: Add support for promiscuous mode in the new steering model.') Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When this_order variable in blk_mq_init_rq_map() becomes zero
the code incorrectly decrements the variable and passes the result
to order_to_size() helper causing undefined behaviour:
UBSAN: Undefined behaviour in block/blk-mq.c:1459:27
shift exponent 4294967295 is too large for 32-bit type 'unsigned int'
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc6-00072-g33656a1 #22
Fix the code by checking this_order variable for not having the zero
value first.
Reported-by: Meelis Roos <mroos@linux.ee> Fixes: 320ae51feed5 ("blk-mq: new multi-queue block IO queueing mechanism") Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Some eMMCs set the partition switch timeout too low.
Now typically eMMCs are considered a critical component (e.g. because
they store the root file system) and consequently are expected to be
reliable. Thus we can neglect the use case where eMMCs can't switch
reliably and we might want a lower timeout to facilitate speedy
recovery.
Although we could employ a quirk for the cards that are affected (if
we could identify them all), as described above, there is little
benefit to having a low timeout, so instead simply set a minimum
timeout.
The minimum is set to 300ms somewhat arbitrarily - the examples that
have been seen had a timeout of 10ms but were sometimes taking 60-70ms.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fix array overrun when going over callback table.
In declaration of callback table, the max size isn't provided and
in registration phase, it is provided.
There is potential scenario where a new operation is added
and it is not supported by current client. The acceptance of
such operation by ib_netlink will cause to array overrun.
Fixes: 809d5fc9bf65 ("infiniband: pass rdma_cm module to netlink_dump_start") Fixes: b493d91d333e ("iwcm: common code for port mapper") Fixes: 2ca546b92a02 ("IB/sa: Route SA pathrecord query through netlink") Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
[bwh: Backported to 3.16:
- Only cma.c needs to be fixed
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In case ibnl_put_msg fails in send_nlmsg_done,
the function returns with -ENOMEM without freeing.
This patch fixes this behavior.
Fixes: 30dc5e63d6a5 ("RDMA/core: Add support for iWARP Port Mapper user space service") Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently c4iw_peer_abort_intr() does not wake up the waiter if the
endpoint state indicates we're using MPAv2 and we're currently trying to
connect. This was introduced with commit 7c0a33d61187a ("RDMA/cxgb4:
Don't wakeup threads for MPAv2")
However, this original fix is flawed because it introduces a race that
can cause a deadlock of the iwarp stack. Here is the race:
->local side sets up an active offload connection.
->local side sends MPA_START request.
->peer sends MPA_START response.
->local side ingress cpl thread begins processing the MPA_START response,
but before it changes the state from MPA_REQ_SENT to FPDU_MODE:
->peer sends a RST which results in a ABORT_REQ_RSS. This triggers
peer_abort_intr() which sees the state in MPA_REQ_SENT and since mpa_rev
is 2, it will avoid waking up the endpoint with -ECONNRESET, assuming the
stack will re-attempt the connection using MPAv1.
->Meanwhile, the cpl thread moves the state to FPDU_MODE and calls
c4iw_modify_rc_qp() which calls rdma_init() which sends a RI_WR/INIT WR
to firmware. But since HW sent an abort, FW correctly drops the RI_WR/INIT
WR.
->So the cpl thread is stuck waiting for a reply and cannot process the
ABORT_REQ_RSS cpl sitting in its input queue. Thus everything comes to a
halt because no more ingress cpls are processed by the stack...
The correct fix for the issue is to always do the wake up in
c4iw_abort_intr() but reinitialize the wait object in c4iw_reconnect().
Fixes: 7c0a33d61187a ("RDMA/cxgb4: Don't wakeup threads for MPAv2") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Thus it overflows and the resulting number is less than 4080, which makes
3823 / 4080 = 0
an nr_pages is set to this. As we already checked against the minimum that
nr_pages may be, this causes the logic to fail as well, and we crash the
kernel.
There's no reason to have the two DIV_ROUND_UP() (that's just result of
historical code changes), clean up the code and fix this bug.
Fixes: 83f40318dab00 ("ring-buffer: Make removal of ring buffer pages atomic") Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The size variable to change the ring buffer in ftrace is a long. The
nr_pages used to update the ring buffer based on the size is int. On 64 bit
machines this can cause an overflow problem.
For example, the following will cause the ring buffer to crash:
WARNING: CPU: 1 PID: 318 at kernel/trace/ring_buffer.c:1527 rb_update_pages+0x22f/0x260
Which is:
RB_WARN_ON(cpu_buffer, nr_removed);
Note each ring buffer page holds 4080 bytes.
This is because:
1) 10 causes the ring buffer to have 3 pages.
(10kb requires 3 * 4080 pages to hold)
2) (2^31 / 2^10 + 1) * 4080 = 8556384240
The value written into buffer_size_kb is shifted by 10 and then passed
to ring_buffer_resize(). 8556384240 * 2^10 = 8761737461760
3) The size passed to ring_buffer_resize() is then divided by BUF_PAGE_SIZE
which is 4080. 8761737461760 / 4080 = 2147484672
4) nr_pages is subtracted from the current nr_pages (3) and we get: 2147484669. This value is saved in a signed integer nr_pages_to_update
5) 2147484669 is greater than 2^31 but smaller than 2^32, a signed int
turns into the value of -2147482627
6) As the value is a negative number, in update_pages_handler() it is
negated and passed to rb_remove_pages() and 2147482627 pages will
be removed, which is much larger than 3 and it causes the warning
because not all the pages asked to be removed were removed.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=118001 Fixes: 7a8e76a3829f1 ("tracing: unified trace buffer") Reported-by: Hao Qin <QEver.cn@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When emulating a jalr instruction with rd == $0, the code in
isBranchInstr was incorrectly writing to GPR $0 which should actually
always remain zeroed. This would lead to any further instructions
emulated which use $0 operating on a bogus value until the task is next
context switched, at which point the value of $0 in the task context
would be restored to the correct zero by a store in SAVE_SOME. Fix this
by not writing to rd if it is $0.
Fixes: 102cedc32a6e ("MIPS: microMIPS: Floating point support.") Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: Maciej W. Rozycki <macro@imgtec.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13160/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The lazy cache flushing implemented in the MIPS kernel suffers from a
race condition that is exposed by do_set_pte() in mm/memory.c.
A pre-condition is a file-system that writes to the page from the CPU
in its readpage method and then calls flush_dcache_page(). One example
is ubifs. Another pre-condition is that the dcache flush is postponed
in __flush_dcache_page().
Upon a page fault for an executable mapping not existing in the
page-cache, the following will happen:
1. Write to the page
2. flush_dcache_page
3. flush_icache_page
4. set_pte_at
5. update_mmu_cache (commits the flush of a dcache-dirty page)
Between steps 4 and 5 another thread can hit the same page and it will
encounter a valid pte. Because the data still is in the L1 dcache the CPU
will fetch stale data from L2 into the icache and execute garbage.
This fix moves the commit of the cache flush to step 3 to close the
race window. It also reduces the amount of flushes on non-executable
mappings because we never enter __flush_dcache_page() for non-aliasing
CPUs.
Regressions can occur in drivers that mistakenly relies on the
flush_dcache_page() in get_user_pages() for DMA operations.
[ralf@linux-mips.org: Folded in patch 9346 to fix highmem issue.]
Commit 39baadbf36ce ("powerpc/eeh: Remove eeh information from pci_dn")
changed the pci_dn struct by removing its EEH-related members.
As part of this clean-up, DDW mechanism was modified to read the device
configuration address from eeh_dev struct.
As a consequence, now if we disable EEH mechanism on kernel command-line
for example, the DDW mechanism will fail, generating a kernel oops by
dereferencing a NULL pointer (which turns to be the eeh_dev pointer).
This patch just changes the configuration address calculation on DDW
functions to a manual calculation based on pci_dn members instead of
using eeh_dev-based address.
No functional changes were made. This was tested on pSeries, both
in PHyp and qemu guest.
Fixes: 39baadbf36ce ("powerpc/eeh: Remove eeh information from pci_dn") Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Systems show a minimal load average of 0.00, 0.01, 0.05 even when they
have no load at all.
Uptime and /proc/loadavg on all systems with kernels released during the
last five years up until kernel version 4.6-rc5, show a 5- and 15-minute
minimum loadavg of 0.01 and 0.05 respectively. This should be 0.00 on
idle systems, but the way the kernel calculates this value prevents it
from getting lower than the mentioned values.
Likewise but not as obviously noticeable, a fully loaded system with no
processes waiting, shows a maximum 1/5/15 loadavg of 1.00, 0.99, 0.95
(multiplied by number of cores).
Once the (old) load becomes 93 or higher, it mathematically can never
get lower than 93, even when the active (load) remains 0 forever.
This results in the strange 0.00, 0.01, 0.05 uptime values on idle
systems. Note: 93/2048 = 0.0454..., which rounds up to 0.05.
It is not correct to add a 0.5 rounding (=1024/2048) here, since the
result from this function is fed back into the next iteration again,
so the result of that +0.5 rounding value then gets multiplied by
(2048-2037), and then rounded again, so there is a virtual "ghost"
load created, next to the old and active load terms.
By changing the way the internally kept value is rounded, that internal
value equivalent now can reach 0.00 on idle, and 1.00 on full load. Upon
increasing load, the internally kept load value is rounded up, when the
load is decreasing, the load value is rounded down.
The modified code was tested on nohz=off and nohz kernels. It was tested
on vanilla kernel 4.6-rc5 and on centos 7.1 kernel 3.10.0-327. It was
tested on single, dual, and octal cores system. It was tested on virtual
hosts and bare hardware. No unwanted effects have been observed, and the
problems that the patch intended to fix were indeed gone.
Tested-by: Damien Wyart <damien.wyart@free.fr> Signed-off-by: Vik Heyndrickx <vik.heyndrickx@veribox.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Doug Smythies <dsmythies@telus.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 0f004f5a696a ("sched: Cure more NO_HZ load average woes") Link: http://lkml.kernel.org/r/e8d32bff-d544-7748-72b5-3c86cc71f09f@veribox.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In commit a269913c52ad ("rtlwifi: Rework rtl_lps_leave() and
rtl_lps_enter() to use work queue"), the tests for enter/exit
power-save mode were inverted. With this change applied, the
wifi connection becomes much more stable.
Fixes: a269913c52ad ("rtlwifi: Rework rtl_lps_leave() and rtl_lps_enter() to use work queue") Signed-off-by: Wang YanQing <udknight@gmail.com> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
[bwh: Backported to 3.16:
- We only set a flag here to be used later, but it was also set the wrong way
- Adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
On some architectures (powerpc in particular), the number of registers
exceeds what can be represented in an integer bitmask. Ensure we
generate the proper bitmask on such platforms.
Fixes: 71ad0f5e4 ("perf tools: Support for DWARF CFI unwinding on post processing") Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
With Linux page size of 64K and hardware only supporting 4K HPTE, if we
use subpage protection, we always fail for the subpage 0 as shown
below (using the selftest subpage_prot test):
520175565: (4520111850): Failed at 0x3fffad4b0000 (p=13,sp=0,w=0), want=fault, got=pass ! 4520890210: (4520826495): Failed at 0x3fffad5b0000 (p=29,sp=0,w=0), want=fault, got=pass ! 4521574251: (4521510536): Failed at 0x3fffad6b0000 (p=45,sp=0,w=0), want=fault, got=pass ! 4522258324: (4522194609): Failed at 0x3fffad7b0000 (p=61,sp=0,w=0), want=fault, got=pass !
This is because hash preload wrongly inserts the HPTE entry for subpage
0 without looking at the subpage protection information.
Fix it by teaching should_hash_preload() not to preload if we have
subpage protection configured for that range.
It appears this has been broken since it was introduced in 2008.
Fixes: fa28237cfcc5 ("[POWERPC] Provide a way to protect 4k subpages when using 64k pages") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Rework into should_hash_preload() to avoid build fails w/SLICES=n] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently we have a check in hash_preload() against the psize, which is
only included when CONFIG_PPC_MM_SLICES is enabled. We want to expand
this check in a subsequent patch, so factor it out to allow that. As a
bonus it removes the #ifdef in the C code.
Unfortunately we can't put this in the existing CONFIG_PPC_MM_SLICES
block because it would require a forward declaration.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
gcc-6 started warning by default about variables that are not
used anywhere and that are marked 'const', generating many
false positives in an allmodconfig build, e.g.:
arch/arm/mach-davinci/board-da830-evm.c:282:20: warning: 'da830_evm_emif25_pins' defined but not used [-Wunused-const-variable=]
arch/arm/plat-omap/dmtimer.c:958:34: warning: 'omap_timer_match' defined but not used [-Wunused-const-variable=]
drivers/bluetooth/hci_bcm.c:625:39: warning: 'acpi_bcm_default_gpios' defined but not used [-Wunused-const-variable=]
drivers/char/hw_random/omap-rng.c:92:18: warning: 'reg_map_omap4' defined but not used [-Wunused-const-variable=]
drivers/devfreq/exynos/exynos5_bus.c:381:32: warning: 'exynos5_busfreq_int_pm' defined but not used [-Wunused-const-variable=]
drivers/dma/mv_xor.c:1139:34: warning: 'mv_xor_dt_ids' defined but not used [-Wunused-const-variable=]
This is similar to the existing -Wunused-but-set-variable warning
that was added in an earlier release and that we disable by default
now and only enable when W=1 is set, so it makes sense to do
the same here. Once we have eliminated the majority of the
warnings for both, we can put them back into the default list.
We probably want this in backport kernels as well, to allow building
them with gcc-6 without introducing extra warnings.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Michal Marek <mmarek@suse.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The loop that browses the array compat_hwcap_str will stop when a NULL
is encountered, however NULL is missing at the end of array. This will
lead to overrun until a NULL is found somewhere in the following memory.
In reality, this works out because the compat_hwcap2_str array tends to
follow immediately in memory, and that *is* terminated correctly.
Furthermore, the unsigned int compat_elf_hwcap is checked before
printing each capability, so we end up doing the right thing because
the size of the two arrays is less than 32. Still, this is an obvious
mistake and should be fixed.
Note for backporting: commit 12d11817eaafa414 ("arm64: Move
/proc/cpuinfo handling code") moved this code in v4.4. Prior to that
commit, the same change should be made in arch/arm64/kernel/setup.c.
Fixes: 44b82b7700d0 "arm64: Fix up /proc/cpuinfo" Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
[bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When an IPI is generated by a CPU, the pattern looks roughly like:
<write shared data>
smp_wmb();
<write to GIC to signal SGI>
On the receiving CPU we rely on the fact that, once we've taken the
interrupt, then the freshly written shared data must be visible to us.
Put another way, the CPU isn't going to speculate taking an interrupt.
Unfortunately, this assumption turns out to be broken.
Consider that CPUx wants to send an IPI to CPUy, which will cause CPUy
to read some shared_data. Before CPUx has done anything, a random
peripheral raises an IRQ to the GIC and the IRQ line on CPUy is raised.
CPUy then takes the IRQ and starts executing the entry code, heading
towards gic_handle_irq. Furthermore, let's assume that a bunch of the
previous interrupts handled by CPUy were SGIs, so the branch predictor
kicks in and speculates that irqnr will be <16 and we're likely to
head into handle_IPI. The prefetcher then grabs a speculative copy of
shared_data which contains a stale value.
Meanwhile, CPUx gets round to updating shared_data and asking the GIC
to send an SGI to CPUy. Internally, the GIC decides that the SGI is
more important than the peripheral interrupt (which hasn't yet been
ACKed) but doesn't need to do anything to CPUy, because the IRQ line
is already raised.
CPUy then reads the ACK register on the GIC, sees the SGI value which
confirms the branch prediction and we end up with a stale shared_data
value.
This patch fixes the problem by adding an smp_rmb() to the IPI entry
code in gic_handle_irq. As it turns out, the combination of a control
dependency and an ISB instruction from the EOI in the GICv3 driver is
enough to provide the ordering we need, so we add a comment there
justifying the absence of an explicit smp_rmb().
Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[bwh: Backported to 3.16: drop changes to irq-gic-v3] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The identity mapping is suboptimal for the last 2GB frame. The mapping
will be established with a mix of 4KB and 1MB mappings instead of a
single 2GB mapping.
This happens because of a off-by-one bug introduced with
commit 50be63450728 ("s390/mm: Convert bootmem to memblock").
This lock is already taken in ata_scsi_queuecmd() a few levels up the
call stack so attempting to take it here is an error. Moreover, it is
pointless in the first place since it only protects a single, atomic
assignment.
Enabling lock debugging gives the following output:
=============================================
[ INFO: possible recursive locking detected ]
4.4.0-rc5+ #189 Not tainted
---------------------------------------------
kworker/u2:3/37 is trying to acquire lock:
(&(&host->lock)->rlock){-.-...}, at: [<90283294>] sata_dwc_exec_command_by_tag.constprop.14+0x44/0x8c
but task is already holding lock:
(&(&host->lock)->rlock){-.-...}, at: [<902761ac>] ata_scsi_queuecmd+0x2c/0x330
other info that might help us debug this:
Possible unsafe locking scenario:
Enabling CONFIG_GCOV_PROFILE_ALL produces us a lot of warnings like
lib/lz4/lz4hc_compress.c: In function 'lz4_compresshcctx':
lib/lz4/lz4hc_compress.c:514:1: warning: the frame size of 1504 bytes is larger than 1024 bytes [-Wframe-larger-than=]
After some investigation, I found that this behavior started with gcc-4.9,
and opened https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69702.
A suggested workaround for it is to use the -fno-tree-loop-im
flag that turns off one of the optimization stages in gcc, so the
code runs a little slower but does not use excessive amounts
of stack.
We could make this conditional on the gcc version, but I could not
find an easy way to do this in Kbuild and the benefit would be
fairly small, given that most of the gcc version in production are
affected now.
I'm marking this for 'stable' backports because it addresses a bug
with code generation in gcc that exists in all kernel versions
with the affected gcc releases.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Michal Marek <mmarek@suse.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Writing CP0_Compare clears the timer interrupt pending bit
(CP0_Cause.TI), but this wasn't being done atomically. If a timer
interrupt raced with the write of the guest CP0_Compare, the timer
interrupt could end up being pending even though the new CP0_Compare is
nowhere near CP0_Count.
We were already updating the hrtimer expiry with
kvm_mips_update_hrtimer(), which used both kvm_mips_freeze_hrtimer() and
kvm_mips_resume_hrtimer(). Close the race window by expanding out
kvm_mips_update_hrtimer(), and clearing CP0_Cause.TI and setting
CP0_Compare between the freeze and resume. Since the pending timer
interrupt should not be cleared when CP0_Compare is written via the KVM
user API, an ack argument is added to distinguish the source of the
write.
Fixes: e30492bbe95a ("MIPS: KVM: Rewrite count/compare timer emulation") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim KrÄ\8dmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.16: adjust filenames] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
There's a particularly narrow and subtle race condition when the
software emulated guest timer is frozen which can allow a guest timer
interrupt to be missed.
This happens due to the hrtimer expiry being inexact, so very
occasionally the freeze time will be after the moment when the emulated
CP0_Count transitions to the same value as CP0_Compare (so an IRQ should
be generated), but before the moment when the hrtimer is due to expire
(so no IRQ is generated). The IRQ won't be generated when the timer is
resumed either, since the resume CP0_Count will already match CP0_Compare.
With VZ guests in particular this is far more likely to happen, since
the soft timer may be frozen frequently in order to restore the timer
state to the hardware guest timer. This happens after 5-10 hours of
guest soak testing, resulting in an overflow in guest kernel timekeeping
calculations, hanging the guest. A more focussed test case to
intentionally hit the race (with the help of a new hypcall to cause the
timer state to migrated between hardware & software) hits the condition
fairly reliably within around 30 seconds.
Instead of relying purely on the inexact hrtimer expiry to determine
whether an IRQ should be generated, read the guest CP0_Compare and
directly check whether the freeze time is before or after it. Only if
CP0_Count is on or after CP0_Compare do we check the hrtimer expiry to
determine whether the last IRQ has already been generated (which will
have pushed back the expiry by one timer period).
Fixes: e30492bbe95a ("MIPS: KVM: Rewrite count/compare timer emulation") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim KrÄ\8dmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The interface read URB is submitted in attach, but was only unlinked by
the driver at disconnect.
In case of a late probe error (e.g. due to failed minor allocation),
disconnect is never called and we would end up with active URBs for an
unbound interface. This in turn could lead to deallocated memory being
dereferenced in the completion callback.
Fixes: f7a33e608d9a ("USB: serial: add quatech2 usb to serial driver") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The interface read and event URBs are submitted in attach, but were
never explicitly unlinked by the driver. Instead the URBs would have
been killed by usb-serial core on disconnect.
In case of a late probe error (e.g. due to failed minor allocation),
disconnect is never called and we could end up with active URBs for an
unbound interface. This in turn could lead to deallocated memory being
dereferenced in the completion callbacks.
The interface instat and indat URBs were submitted in attach, but never
unlinked in release before deallocating the corresponding transfer
buffers.
In the case of a late probe error (e.g. due to failed minor allocation),
disconnect would not have been called before release, causing the
buffers to be freed while the URBs are still in use. We'd also end up
with active URBs for an unbound interface.
Fixes: f9c99bb8b3a1 ("USB: usb-serial: replace shutdown with disconnect,
release") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
URBs and buffers allocated in attach for Epic devices would never be
deallocated in case of a later probe error (e.g. failure to allocate
minor numbers) as disconnect is then never called.
Fix by moving deallocation to release and making sure that the
URBs are first unlinked.
Fixes: f9c99bb8b3a1 ("USB: usb-serial: replace shutdown with disconnect,
release") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Private data, URBs and buffers allocated for Epic devices during
attach were never released on errors (e.g. missing endpoints).
Fixes: 6e8cf7751f9f ("USB: add EPIC support to the io_edgeport driver") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Update the recent changes to set_pte() that were added in 46011e6ea392
to handle R10000_LLSC_WAR, and format the assembly to match other areas
of the MIPS tree using the same WAR.
This also incorporates a patch recently sent in my Markos Chandras,
"Remove local LL/SC preprocessor variants", so that patch doesn't need
to be applied if this one is accepted.
Signed-off-by: Joshua Kinard <kumba@gentoo.org> Fixes: 46011e6ea392 ("MIPS: Make set_pte() SMP safe.) Cc: David Daney <david.daney@cavium.com> Cc: Linux/MIPS <linux-mips@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/11103/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
[bwh: Backported to 3.2:
- Use {LL,SC}_INSN not __{LL,SC}
- Use literal arch=r4000 instead of MIPS_ISA_ARCH_LEVEL since R6 is not
supported] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When showing backtraces in response to traps, for example crashes and
address errors (usually unaligned accesses) when they are set in debugfs
to be reported, unwind_stack will be used if the PC was in the kernel
text address range. However since EVA it is possible for user and kernel
address ranges to overlap, and even without EVA userland can still
trigger an address error by jumping to a KSeg0 address.
Adjust the check to also ensure that it was running in kernel mode. I
don't believe any harm can come of this problem, since unwind_stack() is
sufficiently defensive, however it is only meant for unwinding kernel
code, so to be correct it should use the raw backtracing instead.
Signed-off-by: James Hogan <james.hogan@imgtec.com> Reviewed-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11701/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
(cherry picked from commit d2941a975ac745c607dfb590e92bb30bc352dad9) Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When unwinding through IRQs and exceptions, the unwinding only continues
if the PC is a kernel text address, however since EVA it is possible for
user and kernel address ranges to overlap, potentially allowing
unwinding to continue to user mode if the user PC happens to be in the
kernel text address range.
Adjust the check to also ensure that the register state from before the
exception is actually running in kernel mode, i.e. !user_mode(regs).
I don't believe any harm can come of this problem, since the PC is only
output, the stack pointer is checked to ensure it resides within the
task's stack page before it is dereferenced in search of the return
address, and the return address register is similarly only output (if
the PC is in a leaf function or the beginning of a non-leaf function).
However unwind_stack() is only meant for unwinding kernel code, so to be
correct the unwind should stop there.
Signed-off-by: James Hogan <james.hogan@imgtec.com> Reviewed-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11700/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
BMIPS5000 have a PrID value of 0x5A00 and BMIPS5200 have a PrID value of
0x5B00, which, masked with 0x5A00, returns 0x5A00. Update all conditionals on
the PrID to cover both variants since we are going to need this to enable
BMIPS5200 SMP. The existing check, masking with 0xFF00 would not cover
BMIPS5200 at all.
Commit 85efde6f4e0d ("make exported headers use strict posix types")
changed the asm-generic siginfo.h to use the __kernel_* types, and
commit 3a471cbc081b ("remove __KERNEL_STRICT_NAMES") make the internal
types accessible only to the kernel, but the MIPS implementation hasn't
been updated to match.
Switch to proper types now so that the exported asm/siginfo.h won't
produce quite so many compiler errors when included alone by a user
program.
Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Christopher Ferris <cferris@google.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/12477/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently, pmd_present() only checks for a non-zero value, returning
true even after pmd_mknotpresent() (which only clears the type bits).
This patch converts pmd_present() to using pte_present(), similar to the
other pmd_*() checks. As a side effect, it will return true for
PROT_NONE mappings, though they are not yet used by the kernel with
transparent huge pages.
For consistency, also change pmd_mknotpresent() to only clear the
PMD_SECT_VALID bit, even though the PMD_TABLE_BIT is already 0 for block
mappings (no functional change). The unused PMD_SECT_PROT_NONE
definition is removed as transparent huge pages use the pte page prot
values.
Fixes: 9c7e535fcc17 ("arm64: mm: Route pmd thp functions through pte equivalents") Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When filesystem is corrupted in the right way, it can happen
ext4_mark_iloc_dirty() in ext4_orphan_add() returns error and we
subsequently remove inode from the in-memory orphan list. However this
deletion is done with list_del(&EXT4_I(inode)->i_orphan) and thus we
leave i_orphan list_head with a stale content. Later we can look at this
content causing list corruption, oops, or other issues. The reported
trace looked like:
The problem with ornamental, do-nothing gotos is that they lead to
"forgot to set the error code" bugs. We should be returning -EINVAL
here but we don't. It leads to an uninitalized variable in
counter_show():
drivers/acpi/sysfs.c:603 counter_show()
error: uninitialized symbol 'status'.
Fixes: 1c8fce27e275 (ACPI: introduce drivers/acpi/sysfs.c) Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently, the userspace governor only updates frequency on GOV_LIMITS
if policy->cur falls outside policy->{min/max}. However, it is also
necessary to update current frequency on GOV_LIMITS to match the user
requested value if it can be achieved within the new policy->{max/min}.
This was previously the behaviour in the governor until commit d1922f0
("cpufreq: Simplify userspace governor") which incorrectly assumed that
policy->cur == user requested frequency via scaling_setspeed. This won't
be true if the user requested frequency falls outside policy->{min/max}.
Ex: a temporary thermal cap throttled the user requested frequency.
Fix this by storing the user requested frequency in a seperate variable.
The governor will then try to achieve this request on every GOV_LIMITS
change.
Fixes: d1922f02562f (cpufreq: Simplify userspace governor) Signed-off-by: Sai Gurrappadi <sgurrappadi@nvidia.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: lei liu <liu.lei78@zte.com.cn>
[johan: rebase and replace commit message ] Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: lei liu <liu.lei78@zte.com.cn>
[properly sort them - gregkh] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Johan Hovold <johan@kernel.org>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When csw->con_startup() fails in do_register_con_driver, we return no
error (i.e. 0). This was changed back in 2006 by commit 3e795de763.
Before that we used to return -ENODEV.
So fix the return value to be -ENODEV in that case again.
Fixes: 3e795de763 ("VT binding: Add binding/unbinding support for the VT console") Signed-off-by: Jiri Slaby <jslaby@suse.cz> Reported-by: "Dan Carpenter" <dan.carpenter@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The bar number is found in reg2 within the gdd. Therefore
we need to change the assigment from reg1 to reg2 which
is the correct location.
Signed-off-by: Andreas Werner <andreas.werner@men.de> Fixes: '3764e82e5' drivers: Introduce MEN Chameleon Bus Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Added support for Gemalto's Cinterion PH8 and AHxx products
with 2 RmNet Interfaces and products with 1 RmNet + 1 USB Audio interface.
In addition some minor renaming and formatting.
Signed-off-by: Hans-Christoph Schemmel <hans-christoph.schemmel@gemalto.com>
[johan: sort current entries and trim trailing whitespace ] Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
dev_dbg_ratelimited() is a macro that ignores its first argument when DEBUG is
not set, which can lead to unused variable warnings:
ethernet/mellanox/mlxsw/pci.c: In function 'mlxsw_pci_cqe_sdq_handle':
ethernet/mellanox/mlxsw/pci.c:646:18: warning: unused variable 'pdev' [-Wunused-variable]
ethernet/mellanox/mlxsw/pci.c: In function 'mlxsw_pci_cqe_rdq_handle':
ethernet/mellanox/mlxsw/pci.c:671:18: warning: unused variable 'pdev' [-Wunused-variable]
The macro already ensures that all its other arguments are silently
ignored by the compiler without triggering a warning, through the
use of the no_printk() macro, but the dev argument is not passed into
that.
This changes the definition to use the same trick as no_printk() with
an if(0) that leads the compiler to not evaluate the side-effects but
still see that 'dev' might not be unused.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Andrew Lunn <andrew@lunn.ch> Fixes: 6f586e663e3b ("driver-core: Shut up dev_dbg_reatelimited() without DEBUG") Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
According to full-history-linux commit d3794f4fa7c3edc3 ("[PATCH] M68k
update (part 25)"), port operations are allowed on m68k if CONFIG_ISA is
defined.
However, commit 153dcc54df826d2f ("[PATCH] mem driver: fix conditional
on isa i/o support") accidentally changed an "||" into an "&&",
disabling it completely on m68k. This logic was retained when
introducing the DEVPORT symbol in commit 4f911d64e04a44c4 ("Make
/dev/port conditional on config symbol").
Drop the bogus dependency on !M68K to fix this.
Fixes: 153dcc54df826d2f ("[PATCH] mem driver: fix conditional on isa i/o support") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Al Stone <ahs3@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
OpenSSH expects the (non-blocking) read() of pty master to return
EAGAIN only if it has received all of the slave-side output after
it has received SIGCHLD. This used to work on pre-3.12 kernels.
This fix effectively forces non-blocking read() and poll() to
block for parallel i/o to complete for all ttys. It also unwinds
these changes:
Inspired by analysis and patch from Marc Aurele La France <tsi@tuyoix.net>
Reported-by: Volth <openssh@volth.com> Reported-by: Marc Aurele La France <tsi@tuyoix.net> BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=52 BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=2492 Signed-off-by: Brian Bloniarz <brian.bloniarz@gmail.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16:
- No need to unwind commits 2 and 3
- Keep using tty_flush_to_ldisc() rather than adding tty_buffer_flush_work()]] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
This makes the ath79 bootconsole behave the same way as the generic 8250
bootconsole.
Also waiting for TEMT (transmit buffer is empty) instead of just THRE
(transmit buffer is not full) ensures that all characters have been
transmitted before the real serial driver starts reconfiguring the serial
controller (which would sometimes result in garbage being transmitted.)
This change does not cause a visible performance loss.
In addition, this seems to fix a hang observed in certain configurations on
many AR7xxx/AR9xxx SoCs during autoconfig of the real serial driver.
A more complete follow-up patch will disable 8250 autoconfig for ath79
altogether (the serial controller is detected as a 16550A, which is not
fully compatible with the ath79 serial, and the autoconfig may lead to
undefined behavior on ath79.)
Instead of just printing warning messages, if the orphan list is
corrupted, declare the file system is corrupted. If there are any
reserved inodes in the orphaned inode list, declare the file system
corrupted and stop right away to avoid doing more potential damage to
the file system.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.16: leave error code as EIO] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
If the orphaned inode list contains inode #5, ext4_iget() returns a
bad inode (since the bootloader inode should never be referenced
directly). Because of the bad inode, we end up processing the inode
repeatedly and this hangs the machine.
(But don't do this if you are using an unpatched kernel if you care
about the system staying functional. :-)
This bug was found by the port of American Fuzzy Lop into the kernel
to find file system problems[1]. (Since it *only* happens if inode #5
shows up on the orphan list --- 3, 7, 8, etc. won't do it, it's not
surprising that AFL needed two hours before it found it.)
Typically under error conditions, it is possible for aac_command_thread()
to miss the wakeup from kthread_stop() and go back to sleep, causing it
to hang aac_shutdown.
In the observed scenario, the adapter is not functioning correctly and so
aac_fib_send() never completes (or time-outs depending on how it was
called). Shortly after aac_command_thread() starts it performs
aac_fib_send(SendHostTime) which hangs. When aac_probe_one
/aac_get_adapter_info send time outs, kthread_stop is called which breaks
the command thread out of it's hang.
The code will still go back to sleep in schedule_timeout() without
checking kthread_should_stop() so it causes aac_probe_one to hang until
the schedule_timeout() which is 30 minutes.
Fixed by: Adding another kthread_should_stop() before schedule_timeout() Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
aac_fib_send has a special function case for initial commands during
driver initialization using wait < 0(pseudo sync mode). In this case,
the command does not sleep but rather spins checking for timeout.This
loop is calls cpu_relax() in an attempt to allow other processes/threads
to use the CPU, but this function does not relinquish the CPU and so the
command will hog the processor. This was observed in a KDUMP
"crashkernel" and that prevented the "command thread" (which is
responsible for completing the command from being timed out) from
starting because it could not get the CPU.
Fixed by replacing "cpu_relax()" call with "schedule()" Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The ARM architecture mandates that when changing a page table entry
from a valid entry to another valid entry, an invalid entry is first
written, TLB invalidated, and only then the new entry being written.
The current code doesn't respect this, directly writing the new
entry and only then invalidating TLBs. Let's fix it up.
Reported-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
32-bit ioctl uses these rather than the regular FS_IOC_* versions. They can
be handled in btrfs using the same code. Without this, 32-bit {ch,ls}attr
fail.
Signed-off-by: Luke Dashjr <luke-jr+git@utopios.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
sg_dma_len() macro can be used only on scattelists which are mapped, so
all calls to it before dma_map_sg() are invalid. Replace them by proper
check for direct sg segment length read.
Fixes: a49e490c7a8a ("crypto: s5p-sss - add S5PV210 advanced crypto engine support") Fixes: 9e4a1100a445 ("crypto: s5p-sss - Handle unaligned buffers") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Vladimir Zapolskiy <vz@mleia.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
[bwh: Backported to 3.16: unaligned DMA is unsupported so there is a different
set of calls to replace] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The alpha pci_mmap_resource() is used for both IORESOURCE_MEM and
IORESOURCE_IO resources, but iomem_is_exclusive() is only applicable for
IORESOURCE_MEM.
Call iomem_is_exclusive() only for IORESOURCE_MEM resources, and do it
earlier to match the generic version of pci_mmap_resource().
Fixes: 10a0ef39fbd1 ("PCI/alpha: pci sysfs resources") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>