Wrong return code was being returned on SMB3 rmdir of
non-empty directory.
For SMB3 (unlike for cifs), we attempt to delete a directory by
set of delete on close flag on the open. Windows clients set
this flag via a set info (SET_FILE_DISPOSITION to set this flag)
which properly checks if the directory is empty.
With this patch on smb3 mounts we correctly return
"DIRECTORY NOT EMPTY"
on attempts to remove a non-empty directory.
Signed-off-by: Steve French <steve.french@primarydata.com> Acked-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.16:adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.16: adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.16: adjust context, indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
See [MS-NLMP] 3.2.5.1.2 Server Receives an AUTHENTICATE_MESSAGE from the Client:
...
Set NullSession to FALSE
If (AUTHENTICATE_MESSAGE.UserNameLen == 0 AND
AUTHENTICATE_MESSAGE.NtChallengeResponse.Length == 0 AND
(AUTHENTICATE_MESSAGE.LmChallengeResponse == Z(1)
OR
AUTHENTICATE_MESSAGE.LmChallengeResponse.Length == 0))
-- Special case: client requested anonymous authentication
Set NullSession to TRUE
...
Only server which map unknown users to guest will allow
access using a non-null NTChallengeResponse.
For Samba it's the "map to guest = bad user" option.
During boot, MST hotplugs are generally expected (even if no physical
hotplugging occurs) and result in DRM's connector topology changing.
This means that using num_connector from the current mode configuration
can lead to the number of connectors changing under us. This can lead to
some nasty scenarios in fbcon:
- We allocate an array to the size of dev->mode_config.num_connectors.
- MST hotplug occurs, dev->mode_config.num_connectors gets incremented.
- We try to loop through each element in the array using the new value
of dev->mode_config.num_connectors, and end up going out of bounds
since dev->mode_config.num_connectors is now larger then the array we
allocated.
fb_helper->connector_count however, will always remain consistent while
we do a modeset in fb_helper.
Note: This is just polish for 4.7, Dave Airlie's drm_connector
refcounting fixed these bugs for real. But it's good enough duct-tape
for stable kernel backporting, since backporting the refcounting
changes is way too invasive.
During boot time, MST devices usually send a ton of hotplug events
irregardless of whether or not any physical hotplugs actually occurred.
Hotplugs mean connectors being created/destroyed, and the number of DRM
connectors changing under us. This isn't a problem if we use
fb_helper->connector_count since we only set it once in the code,
however if we use num_connector from struct drm_mode_config we risk it's
value changing under us. On top of that, there's even a chance that
dev->mode_config.num_connector != fb_helper->connector_count. If the
number of connectors happens to increase under us, we'll end up using
the wrong array size for memcpy and start writing beyond the actual
length of the array, occasionally resulting in kernel panics.
Note: This is just polish for 4.7, Dave Airlie's drm_connector
refcounting fixed these bugs for real. But it's good enough duct-tape
for stable kernel backporting, since backporting the refcounting
changes is way too invasive.
Irrespective of that fact that that is horrible code that should be
fixed for many reasons, it does highlight a deficiency in the generic
preempt_count manipulators. As it is never right to combine/elide
preempt_count manipulations like this.
Therefore sprinkle some volatile in the two generic accessors to
ensure the compiler is aware of the fact that the preempt_count is
observed outside of the regular program-order view and thus cannot be
optimized away like this.
x86; the only arch not using the generic code is not affected as we
do all this in asm in order to use the segment base per-cpu stuff.
Reported-by: Vikram Mulukutla <markivx@codeaurora.org> Tested-by: Vikram Mulukutla <markivx@codeaurora.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: a787870924db ("sched, arch: Create asm/preempt.h") Link: http://lkml.kernel.org/r/20160516131751.GH3205@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.16: use ACCESS_ONCE() instead of READ_ONCE()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When we free cb->skb after a dump, we do it after releasing the
lock. This means that a new dump could have started in the time
being and we'll end up freeing their skb instead of ours.
This patch saves the skb and module before we unlock so we free
the right memory.
Fixes: 16b304f3404f ("netlink: Eliminate kmalloc in netlink dump operation.") Reported-by: Baozeng Ding <sploving1@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
b84106b4e229 ("PCI: Disable IO/MEM decoding for devices with non-compliant
BARs") disabled BAR sizing for BARs 0-5 of devices that don't comply with
the PCI spec. But it didn't do anything for expansion ROM BARs, so we
still try to size them, resulting in warnings like this on Broadwell-EP:
pci 0000:ff:12.0: BAR 6: failed to assign [mem size 0x00000001 pref]
Move the non-compliant BAR check from __pci_read_base() up to
pci_read_bases() so it applies to the expansion ROM BAR as well as
to BARs 0-5.
Note that direct callers of __pci_read_base(), like sriov_init(), will now
bypass this check. We haven't had reports of devices with broken SR-IOV
BARs yet.
[bhelgaas: changelog] Fixes: b84106b4e229 ("PCI: Disable IO/MEM decoding for devices with non-compliant BARs") Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> CC: "H. Peter Anvin" <hpa@zytor.com> CC: Andi Kleen <ak@linux.intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Commit b894157145e4 ("x86/PCI: Mark Broadwell-EP Home Agent & PCU as having
non-compliant BARs") marked Home Agent 0 & PCU has having non-compliant
BARs. Home Agent 1 also has non-compliant BARs.
Mark Home Agent 1 as having non-compliant BARs so the PCI core doesn't
touch them.
The problem with these devices is documented in the Xeon v4 specification
update:
BDF2 PCI BARs in the Home Agent Will Return Non-Zero Values
During Enumeration
Problem: During system initialization the Operating System may access
the standard PCI BARs (Base Address Registers). Due to
this erratum, accesses to the Home Agent BAR registers (Bus
1; Device 18; Function 0,4; Offsets (0x14-0x24) will return
non-zero values.
Implication: The operating system may issue a warning. Intel has not
observed any functional failures due to this erratum.
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e5-v4-spec-update.html Fixes: b894157145e4 ("x86/PCI: Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs") Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> CC: "H. Peter Anvin" <hpa@zytor.com> CC: Andi Kleen <ak@linux.intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Prevent using uninitialized or negative index when handling
steering entries.
Fixes: b12d93d63c32 ('mlx4: Add support for promiscuous mode in the new steering model.') Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When this_order variable in blk_mq_init_rq_map() becomes zero
the code incorrectly decrements the variable and passes the result
to order_to_size() helper causing undefined behaviour:
UBSAN: Undefined behaviour in block/blk-mq.c:1459:27
shift exponent 4294967295 is too large for 32-bit type 'unsigned int'
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc6-00072-g33656a1 #22
Fix the code by checking this_order variable for not having the zero
value first.
Reported-by: Meelis Roos <mroos@linux.ee> Fixes: 320ae51feed5 ("blk-mq: new multi-queue block IO queueing mechanism") Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Some eMMCs set the partition switch timeout too low.
Now typically eMMCs are considered a critical component (e.g. because
they store the root file system) and consequently are expected to be
reliable. Thus we can neglect the use case where eMMCs can't switch
reliably and we might want a lower timeout to facilitate speedy
recovery.
Although we could employ a quirk for the cards that are affected (if
we could identify them all), as described above, there is little
benefit to having a low timeout, so instead simply set a minimum
timeout.
The minimum is set to 300ms somewhat arbitrarily - the examples that
have been seen had a timeout of 10ms but were sometimes taking 60-70ms.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fix array overrun when going over callback table.
In declaration of callback table, the max size isn't provided and
in registration phase, it is provided.
There is potential scenario where a new operation is added
and it is not supported by current client. The acceptance of
such operation by ib_netlink will cause to array overrun.
Fixes: 809d5fc9bf65 ("infiniband: pass rdma_cm module to netlink_dump_start") Fixes: b493d91d333e ("iwcm: common code for port mapper") Fixes: 2ca546b92a02 ("IB/sa: Route SA pathrecord query through netlink") Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
[bwh: Backported to 3.16:
- Only cma.c needs to be fixed
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In case ibnl_put_msg fails in send_nlmsg_done,
the function returns with -ENOMEM without freeing.
This patch fixes this behavior.
Fixes: 30dc5e63d6a5 ("RDMA/core: Add support for iWARP Port Mapper user space service") Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently c4iw_peer_abort_intr() does not wake up the waiter if the
endpoint state indicates we're using MPAv2 and we're currently trying to
connect. This was introduced with commit 7c0a33d61187a ("RDMA/cxgb4:
Don't wakeup threads for MPAv2")
However, this original fix is flawed because it introduces a race that
can cause a deadlock of the iwarp stack. Here is the race:
->local side sets up an active offload connection.
->local side sends MPA_START request.
->peer sends MPA_START response.
->local side ingress cpl thread begins processing the MPA_START response,
but before it changes the state from MPA_REQ_SENT to FPDU_MODE:
->peer sends a RST which results in a ABORT_REQ_RSS. This triggers
peer_abort_intr() which sees the state in MPA_REQ_SENT and since mpa_rev
is 2, it will avoid waking up the endpoint with -ECONNRESET, assuming the
stack will re-attempt the connection using MPAv1.
->Meanwhile, the cpl thread moves the state to FPDU_MODE and calls
c4iw_modify_rc_qp() which calls rdma_init() which sends a RI_WR/INIT WR
to firmware. But since HW sent an abort, FW correctly drops the RI_WR/INIT
WR.
->So the cpl thread is stuck waiting for a reply and cannot process the
ABORT_REQ_RSS cpl sitting in its input queue. Thus everything comes to a
halt because no more ingress cpls are processed by the stack...
The correct fix for the issue is to always do the wake up in
c4iw_abort_intr() but reinitialize the wait object in c4iw_reconnect().
Fixes: 7c0a33d61187a ("RDMA/cxgb4: Don't wakeup threads for MPAv2") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Thus it overflows and the resulting number is less than 4080, which makes
3823 / 4080 = 0
an nr_pages is set to this. As we already checked against the minimum that
nr_pages may be, this causes the logic to fail as well, and we crash the
kernel.
There's no reason to have the two DIV_ROUND_UP() (that's just result of
historical code changes), clean up the code and fix this bug.
Fixes: 83f40318dab00 ("ring-buffer: Make removal of ring buffer pages atomic") Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The size variable to change the ring buffer in ftrace is a long. The
nr_pages used to update the ring buffer based on the size is int. On 64 bit
machines this can cause an overflow problem.
For example, the following will cause the ring buffer to crash:
WARNING: CPU: 1 PID: 318 at kernel/trace/ring_buffer.c:1527 rb_update_pages+0x22f/0x260
Which is:
RB_WARN_ON(cpu_buffer, nr_removed);
Note each ring buffer page holds 4080 bytes.
This is because:
1) 10 causes the ring buffer to have 3 pages.
(10kb requires 3 * 4080 pages to hold)
2) (2^31 / 2^10 + 1) * 4080 = 8556384240
The value written into buffer_size_kb is shifted by 10 and then passed
to ring_buffer_resize(). 8556384240 * 2^10 = 8761737461760
3) The size passed to ring_buffer_resize() is then divided by BUF_PAGE_SIZE
which is 4080. 8761737461760 / 4080 = 2147484672
4) nr_pages is subtracted from the current nr_pages (3) and we get: 2147484669. This value is saved in a signed integer nr_pages_to_update
5) 2147484669 is greater than 2^31 but smaller than 2^32, a signed int
turns into the value of -2147482627
6) As the value is a negative number, in update_pages_handler() it is
negated and passed to rb_remove_pages() and 2147482627 pages will
be removed, which is much larger than 3 and it causes the warning
because not all the pages asked to be removed were removed.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=118001 Fixes: 7a8e76a3829f1 ("tracing: unified trace buffer") Reported-by: Hao Qin <QEver.cn@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When emulating a jalr instruction with rd == $0, the code in
isBranchInstr was incorrectly writing to GPR $0 which should actually
always remain zeroed. This would lead to any further instructions
emulated which use $0 operating on a bogus value until the task is next
context switched, at which point the value of $0 in the task context
would be restored to the correct zero by a store in SAVE_SOME. Fix this
by not writing to rd if it is $0.
Fixes: 102cedc32a6e ("MIPS: microMIPS: Floating point support.") Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: Maciej W. Rozycki <macro@imgtec.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13160/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The lazy cache flushing implemented in the MIPS kernel suffers from a
race condition that is exposed by do_set_pte() in mm/memory.c.
A pre-condition is a file-system that writes to the page from the CPU
in its readpage method and then calls flush_dcache_page(). One example
is ubifs. Another pre-condition is that the dcache flush is postponed
in __flush_dcache_page().
Upon a page fault for an executable mapping not existing in the
page-cache, the following will happen:
1. Write to the page
2. flush_dcache_page
3. flush_icache_page
4. set_pte_at
5. update_mmu_cache (commits the flush of a dcache-dirty page)
Between steps 4 and 5 another thread can hit the same page and it will
encounter a valid pte. Because the data still is in the L1 dcache the CPU
will fetch stale data from L2 into the icache and execute garbage.
This fix moves the commit of the cache flush to step 3 to close the
race window. It also reduces the amount of flushes on non-executable
mappings because we never enter __flush_dcache_page() for non-aliasing
CPUs.
Regressions can occur in drivers that mistakenly relies on the
flush_dcache_page() in get_user_pages() for DMA operations.
[ralf@linux-mips.org: Folded in patch 9346 to fix highmem issue.]
Commit 39baadbf36ce ("powerpc/eeh: Remove eeh information from pci_dn")
changed the pci_dn struct by removing its EEH-related members.
As part of this clean-up, DDW mechanism was modified to read the device
configuration address from eeh_dev struct.
As a consequence, now if we disable EEH mechanism on kernel command-line
for example, the DDW mechanism will fail, generating a kernel oops by
dereferencing a NULL pointer (which turns to be the eeh_dev pointer).
This patch just changes the configuration address calculation on DDW
functions to a manual calculation based on pci_dn members instead of
using eeh_dev-based address.
No functional changes were made. This was tested on pSeries, both
in PHyp and qemu guest.
Fixes: 39baadbf36ce ("powerpc/eeh: Remove eeh information from pci_dn") Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Systems show a minimal load average of 0.00, 0.01, 0.05 even when they
have no load at all.
Uptime and /proc/loadavg on all systems with kernels released during the
last five years up until kernel version 4.6-rc5, show a 5- and 15-minute
minimum loadavg of 0.01 and 0.05 respectively. This should be 0.00 on
idle systems, but the way the kernel calculates this value prevents it
from getting lower than the mentioned values.
Likewise but not as obviously noticeable, a fully loaded system with no
processes waiting, shows a maximum 1/5/15 loadavg of 1.00, 0.99, 0.95
(multiplied by number of cores).
Once the (old) load becomes 93 or higher, it mathematically can never
get lower than 93, even when the active (load) remains 0 forever.
This results in the strange 0.00, 0.01, 0.05 uptime values on idle
systems. Note: 93/2048 = 0.0454..., which rounds up to 0.05.
It is not correct to add a 0.5 rounding (=1024/2048) here, since the
result from this function is fed back into the next iteration again,
so the result of that +0.5 rounding value then gets multiplied by
(2048-2037), and then rounded again, so there is a virtual "ghost"
load created, next to the old and active load terms.
By changing the way the internally kept value is rounded, that internal
value equivalent now can reach 0.00 on idle, and 1.00 on full load. Upon
increasing load, the internally kept load value is rounded up, when the
load is decreasing, the load value is rounded down.
The modified code was tested on nohz=off and nohz kernels. It was tested
on vanilla kernel 4.6-rc5 and on centos 7.1 kernel 3.10.0-327. It was
tested on single, dual, and octal cores system. It was tested on virtual
hosts and bare hardware. No unwanted effects have been observed, and the
problems that the patch intended to fix were indeed gone.
Tested-by: Damien Wyart <damien.wyart@free.fr> Signed-off-by: Vik Heyndrickx <vik.heyndrickx@veribox.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Doug Smythies <dsmythies@telus.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 0f004f5a696a ("sched: Cure more NO_HZ load average woes") Link: http://lkml.kernel.org/r/e8d32bff-d544-7748-72b5-3c86cc71f09f@veribox.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In commit a269913c52ad ("rtlwifi: Rework rtl_lps_leave() and
rtl_lps_enter() to use work queue"), the tests for enter/exit
power-save mode were inverted. With this change applied, the
wifi connection becomes much more stable.
Fixes: a269913c52ad ("rtlwifi: Rework rtl_lps_leave() and rtl_lps_enter() to use work queue") Signed-off-by: Wang YanQing <udknight@gmail.com> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
[bwh: Backported to 3.16:
- We only set a flag here to be used later, but it was also set the wrong way
- Adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
On some architectures (powerpc in particular), the number of registers
exceeds what can be represented in an integer bitmask. Ensure we
generate the proper bitmask on such platforms.
Fixes: 71ad0f5e4 ("perf tools: Support for DWARF CFI unwinding on post processing") Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
With Linux page size of 64K and hardware only supporting 4K HPTE, if we
use subpage protection, we always fail for the subpage 0 as shown
below (using the selftest subpage_prot test):
520175565: (4520111850): Failed at 0x3fffad4b0000 (p=13,sp=0,w=0), want=fault, got=pass ! 4520890210: (4520826495): Failed at 0x3fffad5b0000 (p=29,sp=0,w=0), want=fault, got=pass ! 4521574251: (4521510536): Failed at 0x3fffad6b0000 (p=45,sp=0,w=0), want=fault, got=pass ! 4522258324: (4522194609): Failed at 0x3fffad7b0000 (p=61,sp=0,w=0), want=fault, got=pass !
This is because hash preload wrongly inserts the HPTE entry for subpage
0 without looking at the subpage protection information.
Fix it by teaching should_hash_preload() not to preload if we have
subpage protection configured for that range.
It appears this has been broken since it was introduced in 2008.
Fixes: fa28237cfcc5 ("[POWERPC] Provide a way to protect 4k subpages when using 64k pages") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Rework into should_hash_preload() to avoid build fails w/SLICES=n] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently we have a check in hash_preload() against the psize, which is
only included when CONFIG_PPC_MM_SLICES is enabled. We want to expand
this check in a subsequent patch, so factor it out to allow that. As a
bonus it removes the #ifdef in the C code.
Unfortunately we can't put this in the existing CONFIG_PPC_MM_SLICES
block because it would require a forward declaration.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
gcc-6 started warning by default about variables that are not
used anywhere and that are marked 'const', generating many
false positives in an allmodconfig build, e.g.:
arch/arm/mach-davinci/board-da830-evm.c:282:20: warning: 'da830_evm_emif25_pins' defined but not used [-Wunused-const-variable=]
arch/arm/plat-omap/dmtimer.c:958:34: warning: 'omap_timer_match' defined but not used [-Wunused-const-variable=]
drivers/bluetooth/hci_bcm.c:625:39: warning: 'acpi_bcm_default_gpios' defined but not used [-Wunused-const-variable=]
drivers/char/hw_random/omap-rng.c:92:18: warning: 'reg_map_omap4' defined but not used [-Wunused-const-variable=]
drivers/devfreq/exynos/exynos5_bus.c:381:32: warning: 'exynos5_busfreq_int_pm' defined but not used [-Wunused-const-variable=]
drivers/dma/mv_xor.c:1139:34: warning: 'mv_xor_dt_ids' defined but not used [-Wunused-const-variable=]
This is similar to the existing -Wunused-but-set-variable warning
that was added in an earlier release and that we disable by default
now and only enable when W=1 is set, so it makes sense to do
the same here. Once we have eliminated the majority of the
warnings for both, we can put them back into the default list.
We probably want this in backport kernels as well, to allow building
them with gcc-6 without introducing extra warnings.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Michal Marek <mmarek@suse.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The loop that browses the array compat_hwcap_str will stop when a NULL
is encountered, however NULL is missing at the end of array. This will
lead to overrun until a NULL is found somewhere in the following memory.
In reality, this works out because the compat_hwcap2_str array tends to
follow immediately in memory, and that *is* terminated correctly.
Furthermore, the unsigned int compat_elf_hwcap is checked before
printing each capability, so we end up doing the right thing because
the size of the two arrays is less than 32. Still, this is an obvious
mistake and should be fixed.
Note for backporting: commit 12d11817eaafa414 ("arm64: Move
/proc/cpuinfo handling code") moved this code in v4.4. Prior to that
commit, the same change should be made in arch/arm64/kernel/setup.c.
Fixes: 44b82b7700d0 "arm64: Fix up /proc/cpuinfo" Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
[bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When an IPI is generated by a CPU, the pattern looks roughly like:
<write shared data>
smp_wmb();
<write to GIC to signal SGI>
On the receiving CPU we rely on the fact that, once we've taken the
interrupt, then the freshly written shared data must be visible to us.
Put another way, the CPU isn't going to speculate taking an interrupt.
Unfortunately, this assumption turns out to be broken.
Consider that CPUx wants to send an IPI to CPUy, which will cause CPUy
to read some shared_data. Before CPUx has done anything, a random
peripheral raises an IRQ to the GIC and the IRQ line on CPUy is raised.
CPUy then takes the IRQ and starts executing the entry code, heading
towards gic_handle_irq. Furthermore, let's assume that a bunch of the
previous interrupts handled by CPUy were SGIs, so the branch predictor
kicks in and speculates that irqnr will be <16 and we're likely to
head into handle_IPI. The prefetcher then grabs a speculative copy of
shared_data which contains a stale value.
Meanwhile, CPUx gets round to updating shared_data and asking the GIC
to send an SGI to CPUy. Internally, the GIC decides that the SGI is
more important than the peripheral interrupt (which hasn't yet been
ACKed) but doesn't need to do anything to CPUy, because the IRQ line
is already raised.
CPUy then reads the ACK register on the GIC, sees the SGI value which
confirms the branch prediction and we end up with a stale shared_data
value.
This patch fixes the problem by adding an smp_rmb() to the IPI entry
code in gic_handle_irq. As it turns out, the combination of a control
dependency and an ISB instruction from the EOI in the GICv3 driver is
enough to provide the ordering we need, so we add a comment there
justifying the absence of an explicit smp_rmb().
Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[bwh: Backported to 3.16: drop changes to irq-gic-v3] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The identity mapping is suboptimal for the last 2GB frame. The mapping
will be established with a mix of 4KB and 1MB mappings instead of a
single 2GB mapping.
This happens because of a off-by-one bug introduced with
commit 50be63450728 ("s390/mm: Convert bootmem to memblock").
This lock is already taken in ata_scsi_queuecmd() a few levels up the
call stack so attempting to take it here is an error. Moreover, it is
pointless in the first place since it only protects a single, atomic
assignment.
Enabling lock debugging gives the following output:
=============================================
[ INFO: possible recursive locking detected ]
4.4.0-rc5+ #189 Not tainted
---------------------------------------------
kworker/u2:3/37 is trying to acquire lock:
(&(&host->lock)->rlock){-.-...}, at: [<90283294>] sata_dwc_exec_command_by_tag.constprop.14+0x44/0x8c
but task is already holding lock:
(&(&host->lock)->rlock){-.-...}, at: [<902761ac>] ata_scsi_queuecmd+0x2c/0x330
other info that might help us debug this:
Possible unsafe locking scenario:
Enabling CONFIG_GCOV_PROFILE_ALL produces us a lot of warnings like
lib/lz4/lz4hc_compress.c: In function 'lz4_compresshcctx':
lib/lz4/lz4hc_compress.c:514:1: warning: the frame size of 1504 bytes is larger than 1024 bytes [-Wframe-larger-than=]
After some investigation, I found that this behavior started with gcc-4.9,
and opened https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69702.
A suggested workaround for it is to use the -fno-tree-loop-im
flag that turns off one of the optimization stages in gcc, so the
code runs a little slower but does not use excessive amounts
of stack.
We could make this conditional on the gcc version, but I could not
find an easy way to do this in Kbuild and the benefit would be
fairly small, given that most of the gcc version in production are
affected now.
I'm marking this for 'stable' backports because it addresses a bug
with code generation in gcc that exists in all kernel versions
with the affected gcc releases.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Michal Marek <mmarek@suse.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Writing CP0_Compare clears the timer interrupt pending bit
(CP0_Cause.TI), but this wasn't being done atomically. If a timer
interrupt raced with the write of the guest CP0_Compare, the timer
interrupt could end up being pending even though the new CP0_Compare is
nowhere near CP0_Count.
We were already updating the hrtimer expiry with
kvm_mips_update_hrtimer(), which used both kvm_mips_freeze_hrtimer() and
kvm_mips_resume_hrtimer(). Close the race window by expanding out
kvm_mips_update_hrtimer(), and clearing CP0_Cause.TI and setting
CP0_Compare between the freeze and resume. Since the pending timer
interrupt should not be cleared when CP0_Compare is written via the KVM
user API, an ack argument is added to distinguish the source of the
write.
Fixes: e30492bbe95a ("MIPS: KVM: Rewrite count/compare timer emulation") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim KrÄ\8dmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.16: adjust filenames] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
There's a particularly narrow and subtle race condition when the
software emulated guest timer is frozen which can allow a guest timer
interrupt to be missed.
This happens due to the hrtimer expiry being inexact, so very
occasionally the freeze time will be after the moment when the emulated
CP0_Count transitions to the same value as CP0_Compare (so an IRQ should
be generated), but before the moment when the hrtimer is due to expire
(so no IRQ is generated). The IRQ won't be generated when the timer is
resumed either, since the resume CP0_Count will already match CP0_Compare.
With VZ guests in particular this is far more likely to happen, since
the soft timer may be frozen frequently in order to restore the timer
state to the hardware guest timer. This happens after 5-10 hours of
guest soak testing, resulting in an overflow in guest kernel timekeeping
calculations, hanging the guest. A more focussed test case to
intentionally hit the race (with the help of a new hypcall to cause the
timer state to migrated between hardware & software) hits the condition
fairly reliably within around 30 seconds.
Instead of relying purely on the inexact hrtimer expiry to determine
whether an IRQ should be generated, read the guest CP0_Compare and
directly check whether the freeze time is before or after it. Only if
CP0_Count is on or after CP0_Compare do we check the hrtimer expiry to
determine whether the last IRQ has already been generated (which will
have pushed back the expiry by one timer period).
Fixes: e30492bbe95a ("MIPS: KVM: Rewrite count/compare timer emulation") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim KrÄ\8dmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The interface read URB is submitted in attach, but was only unlinked by
the driver at disconnect.
In case of a late probe error (e.g. due to failed minor allocation),
disconnect is never called and we would end up with active URBs for an
unbound interface. This in turn could lead to deallocated memory being
dereferenced in the completion callback.
Fixes: f7a33e608d9a ("USB: serial: add quatech2 usb to serial driver") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The interface read and event URBs are submitted in attach, but were
never explicitly unlinked by the driver. Instead the URBs would have
been killed by usb-serial core on disconnect.
In case of a late probe error (e.g. due to failed minor allocation),
disconnect is never called and we could end up with active URBs for an
unbound interface. This in turn could lead to deallocated memory being
dereferenced in the completion callbacks.
The interface instat and indat URBs were submitted in attach, but never
unlinked in release before deallocating the corresponding transfer
buffers.
In the case of a late probe error (e.g. due to failed minor allocation),
disconnect would not have been called before release, causing the
buffers to be freed while the URBs are still in use. We'd also end up
with active URBs for an unbound interface.
Fixes: f9c99bb8b3a1 ("USB: usb-serial: replace shutdown with disconnect,
release") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
URBs and buffers allocated in attach for Epic devices would never be
deallocated in case of a later probe error (e.g. failure to allocate
minor numbers) as disconnect is then never called.
Fix by moving deallocation to release and making sure that the
URBs are first unlinked.
Fixes: f9c99bb8b3a1 ("USB: usb-serial: replace shutdown with disconnect,
release") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Private data, URBs and buffers allocated for Epic devices during
attach were never released on errors (e.g. missing endpoints).
Fixes: 6e8cf7751f9f ("USB: add EPIC support to the io_edgeport driver") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Update the recent changes to set_pte() that were added in 46011e6ea392
to handle R10000_LLSC_WAR, and format the assembly to match other areas
of the MIPS tree using the same WAR.
This also incorporates a patch recently sent in my Markos Chandras,
"Remove local LL/SC preprocessor variants", so that patch doesn't need
to be applied if this one is accepted.
Signed-off-by: Joshua Kinard <kumba@gentoo.org> Fixes: 46011e6ea392 ("MIPS: Make set_pte() SMP safe.) Cc: David Daney <david.daney@cavium.com> Cc: Linux/MIPS <linux-mips@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/11103/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
[bwh: Backported to 3.2:
- Use {LL,SC}_INSN not __{LL,SC}
- Use literal arch=r4000 instead of MIPS_ISA_ARCH_LEVEL since R6 is not
supported] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When showing backtraces in response to traps, for example crashes and
address errors (usually unaligned accesses) when they are set in debugfs
to be reported, unwind_stack will be used if the PC was in the kernel
text address range. However since EVA it is possible for user and kernel
address ranges to overlap, and even without EVA userland can still
trigger an address error by jumping to a KSeg0 address.
Adjust the check to also ensure that it was running in kernel mode. I
don't believe any harm can come of this problem, since unwind_stack() is
sufficiently defensive, however it is only meant for unwinding kernel
code, so to be correct it should use the raw backtracing instead.
Signed-off-by: James Hogan <james.hogan@imgtec.com> Reviewed-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11701/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
(cherry picked from commit d2941a975ac745c607dfb590e92bb30bc352dad9) Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When unwinding through IRQs and exceptions, the unwinding only continues
if the PC is a kernel text address, however since EVA it is possible for
user and kernel address ranges to overlap, potentially allowing
unwinding to continue to user mode if the user PC happens to be in the
kernel text address range.
Adjust the check to also ensure that the register state from before the
exception is actually running in kernel mode, i.e. !user_mode(regs).
I don't believe any harm can come of this problem, since the PC is only
output, the stack pointer is checked to ensure it resides within the
task's stack page before it is dereferenced in search of the return
address, and the return address register is similarly only output (if
the PC is in a leaf function or the beginning of a non-leaf function).
However unwind_stack() is only meant for unwinding kernel code, so to be
correct the unwind should stop there.
Signed-off-by: James Hogan <james.hogan@imgtec.com> Reviewed-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11700/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
BMIPS5000 have a PrID value of 0x5A00 and BMIPS5200 have a PrID value of
0x5B00, which, masked with 0x5A00, returns 0x5A00. Update all conditionals on
the PrID to cover both variants since we are going to need this to enable
BMIPS5200 SMP. The existing check, masking with 0xFF00 would not cover
BMIPS5200 at all.
Commit 85efde6f4e0d ("make exported headers use strict posix types")
changed the asm-generic siginfo.h to use the __kernel_* types, and
commit 3a471cbc081b ("remove __KERNEL_STRICT_NAMES") make the internal
types accessible only to the kernel, but the MIPS implementation hasn't
been updated to match.
Switch to proper types now so that the exported asm/siginfo.h won't
produce quite so many compiler errors when included alone by a user
program.
Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Christopher Ferris <cferris@google.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/12477/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently, pmd_present() only checks for a non-zero value, returning
true even after pmd_mknotpresent() (which only clears the type bits).
This patch converts pmd_present() to using pte_present(), similar to the
other pmd_*() checks. As a side effect, it will return true for
PROT_NONE mappings, though they are not yet used by the kernel with
transparent huge pages.
For consistency, also change pmd_mknotpresent() to only clear the
PMD_SECT_VALID bit, even though the PMD_TABLE_BIT is already 0 for block
mappings (no functional change). The unused PMD_SECT_PROT_NONE
definition is removed as transparent huge pages use the pte page prot
values.
Fixes: 9c7e535fcc17 ("arm64: mm: Route pmd thp functions through pte equivalents") Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When filesystem is corrupted in the right way, it can happen
ext4_mark_iloc_dirty() in ext4_orphan_add() returns error and we
subsequently remove inode from the in-memory orphan list. However this
deletion is done with list_del(&EXT4_I(inode)->i_orphan) and thus we
leave i_orphan list_head with a stale content. Later we can look at this
content causing list corruption, oops, or other issues. The reported
trace looked like:
The problem with ornamental, do-nothing gotos is that they lead to
"forgot to set the error code" bugs. We should be returning -EINVAL
here but we don't. It leads to an uninitalized variable in
counter_show():
drivers/acpi/sysfs.c:603 counter_show()
error: uninitialized symbol 'status'.
Fixes: 1c8fce27e275 (ACPI: introduce drivers/acpi/sysfs.c) Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently, the userspace governor only updates frequency on GOV_LIMITS
if policy->cur falls outside policy->{min/max}. However, it is also
necessary to update current frequency on GOV_LIMITS to match the user
requested value if it can be achieved within the new policy->{max/min}.
This was previously the behaviour in the governor until commit d1922f0
("cpufreq: Simplify userspace governor") which incorrectly assumed that
policy->cur == user requested frequency via scaling_setspeed. This won't
be true if the user requested frequency falls outside policy->{min/max}.
Ex: a temporary thermal cap throttled the user requested frequency.
Fix this by storing the user requested frequency in a seperate variable.
The governor will then try to achieve this request on every GOV_LIMITS
change.
Fixes: d1922f02562f (cpufreq: Simplify userspace governor) Signed-off-by: Sai Gurrappadi <sgurrappadi@nvidia.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: lei liu <liu.lei78@zte.com.cn>
[johan: rebase and replace commit message ] Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: lei liu <liu.lei78@zte.com.cn>
[properly sort them - gregkh] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Johan Hovold <johan@kernel.org>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When csw->con_startup() fails in do_register_con_driver, we return no
error (i.e. 0). This was changed back in 2006 by commit 3e795de763.
Before that we used to return -ENODEV.
So fix the return value to be -ENODEV in that case again.
Fixes: 3e795de763 ("VT binding: Add binding/unbinding support for the VT console") Signed-off-by: Jiri Slaby <jslaby@suse.cz> Reported-by: "Dan Carpenter" <dan.carpenter@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The bar number is found in reg2 within the gdd. Therefore
we need to change the assigment from reg1 to reg2 which
is the correct location.
Signed-off-by: Andreas Werner <andreas.werner@men.de> Fixes: '3764e82e5' drivers: Introduce MEN Chameleon Bus Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Added support for Gemalto's Cinterion PH8 and AHxx products
with 2 RmNet Interfaces and products with 1 RmNet + 1 USB Audio interface.
In addition some minor renaming and formatting.
Signed-off-by: Hans-Christoph Schemmel <hans-christoph.schemmel@gemalto.com>
[johan: sort current entries and trim trailing whitespace ] Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
dev_dbg_ratelimited() is a macro that ignores its first argument when DEBUG is
not set, which can lead to unused variable warnings:
ethernet/mellanox/mlxsw/pci.c: In function 'mlxsw_pci_cqe_sdq_handle':
ethernet/mellanox/mlxsw/pci.c:646:18: warning: unused variable 'pdev' [-Wunused-variable]
ethernet/mellanox/mlxsw/pci.c: In function 'mlxsw_pci_cqe_rdq_handle':
ethernet/mellanox/mlxsw/pci.c:671:18: warning: unused variable 'pdev' [-Wunused-variable]
The macro already ensures that all its other arguments are silently
ignored by the compiler without triggering a warning, through the
use of the no_printk() macro, but the dev argument is not passed into
that.
This changes the definition to use the same trick as no_printk() with
an if(0) that leads the compiler to not evaluate the side-effects but
still see that 'dev' might not be unused.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Andrew Lunn <andrew@lunn.ch> Fixes: 6f586e663e3b ("driver-core: Shut up dev_dbg_reatelimited() without DEBUG") Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
According to full-history-linux commit d3794f4fa7c3edc3 ("[PATCH] M68k
update (part 25)"), port operations are allowed on m68k if CONFIG_ISA is
defined.
However, commit 153dcc54df826d2f ("[PATCH] mem driver: fix conditional
on isa i/o support") accidentally changed an "||" into an "&&",
disabling it completely on m68k. This logic was retained when
introducing the DEVPORT symbol in commit 4f911d64e04a44c4 ("Make
/dev/port conditional on config symbol").
Drop the bogus dependency on !M68K to fix this.
Fixes: 153dcc54df826d2f ("[PATCH] mem driver: fix conditional on isa i/o support") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Al Stone <ahs3@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
OpenSSH expects the (non-blocking) read() of pty master to return
EAGAIN only if it has received all of the slave-side output after
it has received SIGCHLD. This used to work on pre-3.12 kernels.
This fix effectively forces non-blocking read() and poll() to
block for parallel i/o to complete for all ttys. It also unwinds
these changes:
Inspired by analysis and patch from Marc Aurele La France <tsi@tuyoix.net>
Reported-by: Volth <openssh@volth.com> Reported-by: Marc Aurele La France <tsi@tuyoix.net> BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=52 BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=2492 Signed-off-by: Brian Bloniarz <brian.bloniarz@gmail.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16:
- No need to unwind commits 2 and 3
- Keep using tty_flush_to_ldisc() rather than adding tty_buffer_flush_work()]] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
This makes the ath79 bootconsole behave the same way as the generic 8250
bootconsole.
Also waiting for TEMT (transmit buffer is empty) instead of just THRE
(transmit buffer is not full) ensures that all characters have been
transmitted before the real serial driver starts reconfiguring the serial
controller (which would sometimes result in garbage being transmitted.)
This change does not cause a visible performance loss.
In addition, this seems to fix a hang observed in certain configurations on
many AR7xxx/AR9xxx SoCs during autoconfig of the real serial driver.
A more complete follow-up patch will disable 8250 autoconfig for ath79
altogether (the serial controller is detected as a 16550A, which is not
fully compatible with the ath79 serial, and the autoconfig may lead to
undefined behavior on ath79.)
Instead of just printing warning messages, if the orphan list is
corrupted, declare the file system is corrupted. If there are any
reserved inodes in the orphaned inode list, declare the file system
corrupted and stop right away to avoid doing more potential damage to
the file system.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.16: leave error code as EIO] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
If the orphaned inode list contains inode #5, ext4_iget() returns a
bad inode (since the bootloader inode should never be referenced
directly). Because of the bad inode, we end up processing the inode
repeatedly and this hangs the machine.
(But don't do this if you are using an unpatched kernel if you care
about the system staying functional. :-)
This bug was found by the port of American Fuzzy Lop into the kernel
to find file system problems[1]. (Since it *only* happens if inode #5
shows up on the orphan list --- 3, 7, 8, etc. won't do it, it's not
surprising that AFL needed two hours before it found it.)
Typically under error conditions, it is possible for aac_command_thread()
to miss the wakeup from kthread_stop() and go back to sleep, causing it
to hang aac_shutdown.
In the observed scenario, the adapter is not functioning correctly and so
aac_fib_send() never completes (or time-outs depending on how it was
called). Shortly after aac_command_thread() starts it performs
aac_fib_send(SendHostTime) which hangs. When aac_probe_one
/aac_get_adapter_info send time outs, kthread_stop is called which breaks
the command thread out of it's hang.
The code will still go back to sleep in schedule_timeout() without
checking kthread_should_stop() so it causes aac_probe_one to hang until
the schedule_timeout() which is 30 minutes.
Fixed by: Adding another kthread_should_stop() before schedule_timeout() Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
aac_fib_send has a special function case for initial commands during
driver initialization using wait < 0(pseudo sync mode). In this case,
the command does not sleep but rather spins checking for timeout.This
loop is calls cpu_relax() in an attempt to allow other processes/threads
to use the CPU, but this function does not relinquish the CPU and so the
command will hog the processor. This was observed in a KDUMP
"crashkernel" and that prevented the "command thread" (which is
responsible for completing the command from being timed out) from
starting because it could not get the CPU.
Fixed by replacing "cpu_relax()" call with "schedule()" Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The ARM architecture mandates that when changing a page table entry
from a valid entry to another valid entry, an invalid entry is first
written, TLB invalidated, and only then the new entry being written.
The current code doesn't respect this, directly writing the new
entry and only then invalidating TLBs. Let's fix it up.
Reported-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
32-bit ioctl uses these rather than the regular FS_IOC_* versions. They can
be handled in btrfs using the same code. Without this, 32-bit {ch,ls}attr
fail.
Signed-off-by: Luke Dashjr <luke-jr+git@utopios.org> Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
sg_dma_len() macro can be used only on scattelists which are mapped, so
all calls to it before dma_map_sg() are invalid. Replace them by proper
check for direct sg segment length read.
Fixes: a49e490c7a8a ("crypto: s5p-sss - add S5PV210 advanced crypto engine support") Fixes: 9e4a1100a445 ("crypto: s5p-sss - Handle unaligned buffers") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Vladimir Zapolskiy <vz@mleia.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
[bwh: Backported to 3.16: unaligned DMA is unsupported so there is a different
set of calls to replace] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The alpha pci_mmap_resource() is used for both IORESOURCE_MEM and
IORESOURCE_IO resources, but iomem_is_exclusive() is only applicable for
IORESOURCE_MEM.
Call iomem_is_exclusive() only for IORESOURCE_MEM resources, and do it
earlier to match the generic version of pci_mmap_resource().
Fixes: 10a0ef39fbd1 ("PCI/alpha: pci sysfs resources") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
iomem_is_exclusive() requires a CPU physical address, but on some arches we
supplied a PCI bus address instead.
On most arches, pci_resource_to_user(res) returns "res->start", which is a
CPU physical address. But on microblaze, mips, powerpc, and sparc, it
returns the PCI bus address corresponding to "res->start".
The result is that pci_mmap_resource() may fail when it shouldn't (if the
bus address happens to match an existing resource), or it may succeed when
it should fail (if the resource is exclusive but the bus address doesn't
match it).
Call iomem_is_exclusive() with "res->start", which is always a CPU physical
address, not the result of pci_resource_to_user().
Fixes: e8de1481fd71 ("resource: allow MMIO exclusivity for device drivers") Suggested-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The tcrypt testing module on Exynos5422-based Odroid XU3/4 board failed on
testing 8 kB size blocks:
$ sudo modprobe tcrypt sec=1 mode=500
testing speed of async ecb(aes) (ecb-aes-s5p) encryption
test 0 (128 bit key, 16 byte blocks): 21971 operations in 1 seconds (351536 bytes)
test 1 (128 bit key, 64 byte blocks): 21731 operations in 1 seconds (1390784 bytes)
test 2 (128 bit key, 256 byte blocks): 21932 operations in 1 seconds (5614592 bytes)
test 3 (128 bit key, 1024 byte blocks): 21685 operations in 1 seconds (22205440 bytes)
test 4 (128 bit key, 8192 byte blocks):
This was caused by a race issue of missed BRDMA_DONE ("Block cipher
Receiving DMA") interrupt. Device starts processing the data in DMA mode
immediately after setting length of DMA block: receiving (FCBRDMAL) or
transmitting (FCBTDMAL). The driver sets these lengths from interrupt
handler through s5p_set_dma_indata() function (or xxx_setdata()).
However the interrupt handler was first dealing with receive buffer
(dma-unmap old, dma-map new, set receive block length which starts the
operation), then with transmit buffer and finally was clearing pending
interrupts (FCINTPEND). Because of the time window between setting
receive buffer length and clearing pending interrupts, the operation on
receive buffer could end already and driver would miss new interrupt.
User manual for Exynos5422 confirms in example code that setting DMA
block lengths should be the last operation.
The tcrypt hang could be also observed in following blocked-task dmesg:
INFO: task modprobe:258 blocked for more than 120 seconds.
Not tainted 4.6.0-rc4-next-20160419-00005-g9eac8b7b7753-dirty #42
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
modprobe D c06b09d8 0 258 256 0x00000000
[<c06b09d8>] (__schedule) from [<c06b0f24>] (schedule+0x40/0xac)
[<c06b0f24>] (schedule) from [<c06b49f8>] (schedule_timeout+0x124/0x178)
[<c06b49f8>] (schedule_timeout) from [<c06b17fc>] (wait_for_common+0xb8/0x144)
[<c06b17fc>] (wait_for_common) from [<bf0013b8>] (test_acipher_speed+0x49c/0x740 [tcrypt])
[<bf0013b8>] (test_acipher_speed [tcrypt]) from [<bf003e8c>] (do_test+0x2240/0x30ec [tcrypt])
[<bf003e8c>] (do_test [tcrypt]) from [<bf008048>] (tcrypt_mod_init+0x48/0xa4 [tcrypt])
[<bf008048>] (tcrypt_mod_init [tcrypt]) from [<c010177c>] (do_one_initcall+0x3c/0x16c)
[<c010177c>] (do_one_initcall) from [<c0191ff0>] (do_init_module+0x5c/0x1ac)
[<c0191ff0>] (do_init_module) from [<c0185610>] (load_module+0x1a30/0x1d08)
[<c0185610>] (load_module) from [<c0185ab0>] (SyS_finit_module+0x8c/0x98)
[<c0185ab0>] (SyS_finit_module) from [<c01078c0>] (ret_fast_syscall+0x0/0x3c)
Fixes: a49e490c7a8a ("crypto: s5p-sss - add S5PV210 advanced crypto engine support") Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Huang has reported that in his powerfail testing he is seeing stale
block contents in some of recently allocated blocks although he mounts
ext4 in data=ordered mode. After some investigation I have found out
that indeed when delayed allocation is used, we don't add inode to
transaction's list of inodes needing flushing before commit. Originally
we were doing that but commit f3b59291a69d removed the logic with a
flawed argument that it is not needed.
The problem is that although for delayed allocated blocks we write their
contents immediately after allocating them, there is no guarantee that
the IO scheduler or device doesn't reorder things and thus transaction
allocating blocks and attaching them to inode can reach stable storage
before actual block contents. Actually whenever we attach freshly
allocated blocks to inode using a written extent, we should add inode to
transaction's ordered inode list to make sure we properly wait for block
contents to be written before committing the transaction. So that is
what we do in this patch. This also handles other cases where stale data
exposure was possible - like filling hole via mmap in
data=ordered,nodelalloc mode.
The only exception to the above rule are extending direct IO writes where
blkdev_direct_IO() waits for IO to complete before increasing i_size and
thus stale data exposure is not possible. For now we don't complicate
the code with optimizing this special case since the overhead is pretty
low. In case this is observed to be a performance problem we can always
handle it using a special flag to ext4_map_blocks().
Fixes: f3b59291a69d0b734be1fc8be489fef2dd846d3d Reported-by: "HUANG Weller (CM/ESW12-CN)" <Weller.Huang@cn.bosch.com> Tested-by: "HUANG Weller (CM/ESW12-CN)" <Weller.Huang@cn.bosch.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.16:
- Drop check for EXT4_GET_BLOCKS_ZERO flag
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
As pm_runtime_set_active() may fail because the device's parent isn't
active, we can end up executing the ->runtime_resume() callback for the
device when it isn't allowed.
Fix this by invoking pm_runtime_set_active() before running the callback
and let's also deal with the error code.
Fixes: 37f204164dfb (PM: Add pm_runtime_suspend|resume_force functions) Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Some of the interrupt vectors on 64-bit POWER server processors are only
32 bytes long (8 instructions), which is not enough for the full
first-level interrupt handler. For these we need to branch to an
out-of-line (OOL) handler. But when we are running a relocatable kernel,
interrupt vectors till __end_interrupts marker are copied down to real
address 0x100. So, branching to labels (ie. OOL handlers) outside this
section must be handled differently (see LOAD_HANDLER()), considering
relocatable kernel, which would need at least 4 instructions.
However, branching from interrupt vector means that we corrupt the
CFAR (come-from address register) on POWER7 and later processors as
mentioned in commit 1707dd16. So, EXCEPTION_PROLOG_0 (6 instructions)
that contains the part up to the point where the CFAR is saved in the
PACA should be part of the short interrupt vectors before we branch out
to OOL handlers.
But as mentioned already, there are interrupt vectors on 64-bit POWER
server processors that are only 32 bytes long (like vectors 0x4f00,
0x4f20, etc.), which cannot accomodate the above two cases at the same
time owing to space constraint. Currently, in these interrupt vectors,
we simply branch out to OOL handlers, without using LOAD_HANDLER(),
which leaves us vulnerable when running a relocatable kernel (eg. kdump
case). While this has been the case for sometime now and kdump is used
widely, we were fortunate not to see any problems so far, for three
reasons:
1. In almost all cases, production kernel (relocatable) is used for
kdump as well, which would mean that crashed kernel's OOL handler
would be at the same place where we end up branching to, from short
interrupt vector of kdump kernel.
2. Also, OOL handler was unlikely the reason for crash in almost all
the kdump scenarios, which meant we had a sane OOL handler from
crashed kernel that we branched to.
3. On most 64-bit POWER server processors, page size is large enough
that marking interrupt vector code as executable (see commit 429d2e83) leads to marking OOL handler code from crashed kernel,
that sits right below interrupt vector code from kdump kernel, as
executable as well.
Let us fix this by moving the __end_interrupts marker down past OOL
handlers to make sure that we also copy OOL handlers to real address
0x100 when running a relocatable kernel.
This fix has been tested successfully in kdump scenario, on an LPAR with
4K page size by using different default/production kernel and kdump
kernel.
Also tested by manually corrupting the OOL handlers in the first kernel
and then kdump'ing, and then causing the OOL handlers to fire - mpe.
Fixes: c1fb6816fb1b ("powerpc: Add relocation on exception vector handlers") Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
hci_vhci driver creates a hci device object dynamically upon each
HCI_VENDOR_PKT write. Although it checks the already created object
and returns an error, it's still racy and may build multiple hci_dev
objects concurrently when parallel writes are performed, as the device
tracks only a single hci_dev object.
This patch introduces a mutex to protect against the concurrent device
creations.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Commit 834392a7d92677ff ("serial: doc: Un-document non-existing
uart_write_console()") removed a paragraph about a helper function that
seemed to never exist.
Peter Hurley pointed out that the function does exist, but is called
differently. Re-add the paragraph, with the function name corrected.
Fixes: 834392a7d92677ff ("serial: doc: Un-document non-existing uart_write_console()") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Now that we've done a more comprehensive fix with the intermediate
target state we can remove the previous hack introduced with commit 90a88d6ef88e ("scsi: fix soft lockup in scsi_remove_target() on module
removal").
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Add intermediate STARGET_REMOVE state to scsi_target_state to avoid
running into the BUG_ON() in scsi_target_reap(). The STARGET_REMOVE
state is only valid in the path from scsi_remove_target() to
scsi_target_destroy() indicating this target is going to be removed.
This re-fixes the problem introduced in commits bc3f02a795d3 ("[SCSI]
scsi_remove_target: fix softlockup regression on hot remove") and 40998193560d ("scsi: restart list search after unlock in
scsi_remove_target") in a more comprehensive way.
[mkp: Included James' fix for scsi_target_destroy()]
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: 40998193560dab6c3ce8d25f4fa58a23e252ef38 Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: James Bottomley <jejb@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Two concurrent writes into the same register cacheline has the chance of
killing the machine on Ivybridge and other gen7. This includes LRI
emitted from the command parser. The MI_SET_CONTEXT itself serves as
serialising barrier and prevents the pair of register writes in the first
packet from triggering the fault. However, if a second switch-context
immediately occurs then we may have two adjacent blocks of LRI to the
same registers which may then trigger the hang. To counteract this we
need to insert a delay after the second register write using SRM.
This is easiest to reproduce with something like
igt/gem_ctx_switch/interruptible that triggers back-to-back context
switches (with no operations in between them in the command stream,
which requires the execbuf operation to be interrupted after the
MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
interruptible igt. No reports from the wild though, so it must be of low
enough frequency that no one has correlated the random machine freezes
with i915.ko
The issue was introduced with
commit 2c550183476dfa25641309ae9a28d30feed14379 [v3.19]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 16 10:02:27 2014 +0000
drm/i915: Disable PSMI sleep messages on all rings around context switches
Testcase: igt/gem_ctx_switch/render-interruptible #ivb Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Daniel Vetter <daniel@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-11-git-send-email-chris@chris-wilson.co.uk
[bwh: Backported to 3.16:
- Pass ring, not engine, to intel_ring_emit()
- Register type is u32 not i915_reg_t
- MI_STORE_REGISTER_MEM is a function-macro] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The original tokenized iid support implemented via f53adae4eae5 ("net: ipv6:
add tokenized interface identifier support") didn't allow for clearing a
device token as it was intended that this addressing mode was the only one
active for globally scoped IPv6 addresses. Later we relaxed that restriction
via 617fe29d45bd ("net: ipv6: only invalidate previously tokenized addresses"),
and we should also allow for clearing tokens as there's no good reason why
it shouldn't be allowed.
Fixes: 617fe29d45bd ("net: ipv6: only invalidate previously tokenized addresses") Reported-by: Robin H. Johnson <robbat2@gentoo.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The "handled" variable could be uninitialized if the
interrupt_service_routine() call back hasn't been implimented or if it
has been implemented but doesn't initialize "handled" to zero at the
start. For example, adv76xx_isr() only sets "handled" to true.
Fixes: 44b153ca639f ('[media] m5mols: Add ISO sensitivity controls') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Instead to being true/false, the "handled" is true/uninitialized.
Presumably this doesn't cause that many problems in real life because
normally we handle the IRQ.
Fixes: eea6b7cc53aa ('mfd: Add lp8788 mfd driver') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Milo Kim <milo.kim@ti.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The omap_hwmod _enable() function can return success without setting
the hwmod state to _HWMOD_STATE_ENABLED for IPs with reset lines when
all of the reset lines are asserted. The omap_hwmod _idle() function
also performs a similar check, but after checking for the hwmod state
first. This triggers the WARN when pm_runtime_get and pm_runtime_put
are invoked on IPs with all reset lines asserted. Reverse the checks
for hwmod state and reset lines status to fix this.
Issue found during a unbind operation on a device with reset lines
still asserted, example backtrace below
------------[ cut here ]------------
WARNING: CPU: 1 PID: 879 at arch/arm/mach-omap2/omap_hwmod.c:2207 _idle+0x1e4/0x240()
omap_hwmod: mmu_dsp: idle state can only be entered from enabled state
Modules linked in:
CPU: 1 PID: 879 Comm: sh Not tainted 4.4.0-00008-ga989d951331a #3
Hardware name: Generic OMAP5 (Flattened Device Tree)
[<c0018e60>] (unwind_backtrace) from [<c0014dc4>] (show_stack+0x10/0x14)
[<c0014dc4>] (show_stack) from [<c037ac28>] (dump_stack+0x90/0xc0)
[<c037ac28>] (dump_stack) from [<c003f420>] (warn_slowpath_common+0x78/0xb4)
[<c003f420>] (warn_slowpath_common) from [<c003f48c>] (warn_slowpath_fmt+0x30/0x40)
[<c003f48c>] (warn_slowpath_fmt) from [<c0028c20>] (_idle+0x1e4/0x240)
[<c0028c20>] (_idle) from [<c0029080>] (omap_hwmod_idle+0x28/0x48)
[<c0029080>] (omap_hwmod_idle) from [<c002a5a4>] (omap_device_idle+0x3c/0x90)
[<c002a5a4>] (omap_device_idle) from [<c0427a90>] (__rpm_callback+0x2c/0x60)
[<c0427a90>] (__rpm_callback) from [<c0427ae4>] (rpm_callback+0x20/0x80)
[<c0427ae4>] (rpm_callback) from [<c0427f84>] (rpm_suspend+0x138/0x74c)
[<c0427f84>] (rpm_suspend) from [<c0428b78>] (__pm_runtime_idle+0x78/0xa8)
[<c0428b78>] (__pm_runtime_idle) from [<c041f514>] (__device_release_driver+0x64/0x100)
[<c041f514>] (__device_release_driver) from [<c041f5d0>] (device_release_driver+0x20/0x2c)
[<c041f5d0>] (device_release_driver) from [<c041d85c>] (unbind_store+0x78/0xf8)
[<c041d85c>] (unbind_store) from [<c0206df8>] (kernfs_fop_write+0xc0/0x1c4)
[<c0206df8>] (kernfs_fop_write) from [<c018a120>] (__vfs_write+0x20/0xdc)
[<c018a120>] (__vfs_write) from [<c018a9cc>] (vfs_write+0x90/0x164)
[<c018a9cc>] (vfs_write) from [<c018b1f0>] (SyS_write+0x44/0x9c)
[<c018b1f0>] (SyS_write) from [<c0010420>] (ret_fast_syscall+0x0/0x1c)
---[ end trace a4182013c75a9f50 ]---
While at this, fix the sequence in _shutdown() as well, though there
is no easy reproducible scenario.
Fixes: 747834ab8347 ("ARM: OMAP2+: hwmod: revise hardreset behavior") Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently the 'registered' member of the cpuidle_device struct is set
to 1 during cpuidle_register_device. In this same function there are
checks to see if the device is already registered to prevent duplicate
calls to register the device, but this value is never set to 0 even on
unregister of the device. Because of this, any attempt to call
cpuidle_register_device after a call to cpuidle_unregister_device will
fail which shouldn't be the case.
To prevent this, set registered to 0 when the device is unregistered.
Fixes: c878a52d3c7c (cpuidle: Check if device is already registered) Signed-off-by: Dave Gerlach <d-gerlach@ti.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The write handler allocates skbs and queues them into data->readq.
Read side should read them, if there is any. If there is none, skbs
should be dropped by hdev->flush. But this happens only if the device
is HCI_UP, i.e. hdev->power_on work was triggered already. When it was
not, skbs stay allocated in the queue when /dev/vhci is closed. So
purge the queue in ->release.
Program to reproduce:
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
Both vhci_get_user and vhci_release race with open_timeout work. They
both contain cancel_delayed_work_sync, but do not test whether the
work actually created hdev or not. Since the work can be in progress
and _sync will wait for finishing it, we can have data->hdev allocated
when cancel_delayed_work_sync returns. But the call sites do 'if
(data->hdev)' *before* cancel_delayed_work_sync.
As a result:
* vhci_get_user allocates a second hdev and puts it into
data->hdev. The former is leaked.
* vhci_release does not release data->hdev properly as it thinks there
is none.
Fix both cases by moving the actual test *after* the call to
cancel_delayed_work_sync.
This can be hit by this program:
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
int main(int argc, char **argv)
{
int fd;
srand(time(NULL));
while (1) {
const int delta = (rand() % 200 - 100) * 100;
fd = open("/dev/vhci", O_RDWR);
if (fd < 0)
err(1, "open");
And the result is:
BUG: KASAN: use-after-free in skb_queue_tail+0x13e/0x150 at addr ffff88006b0c1228
Read of size 8 by task kworker/u13:1/32068
=============================================================================
BUG kmalloc-192 (Tainted: G E ): kasan: bad access detected
-----------------------------------------------------------------------------
Disabling lock debugging due to kernel taint
INFO: Allocated in vhci_open+0x50/0x330 [hci_vhci] age=260 cpu=3 pid=32040
...
kmem_cache_alloc_trace+0x150/0x190
vhci_open+0x50/0x330 [hci_vhci]
misc_open+0x35b/0x4e0
chrdev_open+0x23b/0x510
...
INFO: Freed in vhci_release+0xa4/0xd0 [hci_vhci] age=9 cpu=2 pid=32040
...
__slab_free+0x204/0x310
vhci_release+0xa4/0xd0 [hci_vhci]
...
INFO: Slab 0xffffea0001ac3000 objects=16 used=13 fp=0xffff88006b0c1e00 flags=0x5fffff80004080
INFO: Object 0xffff88006b0c1200 @offset=4608 fp=0xffff88006b0c0600
Bytes b4 ffff88006b0c11f0: 09 df 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff88006b0c1200: 00 06 0c 6b 00 88 ff ff 00 00 00 00 00 00 00 00 ...k............
Object ffff88006b0c1210: 10 12 0c 6b 00 88 ff ff 10 12 0c 6b 00 88 ff ff ...k.......k....
Object ffff88006b0c1220: c0 46 c2 6b 00 88 ff ff c0 46 c2 6b 00 88 ff ff .F.k.....F.k....
Object ffff88006b0c1230: 01 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00 ................
Object ffff88006b0c1240: 40 12 0c 6b 00 88 ff ff 40 12 0c 6b 00 88 ff ff @..k....@..k....
Object ffff88006b0c1250: 50 0d 6e a0 ff ff ff ff 00 02 00 00 00 00 ad de P.n.............
Object ffff88006b0c1260: 00 00 00 00 00 00 00 00 ab 62 02 00 01 00 00 00 .........b......
Object ffff88006b0c1270: 90 b9 19 81 ff ff ff ff 38 12 0c 6b 00 88 ff ff ........8..k....
Object ffff88006b0c1280: 03 00 20 00 ff ff ff ff ff ff ff ff 00 00 00 00 .. .............
Object ffff88006b0c1290: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff88006b0c12a0: 00 00 00 00 00 00 00 00 00 80 cd 3d 00 88 ff ff ...........=....
Object ffff88006b0c12b0: 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 . ..............
Redzone ffff88006b0c12c0: bb bb bb bb bb bb bb bb ........
Padding ffff88006b0c13f8: 00 00 00 00 00 00 00 00 ........
CPU: 3 PID: 32068 Comm: kworker/u13:1 Tainted: G B E 4.4.6-0-default #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20151112_172657-sheep25 04/01/2014
Workqueue: hci0 hci_cmd_work [bluetooth] 00000000ffffffffffffffff81926cfaffff88006be37c68ffff88006bc27180 ffff88006b0c1200ffff88006b0c1234ffffffff81577993ffffffff82489320 ffff88006bc242400000000000000046ffff88006a100000000000026e51eb80
Call Trace:
...
[<ffffffff81ec8ebe>] ? skb_queue_tail+0x13e/0x150
[<ffffffffa06e027c>] ? vhci_send_frame+0xac/0x100 [hci_vhci]
[<ffffffffa0c61268>] ? hci_send_frame+0x188/0x320 [bluetooth]
[<ffffffffa0c61515>] ? hci_cmd_work+0x115/0x310 [bluetooth]
[<ffffffff811a1375>] ? process_one_work+0x815/0x1340
[<ffffffff811a1f85>] ? worker_thread+0xe5/0x11f0
[<ffffffff811a1ea0>] ? process_one_work+0x1340/0x1340
[<ffffffff811b3c68>] ? kthread+0x1c8/0x230
...
Memory state around the buggy address: ffff88006b0c1100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88006b0c1180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88006b0c1200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff88006b0c1280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff88006b0c1300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Fix possible out of bounds read, by adding missing comma.
The code may read pass the end of the dsi_errors array
when the most significant bit (bit #31) in the intr_stat register
is set.
This bug has been detected using CppCheck (static analysis tool).
Today, a kernel which refuses to mount a filesystem read-write
due to unknown ro-compat features can still transition to read-write
via the remount path. The old kernel is most likely none the wiser,
because it's unaware of the new feature, and isn't using it. However,
writing to the filesystem may well corrupt metadata related to that
new feature, and moving to a newer kernel which understand the feature
will have problems.
Right now the only ro-compat feature we have is the free inode btree,
which showed up in v3.16. It would be good to push this back to
all the active stable kernels, I think, so that if anyone is using
newer mkfs (which enables the finobt feature) with older kernel
releases, they'll be protected.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
BugLink: http://bugs.launchpad.net/bugs/972604
Commit 09c9bae26b0d3c9472cb6ae45010460a2cee8b8d ("ath5k: add led pin
configuration for compaq c700 laptop") added a pin configuration for the Compaq
c700 laptop. However, the polarity of the led pin is reversed. It should be
red for wifi off and blue for wifi on, but it is the opposite. This bug was
reported in the following bug report:
http://pad.lv/972604
Fixes: 09c9bae26b0d3c9472cb6ae45010460a2cee8b8d ("ath5k: add led pin configuration for compaq c700 laptop") Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Commit 2d0a7addbd10 ("ARM: Kirkwood: Add support for many Synology
NAS devices") created the new file kirkwood-ds112.dts but did not
add it to the Makefile.
Fixes: 2d0a7addbd10 ("ARM: Kirkwood: Add support for many Synology NAS devices") Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Setting the flag 'cache_bypass' will bypass the cache not the hardware.
Fix this comment here.
Fixes: 0eef6b0415f5 ("regmap: Fix doc comment") Signed-off-by: Andrew F. Davis <afd@ti.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In order to be able to use pull_dl_task() from a callback, we need to
do away with the return value.
Since the return value indicates if we should reschedule, do this
inside the function. Since not all callers currently do this, this can
increase the number of reschedules due rt balancing.
Too many reschedules is not a correctness issues, too few are.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: ktkhai@parallels.com Cc: rostedt@goodmis.org Cc: juri.lelli@gmail.com Cc: pang.xunlei@linaro.org Cc: oleg@redhat.com Cc: wanpeng.li@linux.intel.com Cc: umgwanakikbuti@gmail.com Link: http://lkml.kernel.org/r/20150611124742.859398977@infradead.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[Conflicts: kernel/sched/deadline.c] Signed-off-by: Byungchul Park <byungchul.park@lge.com>
[bwh: Backported to 3.16: use resched_task() instead of resched_curr()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>