Vladimir Murzin [Mon, 26 Jun 2017 09:18:57 +0000 (10:18 +0100)]
drivers: dma-coherent: Account dma_pfn_offset when used with device tree
dma_declare_coherent_memory() and friends are designed to account
difference in CPU and device addresses. However, when it is used with
reserved memory regions there is assumption that CPU and device have
the same view on address space. This assumption gets invalid when
reserved memory for coherent DMA allocations is referenced by device
with non-empty "dma-range" property.
Simply feeding device address as rmem->base + dev->dma_pfn_offset
would not work due to reserved memory region can be shared, so this
patch turns device address to be expressed with help of CPU address
and device's dma_pfn_offset in case memory reservation has been done
via device tree; non device tree users continue to use the old scheme.
Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Roger Quadros <rogerq@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> Tested-by: Andras Szemzo <sza@esh.hu> Tested-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Vladimir Murzin [Mon, 26 Jun 2017 09:18:55 +0000 (10:18 +0100)]
dma: Take into account dma_pfn_offset
Even though dma-noop-ops assumes 1:1 memory mapping DMA memory range
can be different to RAM. For example, ARM STM32F4 MCU offers the
possibility to remap SDRAM from 0xc000_0000 to 0x0 to get CPU
performance boost, but DMA continue to see SDRAM at 0xc000_0000. This
difference in mapping is handled via device-tree "dma-range" property
which leads to dev->dma_pfn_offset is set nonzero. To handle such
cases take dma_pfn_offset into account.
Cc: Joerg Roedel <jroedel@suse.de> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Reported-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> Tested-by: Andras Szemzo <sza@esh.hu> Tested-by: Alexandre TORGUE <alexandre.torgue@st.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
dma-mapping: replace dmam_alloc_noncoherent with dmam_alloc_attrs
dmam_alloc_noncoherent is a trivial wrapper around dmam_alloc_attrs,
that hardcodes one particular flag. Make the devres code more
flexible by allowing the callers to pass arbitrary flags.
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org>
Arnd Bergmann [Thu, 22 Jun 2017 12:35:46 +0000 (14:35 +0200)]
crypto: qat - avoid an uninitialized variable warning
After commit 9e442aa6a753 ("x86: remove DMA_ERROR_CODE"), the inlining
decisions in the qat driver changed slightly, introducing a new false-positive
warning:
drivers/crypto/qat/qat_common/qat_algs.c: In function 'qat_alg_sgl_to_bufl.isra.6':
include/linux/dma-mapping.h:228:2: error: 'sz_out' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/crypto/qat/qat_common/qat_algs.c:676:9: note: 'sz_out' was declared here
The patch that introduced this is correct, so let's just avoid the
warning in this driver by rearranging the unwinding after an error
to make it more obvious to the compiler what is going on.
The problem here is the 'if (unlikely(dma_mapping_error(dev, blp)))'
check, in which the 'unlikely' causes gcc to forget what it knew about
the state of the variables. Cleaning up the dma state in the reverse
order it was created means we can simplify the logic so it doesn't have
to know about that state, and also makes it easier to understand.
Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
au1100fb: remove a bogus dma_free_nonconsistent call
au1100fb is using managed dma allocations, so it doesn't need to
explicitly free the dma memory in the error path (and if it did
it would have to use the managed version).
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
This code has been spread between getting in through arch trees, the iommu
tree, -mm and the drivers tree. There will be a lot of work in this area,
including consolidating various arch implementations into more common
code, so ensure we have a proper git tree that facilitates cooperation with
the architecture maintainers.
powerpc/cell: clean up fixed mapping dma_ops initialization
By the time cell_pci_dma_dev_setup calls cell_dma_dev_setup no device can
have the fixed map_ops set yet as it's only set by the set_dma_mask
method. So move the setup for the fixed case to be only called in that
place instead of indirecting through cell_dma_dev_setup.
sparc: remove arch specific dma_supported implementations
Usually dma_supported decisions are done by the dma_map_ops instance.
Switch sparc to that model by providing a ->dma_supported instance for
sbus that always returns false, and implementations tailored to the sun4u
and sun4v cases for sparc64, and leave it unimplemented for PCI on
sparc32, which means always supported.
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: David S. Miller <davem@davemloft.net>
DMA_ERROR_CODE is going to go away, so don't rely on it. Instead
define a ->mapping_error method for all IOMMU based dma operation
instances. The direct ops don't ever return an error and don't
need a ->mapping_error method.
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au>
s390 can also use noop_dma_ops, and while that currently does not return
errors it will so in the future. Implementing the mapping_error method
is the proper way to have per-ops error conditions.
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
The dma alloc interface returns an error by return NULL, and the
mapping interfaces rely on the mapping_error method, which the dummy
ops already implement correctly.
Thus remove the DMA_ERROR_CODE define.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
ARM and x86 had duplicated versions of the dma_ops structure, the
only difference is that x86 hasn't wired up the set_dma_mask,
mmap, and get_sgtable ops yet. On x86 all of them are identical
to the generic version, so they aren't needed but harmless.
All the symbols used only for xen_swiotlb_dma_ops can now be marked
static as well.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
DMA_ERROR_CODE is not a public API and will go away soon. dma dma-iommu
driver already implements a proper ->mapping_error method, so it's only
using the value internally. Add a new local define using the value
that arm64 which is the only current user of dma-iommu.
DMA_ERROR_CODE already isn't a valid API to user for drivers and will
go away soon. exynos_drm_fb_dma_addr uses it a an error return when
the passed in index is invalid, but the callers never check for it
but instead pass the address straight to the hardware.
Hugh Dickins [Mon, 19 Jun 2017 11:03:24 +0000 (04:03 -0700)]
mm: larger stack guard gap, between vmas
Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.
This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.
Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.
One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications. For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).
Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.
Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.
Olof Johansson [Mon, 19 Jun 2017 03:42:21 +0000 (20:42 -0700)]
Merge tag 'sunxi-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes
Allwinner fixes for 4.12
A few fixes around the PRCM support that got in 4.12 with a wrong
compatible, and a missing clock in the binding.
* tag 'sunxi-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU
ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU
arm64: allwinner: h5: Remove syslink to shared DTSI
ARM: sunxi: h3/h5: fix the compatible of R_CCU
Linus Torvalds [Mon, 19 Jun 2017 00:25:05 +0000 (09:25 +0900)]
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio bugfix from Michael Tsirkin:
"It turns out balloon does not handle IOMMUs correctly. We should fix
that at some point, for now let's just disable this configuration"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_balloon: disable VIOMMU support
Linus Torvalds [Mon, 19 Jun 2017 00:20:25 +0000 (09:20 +0900)]
Merge branch 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Two driver bugfixes"
* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: ismt: fix wrong device address when unmap the data buffer
i2c: rcar: use correct length when unmapping DMA
Linus Torvalds [Mon, 19 Jun 2017 00:01:01 +0000 (09:01 +0900)]
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
- Three highmem fixes:
+ Fixed mapping initialization
+ Adjust the pkmap location
+ Ensure we use at most one page for PTEs
- Fix makefile dependencies for .its targets to depend on vmlinux
- Fix reversed condition in BNEZC and JIALC software branch emulation
- Only flush initialized flush_insn_slot to avoid NULL pointer
dereference
- perf: Remove incorrect odd/even counter handling for I6400
- ftrace: Fix init functions tracing
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: .its targets depend on vmlinux
MIPS: Fix bnezc/jialc return address calculation
MIPS: kprobes: flush_insn_slot should flush only if probe initialised
MIPS: ftrace: fix init functions tracing
MIPS: mm: adjust PKMAP location
MIPS: highmem: ensure that we don't use more than one page for PTEs
MIPS: mm: fixed mappings: correct initialisation
MIPS: perf: Remove incorrect odd/even counter handling for I6400
virtio balloon bypasses the DMA API entirely so does not support the
VIOMMU right now. It's not clear we need that support, for now let's
just make sure we don't pretend to support it.
Cc: stable@vger.kernel.org Cc: Wei Wang <wei.w.wang@intel.com> Fixes: 1a937693993f ("virtio: new feature to detect IOMMU device quirk") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
Linus Torvalds [Sun, 18 Jun 2017 09:49:12 +0000 (18:49 +0900)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"Two fixlets for x86:
- Handle WARN_ONs proper with the new UD based WARN implementation
- Disable 1G mappings when 2M mappings are disabled by kmemleak or
debug_pagealloc. Otherwise 1G mappings might still be used,
confusing the debug mechanisms"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Disable 1GB direct mappings when disabling 2MB mappings
x86/debug: Handle early WARN_ONs proper
Linus Torvalds [Sun, 18 Jun 2017 09:46:51 +0000 (18:46 +0900)]
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"Three fixlets for timers:
- Two hot-fixes for the alarmtimer based posix timers, which prevent
a nasty DOS by self rescheduling timers. The proper cleanup of that
mess is queued for 4.13
- Make a function static"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick/broadcast: Make tick_broadcast_setup_oneshot() static
alarmtimer: Rate limit periodic intervals
alarmtimer: Prevent overflow of relative timers
Linus Torvalds [Sun, 18 Jun 2017 09:45:17 +0000 (18:45 +0900)]
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
"Two small fixes for the schedulre core:
- Use the proper switch_mm() variant in idle_task_exit() because that
code is not called with interrupts disabled.
- Fix a confusing typo in a printk"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Idle_task_exit() shouldn't use switch_mm_irqs_off()
sched/fair: Fix typo in printk message
Linus Torvalds [Sat, 17 Jun 2017 23:51:35 +0000 (08:51 +0900)]
Merge tag 'led_fixes_for_4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
Pull LED fixes from Jacek Anaszewski:
"Two LED fixes:
- fix signal source assignment for leds-bcm6328
- revert patch that intended to fix LED behavior on suspend but it
had a side effect preventing suspend at all due to uevent being
sent on trigger removal"
* tag 'led_fixes_for_4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
Revert "leds: handle suspend/resume in heartbeat trigger"
leds: bcm6328: fix signal source assignment for leds 4 to 7
Linus Torvalds [Sat, 17 Jun 2017 23:39:54 +0000 (08:39 +0900)]
Merge tag 'usb-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small gadget and xhci USB fixes for 4.12-rc6.
Nothing major, but one of the gadget patches does fix a reported oops,
and the xhci ones resolve reported problems. All have been in
linux-next with no reported issues"
* tag 'usb-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks
usb: xhci: ASMedia ASM1042A chipset need shorts TX quirk
usb: xhci: Fix USB 3.1 supported protocol parsing
USB: gadget: fix GPF in gadgetfs
usb: gadget: composite: make sure to reactivate function on unbind
Linus Torvalds [Sat, 17 Jun 2017 23:23:02 +0000 (08:23 +0900)]
Merge tag 'ceph-for-4.12-rc6' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"A fix for an old ceph ->fh_to_* bug from Luis and two timestamp fixups
from Zheng, prompted by the ongoing y2038 work"
* tag 'ceph-for-4.12-rc6' of git://github.com/ceph/ceph-client:
ceph: unify inode i_ctime update
ceph: use current_kernel_time() to get request time stamp
ceph: check i_nlink while converting a file handle to dentry
Linus Torvalds [Sat, 17 Jun 2017 08:30:07 +0000 (17:30 +0900)]
Merge branch 'ufs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull ufs fixes from Al Viro:
"Fix assorted ufs bugs: a couple of deadlocks, fs corruption in
truncate(), oopsen on tail unpacking and truncate when racing with
vmscan, mild fs corruption (free blocks stats summary buggered, *BSD
fsck would complain and fix), several instances of broken logics
around reserved blocks (starting with "check almost never triggers
when it should" and then there are issues with sufficiently large
UFS2)"
[ Note: ufs hasn't gotten any loving in a long time, because nobody
really seems to use it. These ufs fixes are triggered by people
actually caring now, not some sudden influx of new bugs. - Linus ]
* 'ufs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ufs_truncate_blocks(): fix the case when size is in the last direct block
ufs: more deadlock prevention on tail unpacking
ufs: avoid grabbing ->truncate_mutex if possible
ufs_get_locked_page(): make sure we have buffer_heads
ufs: fix s_size/s_dsize users
ufs: fix reserved blocks check
ufs: make ufs_freespace() return signed
ufs: fix logics in "ufs: make fsck -f happy"
Linus Torvalds [Sat, 17 Jun 2017 08:26:53 +0000 (17:26 +0900)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"A couple of fixes; a leak in mntns_install() caught by Andrei (this
cycle regression) + d_invalidate() softlockup fix - that had been
reported by a bunch of people lately, but the problem is pretty old"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: don't forget to put old mntns in mntns_install
Hang/soft lockup in d_invalidate with simultaneous calls
Linus Torvalds [Fri, 16 Jun 2017 21:53:20 +0000 (06:53 +0900)]
Merge tag 'pci-v4.12-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- fix another PCI_ENDPOINT build error (merged for v4.12)
- fix error codes added to config accessors for v4.12
* tag 'pci-v4.12-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: endpoint: Select CRC32 to fix test build error
PCI: Make error code types consistent in pci_{read,write}_config_*
Linus Torvalds [Fri, 16 Jun 2017 21:49:34 +0000 (06:49 +0900)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"5 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: correct the comment when reclaimed pages exceed the scanned pages
userfaultfd: shmem: handle coredumping in handle_userfault()
mm: numa: avoid waiting on freed migrated pages
swap: cond_resched in swap_cgroup_prepare()
mm/memory-failure.c: use compound_head() flags for huge pages
zhongjiang [Fri, 16 Jun 2017 21:02:40 +0000 (14:02 -0700)]
mm: correct the comment when reclaimed pages exceed the scanned pages
Commit e1587a494540 ("mm: vmpressure: fix sending wrong events on
underflow") declared that reclaimed pages exceed the scanned pages due
to the thp reclaim.
That is incorrect because THP will be spilt to normal page and loop
again, which will result in the scanned pages increment.
Andrea Arcangeli [Fri, 16 Jun 2017 21:02:37 +0000 (14:02 -0700)]
userfaultfd: shmem: handle coredumping in handle_userfault()
Anon and hugetlbfs handle FOLL_DUMP set by get_dump_page() internally to
__get_user_pages().
shmem as opposed has no special FOLL_DUMP handling there so
handle_mm_fault() is invoked without mmap_sem and ends up calling
handle_userfault() that isn't expecting to be invoked without mmap_sem
held.
This makes handle_userfault() fail immediately if invoked through
shmem_vm_ops->fault during coredumping and solves the problem.
The side effect is a BUG_ON with no lock held triggered by the
coredumping process which exits. Only 4.11 is affected, pre-4.11 anon
memory holes are skipped in __get_user_pages by checking FOLL_DUMP
explicitly against empty pagetables (mm/gup.c:no_page_table()).
It's zero cost as we already had a check for current->flags to prevent
futex to trigger userfaults during exit (PF_EXITING).
Link: http://lkml.kernel.org/r/20170615214838.27429-1-aarcange@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: <stable@vger.kernel.org> [4.11+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mark Rutland [Fri, 16 Jun 2017 21:02:34 +0000 (14:02 -0700)]
mm: numa: avoid waiting on freed migrated pages
In do_huge_pmd_numa_page(), we attempt to handle a migrating thp pmd by
waiting until the pmd is unlocked before we return and retry. However,
we can race with migrate_misplaced_transhuge_page():
// do_huge_pmd_numa_page // migrate_misplaced_transhuge_page()
// Holds 0 refs on page // Holds 2 refs on page
vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
/* ... */
if (pmd_trans_migrating(*vmf->pmd)) {
page = pmd_page(*vmf->pmd);
spin_unlock(vmf->ptl);
ptl = pmd_lock(mm, pmd);
if (page_count(page) != 2)) {
/* roll back */
}
/* ... */
mlock_migrate_page(new_page, page);
/* ... */
spin_unlock(ptl);
put_page(page);
put_page(page); // page freed here
wait_on_page_locked(page);
goto out;
}
This can result in the freed page having its waiters flag set
unexpectedly, which trips the PAGE_FLAGS_CHECK_AT_PREP checks in the
page alloc/free functions. This has been observed on arm64 KVM guests.
We can avoid this by having do_huge_pmd_numa_page() take a reference on
the page before dropping the pmd lock, mirroring what we do in
__migration_entry_wait().
When we hit the race, migrate_misplaced_transhuge_page() will see the
reference and abort the migration, as it may do today in other cases.
Fixes: b8916634b77bffb2 ("mm: Prevent parallel splits during THP migration") Link: http://lkml.kernel.org/r/1497349722-6731-2-git-send-email-will.deacon@arm.com Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
James Morse [Fri, 16 Jun 2017 21:02:29 +0000 (14:02 -0700)]
mm/memory-failure.c: use compound_head() flags for huge pages
memory_failure() chooses a recovery action function based on the page
flags. For huge pages it uses the tail page flags which don't have
anything interesting set, resulting in:
> Memory failure: 0x9be3b4: Unknown page state
> Memory failure: 0x9be3b4: recovery action for unknown page: Failed
Instead, save a copy of the head page's flags if this is a huge page,
this means if there are no relevant flags for this tail page, we use the
head pages flags instead. This results in the me_huge_page() recovery
action being called:
> Memory failure: 0x9b7969: recovery action for huge page: Delayed
For hugepages that have not yet been allocated, this allows the hugepage
to be dequeued.
Fixes: 524fca1e7356 ("HWPOISON: fix misjudgement of page_action() for errors on mlocked pages") Link: http://lkml.kernel.org/r/20170524130204.21845-1-james.morse@arm.com Signed-off-by: James Morse <james.morse@arm.com> Tested-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 16 Jun 2017 20:57:54 +0000 (05:57 +0900)]
Merge tag 'powerpc-4.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Three small fixes for recently merged code:
- remove a spurious WARN_ON when a PCI device has no of_node, it's
allowed in some circumstances for there to be no of_node.
- fix the offset for store EOI MMIOs in the XIVE interrupt
controller.
- fix non-const WARN_ONs which were becoming BUGs due to them losing
BUGFLAG_WARNING in a recent cleanup patch.
Thanks to: Alexey Kardashevskiy, Alistair Popple, Benjamin
Herrenschmidt"
* tag 'powerpc-4.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/debug: Add missing warn flag to WARN_ON's non-builtin path
powerpc/xive: Fix offset for store EOI MMIOs
powerpc/npu-dma: Remove spurious WARN_ON when a PCI device has no of_node
Ingo Molnar [Fri, 16 Jun 2017 19:33:48 +0000 (21:33 +0200)]
Merge tag 'perf-urgent-for-mingo-4.12-20170616' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix probing of precise_ip level for default cycles event, that
got broken recently on x86_64 when its arch code started
considering invalid requesting precise samples when not sampling
(i.e. when attr.sample_period == 0).
This also fixes another problem in s/390 where the precision
probing with sample_period == 0 returned precise_ip > 0, that
then, when setting up the real cycles event (not probing) would
return EOPNOTSUPP for precise_ip > 0 (as determined previously
by probing) and sample_period > 0.
These problems resulted in attr_precise not being set to the
highest precision available on x86.64 when no event was specified,
i.e. the canonical:
perf record ./workload
would end up using attr.precise_ip = 0. As a workaround this would
need to be done:
perf record -e cycles:P ./workload
And on s/390 it would plain not work, requiring using:
perf record -e cycles ./workload
as a workaround. (Arnaldo Carvalho de Melo)
- Fix perf build with ARCH=x86_64, when ARCH should be transformed
into ARCH=x86, just like with the main kernel Makefile and
tools/objtool's, i.e. use SRCARCH. (Jiada Wang)
- Avoid accessing uninitialized data structures when unwinding with
elfutils's libdw, making it more closely mimic libunwind's unwinder.
(Milian Wolff)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
Milian Wolff [Fri, 2 Jun 2017 14:37:53 +0000 (16:37 +0200)]
perf unwind: Report module before querying isactivation in dwfl unwind
The PC returned by dwfl_frame_pc() may map into a not-yet-reported
module. We have to report it before we continue unwinding. But when we
query for the isactivation flag in dwfl_frame_pc, libdw will actually do
one more unwinding step internally which can then break and lead to
missed frames or broken stacks.
Linus Torvalds [Fri, 16 Jun 2017 09:45:47 +0000 (18:45 +0900)]
Merge tag 'configfs-for-4.12' of git://git.infradead.org/users/hch/configfs
Pull configfs updates from Christoph Hellwig:
"A fix from Nic for a race seen in production (including a stable tag).
And while I'm sending you this I'm also sneaking in a trivial new
helper from Bart so that we don't need inter-tree dependencies for the
next merge window"
* tag 'configfs-for-4.12' of git://git.infradead.org/users/hch/configfs:
configfs: Introduce config_item_get_unless_zero()
configfs: Fix race between create_link and configfs_rmdir
Linus Torvalds [Fri, 16 Jun 2017 08:46:47 +0000 (17:46 +0900)]
Merge tag 'drm-fixes-for-v4.12-rc6' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"This is the main fixes pull for 4.12-rc6, all pretty normal for this
stage, nothing really stands out. The mxsfb one is probably the
largest and it's for a black screen boot problem.
AMD, i915, mgag200, msxfb, tegra fixes"
* tag 'drm-fixes-for-v4.12-rc6' of git://people.freedesktop.org/~airlied/linux:
drm: mxsfb_crtc: Reset the eLCDIF controller
drm/mgag200: Fix to always set HiPri for G200e4 V2
drm/tegra: Correct idr_alloc() minimum id
drm/tegra: Fix lockup on a use of staging API
gpu: host1x: Fix error handling
drm/radeon: Fix overflow of watermark calcs at > 4k resolutions.
drm/amdgpu: Fix overflow of watermark calcs at > 4k resolutions.
drm/radeon: fix "force the UVD DPB into VRAM as well"
drm/i915: Fix GVT-g PVINFO version compatibility check
drm/i915: Fix SKL+ watermarks for 90/270 rotation
drm/i915: Fix scaling check for 90/270 degree plane rotation
drm: dw-hdmi: Fix compilation breakage by selecting REGMAP_MMIO
Linus Torvalds [Fri, 16 Jun 2017 08:38:23 +0000 (17:38 +0900)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"I had thought at the time of the last pull request that there wouldn't
be much more to go, but several things just kept trickling in over the
last week.
Instead of just the six patches to bnxt_re that I had anticipated,
there are another five IPoIB patches, two qedr patches, and a few
other miscellaneous patches.
The bnxt_re patches are more lines of diff than I like to submit this
late in the game. That's mostly because of the first two patches in
the series of six. I almost dropped them just because of the lines of
churn, but on a close review, a lot of the churn came from removing
duplicated code sections and consolidating them into callable
routines. I felt like this made the number of lines of change more
acceptable, and they address problems, so I left them. The remainder
of the patches are all small, well contained, and well understood.
These have passed 0day testing, but have not been submitted to
linux-next (but a local merge test with your current master was
without any conflicts).
Summary:
- A fix for fix eea40b8f624 ("infiniband: call ipv6 route lookup via
the stub interface")
- Six patches against bnxt_re...the first two are considerably larger
than I would like, but as they address real issues I went ahead and
submitted them (it also helped that a good deal of the churn was
removing code repeated in multiple places and consolidating it to
one common function)
- Two fixes against qedr that just came in
- One fix against rxe that took a few revisions to get right plus
time to get the proper reviews
- Five late breaking IPoIB fixes
- One late cxgb4 fix"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
rdma/cxgb4: Fix memory leaks during module exit
IB/ipoib: Fix memory leak in create child syscall
IB/ipoib: Fix access to un-initialized napi struct
IB/ipoib: Delete napi in device uninit default
IB/ipoib: Limit call to free rdma_netdev for capable devices
IB/ipoib: Fix memory leaks for child interfaces priv
rxe: Fix a sleep-in-atomic bug in post_one_send
RDMA/qedr: Add 64KB PAGE_SIZE support to user-space queues
RDMA/qedr: Initialize byte_len in WC of READ and SEND commands
RDMA/bnxt_re: Remove FMR support
RDMA/bnxt_re: Fix RQE posting logic
RDMA/bnxt_re: Add HW workaround for avoiding stall for UD QPs
RDMA/bnxt_re: Dereg MR in FW before freeing the fast_reg_page_list
RDMA/bnxt_re: HW workarounds for handling specific conditions
RDMA/bnxt_re: Fixing the Control path command and response handling
IB/addr: Fix setting source address in addr6_resolve()
Linus Torvalds [Fri, 16 Jun 2017 08:30:44 +0000 (17:30 +0900)]
Merge tag 'platform-drivers-x86-v4.12-2' of git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform driver fix from Darren Hart:
"Just a single patch to fix an oops in the intel_telemetry_debugfs
module load/unload"
* tag 'platform-drivers-x86-v4.12-2' of git://git.infradead.org/linux-platform-drivers-x86:
platform/x86: intel_telemetry_debugfs: fix oops when load/unload module
Linus Torvalds [Fri, 16 Jun 2017 08:13:06 +0000 (17:13 +0900)]
Merge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
Pull dmi fixes from Jean Delvare.
* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
firmware: dmi_scan: Check DMI structure length
firmware: dmi: Fix permissions of product_family
firmware: dmi_scan: Make dmi_walk and dmi_walk_early return real error codes
firmware: dmi_scan: Look for SMBIOS 3 entry point first
powerpc/debug: Add missing warn flag to WARN_ON's non-builtin path
When trapped on WARN_ON(), report_bug() is expected to return
BUG_TRAP_TYPE_WARN so the caller will increment NIP by 4 and continue.
The __builtin_constant_p() path of the PPC's WARN_ON()
calls (indirectly) __WARN_FLAGS() which has BUGFLAG_WARNING set,
however the other branch does not which makes report_bug() report a
bug rather than a warning.
Dave Airlie [Fri, 16 Jun 2017 00:01:04 +0000 (10:01 +1000)]
Merge tag 'drm-misc-fixes-2017-06-15' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes
Driver Changes:
- dw-hdmi: Fix compilation error if REGMAP_MMIO not selected (Laurent)
- host1x: Fix incorrect return value (Christophe)
- tegra: Shore up idr API usage in tegra staging code (Dmitry)
- mgag200: Always use HiPri mode for G200e4v2 and limit max bandwidth (Mathieu)
- mxsfb: Ensure display can be lit up without bootloader initialization (Fabio)
Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Dmitry Osipenko <digetx@gmail.com> Cc: Mathieu Larouche <mathieu.larouche@matrox.com> Cc: Fabio Estevam <fabio.estevam@nxp.com>
* tag 'drm-misc-fixes-2017-06-15' of git://anongit.freedesktop.org/git/drm-misc:
drm: mxsfb_crtc: Reset the eLCDIF controller
drm/mgag200: Fix to always set HiPri for G200e4 V2
drm/tegra: Correct idr_alloc() minimum id
drm/tegra: Fix lockup on a use of staging API
gpu: host1x: Fix error handling
drm: dw-hdmi: Fix compilation breakage by selecting REGMAP_MMIO
Dave Airlie [Fri, 16 Jun 2017 00:00:11 +0000 (10:00 +1000)]
Merge branch 'drm-fixes-4.12' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
A few fixes for 4.12:
- fix a UVD regression on SI
- fix overflow in watermark calcs on large modes
* 'drm-fixes-4.12' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: Fix overflow of watermark calcs at > 4k resolutions.
drm/amdgpu: Fix overflow of watermark calcs at > 4k resolutions.
drm/radeon: fix "force the UVD DPB into VRAM as well"
Alan Stern [Tue, 13 Jun 2017 19:23:42 +0000 (15:23 -0400)]
USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks
Using the syzkaller kernel fuzzer, Andrey Konovalov generated the
following error in gadgetfs:
> BUG: KASAN: use-after-free in __lock_acquire+0x3069/0x3690
> kernel/locking/lockdep.c:3246
> Read of size 8 at addr ffff88003a2bdaf8 by task kworker/3:1/903
>
> CPU: 3 PID: 903 Comm: kworker/3:1 Not tainted 4.12.0-rc4+ #35
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Workqueue: usb_hub_wq hub_event
> Call Trace:
> __dump_stack lib/dump_stack.c:16 [inline]
> dump_stack+0x292/0x395 lib/dump_stack.c:52
> print_address_description+0x78/0x280 mm/kasan/report.c:252
> kasan_report_error mm/kasan/report.c:351 [inline]
> kasan_report+0x230/0x340 mm/kasan/report.c:408
> __asan_report_load8_noabort+0x19/0x20 mm/kasan/report.c:429
> __lock_acquire+0x3069/0x3690 kernel/locking/lockdep.c:3246
> lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
> _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
> spin_lock include/linux/spinlock.h:299 [inline]
> gadgetfs_suspend+0x89/0x130 drivers/usb/gadget/legacy/inode.c:1682
> set_link_state+0x88e/0xae0 drivers/usb/gadget/udc/dummy_hcd.c:455
> dummy_hub_control+0xd7e/0x1fb0 drivers/usb/gadget/udc/dummy_hcd.c:2074
> rh_call_control drivers/usb/core/hcd.c:689 [inline]
> rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
> usb_hcd_submit_urb+0x92f/0x20b0 drivers/usb/core/hcd.c:1650
> usb_submit_urb+0x8b2/0x12c0 drivers/usb/core/urb.c:542
> usb_start_wait_urb+0x148/0x5b0 drivers/usb/core/message.c:56
> usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
> usb_control_msg+0x341/0x4d0 drivers/usb/core/message.c:151
> usb_clear_port_feature+0x74/0xa0 drivers/usb/core/hub.c:412
> hub_port_disable+0x123/0x510 drivers/usb/core/hub.c:4177
> hub_port_init+0x1ed/0x2940 drivers/usb/core/hub.c:4648
> hub_port_connect drivers/usb/core/hub.c:4826 [inline]
> hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
> port_event drivers/usb/core/hub.c:5105 [inline]
> hub_event+0x1ae1/0x3d40 drivers/usb/core/hub.c:5185
> process_one_work+0xc08/0x1bd0 kernel/workqueue.c:2097
> process_scheduled_works kernel/workqueue.c:2157 [inline]
> worker_thread+0xb2b/0x1860 kernel/workqueue.c:2233
> kthread+0x363/0x440 kernel/kthread.c:231
> ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:424
>
> Allocated by task 9958:
> save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
> save_stack+0x43/0xd0 mm/kasan/kasan.c:513
> set_track mm/kasan/kasan.c:525 [inline]
> kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:617
> kmem_cache_alloc_trace+0x87/0x280 mm/slub.c:2745
> kmalloc include/linux/slab.h:492 [inline]
> kzalloc include/linux/slab.h:665 [inline]
> dev_new drivers/usb/gadget/legacy/inode.c:170 [inline]
> gadgetfs_fill_super+0x24f/0x540 drivers/usb/gadget/legacy/inode.c:1993
> mount_single+0xf6/0x160 fs/super.c:1192
> gadgetfs_mount+0x31/0x40 drivers/usb/gadget/legacy/inode.c:2019
> mount_fs+0x9c/0x2d0 fs/super.c:1223
> vfs_kern_mount.part.25+0xcb/0x490 fs/namespace.c:976
> vfs_kern_mount fs/namespace.c:2509 [inline]
> do_new_mount fs/namespace.c:2512 [inline]
> do_mount+0x41b/0x2d90 fs/namespace.c:2834
> SYSC_mount fs/namespace.c:3050 [inline]
> SyS_mount+0xb0/0x120 fs/namespace.c:3027
> entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> Freed by task 9960:
> save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
> save_stack+0x43/0xd0 mm/kasan/kasan.c:513
> set_track mm/kasan/kasan.c:525 [inline]
> kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:590
> slab_free_hook mm/slub.c:1357 [inline]
> slab_free_freelist_hook mm/slub.c:1379 [inline]
> slab_free mm/slub.c:2961 [inline]
> kfree+0xed/0x2b0 mm/slub.c:3882
> put_dev+0x124/0x160 drivers/usb/gadget/legacy/inode.c:163
> gadgetfs_kill_sb+0x33/0x60 drivers/usb/gadget/legacy/inode.c:2027
> deactivate_locked_super+0x8d/0xd0 fs/super.c:309
> deactivate_super+0x21e/0x310 fs/super.c:340
> cleanup_mnt+0xb7/0x150 fs/namespace.c:1112
> __cleanup_mnt+0x1b/0x20 fs/namespace.c:1119
> task_work_run+0x1a0/0x280 kernel/task_work.c:116
> exit_task_work include/linux/task_work.h:21 [inline]
> do_exit+0x18a8/0x2820 kernel/exit.c:878
> do_group_exit+0x14e/0x420 kernel/exit.c:982
> get_signal+0x784/0x1780 kernel/signal.c:2318
> do_signal+0xd7/0x2130 arch/x86/kernel/signal.c:808
> exit_to_usermode_loop+0x1ac/0x240 arch/x86/entry/common.c:157
> prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
> syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263
> entry_SYSCALL_64_fastpath+0xbc/0xbe
>
> The buggy address belongs to the object at ffff88003a2bdae0
> which belongs to the cache kmalloc-1024 of size 1024
> The buggy address is located 24 bytes inside of
> 1024-byte region [ffff88003a2bdae0, ffff88003a2bdee0)
> The buggy address belongs to the page:
> page:ffffea0000e8ae00 count:1 mapcount:0 mapping: (null)
> index:0x0 compound_mapcount: 0
> flags: 0x100000000008100(slab|head)
> raw: 0100000000008100000000000000000000000000000000000000000100170017
> raw: ffffea0000ed3020ffffea0000f5f820ffff88003e80efc00000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
> ffff88003a2bd980: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff88003a2bda00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >ffff88003a2bda80: fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb
> ^
> ffff88003a2bdb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff88003a2bdb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================
What this means is that the gadgetfs_suspend() routine was trying to
access dev->lock after it had been deallocated. The root cause is a
race in the dummy_hcd driver; the dummy_udc_stop() routine can race
with the rest of the driver because it contains no locking. And even
when proper locking is added, it can still race with the
set_link_state() function because that function incorrectly drops the
private spinlock before invoking any gadget driver callbacks.
The result of this race, as seen above, is that set_link_state() can
invoke a callback in gadgetfs even after gadgetfs has been unbound
from dummy_hcd's UDC and its private data structures have been
deallocated.
include/linux/usb/gadget.h documents that the ->reset, ->disconnect,
->suspend, and ->resume callbacks may be invoked in interrupt context.
In general this is necessary, to prevent races with gadget driver
removal. This patch fixes dummy_hcd to retain the spinlock across
these calls, and it adds a spinlock acquisition to dummy_udc_stop() to
prevent the race.
The net2280 driver makes the same mistake of dropping the private
spinlock for its ->disconnect and ->reset callback invocations. The
patch fixes it too.
Lastly, since gadgetfs_suspend() may be invoked in interrupt context,
it cannot assume that interrupts are enabled when it runs. It must
use spin_lock_irqsave() instead of spin_lock_irq(). The patch fixes
that bug as well.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: Andrey Konovalov <andreyknvl@google.com> CC: <stable@vger.kernel.org> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fabio Estevam [Fri, 5 May 2017 18:01:41 +0000 (15:01 -0300)]
drm: mxsfb_crtc: Reset the eLCDIF controller
According to the eLCDIF initialization steps listed in the MX6SX
Reference Manual the eLCDIF block reset is mandatory.
Without performing the eLCDIF reset the display shows garbage content
when the kernel boots.
In earlier tests this issue has not been observed because the bootloader
was previously showing a splash screen and the bootloader display driver
does properly implement the eLCDIF reset.
Add the eLCDIF reset to the driver, so that it can operate correctly
independently of the bootloader.