RFC 2734 defines the datagram_size field in fragment encapsulation
headers thus:
datagram_size: The encoded size of the entire IP datagram. The
value of datagram_size [...] SHALL be one less than the value of
Total Length in the datagram's IP header (see STD 5, RFC 791).
Accordingly, the eth1394 driver of Linux 2.6.36 and older set and got
this field with a -/+1 offset:
Likewise, I observe OS X 10.4 and Windows XP Pro SP3 to transmit 1500
byte sized datagrams in fragments with datagram_size=1499 if link
fragmentation is required.
Only firewire-net sets and gets datagram_size without this offset. The
result is lacking interoperability of firewire-net with OS X, Windows
XP, and presumably Linux' eth1394. (I did not test with the latter.)
For example, FTP data transfers to a Linux firewire-net box with max_rec
smaller than the 1500 bytes MTU
- from OS X fail entirely,
- from Win XP start out with a bunch of fragmented datagrams which
time out, then continue with unfragmented datagrams because Win XP
temporarily reduces the MTU to 576 bytes.
So let's fix firewire-net's datagram_size accessors.
Note that firewire-net thereby loses interoperability with unpatched
firewire-net, but only if link fragmentation is employed. (This happens
with large broadcast datagrams, and with large datagrams on several
FireWire CardBus cards with smaller max_rec than equivalent PCI cards,
and it can be worked around by setting a small enough MTU.)
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The IP-over-1394 driver firewire-net lacked input validation when
handling incoming fragmented datagrams. A maliciously formed fragment
with a respectively large datagram_offset would cause a memcpy past the
datagram buffer.
So, drop any packets carrying a fragment with offset + length larger
than datagram_size.
In addition, ensure that
- GASP header, unfragmented encapsulation header, or fragment
encapsulation header actually exists before we access it,
- the encapsulated datagram or fragment is of nonzero size.
The Schenker XMG C504 is a rebranded Gigabyte P35 v2 laptop.
Therefore it also needs a keyboard reset to detect the Elantech touchpad.
Otherwise the touchpad appears to be dead.
With this patch the touchpad is detected:
$ dmesg | grep -E "(i8042|Elantech|elantech)"
[ 2.675399] i8042: PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
[ 2.680372] i8042: Attempting to reset device connected to KBD port
[ 2.789037] serio: i8042 KBD port at 0x60,0x64 irq 1
[ 2.791586] serio: i8042 AUX port at 0x60,0x64 irq 12
[ 2.813840] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input4
[ 3.811431] psmouse serio1: elantech: assuming hardware version 4 (with firmware version 0x361f0e)
[ 3.825424] psmouse serio1: elantech: Synaptics capabilities query result 0x00, 0x15, 0x0f.
[ 3.839424] psmouse serio1: elantech: Elan sample query result 03, 58, 74
[ 3.911349] input: ETPS/2 Elantech Touchpad as /devices/platform/i8042/serio1/input/input6
A device running without RX package aggregation could return more data
in the USB packet than the actual network packet. In this case we
could would clone the skb but then determine that that there was no
packet to handle and exit without freeing the cloned skb first.
This has so far only been observed with 8188eu devices, but could
affect others.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
dm-raid 1.9.0 fails to activate existing RAID4/10 devices that have the
old superblock format (which does not have takeover/reshaping support
that was added via commit 33e53f06850f).
Fix validation path for old superblocks by reverting to the old raid4
layout and basing checks on mddev->new_{level,layout,...} members in
super_init_validation().
In ecbfb9f118bce4 ("dm raid: add raid level takeover support") a new
compatible feature flag was added. Validation for these compat_features
was added but this only passes for new raid mappings with this feature
flag. This causes previously created raid mappings to be failed at
import.
Check compat_features for the only valid combination.
cleanup_mapped_device() calls kthread_stop() if kworker_task is
non-NULL. Currently the assigned value could be a valid task struct or
an error code (e.g -ENOMEM). Reset md->kworker_task to NULL if
kthread_run() returned an erorr.
dm_get_target_type() was previously called so any error returned from
dm_table_add_target() must first call dm_put_target_type(). Otherwise
the DM target module's reference count will leak and the associated
kernel module will be unable to be removed.
Also, leverage the fact that r is already -EINVAL and remove an extra
newline.
If a default leg has failed, any read will cause a new operational
default leg to be selected and the read is resubmitted. But until now
the read will return failure even though it was successful due to
resubmission. The reason for this is bio->bi_error was not being
cleared before resubmitting the bio.
Fix by clearing bio->bi_error before resubmission.
Fixes: 4246a0b63bd8 ("block: add a bi_error field to struct bio") Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit c6017e793b93 ("virtio: console: add locks around buffer removal
in port unplug path") added locking around the freeing of buffers in the
vq. However, when free_buf() is called with can_sleep = true and rproc
is enabled, it calls dma_free_coherent() directly, requiring interrupts
to be enabled. Currently a WARNING is triggered due to the spin locking
around free_buf, with a call stack like this:
Fix this by restructuring the loops to allow the locks to only be taken
where it is necessary to protect the vqs, and release it while the
buffer is being freed.
Fixes: c6017e793b93 ("virtio: console: add locks around buffer removal in port unplug path") Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Legacy virtio defines the virtqueue base using a 32-bit PFN field, with
a read-only register indicating a fixed page size of 4k.
This can cause problems for DMA allocators that allocate top down from
the DMA mask, which is set to 64 bits. In this case, the addresses are
silently truncated to 44-bit, leading to IOMMU faults, failure to read
from the queue or data corruption.
This patch restricts the coherent DMA mask for legacy PCI virtio devices
to 44 bits, which matches the specification.
Cc: Andy Lutomirski <luto@kernel.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Benjamin Serebrin <serebrin@google.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to the spec, if the VIRTIO_RING_F_EVENT_IDX feature bit is
negotiated the driver MUST set flags to 0. Not dirtying the available
ring in virtqueue_disable_cb also has a minor positive performance
impact, improving L1 dcache load missed by ~0.5% in vring_bench.
Writes to the used event field (vring_used_event) are still unconditional.
Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We have one critical section in the syscall entry path in which we switch from
the userspace stack to kernel stack. In the event of an external interrupt, the
interrupt code distinguishes between those two states by analyzing the value of
sr7. If sr7 is zero, it uses the kernel stack. Therefore it's important, that
the value of sr7 is in sync with the currently enabled stack.
This patch now disables interrupts while executing the critical section. This
prevents the interrupt handler to possibly see an inconsistent state which in
the worst case can lead to crashes.
Interestingly, in the syscall exit path interrupts were already disabled in the
critical section which switches back to the userspace stack.
Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make sure the copied up file hits the disk before renaming to the final
destination. If this is not done then the copy-up may corrupt the data in
the file in case of a crash.
If platform code returns a NULL pointer to the FDT, initial_boot_params
will not get set to a valid pointer and attempting to find the /chosen
node in it will cause a NULL pointer dereference and the kernel to crash
immediately on startup - with no output to the console.
Fix this by checking that initial_boot_params is valid before using it.
Jeff Layton says:
> Hm...now that I look though, this is a little suspicious:
>
> struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
>
> I wonder if it's possible for the openstateid to have already been
> destroyed at this point.
>
> We might be better off doing something like this to get the client pointer:
>
> stp->st_stid.sc_client;
>
> ...which should be more direct and less dependent on other stateids
> staying valid.
With the suggested change, I am no longer able to reproduce the above oops.
v2: Fix unhash_lock_stateid() as well
Fix-suggested-by: Jeff Layton <jlayton@redhat.com> Fixes: 42691398be08 ('nfsd: Fix race between FREE_STATEID and LOCK') Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This fixes a very annoying regression on the Snowball SD card
that has been around for a while. It turns out that the device
tree does not configure the direction pins properly, nor sets
up the pins for the voltage converter properly at boot. Unless
all things are correctly set up, the feedback clock will not
work, and makes the driver spew messages in the console (but
it works, very slowly):
root@Ux500:/ mount /dev/mmcblk0p2 /mnt/
[ 9.953460] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
[ 9.960296] mmcblk0: error -110 sending status command, retrying
[ 9.966461] mmcblk0: error -110 sending status command, retrying
[ 9.972534] mmcblk0: error -110 sending status command, aborting
Fix this by rectifying the device tree to correspond to that of
the Ux500 HREF boards plus the DAT31DIR setting that is unique for
the Snowball, and things start working smoothly. Add in the SDR12
and SDR25 modes which this host can do without any problems.
I don't know if this has ever been correct, sadly. It works after
this patch.
Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since the commit bd3677ff31a3 ("clk: mvebu: Remove corediv clock from
Armada XP"), the corediv clk is no more selected for Armada XP, however
this clock is used for Armada XP using the compatible
armada-370-corediv-clock.
While since commit 1594d568c6e3 ("clk: mvebu: Move corediv config to
mvebu config") Armada 38x and Armada 375 got corediv support again, not
only Armada XP was missed but also Armada 39x.
Actually all the SoC selecting MVEBU_V7 config need this clock:
git grep "\-corediv-clock" arch/arm/boot/dts
arch/arm/boot/dts/armada-370-xp.dtsi: compatible = "marvell,armada-370-corediv-clock";
arch/arm/boot/dts/armada-375.dtsi: compatible = "marvell,armada-375-corediv-clock";
arch/arm/boot/dts/armada-38x.dtsi: compatible = "marvell,armada-380-corediv-clock";
arch/arm/boot/dts/armada-39x.dtsi: compatible = "marvell,armada-390-corediv-clock"
This commit now fixes this behavior by letting MVEBU_V7 select
MVEBU_CLK_COREDIV.
The advancing of the PC when completing an MMIO load is done before
re-entering the guest, i.e. before restoring the guest ASID. However if
the load is in a branch delay slot it may need to access guest code to
read the prior branch instruction. This isn't safe in TLB mapped code at
the moment, nor in the future when we'll access unmapped guest segments
using direct user accessors too, as it could read the branch from host
user memory instead.
Therefore calculate the resume PC in advance while we're still in the
right context and save it in the new vcpu->arch.io_pc (replacing the no
longer needed vcpu->arch.pending_load_cause), and restore it on MMIO
completion.
The ERET instruction to return from exception is used for returning from
exception level (Status.EXL) and error level (Status.ERL). If both bits
are set however we should be returning from ERL first, as ERL can
interrupt EXL, for example when an NMI is taken. KVM however checks EXL
first.
Fix the order of the checks to match the pseudocode in the instruction
set manual.
Diag224 requires a page-aligned 4k buffer to store the name table
into. kmalloc does not guarantee page alignment, hence we replace it
with __get_free_page for the buffer allocation.
Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
vcpu->arch.wbinvd_dirty_mask may still be used after freeing it,
corrupting memory. For example, the following call trace may set a bit
in an already freed cpu mask:
kvm_arch_vcpu_load
vcpu_load
vmx_free_vcpu_nested
vmx_free_vcpu
kvm_arch_vcpu_free
Fix this by deferring freeing of wbinvd_dirty_mask.
dm_old_request_fn() has paths that access md->io_barrier. The party
destroying io_barrier should ensure that no future execution of
dm_old_request_fn() is possible. Move io_barrier destruction to below
blk_cleanup_queue() to ensure this and avoid a NULL pointer crash during
request-based DM device shutdown.
Commit 2518ac59eb27 ("staging: wilc1000: Replace kthread with workqueue
for host interface") adds an unconditional destroy_workqueue() on the
wilc's "hif_workqueue" soon after its creation thereby rendering
it unusable. It then further attempts to queue work onto this
non-existing hif_worqueue and results in:
This fix removes the unnecessary call to destroy_workqueue() while opening
the device to avoid the above kernel panic. The deinit routine already
does a good job of terminating the workqueue when no longer needed.
Reported-by: Nicolas Ferre <Nicolas.Ferre@microchip.com> Fixes: 2518ac59eb27 ("staging: wilc1000: Replace kthread with workqueue for host interface") Signed-off-by: Aditya Shankar <Aditya.Shankar@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
I2C and SPI interfaces share common clock trees within the CP110 HW block.
It occurred that SPI0 interface has wrong clock assignment in the device
tree, which is fixed in this commit to a proper value.
Size of kmalloc() in vc_do_resize() is controlled by user.
Too large kmalloc() size triggers WARNING message on console.
Put a reasonable upper bound on terminal size to prevent WARNINGs.
If a device is unplugged and replugged during Sx system suspend
some Intel xHC hosts will overwrite the CAS (Cold attach status) flag
and no device connection is noticed in resume.
A device in this state can be identified in resume if its link state
is in polling or compliance mode, and the current connect status is 0.
A device in this state needs to be warm reset.
Intel 100/c230 series PCH specification update Doc #332692-006 Errata #8
Observed on Cherryview and Apollolake as they go into compliance mode
if LFPS times out during polling, and re-plugged devices are not
discovered at resume.
The host keeps sending heartbeat packets independent of the
guest responding to them. Even though we respond to the heartbeat messages at
interrupt level, we can have situations where there maybe multiple heartbeat
messages pending that have not been responded to. For instance this occurs when the
VM is paused and the host continues to send the heartbeat messages.
Address this issue by draining and responding to all
the heartbeat messages that maybe pending.
Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The recent changes, which forced the registration of the boot cpu on UP
systems, which do not have ACPI tables, have been fixed for systems w/o
local APIC, but left a wreckage for systems which have neither ACPI nor
mptables, but the CPU has an APIC, e.g. virtualbox.
The boot process crashes in prefill_possible_map() as it wants to register
the boot cpu, which needs to access the local apic, but the local APIC is
not yet mapped.
There is no reason why init_apic_mapping() can't be invoked before
prefill_possible_map(). So instead of playing another silly early mapping
game, as the ACPI/mptables code does, we just move init_apic_mapping()
before the call to prefill_possible_map().
In hindsight, I should have noticed that combination earlier.
When interrupting an application which was allocating DMAable
memory, it was possible, that the DMA memory was deallocated
twice, leading to the error symptoms below.
Thanks to Gerald, who analyzed the problem and provided this
patch.
I agree with his analysis of the problem: ddcb_cmd_fixups() ->
genwqe_alloc_sync_sgl() (fails in f/lpage, but sgl->sgl != NULL
and f/lpage maybe also != NULL) -> ddcb_cmd_cleanup() ->
genwqe_free_sync_sgl() (double free, because sgl->sgl != NULL and
f/lpage maybe also != NULL)
In this scenario we would have exactly the kind of double free that
would explain the WARNING / Bad page state, and as expected it is
caused by broken error handling (cleanup).
Using the Ubuntu git source, tag Ubuntu-4.4.0-33.52, he was able to reproduce
the "Bad page state" issue, and with the patch on top he could not reproduce
it any more.
Increase ohci watchout delay to 275 ms. Previous delay was 250 ms
with 20 ms of slack, after removing slack time some ohci controllers don't
respond in time. Logs from systems with controllers that have the
issue would show "HcDoneHead not written back; disabled"
Since the controller on R-Car Gen3 doesn't have any status registers
to detect initialization (LPSTS.SUSPM = 1) and the initialization needs
up to 45 usec, this patch adds wait after the initialization. Otherwise,
writing other registers (e.g. INTENB0) will fail.
This adds support to ftdi_sio for the Infineon TriBoard TC2X7
engineering board for first-generation Aurix SoCs with Tricore CPUs.
Mere addition of the device IDs does the job.
Signed-off-by: Stefan Tauner <stefan.tauner@technikum-wien.at> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current tiocmget implementation would fail to report errors up the
stack and instead leaked a few bits from the stack as a mask of
modem-status flags.
Fixes: 39a66b8d22a3 ("[PATCH] USB: CP2101 Add support for flow control") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make sure we have at least one port before attempting to register a
console.
Currently, at least one driver binds to a "dummy" interface and requests
zero ports for it. Should such an interface also lack endpoints, we get
a NULL-deref during probe.
Fixes: e5b1e2062e05 ("USB: serial: make minor allocation dynamic") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we don't guarantee that we will always get an
interrupt at least when we're queueing our very last
request, we could fall into situation where we queue
every request with 'no_interrupt' set. This will
cause the link to get stuck.
The behavior above has been triggered with g_ether
and dwc3.
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
SEC registers are not accessible when the TXE device is in low power
state, hence the SEC interrupt cannot be processed if device is not
awake.
In some rare cases entrance to low power state (aliveness off) and input
ready bits can be signaled at the same time, resulting in communication
stall as input ready won't be signaled again after waking up. To resolve
this IPC_HHIER_SEC bit in HHISR_REG should not be cleaned if the
interrupt is not processed.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If UBIFS is facing an error while walking a directory, it reports this
error and ubifs_readdir() returns the error code. But the VFS readdir
logic does not make the getdents system call fail in all cases. When the
readdir cursor indicates that more entries are present, the system call
will just return and the libc wrapper will try again since it also
knows that more entries are present.
This causes the libc wrapper to busy loop for ever when a directory is
corrupted on UBIFS.
A common approach do deal with corrupted directory entries is
skipping them by setting the cursor to the next entry. On UBIFS this
approach is not possible since we cannot compute the next directory
entry cursor position without reading the current entry. So all we can
do is setting the cursor to the "no more entries" position and make
getdents exit.
Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus stumbled over the unlocked modification of the timer expiry value in
mod_timer() which is an optimization for timers which stay in the same
bucket - due to the bucket granularity - despite their expiry time getting
updated.
The optimization itself still makes sense even if we take the lock, because
in case that the bucket stays the same, we avoid the pointless
queue/enqueue dance.
Make the check and the modification of timer->expires protected by the base
lock and shuffle the remaining code around so we can keep the lock held
when we actually have to requeue the timer to a different bucket.
Fixes: f00c0afdfa62 ("timers: Implement optimization for same expiry time in mod_timer()") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610241711220.4983@nanos Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus noticed that lock_timer_base() lacks a READ_ONCE() for accessing the
timer flags. As a consequence the compiler is allowed to reload the flags
between the initial check for TIMER_MIGRATION and the following timer base
computation and the spin lock of the base.
While this has not been observed (yet), we need to make sure that it never
happens.
Fixes: 0eeda71bc30d ("timer: Replace timer base by a cpu index") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610241711220.4983@nanos Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a timer is enqueued we try to forward the timer base clock. This
mechanism has two issues:
1) Forwarding a remote base unlocked
The forwarding function is called from get_target_base() with the current
timer base lock held. But if the new target base is a different base than
the current base (can happen with NOHZ, sigh!) then the forwarding is done
on an unlocked base. This can lead to corruption of base->clk.
Solution is simple: Invoke the forwarding after the target base is locked.
2) Possible corruption due to jiffies advancing
This is similar to the issue in get_net_timer_interrupt() which was fixed
in the previous patch. jiffies can advance between check and assignement
and therefore advancing base->clk beyond the next expiry value.
So we need to read jiffies into a local variable once and do the checks and
assignment with the local copy.
Fixes: a683f390b93f("timers: Forward the wheel clock whenever possible") Reported-by: Ashton Holmes <scoopta@gmail.com> Reported-by: Michael Thayer <michael.thayer@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Michal Necasek <michal.necasek@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: knut.osmundsen@oracle.com Cc: stern@rowland.harvard.edu Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20161022110552.253640125@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jiffies has not advanced since run_timers(), so this assignment effectively
decrements base->clk by one.
base->clk is the index into the timer wheel arrays. So let's assume the
following state after the base->clk increment in run_timers():
jiffies = 0
base->clk = 1
A timer gets enqueued with an expiry delta of 63 ticks (which is the case
with the USB timeout and HZ=250) so the resulting bucket index is:
base->clk + delta = 1 + 63 = 64
The timer goes into the first wheel level. The array size is 64 so it ends
up in bucket 0, which is correct as it takes 63 ticks to advance base->clk
to index into bucket 0 again.
If the cpu goes idle before jiffies advance, then the bug in the forwarding
mechanism sets base->clk back to 0, so the next invocation of run_timers()
at the next tick will index into bucket 0 and therefore expire the timer 62
ticks too early.
Instead of blindly setting base->clk to jiffies we must make the forwarding
conditional on jiffies > base->clk, but we cannot use jiffies for this as
we might run into the following issue:
if (time_after(jiffies, base->clk) {
if (time_after(nextevt, base->clk))
base->clk = jiffies;
jiffies can increment between the check and the assigment far enough to
advance beyond nextevt. So we need to use a stable value for checking.
get_next_timer_interrupt() has the basej argument which is the jiffies
value snapshot taken in the calling code. So we can just that.
Thanks to Ashton for bisecting and providing trace data!
Fixes: a683f390b93f ("timers: Forward the wheel clock whenever possible") Reported-by: Ashton Holmes <scoopta@gmail.com> Reported-by: Michael Thayer <michael.thayer@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Michal Necasek <michal.necasek@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: knut.osmundsen@oracle.com Cc: stern@rowland.harvard.edu Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20161022110552.175308322@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We needed the physical address of the container in order to compute the
offset within the relocated ramdisk. And we did this by doing __pa() on
the virtual address.
However, __pa() does checks whether the physical address is within
PAGE_OFFSET and __START_KERNEL_map - see __phys_addr() - which fail
if we have CONFIG_RANDOMIZE_MEMORY enabled: we feed a virtual address
which *doesn't* have the randomization offset into a function which uses
PAGE_OFFSET which *does* have that offset.
This fixes a race condition where one thread that is entering or
leaving a power-saving state can inadvertently ignore the lock bit
that was set by another thread, and potentially also clear it.
The core_idle_lock_held function is called when the lock bit is
seen to be set. It polls the lock bit until it is clear, then
does a lwarx to load the word containing the lock bit and thread
idle bits so it can be updated. However, it is possible that the
value loaded with the lwarx has the lock bit set, even though an
immediately preceding lwz loaded a value with the lock bit clear.
If this happens then we go ahead and update the word despite the
lock bit being set, and when called from pnv_enter_arch207_idle_mode,
we will subsequently clear the lock bit.
No identifiable misbehaviour has been attributed to this race.
This fixes it by checking the lock bit in the value loaded by the
lwarx. If it is set then we just go back and keep on polling.
Fixes: b32aadc1a8ed ("powerpc/powernv: Fix race in updating core_idle_state") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 8117ac6a6c2f ("powerpc/powernv: Switch off MMU before entering
nap/sleep/rvwinkle mode", 2014-12-10) fixed a race condition where one
thread entering a KVM guest could switch the MMU context to the guest
while another thread was still in host kernel context with the MMU on.
That commit moved the point where a thread entering a power-saving
mode set its kvm_hstate.hwthread_state field in its PACA to
KVM_HWTHREAD_IN_IDLE from a point where the MMU was on to after the
MMU had been switched off. That commit also added a comment
explaining that we have to switch to real mode before setting
hwthread_state to avoid this race.
Nevertheless, commit 4eae2c9ae54a ("powerpc/powernv: Make
pnv_powersave_common more generic", 2016-07-08) subsequently moved
the setting of hwthread_state back to a point where the MMU is on,
thus reintroducing the race, despite the comment saying that this
should not be done being included in full in the context lines of
the patch that did it.
This fixes the race again and adds a bigger and shoutier comment
explaining the potential race condition.
Fixes: 4eae2c9ae54a ("powerpc/powernv: Make pnv_powersave_common more generic") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Shreyas B. Prabhu <shreyasbp@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Before this patch, we used tlbiel, if we ever ran only on this core.
That was mostly derived from the nohash usage of the same. But is
incorrect, the ISA 3.0 clarifies tlbiel such that:
"All TLB entries that have all of the following properties are made
invalid on the thread executing the tlbiel instruction"
ie. tlbiel only invalidates TLB entries on the current thread. So if the
mm has been used on any other thread (aka. cpu) then we must broadcast
the invalidate.
This bug could lead to invalid TLB entries if a program runs on multiple
threads of a core.
Hence use tlbiel, if we only ever ran on only the current cpu.
PowerPC's "cmp" instruction has four operands. Normally people write
"cmpw" or "cmpd" for the second cmp operand 0 or 1. But, frequently
people forget, and write "cmp" with just three operands.
With older binutils this is silently accepted as if this was "cmpw",
while often "cmpd" is wanted. With newer binutils GAS will complain
about this for 64-bit code. For 32-bit code it still silently assumes
"cmpw" is what is meant.
In this instance the code comes directly from ISA v2.07, including the
cmp, but cmpd is correct. Backport to stable so that new toolchains can
build old kernels.
Fixes: 948cf67c4726 ("powerpc: Add NAP mode support on Power7 in HV mode") Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
btrfs_remove_all_log_ctxs takes a shortcut where it avoids walking the
list because it knows all of the waiters are patiently waiting for the
commit to finish.
But, there's a small race where btrfs_sync_log can remove itself from
the list if it finds a log commit is already done. Also, it uses
list_del_init() to remove itself from the list, but there's no way to
know if btrfs_remove_all_log_ctxs has already run, so we don't know for
sure if it is safe to call list_del_init().
This gets rid of all the shortcuts for btrfs_remove_all_log_ctxs(), and
just calls it with the proper locking.
This is part two of the corruption fixed by cbd60aa7cd1. I should have
done this in the first place, but convinced myself the optimizations were
safe. A 12 hour run of dbench 2048 will eventually trigger a list debug
WARN_ON for the list_del_init() in btrfs_sync_log().
Fixes: d1433debe7f4346cf9fc0dafc71c3137d2a97bc4 Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In some error paths in functions cxl_start_context and
afu_ioctl_start_work pid references to the current & group-leader tasks
can leak after they are taken. This patch fixes these error paths to
release these pid references before exiting the error path.
Fixes: 7b8ad495d592 ("cxl: Fix DSI misses when the context owning task exits") Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reported-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
They uses the codec ALC255, and have the different pin cfg definition
from the ones in the existing pin quirk table. Now adding them into
the table to fix the problem.
ASRock B150M Pro4/D3 mobo with ALC892 codec doesn't seem to provide
proper pins for the surround outputs, hence we need to specify the
pincfgs manually with a couple of other corrections.
We have a new Dell laptop model which uses ALC295, the pin definition
is different from the existing ones in the pin quirk table, to fix the
headset mic detection and mic mute led's problem, we need to add the
new pin defintion into the pin quirk table.
Commit 49d9e77e72cf ("ALSA: hda - Fix system panic when DMA > 40 bits
for Nvidia audio controllers") simply disabled any DMA exceeding 32
bits for NVidia devices, even though they are capable of performing
DMA up to 40 bits. On some architectures (such as arm64), system memory
is not guaranteed to be 32-bit addressable by PCI devices, and so this
change prevents NVidia devices from working on platforms such as AMD
Seattle.
Since the original commit already mentioned that up to 40 bits of DMA
is supported, and given that the code has been updated in the meantime
to support a 40 bit DMA mask on other devices, revert commit 49d9e77e72cf
and explicitly set the DMA mask to 40 bits for NVidia devices.
The recent rewrite of the sequencer time accounting using timespec64
in the commit [3915bf294652: ALSA: seq_timer: use monotonic times
internally] introduced a bad regression. Namely, the time reported
back doesn't increase but goes back and forth.
The culprit was obvious: the delta is stored to the result (cur_time =
delta), instead of adding the delta (cur_time += delta)!
The stk1160 chip needs QUIRK_AUDIO_ALIGN_TRANSFER. This patch resolves
the issue reported on the mailing list
(http://marc.info/?l=linux-sound&m=139223599126215&w=2) and also fixes
bug 180071 (https://bugzilla.kernel.org/show_bug.cgi?id=180071).
We need to wait until the percpu_ref is released before exit. Otherwise,
we sometimes lose the race and trigger this new warning that was added
in v4.9 (commit a67823c1ed10 "percpu-refcount: init ->confirm_switch
member properly"):
Fixes: ab68f2622136 ("/dev/dax, pmem: direct access to persistent memory") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since BIG_KEYS can't be compiled as module it requires one of the "stdrng"
providers to be compiled into kernel. Otherwise big_key_crypto_init() fails
on crypto_alloc_rng step and next dereference of big_key_skcipher (e.g. in
big_key_preparse()) results in a NULL pointer dereference.
Fixes: 13100a72f40f5748a04017e0ab3df4cf27c809ef ('Security: Keys: Big keys stored encrypted') Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com>
cc: Stephan Mueller <smueller@chronox.de>
cc: Kirill Marinushkin <k.marinushkin@gmail.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
big_key has two separate initialisation functions, one that registers the
key type and one that registers the crypto. If the key type fails to
register, there's no problem if the crypto registers successfully because
there's no way to reach the crypto except through the key type.
However, if the key type registers successfully but the crypto does not,
big_key_rng and big_key_blkcipher may end up set to NULL - but the code
neither checks for this nor unregisters the big key key type.
Furthermore, since the key type is registered before the crypto, it is
theoretically possible for the kernel to try adding a big_key before the
crypto is set up, leading to the same effect.
Fix this by merging big_key_crypto_init() and big_key_init() and calling
the resulting function late. If they're going to be encrypted, we
shouldn't be creating big_keys before we have the facilities to do the
encryption available. The key type registration is also moved after the
crypto initialisation.
The fix also includes message printing on failure.
If the big_key type isn't correctly set up, simply doing:
dd if=/dev/zero bs=4096 count=1 | keyctl padd big_key a @s
ought to cause an oops.
Fixes: 13100a72f40f5748a04017e0ab3df4cf27c809ef ('Security: Keys: Big keys stored encrypted') Signed-off-by: David Howells <dhowells@redhat.com>
cc: Peter Hlavaty <zer0mem@yahoo.com>
cc: Kirill Marinushkin <k.marinushkin@gmail.com>
cc: Artem Savkov <asavkov@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix a short sprintf buffer in proc_keys_show(). If the gcc stack protector
is turned on, this can cause a panic due to stack corruption.
The problem is that xbuf[] is not big enough to hold a 64-bit timeout
rendered as weeks:
(gdb) p 0xffffffffffffffffULL/(60*60*24*7)
$2 = 30500568904943
That's 14 chars plus NUL, not 11 chars plus NUL.
Expand the buffer to 16 chars.
I think the unpatched code apparently works if the stack-protector is not
enabled because on a 32-bit machine the buffer won't be overflowed and on a
64-bit machine there's a 64-bit aligned pointer at one side and an int that
isn't checked again on the other side.
Reported-by: Ondrej Kozina <okozina@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Ondrej Kozina <okozina@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Initial logic for checking CPU match resulted in OR of CPU features
rather than the intended AND.
Updated to use boot_cpu_has macro rather than x86_match_cpu.
In addition, MWAIT is the only required CPU feature for idle
injection to work. Drop other feature requirements since they are
only needed for optimal efficiency.
Signed-off-by: Eric Ernst <eric.ernst@linux.intel.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On 4.0, we saw a stack corruption from a page fault entering direct
memory cgroup reclaim, calling into btrfs_releasepage(), which then
tried to allocate an extent and recursed back into a kmem charge ad
nauseam:
On later kernels, kmem charging is opt-in rather than opt-out, and that
particular kmem allocation in btrfs_releasepage() is no longer being
charged and won't recurse and overrun the stack anymore.
But it's not impossible for an accounted allocation to happen from the
memcg direct reclaim context, and we needed to reproduce this crash many
times before we even got a useful stack trace out of it.
Like other direct reclaimers, mark tasks in memcg reclaim PF_MEMALLOC to
avoid recursing into any other form of direct reclaim. Then let
recursive charges from PF_MEMALLOC contexts bypass the cgroup limit.
Link: http://lkml.kernel.org/r/20161025141050.GA13019@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This issue is caused by kmemcg feature that try to create new set of
kmem_caches for each memcg. Recently, kmem_cache creation is slowed by
synchronize_sched() and futher kmem_cache creation is also delayed since
kmem_cache creation is synchronized by a global slab_mutex lock. So,
the number of kworker that try to create kmem_cache increases quietly.
synchronize_sched() is for lockless access to node's shared array but
it's not needed when a new kmem_cache is created. So, this patch rules
out that case.
Fixes: 801faf0db894 ("mm/slab: lockless decision to grow cache") Link: http://lkml.kernel.org/r/1475734855-4837-1-git-send-email-iamjoonsoo.kim@lge.com Reported-by: Doug Smythies <dsmythies@telus.net> Tested-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As described in https://bugzilla.kernel.org/show_bug.cgi?id=177821:
After some analysis it seems to be that the problem is in alloc_super().
In case list_lru_init_memcg() fails it goes into destroy_super(), which
calls list_lru_destroy().
And in list_lru_init() we see that in case memcg_init_list_lru() fails,
lru->node is freed, but not set NULL, which then leads list_lru_destroy()
to believe it is initialized and call memcg_destroy_list_lru().
memcg_destroy_list_lru() in turn can access lru->node[i].memcg_lrus,
which is NULL.
[akpm@linux-foundation.org: add comment] Signed-off-by: Alexander Polakov <apolyakov@beget.ru> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function xfs_calc_dquots_per_chunk takes a parameter in units
of basic blocks. The kernel seems to get the units wrong, but
userspace got 'fixed' by commenting out the unnecessary conversion.
Fix both.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When allocating a new line handle or event a file is allocated that it is
associated to. The file is attached to a file descriptor of the current
process and the file descriptor is returned to userspace using
copy_to_user(). If this copy operation fails the line handle or event
allocation is aborted, all acquired resources are freed and an error is
returned.
But the file struct is not freed and left attached to the userspace
application and even though the file descriptor number was not copied it is
trivial to guess. If a userspace application performs a IOCTL on such a
left over file descriptor it will trigger a use-after-free and if the file
descriptor is closed (latest when the application exits) a double-free is
triggered.
anon_inode_getfd() performs 3 tasks, allocate a file struct, allocate a
file descriptor for the current process and install the file struct in the
file descriptor. As soon as the file struct is installed in the file
descriptor it is accessible by userspace (even if the IOCTL itself hasn't
completed yet), this means uninstalling the fd on the error path is not an
option, since userspace might already got a reference to the file.
Instead anon_inode_getfd() needs to be broken into its individual steps.
The allocation of the file struct and file descriptor is done first, then
the copy_to_user() is executed and only if it succeeds the file is
installed.
Since the file struct is reference counted it can not be just freed, but
its reference needs to be dropped, which will also call the release()
callback, which will free the state attached to the file. So in this case
the normal error cleanup path should not be taken.
The GPIOHANDLE_GET_LINE_VALUES_IOCTL handler allocates a gpiohandle_data
struct on the stack and then passes it to copy_to_user(). But only the
first element of the values array in the struct is set, which leaves the
struct partially initialized.
This exposes the previous, potentially sensitive, stack content to the
issuing userspace application. To avoid this make sure that the struct is
fully initialized.
Cc: stable@vger.kernel.org Fixes: 61f922db7221 ("gpio: userspace ABI for reading GPIO line events") Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The GPIO_GET_LINEEVENT_IOCTL currently ignores unknown or undefined
linehandle and lineevent flags. From a backwards and forwards compatibility
viewpoint it is highly desirable to reject unknown flags though.
On one hand an application that is using newer flags and is running on
an older kernel has no way to detect if the new flags were handled
correctly if they are silently discarded.
On the other hand an application that (accidentally) passes undefined flags
will run fine on an older kernel, but may break on a newer kernel when
these flags get defined.
Ensure that requests that have undefined flags set are rejected with an
error, rather than silently discarding the undefined flags.
Fixes: 61f922db7221 ("gpio: userspace ABI for reading GPIO line events") Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The GPIO_GET_LINEHANDLE_IOCTL currently ignores unknown or undefined
linehandle flags. From a backwards and forwards compatibility viewpoint it
is highly desirable to reject unknown flags though.
On one hand an application that is using newer flags and is running on
an older kernel has no way to detect if the new flags were handled
correctly if they are silently discarded.
On the other hand an application that (accidentally) passes undefined flags
will run fine on an older kernel, but may break on a newer kernel when
these flags get defined.
Ensure that requests that have undefined flags set are rejected with an
error, rather than silently discarding the undefined flags.
The line offset that is used as an index into the descs array is provided
by userspace and might go beyond the bounds of the array. If that happens
undefined behavior will occur.
Make sure that the offset is within the bounds of the desc array and reject
any requests that specify a value outside of it.
Fixes: 61f922db7221 ("gpio: userspace ABI for reading GPIO line events") Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The GPIOHANDLE_GET_LINE_VALUES_IOCTL handler allocates a gpiohandle_data
struct on the stack and then passes it to copy_to_user(). But depending on
the number of requested line handles the struct is only partially
initialized.
This exposes the previous, potentially sensitive, stack content to the
issuing userspace application. To avoid this make sure that the struct is
fully initialized.
The line offset that is used as an index into the descs array is provided
by userspace and might go beyond the bounds of the array. If that happens
undefined behavior will occur.
Make sure that the offset is within the bounds of the desc array and reject
any requests that specify a value outside of it.
The GPIO_GET_CHIPINFO_IOCTL handler allocates a gpiochip_info struct on the
stack and then passes it to copy_to_user(). But depending on the length of
the GPIO chip name and label the struct is only partially initialized.
This exposes the previous, potentially sensitive, stack content to the
issuing userspace application. To avoid this make sure that the struct is
fully initialized.
Fixes: 521a2ad6f862 ("gpio: add userspace ABI for GPIO line information") Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current line offset validation is off by one. Depending on the data
stored behind the descs array this can either cause undefined behavior or
disclose arbitrary, potentially sensitive, memory to the issuing userspace
application.
Make sure that offset is within the bounds of the desc array.
Fixes: 521a2ad6f862 ("gpio: add userspace ABI for GPIO line information") Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Back in commit f56141e3e2d9 ("all arches, signal: move restart_block to
struct task_struct"), all architectures and core code were changed to
use task_struct::restart_block. However, when h8300 support was
subsequently restored in v4.2, it was not updated to account for this,
and maintains thread_info::restart_block, which is not kept in sync.
This patch drops the redundant restart_block from thread_info, and moves
h8300 to the common one in task_struct, ensuring that syscall restarting
always works as expected.
Fixes: f56141e3e2d9 ("all arches, signal: move restart_block to struct task_struct") Link: http://lkml.kernel.org/r/1476714934-11635-1-git-send-email-mark.rutland@arm.com Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Instantiated SPI device nodes are marked with OF_POPULATE. This was
introduced in bd6c164. On unloading, loaded device nodes will of course
be unmarked. The problem are nodes that fail during initialisation: If a
node fails, it won't be unloaded and hence not be unmarked.
If a SPI driver module is unloaded and reloaded, it will skip nodes that
failed before.
Skip device nodes that are already populated and mark them only in case
of success.
Note that the same issue exists for I2C.
Fixes: bd6c164 ("spi: Mark instantiated device nodes with OF_POPULATE") Signed-off-by: Ralf Ramsauer <ralf@ramses-pyramidenbau.de> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we get a spurious interrupt in fsl_espi_irq, we end up
processing four uninitalized bytes of data, as shown in this
warning message:
drivers/spi/spi-fsl-espi.c: In function 'fsl_espi_irq':
drivers/spi/spi-fsl-espi.c:462:4: warning: 'rx_data' may be used uninitialized in this function [-Wmaybe-uninitialized]
This adds another check so we skip the data in this case.
Fixes: 6319a68011b8 ("spi/fsl-espi: avoid infinite loops on fsl_espi_cpu_irq()") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The i2c adapter is only relevant for some peer device types, so
let's clear the pdt if it's still the same as the old_pdt when we
tear down the i2c adapter.
I don't really like this design pattern of updating port->whatever
before doing the accompanying changes and passing around old_whatever
to figure stuff out. Would make much more sense to me to the pass the
new value around and only update the port->whatever when things are
consistent. But let's try to work with what we have right now.
Quoting a follow-up from Ville:
"And naturally I forgot to amend the commit message w.r.t. this guy
[the change in drm_dp_destroy_port]. We don't really need to do this
here, but I figured I'd try to be a bit more consistent by having it,
just to avoid accidental mistakes if/when someone changes this stuff
again later."
v2: Clear port->pdt in the caller, if needed (Daniel)
Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Carlos Santa <carlos.santa@intel.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Tested-by: Carlos Santa <carlos.santa@intel.com> Tested-by: Kirill A. Shutemov <kirill@shutemov.name>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97666 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1477488633-16544-1-git-send-email-ville.syrjala@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Race condition between registering an I2C device driver and
deregistering an I2C adapter device which is assumed to manage that
I2C device may lead to a NULL pointer dereference due to the
uninitialized list head of driver clients.
The root cause of the issue is that the I2C bus may know about the
registered device driver and thus it is matched by bus_for_each_drv(),
but the list of clients is not initialized and commonly it is NULL,
because I2C device drivers define struct i2c_driver as static and
clients field is expected to be initialized by I2C core:
To solve the problem it is sufficient to do clients list head
initialization before calling driver_register().
The problem was found while using an I2C device driver with a sluggish
registration routine on a bus provided by a physically detachable I2C
master controller, but practically the oops may be reproduced under
the race between arbitraty I2C device driver registration and managing
I2C bus device removal e.g. by unbinding the latter over sysfs:
% echo 21a4000.i2c > /sys/bus/platform/drivers/imx-i2c/unbind
Unable to handle kernel NULL pointer dereference at virtual address 00000000
Internal error: Oops: 17 [#1] SMP ARM
CPU: 2 PID: 533 Comm: sh Not tainted 4.9.0-rc3+ #61
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
task: e5ada400 task.stack: e4936000
PC is at i2c_do_del_adapter+0x20/0xcc
LR is at __process_removed_adapter+0x14/0x1c
Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 10c5387d Table: 35bd004a DAC: 00000051
Process sh (pid: 533, stack limit = 0xe4936210)
Stack: (0xe4937d28 to 0xe4938000)
Backtrace:
[<c0667be0>] (i2c_do_del_adapter) from [<c0667cc0>] (__process_removed_adapter+0x14/0x1c)
[<c0667cac>] (__process_removed_adapter) from [<c0516998>] (bus_for_each_drv+0x6c/0xa0)
[<c051692c>] (bus_for_each_drv) from [<c06685ec>] (i2c_del_adapter+0xbc/0x284)
[<c0668530>] (i2c_del_adapter) from [<bf0110ec>] (i2c_imx_remove+0x44/0x164 [i2c_imx])
[<bf0110a8>] (i2c_imx_remove [i2c_imx]) from [<c051a838>] (platform_drv_remove+0x2c/0x44)
[<c051a80c>] (platform_drv_remove) from [<c05183d8>] (__device_release_driver+0x90/0x12c)
[<c0518348>] (__device_release_driver) from [<c051849c>] (device_release_driver+0x28/0x34)
[<c0518474>] (device_release_driver) from [<c0517150>] (unbind_store+0x80/0x104)
[<c05170d0>] (unbind_store) from [<c0516520>] (drv_attr_store+0x28/0x34)
[<c05164f8>] (drv_attr_store) from [<c0298acc>] (sysfs_kf_write+0x50/0x54)
[<c0298a7c>] (sysfs_kf_write) from [<c029801c>] (kernfs_fop_write+0x100/0x214)
[<c0297f1c>] (kernfs_fop_write) from [<c0220130>] (__vfs_write+0x34/0x120)
[<c02200fc>] (__vfs_write) from [<c0221088>] (vfs_write+0xa8/0x170)
[<c0220fe0>] (vfs_write) from [<c0221e74>] (SyS_write+0x4c/0xa8)
[<c0221e28>] (SyS_write) from [<c0108a20>] (ret_fast_syscall+0x0/0x1c)
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We found a bug that i2c transfer sometimes failed on 3066a board with
stabel-4.8, the con register would be updated by uninitialized tuning
value, it made the i2c transfer failed.
So give the tuning value to be zero during rk3x_i2c_v0_calc_timings.
Signed-off-by: David Wu <david.wu@rock-chips.com> Tested-by: Andy Yan <andy.yan@rock-chips.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
nvdimm_clear_poison cleared the user-visible badblocks, and sent
commands to the NVDIMM to clear the areas marked as 'poison', but it
neglected to clear the same areas from the internal poison_list which is
used to marshal ARS results before sorting them by namespace. As a
result, once on-demand ARS functionality was added:
37b137f nfit, libnvdimm: allow an ARS scrub to be triggered on demand
A scrub triggered from either sysfs or an MCE was found to be adding
stale entries that had been cleared from gendisk->badblocks, but were
still present in nvdimm_bus->poison_list. Additionally, the stale entries
could be triggered into producing stale disk->badblocks by simply disabling
and re-enabling the namespace or region.
This adds the missing step of clearing poison_list entries when clearing
poison, so that it is always in sync with badblocks.
Fixes: 37b137f ("nfit, libnvdimm: allow an ARS scrub to be triggered on demand") Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On ARM/ARM64 architectures, PCI IO ports are emulated through memory mapped
IO, by reserving a chunk of virtual address space starting at PCI_IOBASE
and by mapping the PCI host bridges memory address space driving PCI IO
cycles to it.
PCI host bridge drivers that enable downstream PCI IO cycles map the host
bridge memory address responding to PCI IO cycles to the fixed virtual
address space through the pci_remap_iospace() API.
This means that if the pci_remap_iospace() function fails, the
corresponding host bridge PCI IO resource must be considered invalid, in
that there is no way for the kernel to actually drive PCI IO transactions
if the memory addresses responding to PCI IO cycles cannot be mapped into
the CPU virtual address space.
The PCI tegra host bridge driver adds the PCI IO resource retrieved from
firmware to the host bridge resource windows even if the
pci_remap_iospace() call fails; this is an actual bug in that the PCI host
bridge would consider the PCI IO resource valid (and possibly assign it to
downstream devices) even if the kernel was not able to map the PCI host
bridge memory address driving IO cycle to the CPU virtual address space (ie
pci_remap_iospace() failures).
Add the PCI host bridge driver pci_remap_iospace() failure path and do not
add the corresponding PCI host bridge PCI IO resources retrieved through
firmware when the pci_remap_iospace() function call fails, fixing the
issue.
Fixes: e6e9f471f5fe ("PCI: tegra: Use generic pci_remap_iospace() rather than ARM32-specific one") Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Thierry Reding <treding@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On ARM/ARM64 architectures, PCI IO ports are emulated through memory mapped
IO, by reserving a chunk of virtual address space starting at PCI_IOBASE
and by mapping the PCI host bridges memory address space driving PCI IO
cycles to it.
PCI host bridge drivers that enable downstream PCI IO cycles map the host
bridge memory address responding to PCI IO cycles to the fixed virtual
address space through the pci_remap_iospace() API.
This means that if the pci_remap_iospace() function fails, the
corresponding host bridge PCI IO resource must be considered invalid, in
that there is no way for the kernel to actually drive PCI IO transactions
if the memory addresses responding to PCI IO cycles cannot be mapped into
the CPU virtual address space.
The PCI designware host bridge driver does not remove the PCI IO resource
from the host bridge resource windows if the pci_remap_iospace() call
fails; this is an actual bug in that the PCI host bridge would consider the
PCI IO resource valid (and possibly assign it to downstream devices) even
if the kernel was not able to map the PCI host bridge memory address
driving IO cycle to the CPU virtual address space (ie pci_remap_iospace()
failures).
Fix the PCI host bridge driver pci_remap_iospace() failure path, by
destroying the PCI host bridge PCI IO resources retrieved through firmware
when the pci_remap_iospace() function call fails, therefore preventing the
kernel from adding the respective PCI IO resource to the list of PCI host
bridge valid resources, fixing the issue.
Fixes: cbce7900598c ("PCI: designware: Make driver arch-agnostic") Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Jingoo Han <jingoohan1@gmail.com> CC: Pratyush Anand <pratyush.anand@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On ARM/ARM64 architectures, PCI IO ports are emulated through memory mapped
IO, by reserving a chunk of virtual address space starting at PCI_IOBASE
and by mapping the PCI host bridges memory address space driving PCI IO
cycles to it.
PCI host bridge drivers that enable downstream PCI IO cycles map the host
bridge memory address responding to PCI IO cycles to the fixed virtual
address space through the pci_remap_iospace() API.
This means that if the pci_remap_iospace() function fails, the
corresponding host bridge PCI IO resource must be considered invalid, in
that there is no way for the kernel to actually drive PCI IO transactions
if the memory addresses responding to PCI IO cycles cannot be mapped into
the CPU virtual address space.
The PCI versatile host bridge driver does not remove the PCI IO resource
from the host bridge resource windows if the pci_remap_iospace() call
fails; this is an actual bug in that the PCI host bridge would consider the
PCI IO resource valid (and possibly assign it to downstream devices) even
if the kernel was not able to map the PCI host bridge memory address
driving IO cycle to the CPU virtual address space (ie pci_remap_iospace()
failures).
Fix the PCI host bridge driver pci_remap_iospace() failure path, by
destroying the PCI host bridge PCI IO resources retrieved through firmware
when the pci_remap_iospace() function call fails, therefore preventing the
kernel from adding the respective PCI IO resource to the list of PCI host
bridge valid resources, fixing the issue.
Fixes: b7e78170efd4 ("PCI: versatile: Add DT-based ARM Versatile PB PCIe host driver") Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On ARM/ARM64 architectures, PCI IO ports are emulated through memory mapped
IO, by reserving a chunk of virtual address space starting at PCI_IOBASE
and by mapping the PCI host bridges memory address space driving PCI IO
cycles to it.
PCI host bridge drivers that enable downstream PCI IO cycles map the host
bridge memory address responding to PCI IO cycles to the fixed virtual
address space through the pci_remap_iospace() API.
This means that if the pci_remap_iospace() function fails, the
corresponding host bridge PCI IO resource must be considered invalid, in
that there is no way for the kernel to actually drive PCI IO transactions
if the memory addresses responding to PCI IO cycles cannot be mapped into
the CPU virtual address space.
The PCI common host bridge driver does not remove the PCI IO resource from
the host bridge resource windows if the pci_remap_iospace() call fails;
this is an actual bug in that the PCI host bridge would consider the PCI IO
resource valid (and possibly assign it to downstream devices) even if the
kernel was not able to map the PCI host bridge memory address driving IO
cycle to the CPU virtual address space (ie pci_remap_iospace() failures).
Fix the PCI host bridge driver pci_remap_iospace() failure path, by
destroying the PCI host bridge PCI IO resources retrieved through firmware
when the pci_remap_iospace() function call fails, therefore preventing the
kernel from adding the respective PCI IO resource to the list of PCI host
bridge valid resources, fixing the issue.
Fixes: 4e64dbe226e7 ("PCI: generic: Expose pci_host_common_probe() for use by other drivers") Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On ARM/ARM64 architectures, PCI IO ports are emulated through memory mapped
IO, by reserving a chunk of virtual address space starting at PCI_IOBASE
and by mapping the PCI host bridge's memory address space driving PCI IO
cycles to it.
PCI host bridge drivers that enable downstream PCI IO cycles map the host
bridge memory address responding to PCI IO cycles to the fixed virtual
address space through the pci_remap_iospace() API.
This means that if the pci_remap_iospace() function fails, the
corresponding host bridge PCI IO resource must be considered invalid, in
that there is no way for the kernel to actually drive PCI IO transactions
if the memory addresses responding to PCI IO cycles cannot be mapped into
the CPU virtual address space.
The PCI aardvark host bridge driver does not remove the PCI IO resource
from the host bridge resource windows if the pci_remap_iospace() call
fails; this is an actual bug in that the PCI host bridge would consider the
PCI IO resource valid (and possibly assign it to downstream devices) even
if the kernel was not able to map the PCI host bridge memory address
driving IO cycle to the CPU virtual address space (ie pci_remap_iospace()
failures).
Fix the PCI host bridge driver pci_remap_iospace() failure path, by
destroying the PCI host bridge PCI IO resources retrieved through firmware
when the pci_remap_iospace() function call fails, therefore preventing the
kernel from adding the respective PCI IO resource to the list of PCI host
bridge valid resources, fixing the issue.