If some error happens in NCP_IOC_SETROOT ioctl, the appropriate error
return value is then (in most cases) just overwritten before we return.
This can result in reporting success to userspace although error happened.
This bug was introduced by commit 2e54eb96e2c8 ("BKL: Remove BKL from
ncpfs"). Propagate the errors correctly.
Fixes: 2e54eb96e2c80 ("BKL: Remove BKL from ncpfs") Signed-off-by: Jan Kara <jack@suse.cz> Cc: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This is indeed because of an uninitialized kobject for blk_mq_ctx.
The blk_mq_ctx kobjects are initialized in blk_mq_sysfs_init(), but it
goes loop over hctx_for_each_ctx(), i.e. it initializes only for
online CPUs. Thus, when a CPU is hotplugged, the ctx for the newly
onlined CPU is registered without initialization.
This patch fixes the issue by initializing the all ctx kobjects
belonging to each queue.
- Disables the F00F bug workaround warning. There is no F00F bug
workaround any more because Linux's standard IDT handling already
works around the F00F bug, but the warning still exists. This
is only cosmetic, and, in any event, there is no such thing as
KVM on a CPU with the F00F bug.
- Disables 32-bit APM BIOS detection. On a KVM paravirt system,
there should be no APM BIOS anyway.
- Disables tboot. I think that the tboot code should check the
CPUID hypervisor bit directly if it matters.
- paravirt_enabled disables espfix32. espfix32 should *not* be
disabled under KVM paravirt.
The last point is the purpose of this patch. It fixes a leak of the
high 16 bits of the kernel stack address on 32-bit KVM paravirt
guests. Fixes CVE-2014-8134.
Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Apparently stuff works that way on those machines.
I agree with Chris' concern that this is a bit risky but imo worth a
shot in -next just for fun. Afaics all these machines have the pci
resources allocated like that by the BIOS, so I suspect that it's all
ok.
Generalize id_map_mutex so it can be used for more state of a user namespace.
Reviewed-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
If you did not create the user namespace and are allowed
to write to uid_map or gid_map you should already have the necessary
privilege in the parent user namespace to establish any mapping
you want so this will not affect userspace in practice.
Limiting unprivileged uid mapping establishment to the creator of the
user namespace makes it easier to verify all credentials obtained with
the uid mapping can be obtained without the uid mapping without
privilege.
Limiting unprivileged gid mapping establishment (which is temporarily
absent) to the creator of the user namespace also ensures that the
combination of uid and gid can already be obtained without privilege.
This is part of the fix for CVE-2014-8989.
Reviewed-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
setresuid allows the euid to be set to any of uid, euid, suid, and
fsuid. Therefor it is safe to allow an unprivileged user to map
their euid and use CAP_SETUID privileged with exactly that uid,
as no new credentials can be obtained.
I can not find a combination of existing system calls that allows setting
uid, euid, suid, and fsuid from the fsuid making the previous use
of fsuid for allowing unprivileged mappings a bug.
This is part of a fix for CVE-2014-8989.
Reviewed-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
As any gid mapping will allow and must allow for backwards
compatibility dropping groups don't allow any gid mappings to be
established without CAP_SETGID in the parent user namespace.
For a small class of applications this change breaks userspace
and removes useful functionality. This small class of applications
includes tools/testing/selftests/mount/unprivilged-remount-test.c
Most of the removed functionality will be added back with the addition
of a one way knob to disable setgroups. Once setgroups is disabled
setting the gid_map becomes as safe as setting the uid_map.
For more common applications that set the uid_map and the gid_map
with privilege this change will have no affect.
This is part of a fix for CVE-2014-8989.
Reviewed-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
setgroups is unique in not needing a valid mapping before it can be called,
in the case of setgroups(0, NULL) which drops all supplemental groups.
The design of the user namespace assumes that CAP_SETGID can not actually
be used until a gid mapping is established. Therefore add a helper function
to see if the user namespace gid mapping has been established and call
that function in the setgroups permission check.
This is part of the fix for CVE-2014-8989, being able to drop groups
without privilege using user namespaces.
Reviewed-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Fix a bug where nfsd4_encode_components_esc() incorrectly calculates the
length of server array in fs_location4--note that it is a count of the
number of array elements, not a length in bytes.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: 082d4bd72a45 (nfsd4: "backfill" using write_bytes_to_xdr_buf) Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Fix a bug where nfsd4_encode_components_esc() includes the esc_end char as
an additional string encoding.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: e7a0444aef4a "nfsd: add IPv6 addr escaping to fs_location hosts" Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Bugs similar to the one in acbbe6fbb240 (kcmp: fix standard comparison
bug) are in rich supply.
In this variant, the problem is that struct xdr_netobj::len has type
unsigned int, so the expression o1->len - o2->len _also_ has type
unsigned int; it has completely well-defined semantics, and the result
is some non-negative integer, which is always representable in a long
long. But this means that if the conditional triggers, we are
guaranteed to return a positive value from compare_blob.
In this case it could be fixed by
- res = o1->len - o2->len;
+ res = (long long)o1->len - (long long)o2->len;
but I'd rather eliminate the usually broken 'return a - b;' idiom.
Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
What we need is the following two guarantees:
* Any thread that observes the effect of the test_and_set_bit() by
__bt_get_word() also observes the preceding addition of 'current'
to the appropriate wait list. This is guaranteed by the semantics
of the spin_unlock() operation performed by prepare_and_wait().
Hence the conversion of test_and_set_bit_lock() into
test_and_set_bit().
* The wait lists are examined by bt_clear() after the tag bit has
been cleared. clear_bit_unlock() guarantees that any thread that
observes that the bit has been cleared also observes the store
operations preceding clear_bit_unlock(). However,
clear_bit_unlock() does not prevent that the wait lists are examined
before that the tag bit is cleared. Hence the addition of a memory
barrier between clear_bit() and the wait list examination.
Signed-off-by: Bart Van Assche <bvanassche@acm.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Robert Elliott <elliott@hp.com> Cc: Ming Lei <ming.lei@canonical.com> Cc: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
If __bt_get_word() is called with last_tag != 0, if the first
find_next_zero_bit() fails, if after wrap-around the
test_and_set_bit() call fails and find_next_zero_bit() succeeds,
if the next test_and_set_bit() call fails and subsequently
find_next_zero_bit() does not find a zero bit, then another
wrap-around will occur. Avoid this by introducing an additional
local variable.
Signed-off-by: Bart Van Assche <bvanassche@acm.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Robert Elliott <elliott@hp.com> Cc: Ming Lei <ming.lei@canonical.com> Cc: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
blk-mq users are allowed to free the memory request_queue.tag_set
points at after blk_cleanup_queue() has finished but before
blk_release_queue() has started. This can happen e.g. in the SCSI
core. The SCSI core namely embeds the tag_set structure in a SCSI
host structure. The SCSI host structure is freed by
scsi_host_dev_release(). This function is called after
blk_cleanup_queue() finished but can be called before
blk_release_queue().
This means that it is not safe to access request_queue.tag_set from
inside blk_release_queue(). Hence remove the blk_sync_queue() call
from blk_release_queue(). This call is not necessary - outstanding
requests must have finished before blk_release_queue() is
called. Additionally, move the blk_mq_free_queue() call from
blk_release_queue() to blk_cleanup_queue() to avoid that struct
request_queue.tag_set gets accessed after it has been freed.
This patch avoids that the following kernel oops can be triggered
when deleting a SCSI host for which scsi-mq was enabled:
Signed-off-by: Bart Van Assche <bvanassche@acm.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Robert Elliott <elliott@hp.com> Cc: Ming Lei <ming.lei@canonical.com> Cc: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
On MST systems the monitors don't appear when we set the fb up,
but plymouth opens the drm device and holds it open while they
come up, when plymouth finishes and lastclose gets called we
don't do the delayed fb probe, so the monitor never appears on the
console.
Fix this by moving the delayed checking into the mode restore.
v2: Daniel suggested that ->delayed_hotplug is set under
the mode_config mutex, so we should check it under that as
well, while we are in the area.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
At least on two MST devices I've tested with, when
they are link training downstream, they are totally
unable to handle aux ch msgs, so they defer like nuts.
I tried 16, it wasn't enough, 32 seems better.
This fixes one Dell 4k monitor and one of the
MST hubs.
v1.1: fixup comment (Tom).
Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In (6468276 i2c: designware: make SCL and SDA falling time
configurable) new device tree properties were added for setting the
falling time of SDA and SCL. The device tree bindings doc had a typo
in it: it forgot the "-ns" suffix for both properies in the prose of
the bindings.
I assume this is a typo because:
* The source code includes the "-ns"
* The example in the bindings includes the "-ns".
Fix the typo.
Signed-off-by: Doug Anderson <dianders@chromium.org> Fixes: 6468276b2206 ("i2c: designware: make SCL and SDA falling time configurable") Acked-by: Romain Baeriswyl <romain.baeriswyl@alitech.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When loading encrypted-keys module, if the last check of
aes_get_sizes() in init_encrypted() fails, the driver just returns an
error without unregistering its key type. This results in the stale
entry in the list. In addition to memory leaks, this leads to a kernel
crash when registering a new key type later.
This patch fixes the problem by swapping the calls of aes_get_sizes()
and register_key_type(), and releasing resources properly at the error
paths.
In snd_usbmidi_error_timer(), the driver tries to resubmit MIDI input
URBs to reactivate the MIDI stream, but this causes the error when
some of URBs are still pending like:
This patch sets the correct reverse sequence order to the instructions
set to run, when any failure occurs during the initialization steps.
It also adds the missing unregistration call of the can device if the
failure appears after having been registered.
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patchs fixes a misplaced call to memset() that fills the request
buffer with 0. The problem was with sending PCAN_USBPRO_REQ_FCT
requests, the content set by the caller was thus lost.
With this patch, the memory area is zeroed only when requesting info
from the device.
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The rule is simple. Don't allow anything that wouldn't be allowed
without unprivileged mappings.
It was previously overlooked that establishing gid mappings would
allow dropping groups and potentially gaining permission to files and
directories that had lesser permissions for a specific group than for
all other users.
This is the rule needed to fix CVE-2014-8989 and prevent any other
security issues with new_idmap_permitted.
The reason for this rule is that the unix permission model is old and
there are programs out there somewhere that take advantage of every
little corner of it. So allowing a uid or gid mapping to be
established without privielge that would allow anything that would not
be allowed without that mapping will result in expectations from some
code somewhere being violated. Violated expectations about the
behavior of the OS is a long way to say a security issue.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Today there are 3 instances of setgroups and due to an oversight their
permission checking has diverged. Add a common function so that
they may all share the same permission checking code.
This corrects the current oversight in the current permission checks
and adds a helper to avoid this in the future.
A user namespace security fix will update this new helper, shortly.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This is a bug fix for using physical arch timers when
the arch_timer_use_virtual boolean is false. It restores the
arch_counter_get_cntpct() function after removal in
0d651e4e "clocksource: arch_timer: use virtual counters"
We need this on certain ARMv7 systems which are architected like this:
* The firmware doesn't know and doesn't care about hypervisor mode and
we don't want to add the complexity of hypervisor there.
* The firmware isn't involved in SMP bringup or resume.
* The ARCH timer come up with an uninitialized offset between the
virtual and physical counters. Each core gets a different random
offset.
* The device boots in "Secure SVC" mode.
* Nothing has touched the reset value of CNTHCTL.PL1PCEN or
CNTHCTL.PL1PCTEN (both default to 1 at reset)
One example of such as system is RK3288 where it is much simpler to
use the physical counter since there's nobody managing the offset and
each time a core goes down and comes back up it will get reinitialized
to some other random value.
Fixes: 0d651e4e65e9 ("clocksource: arch_timer: use virtual counters") Signed-off-by: Sonny Rao <sonnyrao@chromium.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The arm and arm64 VDSOs need CP15 access to the architected counter.
If this is unavailable (which is allowed by ARM v7), indicate this by
changing the clocksource name to "arch_mem_counter" before registering
the clocksource.
Suggested by Stephen Boyd.
Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The existing MCE code calls flush_tlb hook with IS=0 (single page) resulting
in partial invalidation of TLBs which is not right. This patch fixes
that by passing IS=0xc00 to invalidate whole TLB for successful recovery
from TLB and ERAT errors.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The end timer is used for switching back from repeat code timings when
no repeat codes have been received for a certain amount of time. When
the protocol is changed, the end timer is deleted synchronously with
del_timer_sync(), however this takes place while holding the main spin
lock, and the timer handler also needs to acquire the spin lock.
This opens the possibility of a deadlock on an SMP system if the
protocol is changed just as the repeat timer is expiring. One CPU could
end up in img_ir_set_decoder() holding the lock and waiting for the end
timer to complete, while the other CPU is stuck in the timer handler
spinning on the lock held by the first CPU.
Lockdep also spots a possible lock inversion in the same code, since
img_ir_set_decoder() acquires the img-ir lock before the timer lock, but
the timer handler will try and acquire them the other way around:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
3.18.0-rc5+ #957 Not tainted
---------------------------------------------------------
swapper/0/0 just changed the state of lock:
(((&hw->end_timer))){+.-...}, at: [<4006ae5c>] _call_timer_fn+0x0/0xfc
but this lock was taken by another, HARDIRQ-safe lock in the past:
(&(&priv->lock)->rlock#2){-.....}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
This is fixed by releasing the main spin lock while performing the
del_timer_sync() call. The timer is prevented from restarting before the
lock is reacquired by a new "stopping" flag which img_ir_handle_data()
checks before updating the timer.
---------------------------------------------------------
swapper/0/0 just changed the state of lock:
(((&hw->end_timer))){+.-...}, at: [<4006ae5c>] _call_timer_fn+0x0/0xfc
but this lock was taken by another, HARDIRQ-safe lock in the past:
(&(&priv->lock)->rlock#2){-.....}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(((&hw->end_timer)));
local_irq_disable();
lock(&(&priv->lock)->rlock#2);
lock(((&hw->end_timer)));
<Interrupt>
lock(&(&priv->lock)->rlock#2);
*** DEADLOCK ***
This is fixed by releasing the main spin lock while performing the
del_timer_sync() call. The timer is prevented from restarting before the
lock is reacquired by a new "stopping" flag which img_ir_handle_data()
checks before updating the timer.
Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Sifan Naeem <sifan.naeem@imgtec.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
A problem was found on Polaris where if the unit it booted via the power
button on the infrared remote then the next button press on the remote
would return the key code used to power on the unit.
The sequence is:
- The polaris powered off but with the powerdown controller (PDC) block
still powered.
- Press power key on remote, IR block receives the key.
- Kernel starts, IR code is in IMG_IR_DATA_x but neither IMG_IR_RXDVAL
or IMG_IR_RXDVALD2 are set.
- Wait any amount of time.
- Press any key.
- IMG_IR_RXDVAL or IMG_IR_RXDVALD2 is set but IMG_IR_DATA_x is
unchanged since the powerup key data was never read.
This is worked around by always reading the IMG_IR_DATA_x in
img_ir_set_decoder(), rather than only when the IMG_IR_RXDVAL or
IMG_IR_RXDVALD2 bit is set.
Signed-off-by: Dylan Rajaratnam <dylan.rajaratnam@imgtec.com> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
On x86 truncation cannot occur because config XEN depends on X86_64 ||
(X86_32 && X86_PAE).
On ARM truncation can occur without CONFIG_ARM_LPAE, when the dma
operation involves foreign grants. However in that case the physical
address returned by xen_bus_to_phys is actually invalid (there is no mfn
to pfn tracking for foreign grants on ARM) and it is not used.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Check the that ring we are using for copies is functional
rather than the GFX ring. On newer asics we use the DMA
ring for bo moves.
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The commit "vmwgfx: Rework fence event action" introduced a number of bugs
that are fixed with this commit:
a) A forgotten return stateemnt.
b) An if statement with identical branches.
Reported-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The function vmw_master_check() might return -ERESTARTSYS if there is a
signal pending, indicating that the IOCTL should be rerun, potentially from
user-space. At that point we shouldn't print out an error message since that
is not an error condition. In short, avoid bloating the kernel log when a
process refuses to die on SIGTERM.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In all likelihood we will do a few hundred errnoneous register
operations if we do a single invalid register access whilst the device
is suspended. As each instance causes a WARN, this floods the system
logs and can make the system unresponsive.
The warning was first introduced in
commit b2ec142cb0101f298f8e091c7d75b1ec5b809b65
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
Date: Fri Feb 21 13:52:25 2014 -0300
drm/i915: call assert_device_not_suspended at gen6_force_wake_work
and despite the claims the WARN is still encountered in the wild today.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
It is critical that fetch_block() and handle_stripe_dirtying()
are consistent in their analysis of what needs to be loaded.
Otherwise raid5 can wait forever for a block that won't be loaded.
Currently when writing to a RAID5 that is resyncing, to a location
beyond the resync offset, handle_stripe_dirtying chooses a
reconstruct-write cycle, but fetch_block() assumes a
read-modify-write, and a lockup can happen.
So treat that case just like RAID6, just as we do in
handle_stripe_dirtying. RAID6 always does reconstruct-write.
This bug was introduced when the behaviour of handle_stripe_dirtying
was changed in 3.7, so the patch is suitable for any kernel since,
though it will need careful merging for some versions.
Fixes: a7854487cd7128a30a7f4f5259de9f67d5efb95f Reported-by: Henry Cai <henryplusplus@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Forced unmount affects not just the mount namespace but the underlying
superblock as well. Restrict forced unmount to the global root user
for now. Otherwise it becomes possible a user in a less privileged
mount namespace to force the shutdown of a superblock of a filesystem
in a more privileged mount namespace, allowing a DOS attack on root.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Use memzero_explicit to cleanup sensitive data allocated on stack
to prevent the compiler from optimizing and removing memset() calls.
Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
There's an off-by-one bug in function __domain_mapping(), which may
trigger the BUG_ON(nr_pages < lvl_pages) when
(nr_pages + 1) & superpage_mask == 0
The issue was introduced by commit 9051aa0268dc "intel-iommu: Combine
domain_pfn_mapping() and domain_sg_mapping()", which sets sg_res to
"nr_pages + 1" to avoid some of the 'sg_res==0' code paths.
It's safe to remove extra "+1" because sg_res is only used to calculate
page size now.
Reported-And-Tested-by: Sudeep Dutt <sudeep.dutt@intel.com> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Acked-By: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Since the commit below, iwldvm sends the FLUSH command to
the firmware. All the devices that use iwldvm have a
firmware that expects the _v3 version of this command,
besides 5150.
5150's latest available firmware still expects a _v2 version
of the FLUSH command.
This means that since the commit below, we had a mismatch for
this specific device only.
This mismatch led to the NMI below:
This was reported here:
https://bugzilla.kernel.org/show_bug.cgi?id=88961
Fixes: a0855054e59b ("iwlwifi: dvm: drop non VO frames when flushing") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Like with ath9k, ath5k queues also need to be ordered by priority.
queue_info->tqi_subtype already contains the correct index, so use it
instead of relying on the order of ath5k_hw_setup_tx_queue calls.
Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Hardware queues are ordered by priority. Use queue index 0 for BK, which
has lower priority than BE.
Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The driver passes the desired hardware queue index for a WMM data queue
in qinfo->tqi_subtype. This was ignored in ath9k_hw_setuptxqueue, which
instead relied on the order in which the function is called.
Reported-by: Hubert Feurstein <h.feurstein@gmail.com> Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When dm-bufio sets out to use the bio built into a struct dm_buffer to
issue an IO, it needs to call bio_reset after it's done with the bio
so that we can free things attached to the bio such as the integrity
payload. Therefore, inject our own endio callback to take care of
the bio_reset after calling submit_io's end_io callback.
Test case:
1. modprobe scsi_debug delay=0 dif=1 dix=199 ato=1 dev_size_mb=300
2. Set up a dm-bufio client, e.g. dm-verity, on the scsi_debug device
3. Repeatedly read metadata and watch kmalloc-192 leak!
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
If the incoming bio is a WRITE and completely covers a block then we
don't bother to do any copying for a promotion operation. Once this is
done the cache block and origin block will be different, so we need to
set it to 'dirty'.
Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Overwrite causes the cache block and origin blocks to diverge, which
is only allowed in writeback mode.
Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit 19e2ad6a09f0c06dbca19c98e5f4584269d913dd ("n_tty: Remove overflow
tests from receive_buf() path") moved the increment of read_head into
the arguments list of read_buf_addr(). Function calls represent a
sequence point in C. Therefore read_head is incremented before the
character c is placed in the buffer. Since the circular read buffer is
a lock-less design since commit 6d76bd2618535c581f1673047b8341fd291abc67
("n_tty: Make N_TTY ldisc receive path lockless"), this creates a race
condition that leads to communication errors.
This patch modifies the code to increment read_head _after_ the data
is placed in the buffer and thus fixes the race for non-SMP machines.
To fix the problem for SMP machines, memory barriers must be added in
a separate patch.
Signed-off-by: Christian Riesch <christian.riesch@omicron.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit f1d2736c8156 (mmc: dw_mmc: control card read threshold) added
dw_mci_ctrl_rd_thld() with an unconditional write to the CDTHRCTL
register at offset 0x100. However before version 240a, the FIFO region
started at 0x100, so the write messes with the FIFO and completely
breaks the driver.
If the version id < 240A, return early from dw_mci_ctl_rd_thld() so as
not to hit this problem.
Fixes: f1d2736c8156 (mmc: dw_mmc: control card read threshold) Signed-off-by: James Hogan <james.hogan@imgtec.com> Acked-by: Jaehoon Chung <jh80.chung@samsung.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patch adds waiting until transmit buffer and shifter will be empty
before clock disabling.
Without this fix it's possible to have clock disabled while data was
not transmited yet, which causes unproper state of TX line and problems
in following data transfers.
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The mcb_device_id table is supposed to be zero-terminated.
Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Some boards with TC6393XB chip require full state restore during system
resume thanks to chip's VCC being cut off during suspend (Sharp SL-6000
tosa is one of them). Failing to do so would result in ohci Oops on
resume due to internal memory contentes being changed. Fail ohci suspend
on tc6393xb is full state restore is required.
Recommended workaround is to unbind tmio-ohci driver before suspend and
rebind it after resume.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit e7cd1d1eb16f ("mfd: twl4030-power: Add generic reset
configuration") accidentally removed the compatible flag for
"ti,twl4030-power" that should be there as documented in the
binding.
If "ti,twl4030-power" only the poweroff configuration is done
by the driver.
Fixes: e7cd1d1eb16f ("mfd: twl4030-power: Add generic reset configuration") Reported-by: "Dr. H. Nikolaus Schaller" <hns@goldelico.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Current driver uses a common buffer for reading reports either
synchronously in i2c_hid_get_raw_report() and asynchronously in
the interrupt handler.
There is race condition if an interrupt arrives immediately after
the report is received in i2c_hid_get_raw_report(); the common
buffer is modified by the interrupt handler with the new report
and then i2c_hid_get_raw_report() proceed using wrong data.
Fix it by using a separate buffers for synchronous reports.
Signed-off-by: Jean-Baptiste Maneyrol <jmaneyrol@invensense.com>
[Antonio Borneo: cleanup, rebase to v3.17, submit mainline] Signed-off-by: Antonio Borneo <borneo.antonio@gmail.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
We've got a bug report at disconnecting a Webcam, where the kernel
spews warnings like below:
WARNING: CPU: 0 PID: 8385 at ../fs/sysfs/group.c:219 sysfs_remove_group+0x87/0x90()
sysfs group c0b2350c not found for kobject 'event3'
CPU: 0 PID: 8385 Comm: queue2:src Not tainted 3.16.2-1.gdcee397-default #1
Hardware name: ASUSTeK Computer INC. A7N8X-E/A7N8X-E, BIOS ASUS A7N8X-E Deluxe ACPI BIOS Rev 1013 11/12/2004 c08d0705ddc75cbcc0718c5bddc75cccc024b654c08c6d44ddc75ce8000020c1 c08d0705000000dbc03d1ec7c03d1ec70000000900000000c0b2350cd62c9064 ddc75cd4c024b6a300000009ddc75cccc08c6d44ddc75ce8ddc75cfcc03d1ec7
Call Trace:
[<c0205ba6>] try_stack_unwind+0x156/0x170
[<c02046f3>] dump_trace+0x53/0x180
[<c0205c06>] show_trace_log_lvl+0x46/0x50
[<c0204871>] show_stack_log_lvl+0x51/0xe0
[<c0205c67>] show_stack+0x27/0x50
[<c0718c5b>] dump_stack+0x3e/0x4e
[<c024b654>] warn_slowpath_common+0x84/0xa0
[<c024b6a3>] warn_slowpath_fmt+0x33/0x40
[<c03d1ec7>] sysfs_remove_group+0x87/0x90
[<c05a2c54>] device_del+0x34/0x180
[<c05e3989>] evdev_disconnect+0x19/0x50
[<c05e06fa>] __input_unregister_device+0x9a/0x140
[<c05e0845>] input_unregister_device+0x45/0x80
[<f854b1d6>] uvc_delete+0x26/0x110 [uvcvideo]
[<f84d66f8>] v4l2_device_release+0x98/0xc0 [videodev]
[<c05a25bb>] device_release+0x2b/0x90
[<c04ad8bf>] kobject_cleanup+0x6f/0x1a0
[<f84d5453>] v4l2_release+0x43/0x70 [videodev]
[<c0372f31>] __fput+0xb1/0x1b0
[<c02650c1>] task_work_run+0x91/0xb0
[<c024d845>] do_exit+0x265/0x910
[<c024df64>] do_group_exit+0x34/0xa0
[<c025a76f>] get_signal_to_deliver+0x17f/0x590
[<c0201b6a>] do_signal+0x3a/0x960
[<c02024f7>] do_notify_resume+0x67/0x90
[<c071ebb5>] work_notifysig+0x30/0x3b
[<b7739e60>] 0xb7739e5f
---[ end trace b1e56095a485b631 ]---
The cause is that uvc_status_cleanup() is called after usb_put_*() in
uvc_delete(). usb_put_*() removes the sysfs parent and eventually
removes the children recursively, so the later device_del() can't find
its sysfs. The fix is simply rearrange the call orders in
uvc_delete() so that the child is removed before the parent.
nfs4_layoutget_release() drops layout hdr refcnt. Grab the refcnt
early so that it is safe to call .release in case nfs4_alloc_pages
fails.
Signed-off-by: Peng Tao <tao.peng@primarydata.com> Fixes: a47970ff78147 ("NFSv4.1: Hold reference to layout hdr in layoutget") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
We currently use num_possible_cpus(), but that breaks on sparc64 where
the CPU ID space is discontig. Use nr_cpu_ids as the highest CPU ID
instead, so we don't end up reading from invalid memory.
Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit 5fe5b767dc6f ("ASoC: dapm: Do not pretend to support controls for non
mixer/mux widgets") revealed ill-defined control in a route between
"STENL Mux" and DACs in max98090.c:
max98090 i2c-193C9890:00: Control not supported for path STENL Mux -> [NULL] -> DACL
max98090 i2c-193C9890:00: ASoC: no dapm match for STENL Mux --> NULL --> DACL
max98090 i2c-193C9890:00: ASoC: Failed to add route STENL Mux -> NULL -> DACL
max98090 i2c-193C9890:00: Control not supported for path STENL Mux -> [NULL] -> DACR
max98090 i2c-193C9890:00: ASoC: no dapm match for STENL Mux --> NULL --> DACR
max98090 i2c-193C9890:00: ASoC: Failed to add route STENL Mux -> NULL -> DACR
Since there is no control between "STENL Mux" and DACs the control name must
be NULL not "NULL".
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Enabling the hardware I/O coherency on Armada 370, Armada 375, Armada
38x and Armada XP requires a certain number of conditions:
- On Armada 370, the cache policy must be set to write-allocate.
- On Armada 375, 38x and XP, the cache policy must be set to
write-allocate, the pages must be mapped with the shareable
attribute, and the SMP bit must be set
Currently, on Armada XP, when CONFIG_SMP is enabled, those conditions
are met. However, when Armada XP is used in a !CONFIG_SMP kernel, none
of these conditions are met. With Armada 370, the situation is worse:
since the processor is single core, regardless of whether CONFIG_SMP
or !CONFIG_SMP is used, the cache policy will be set to write-back by
the kernel and not write-allocate.
Since solving this problem turns out to be quite complicated, and we
don't want to let users with a mainline kernel known to have
infrequent but existing data corruptions, this commit proposes to
simply disable hardware I/O coherency in situations where it is known
not to work.
And basically, the is_smp() function of the kernel tells us whether it
is OK to enable hardware I/O coherency or not, so this commit slightly
refactors the coherency_type() function to return
COHERENCY_FABRIC_TYPE_NONE when is_smp() is false, or the appropriate
type of the coherency fabric in the other case.
Thanks to this, the I/O coherency fabric will no longer be used at all
in !CONFIG_SMP configurations. It will continue to be used in
CONFIG_SMP configurations on Armada XP, Armada 375 and Armada 38x
(which are multiple cores processors), but will no longer be used on
Armada 370 (which is a single core processor).
In the process, it simplifies the implementation of the
coherency_type() function, and adds a missing call to of_node_put().
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Fixes: e60304f8cb7bb545e79fe62d9b9762460c254ec2 ("arm: mvebu: Add hardware I/O Coherency support") Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Link: https://lkml.kernel.org/r/1415871540-20302-3-git-send-email-thomas.petazzoni@free-electrons.com Signed-off-by: Jason Cooper <jason@lakedaemon.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
We use the modified list to keep track of which extents have been modified so we
know which ones are candidates for logging at fsync() time. Newly modified
extents are added to the list at modification time, around the same time the
ordered extent is created. We do this so that we don't have to wait for ordered
extents to complete before we know what we need to log. The problem is when
something like this happens
log extent 0-4k on inode 1
copy csum for 0-4k from ordered extent into log
sync log
commit transaction
log some other extent on inode 1
ordered extent for 0-4k completes and adds itself onto modified list again
log changed extents
see ordered extent for 0-4k has already been logged
at this point we assume the csum has been copied
sync log
crash
On replay we will see the extent 0-4k in the log, drop the original 0-4k extent
which is the same one that we are replaying which also drops the csum, and then
we won't find the csum in the log for that bytenr. This of course causes us to
have errors about not having csums for certain ranges of our inode. So remove
the modified list manipulation in unpin_extent_cache, any modified extents
should have been added well before now, and we don't want them re-logged. This
fixes my test that I could reliably reproduce this problem with. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Liu Bo pointed out that my previous fix would lose the generation update in the
scenario I described. It is actually much worse than that, we could lose the
entire extent if we lose power right after the transaction commits. Consider
the following
write extent 0-4k
log extent in log tree
commit transaction
< power fail happens here
ordered extent completes
We would lose the 0-4k extent because it hasn't updated the actual fs tree, and
the transaction commit will reset the log so it isn't replayed. If we lose
power before the transaction commit we are save, otherwise we are not.
Fix this by keeping track of all extents we logged in this transaction. Then
when we go to commit the transaction make sure we wait for all of those ordered
extents to complete before proceeding. This will make sure that if we lose
power after the transaction commit we still have our data. This also fixes the
problem of the improperly updated extent generation. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
If we have two fsync()'s race on different subvols one will do all of its work
to get into the log_tree, wait on it's outstanding IO, and then allow the
log_tree to finish it's commit. The problem is we were just free'ing that
subvols logged extents instead of waiting on them, so whoever lost the race
wouldn't really have their data on disk. Fix this by waiting properly instead
of freeing the logged extents. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit 7628083227b6bc4a7e33d7c381d7a4e558424b6b (usb: gadget: at91_udc:
prepare clk before calling enable) added clock preparation in interrupt
context. This is not allowed as it might sleep. Also setting the clock
rate is unsafe to call from there for the same reason. Move clock
preparation and setting clock rate into process context (at91udc_probe).
Signed-off-by: Ronald Wahl <ronald.wahl@raritan.com> Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Felipe Balbi <balbi@ti.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Make sure to check the version field of the firmware header to make sure to
not accidentally try to parse a firmware file with a different layout.
Trying to do so can result in loading invalid firmware code to the device.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Andrew Morton wrote:
> On Wed, 12 Nov 2014 13:08:55 +0900 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> > Andrew Morton wrote:
> > > Poor ttm guys - this is a bit of a trap we set for them.
> >
> > Commit a91576d7916f6cce ("drm/ttm: Pass GFP flags in order to avoid deadlock.")
> > changed to use sc->gfp_mask rather than GFP_KERNEL.
> >
> > - pages_to_free = kmalloc(npages_to_free * sizeof(struct page *),
> > - GFP_KERNEL);
> > + pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp);
> >
> > But this bug is caused by sc->gfp_mask containing some flags which are not
> > in GFP_KERNEL, right? Then, I think
> >
> > - pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp);
> > + pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp & GFP_KERNEL);
> >
> > would hide this bug.
> >
> > But I think we should use GFP_ATOMIC (or drop __GFP_WAIT flag)
>
> Well no - ttm_page_pool_free() should stop calling kmalloc altogether.
> Just do
>
> struct page *pages_to_free[16];
>
> and rework the code to free 16 pages at a time. Easy.
Well, ttm code wants to process 512 pages at a time for performance.
Memory footprint increased by 512 * sizeof(struct page *) buffer is
only 4096 bytes. What about using static buffer like below?
----------
>From d3cb5393c9c8099d6b37e769f78c31af1541fe8c Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Thu, 13 Nov 2014 22:21:54 +0900
Subject: [PATCH] drm/ttm: Avoid memory allocation from shrinker functions.
Commit a91576d7916f6cce ("drm/ttm: Pass GFP flags in order to avoid
deadlock.") caused BUG_ON() due to sc->gfp_mask containing flags
which are not in GFP_KERNEL.
https://bugzilla.kernel.org/show_bug.cgi?id=87891
Changing from sc->gfp_mask to (sc->gfp_mask & GFP_KERNEL) would
avoid the BUG_ON(), but avoiding memory allocation from shrinker
function is better and reliable fix.
Shrinker function is already serialized by global lock, and
clean up function is called after shrinker function is unregistered.
Thus, we can use static buffer when called from shrinker function
and clean up function.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When CONFIG_FRAME_POINTERS are enabled, it is required that the
ftrace_caller and ftrace_regs_caller trampolines set up frame pointers
otherwise a stack trace from a function call wont print the functions
that called the trampoline. This is due to a check in
__save_stack_address():
#ifdef CONFIG_FRAME_POINTER
if (!reliable)
return;
#endif
The "reliable" variable is only set if the function address is equal to
contents of the address before the address the frame pointer register
points to. If the frame pointer is not set up for the ftrace caller
then this will fail the reliable test. It will miss the function that
called the trampoline. Worse yet, if fentry is used (gcc 4.6 and
beyond), it will also miss the parent, as the fentry is called before
the stack frame is set up. That means the bp frame pointer points
to the stack of just before the parent function was called.
Link: http://lkml.kernel.org/r/20141119034829.355440340@goodmis.org Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
We can get here from blkdev_ioctl() -> blkpg_ioctl() -> add_partition()
with a user passed in partno value. If we pass in 0x7fffffff, the
new target in disk_expand_part_tbl() overflows the 'int' and we
access beyond the end of ptbl->part[] and even write to it when we
do the rcu_assign_pointer() to assign the new partition.
Reported-by: David Ramos <daramos@stanford.edu> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patch fixes kmemcheck warning in switch_names. The function
switch_names swaps inline names of two dentries. It swaps full arrays
d_iname, no matter how many bytes are really used by the strings. Reading
data beyond string ends results in kmemcheck warning.
We fix the bug by marking both arrays as fully initialized.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Added new device layout "DEVICE_HWI" and also added the USB VID/PID for the
HP lt4112 LTE/HSPA+ Gobi 4G Modem (Huawei me906e)
Signed-off-by: Martin Hauke <mardnh@gmx.de> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit a469abd0f868 (ARM: elf: add new hwcap for identifying atomic
ldrd/strd instructions) introduces HWCAP_ELF for 32-bit ARM
applications. As LPAE is always present on arm64, report the
corresponding compat HWCAP to user space.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit d127e9c ("ARM: tegra: make tegra_resume can work with current and later
chips") removed tegra_get_soc_id macro leaving used cpu register corrupted after
branching to v7_invalidate_l1() and as result causing execution of unintended
code on tegra20. Possibly it was expected that r6 would be SoC id func argument
since common cpu reset handler is setting r6 before branching to tegra_resume(),
but neither tegra20_lp1_reset() nor tegra30_lp1_reset() aren't setting r6
register before jumping to resume function. Fix it by re-adding macro.
Fixes: d127e9c (ARM: tegra: make tegra_resume can work with current and later chips) Reviewed-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The .eh_abort_handler needs to return SUCCESS, FAILED, or
FAST_IO_FAIL. So fixup all callers to adhere to this requirement.
Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit 6ac665c63dca ("PCI: rewrite PCI BAR reading code") masked off
low-order bits from 'l', but not from 'sz'. Both are passed to pci_size(),
which compares 'base == maxbase' to check for read-only BARs. The masking
of 'l' means that comparison will never be 'true', so the check for
read-only BARs no longer works.
Resolve this by also masking off the low-order bits of 'sz' before passing
it into pci_size() as 'maxbase'. With this change, pci_size() will once
again catch the problems that have been encountered to date:
- AGP aperture BAR of AMD-7xx host bridges: if the AGP window is
disabled, this BAR is read-only and read as 0x00000008 [1]
- BARs 0-4 of ALi IDE controllers can be non-zero and read-only [1]
bus_find_device_by_name() acquires a device reference which is never
released. This results in an object leak, which on older kernels
results in failure to release all resources of PCI devices. libvirt
uses drivers_probe to re-attach devices to the host after assignment
and is therefore a common trigger for this leak.
`genwqe_user_vmap()` calls `get_user_pages_fast()` and if the return
value is less than the number of pages requested, it frees the pages and
returns an error (`-EFAULT`). However, it fails to consider a negative
error return value from `get_user_pages_fast()`. In that case, the test
`if (rc < m->nr_pages)` will be false (due to promotion of `rc` to a
large `unsigned int`) and the code will continue on to call
`genwqe_map_pages()` with an invalid list of page pointers. Fix it by
bailing out if `get_user_pages_fast()` returns a negative error value.
Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This happens due to the following race: between 'if (channel->device_obj)' check
in vmbus_process_rescind_offer() and pr_debug() in vmbus_device_unregister() the
device can disappear. Fix the issue by taking an additional reference to the
device before proceeding to vmbus_device_unregister().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
An attempt to fix fcopy on i586 (bc5a5b0 Drivers: hv: util: Properly pack the data
for file copy functionality) led to a regression on x86_64 (and actually didn't fix
i586 breakage). Fcopy messages from Hyper-V host come in the following format:
On x86_64 struct hv_do_fcopy matched this format without ' __attribute__((packed))'
and on i586 adding ' __attribute__((packed))' to it doesn't change anything. Keep
the structure packed and add padding to match re reality. Tested both i586 and x86_64
on Hyper-V Server 2012 R2.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The logic of vfree()'ing vol->upd_buf is tied to vol->updating.
In ubi_start_update() vol->updating is set long before vmalloc()'ing
vol->upd_buf. If we encounter a write failure in ubi_start_update()
before vmalloc() the UBI device release function will try to vfree()
vol->upd_buf because vol->updating is set.
Fix this by allocating vol->upd_buf directly after setting vol->updating.
If the erase worker is unable to erase a PEB it will
free the ubi_wl_entry itself.
The failing ubi_wl_entry must not free()'d again after
do_sync_erase() returns.
Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
some control register changes will flush some aspects of the CPU, e.g.
POP explicitely mentions that for CR9-CR11 "TLBs may be cleared".
Instead of trying to be clever and only flush on specific CRs, let
play safe and flush on all lctl(g) as future machines might define
new bits in CRs. Load control intercept should not happen that often.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
ipte_unlock_siif uses cmpxchg to replace the in-memory data of the ipte
lock together with ACCESS_ONCE for the intial read.
union ipte_control {
unsigned long val;
struct {
unsigned long k : 1;
unsigned long kh : 31;
unsigned long kg : 32;
};
};
[...]
static void ipte_unlock_siif(struct kvm_vcpu *vcpu)
{
union ipte_control old, new, *ic;
ic = &vcpu->kvm->arch.sca->ipte_control;
do {
new = old = ACCESS_ONCE(*ic);
new.kh--;
if (!new.kh)
new.k = 0;
} while (cmpxchg(&ic->val, old.val, new.val) != old.val);
if (!new.kh)
wake_up(&vcpu->kvm->arch.ipte_wq);
}
The new value, is loaded twice from memory with gcc 4.7.2 of
fedora 18, despite the ACCESS_ONCE:
--->
l %r4,0(%r3) <--- load first 32 bit of lock (k and kh) in r4
alfi %r4,2147483647 <--- add -1 to r4
llgtr %r4,%r4 <--- zero out the sign bit of r4
lg %r1,0(%r3) <--- load all 64 bit of lock into new
lgr %r2,%r1 <--- load the same into old
risbg %r1,%r4,1,31,32 <--- shift and insert r4 into the bits 1-31 of
new
llihf %r4,2147483647
ngrk %r4,%r1,%r4
jne aa0 <ipte_unlock+0xf8>
nihh %r1,32767
lgr %r4,%r2
csg %r4,%r1,0(%r3)
cgr %r2,%r4
jne a70 <ipte_unlock+0xc8>
If the memory value changes between the first load (l) and the second
load (lg) we are broken. If that happens VCPU threads will hang
(unkillable) in handle_ipte_interlock.
Andreas Krebbel analyzed this and tracked it down to a compiler bug in
that version:
"while it is not that obvious the C99 standard basically forbids
duplicating the memory access also in that case. For an argumentation of
a similiar case please see:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=22278#c43
For the implementation-defined cases regarding volatile there are some
GCC-specific clarifications which can be found here:
https://gcc.gnu.org/onlinedocs/gcc/Volatiles.html#Volatiles
I've tracked down the problem with a reduced testcase. The problem was
that during a tree level optimization (SRA - scalar replacement of
aggregates) the volatile marker is lost. And an RTL level optimizer (CSE
- common subexpression elimination) then propagated the memory read into
its second use introducing another access to the memory location. So
indeed Christian's suspicion that the union access has something to do
with it is correct (since it triggered the SRA optimization).
This issue has been reported and fixed in the GCC 4.8 development cycle:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145"
This patch replaces the ACCESS_ONCE scheme with a barrier() based scheme
that should work for all supported compilers.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patch fixes an issue that the NULL pointer dereference happens
when we uses g_audio driver. Since the g_audio driver will call
usb_ep_disable() in afunc_set_alt() before it calls usb_ep_enable(),
the uep->pipe of renesas usbhs driver will be NULL. So, this patch
adds a condition to avoid the oops.
Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com> Signed-off-by: Takeshi Kihara <takeshi.kihara.df@renesas.com> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Fixes: 2f98382dc (usb: renesas_usbhs: Add Renesas USBHS Gadget) Signed-off-by: Felipe Balbi <balbi@ti.com>
[ luis: backported to 3.16: replaced pipe by uep->pipe ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>