Get rid of the WANT_COMPAT_REG_H test and instead define both the 32-
and 64-bit register offset definitions at the same time with
MIPS{32,64}_ prefixes, then define the existing EF_* names to the
correct definitions for the kernel's bitness.
This patch is a prerequisite of the following bug fix patch.
Signed-off-by: Alex Smith <alex@alex-smith.me.uk> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7451/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Looks like MUSB cable removal can cause wake-up interrupts to
stop working for device tree based booting at least for UART3
even as nothing is dynamically remuxed. This can be fixed by
calling reconfigure_io_chain() for device tree based booting
in hwmod code. Note that we already do that for legacy booting
if the legacy mux is configured.
My guess is that this is related to UART3 and MUSB ULPI
hsusb0_data0 and hsusb0_data1 support for Carkit mode that
somehow affect the configured IO chain for UART3 and require
rearming the wake-up interrupts.
In general, for device tree based booting, pinctrl-single
calls the rearm hook that in turn calls reconfigure_io_chain
so calling reconfigure_io_chain should not be needed from the
hwmod code for other events.
So let's limit the hwmod rearming of iochain only to
HWMOD_FORCE_MSTANDBY where MUSB is currently the only user
of it. If we see other devices needing similar changes we can
add more checks for it.
Cc: Paul Walmsley <paul@pwsan.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Stable_kernel_rules should point submitters of network stable patches to the
netdev_FAQ.txt as requests for stable network patches should go to netdev
first.
The assumption that sizeof(long) >= sizeof(resource_size_t) can lead to
truncation of the PCI resource address, meaning this driver didn't work
on 32-bit systems with 64-bit PCI adressing ranges.
Signed-off-by: Ben Collins <ben.c@servergy.com> Acked-by: Sumit Saxena <sumit.saxena@lsi.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
In non-leading connection login, iscsi_login_non_zero_tsih_s1() calls
iscsi_change_param_value() with the buffer it uses to hold the login
PDU, not a temporary buffer. This leads to the login header getting
corrupted and login failing for non-leading connections in MC/S.
Fix this by adding a wrapper iscsi_change_param_sprintf() that handles
the temporary buffer itself to avoid confusion. Also handle sending a
reject in case of failure in the wrapper, which lets the calling code
get quite a bit smaller and easier to read.
Finally, bump the size of the temporary buffer from 32 to 64 bytes to be
safe, since "MaxRecvDataSegmentLength=" by itself is 25 bytes; with a
trailing NUL, a value >= 1M will lead to a buffer overrun. (This isn't
the default but we don't need to run right at the ragged edge here)
Reported-by: Santosh Kulkarni <santosh.kulkarni@calsoftinc.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
On OMAP4 panda board, there have been several bug reports about boot
hang and lock-ups with CPU_IDLE enabled. The root cause of the issue
is missing interrupts while in idle state. Commit cb7094e8 {cpuidle / omap4 :
use CPUIDLE_FLAG_TIMER_STOP flag} moved the broadcast notifiers to common
code for right reasons but on OMAP4 which suffers from a nasty ROM code
bug with GIC, commit ff999b8a {ARM: OMAP4460: Workaround for ROM bug ..},
we loose interrupts which leads to issues like lock-up, hangs etc.
Patch reverts commit cb7094 {cpuidle / omap4 : use CPUIDLE_FLAG_TIMER_STOP
flag} and 54769d6 {cpuidle: OMAP4: remove timer broadcast initialization} to
avoid the issue. With this change, OMAP4 panda boards, the mentioned
issues are getting fixed. We no longer loose interrupts which was the cause
of the regression.
Fixes: cb7094e8 (cpuidle / omap4 : use CPUIDLE_FLAG_TIMER_STOP flag) Fixes: ff999b8a (cpuidle: OMAP4: remove timer broadcast initialization) Cc: Roger Quadros <rogerq@ti.com> Cc: Kevin Hilman <khilman@linaro.org> Cc: Tony Lindgren <tony@atomide.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Reported-tested-by: Roger Quadros <rogerq@ti.com> Reported-tested-by: Kevin Hilman <khilman@linaro.org> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
An additional testcase found an issue with the last
series of patches applied: the fallback solution may
not save the iv value after operation. This very small
fix just makes sure the iv is copied back to the
walk/desc struct.
Cc: <stable@vger.kernel.org> # 3.14+ Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
When persistent grants were added they were always used, even if the
backend doesn't have this feature (there's no harm in always using the
same set of pages). This restores the old data path when the backend
doesn't have persistent grants, removing the burden of doing a memcpy
when it is not actually needed.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reported-by: Felipe Franciosi <felipe.franciosi@citrix.com> Cc: Felipe Franciosi <felipe.franciosi@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v2: Fix up whitespace issues]
There's no need to keep the foreign access in a grant if it is not
persistently mapped by the backend. This allows us to free grants that
are not mapped by the backend, thus preventing blkfront from hoarding
all grants.
The main effect of this is that blkfront will only persistently map
the same grants as the backend, and it will always try to use grants
that are already mapped by the backend. Also the number of persistent
grants in blkfront is the same as in blkback (and is controlled by the
value in blkback).
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Matt Wilson <msw@amazon.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
As sysctl_hung_task_timeout_sec is unsigned long, when this value is
larger then LONG_MAX/HZ, the function schedule_timeout_interruptible in
watchdog will return immediately without sleep and with print :
Additional validation of adjtimex freq values to avoid
potential multiplication overflows were added in commit 5e5aeb4367b (time: adjtimex: Validate the ADJ_FREQUENCY values)
Unfortunately the patch used LONG_MAX/MIN instead of
LLONG_MAX/MIN, which was fine on 64-bit systems, but being
much smaller on 32-bit systems caused false positives
resulting in most direct frequency adjustments to fail w/
EINVAL.
ntpd only does direct frequency adjustments at startup, so
the issue was not as easily observed there, but other time
sync applications like ptpd and chrony were more effected by
the bug.
This patch changes the checks to use LLONG_MAX for
clarity, and additionally the checks are disabled
on 32-bit systems since LLONG_MAX/PPM_SCALE is always
larger then the 32-bit long freq value, so multiplication
overflows aren't possible there.
Reported-by: Josh Boyer <jwboyer@fedoraproject.org> Reported-by: George Joseph <george.joseph@fairview5.com> Tested-by: George Joseph <george.joseph@fairview5.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Sasha Levin <sasha.levin@oracle.com> Link: http://lkml.kernel.org/r/1423553436-29747-1-git-send-email-john.stultz@linaro.org
[ Prettified the changelog and the comments a bit. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Christian Riesch <christian.riesch@omicron.at> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Ensure that other operations which raced with our setattr RPC call
cannot revert the file attribute changes that were made on the server.
To do so, we artificially bump the attribute generation counter on
the inode so that all calls to nfs_fattr_init() that precede ours
will be dropped.
The motivation for the patch came from Chuck Lever's reports of readaheads
racing with truncate operations and causing the file size to be reverted.
Reported-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
tcrypt/testmgr uses wait_for_completion_interruptible() everywhere when
it waits for a request to be completed. If it's interrupted, then the
test is aborted and the request is freed.
However, if any of these calls actually do get interrupted, the result
will likely be a kernel crash, when the driver handles the now-freed
request. Use wait_for_completion() instead.
Signed-off-by: Rabin Vincent <rabin.vincent@axis.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Like SHA1, use get_unaligned_be*() on the raw input data.
Reported-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
that the shmdt uses inode->i_size outside of i_mutex being held.
There is one more case in shm.c in shm_destroy(). This converts
both users over to use i_size_read().
This is a highly-contrived scenario. But, a single shmdt() call can be
induced in to unmapping memory from mulitple shm segments. Example code
is here:
http://www.sr71.net/~dave/intel/shmfun.c
The fix is pretty simple: Record the 'struct file' for the first VMA we
encounter and then stick to it. Decline to unmap anything not from the
same file and thus the same segment.
I found this by inspection and the odds of anyone hitting this in practice
are pretty darn small.
ipc_addid() makes a new ipc identifier visible to everyone. New objects
start as locked, so that the caller can complete the initialization
after the call. Within struct sem_array, at least sma->sem_base and
sma->sem_nsems are accessed without any locks, therefore this approach
doesn't work.
Thus: Move the ipc_addid() to the end of the initialization.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Reported-by: Rik van Riel <riel@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Rafael Aquini <aquini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
If a request_key() call to allocate and fill out a key attempts to insert the
key structure into a revoked keyring, the key will leak, using memory and part
of the user's key quota until the system reboots. This is from a failure of
construct_alloc_key() to decrement the key's reference count after the attempt
to insert into the requested keyring is rejected.
key_put() needs to be called in the link_prealloc_failed callpath to ensure
the unused key is released.
Signed-off-by: David Jeffery <djeffery@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
This was caused by that 'sd->groups == NULL' after building groups, which
was caused by the empty 'sd->span'.
The cpu's domain contained nothing because the cpu was assigned to a wrong
node, due to the following unfortunate sequence of events:
1. The hypervisor sent a topology update to the guest OS, to notify changes
to the cpu-node mapping. However, the update was actually redundant - i.e.,
the "new" mapping was exactly the same as the old one.
2. Due to this, the 'updated_cpus' mask turned out to be empty after exiting
the 'for-loop' in arch_update_cpu_topology().
3. So we ended up calling stop-machine() with an empty cpumask list, which made
stop-machine internally elect cpumask_first(cpu_online_mask), i.e., CPU0 as
the cpu to run the payload (the update_cpu_topology() function).
4. This causes update_cpu_topology() to be run by CPU0. And since 'updates'
is kzalloc()'ed inside arch_update_cpu_topology(), update_cpu_topology()
finds update->cpu as well as update->new_nid to be 0. In other words, we
end up assigning CPU0 (and eventually its siblings) to node 0, incorrectly.
Along with the following wrong updating, it causes the sched-domain rebuild
code to break and crash the system.
Fix this by skipping the topology update in cases where we find that
the topology has not actually changed in reality (ie., spurious updates).
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> CC: Nathan Fontenot <nfont@linux.vnet.ibm.com> CC: Stephen Rothwell <sfr@canb.auug.org.au> CC: Andrew Morton <akpm@linux-foundation.org> CC: Robert Jennings <rcj@linux.vnet.ibm.com> CC: Jesse Larrew <jlarrew@linux.vnet.ibm.com> CC: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> CC: Alistair Popple <alistair@popple.id.au> Suggested-by: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
This patch changes core_scsi3_pro_release() logic to allow an
existing AllRegistrants type reservation to be re-reserved by
any registered I_T nexus.
This addresses a issue where AllRegistrants type RESERVE was
receiving RESERVATION_CONFLICT status if dev_pr_res_holder did
not match the same I_T nexus, instead of just returning GOOD
status following spc4r34 Section 5.9.9:
"If the device server receives a PERSISTENT RESERVE OUT command
with RESERVE service action where the TYPE field and the SCOPE
field contain the same values as the existing type and scope
from a persistent reservation holder, it shall not make any
change to the existing persistent reservation and shall complete
the command with GOOD status."
Reported-by: Ilias Tsitsimpis <i.tsitsimpis@gmail.com> Cc: Ilias Tsitsimpis <i.tsitsimpis@gmail.com> Cc: Lee Duncan <lduncan@suse.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
This patch fixes an issue with AllRegistrants reservations where
an unregister operation by the I_T nexus reservation holder would
incorrectly drop the reservation, instead of waiting until the
last active I_T nexus is unregistered as per SPC-4.
This includes updating __core_scsi3_complete_pro_release() to reset
dev->dev_pr_res_holder with another pr_reg for this special case,
as well as a new 'unreg' parameter to determine when the release
is occuring from an implicit unregister, vs. explicit RELEASE.
It also adds special handling in core_scsi3_free_pr_reg_from_nacl()
to release the left-over pr_res_holder, now that pr_reg is deleted
from pr_reg_list within __core_scsi3_complete_pro_release().
Reported-by: Ilias Tsitsimpis <i.tsitsimpis@gmail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
This patch fixes the usage of R_HOLDER bit for an All Registrants
reservation in READ_FULL_STATUS, where only the registration who
issued RESERVE was being reported as having an active reservation.
It changes core_scsi3_pri_read_full_status() to check ahead of the
list walk of active registrations to see if All Registrants is active,
and if so set R_HOLDER bit and scope/type fields for all active
registrations.
Reported-by: Ilias Tsitsimpis <i.tsitsimpis@gmail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
This patch fixes a NULL pointer dereference OOPs with pSCSI backends
within target_core_stat.c code. The bug is caused by a configfs attr
read if no pscsi_dev_virt->pdv_sd has been configured.
Reported-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
This patch fixes a iser specific logout bug where early complete()
of conn->conn_logout_comp in iscsit_close_connection() was causing
isert_wait4logout() to complete too soon, triggering a use after
free NULL pointer dereference of iscsi_conn memory.
The complete() was originally added for traditional iscsi-target
when a ISCSI_LOGOUT_OP failed in iscsi_target_rx_opcode(), but given
iser-target does not wait in logout failure, this special case needs
to be avoided.
This patch fixes a NULL pointer dereference triggered by a late
target_configure_device() -> alloc_workqueue() failure that results
in target_free_device() being called with DF_CONFIGURED already set,
which subsequently OOPses in destroy_workqueue() code.
Currently this only happens at modprobe target_core_mod time when
core_dev_setup_virtual_lun0() -> target_configure_device() fails,
and the explicit target_free_device() gets called.
To address this bug originally introduced by commit 0fd97ccf45, go
ahead and move DF_CONFIGURED to end of target_configure_device()
code to handle this special failure case.
Reported-by: Claudio Fleiner <cmf@daterainc.com> Cc: Claudio Fleiner <cmf@daterainc.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
at91rm9200 standby and suspend to ram has been broken since 00482a4078f4. It is wrongly using AT91_BASE_SYS which is a physical address
and actually doesn't correspond to any register on at91rm9200.
Use the correct at91_ramc_base[0] instead.
Fixes: 00482a4078f4 (ARM: at91: implement the standby function for pm/cpuidle) Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Because prior GCC versions always emitted NOPs on ALIGN directives, but
gcc5 started omitting them.
.LSTARTFDEDLSI1 says:
/* HACK: The dwarf2 unwind routines will subtract 1 from the
return address to get an address in the middle of the
presumed call instruction. Since we didn't get here via
a call, we need to include the nop before the real start
to make up for it. */
.long .LSTART_sigreturn-1-. /* PC-relative start address */
But commit 69d0627a7f6e ("x86 vDSO: reorder vdso32 code") from 2.6.25
replaced .org __kernel_vsyscall+32,0x90 by ALIGN right before
__kernel_sigreturn.
Of course, ALIGN need not generate any NOP in there. Esp. gcc5 collapses
vclock_gettime.o and int80.o together with no generated NOPs as "ALIGN".
So fix this by adding to that point at least a single NOP and make the
function ALIGN possibly with more NOPs then.
Kudos for reporting and diagnosing should go to Richard.
Reported-by: Richard Biener <rguenther@suse.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1425543211-12542-1-git-send-email-jslaby@suse.cz Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
drop_fpu() does clear_used_math() and usually this is correct
because tsk == current.
However switch_fpu_finish()->restore_fpu_checking() is called before
__switch_to() updates the "current_task" variable. If it fails,
we will wrongly clear the PF_USED_MATH flag of the previous task.
So use clear_stopped_child_used_math() instead.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150309171041.GB11388@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
math_state_restore() assumes it is called with irqs disabled,
but this is not true if the caller is __restore_xstate_sig().
This means that if ia32_fxstate == T and __copy_from_user()
fails, __restore_xstate_sig() returns with irqs disabled too.
This triggers:
BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:41
dump_stack
___might_sleep
? _raw_spin_unlock_irqrestore
__might_sleep
down_read
? _raw_spin_unlock_irqrestore
print_vma_addr
signal_fault
sys32_rt_sigreturn
Change __restore_xstate_sig() to call set_used_math()
unconditionally. This avoids enabling and disabling interrupts
in math_state_restore(). If copy_from_user() fails, we can
simply do fpu_finit() by hand.
[ Note: this is only the first step. math_state_restore() should
not check used_math(), it should set this flag. While
init_fpu() should simply die. ]
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150307153844.GB25954@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
The kernel crypto API logic requires the caller to provide the
length of (ciphertext || authentication tag) as cryptlen for the
AEAD decryption operation. Thus, the cipher implementation must
calculate the size of the plaintext output itself and cannot simply use
cryptlen.
The RFC4106 GCM decryption operation tries to overwrite cryptlen memory
in req->dst. As the destination buffer for decryption only needs to hold
the plaintext memory but cryptlen references the input buffer holding
(ciphertext || authentication tag), the assumption of the destination
buffer length in RFC4106 GCM operation leads to a too large size. This
patch simply uses the already calculated plaintext size.
In addition, this patch fixes the offset calculation of the AAD buffer
pointer: as mentioned before, cryptlen already includes the size of the
tag. Thus, the tag does not need to be added. With the addition, the AAD
will be written beyond the already allocated buffer.
Note, this fixes a kernel crash that can be triggered from user space
via AF_ALG(aead) -- simply use the libkcapi test application
from [1] and update it to use rfc4106-gcm-aes.
Using [1], the changes were tested using CAVS vectors to demonstrate
that the crypto operation still delivers the right results.
[1] http://www.chronox.de/libkcapi.html
CC: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
As pointed by recent post[1] on exploiting DRAM physical imperfection,
/proc/PID/pagemap exposes sensitive information which can be used to do
attacks.
This disallows anybody without CAP_SYS_ADMIN to read the pagemap.
It's directly caused by 89d3cf6 [SCSI] libsas: add mutex for SMP task
execution, but shows a deeper cause: expander functions expect to be able to
cast to and treat domain devices as expanders. The correct fix is to only do
expander discover when we know we've got an expander device to avoid wrongly
casting a non-expander device.
Otherwise the guest can abuse that control to cause e.g. PCIe
Unsupported Request responses by disabling memory and/or I/O decoding
and subsequently causing (CPU side) accesses to the respective address
ranges, which (depending on system configuration) may be fatal to the
host.
Note that to alter any of the bits collected together as
PCI_COMMAND_GUEST permissive mode is now required to be enabled
globally or on the specific device.
This is CVE-2015-2150 / XSA-120.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
To take down the MOB and GMR memory types, the driver may have to issue
fence objects and thus make sure that the fence manager is taken down
after those memory types.
Reorder device init accordingly.
According to a report from Yuxuan Shui, nilfs2 in kernel 3.19 got stuck
during recovery at mount time. The code path that caused the deadlock was
as follows:
nilfs_fill_super()
load_nilfs()
nilfs_salvage_orphan_logs()
* Do roll-forwarding, attach segment constructor for recovery,
and kick it.
nilfs_segctor_thread()
nilfs_segctor_thread_construct()
* A lock is held with nilfs_transaction_lock()
nilfs_segctor_do_construct()
nilfs_segctor_drop_written_files()
iput()
iput_final()
write_inode_now()
writeback_single_inode()
__writeback_single_inode()
do_writepages()
nilfs_writepage()
nilfs_construct_dsync_segment()
nilfs_transaction_lock() --> deadlock
This can happen if commit 7ef3ff2fea8b ("nilfs2: fix deadlock of segment
constructor over I_SYNC flag") is applied and roll-forward recovery was
performed at mount time. The roll-forward recovery can happen if datasync
write is done and the file system crashes immediately after that. For
instance, we can reproduce the issue with the following steps:
< nilfs2 is mounted on /nilfs (device: /dev/sdb1) >
# dd if=/dev/zero of=/nilfs/test bs=4k count=1 && sync
# dd if=/dev/zero of=/nilfs/test conv=notrunc oflag=dsync bs=4k
count=1 && reboot -nfh
< the system will immediately reboot >
# mount -t nilfs2 /dev/sdb1 /nilfs
The deadlock occurs because iput() can run segment constructor through
writeback_single_inode() if MS_ACTIVE flag is not set on sb->s_flags. The
above commit changed segment constructor so that it calls iput()
asynchronously for inodes with i_nlink == 0, but that change was
imperfect.
This fixes the another deadlock by deferring iput() in segment constructor
even for the case that mount is not finished, that is, for the case that
MS_ACTIVE flag is not set.
Normally _regulator_do_enable() isn't called on an already-enabled
rdev. That's because the main caller, _regulator_enable() always
calls _regulator_is_enabled() and only calls _regulator_do_enable() if
the rdev was not already enabled.
However, there is one caller of _regulator_do_enable() that doesn't
check: regulator_suspend_finish(). While we might want to make
regulator_suspend_finish() behave more like _regulator_enable(), it's
probably also a good idea to make _regulator_do_enable() robust if it
is called on an already enabled rdev.
At the moment, _regulator_do_enable() is _not_ robust for already
enabled rdevs if we're using an ena_pin. Each time
_regulator_do_enable() is called for an rdev using an ena_pin the
reference count of the ena_pin is incremented even if the rdev was
already enabled. This is not as intended because the ena_pin is for
something else: for keeping track of how many active rdevs there are
sharing the same ena_pin.
Here's how the reference counting works here:
* Each time _regulator_enable() is called we increment
rdev->use_count, so _regulator_enable() calls need to be balanced
with _regulator_disable() calls.
* There is no explicit reference counting in _regulator_do_enable()
which is normally just a warapper around rdev->desc->ops->enable()
with code for supporting delays. It's not expected that the
"ops->enable()" call do reference counting.
* Since regulator_ena_gpio_ctrl() does have reference counting
(handling the sharing of the pin amongst multiple rdevs), we
shouldn't call it if the current rdev is already enabled.
Note that as part of this we cleanup (remove) the initting of
ena_gpio_state in regulator_register(). In _regulator_do_enable(),
_regulator_do_disable() and _regulator_is_enabled() is is clear that
ena_gpio_state should be the state of whether this particular rdev has
requested the GPIO be enabled. regulator_register() was initting it
as the actual state of the pin.
Fixes: 967cfb18c0e3 ("regulator: core: manage enable GPIO list") Signed-off-by: Doug Anderson <dianders@chromium.org> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
The _regulator_do_enable() call ought to be a no-op when called on an
already-enabled regulator. However, as an optimization
_regulator_enable() doesn't call _regulator_do_enable() on an already
enabled regulator. That means we never test the case of calling
_regulator_do_enable() during normal usage and there may be hidden
bugs or warnings. We have seen warnings issued by the tps65090 driver
and bugs when using the GPIO enable pin.
Let's match the same optimization that _regulator_enable() in
regulator_suspend_finish(). That may speed up suspend/resume and also
avoids exposing hidden bugs.
[Use much clearer commit message from Doug Anderson]
Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
EEH recovery for bnx2x based adapters is not reliable on all Power
systems using the default hot reset, which can result in an
unrecoverable EEH error. Forcing the use of fundamental reset
during EEH recovery fixes this.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
The commit [ef403edb7558: ALSA: hda - Don't access stereo amps for
mono channel widgets] fixed the handling of mono widgets in general,
but it still misses an exceptional case: namely, a mono mixer widget
taking a single stereo input. In this case, it has stereo volumes
although it's a mono widget, and thus we have to take care of both
left and right input channels, as stated in HD-audio spec ("7.1.3
Widget Interconnection Rules").
This patch covers this missing piece by adding proper checks of stereo
amps in both the generic parser and the proc output codes.
MacBook Air 5,2 has the same problem as MacBook Pro 8,1 where the
built-in mic records only the right channel. Apply the same
workaround as MBP8,1 to spread the mono channel via a Cirrus codec
vendor-specific COEF setup.
CS420x codecs seem to deal only the single amps of ADC nodes even
though the nodes receive multiple inputs. This leads to the
inconsistent amp value after S3/S4 resume, for example.
The fix is just to set codec->single_adc_amp flag. Then the driver
handles these ADC amps as if single connections.
The current HDA generic parser initializes / modifies the amp values
always in stereo, but this seems causing the problem on ALC3229 codec
that has a few mono channel widgets: namely, these mono widgets react
to actions for both channels equally.
In the driver code, we do care the mono channel and create a control
only for the left channel (as defined in HD-audio spec) for such a
node. When the control is updated, only the left channel value is
changed. However, in the resume, the right channel value is also
restored from the initial value we took as stereo, and this overwrites
the left channel value. This ends up being the silent output as the
right channel has been never touched and remains muted.
This patch covers the places where unconditional stereo amp accesses
are done and converts to the conditional accesses.
Compaq Presario CQ60 laptop with CX20561 gives a wrong pin for the
built-in mic NID 0x17 instead of NID 0x1d, and it results in the
non-working mic. This patch just remaps the pin correctly via fixup.
There was no check about the id string of user control elements, so we
accepted even a control element with an empty string, which is
obviously bogus. This patch adds more sanity checks of id strings.
The device complies to the UAC1 standard but hides that fact with
proprietary descriptors. The autodetect quirk for Roland devices
catches the audio interface but misses the MIDI part, so a specific
quirk is needed.
Commit fd316941c ("spi/pl022: disable port when unused") introduced a race,
which leads to possible driver lock up (easily reproducible on SMP).
The problem happens in giveback() function where the completion of the transfer
is signalled to SPI subsystem and then the HW SPI controller is disabled. Another
transfer might be setup in between, which brings driver in locked-up state.
Lockup! SPI controller is disabled and the data will never be received. Whole
SPI subsystem is waiting for transfer ACK and blocked.
So, only signal transfer completion after disabling the controller.
Fixes: fd316941c (spi/pl022: disable port when unused) Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Problem: When IMA and VTPM are both enabled in kernel config,
kernel hangs during bootup on LE OS.
Why?: IMA calls tpm_pcr_read() which results in tpm_ibmvtpm_send
and tpm_ibmtpm_recv getting called. A trace showed that
tpm_ibmtpm_recv was hanging.
Resolution: tpm_ibmtpm_recv was hanging because tpm_ibmvtpm_send
was sending CRQ message that probably did not make much sense
to phype because of Endianness. The fix below sends correctly
converted CRQ for LE. This was not caught before because it
seems IMA is not enabled by default in kernel config and
IMA exercises this particular code path in vtpm.
Tested with IMA and VTPM enabled in kernel config and VTPM
enabled on both a BE OS and a LE OS ppc64 lpar. This exercised
CRQ and TPM command code paths in vtpm.
Patch is against Peter's tpmdd tree on github which included
Vicky's previous vtpm le patches.
Signed-off-by: Joy Latten <jmlatten@linux.vnet.ibm.com> Reviewed-by: Ashley Lai <ashley@ahsleylai.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
The cpuset.sched_relax_domain_level can control how far we do
immediate load balancing on a system. However, it was found on recent
kernels that echo'ing a value into cpuset.sched_relax_domain_level
did not reduce any immediate load balancing.
The reason this occurred was because the update_domain_attr_tree() traversal
did not update for the "top_cpuset". This resulted in nothing being changed
when modifying the sched_relax_domain_level parameter.
This patch is able to address that problem by having update_domain_attr_tree()
allow updates for the root in the cpuset traversal.
Fixes: fc560a26acce ("cpuset: replace cpuset->stack_list with cpuset_for_each_descendant_pre()") Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Zefan Li <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> Tested-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
cancel[_delayed]_work_sync() are implemented using
__cancel_work_timer() which grabs the PENDING bit using
try_to_grab_pending() and then flushes the work item with PENDING set
to prevent the on-going execution of the work item from requeueing
itself.
try_to_grab_pending() can always grab PENDING bit without blocking
except when someone else is doing the above flushing during
cancelation. In that case, try_to_grab_pending() returns -ENOENT. In
this case, __cancel_work_timer() currently invokes flush_work(). The
assumption is that the completion of the work item is what the other
canceling task would be waiting for too and thus waiting for the same
condition and retrying should allow forward progress without excessive
busy looping
Unfortunately, this doesn't work if preemption is disabled or the
latter task has real time priority. Let's say task A just got woken
up from flush_work() by the completion of the target work item. If,
before task A starts executing, task B gets scheduled and invokes
__cancel_work_timer() on the same work item, its try_to_grab_pending()
will return -ENOENT as the work item is still being canceled by task A
and flush_work() will also immediately return false as the work item
is no longer executing. This puts task B in a busy loop possibly
preventing task A from executing and clearing the canceling state on
the work item leading to a hang.
task A task B worker
executing work
__cancel_work_timer()
try_to_grab_pending()
set work CANCELING
flush_work()
block for work completion
completion, wakes up A
__cancel_work_timer()
while (forever) {
try_to_grab_pending()
-ENOENT as work is being canceled
flush_work()
false as work is no longer executing
}
This patch removes the possible hang by updating __cancel_work_timer()
to explicitly wait for clearing of CANCELING rather than invoking
flush_work() after try_to_grab_pending() fails with -ENOENT.
Link: http://lkml.kernel.org/g/20150206171156.GA8942@axis.com
v3: bit_waitqueue() can't be used for work items defined in vmalloc
area. Switched to custom wake function which matches the target
work item and exclusive wait and wakeup.
v2: v1 used wake_up() on bit_waitqueue() which leads to NULL deref if
the target bit waitqueue has wait_bit_queue's on it. Use
DEFINE_WAIT_BIT() and __wake_up_bit() instead. Reported by Tomeu
Vizoso.
when multiport is off, virtio console invokes config access from irq
context, config access is blocking on s390.
Fix this up by scheduling work from config irq - similar to what we do
for multiport configs.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
when multiport is off, we don't initialize config work,
but we then cancel uninitialized control_work on freeze.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
commit 6ae9200f2cab7 ("enlarge console.name") increased the storage
for the console name to 16 bytes, but not the corresponding
struct console_cmdline::name storage. Console names longer than
8 bytes cause read beyond end-of-string and failure to match
console; I'm not sure if there are other unexpected consequences.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
If the part of the compression data are corrupted, or the compression
data is totally fake, the memory access over the limit is possible.
This is the log from my system usning lz4 decompression.
[6502]data abort, halting
[6503]r0 0x00000000 r1 0x00000000 r2 0xdcea0ffc r3 0xdcea0ffc
[6509]r4 0xb9ab0bfd r5 0xdcea0ffc r6 0xdcea0ff8 r7 0xdce80000
[6515]r8 0x00000000 r9 0x00000000 r10 0x00000000 r11 0xb9a98000
[6522]r12 0xdcea1000 usp 0x00000000 ulr 0x00000000 pc 0x820149bc
[6528]spsr 0x400001f3
and the memory addresses of some variables at the moment are
ref:0xdcea0ffc, op:0xdcea0ffc, oend:0xdcea1000
As you can see, COPYLENGH is 8bytes, so @ref and @op can access the momory
over @oend.
Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tommi Rantala <tt.rantala@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
and a non-hugetlbfs page has no page_hstate(). This works 99% of the
time because page_hstate() determines the hstate from the page order
alone. Since the page order of a THP page matches the default hugetlbfs
page order, it works.
But, if you change the default huge page size on the boot command-line
(say default_hugepagesz=1G), then we might not even *have* a 2MB hstate
so page_hstate() returns null and copy_huge_page() oopses pretty fast
since copy_huge_page() dereferences the hstate:
Mel noticed that the migration code is really the only user of these
functions. This moves all the copy code over to migrate.c and makes
copy_huge_page() work for THP by checking for it explicitly.
I believe the bug was introduced in commit b32967ff101a ("mm: numa: Add
THP migration for the NUMA working set scanning fault case")
[akpm@linux-foundation.org: fix coding-style and comment text, per Naoya Horiguchi] Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
This is a single hunk introduced later in the upstream commit cb900f41215447433cbc456d1c4294e858a84d7c (mm, hugetlb: convert
hugetlbfs to use split pmd lock). We need page_hstate even for
!HUGETLB_PAGE case for the next patch (mm: thp: give transparent
hugepage code a separate copy_page).
mm/migrate.c: In function `do_move_page_to_node_array':
include/linux/hugetlb.h:140:33: warning: statement with no effect [-Wunused-value]
#define isolate_huge_page(p, l) false
^
mm/migrate.c:1170:4: note: in expansion of macro `isolate_huge_page'
isolate_huge_page(page, &pagelist);
Commit 746c9e9f92dd "of/base: Fix PowerPC address parsing hack" limited
the applicability of the workaround whereby a missing ranges is treated
as an empty ranges. This workaround was hiding a bug in the etsec2
device tree nodes, which have children with reg, but did not have
ranges.
Signed-off-by: Scott Wood <scottwood@freescale.com> Reported-by: Alexander Graf <agraf@suse.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Firstly, handle zero length calls properly. Believe it or not there
are a few of these happening during early boot.
Next, we can't just drop to a memcpy() call in the forward copy case
where dst <= src. The reason is that the cache initializing stores
used in the Niagara memcpy() implementations can end up clearing out
cache lines before we've sourced their original contents completely.
For example, considering NG4memcpy, the main unrolled loop begins like
this:
Assume dst is 64 byte aligned and let's say that dst is src - 8 for
this memcpy() call. That store at the end there is the one to the
first line in the cache line, thus clearing the whole line, which thus
clobbers "src + 0x28" before it even gets loaded.
To avoid this, just fall through to a simple copy only mildly
optimized for the case where src and dst are 8 byte aligned and the
length is a multiple of 8 as well. We could get fancy and call
GENmemcpy() but this is good enough for how this thing is actually
used.
Reported-by: David Ahern <david.ahern@oracle.com> Reported-by: Bob Picco <bpicco@meloft.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
With the increase in number of CPUs calls to functions that dump
output to console (e.g., arch_trigger_all_cpu_backtrace) can take
a long time to complete. If IRQs are disabled eventually the NMI
watchdog kicks in and creates more havoc. Avoid by telling the NMI
watchdog everything is ok.
Signed-off-by: David Ahern <david.ahern@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
The reason is that state is never reset (stays with PERF_HES_UPTODATE set).
Add a call to sparc_pmu_enable_event during the added_event handling.
Clean up the encoding since pmu_start calls sparc_pmu_enable_event which
does the same. Passing PERF_EF_RELOAD to sparc_pmu_start means the call
to sparc_perf_event_set_period can be removed as well.
With this patch:
$ perf stat ls
...
Performance counter stats for 'ls':
Signed-off-by: David Ahern <david.ahern@oracle.com> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
perf_pmu_disable is called by core perf code before pmu->del and the
enable function is called by core perf code afterwards. No need to
call again within sparc_pmu_del.
Ditto for pmu->add and sparc_pmu_add.
Signed-off-by: David Ahern <david.ahern@oracle.com> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
A bug was reported that the semtimedop() system call was always
failing eith ENOSYS.
Since SEMCTL is defined as 3, and SEMTIMEDOP is defined as 4,
the comparison "call <= SEMCTL" will always prevent SEMTIMEDOP
from getting through to the semaphore ops switch statement.
This is corrected by changing the comparison to "call <= SEMTIMEDOP".
Load balancing can be triggered in the critical sections protected by
srmmu_context_spinlock in destroy_context() and switch_mm() and can hang
the cpu waiting for the rq lock of another cpu that in turn has called
switch_mm hangning on srmmu_context_spinlock leading to deadlock.
So, disable interrupt while taking srmmu_context_spinlock in
destroy_context() and switch_mm() so we don't deadlock.
See also commit 77b838fa1ef0 ("[SPARC64]: destroy_context() needs to disable
interrupts.")
Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
While working on sk_forward_alloc problems reported by Denys
Fedoryshchenko, we found that tcp connect() (and fastopen) do not call
sk_wmem_schedule() for SYN packet (and/or SYN/DATA packet), so
sk_forward_alloc is negative while connect is in progress.
We can fix this by calling regular sk_stream_alloc_skb() both for the
SYN packet (in tcp_connect()) and the syn_data packet in
tcp_send_syn_data()
Then, tcp_send_syn_data() can avoid copying syn_data as we simply
can manipulate syn_data->cb[] to remove SYN flag (and increment seq)
Instead of open coding memcpy_fromiovecend(), simply use this helper.
This leaves in socket write queue clean fast clone skbs.
This was tested against our fastopen packetdrill tests.
Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Commit db31c55a6fb2 (net: clamp ->msg_namelen instead of returning an
error) introduced the clamping of msg_namelen when the unsigned value
was larger than sizeof(struct sockaddr_storage). This caused a
msg_namelen of -1 to be valid. The native code was subsequently fixed by
commit dbb490b96584 (net: socket: error on a negative msg_namelen).
In addition, the native code sets msg_namelen to 0 when msg_name is
NULL. This was done in commit (6a2a2b3ae075 net:socket: set msg_namelen
to 0 if msg_name is passed as NULL in msghdr struct from userland) and
subsequently updated by 08adb7dabd48 (fold verify_iovec() into
copy_msghdr_from_user()).
This patch brings the get_compat_msghdr() in line with
copy_msghdr_from_user().
Fixes: db31c55a6fb2 (net: clamp ->msg_namelen instead of returning an error) Cc: David S. Miller <davem@davemloft.net> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
tcp_send_fin() does not account for the memory it allocates properly, so
sk_forward_alloc can be negative in cases where we've sent a FIN:
ss example output (ss -amn | grep -B1 f4294):
tcp FIN-WAIT-1 0 1 192.168.0.1:45520 192.0.2.1:8080
skmem:(r0,rb87380,t0,tb87380,f4294966016,w1280,o0,bl0) Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
for throw routes to trigger evaluation of other policy rules
EAGAIN needs to be propagated up to fib_rules_lookup
similar to how its done for IPv4
A simple testcase for verification is:
ip -6 rule add lookup 33333 priority 33333
ip -6 route add throw 2001:db8::1
ip -6 route add 2001:db8::1 via fe80::1 dev wlan0 table 33333
ip route get 2001:db8::1
Signed-off-by: Steven Barth <cyrus@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
The custom USB_DEVICE_CLASS macro matches
bDeviceClass, bDeviceSubClass and bDeviceProtocol
but the common USB_DEVICE_AND_INTERFACE_INFO matches
bInterfaceClass, bInterfaceSubClass and bInterfaceProtocol instead, which are
not specified.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
[I would really like an ACK on that one from dhowells; it appears to be
quite straightforward, but...]
MSG_PEEK isn't passed to ->recvmsg() via msg->msg_flags; as the matter of
fact, neither the kernel users of rxrpc, nor the syscalls ever set that bit
in there. It gets passed via flags; in fact, another such check in the same
function is done correctly - as flags & MSG_PEEK.
It had been that way (effectively disabled) for 8 years, though, so the patch
needs beating up - that case had never been tested. If it is correct, it's
-stable fodder.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
It should be checking flags, not msg->msg_flags. It's ->sendmsg()
instances that need to look for that in ->msg_flags, ->recvmsg() ones
(including the other ->recvmsg() instance in that file, as well as
unix_dgram_recvmsg() this one claims to be imitating) check in flags.
Braino had been introduced in commit dcda13 ("caif: Bugfix - use MSG_TRUNC
in receive") back in 2010, so it goes quite a while back.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
The rds_iw_update_cm_id function stores a large 'struct rds_sock' object
on the stack in order to pass a pair of addresses. This happens to just
fit withint the 1024 byte stack size warning limit on x86, but just
exceed that limit on ARM, which gives us this warning:
net/rds/iw_rdma.c:200:1: warning: the frame size of 1056 bytes is larger than 1024 bytes [-Wframe-larger-than=]
As the use of this large variable is basically bogus, we can rearrange
the code to not do that. Instead of passing an rds socket into
rds_iw_get_device, we now just pass the two addresses that we have
available in rds_iw_update_cm_id, and we change rds_iw_get_mr accordingly,
to create two address structures on the stack there.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
sysctl has sysctl.net.core.rmem_*/wmem_* parameters which can be
set to incorrect values. Given that 'struct sk_buff' allocates from
rcvbuf, incorrectly set buffer length could result to memory
allocation failures. For example, set them as follows:
Moreover, the possible minimum is 1, so we can get another kernel panic:
...
BUG: unable to handle kernel paging request at ffff88013caee5c0
IP: [<ffffffff815604cf>] __alloc_skb+0x12f/0x1f0
...
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
The commit [63e51fd708f5: ALSA: hda - Don't take unresponsive D3
transition too serious] introduced a conditional fallback behavior to
the HD-audio controller depending on the flag set. However, it
introduced a silly bug, too, that the flag was evaluated in a reverse
way. This resulted in a regression of HD-audio controller driver
where it can't go to the fallback mode at communication errors.
Unfortunately (or fortunately?) this didn't come up until recently
because the affected code path is an error handling that happens only
on an unstable hardware chip. Most of recent chips work stably, thus
they didn't hit this problem. Now, we've got a regression report with
a VIA chip, and this seems indeed requiring the fallback to the
polling mode, and finally the bug was revealed.
The fix is a oneliner to remove the wrong logical NOT in the check.
(Lesson learned - be careful about double negation.)
The bug should be backported to stable, but the patch won't be
applicable to 3.13 or earlier because of the code splits. The stable
fix patches for earlier kernels will be posted later manually.
When running the test which causes the race as shown in the previous patch,
we can hit the BUG "get_page() on refcount 0 page" in hugetlb_fault().
This race happens when pte turns into migration entry just after the first
check of is_hugetlb_entry_migration() in hugetlb_fault() passed with false.
To fix this, we need to check pte_present() again after huge_ptep_get().
This patch also reorders taking ptl and doing pte_page(), because
pte_page() should be done in ptl. Due to this reordering, we need use
trylock_page() in page != pagecache_page case to respect locking order.
Fixes: 66aebce747ea ("hugetlb: fix race condition in hugetlb_fault()") Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Hugh Dickins <hughd@google.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: <stable@vger.kernel.org> [3.2+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz> [backport to 3.12]
In case CLK_GATE_HIWORD_MASK flag is passed to clk_register_gate(), the bit #
should be no higher than 15, however the corresponding check is obviously off-
by-one.
Fixes: 045779942c04 ("clk: gate: add CLK_GATE_HIWORD_MASK") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Michael Turquette <mturquette@linaro.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Sometimes while CPU have some load and ath5k doing the wireless
interface reset the whole WiSoC completely freezes. Set of tests shows
that using atomic delay function while we wait interface reset helps to
avoid such freezes.
The easiest way to reproduce this issue: create a station interface,
start continous scan with wpa_supplicant and load CPU by something. Or
just create multiple station interfaces and put them all in continous
scan.
This patch partially reverts the commit 1846ac3dbec0 ("ath5k: Use
usleep_range where possible"), which replaces initial udelay()
by usleep_range().
I do not know actual source of this issue, but all looks like that HW
freeze is caused by transaction on internal SoC bus, while wireless
block is in reset state.
Also I should note that I do not know how many chips are affected, but I
did not see this issue with chips, other than AR5312.
CC: Jiri Slaby <jirislaby@gmail.com> CC: Nick Kossifidis <mickflemm@gmail.com> CC: Luis R. Rodriguez <mcgrof@do-not-panic.com> Fixes: 1846ac3dbec0 ("ath5k: Use usleep_range where possible") Reported-by: Christophe Prevotaux <c.prevotaux@rural-networks.com> Tested-by: Christophe Prevotaux <c.prevotaux@rural-networks.com> Tested-by: Eric Bree <ebree@nltinc.com> Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
i915.ko depends upon the acpi/video.ko module and so refuses to load if
ACPI is disabled at runtime if for example the BIOS is broken beyond
repair. acpi/video provides an optional service for i915.ko and so we
should just allow the modules to load, but do no nothing in order to let
the machines boot correctly.
Reported-by: Bill Augur <bill-auger@programmer.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jani Nikula <jani.nikula@intel.com> Acked-by: Aaron Lu <aaron.lu@intel.com>
[ rjw: Fixed up the new comment in acpi_video_init() ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz>