When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.
Fix the problem by moving posix_acl_update_mode() out of
__btrfs_set_acl() into btrfs_set_acl(). That way the function will not be
called when inheriting ACLs which is what we want as it prevents SGID
bit clearing and the mode has been properly set by posix_acl_create()
anyway.
Fixes: 073931017b49d9458aa351605b43a7e34598caef CC: linux-btrfs@vger.kernel.org CC: David Sterba <dsterba@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While punching a hole in a range that is not aligned with the sector size
(currently the same as the page size) we can end up leaving an extent map
in memory with a length that is smaller then the sector size or with a
start offset that is not aligned to the sector size. Both cases are not
expected and can lead to problems. This issue is easily detected
after the patch from commit a7e3b975a0f9 ("Btrfs: fix reported number of
inode blocks"), introduced in kernel 4.12-rc1, in a scenario like the
following for example:
What happens in the above example is the following:
1) When punching the hole, at btrfs_punch_hole(), the variable tail_len
is set to 2048 (as tail_start is 148Kb + 1 and offset + len is 150Kb).
This results in the creation of an extent map with a length of 2Kb
starting at file offset 148Kb, through find_first_non_hole() ->
btrfs_get_extent().
2) The second write (first write after the hole punch operation), sets
the range [50Kb, 152Kb[ to delalloc.
3) The third write, at btrfs_find_new_delalloc_bytes(), sees the extent
map covering the range [148Kb, 150Kb[ and ends up calling
set_extent_bit() for the same range, which results in splitting an
existing extent state record, covering the range [148Kb, 152Kb[ into
two 2Kb extent state records, covering the ranges [148Kb, 150Kb[ and
[150Kb, 152Kb[.
4) Finally at lock_and_cleanup_extent_if_need(), immediately after calling
btrfs_find_new_delalloc_bytes() we clear the delalloc bit from the
range [100Kb, 152Kb[ which results in the btrfs_clear_bit_hook()
callback being invoked against the two 2Kb extent state records that
cover the ranges [148Kb, 150Kb[ and [150Kb, 152Kb[. When called against
the first 2Kb extent state, it calls btrfs_delalloc_release_metadata()
with a length argument of 2048 bytes. That function rounds up the length
to a sector size aligned length, so it ends up considering a length of
4096 bytes, and then calls calc_csum_metadata_size() which results in
decrementing the inode's csum_bytes counter by 4096 bytes, so after
it stays a value of 0 bytes. Then the same happens when
btrfs_clear_bit_hook() is called against the second extent state that
has a length of 2Kb, covering the range [150Kb, 152Kb[, the length is
rounded up to 4096 and calc_csum_metadata_size() ends up being called
to decrement 4096 bytes from the inode's csum_bytes counter, which
at that time has a value of 0, leading to an underflow, which is
exactly what triggers the first warning, at btrfs_destroy_inode().
All the other warnings relate to several space accounting counters
that underflow as well due to similar reasons.
A similar case but where the hole punching operation creates an extent map
with a start offset not aligned to the sector size is the following:
During the unmount operation we get similar traces for the same reasons as
in the first example.
So fix the hole punching operation to make sure it never creates extent
maps with a length that is not aligned to the sector size nor with a start
offset that is not aligned to the sector size, as this breaks all
assumptions and it's a land mine.
Fixes: d77815461f04 ("btrfs: Avoid trucating page or punching hole in a already existed hole.") Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we fail to add an interface in mwifiex_add_virtual_intf(), we might
hit a BUG_ON() in the networking code, because we didn't tear things
down properly. Among the problems:
(a) when failing to allocate workqueues, we fail to unregister the
netdev before calling free_netdev()
(b) even if we do try to unregister the netdev, we're still holding the
rtnl lock, so the device never properly unregistered; we'll be at
state NETREG_UNREGISTERING, and then hit free_netdev()'s:
BUG_ON(dev->reg_state != NETREG_UNREGISTERED);
(c) we're allocating some dependent resources (e.g., DFS workqueues)
after we've registered the interface; this may or may not cause
problems, but it's good practice to allocate these before registering
(d) we're not even trying to unwind anything when mwifiex_send_cmd() or
mwifiex_sta_init_cmd() fail
To fix these issues, let's:
* add a stacked set of error handling labels, to keep error handling
consistent and properly ordered (resolving (a) and (d))
* move the workqueue allocations before the registration (to resolve
(c); also resolves (b) by avoiding error cases where we have to
unregister)
[Incidentally, it's pretty easy to interrupt the alloc_workqueue() in,
e.g., the following:
iw phy phy0 interface add mlan0 type station
by sending it SIGTERM.]
This bugfix covers commits like commit 7d652034d1a0 ("mwifiex: channel
switch support for mwifiex"), but parts of this bug exist all the way
back to the introduction of dynamic interface handling in commit 93a1df48d224 ("mwifiex: add cfg80211 handlers add/del_virtual_intf").
Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9abdcccc3d5f ("pstore: Extract common arguments into structure")
moved record decompression to function. decompress_record() gets
called without checking type and compressed flag. Warning will be
reported if data is uncompressed. Pstore type PSTORE_TYPE_PPC_OPAL,
PSTORE_TYPE_PPC_COMMON doesn't contain compressed data and warning get
printed part of dmesg.
Partial dmesg log:
[ 35.848914] pstore: ignored compressed record type 6
[ 35.848927] pstore: ignored compressed record type 8
Above warning should not get printed as it is known that data won't be
compressed for above type and it is valid condition.
This patch returns if data is not compressed and print warning only if
data is compressed and type is not PSTORE_TYPE_DMESG.
Reported-by: Anton Blanchard <anton@au1.ibm.com> Signed-off-by: Ankit Kumar <ankit@linux.vnet.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Kees Cook <keescook@chromium.org> Fixes: 9abdcccc3d5f ("pstore: Extract common arguments into structure") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the stable linux-3.16 branch, I ran into a warning in the
wlcore driver:
drivers/net/wireless/ti/wlcore/spi.c: In function 'wl12xx_spi_raw_write':
drivers/net/wireless/ti/wlcore/spi.c:315:1: error: the frame size of 12848 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
Newer kernels no longer show the warning, but the bug is still there,
as the allocation is based on the CPU page size rather than the
actual capabilities of the hardware.
This replaces the PAGE_SIZE macro with the SZ_4K macro, i.e. 4096 bytes
per buffer.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This file is filled with complex cryptography. Thus, the comparisons of
MACs and secret keys and curve points and so forth should not add timing
attacks, which could either result in a direct forgery, or, given the
complexity, some other type of attack.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sometimes a FUP packet is associated with a TSX transaction and a flag is
set to indicate that. Ensure that flag is cleared on any error condition
because at that point the decoder can no longer assume it is correct.
The decoder will try to use branch packets to find an IP to start decoding
or to recover from errors. Currently the FUP packet is used only in the
case of an overflow, however there is no reason for that to be a special
case. So just use FUP always when scanning for an IP.
Intel PT uses IP compression based on the last IP. For decoding purposes,
'last IP' is not updated when a branch target has been suppressed, which is
indicated by IPBytes == 0. IPBytes is stored in the packet 'count', so
ensure never to set 'last_ip' when packet 'count' is zero.
Intel PT uses IP compression based on the last IP. For decoding
purposes, 'last IP' is considered to be reset to zero whenever there is
a synchronization packet (PSB). The decoder wasn't doing that, and was
treating the zero value to mean that there was no last IP, whereas
compression can be done against the zero value. Fix by setting last_ip
to zero when a PSB is received and keep track of have_last_ip.
The decoder uses its current timestamp in samples. Usually that is a
timestamp that has already passed, but in some cases it is a timestamp
for a branch that the decoder is walking towards, and consequently
hasn't reached. Improve that situation by using the pkt_state to
determine when to use the current or previous timestamp.
Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in bind() handlers of the
AF_NFC socket. Since the syscall doesn't enforce a minimum size of the
corresponding memory region, very short sockaddrs (zero or one byte long)
result in operating on uninitialized memory while referencing .sa_family.
Fix the sockaddr length verification in the connect() handler of NFC/LLCP
sockets, to compare against the size of the actual structure expected on
input (sockaddr_nfc_llcp) instead of its shorter version (sockaddr_nfc).
Both structures are defined in include/uapi/linux/nfc.h. The fields
specific to the _llcp extended struct are as follows:
276 __u8 dsap; /* Destination SAP, if known */
277 __u8 ssap; /* Source SAP to be bound to */
278 char service_name[NFC_LLCP_MAX_SERVICE_NAME]; /* Service name URI */;
279 size_t service_name_len;
If the caller doesn't provide a sufficiently long sockaddr buffer, these
fields remain uninitialized (and they currently originate from the stack
frame of the top-level sys_connect handler). They are then copied by
llcp_sock_connect() into internal storage (nfc_llcp_sock structure), and
could be subsequently read back through the user-mode getsockname()
function (handled by llcp_sock_getname()). This would result in the
disclosure of up to ~70 uninitialized bytes from the kernel stack to
user-mode clients capable of creating AFC_NFC sockets.
Check that the NFC_ATTR_TARGET_INDEX and NFC_ATTR_PROTOCOLS attributes (in
addition to NFC_ATTR_DEVICE_INDEX) are provided by the netlink client
prior to accessing them. This prevents potential unhandled NULL pointer
dereference exceptions which can be triggered by malicious user-mode
programs, if they omit one or both of these attributes.
The nci-device was never deregistered in the event that
fw-initialisation failed.
Fix this by moving the firmware initialisation before device
registration since the firmware work queue should be available before
registering.
Note that this depends on a recent fix that moved device-name
initialisation back to to nci_allocate_device() as the
firmware-workqueue name is now derived from the nfc-device name.
Fixes: 3194c6870158 ("NFC: nfcmrvl: add firmware download support") Cc: Vincent Cuissard <cuissard@marvell.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This specifically fixes resource leaks in the registration error paths.
Device-managed resources is a bad fit for this driver as devices can be
registered from the n_nci line discipline. Firstly, a tty may not even
have a corresponding device (should it be part of a Unix98 pty)
something which would lead to a NULL-pointer dereference when
registering resources.
Secondly, if the tty has a class device, its lifetime exceeds that of
the line discipline, which means that resources would leak every time
the line discipline is closed (or if registration fails).
Currently, the devres interface was only being used to request a reset
gpio despite the fact that it was already explicitly freed in
nfcmrvl_nci_unregister_dev() (along with the private data), something
which also prevented the resource leak at close.
Note that the driver treats gpio number 0 as invalid despite it being
perfectly valid. This will be addressed in a follow-up patch.
Make sure to check the tty-device pointer before trying to access the
parent device to avoid dereferencing a NULL-pointer when the tty is one
end of a Unix98 pty.
Fixes: e097dc624f78 ("NFC: nfcmrvl: add UART driver") Cc: Vincent Cuissard <cuissard@marvell.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 7eda8b8e9677 ("NFC: Use IDR library to assing NFC devices IDs")
moved device-id allocation and struct-device initialisation from
nfc_allocate_device() to nfc_register_device().
This broke just about every nfc-device-registration error path, which
continue to call nfc_free_device() that tries to put the device
reference of the now uninitialised (but zeroed) struct device:
kobject: '(null)' (ce316420): is not initialized, yet kobject_put() is being called.
The late struct-device initialisation also meant that various work
queues whose names are derived from the nfc device name were also
misnamed:
Move the id-allocation and struct-device initialisation back to
nfc_allocate_device() and fix up the single call site which did not use
nfc_free_device() in its error path.
Fixes: 7eda8b8e9677 ("NFC: Use IDR library to assing NFC devices IDs") Cc: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In BSS mode in the disconnection flow, mac80211 removes
the AP station before the vif is set to unassociated.
Our firmware wants it the other way around: first set
the vif as unassociated, and then remove the AP station.
In order to bridge between those two different behaviors,
iwlmvm doesn't remove the station from the firmware when
mac80211 removes it, but only after the vif is set to
unassociated. The implementation is in
iwl_mvm_bss_info_changed_station:
if (assoc state was modified && mvmvif->ap_sta_id is VALID
&& assoc state is now UNASSC)
remove_the_station_from_the_firmware()
During the recovery flow, mac80211 re-adds the AP station
and then reconfigures the vif. Since the vif is not
associated, and then, we enter the if above (which was
intended to be taken in the disconnection flow only) and
remove the station we just added. This defeats the
recovery flow.
Fix this by not removing the AP station in this flow if
we are in recovery flow.
The bug was triggered when do suspend/resuming continuously
on Dell XPS L322X/0PJHXN version 9333 (2013) with kernel
4.12.0-041200rc4-generic. But can't reproduce on DELL
E5440 + AR9300 PCIE chips.
The warning is caused by accessing invalid pointer sc->rng_task.
sc->rng_task is not be cleared after kthread_stop(sc->rng_task)
be called in ath9k_rng_stop(). Because the kthread is stopped
before ath9k_rng_kthread() be scheduled.
So set sc->rng_task to null after kthread_stop(sc->rng_task) to
resolve this issue.
The hard coded register 0x9864 and 0x9924 are invalid
for ar9300 chips.
Signed-off-by: Miaoqing Pan <miaoqing@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
One scenario that could lead to UAF is two threads writing
simultaneously to the "tx99" debug file. One of them would
set the "start" value to true and follow to ath9k_tx99_init().
Inside the function it would set the sc->tx99_state to true
after allocating sc->tx99skb. Then, the other thread would
execute write_file_tx99() and call ath9k_tx99_deinit().
sc->tx99_state would be freed. After that, the first thread
would continue inside ath9k_tx99_init() and call
r = ath9k_tx99_send(sc, sc->tx99_skb, &txctl);
that would make use of the freed sc->tx99_skb memory.
Signed-off-by: Miaoqing Pan <miaoqing@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The thermal child device reuses the parent MFD-device device-tree node
when registering a thermal zone, but did not take a reference to the
node.
This leads to a reference imbalance, and potential use-after-free, when
the node reference is dropped by the platform-bus device destructor
(once for the child and later again for the parent).
Fix this by dropping any reference already held to a device-tree node
and getting a reference to the parent's node which will be balanced on
reprobe or on platform-device release, whichever comes first.
Note that simply clearing the of_node pointer on probe errors and on
driver unbind would not allow the use of device-managed resources as
specifically thermal_zone_of_sensor_unregister() claims that a valid
device-tree node pointer is needed during deregistration (even if it
currently does not seem to use it).
drivers/media/platform/s5p-jpeg/jpeg-core.c: In function 's5p_jpeg_parse_hdr.isra.9':
drivers/media/platform/s5p-jpeg/jpeg-core.c:1207:12: warning: 'width' may be used uninitialized in this function [-Wmaybe-uninitialized]
result->w = width;
~~~~~~~~~~^~~~~~~
drivers/media/platform/s5p-jpeg/jpeg-core.c:1208:12: warning: 'height' may be used uninitialized in this function [-Wmaybe-uninitialized]
result->h = height;
~~~~~~~~~~^~~~~~~~
Indeed the code would allow it to return a random value (although
it shouldn't happen, in practice). So, explicitly set both to zero,
just in case.
gcc-7 suggests that an expression using a bitwise not and a bitmask
on a 'bool' variable is better written using boolean logic:
drivers/media/rc/imon.c: In function 'imon_incoming_scancode':
drivers/media/rc/imon.c:1725:22: error: '~' on a boolean expression [-Werror=bool-operation]
ictx->pad_mouse = ~(ictx->pad_mouse) & 0x1;
^
drivers/media/rc/imon.c:1725:22: note: did you mean to use logical not?
I made the mistake of upgrading my desktop to the new Fedora 26 that
comes with gcc-7.1.1.
There's nothing wrong per se that I've noticed, but I now have 1500
lines of warnings, mostly from the new format-truncation warning
triggering all over the tree.
We use 'snprintf()' and friends in a lot of places, and often know that
the numbers are fairly small (ie a controller index or similar), but gcc
doesn't know that, and sees an 'int', and thinks that it could be some
huge number. And then complains when our buffers are not able to fit
the name for the ten millionth controller.
These warnings aren't necessarily bad per se, and we probably want to
look through them subsystem by subsystem, but at least during the merge
window they just mean that I can't even see if somebody is introducing
any *real* problems when I pull.
The MSR permission bitmaps are shared by all VMs. However, some VMs
may not be configured to support MPX, even when the host does. If the
host supports VMX and the guest does not, we should intercept accesses
to the BNDCFGS MSR, so that we can synthesize a #GP
fault. Furthermore, if the host does not support MPX and the
"ignore_msrs" kvm kernel parameter is set, then we should intercept
accesses to the BNDCFGS MSR, so that we can skip over the rdmsr/wrmsr
without raising a #GP fault.
In the current code, if the user accidentally writes a bogus command to
this sysfs file, then we set the latency tolerance to an uninitialized
variable.
Fixes: 2d984ad132a8 (PM / QoS: Introcuce latency tolerance device PM QoS type) Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On this Lenovo machine, there are two front mics, and both of them are
assigned the same name "Mic", but pulseaudio can't support two mics
with the same name, as a workaround, we change the location for one of
them, then the driver will assign "Front Mic" and "Mic" for them.
Clear the notify function pointer in the platform data before we tear
down the driver. Otherwise i915 would end up calling a stale function
pointer and possibly explode.
When the "if (record->size <= 0)" test is true in
pstore_get_backend_records() it's pretty clear that nobody holds a
reference to the allocated pstore_record, yet we don't free it.
Let's free it.
Fixes: 2a2b0acf768c ("pstore: Allocate records on heap instead of stack") Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The wakeirq infrastructure uses RCU to protect the list of wakeirqs. That
breaks the irq bus locking infrastructure, which is allows sleeping
functions to be called so interrupt controllers behind slow busses,
e.g. i2c, can be handled.
The wakeirq functions hold rcu_read_lock and call into irq functions, which
in case of interrupts using the irq bus locking will trigger a
might_sleep() splat.
Convert the wakeirq infrastructure to Sleepable RCU and unbreak it.
Fixes: 4990d4fe327b (PM / Wakeirq: Add automated device wake IRQ handling) Reported-by: Brian Norris <briannorris@chromium.org> Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This causes sched_balance_cpu() to compute the wrong CPU and
consequently should_we_balance() will terminate early resulting in
missed load-balance opportunities.
(note: this relies on OVERLAP domains to always have children, this is
true because the regular topology domains are still here -- this is
before degenerate trimming)
Debugged-by: Lauro Ramos Venancio <lvenanci@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Fixes: e3589f6c81e4 ("sched: Allow for overlapping sched_domain spans") Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The group mask is always used in intersection with the group CPUs. So,
when building the group mask, we don't have to care about CPUs that are
not part of the group.
Signed-off-by: Lauro Ramos Venancio <lvenanci@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: lwang@redhat.com Cc: riel@redhat.com Link: http://lkml.kernel.org/r/1492717903-5195-2-git-send-email-lvenanci@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Syscalls must validate that their reserved arguments are zero and return
EINVAL otherwise. Otherwise, it will be impossible to actually use them
for anything in the future because existing programs may be passing
garbage in. This is standard practice when adding new APIs.
Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Driver does not properly handle the case when signals interrupt
wait_for_completion_interruptible():
-it does not check for return value
-completion structure is allocated on stack; in case a signal interrupts
the sleep, it will go out of scope, causing the worker thread
(caam_jr_dequeue) to fail when it accesses it
wait_for_completion_interruptible() is replaced with uninterruptable
wait_for_completion().
We choose to block all signals while waiting for I/O (device executing
the split key generation job descriptor) since the alternative - in
order to have a deterministic device state - would be to flush the job
ring (aborting *all* in-progress jobs).
Certain cipher modes like CTS expect the IV (req->info) of
ablkcipher_request (or equivalently req->iv of skcipher_request) to
contain the last ciphertext block when the {en,de}crypt operation is done.
This is currently not the case for the CAAM driver which in turn breaks
e.g. cts(cbc(aes)) when the CAAM driver is enabled.
This patch fixes the CAAM driver to properly set the IV after the
{en,de}crypt operation of ablkcipher finishes.
This issue was revealed by the changes in the SW CTS mode in commit 0605c41cc53ca ("crypto: cts - Convert to skcipher")
Signed-off-by: David Gstir <david@sigma-star.at> Reviewed-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
An updated patch that also handles the additional key length requirements
for the AEAD algorithms.
The max keysize is not 96. For SHA384/512 it's 128, and for the AEAD
algorithms it's longer still. Extend the max keysize for the
AEAD size for AES256 + HMAC(SHA512).
Fixes: 357fb60502ede ("crypto: talitos - add sha224, sha384 and sha512 to existing AEAD algorithms") Signed-off-by: Martin Hicks <mort@bork.org> Acked-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jörn Engel noticed that the expand_upwards() function might not return
-ENOMEM in case the requested address is (unsigned long)-PAGE_SIZE and
if the architecture didn't defined TASK_SIZE as multiple of PAGE_SIZE.
Affected architectures are arm, frv, m68k, blackfin, h8300 and xtensa
which all define TASK_SIZE as 0xffffffff, but since none of those have
an upwards-growing stack we currently have no actual issue.
Nevertheless let's fix this just in case any of the architectures with
an upward-growing stack (currently parisc, metag and partly ia64) define
TASK_SIZE similar.
Link: http://lkml.kernel.org/r/20170702192452.GA11868@p100.box Fixes: bd726c90b6b8 ("Allow stack to grow up to address space limit") Signed-off-by: Helge Deller <deller@gmx.de> Reported-by: Jörn Engel <joern@purestorage.com> Cc: Hugh Dickins <hughd@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
test_execve does rather odd mount manipulations to safely create
temporary setuid and setgid executables that aren't visible to the
rest of the system. Those executables end up in the test's cwd, but
that cwd is MNT_DETACHed.
The core namespace code considers MNT_DETACHed trees to belong to no
mount namespace at all and, in general, MNT_DETACHed trees are only
barely function. This interacted with commit 380cf5ba6b0a ("fs:
Treat foreign mounts as nosuid") to cause all MNT_DETACHed trees to
act as though they're nosuid, breaking the test.
Fix it by just not detaching the tree. It's still in a private
mount namespace and is therefore still invisible to the rest of the
system (except via /proc, and the same nosuid logic will protect all
other programs on the system from believing in test_execve's setuid
bits).
While we're at it, fix some blatant whitespace problems.
Andrei Vagin pointed out that time to executue propagate_umount can go
non-linear (and take a ludicrious amount of time) when the mount
propogation trees of the mounts to be unmunted by a lazy unmount
overlap.
Make the walk of the mount propagation trees nearly linear by
remembering which mounts have already been visited, allowing
subsequent walks to detect when walking a mount propgation tree or a
subtree of a mount propgation tree would be duplicate work and to skip
them entirely.
Walk the list of mounts whose propgatation trees need to be traversed
from the mount highest in the mount tree to mounts lower in the mount
tree so that odds are higher that the code will walk the largest trees
first, allowing later tree walks to be skipped entirely.
Add cleanup_umount_visitation to remover the code's memory of which
mounts have been visited.
Add the functions last_slave and skip_propagation_subtree to allow
skipping appropriate parts of the mount propagation tree without
needing to change the logic of the rest of the code.
A script to generate overlapping mount propagation trees:
$ cat runs.h
set -e
mount -t tmpfs zdtm /mnt
mkdir -p /mnt/1 /mnt/2
mount -t tmpfs zdtm /mnt/1
mount --make-shared /mnt/1
mkdir /mnt/1/1
iteration=10
if [ -n "$1" ] ; then
iteration=$1
fi
for i in $(seq $iteration); do
mount --bind /mnt/1/1 /mnt/1/1
done
While investigating some poor umount performance I realized that in
the case of overlapping mount trees where some of the mounts are locked
the code has been failing to unmount all of the mounts it should
have been unmounting.
This failure to unmount all of the necessary
mounts can be reproduced with:
$ cat locked_mounts_test.sh
mount -t tmpfs test-base /mnt
mount --make-shared /mnt
mkdir -p /mnt/b
mount -t tmpfs test1 /mnt/b
mount --make-shared /mnt/b
mkdir -p /mnt/b/10
mount -t tmpfs test2 /mnt/b/10
mount --make-shared /mnt/b/10
mkdir -p /mnt/b/10/20
This failure is corrected by removing the prepass that marks mounts
that may be umounted.
A first pass is added that umounts mounts if possible and if not sets
mount mark if they could be unmounted if they weren't locked and adds
them to a list to umount possibilities. This first pass reconsiders
the mounts parent if it is on the list of umount possibilities, ensuring
that information of umoutability will pass from child to mount parent.
A second pass then walks through all mounts that are umounted and processes
their children unmounting them or marking them for reparenting.
A last pass cleans up the state on the mounts that could not be umounted
and if applicable reparents them to their first parent that remained
mounted.
While a bit longer than the old code this code is much more robust
as it allows information to flow up from the leaves and down
from the trunk making the order in which mounts are encountered
in the umount propgation tree irrelevant.
Fixes: 0c56fe31420c ("mnt: Don't propagate unmounts to locked mounts") Reviewed-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It was observed that in some pathlogical cases that the current code
does not unmount everything it should. After investigation it
was determined that the issue is that mnt_change_mntpoint can
can change which mounts are available to be unmounted during mount
propagation which is wrong.
The trivial reproducer is:
$ cat ./pathological.sh
mount -t tmpfs test-base /mnt
cd /mnt
mkdir 1 2 1/1
mount --bind 1 1
mount --make-shared 1
mount --bind 1 2
mount --bind 1/1 1/1
mount --bind 1/1 1/1
echo
grep test-base /proc/self/mountinfo
umount 1/1
echo
grep test-base /proc/self/mountinfo
That last mount in the output was in the propgation tree to be unmounted but
was missed because the mnt_change_mountpoint changed it's parent before the walk
through the mount propagation tree observed it.
Fixes: 1064f874abc0 ("mnt: Tuck mounts under others instead of creating shadow/side mounts.") Acked-by: Andrei Vagin <avagin@virtuozzo.com> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Extend the disabling of preemption to include the hypercall so that
another thread can't get the CPU and corrupt the per-cpu page used
for hypercall arguments.
Signed-off-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Wait/wakeup operations do not guarantee ordering on their own. Instead,
either locking or memory barriers are required. This commit therefore
adds memory barriers to wake_nocb_leader() and nocb_leader_wait().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Krister Johansen <kjlx@templeofstupid.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Actually, at the moment this is not an issue, as every in-tree arch does
the same manual checks for VERIFY_READ vs VERIFY_WRITE, relying on the MMU
to tell them apart, but this wasn't the case in the past and may happen
again on some odd arch in the future.
If anyone cares about 3.7 and earlier, this is a security hole (untested)
on real 80386 CPUs.
Signed-off-by: Adam Borowski <kilobyte@angband.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Like arch/arm/, we inherit the READ_IMPLIES_EXEC personality flag across
fork(). This is undesirable for a number of reasons:
* ELF files that don't require executable stack can end up with it
anyway
* We end up performing un-necessary I-cache maintenance when mapping
what should be non-executable pages
* Restricting what is executable is generally desirable when defending
against overflow attacks
This patch clears the personality flag when setting up the personality for
newly spwaned native tasks. Given that semi-recent AArch64 toolchains emit
a non-executable PT_GNU_STACK header, userspace applications can already
not rely on READ_IMPLIES_EXEC so shouldn't be adversely affected by this
change.
Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Dong Bo <dongbo4@huawei.com>
[will: added comment to compat code, rewrote commit message] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Contrary to popular belief, PPIs connected to a GICv3 to not have
an affinity field similar to that of GICv2. That is consistent
with the fact that GICv3 is designed to accomodate thousands of
CPUs, and fitting them as a bitmap in a byte is... difficult.
Fixes: adbc3695d9e4 ("arm64: dts: add the Marvell Armada 3700 family and a development board") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes a crash seen while doing a kexec from radix mode to
hash mode. Key 0 is special in hash and used in the RPN by default, we
set the key values to 0 today. In radix mode key 0 is used to control
supervisor<->user access. In hash key 0 is used by default, so the
first instruction after the switch causes a crash on kexec.
Commit 3b10d0095a1e ("powerpc/mm/radix: Prevent kernel execution of
user space") introduced the setting of IAMR and AMOR values to prevent
execution of user mode instructions from supervisor mode. We need to
clean up these SPR's on kexec.
Fixes: 3b10d0095a1e ("powerpc/mm/radix: Prevent kernel execution of user space") Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that explicitly executed loaders are loaded in the mmap region, we
have more freedom to decide where we position PIE binaries in the
address space to avoid possible collisions with mmap or stack regions.
For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit
address space for 32-bit pointers. On 32-bit use 4MB, which is the
traditional x86 minimum load location, likely to avoid historically
requiring a 4MB page table entry when only a portion of the first 4MB
would be used (since the NULL address is avoided). For s390 the
position could be 0x10000, but that is needlessly close to the NULL
address.
Link: http://lkml.kernel.org/r/1498154792-49952-5-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that explicitly executed loaders are loaded in the mmap region, we
have more freedom to decide where we position PIE binaries in the
address space to avoid possible collisions with mmap or stack regions.
For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit
address space for 32-bit pointers. On 32-bit use 4MB, which is the
traditional x86 minimum load location, likely to avoid historically
requiring a 4MB page table entry when only a portion of the first 4MB
would be used (since the NULL address is avoided).
Link: http://lkml.kernel.org/r/1498154792-49952-4-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that explicitly executed loaders are loaded in the mmap region, we
have more freedom to decide where we position PIE binaries in the
address space to avoid possible collisions with mmap or stack regions.
For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit
address space for 32-bit pointers. On 32-bit use 4MB, to match ARM.
This could be 0x8000, the standard ET_EXEC load address, but that is
needlessly close to the NULL address, and anyone running arm compat PIE
will have an MMU, so the tight mapping is not needed.
Now that explicitly executed loaders are loaded in the mmap region, we
have more freedom to decide where we position PIE binaries in the
address space to avoid possible collisions with mmap or stack regions.
4MB is chosen here mainly to have parity with x86, where this is the
traditional minimum load location, likely to avoid historically
requiring a 4MB page table entry when only a portion of the first 4MB
would be used (since the NULL address is avoided).
For ARM the position could be 0x8000, the standard ET_EXEC load address,
but that is needlessly close to the NULL address, and anyone running PIE
on 32-bit ARM will have an MMU, so the tight mapping is not needed.
Link: http://lkml.kernel.org/r/1498154792-49952-2-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Qualys Security Advisory <qsa@qualys.com> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The ELF_ET_DYN_BASE position was originally intended to keep loaders
away from ET_EXEC binaries. (For example, running "/lib/ld-linux.so.2
/bin/cat" might cause the subsequent load of /bin/cat into where the
loader had been loaded.)
With the advent of PIE (ET_DYN binaries with an INTERP Program Header),
ELF_ET_DYN_BASE continued to be used since the kernel was only looking
at ET_DYN. However, since ELF_ET_DYN_BASE is traditionally set at the
top 1/3rd of the TASK_SIZE, a substantial portion of the address space
is unused.
For 32-bit tasks when RLIMIT_STACK is set to RLIM_INFINITY, programs are
loaded above the mmap region. This means they can be made to collide
(CVE-2017-1000370) or nearly collide (CVE-2017-1000371) with
pathological stack regions.
Lowering ELF_ET_DYN_BASE solves both by moving programs below the mmap
region in all cases, and will now additionally avoid programs falling
back to the mmap region by enforcing MAP_FIXED for program loads (i.e.
if it would have collided with the stack, now it will fail to load
instead of falling back to the mmap region).
To allow for a lower ELF_ET_DYN_BASE, loaders (ET_DYN without INTERP)
are loaded into the mmap region, leaving space available for either an
ET_EXEC binary with a fixed location or PIE being loaded into mmap by
the loader. Only PIE programs are loaded offset from ELF_ET_DYN_BASE,
which means architectures can now safely lower their values without risk
of loaders colliding with their subsequently loaded programs.
For 64-bit, ELF_ET_DYN_BASE is best set to 4GB to allow runtimes to use
the entire 32-bit address space for 32-bit pointers.
Thanks to PaX Team, Daniel Micay, and Rik van Riel for inspiration and
suggestions on how to implement this solution.
Fixes: d1fd836dcf00 ("mm: split ET_DYN ASLR from mmap ASLR") Link: http://lkml.kernel.org/r/20170621173201.GA114489@beast Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Rik van Riel <riel@redhat.com> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Qualys Security Advisory <qsa@qualys.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Pratyush Anand <panand@redhat.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As of perl 5, version 26, subversion 0 (v5.26.0) some new warnings have
occurred when running checkpatch.
Unescaped left brace in regex is deprecated here (and will be fatal in
Perl 5.30), passed through in regex; marked by <-- HERE in m/^(.\s*){
<-- HERE \s*/ at scripts/checkpatch.pl line 3544.
Unescaped left brace in regex is deprecated here (and will be fatal in
Perl 5.30), passed through in regex; marked by <-- HERE in m/^(.\s*){
<-- HERE \s*/ at scripts/checkpatch.pl line 3885.
Unescaped left brace in regex is deprecated here (and will be fatal in
Perl 5.30), passed through in regex; marked by <-- HERE in
m/^(\+.*(?:do|\))){ <-- HERE / at scripts/checkpatch.pl line 4374.
It seems perfectly reasonable to do as the warning suggests and simply
escape the left brace in these three locations.
__list_lru_walk_one() acquires nlru spin lock (nlru->lock) for longer
duration if there are more number of items in the lru list. As per the
current code, it can hold the spin lock for upto maximum UINT_MAX
entries at a time. So if there are more number of items in the lru
list, then "BUG: spinlock lockup suspected" is observed in the below
path:
Fix this lockup by reducing the number of entries to be shrinked from
the lru list to 1024 at once. Also, add cond_resched() before
processing the lru list again.
Link: http://marc.info/?t=149722864900001&r=1&w=2 Link: http://lkml.kernel.org/r/1498707575-2472-1-git-send-email-stummala@codeaurora.org Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Suggested-by: Jan Kara <jack@suse.cz> Suggested-by: Vladimir Davydov <vdavydov.dev@gmail.com> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Alexander Polakov <apolyakov@beget.ru> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
list_lru_count_node() iterates over all memcgs to get the total number of
entries on the node but it can race with memcg_drain_all_list_lrus(),
which migrates the entries from a dead cgroup to another. This can return
incorrect number of entries from list_lru_count_node().
Fix this by keeping track of entries per node and simply return it in
list_lru_count_node().
Link: http://lkml.kernel.org/r/1498707555-30525-1-git-send-email-stummala@codeaurora.org Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Jan Kara <jack@suse.cz> Cc: Alexander Polakov <apolyakov@beget.ru> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
core_kernel_text is used by MIPS in its function graph trace processing,
so having this method traced leads to an infinite set of recursive calls
such as:
The motivation for commit abb2ea7dfd82 ("compiler, clang: suppress
warning for unused static inline functions") was to suppress clang's
warnings about unused static inline functions.
For configs without CONFIG_OPTIMIZE_INLINING enabled, such as any non-x86
architecture, `inline' in the kernel implies that
__attribute__((always_inline)) is used.
Some code depends on that behavior, see
https://lkml.org/lkml/2017/6/13/918:
net/built-in.o: In function `__xchg_mb':
arch/arm64/include/asm/cmpxchg.h:99: undefined reference to `__compiletime_assert_99'
arch/arm64/include/asm/cmpxchg.h:99: undefined reference to `__compiletime_assert_99
The full fix would be to identify these breakages and annotate the
functions with __always_inline instead of `inline'. But since we are
late in the 4.12-rc cycle, simply carry forward the forced inlining
behavior and work toward moving arm64, and other architectures, toward
CONFIG_OPTIMIZE_INLINING behavior.
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1706261552200.1075@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Sodagudi Prasad <psodagud@codeaurora.org> Tested-by: Sodagudi Prasad <psodagud@codeaurora.org> Tested-by: Matthias Kaehlcke <mka@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
liblockdep has been broken since commit 75dd602a5198 ("lockdep: Fix
lock_chain::base size"), as that adds a check that MAX_LOCK_DEPTH is
within the range of lock_chain::depth and in liblockdep it is much
too large.
That should have resulted in a compiler error, but didn't because:
- the check uses ARRAY_SIZE(), which isn't yet defined in liblockdep
so is assumed to be an (undeclared) function
- putting a function call inside a BUILD_BUG_ON() expression quietly
turns it into some nonsense involving a variable-length array
It did produce a compiler warning, but I didn't notice because
liblockdep already produces too many warnings if -Wall is enabled
(which I'll fix shortly).
Even before that commit, which reduced lock_chain::depth from 8 bits
to 6, MAX_LOCK_DEPTH was too large.
This is because of commit f98db6013c55 ("sched/core: Add switch_mm_irqs_off()
and use it in the scheduler") in which switch_mm_irqs_off() is called by the
scheduler, vs switch_mm() which is used by use_mm().
This patch lets the parisc code mirror the x86 and powerpc code, ie. it
disables interrupts in switch_mm(), and optimises the scheduler case by
defining switch_mm_irqs_off().
Cause is a call to dma_coerce_mask_and_coherenet in parport_pc_probe_port,
which PARISC DMA API doesn't handle very nicely. This commit gives back
DMA_ERROR_CODE for DMA API calls, if device isn't capable of DMA
transaction.
When a process runs out of stack the parisc kernel wrongly faults with SIGBUS
instead of the expected SIGSEGV signal.
This example shows how the kernel faults:
do_page_fault() command='a.out' type=15 address=0xfaac2000 in libc-2.24.so[f8308000+16c000]
trap #15: Data TLB miss fault, vm_start = 0xfa2c2000, vm_end = 0xfaac2000
The vma->vm_end value is the first address which does not belong to the vma, so
adjust the check to include vma->vm_end to the range for which to send the
SIGSEGV signal.
This patch unbreaks building the debian libsigsegv package.
The GICv3 driver doesn't check if the target CPU for gic_set_affinity
is valid before going ahead and making the changes. This triggers the
following splat with KASAN:
Unset-KVM and decrement-assignment only when we find the group in our
list. Otherwise we can get out of sync if the user triggers this for
groups that aren't currently on our list.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This fixes a typo where the wrong loop index was used to index
the kvmppc_xive_vcpu.queues[] array in xive_pre_save_scan().
The variable i contains the vcpu number; we need to index queues[]
using j, which iterates from 0 to KVMPPC_XIVE_Q_COUNT-1.
The effect of this bug is that things that save the interrupt
controller state, such as "virsh dump", on a VM with more than
8 vCPUs, result in xive_pre_save_queue() getting called on a
bogus queue structure, usually resulting in a crash like this:
When reading the cntpct_el0 in guest with VHE (Virtual Host Extension)
enabled in host, the "Unsupported guest sys_reg access" error reported.
The reason is cnthctl_el2.EL1PCTEN is not enabled, which is expected
to be done in kvm_timer_init_vhe(). The problem is kvm_timer_init_vhe
is called by cpu_init_hyp_mode, and which is called when VHE is disabled.
This patch remove the incorrect call to kvm_timer_init_vhe() from
cpu_init_hyp_mode(), and calls kvm_timer_init_vhe() to enable
cnthctl_el2.EL1PCTEN in cpu_hyp_reinit().
Fixes: 488f94d7212b ("KVM: arm64: Access CNTHCTL_EL2 bit fields correctly on VHE systems") Signed-off-by: Hu Huajun <huhuajun@huawei.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
nla policy checks for only maximum length of the attribute data when the
attribute type is NLA_BINARY. If userspace sends less data than
specified, cfg80211 may access illegal memory. When type is NLA_UNSPEC,
nla policy check ensures that userspace sends minimum specified length
number of bytes.
Remove type assignment to NLA_BINARY from nla_policy of
NL80211_NAN_FUNC_SERVICE_ID to make these NLA_UNSPEC and to make sure
minimum NL80211_NAN_FUNC_SERVICE_ID_LEN bytes are received from
userspace with NL80211_NAN_FUNC_SERVICE_ID.
nla policy checks for only maximum length of the attribute data
when the attribute type is NLA_BINARY. If userspace sends less
data than specified, the wireless drivers may access illegal
memory. When type is NLA_UNSPEC, nla policy check ensures that
userspace sends minimum specified length number of bytes.
Remove type assignment to NLA_BINARY from nla_policy of
NL80211_ATTR_PMKID to make this NLA_UNSPEC and to make sure minimum
WLAN_PMKID_LEN bytes are received from userspace with
NL80211_ATTR_PMKID.
validate_scan_freqs() retrieves frequencies from attributes
nested in the attribute NL80211_ATTR_SCAN_FREQUENCIES with
nla_get_u32(), which reads 4 bytes from each attribute
without validating the size of data received. Attributes
nested in NL80211_ATTR_SCAN_FREQUENCIES don't have an nla policy.
Validate size of each attribute before parsing to avoid potential buffer
overread.
Fixes: 2a519311926 ("cfg80211/nl80211: scanning (and mac80211 update to use it)") Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Buffer overread may happen as nl80211_set_station() reads 4 bytes
from the attribute NL80211_ATTR_LOCAL_MESH_POWER_MODE without
validating the size of data received when userspace sends less
than 4 bytes of data with NL80211_ATTR_LOCAL_MESH_POWER_MODE.
Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODE to avoid
the buffer overread.
Fixes: 3b1c5a5307f ("{cfg,nl}80211: mesh power mode primitives and userspace access") Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
An earlier change to this function (3bdae810721b) fixed a leak in the
case of an unsuccessful call to brcmf_sdiod_buffrw(). However, the
glom_skb buffer, used for emulating a scattering read, is never used
or referenced after its contents are copied into the destination
buffers, and therefore always needs to be freed by the end of the
function.
Fixes: 3bdae810721b ("brcmfmac: Fix glob_skb leak in brcmf_sdiod_recv_chain") Fixes: a413e39a38573 ("brcmfmac: fix brcmf_sdcard_recv_chain() for host without sg support") Signed-off-by: Peter S. Housel <housel@acm.org> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function assumes that each PMD points to head of a
huge page. This is not correct as a PMD can point to
start of any 8M region with a, say 256M, hugepage. The
fix ensures that it points to the correct head of any PMD
huge page.
Cc: Julian Calaby <julian.calaby@gmail.com> Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The following regex in Makefile.build matches only one ___EXPORT_SYMBOL per line.
sed
's/.*___EXPORT_SYMBOL[[:space:]]*\([a-zA-Z0-9_]*\)[[:space:]]*,.*/EXPORT_SYMBOL(\1);/'
ATOMIC_OPS macro in atomic_64.S expands multiple symbols in same line hence
version generation is done only for the last matched symbol. This patch adds
new line between the symbol expansions.
Signed-off-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds the prototypes of assembly defined functions to asm-prototypes.h.
Some prototypes are directly added as they are not present in any existing header
files.
Signed-off-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we have more than 32 unicast MAC addresses assigned to an interface
we will read beyond the end of the address table in the driver when
adding filters. The next 256 entries store multicast addresses, so we
will end up attempting to insert duplicate filters, which is mostly
harmless. If we add more than 288 unicast addresses we will then read
past the multicast address table, which is likely to be more exciting.
Fixes: 12fb0da45c9a ("sfc: clean fallbacks between promisc/normal in efx_ef10_filter_sync_rx_mode") Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The lower level nl80211 code in cfg80211 ensures that "len" is between
25 and NL80211_ATTR_FRAME (2304). We subtract DOT11_MGMT_HDR_LEN (24) from
"len" so thats's max of 2280. However, the action_frame->data[] buffer is
only BRCMF_FIL_ACTION_FRAME_SIZE (1800) bytes long so this memcpy() can
overflow.
We are not allowed to block on the RCU reader side, so can't
just hold the mutex as before. As a quick fix, convert it to
a spinlock.
Fixes: d9f1f61c0801 ("tap: Extending tap device create/destroy APIs") Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>