The reason is that trace event filter treats the user space pointer
defined by "filename" as a normal pointer to compare against the "cpu"
string. The following bug happened:
The above happened because the kernel tried to access user space directly
and triggered a "supervisor read access in kernel mode" fault. Worse yet,
the memory could not even be loaded yet, and a SEGFAULT could happen as
well. This could be true for kernel space accessing as well.
To be even more robust, test both kernel and user space strings. If the
string fails to read, then simply have the filter fail.
Note, TASK_SIZE is used to determine if the pointer is user or kernel space
and the appropriate strncpy_from_kernel/user_nofault() function is used to
copy the memory. For some architectures, the compare to TASK_SIZE may always
pick user space or kernel space. If it gets it wrong, the only thing is that
the filter will fail to match. In the future, this needs to be fixed to have
the event denote which should be used. But failing a filter is much better
than panicing the machine, and that can be solved later.
Link: https://lore.kernel.org/all/20220107044951.22080-1-kernelfans@gmail.com/ Link: https://lkml.kernel.org/r/20220110115532.536088fd@gandalf.local.home Cc: stable@vger.kernel.org Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Tom Zanussi <zanussi@kernel.org> Reported-by: Pingfan Liu <kernelfans@gmail.com> Tested-by: Pingfan Liu <kernelfans@gmail.com> Fixes: 87a342f5db69d ("tracing/filters: Support filtering for char * strings") Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In exfat_truncate(), the computation of inode->i_blocks is wrong if
the file is larger than 4 GiB because a 32-bit variable is used as a
mask. This is fixed and simplified by using round_up().
Also fix the same buggy computation in exfat_read_root() and another
(correct) one in exfat_fill_inode(). The latter was fixed another way
last month but can be simplified by using round_up() as well. See:
commit 0c336d6e33f4 ("exfat: fix incorrect loading of i_blocks for
large files")
pm_runtime_get_() increments the runtime PM usage counter even
when it returns an error code, thus a matching decrement is needed on
the error handling path to keep the counter balanced.
Choose some other visible function for the test. The function does not
have to be actually executed during the test, because it is only testing
filter API interface.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
seccomp_bpf failed on tests 47 global.user_notification_filter_empty
and 48 global.user_notification_filter_empty_threaded when it's
tested on updated kernel but with old kernel headers. Because old
kernel headers don't have definition of macro __NR_clone3 which is
required for these two tests. Since under selftests/, we can install
headers once for all tests (the default INSTALL_HDR_PATH is
usr/include), fix it by adding usr/include to the list of directories
to be searched. Use "-isystem" to indicate it's a system directory as
the real kernel headers directories are.
Signed-off-by: Sherry Yang <sherry.yang@oracle.com> Tested-by: Sherry Yang <sherry.yang@oracle.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When we create a file with modefromsids we set an ACL that
has one ACE for the magic modefromsid as well as a second ACE that
grants full access to all authenticated users.
When later we chante the mode on the file we strip away this, and other,
ACE for authenticated users in set_chmod_dacl() and then just add back/update
the modefromsid ACE.
Thus leaving the file with a single ACE that is for the mode and no ACE
to grant any user any rights to access the file.
Fix this by always adding back also the modefromsid ACE so that we do not
drop the rights to access the file.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
On newer AMD platforms with SFH, it is observed that random interrupts
get generated on the SFH hardware and until this is cleared the firmware
sensor processing is stalled, resulting in no data been received to
driver side.
Add routines to handle these interrupts, so that firmware operations are
not stalled.
Newer AMD platforms with SFH may generate interrupts on some events
which are unwarranted. Until this is cleared the actual MP2 data
processing maybe stalled in some cases.
Add a mechanism to clear the pending interrupts (if any) during the
driver initialization and sensor command operations.
Since in the current amd_sfh design the sensor data is periodically
obtained in the form of poll data, during the suspend/resume cycle,
scheduling a delayed work adds no value.
So, cancel the work and restart back during the suspend/resume cycle
respectively.
When cifs_get_root() fails during cifs_smb3_do_mount() we call
deactivate_locked_super() which eventually will call delayed_free() which
will free the context.
In this situation we should not proceed to enter the out: section in
cifs_smb3_do_mount() and free the same resources a second time.
[Thu Feb 10 12:59:06 2022] BUG: KASAN: use-after-free in rcu_cblist_dequeue+0x32/0x60
[Thu Feb 10 12:59:06 2022] Read of size 8 at addr ffff888364f4d110 by task swapper/1/0
[Thu Feb 10 12:59:06 2022] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G OE 5.17.0-rc3+ #4
[Thu Feb 10 12:59:06 2022] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.0 12/17/2019
[Thu Feb 10 12:59:06 2022] Call Trace:
[Thu Feb 10 12:59:06 2022] <IRQ>
[Thu Feb 10 12:59:06 2022] dump_stack_lvl+0x5d/0x78
[Thu Feb 10 12:59:06 2022] print_address_description.constprop.0+0x24/0x150
[Thu Feb 10 12:59:06 2022] ? rcu_cblist_dequeue+0x32/0x60
[Thu Feb 10 12:59:06 2022] kasan_report.cold+0x7d/0x117
[Thu Feb 10 12:59:06 2022] ? rcu_cblist_dequeue+0x32/0x60
[Thu Feb 10 12:59:06 2022] __asan_load8+0x86/0xa0
[Thu Feb 10 12:59:06 2022] rcu_cblist_dequeue+0x32/0x60
[Thu Feb 10 12:59:06 2022] rcu_core+0x547/0xca0
[Thu Feb 10 12:59:06 2022] ? call_rcu+0x3c0/0x3c0
[Thu Feb 10 12:59:06 2022] ? __this_cpu_preempt_check+0x13/0x20
[Thu Feb 10 12:59:06 2022] ? lock_is_held_type+0xea/0x140
[Thu Feb 10 12:59:06 2022] rcu_core_si+0xe/0x10
[Thu Feb 10 12:59:06 2022] __do_softirq+0x1d4/0x67b
[Thu Feb 10 12:59:06 2022] __irq_exit_rcu+0x100/0x150
[Thu Feb 10 12:59:06 2022] irq_exit_rcu+0xe/0x30
[Thu Feb 10 12:59:06 2022] sysvec_hyperv_stimer0+0x9d/0xc0
...
[Thu Feb 10 12:59:07 2022] Freed by task 58179:
[Thu Feb 10 12:59:07 2022] kasan_save_stack+0x26/0x50
[Thu Feb 10 12:59:07 2022] kasan_set_track+0x25/0x30
[Thu Feb 10 12:59:07 2022] kasan_set_free_info+0x24/0x40
[Thu Feb 10 12:59:07 2022] ____kasan_slab_free+0x137/0x170
[Thu Feb 10 12:59:07 2022] __kasan_slab_free+0x12/0x20
[Thu Feb 10 12:59:07 2022] slab_free_freelist_hook+0xb3/0x1d0
[Thu Feb 10 12:59:07 2022] kfree+0xcd/0x520
[Thu Feb 10 12:59:07 2022] cifs_smb3_do_mount+0x149/0xbe0 [cifs]
[Thu Feb 10 12:59:07 2022] smb3_get_tree+0x1a0/0x2e0 [cifs]
[Thu Feb 10 12:59:07 2022] vfs_get_tree+0x52/0x140
[Thu Feb 10 12:59:07 2022] path_mount+0x635/0x10c0
[Thu Feb 10 12:59:07 2022] __x64_sys_mount+0x1bf/0x210
[Thu Feb 10 12:59:07 2022] do_syscall_64+0x5c/0xc0
[Thu Feb 10 12:59:07 2022] entry_SYSCALL_64_after_hwframe+0x44/0xae
[Thu Feb 10 12:59:07 2022] Last potentially related work creation:
[Thu Feb 10 12:59:07 2022] kasan_save_stack+0x26/0x50
[Thu Feb 10 12:59:07 2022] __kasan_record_aux_stack+0xb6/0xc0
[Thu Feb 10 12:59:07 2022] kasan_record_aux_stack_noalloc+0xb/0x10
[Thu Feb 10 12:59:07 2022] call_rcu+0x76/0x3c0
[Thu Feb 10 12:59:07 2022] cifs_umount+0xce/0xe0 [cifs]
[Thu Feb 10 12:59:07 2022] cifs_kill_sb+0xc8/0xe0 [cifs]
[Thu Feb 10 12:59:07 2022] deactivate_locked_super+0x5d/0xd0
[Thu Feb 10 12:59:07 2022] cifs_smb3_do_mount+0xab9/0xbe0 [cifs]
[Thu Feb 10 12:59:07 2022] smb3_get_tree+0x1a0/0x2e0 [cifs]
[Thu Feb 10 12:59:07 2022] vfs_get_tree+0x52/0x140
[Thu Feb 10 12:59:07 2022] path_mount+0x635/0x10c0
[Thu Feb 10 12:59:07 2022] __x64_sys_mount+0x1bf/0x210
[Thu Feb 10 12:59:07 2022] do_syscall_64+0x5c/0xc0
[Thu Feb 10 12:59:07 2022] entry_SYSCALL_64_after_hwframe+0x44/0xae
Reported-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When idsfromsid is used we create a special SID for owner/group.
This structure must be initialized or else the first 5 bytes
of the Authority field of the SID will contain uninitialized data
and thus not be a valid SID.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If backing file's filesystem has implemented ->fallocate(), we think the
loop device can support discard, then pass sb->s_blocksize as
discard_granularity. However, some underlying FS, such as overlayfs,
doesn't set sb->s_blocksize, and causes discard_granularity to be set as
zero, then the warning in __blkdev_issue_discard() is triggered.
Christoph suggested to pass kstatfs.f_bsize as discard granularity, and
this way is fine because kstatfs.f_bsize means 'Optimal transfer block
size', which still matches with definition of discard granularity.
So fix the issue by setting discard_granularity as kstatfs.f_bsize if it
is available, otherwise claims discard isn't supported.
Cc: Christoph Hellwig <hch@lst.de> Cc: Vivek Goyal <vgoyal@redhat.com> Reported-by: Pei Zhang <pezhang@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220126035830.296465-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
It appears that a read access to GIC[DR]_I[CS]PENDRn doesn't always
result in the pending interrupts being accurately reported if they are
mapped to a HW interrupt. This is particularily visible when acking
the timer interrupt and reading the GICR_ISPENDR1 register immediately
after, for example (the interrupt appears as not-pending while it really
is...).
This is because a HW interrupt has its 'active and pending state' kept
in the *physical* distributor, and not in the virtual one, as mandated
by the spec (this is what allows the direct deactivation). The virtual
distributor only caries the pending and active *states* (note the
plural, as these are two independent and non-overlapping states).
Fix it by reading the HW state back, either from the timer itself or
from the distributor if necessary.
This is because we started using writeback_inodes_sb() to flush delalloc
when committing a transaction (when using -o flushoncommit), in order to
avoid deadlocks with filesystem freeze operations. This change was made
by commit ce8ea7cc6eb313 ("btrfs: don't call btrfs_start_delalloc_roots
in flushoncommit"). After that change we started producing that warning,
and every now and then a user reports this since the warning happens too
often, it spams dmesg/syslog, and a user is unsure if this reflects any
problem that might compromise the filesystem's reliability.
We can not just lock the sb->s_umount semaphore before calling
writeback_inodes_sb(), because that would at least deadlock with
filesystem freezing, since at fs/super.c:freeze_super() sync_filesystem()
is called while we are holding that semaphore in write mode, and that can
trigger a transaction commit, resulting in a deadlock. It would also
trigger the same type of deadlock in the unmount path. Possibly, it could
also introduce some other locking dependencies that lockdep would report.
To fix this call try_to_writeback_inodes_sb() instead of
writeback_inodes_sb(), because that will try to read lock sb->s_umount
and then will only call writeback_inodes_sb() if it was able to lock it.
This is fine because the cases where it can't read lock sb->s_umount
are during a filesystem unmount or during a filesystem freeze - in those
cases sb->s_umount is write locked and sync_filesystem() is called, which
calls writeback_inodes_sb(). In other words, in all cases where we can't
take a read lock on sb->s_umount, writeback is already being triggered
elsewhere.
An alternative would be to call btrfs_start_delalloc_roots() with a
number of pages different from LONG_MAX, for example matching the number
of delalloc bytes we currently have, in which case we would end up
starting all delalloc with filemap_fdatawrite_wbc() and not with an
async flush via filemap_flush() - that is only possible after the rather
recent commit e076ab2a2ca70a ("btrfs: shrink delalloc pages instead of
full inodes"). However that creates a whole new can of worms due to new
lock dependencies, which lockdep complains, like for example:
[ 8948.247280] ======================================================
[ 8948.247823] WARNING: possible circular locking dependency detected
[ 8948.248353] 5.17.0-rc1-btrfs-next-111 #1 Not tainted
[ 8948.248786] ------------------------------------------------------
[ 8948.249320] kworker/u16:18/933570 is trying to acquire lock:
[ 8948.249812] ffff9b3de1591690 (sb_internal#2){.+.+}-{0:0}, at: find_free_extent+0x141e/0x1590 [btrfs]
[ 8948.250638]
but task is already holding lock:
[ 8948.251140] ffff9b3e09c717d8 (&root->delalloc_mutex){+.+.}-{3:3}, at: start_delalloc_inodes+0x78/0x400 [btrfs]
[ 8948.252018]
which lock already depends on the new lock.
It all comes from the fact that btrfs_start_delalloc_roots() takes the
delalloc_root_mutex, in the transaction commit path we are holding a
read lock on one of the superblock's freeze semaphores (via
sb_start_intwrite()), the async reclaim task can also do a call to
btrfs_start_delalloc_roots(), which ends up triggering writeback with
calls to filemap_fdatawrite_wbc(), resulting in extent allocation which
in turn can call btrfs_start_transaction(), which will result in taking
the freeze semaphore via sb_start_intwrite(), forming a nasty dependency
on all those locks which can be taken in different orders by different
code paths.
So just adopt the simple approach of calling try_to_writeback_inodes_sb()
at btrfs_start_delalloc_flush().
Buttonpads are expected to map the INPUT_PROP_BUTTONPAD property bit
and the BTN_LEFT key bit.
As explained in the specification, where a device has a button type
value of 0 (click-pad) or 1 (pressure-pad) there should not be
discrete buttons:
https://docs.microsoft.com/en-us/windows-hardware/design/component-guidelines/touchpad-windows-precision-touchpad-collection#device-capabilities-feature-report
However, some drivers map the BTN_RIGHT and/or BTN_MIDDLE key bits even
though the device is a buttonpad and therefore does not have those
buttons.
This behavior has forced userspace applications like libinput to
implement different workarounds and quirks to detect buttonpads and
offer to the user the right set of features and configuration options.
For more information:
https://gitlab.freedesktop.org/libinput/libinput/-/merge_requests/726
In order to avoid this issue clear the BTN_RIGHT and BTN_MIDDLE key
bits when the input device is register if the INPUT_PROP_BUTTONPAD
property bit is set.
Notice that this change will not affect udev because it does not check
for buttons. See systemd/src/udev/udev-builtin-input_id.c.
List of known affected hardware:
- Chuwi AeroBook Plus
- Chuwi Gemibook
- Framework Laptop
- GPD Win Max
- Huawei MateBook 2020
- Prestigio Smartbook 141 C2
- Purism Librem 14v1
- StarLite Mk II - AMI firmware
- StarLite Mk II - Coreboot firmware
- StarLite Mk III - AMI firmware
- StarLite Mk III - Coreboot firmware
- StarLabTop Mk IV - AMI firmware
- StarLabTop Mk IV - Coreboot firmware
- StarBook Mk V
Acked-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: José Expósito <jose.exposito89@gmail.com> Link: https://lore.kernel.org/r/20220208174806.17183-1-jose.exposito89@gmail.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The check done by regulator_late_cleanup() to detect whether a regulator
is on was inconsistent with the check done by _regulator_is_enabled().
While _regulator_is_enabled() takes the enable GPIO into account,
regulator_late_cleanup() was not doing that.
This resulted in a false positive, e.g. when a GPIO-controlled fixed
regulator was used, which was not enabled at boot time, e.g.
Such regulator doesn't have an is_enabled() operation. Nevertheless
it's state can be determined based on the enable GPIO. The check in
regulator_late_cleanup() wrongly assumed that the regulator is on and
tried to disable it.
The current rt5682_jack_detect_handler() assumes the component
and card will always show up and implements an infinite usleep
loop waiting for them to show up.
This does not hold true if a codec interrupt (or other
event) occurs when the card is unbound. The codec driver's
remove or shutdown functions cannot cancel the workqueue due
to the wait loop. As a result, code can either end up blocking
the workqueue, or hit a kernel oops when the card is freed.
Fix the issue by rescheduling the jack detect handler in
case the card is not ready. In case card never shows up,
the shutdown/remove/suspend calls can now cancel the detect
task.
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Shuming Fan <shumingf@realtek.com> Link: https://lore.kernel.org/r/20220207153000.3452802-3-kai.vehmanen@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The current rt5668_jack_detect_handler() assumes the component
and card will always show up and implements an infinite usleep
loop waiting for them to show up.
This does not hold true if a codec interrupt (or other
event) occurs when the card is unbound. The codec driver's
remove or shutdown functions cannot cancel the workqueue due
to the wait loop. As a result, code can either end up blocking
the workqueue, or hit a kernel oops when the card is freed.
Fix the issue by rescheduling the jack detect handler in
case the card is not ready. In case card never shows up,
the shutdown/remove/suspend calls can now cancel the detect
task.
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Shuming Fan <shumingf@realtek.com> Link: https://lore.kernel.org/r/20220207153000.3452802-2-kai.vehmanen@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The current rt5682s_jack_detect_handler() assumes the component
and card will always show up and implements an infinite usleep
loop waiting for them to show up.
This does not hold true if a codec interrupt (or other
event) occurs when the card is unbound. The codec driver's
remove or shutdown functions cannot cancel the workqueue due
to the wait loop. As a result, code can either end up blocking
the workqueue, or hit a kernel oops when the card is freed.
Fix the issue by rescheduling the jack detect handler in
case the card is not ready. In case card never shows up,
the shutdown/remove/suspend calls can now cancel the detect
task.
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Shuming Fan <shumingf@realtek.com> Link: https://lore.kernel.org/r/20220207153000.3452802-1-kai.vehmanen@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The CLKT register contains at poweron 0x40, which at our typical 100kHz
bus rate means .64ms. But there is no specified limit to how long devices
should be able to stretch the clocks, so just disable the timeout. We
still have a timeout wrapping the entire transfer.
Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> BugLink: https://github.com/raspberrypi/linux/issues/3064 Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In mac80211_hwsim, the probe_req frame is created and sent while
scanning. It is sent with ieee80211_tx_info which is not initialized.
Uninitialized ieee80211_tx_info can cause problems when using
mac80211_hwsim with wmediumd. wmediumd checks the tx_rates field of
ieee80211_tx_info and doesn't relay probe_req frame to other clients
even if it is a broadcasting message.
Call ieee80211_tx_prepare_skb() to initialize ieee80211_tx_info for
the probe_req that is created by hw_scan_work in mac80211_hwsim.
memblock.{reserved,memory}.regions may be allocated using kmalloc() in
memblock_double_array(). Use kfree() to release these kmalloced regions
indicated by memblock_{reserved,memory}_in_slab.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Fixes: 3010f876500f ("mm: discard memblock data later") Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The tegra186 GPIO driver makes the assumption that the pointer
returned by irq_data_get_irq_chip_data() is a pointer to a
tegra_gpio structure. Unfortunately, it is actually a pointer
to the inner gpio_chip structure, as mandated by the gpiolib
infrastructure. Nice try.
The saving grace is that the gpio_chip is the first member of
tegra_gpio, so the bug has gone undetected since... forever.
Fix it by performing a container_of() on the pointer. This results
in no additional code, and makes it possible to understand how
the whole thing works.
In the current implementation the user may open a virtual tty which then
could fail to establish the underlying DLCI. The function gsmtty_open()
gets stuck in tty_port_block_til_ready() while waiting for a carrier rise.
This happens if the remote side fails to acknowledge the link establishment
request in time or completely. At some point gsm_dlci_close() is called
to abort the link establishment attempt. The function tries to inform the
associated virtual tty by performing a hangup. But the blocking loop within
tty_port_block_til_ready() is not informed about this event.
The patch proposed here fixes this by resetting the initialization state of
the virtual tty to ensure the loop exits and triggering it to make
tty_port_block_til_ready() return.
The function gsm_process_modem() exists to handle modem status bits of
incoming frames. This includes incoming MSC (modem status command) frames
and convergence layer type 2 data frames. The function, however, was only
designed to handle MSC frames as it expects the command length. Within
gsm_dlci_data() it is wrongly assumed that this is the same as the data
frame length. This is only true if the data frame contains only 1 byte of
payload.
This patch names the length parameter of gsm_process_modem() in a generic
manner to reflect its association. It also corrects all calls to the
function to handle the variable number of modem status octets correctly in
both cases.
Fixes: 7263287af93d ("tty: n_gsm: Fixed logic to decode break signal from modem status") Cc: stable@vger.kernel.org Signed-off-by: Daniel Starke <daniel.starke@siemens.com> Link: https://lore.kernel.org/r/20220218073123.2121-6-daniel.starke@siemens.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tty flow control is handled via gsmtty_throttle() and gsmtty_unthrottle().
Both functions propagate the outgoing hardware flow control state to the
remote side via MSC (modem status command) frames. The local state is taken
from the RTS (ready to send) flag of the tty. However, RTS gets mapped to
DTR (data terminal ready), which is wrong.
This patch corrects this by mapping RTS to RTS.
The here fixed commit made the tty hangup asynchronous to avoid a circular
locking warning. I could not reproduce this warning. Furthermore, due to
the asynchronous hangup the function call now gets queued up while the
underlying tty is being freed. Depending on the timing this results in a
NULL pointer access in the global work queue scheduler. To be precise in
process_one_work(). Therefore, the previous commit made the issue worse
which it tried to fix.
This patch fixes this by falling back to the old behavior which uses a
blocking tty hangup call before freeing up the associated tty.
Fixes: 7030082a7415 ("tty: n_gsm: avoid recursive locking with async port hangup") Cc: stable@vger.kernel.org Signed-off-by: Daniel Starke <daniel.starke@siemens.com> Link: https://lore.kernel.org/r/20220218073123.2121-4-daniel.starke@siemens.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Trying to open a DLCI by sending a SABM frame may fail with a timeout.
The link is closed on the initiator side without informing the responder
about this event. The responder assumes the link is open after sending a
UA frame to answer the SABM frame. The link gets stuck in a half open
state.
This patch fixes this by initiating the proper link termination procedure
after link setup timeout instead of silently closing it down.
n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010.
See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516
The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to
the newer 27.010 here. Chapter 5.2.1.2 describes the encoding of the
C/R (command/response) bit. Table 1 shows that the actual encoding of the
C/R bit is inverted if the associated frame is sent by the responder.
The referenced commit fixed here further broke the internal meaning of this
bit in the outgoing path by always setting the C/R bit regardless of the
frame type.
This patch fixes both by setting the C/R bit always consistently for
command (1) and response (0) frames and inverting it later for the
responder where necessary. The meaning of this bit in the debug output
is being preserved and shows the bit as if it was encoded by the initiator.
This reflects only the frame type rather than the encoded combination of
communication side and frame type.
Fixes: cc0f42122a7e ("tty: n_gsm: Modify CR,PF bit when config requester") Cc: stable@vger.kernel.org Signed-off-by: Daniel Starke <daniel.starke@siemens.com> Link: https://lore.kernel.org/r/20220218073123.2121-2-daniel.starke@siemens.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010.
See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516
The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to
the newer 27.010 here. Chapter 5.4.6.3.7 describes the encoding of the
control signal octet used by the MSC (modem status command). The same
encoding is also used in convergence layer type 2 as described in chapter
5.5.2. Table 7 and 24 both require the DV (data valid) bit to be set 1 for
outgoing control signal octets sent by the DTE (data terminal equipment),
i.e. for the initiator side.
Currently, the DV bit is only set if CD (carrier detect) is on, regardless
of the side.
This patch fixes this behavior by setting the DV bit on the initiator side
unconditionally.
kernel BUG at include/linux/mm.h:2373!
cpu 0x5d: Vector: 700 (Program Check) at [c00000003c6e76e0]
pc: c000000000581a54: pmd_to_page+0x54/0x80
lr: c00000000058d184: move_hugetlb_page_tables+0x4e4/0x5b0
sp: c00000003c6e7980
msr: 9000000000029033
current = 0xc00000003bd8d980
paca = 0xc000200fff610100 irqmask: 0x03 irq_happened: 0x01
pid = 9349, comm = hugepage-mremap
kernel BUG at include/linux/mm.h:2373!
move_hugetlb_page_tables+0x4e4/0x5b0 (link register)
move_hugetlb_page_tables+0x22c/0x5b0 (unreliable)
move_page_tables+0xdbc/0x1010
move_vma+0x254/0x5f0
sys_mremap+0x7c0/0x900
system_call_exception+0x160/0x2c0
the kernel can't use huge_pte_offset before it set the pte entry because
a page table lookup check for huge PTE bit in the page table to
differentiate between a huge pte entry and a pointer to pte page. A
huge_pte_alloc won't mark the page table entry huge and hence kernel
should not use huge_pte_offset after a huge_pte_alloc.
Link: https://lkml.kernel.org/r/20220211063221.99293-1-aneesh.kumar@linux.ibm.com Fixes: 550a7d60bd5e ("mm, hugepages: add mremap() support for hugepage backed vma") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The trace_hardirqs_{on,off}() require the caller to setup frame pointer
properly. This because these two functions use macro 'CALLER_ADDR1' (aka.
__builtin_return_address(1)) to acquire caller info. If the $fp is used
for other purpose, the code generated this macro (as below) could trigger
memory access fault.
The patch below has two "linkcontrol" names causing the duplication.
Fix by using the correct "diag_counters" name on the second instance.
Fixes: 4a7aaf88c89f ("RDMA/qib: Use attributes for the port sysfs") Link: https://lore.kernel.org/r/1645106372-23004-1-git-send-email-mike.marciniszyn@cornelisnetworks.com Cc: <stable@vger.kernel.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The interrupt mask is enabled before any potential failure points in
the driver, which can leave a failure path where we exit with
interrupts enabled but the device not live. This causes an infinite
stream of interrupts on an Apple M1 Pro laptop on USB-C.
Add a failure label that's used post enabling interrupts, where we
mask them again before returning an error.
Suggested-by: Sven Peter <sven@svenpeter.dev> Cc: stable <stable@vger.kernel.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/e6b80669-20f3-06e7-9ed5-8951a9c6db6f@kernel.dk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In rare cases the display is flipped or mirrored. This was observed more
often in a low temperature environment. A clean reset on init_display()
should help to get registers in a sane state.
Fixes: ef8f317795da (staging: fbtft: use init function instead of init sequence) Cc: stable@vger.kernel.org Signed-off-by: Oliver Graute <oliver.graute@kococonnector.com> Link: https://lore.kernel.org/r/20220210085322.15676-1-oliver.graute@kococonnector.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the state is not idle then resolve_prepare_src() should immediately
fail and no change to global state should happen. However, it
unconditionally overwrites the src_addr trying to build a temporary any
address.
For instance if the state is already RDMA_CM_LISTEN then this will corrupt
the src_addr and would cause the test in cma_cancel_operation():
if (cma_any_addr(cma_src_addr(id_priv)) && !id_priv->cma_dev)
Which would manifest as this trace from syzkaller:
BUG: KASAN: use-after-free in __list_add_valid+0x93/0xa0 lib/list_debug.c:26
Read of size 8 at addr ffff8881546491e0 by task syz-executor.1/32204
This is indicating that an rdma_id_private was destroyed without doing
cma_cancel_listens().
Instead of trying to re-use the src_addr memory to indirectly create an
any address derived from the dst build one explicitly on the stack and
bind to that as any other normal flow would do. rdma_bind_addr() will copy
it over the src_addr once it knows the state is valid.
This is similar to commit bc0bdc5afaa7 ("RDMA/cma: Do not change
route.addr.src_addr.ss_family")
Link: https://lore.kernel.org/r/0-v2-e975c8fd9ef2+11e-syz_cma_srcaddr_jgg@nvidia.com Cc: stable@vger.kernel.org Fixes: 732d41c545bb ("RDMA/cma: Make the locking for automatic state transition more clear") Reported-by: syzbot+c94a3675a626f6333d74@syzkaller.appspotmail.com Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is a big gap between inode_should_defrag() and autodefrag extent
size threshold. For inode_should_defrag() it has a flexible
@small_write value. For compressed extent is 16K, and for non-compressed
extent it's 64K.
However for autodefrag extent size threshold, it's always fixed to the
default value (256K).
This means, the following write sequence will trigger autodefrag to
defrag ranges which didn't trigger autodefrag:
pwrite 0 8k
sync
pwrite 8k 128K
sync
The latter 128K write will also be considered as a defrag target (if
other conditions are met). While only that 8K write is really
triggering autodefrag.
Such behavior can cause extra IO for autodefrag.
Close the gap, by copying the @small_write value into inode_defrag, so
that later autodefrag can use the same @small_write value which
triggered autodefrag.
With the existing transid value, this allows autodefrag really to scan
the ranges which triggered autodefrag.
Although this behavior change is mostly reducing the extent_thresh value
for autodefrag, I believe in the future we should allow users to specify
the autodefrag extent threshold through mount options, but that's an
other problem to consider in the future.
In the rework of btrfs_defrag_file(), we always call
defrag_one_cluster() and increase the offset by cluster size, which is
only 256K.
But there are cases where we have a large extent (e.g. 128M) which
doesn't need to be defragged at all.
Before the refactor, we can directly skip the range, but now we have to
scan that extent map again and again until the cluster moves after the
non-target extent.
Fix the problem by allow defrag_one_cluster() to increase
btrfs_defrag_ctrl::last_scanned to the end of an extent, if and only if
the last extent of the cluster is not a target.
The test script looks like this:
mkfs.btrfs -f $dev > /dev/null
mount $dev $mnt
# As btrfs ioctl uses 32M as extent_threshold
xfs_io -f -c "pwrite 0 64M" $mnt/file1
sync
# Some fragemented range to defrag
xfs_io -s -c "pwrite 65548k 4k" \
-c "pwrite 65544k 4k" \
-c "pwrite 65540k 4k" \
-c "pwrite 65536k 4k" \
$mnt/file1
sync
echo "=== before ==="
xfs_io -c "fiemap -v" $mnt/file1
echo "=== after ==="
btrfs fi defrag $mnt/file1
sync
xfs_io -c "fiemap -v" $mnt/file1
umount $mnt
With extra ftrace put into defrag_one_cluster(), before the patch it
would result tons of loops:
(As defrag_one_cluster() is inlined, the function name is its caller)
Compressed length can be corrupted to be a lot larger than memory
we have allocated for buffer.
This will cause memcpy in copy_compressed_segment to write outside
of allocated memory.
This mostly results in stuck read syscall but sometimes when using
btrfs send can get #GP
From the very beginning of btrfs defrag, there is a check to reject
extents which meet both conditions:
- Physically adjacent
We may want to defrag physically adjacent extents to reduce the number
of extents or the size of subvolume tree.
- Larger than 128K
This may be there for compressed extents, but unfortunately 128K is
exactly the max capacity for compressed extents.
And the check is > 128K, thus it never rejects compressed extents.
Furthermore, the compressed extent capacity bug is fixed by previous
patch, there is no reason for that check anymore.
The original check has a very small ranges to reject (the target extent
size is > 128K, and default extent threshold is 256K), and for
compressed extent it doesn't work at all.
So it's better just to remove the rejection, and allow us to defrag
physically adjacent extents.
CC: stable@vger.kernel.org # 5.16 Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[BUG]
For compressed extents, defrag ioctl will always try to defrag any
compressed extents, wasting not only IO but also CPU time to
compress/decompress:
Then it shows the 2 128K extents just get COW for no extra benefit, with
extra IO/CPU spent:
=== before ===
/mnt/btrfs/file1:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..255]: 26624..26879 256 0x8
1: [256..511]: 26632..26887 256 0x9
=== after ===
/mnt/btrfs/file1:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..255]: 26640..26895 256 0x8
1: [256..511]: 26648..26903 256 0x9
This affects not only v5.16 (after the defrag rework), but also v5.15
(before the defrag rework).
[CAUSE]
From the very beginning, btrfs defrag never checks if one extent is
already at its max capacity (128K for compressed extents, 128M
otherwise).
And the default extent size threshold is 256K, which is already beyond
the compressed extent max size.
This means, by default btrfs defrag ioctl will mark all compressed
extent which is not adjacent to a hole/preallocated range for defrag.
[FIX]
Introduce a helper to grab the maximum extent size, and then in
defrag_collect_targets() and defrag_check_next_extent(), reject extents
which are already at their max capacity.
Reported-by: Filipe Manana <fdmanana@suse.com> CC: stable@vger.kernel.org # 5.16 Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[BUG]
With older kernels (before v5.16), btrfs will defrag preallocated extents.
While with newer kernels (v5.16 and newer) btrfs will not defrag
preallocated extents, but it will defrag the extent just before the
preallocated extent, even it's just a single sector.
This can be exposed by the following small script:
Which defrags the single sector along with the preallocated extent, and
replace them with an regular extent into a new location (caused by data
COW).
This wastes most of the data IO just for the preallocated range.
The preallocated range is not defragged, but the sector before it still
gets defragged, which has no need for it.
[CAUSE]
One of the function reused by the old and new behavior is
defrag_check_next_extent(), it will determine if we should defrag
current extent by checking the next one.
It only checks if the next extent is a hole or inlined, but it doesn't
check if it's preallocated.
On the other hand, out of the function, both old and new kernel will
reject preallocated extents.
Such inconsistent behavior causes above behavior.
[FIX]
- Also check if next extent is preallocated
If so, don't defrag current extent.
- Add comments for each branch why we reject the extent
This will reduce the IO caused by defrag ioctl and autodefrag.
CC: stable@vger.kernel.org # 5.16 Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When unbinding/binding a driver with DMA mapped memory, the DMA map is
not freed before the driver is reloaded. This leads to a memory leak
when the DMA map is overwritten when reprobing the driver.
This can be reproduced with a platform driver having a dma-range:
Wp-gpios property can be used on NVMEM nodes and the same property can
be also used on MTD NAND nodes. In case of the wp-gpios property is
defined at NAND level node, the GPIO management is done at NAND driver
level. Write protect is disabled when the driver is probed or resumed
and is enabled when the driver is released or suspended.
When no partitions are defined in the NAND DT node, then the NAND DT node
will be passed to NVMEM framework. If wp-gpios property is defined in
this node, the GPIO resource is taken twice and the NAND controller
driver fails to probe.
A new Boolean flag named ignore_wp has been added in nvmem_config.
In case ignore_wp is set, it means that the GPIO is handled by the
provider. Lets set this flag in MTD layer to avoid the conflict on
wp_gpios property.
Wp-gpios property can be used on NVMEM nodes and the same property can
be also used on MTD NAND nodes. In case of the wp-gpios property is
defined at NAND level node, the GPIO management is done at NAND driver
level. Write protect is disabled when the driver is probed or resumed
and is enabled when the driver is released or suspended.
When no partitions are defined in the NAND DT node, then the NAND DT node
will be passed to NVMEM framework. If wp-gpios property is defined in
this node, the GPIO resource is taken twice and the NAND controller
driver fails to probe.
It would be possible to set config->wp_gpio at MTD level before calling
nvmem_register function but NVMEM framework will toggle this GPIO on
each write when this GPIO should only be controlled at NAND level driver
to ensure that the Write Protect has not been enabled.
A way to fix this conflict is to add a new boolean flag in nvmem_config
named ignore_wp. In case ignore_wp is set, the GPIO resource will
be managed by the provider.
Fixes: 2a127da461a9 ("nvmem: add support for the write-protect pin") Cc: stable@vger.kernel.org Signed-off-by: Christophe Kerello <christophe.kerello@foss.st.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20220220151432.16605-2-srinivas.kandagatla@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The -ENODEV return value from xhci_check_args() is incorrectly changed
to -EINVAL in a couple places before propagated further.
xhci_check_args() returns 4 types of value, -ENODEV, -EINVAL, 1 and 0.
xhci_urb_enqueue and xhci_check_streams_endpoint return -EINVAL if
the return value of xhci_check_args <= 0.
This causes problems for example r8152_submit_rx, calling usb_submit_urb
in drivers/net/usb/r8152.c.
r8152_submit_rx will never get -ENODEV after submiting an urb when xHC
is halted because xhci_urb_enqueue returns -EINVAL in the very beginning.
When HCE(Host Controller Error) is set, it means an internal
error condition has been detected. Software needs to re-initialize
the HC, so add this check in xhci resume.
The interrupt service routine registered for the gadget is a primary
handler which mask the interrupt source and a threaded handler which
handles the source of the interrupt. Since the threaded handler is
voluntary threaded, the IRQ-core does not disable bottom halves before
invoke the handler like it does for the forced-threaded handler.
Due to changes in networking it became visible that a network gadget's
completions handler may schedule a softirq which remains unprocessed.
The gadget's completion handler is usually invoked either in hard-IRQ or
soft-IRQ context. In this context it is enough to just raise the softirq
because the softirq itself will be handled once that context is left.
In the case of the voluntary threaded handler, there is nothing that
will process pending softirqs. Which means it remain queued until
another random interrupt (on this CPU) fires and handles it on its exit
path or another thread locks and unlocks a lock with the bh suffix.
Worst case is that the CPU goes idle and the NOHZ complains about
unhandled softirqs.
Disable bottom halves before acquiring the lock (and disabling
interrupts) and enable them after dropping the lock. This ensures that
any pending softirqs will handled right away.
When the Bay Trail phy GPIO mappings where added cs and reset were swapped,
this did not cause any issues sofar, because sofar they were always driven
high/low at the same time.
Note the new mapping has been verified both in /sys/kernel/debug/gpio
output on Android factory images on multiple devices, as well as in
the schematics for some devices.
Fixes: 5741022cbdf3 ("usb: dwc3: pci: Add GPIO lookup table on platforms without ACPI GPIO resources") Cc: stable <stable@vger.kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20220213130524.18748-3-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit e0082698b689 ("usb: dwc3: ulpi: conditionally resume ULPI PHY")
fixed an issue where ULPI transfers would timeout if any requests where
send to the phy sometime after init, giving it enough time to auto-suspend.
Commit e5f4ca3fce90 ("usb: dwc3: ulpi: Fix USB2.0 HS/FS/LS PHY suspend
regression") changed the behavior to instead of clearing the
DWC3_GUSB2PHYCFG_SUSPHY bit, add an extra sleep when it is set.
But on Bay Trail devices, when phy_set_mode() gets called during init,
this leads to errors like these:
[ 28.451522] tusb1210 dwc3.ulpi: error -110 writing val 0x01 to reg 0x0a
[ 28.464089] tusb1210 dwc3.ulpi: error -110 writing val 0x01 to reg 0x0a
Add "snps,dis_u2_susphy_quirk" to the settings for Bay Trail devices to
fix this. This restores the old behavior for Bay Trail devices, since
previously the DWC3_GUSB2PHYCFG_SUSPHY bit would get cleared on the first
ulpi_read/_write() and then was never set again.
When the gadget driver hasn't been (yet) configured, and the cable is
connected to a HOST, the SFTDISCON gets cleared unconditionally, so the
HOST tries to enumerate it.
At the host side, this can result in a stuck USB port or worse. When
getting lucky, some dmesg can be observed at the host side:
new high-speed USB device number ...
device descriptor read/64, error -110
Fix it in drd, by checking the enabled flag before calling
dwc2_hsotg_core_connect(). It will be called later, once configured,
by the normal flow:
- udc_bind_to_driver
- usb_gadget_connect
- dwc2_hsotg_pullup
- dwc2_hsotg_core_connect
Dell DW5829e same as DW5821e except CAT level.
DW5821e supports CAT16 but DW5829e supports CAT9.
There are 2 types product of DW5829e: normal and eSIM.
So we will add 2 PID for DW5829e.
And for each PID, it support MBIM or RMNET.
Let's see test evidence as below:
BTW, the interface 0x6 of MBIM mode is GNSS port, which not same as NMEA
port. So it's banned from serial option driver.
The remaining interfaces 0x2-0x5 are: MODEM, MODEM, NMEA, DIAG.
Signed-off-by: Slark Xiao <slark_xiao@163.com> Link: https://lore.kernel.org/r/20220214021401.6264-1-slark_xiao@163.com
[ johan: drop unnecessary reservation of interface 1 ] Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Al Viro brought it to my attention that the dentries may not be filled
when the parse_options() is called, causing the call to set_gid() to
possibly crash. It should only be called if parse_options() succeeds
totally anyway.
He suggested the logical place to do the update is in apply_options().
There's no lock for rndis response list. It could cause list corruption
if there're two different list_add at the same time like below.
It's better to add in rndis_add_response / rndis_free_response
/ rndis_get_next_response to prevent any race condition on response list.
CH341 has Product ID 0x5512 in EPP/MEM mode which is used for
I2C/SPI/GPIO interfaces. In asynchronous serial interface mode
CH341 has PID 0x5523 which is already in the table.
The HPT371 chip physically has only one channel, the secondary one,
however the primary channel registers do exist! Thus we have to
manually disable the non-existing channel if the BIOS hasn't done this
already. Similarly to the pata_hpt3x2n driver, always disable the
primary channel.
Fixes: 669a5db411d8 ("[libata] Add a bunch of PATA drivers.") Cc: stable@vger.kernel.org Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
UART drivers are meant to use the port spinlock within certain
methods, to protect against reentrancy. The sc16is7xx driver does
very little locking, presumably because when added it triggers
"scheduling while atomic" errors. This is due to the use of mutexes
within the regmap abstraction layer, and the mutex implementation's
habit of sleeping the current thread while waiting for access.
Unfortunately this lack of interlocking can lead to corruption of
outbound data, which occurs when the buffer used for I2C transmission
is used simultaneously by two threads - a work queue thread running
sc16is7xx_tx_proc, and an IRQ thread in sc16is7xx_port_irq, both
of which can call sc16is7xx_handle_tx.
An earlier patch added efr_lock, a mutex that controls access to the
EFR register. This mutex is already claimed in the IRQ handler, and
all that is required is to claim the same mutex in sc16is7xx_tx_proc.
The pm_runtime_enable will increase power disable depth.
If the probe fails, we should use pm_runtime_disable() to balance
pm_runtime_enable(). In the PM Runtime docs:
Drivers in ->remove() callback should undo the runtime PM changes done
in ->probe(). Usually this means calling pm_runtime_disable(),
pm_runtime_dont_use_autosuspend() etc.
We should do this in error handling.
Fix this problem for the following drivers: bmc150, bmg160, kmx61,
kxcj-1013, mma9551, mma9553.
Fixes: 7d0ead5c3f00 ("iio: Reconcile operation order between iio_register/unregister and pm functions") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20220106112309.16879-1-linmq006@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On one side we have indio_dev->num_channels includes all physical channels +
timestamp channel. On other side we have an array allocated only for
physical channels. So, fix memory corruption by ARRAY_SIZE() instead of
num_channels variable.
Note the first case is a cleanup rather than a fix as the software
timestamp channel bit in active_scanmask is never set by the IIO core.
Fixes: 9374e8f5a38d ("iio: adc: add ADC driver for the TI TSC2046 controller") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20220107081401.2816357-1-o.rempel@pengutronix.de Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The trigger handler defined in the driver assumes that burst mode is
being used. Hence, for devices that do not support it, we have to use
the adis library default trigger implementation.
Tested-by: Julia Pineda <julia.pineda@analog.com> Fixes: 941f130881fa9 ("iio: adis16480: support burst read function") Signed-off-by: Nuno Sá <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20220114132608.241-1-nuno.sa@analog.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If a trigger is set on an event to disable or enable tracing within an
instance, then tracing should be disabled or enabled in the instance and
not at the top level, which is confusing to users.
Link: https://lkml.kernel.org/r/20220223223837.14f94ec3@rorschach.local.home Cc: stable@vger.kernel.org Fixes: ae63b31e4d0e2 ("tracing: Separate out trace events from global variables") Tested-by: Daniel Bristot de Oliveira <bristot@kernel.org> Reviewed-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The stacktrace event trigger is not dumping the stacktrace to the instance
where it was enabled, but to the global "instance."
Use the private_data, pointing to the trigger file, to figure out the
corresponding trace instance, and use it in the trigger action, like
snapshot_trigger does.
Link: https://lkml.kernel.org/r/afbb0b4f18ba92c276865bc97204d438473f4ebc.1645396236.git.bristot@kernel.org Cc: stable@vger.kernel.org Fixes: ae63b31e4d0e2 ("tracing: Separate out trace events from global variables") Reviewed-by: Tom Zanussi <zanussi@kernel.org> Tested-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When commit e6ac2450d6de ("bpf: Support bpf program calling kernel function") added
kfunc support, it defined reg2btf_ids as a cheap way to translate the verifier
reg type to the appropriate btf_vmlinux BTF ID, however
commit c25b2ae13603 ("bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL")
moved the __BPF_REG_TYPE_MAX from the last member of bpf_reg_type enum to after
the base register types, and defined other variants using type flag
composition. However, now, the direct usage of reg->type to index into
reg2btf_ids may no longer fall into __BPF_REG_TYPE_MAX range, and hence lead to
out of bounds access and kernel crash on dereference of bad pointer.
Allow passing PTR_TO_CTX, if the kfunc expects a matching struct type,
and punt to PTR_TO_MEM block if reg->type does not fall in one of
PTR_TO_BTF_ID or PTR_TO_SOCK* types. This will be used by future commits
to get access to XDP and TC PTR_TO_CTX, and pass various data (flags,
l4proto, netns_id, etc.) encoded in opts struct passed as pointer to
kfunc.
For PTR_TO_MEM support, arguments are currently limited to pointer to
scalar, or pointer to struct composed of scalars. This is done so that
unsafe scenarios (like passing PTR_TO_MEM where PTR_TO_BTF_ID of
in-kernel valid structure is expected, which may have pointers) are
avoided. Since the argument checking happens basd on argument register
type, it is not easy to ascertain what the expected type is. In the
future, support for PTR_TO_MEM for kfunc can be extended to serve other
usecases. The struct type whose pointer is passed in may have maximum
nesting depth of 4, all recursively composed of scalars or struct with
scalars.
Future commits will add negative tests that check whether these
restrictions imposed for kfunc arguments are duly rejected by BPF
verifier or not.
Remove the flush_workqueue(system_long_wq) call since flushing
system_long_wq is deadlock-prone and since that call is redundant with a
preceding cancel_work_sync()
Link: https://lore.kernel.org/r/20220215210511.28303-3-bvanassche@acm.org Fixes: ef6c49d87c34 ("IB/srp: Eliminate state SRP_TARGET_DEAD") Reported-by: syzbot+831661966588c802aae9@syzkaller.appspotmail.com Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When configfs_register_subsystem() or configfs_unregister_subsystem()
is executing link_group() or unlink_group(),
it is possible that two processes add or delete list concurrently.
Some unfortunate interleavings of them can cause kernel panic.
One of cases is:
A --> B --> C --> D
A <-- B <-- C <-- D
Fix this by adding mutex when calling link_group() or unlink_group(),
but parent configfs_subsystem is NULL when config_item is root.
So I create a mutex configfs_subsystem_mutex.
When polling for the firmware message response, we first poll for the
response message header. Once the valid length is detected in the
header, we poll for the valid bit at the end of the message which
signals DMA completion. Normally, this poll time for DMA completion
is extremely short (0 to a few usec). But on some devices under some
rare conditions, it can be up to about 20 msec.
Increase this delay to 50 msec and use udelay() for the first 10 usec
for the common case, and usleep_range() beyond that.
Also, change the error message to include the above delay time when
printing the timeout value.
Fixes: 3c8c20db769c ("bnxt_en: move HWRM API implementation into separate file") Reviewed-by: Vladimir Olovyannikov <vladimir.olovyannikov@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Error path of rtrs_clt_open() calls free_clt(), where free_permit is
called. This is wrong since error path of rtrs_clt_open() does not need
to call free_permit().
Also, moving free_permits() call to rtrs_clt_close(), makes it more
aligned with the call to alloc_permit() in rtrs_clt_open().
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20220217030929.323849-2-haris.iqbal@ionos.com Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Callback function rtrs_clt_dev_release() for put_device() calls kfree(clt)
to free memory. We shouldn't call kfree(clt) again, and we can't use the
clt after kfree too.
Replace device_register() with device_initialize() and device_add() so that
dev_set_name can() be used appropriately.
Move mutex_destroy() to the release function so it can be called in
the alloc_clt err path.
Fixes: eab098246625 ("RDMA/rtrs-clt: Refactor the failure cases in alloc_clt") Link: https://lore.kernel.org/r/20220217030929.323849-1-haris.iqbal@ionos.com Reported-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
UDP sendmsg() can be lockless, this is causing all kinds
of data races.
This patch converts sk->sk_tskey to remove one of these races.
BUG: KCSAN: data-race in __ip_append_data / __ip_append_data
read to 0xffff8881035d4b6c of 4 bytes by task 8877 on cpu 1:
__ip_append_data+0x1c1/0x1de0 net/ipv4/ip_output.c:994
ip_make_skb+0x13f/0x2d0 net/ipv4/ip_output.c:1636
udp_sendmsg+0x12bd/0x14c0 net/ipv4/udp.c:1249
inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819
sock_sendmsg_nosec net/socket.c:705 [inline]
sock_sendmsg net/socket.c:725 [inline]
____sys_sendmsg+0x39a/0x510 net/socket.c:2413
___sys_sendmsg net/socket.c:2467 [inline]
__sys_sendmmsg+0x267/0x4c0 net/socket.c:2553
__do_sys_sendmmsg net/socket.c:2582 [inline]
__se_sys_sendmmsg net/socket.c:2579 [inline]
__x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
write to 0xffff8881035d4b6c of 4 bytes by task 8880 on cpu 0:
__ip_append_data+0x1d8/0x1de0 net/ipv4/ip_output.c:994
ip_make_skb+0x13f/0x2d0 net/ipv4/ip_output.c:1636
udp_sendmsg+0x12bd/0x14c0 net/ipv4/udp.c:1249
inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819
sock_sendmsg_nosec net/socket.c:705 [inline]
sock_sendmsg net/socket.c:725 [inline]
____sys_sendmsg+0x39a/0x510 net/socket.c:2413
___sys_sendmsg net/socket.c:2467 [inline]
__sys_sendmmsg+0x267/0x4c0 net/socket.c:2553
__do_sys_sendmmsg net/socket.c:2582 [inline]
__se_sys_sendmmsg net/socket.c:2579 [inline]
__x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0x0000054d -> 0x0000054e
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 8880 Comm: syz-executor.5 Not tainted 5.17.0-rc2-syzkaller-00167-gdcb85f85fa6f-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: 09c2d251b707 ("net-timestamp: add key to disambiguate concurrent datagrams") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
With the existing logic where clear_ack is true (HW doesn’t support
auto clear for ICR), interrupt clear register reset is not handled
properly. Due to this only the first interrupts get processed properly
and further interrupts are blocked due to not resetting interrupt
clear register.
Example for issue case where Invert_ack is false and clear_ack is true:
Say Default ISR=0x00 & ICR=0x00 and ISR is triggered with 2
interrupts making ISR = 0x11.
Step 1: Say ISR is set 0x11 (store status_buff = ISR). ISR needs to
be cleared with the help of ICR once the Interrupt is processed.
Step 2: Write ICR = 0x11 (status_buff), this will clear the ISR to 0x00.
Step 3: Issue - In the existing code, ICR is written with ICR =
~(status_buff) i.e ICR = 0xEE -> This will block all the interrupts
from raising except for interrupts 0 and 4. So expectation here is to
reset ICR, which will unblock all the interrupts.
if (chip->clear_ack) {
if (chip->ack_invert && !ret)
........
else if (!ret)
ret = regmap_write(map, reg,
~data->status_buf[i]);
So writing 0 and 0xff (when ack_invert is true) should have no effect, other
than clearing the ACKs just set.
Fixes: 3a6f0fb7b8eb ("regmap: irq: Add support to clear ack registers") Signed-off-by: Prasad Kumpatla <quic_pkumpatl@quicinc.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20220217085007.30218-1-quic_pkumpatl@quicinc.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
With v2 hardware, an IRQ can be configured to trigger on both edges via
a bit in the int_bothedge register. Currently, the driver sets this bit
when changing the trigger type to IRQ_TYPE_EDGE_BOTH, but fails to reset
this bit if the trigger type is later changed to something else. This
causes spurious IRQs, and when using gpio-keys with wakeup-event-action
set to EV_ACT_(DE)ASSERTED, those IRQs translate into spurious wakeups.
Fixes: 3bcbd1a85b68 ("gpio/rockchip: support next version gpio controller") Reported-by: Guillaume Savaton <guillaume@baierouge.fr> Tested-by: Guillaume Savaton <guillaume@baierouge.fr> Signed-off-by: Samuel Holland <samuel@sholland.org> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by: Sasha Levin <sashal@kernel.org>
Jan reported that on Turris Omnia (Armada 385), no PCIe devices were
detected after upgrading from v5.16.1 to v5.16.3 and identified the cause
as the backport of 91a8d79fc797 ("PCI: mvebu: Fix configuring secondary bus
of PCIe Root Port via emulated bridge"), which appeared in v5.17-rc1.
91a8d79fc797 was incorrectly applied from mailing list patch [1] to the
linux git repository [2] probably due to resolving merge conflicts
incorrectly. Fix it now.
In zynq_qspi_exec_mem_op(), kzalloc() is directly used in memset(),
which could lead to a NULL pointer dereference on failure of
kzalloc().
Fix this bug by adding a check of tmpbuf.
This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.
Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.
Builds with CONFIG_SPI_ZYNQ_QSPI=m show no new warnings,
and our static analyzer no longer warns about this code.
Fixes: 67dca5e580f1 ("spi: spi-mem: Add support for Zynq QSPI controller") Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Link: https://lore.kernel.org/r/20211130172253.203700-1-zhou1615@umn.edu Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add mistakenly missing increment of count variable when looping over
output buffer in mlx5e_self_test().
This resolves the issue of garbage values output when querying with self
test via ethtool.
before:
$ ethtool -t eth2
The test result is PASS
The test extra info:
Link Test 0
Speed Test 1768697188
Health Test 758528120
Loopback Test 3288687
after:
$ ethtool -t eth2
The test result is PASS
The test extra info:
Link Test 0
Speed Test 0
Health Test 0
Loopback Test 0
Fixes: 7990b1b5e8bd ("net/mlx5e: loopback test is not supported in switchdev mode") Signed-off-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>