Model 88E6191X only supports >1G speeds on port 10. Port 0 and 9 are
only 1G.
Fixes: de776d0d316f ("net: dsa: mv88e6xxx: add support for mv88e6393x family") Signed-off-by: Marek Behún <kabel@kernel.org> Cc: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20211104171747.10509-1-kabel@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
DEFINE_PROC_SHOW_ATTRIBUTE() is supposed to be used to define a series
of functions and variables to register proc file easily. And the users
can use proc_create_data() to pass their own private data and get it
via seq->private in the callback. Unfortunately, the proc file system
use PDE_DATA() to get private data instead of inode->i_private. So fix
it. Fortunately, there only one user of it which does not pass any
private data, so this bug does not break any in-tree codes.
Link: https://lkml.kernel.org/r/20211029032638.84884-1-songmuchun@bytedance.com Fixes: 97a32539b956 ("proc: convert everything to "struct proc_ops"") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Florent Revest <revest@chromium.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
For NV12 FBs with odd main surface tile-row height the CCS surface
height was incorrectly calculated 1 less than the actual value. Fix this
by rounding up the result of divison. For consistency do the same for
the CCS surface width calculation.
The priv->ntfy_blocks[] has "priv->num_ntfy_blks" elements so this >
needs to be >= to prevent an off by one bug. The priv->ntfy_blocks[]
array is allocated in gve_alloc_notify_blocks().
Fixes: 87a7f321bb6a ("gve: Recover from queue stall due to missed IRQ") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Using the % operator on a 64-bit variable is expensive and can
cause a link failure:
arm-linux-gnueabi-ld: drivers/dma/stm32-dma.o: in function `stm32_dma_get_max_width':
stm32-dma.c:(.text+0x170): undefined reference to `__aeabi_uldivmod'
arm-linux-gnueabi-ld: drivers/dma/stm32-dma.o: in function `stm32_dma_set_xfer_param':
stm32-dma.c:(.text+0x1cd4): undefined reference to `__aeabi_uldivmod'
As we know that we just want to check the alignment in
stm32_dma_get_max_width(), there is no need for a full division, and
using a simple mask is a faster replacement.
Same in stm32_dma_set_xfer_param(), change this to only allow burst
transfers if the address is a multiple of the length.
stm32_dma_get_best_burst just after will take buf_len into account to fix
burst in case of misalignment.
Theorically, address pointers used by STM32 DMA must be chosen so as to
ensure that all transfers within a burst block are aligned on the address
boundary equal to the size of the transfer.
If this is always the case for peripheral addresses on STM32, it is not for
memory addresses if the user doesn't respect this alignment constraint.
To avoid a weird behavior of the DMA controller in this case (no error
triggered but data are not transferred as expected), force no burst.
But similar to the case in 84f44df664e9 ("bpf: sock_ops sk access may stomp
registers when dst_reg = src_reg"), the code will read an incorrect skb->len
when src == dst. In this case we end up generating this xlated code:
... where line 560 is the reading 4B of (skb->data + 112) instead of the
intended skb->len Here the skb pointer in r1 gets set to skb->data and the
later deref for skb->len ends up following skb->data instead of skb.
This fixes the issue similarly to the patch mentioned above by creating an
additional temporary variable and using to store the register when dst_reg =
src_reg. We name the variable bpf_temp_reg and place it in the cb context for
sk_skb. Then we restore from the temp to ensure nothing is lost.
Fixes: 16137b09a66f2 ("bpf: Compute data_end dynamically with JIT code") Signed-off-by: Jussi Maki <joamaki@gmail.com> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20211103204736.248403-6-john.fastabend@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Strparser is reusing the qdisc_skb_cb struct to stash the skb message handling
progress, e.g. offset and length of the skb. First this is poorly named and
inherits a struct from qdisc that doesn't reflect the actual usage of cb[] at
this layer.
But, more importantly strparser is using the following to access its metadata.
So we use 12 bytes of ->data[] in struct. However in BPF code running parser
and verdict the user has read capabilities into the data[] array as well. Its
not too problematic, but we should not be exposing internal state to BPF
program. If its really needed then we can use the probe_read() APIs which allow
reading kernel memory. And I don't believe cb[] layer poses any API breakage by
moving this around because programs can't depend on cb[] across layers.
In order to fix another issue with a ctx rewrite we need to stash a temp
variable somewhere. To make this work cleanly this patch builds a cb struct
for sk_skb types called sk_skb_cb struct. Then we can use this consistently
in the strparser, sockmap space. Additionally we can start allowing ->cb[]
write access after this.
Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jussi Maki <joamaki@gmail.com> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20211103204736.248403-5-john.fastabend@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
A socket in a sockmap may have different combinations of programs attached
depending on configuration. There can be no programs in which case the socket
acts as a sink only. There can be a TX program in this case a BPF program is
attached to sending side, but no RX program is attached. There can be an RX
program only where sends have no BPF program attached, but receives are hooked
with BPF. And finally, both TX and RX programs may be attached. Giving us the
permutations:
None, Tx, Rx, and TxRx
To date most of our use cases have been TX case being used as a fast datapath
to directly copy between local application and a userspace proxy. Or Rx cases
and TxRX applications that are operating an in kernel based proxy. The traffic
in the first case where we hook applications into a userspace application looks
like this:
In this case all traffic from AppA (after 3whs) is copied into the AppB
ingress queue and no traffic is ever on the TCP recieive_queue.
In the second case the application never receives, except in some rare error
cases, traffic on the actual user space socket. Instead the send happens in
the kernel.
AppProxy socket pool
sk0 ------------->{sk1,sk2, skn}
^ |
| |
| v
ingress lb egress
TCP TCP
Here because traffic is never read off the socket with userspace recv() APIs
there is only ever one reader on the sk receive_queue. Namely the BPF programs.
However, we've started to introduce a third configuration where the BPF program
on receive should process the data, but then the normal case is to push the
data into the receive queue of AppB.
This is different from the App{A,B} redirect because traffic is first received
on the sk->receive_queue.
Now for the issue. The tcp_bpf_recvmsg() handler first checks the ingress_msg
queue for any data handled by the BPF rx program and returned with PASS code
so that it was enqueued on the ingress msg queue. Then if no data exists on
that queue it checks the socket receive queue. Unfortunately, this is the same
receive_queue the BPF program is reading data off of. So we get a race. Its
possible for the recvmsg() hook to pull data off the receive_queue before the
BPF hook has a chance to read it. It typically happens when an application is
banging on recv() and getting EAGAINs. Until they manage to race with the RX
BPF program.
To fix this we note that before this patch at attach time when the socket is
loaded into the map we check if it needs a TX program or just the base set of
proto bpf hooks. Then it uses the above general RX hook regardless of if we
have a BPF program attached at rx or not. This patch now extends this check to
handle all cases enumerated above, TX, RX, TXRX, and none. And to fix above
race when an RX program is attached we use a new hook that is nearly identical
to the old one except now we do not let the recv() call skip the RX BPF program.
Now only the BPF program pulls data from sk->receive_queue and recv() only
pulls data from the ingress msgQ post BPF program handling.
With this resolved our AppB from above has been up and running for many hours
without detecting any errors. We do this by correlating counters in RX BPF
events and the AppB to ensure data is never skipping the BPF program. Selftests,
was not able to detect this because we only run them for a short period of time
on well ordered send/recvs so we don't get any of the noise we see in real
application environments.
Fixes: 51199405f9672 ("bpf: skb_verdict, support SK_PASS on RX BPF path") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jussi Maki <joamaki@gmail.com> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20211103204736.248403-4-john.fastabend@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
We do not need to handle unhash from BPF side we can simply wait for the
close to happen. The original concern was a socket could transition from
ESTABLISHED state to a new state while the BPF hook was still attached.
But, we convinced ourself this is no longer possible and we also improved
BPF sockmap to handle listen sockets so this is no longer a problem.
More importantly though there are cases where unhash is called when data is
in the receive queue. The BPF unhash logic will flush this data which is
wrong. To be correct it should keep the data in the receive queue and allow
a receiving application to continue reading the data. This may happen when
tcp_abort() is received for example. Instead of complicating the logic in
unhash simply moving all this to tcp_close() hook solves this.
Fixes: 51199405f9672 ("bpf: skb_verdict, support SK_PASS on RX BPF path") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jussi Maki <joamaki@gmail.com> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20211103204736.248403-3-john.fastabend@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
gcc warns about undefined behavior the vmalloc code when building
with CONFIG_ARM64_PA_BITS_52, when the 'idx++' in the argument to
__phys_to_pte_val() is evaluated twice:
mm/vmalloc.c: In function 'vmap_pfn_apply':
mm/vmalloc.c:2800:58: error: operation on 'data->idx' may be undefined [-Werror=sequence-point]
2800 | *pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
| ~~~~~~~~~^~
arch/arm64/include/asm/pgtable-types.h:25:37: note: in definition of macro '__pte'
25 | #define __pte(x) ((pte_t) { (x) } )
| ^
arch/arm64/include/asm/pgtable.h:80:15: note: in expansion of macro '__phys_to_pte_val'
80 | __pte(__phys_to_pte_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))
| ^~~~~~~~~~~~~~~~~
mm/vmalloc.c:2800:30: note: in expansion of macro 'pfn_pte'
2800 | *pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
| ^~~~~~~
I have no idea why this never showed up earlier, but the safest
workaround appears to be changing those macros into inline functions
so the arguments get evaluated only once.
Cc: Matthew Wilcox <willy@infradead.org> Fixes: 75387b92635e ("arm64: handle 52-bit physical addresses in page table entries") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20211105075414.2553155-1-arnd@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The id argument of ARM64_FTR_REG_OVERRIDE() is used for two purposes:
one as the system register encoding (used for the sys_id field of
__ftr_reg_entry), and the other as the register name (stringified
and used for the name field of arm64_ftr_reg), which is debug
information. The id argument is supposed to be a macro that
indicates an encoding of the register (eg. SYS_ID_AA64PFR0_EL1, etc).
ARM64_FTR_REG(), which also has the same id argument,
uses ARM64_FTR_REG_OVERRIDE() and passes the id to the macro.
Since the id argument is completely macro-expanded before it is
substituted into a macro body of ARM64_FTR_REG_OVERRIDE(),
the stringified id in the body of ARM64_FTR_REG_OVERRIDE is not
a human-readable register name, but a string of numeric bitwise
operations.
Fix this so that human-readable register names are available as
debug information.
Fixes: 8f266a5d878a ("arm64: cpufeature: Add global feature override facility") Signed-off-by: Reiji Watanabe <reijiw@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211101045421.2215822-1-reijiw@google.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
skb is already freed by dev_kfree_skb in pn533_fill_fragment_skbs,
but follow error handler branch when pn533_fill_fragment_skbs()
fails, skb is freed again, results in double free issue. Fix this
by not free skb in error path of pn533_fill_fragment_skbs.
Fixes: 963a82e07d4e ("NFC: pn533: Split large Tx frames in chunks") Fixes: 93ad42020c2d ("NFC: pn533: Target mode Tx fragmentation support") Signed-off-by: Chengfeng Ye <cyeaa@connect.ust.hk> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Both ifindex and LLC_SK_DEV_HASH_ENTRIES are signed.
This means that (ifindex % LLC_SK_DEV_HASH_ENTRIES) is negative
if @ifindex is negative.
We could simply make LLC_SK_DEV_HASH_ENTRIES unsigned.
In this patch I chose to use hash_32() to get more entropy
from @ifindex, like llc_sk_laddr_hashfn().
UBSAN: array-index-out-of-bounds in ./include/net/llc.h:75:26
index -43 is out of range for type 'hlist_head [64]'
CPU: 1 PID: 20999 Comm: syz-executor.3 Not tainted 5.15.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
ubsan_epilogue+0xb/0x5a lib/ubsan.c:151
__ubsan_handle_out_of_bounds.cold+0x62/0x6c lib/ubsan.c:291
llc_sk_dev_hash include/net/llc.h:75 [inline]
llc_sap_add_socket+0x49c/0x520 net/llc/llc_conn.c:697
llc_ui_bind+0x680/0xd70 net/llc/af_llc.c:404
__sys_bind+0x1e9/0x250 net/socket.c:1693
__do_sys_bind net/socket.c:1704 [inline]
__se_sys_bind net/socket.c:1702 [inline]
__x64_sys_bind+0x6f/0xb0 net/socket.c:1702
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa503407ae9
Fixes: 6d2e3ea28446 ("llc: use a device based hash table to speed up multicast delivery") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
If btf__new() is called then there needs to be a corresponding btf__free().
Fixes: f8dfeae009effc0b ("perf bpf: Show more BPF program info in print_bpf_prog_info()") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Link: http://lore.kernel.org/lkml/20211106053733.3580931-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
snprintf() returns the number of bytes it would have printed if there
were space. But it does not count the NUL terminator. So that means
that if "count == copied" then this has already overflowed by one
character.
There is one possible race window between zs_pool_dec_isolated() and
zs_unregister_migration() because wait_for_isolated_drain() checks the
isolated count without holding class->lock and there is no order inside
zs_pool_dec_isolated(). Thus the below race window could be possible:
In es58x_rx_err_msg(), if can->do_set_mode() fails, the function
directly returns without calling netif_rx(skb). This means that the
skb previously allocated by alloc_can_err_skb() is not freed. In other
terms, this is a memory leak.
This patch simply removes the return statement in the error branch and
let the function continue.
Issue was found with GCC -fanalyzer, please follow the link below for
details.
Fixes: 8537257874e9 ("can: etas_es58x: add core support for ETAS ES58X CAN USB interfaces") Link: https://lore.kernel.org/all/20211026180740.1953265-1-mailhol.vincent@wanadoo.fr Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
My previous patch correctly addressed the possible link failure, but as
Jani points out, the dependency is now stricter than it needs to be.
Change it again, to allow DRM_FBDEV_EMULATION to be used when
DRM_KMS_HELPER and FB are both loadable modules and DRM is linked into
the kernel.
As a side-effect, the option is now only visible when at least one DRM
driver makes use of DRM_KMS_HELPER. This is better, because the option
has no effect otherwise.
As I want to test both DEVMAP and DEVMAP_HASH in XDP multicast redirect, I
limited DEVMAP max entries to a small value for performace. When the test
runs after amount of interface creating/deleting tests. The interface index
will exceed the map max entries and xdp_redirect_multi will error out with
"Get interfacesInterface index to large".
Fix this issue by limit the tests in netns and specify the ifindex when
creating interfaces.
The arp request number triggered by ping none exist address is not accurate,
which may lead the test false negative/positive. Change to use arping to
accurate the arp number. Also do not use grep pattern match for dot.
Fix a device wakeup power reference counting error introduced by
commit a2d7b2e004af ("ACPI: PM: Fix sharing of wakeup power
resources") because of a coding mistake.
Fixes: a2d7b2e004af ("ACPI: PM: Fix sharing of wakeup power resources") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The resource_size defines that:
res->end - res->start + 1;
The origin original code is:
sysmgr_config.max_register = res->end - res->start - 3;
So, the correct fix is that:
sysmgr_config.max_register = resource_size(res) - 4;
Fixes: d12edf9661a4 ("mfd: altera-sysmgr: Use resource_size function on resource object") Signed-off-by: Kai Song <songkai01@inspur.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20211006141926.6120-1-songkai01@inspur.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding a SPI device ID table.
Fixes: 96c8395e2166 ("spi: Revert modalias changes") Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Baolin Wang <baolin.wang7@gmail.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20210924143347.14721-4-broonie@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding a SPI device ID table.
Fixes: 96c8395e2166 ("spi: Revert modalias changes") Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20210924143347.14721-3-broonie@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
Early exits from for_each_child_of_node() should decrement the
node reference counter. Reported by Coccinelle:
drivers/mfd/mfd-core.c:197:2-24: WARNING:
Function "for_each_child_of_node" should have of_node_put() before goto around lines 209.
Fixes: c94bb233a9fe ("mfd: Make MFD core code Device Tree and IRQ domain aware") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20210528115126.18370-1-krzysztof.kozlowski@canonical.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Although we've covered all calls with NULL dma buffer pointer, so far,
there may be still some else in the wild. For catching such a case
more easily, add a WARN_ON_ONCE() in snd_dma_get_ops().
When I do fuzz test for bonding device interface, I got the following
use-after-free Calltrace:
==================================================================
BUG: KASAN: use-after-free in bond_enslave+0x1521/0x24f0
Read of size 8 at addr ffff88825bc11c00 by task ifenslave/7365
Last potentially related work creation:
kasan_save_stack+0x23/0x50
kasan_record_aux_stack+0xb7/0xd0
insert_work+0x43/0x190
__queue_work+0x2e3/0x970
delayed_work_timer_fn+0x3e/0x50
call_timer_fn+0x148/0x470
run_timer_softirq+0x8a8/0xc50
__do_softirq+0x107/0x55f
Second to last potentially related work creation:
kasan_save_stack+0x23/0x50
kasan_record_aux_stack+0xb7/0xd0
insert_work+0x43/0x190
__queue_work+0x2e3/0x970
__queue_delayed_work+0x130/0x180
queue_delayed_work_on+0xa7/0xb0
bond_enslave+0xe25/0x24f0
bond_do_ioctl+0x3e0/0x450
dev_ifsioc+0x2ba/0x970
dev_ioctl+0x112/0x710
sock_do_ioctl+0x118/0x1b0
sock_ioctl+0x2e0/0x490
__x64_sys_ioctl+0x118/0x150
do_syscall_64+0x35/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Memory state around the buggy address: ffff88825bc11b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88825bc11b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88825bc11c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff88825bc11c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88825bc11d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Put new_slave in bond_sysfs_slave_add() will cause use-after-free problems
when new_slave is accessed in the subsequent error handling process. Since
new_slave will be put in the subsequent error handling process, remove the
unnecessary put to fix it.
In addition, when sysfs_create_file() fails, if some files have been crea-
ted successfully, we need to call sysfs_remove_file() to remove them.
Since there are sysfs_create_files() & sysfs_remove_files() can be used,
use these two functions instead.
Fixes: 7afcaec49696 (bonding: use kobject_put instead of _del after kobject_add) Signed-off-by: Huang Guobin <huangguobin4@huawei.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The huge page functionality in TTM does not work safely because PUD and
PMD entries do not have a special bit.
get_user_pages_fast() considers any page that passed pmd_huge() as
usable:
if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) ||
pmd_devmap(pmd))) {
And vmf_insert_pfn_pmd_prot() unconditionally sets
entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE.
As such gup_huge_pmd() will try to deref a struct page:
head = try_grab_compound_head(pmd_page(orig), refs, flags);
and thus crash.
Thomas further notices that the drivers are not expecting the struct page
to be used by anything - in particular the refcount incr above will cause
them to malfunction.
Thus everything about this is not able to fully work correctly considering
GUP_fast. Delete it entirely. It can return someday along with a proper
PMD/PUD_SPECIAL bit in the page table itself to gate GUP_fast.
Fixes: 314b6580adc5 ("drm/ttm, drm/vmwgfx: Support huge TTM pagefaults") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Thomas Hellström <thomas.helllstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com>
[danvet: Update subject per Thomas' &Christian's review] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/0-v2-a44694790652+4ac-ttm_pmd_jgg@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
As reported by Zhang there's a small issue if in forced mode the duplex
mode changes with the link staying up [0]. In this case the MAC isn't
notified about the change.
The proposed patch relies on the phylib state machine and ignores the
fact that there are drivers that uses phylib but not the phylib state
machine. So let's don't change the behavior for such drivers and fix
it w/o re-adding state PHY_FORCING for the case that phylib state
machine is used.
Commit a365ab6b9dfb ("cpufreq: intel_pstate: Implement the
->adjust_perf() callback") caused intel_pstate to use nonzero HWP
desired values in certain usage scenarios, but it did not prevent
them from being leaked into the confugirations in which HWP desired
is expected to be 0.
The failing scenarios are switching the driver from the passive
mode to the active mode and starting a new kernel via kexec() while
intel_pstate is running in the passive mode.
To address this issue, ensure that HWP desired will be cleared on
offline and suspend/shutdown.
Fixes: a365ab6b9dfb ("cpufreq: intel_pstate: Implement the ->adjust_perf() callback") Reported-by: Julia Lawall <julia.lawall@inria.fr> Tested-by: Julia Lawall <julia.lawall@inria.fr> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Host crashes when pci_enable_atomic_ops_to_root() is called for VFs with
virtual buses. The virtual buses added to SR-IOV have bus->self set to NULL
and host crashes due to this.
Commit bf9c0538e485b591 ("ataflop: use a separate gendisk for each media
format") introduced ataflop_probe_lock mutex, but forgot to unlock the
mutex when atari_floppy_init() (i.e. module loading) succeeded. This will
result in double lock deadlock if ataflop_probe() is called. Also,
unregister_blkdev() must not be called from atari_floppy_init() with
ataflop_probe_lock held when atari_floppy_init() failed, for
ataflop_probe() waits for ataflop_probe_lock with major_names_lock held
(i.e. AB-BA deadlock).
__register_blkdev() needs to be called last in order to avoid calling
ataflop_probe() when atari_floppy_init() is about to fail, for memory for
completing already-started ataflop_probe() safely will be released as soon
as atari_floppy_init() released ataflop_probe_lock mutex.
As with commit 8b52d8be86d72308 ("loop: reorder loop_exit"),
unregister_blkdev() needs to be called first in order to avoid calling
ataflop_alloc_disk() from ataflop_probe() after del_gendisk() from
atari_floppy_exit().
By relocating __register_blkdev() / unregister_blkdev() as explained above,
we can remove ataflop_probe_lock mutex, for probe function and __exit
function are serialized by major_names_lock mutex.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: bf9c0538e485b591 ("ataflop: use a separate gendisk for each media format") Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Link: https://lore.kernel.org/r/20211103230437.1639990-11-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
Instead of using two separate code paths for cleaning up an atari disk,
use one. We take the more careful approach to check for *all* disk
types, as is done on exit. The init path didn't have that check as
the alternative disk types are only probed for later, they are not
initialized by default.
The ataflop assumes del_gendisk() is safe to call, this is only
true because add_disk() does not return a failure, but that will
change soon. And so, before we get to adding error handling for
that case, let's make sure we keep track of which disks actually
get registered. Then we use this to only call del_gendisk for them.
The reference counting issue happens in one exception handling path of
nouveau_svmm_bind(). When cli->svm.svmm is null, the function forgets
to decrease the refcount of mm increased by get_task_mm(), causing a
refcount leak.
Fix this issue by using mmput() to decrease the refcount in the
exception handling path.
Also, the function forgets to do check against null when get mm
by get_task_mm().
Fix this issue by adding null check after get mm by get_task_mm().
Signed-off-by: Chenyuan Mi <cymi20@fudan.edu.cn> Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Fixes: 822cab6150d3 ("drm/nouveau/svm: check for SVM initialized before migrating") Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210907122633.16665-1-cymi20@fudan.edu.cn Link: https://gitlab.freedesktop.org/drm/nouveau/-/merge_requests/14 Signed-off-by: Sasha Levin <sashal@kernel.org>
The handling of PMIC register reads through writing 0 to address 4
of the OpRegion is wrong. Instead of returning the read value
through the value64, which is a no-op for function == ACPI_WRITE calls,
store the value and then on a subsequent function == ACPI_READ with
address == 3 (the address for the value field of the OpRegion)
return the stored value.
This has been tested on a Xiaomi Mi Pad 2 and makes the ACPI battery dev
there mostly functional (unfortunately there are still other issues).
Here are the SET() / GET() functions of the PMIC ACPI device,
which use this OpRegion, which clearly show the new behavior to
be correct:
Method (GET, 3, Serialized)
{
If ((AVBE == One))
{
CLNT = Arg0
SA = Arg1
OFF = Arg2
RWM = Zero
If ((AVBG == One))
{
GPRW = Zero
}
}
Return (VAL) /* \_SB_.PCI0.I2C7.PMI5.VAL_ */
}
Method (SET, 4, Serialized)
{
If ((AVBE == One))
{
CLNT = Arg0
SA = Arg1
OFF = Arg2
VAL = Arg3
RWM = One
If ((AVBG == One))
{
GPRW = One
}
}
}
Fixes: 0afa877a5650 ("ACPI / PMIC: intel: add REGS operation region support") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When a VF is removed and/or reset its Tx queues need to be
stopped from the PF. This is done by calling the ice_dis_vf_qs()
function, which calls ice_vsi_stop_lan_tx_rings(). Currently
ice_dis_vf_qs() is protected by the VF state bit ICE_VF_STATE_QS_ENA.
Unfortunately, this is causing the Tx queues to not be disabled in some
cases and when the VF tries to re-enable/reconfigure its Tx queues over
virtchnl the op is failing. This is because a VF can be reset and/or
removed before the ICE_VF_STATE_QS_ENA bit is set, but the Tx queues
were already configured via ice_vsi_cfg_single_txq() in the
VIRTCHNL_OP_CONFIG_VSI_QUEUES op. However, the ICE_VF_STATE_QS_ENA bit
is set on a successful VIRTCHNL_OP_ENABLE_QUEUES, which will always
happen after the VIRTCHNL_OP_CONFIG_VSI_QUEUES op.
This was causing the following error message when loading the ice
driver, creating VFs, and modifying VF trust in an endless loop:
[35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM
[35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5
[35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
Fix this by always calling ice_dis_vf_qs() and silencing the error
message in ice_vsi_stop_tx_ring() since the calling code ignores the
return anyway. Also, all other places that call ice_vsi_stop_tx_ring()
catch the error, so this doesn't affect those flows since there was no
change to the values the function returns.
Other solutions were considered (i.e. tracking which VF queues had been
"started/configured" in VIRTCHNL_OP_CONFIG_VSI_QUEUES, but it seemed
more complicated than it was worth. This solution also brings in the
chance for other unexpected conditions due to invalid state bit checks.
So, the proposed solution seemed like the best option since there is no
harm in failing to stop Tx queues that were never started.
This issue can be seen using the following commands:
VF was not able to change its hardware MAC address in case
the new address was already present in the MAC filter list.
Change the handling of VF add mac request to not return
if requested MAC address is already present on the list
and check if its hardware MAC needs to be updated in this case.
Fixes: ed4c068d46f6 ("ice: Enable ip link show on the PF to display VF unicast MAC(s)") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Normally it is expected that the dsa_device_ops :: rcv() method finishes
parsing the DSA tag and consumes it, then never looks at it again.
But commit c0bcf537667c ("net: dsa: ocelot: add hardware timestamping
support for Felix") added support for RX timestamping in a very
unconventional way. On this switch, a partial timestamp is available in
the DSA header, but the driver got away with not parsing that timestamp
right away, but instead delayed that parsing for a little longer:
dsa_switch_rcv():
nskb = cpu_dp->rcv(skb, dev); <------------- not here
-> ocelot_rcv()
...
if (dsa_skb_defer_rx_timestamp(p, skb)) <--- but here
-> felix_rxtstamp()
return 0;
When in felix_rxtstamp(), this driver accounted for the fact that
eth_type_trans() happened in the meanwhile, so it got a hold of the
extraction header again by subtracting (ETH_HLEN + OCELOT_TAG_LEN) bytes
from the current skb->data.
This worked for quite some time but was quite fragile from the very
beginning. Not to mention that having DSA tag parsing split in two
different files, under different folders (net/dsa/tag_ocelot.c vs
drivers/net/dsa/ocelot/felix.c) made it quite non-obvious for patches to
come that they might break this.
Finally, the blamed commit does the following: at the end of
ocelot_rcv(), it checks whether the skb payload contains a VLAN header.
If it does, and this port is under a VLAN-aware bridge, that VLAN ID
might not be correct in the sense that the packet might have suffered
VLAN rewriting due to TCAM rules (VCAP IS1). So we consume the VLAN ID
from the skb payload using __skb_vlan_pop(), and take the classified
VLAN ID from the DSA tag, and construct a hwaccel VLAN tag with the
classified VLAN, and the skb payload is VLAN-untagged.
aka it moves the Ethernet header 4 bytes to the right, and pulls 4 bytes
from the skb headroom (effectively also moving skb->data, by definition).
So for felix_rxtstamp()'s fragile logic, all bets are off now.
Instead of having the "extraction" pointer point to the DSA header,
it actually points to 4 bytes _inside_ the extraction header.
Corollary, the last 4 bytes of the "extraction" header are in fact 4
stale bytes of the destination MAC address from the Ethernet header,
from prior to the __skb_vlan_pop() movement.
So of course, RX timestamps are completely bogus when the system is
configured in this way.
The fix is actually very simple: just don't structure the code like that.
For better or worse, the DSA PTP timestamping API does not offer a
straightforward way for drivers to present their RX timestamps, but
other drivers (sja1105) have established a simple mechanism to carry
their RX timestamp from dsa_device_ops :: rcv() all the way to
dsa_switch_ops :: port_rxtstamp() and even later. That mechanism is to
simply save the partial timestamp to the skb->cb, and complete it later.
Question: why don't we simply populate the skb's struct
skb_shared_hwtstamps from ocelot_rcv(), and bother with this
complication of propagating the timestamp to felix_rxtstamp()?
Answer: dsa_switch_ops :: port_rxtstamp() answers the question whether
PTP packets need sleepable context to retrieve the full RX timestamp.
Currently felix_rxtstamp() answers "no, thanks" to that question, and
calls ocelot_ptp_gettime64() from softirq atomic context. This is
understandable, since Felix VSC9959 is a PCIe memory-mapped switch, so
hardware access does not require sleeping. But the felix driver is
preparing for the introduction of other switches where hardware access
is over a slow bus like SPI or MDIO:
https://lore.kernel.org/lkml/20210814025003.2449143-1-colin.foster@in-advantage.com/
So I would like to keep this code structure, so the rework needed when
that driver will need PTP support will be minimal (answer "yes, I need
deferred context for this skb's RX timestamp", then the partial
timestamp will still be found in the skb->cb.
Fixes: ea440cd2d9b2 ("net: dsa: tag_ocelot: use VLAN information from tagging header when available") Reported-by: Po Liu <po.liu@nxp.com> Cc: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
As explained here:
https://lore.kernel.org/netdev/20210908220834.d7gmtnwrorhharna@skbuf/
DSA tagging protocol drivers cannot depend on symbols exported by switch
drivers, because this creates a circular dependency that breaks module
autoloading.
The tag_ocelot.c file depends on the ocelot_ptp_rew_op() function
exported by the common ocelot switch lib. This function looks at
OCELOT_SKB_CB(skb) and computes how to populate the REW_OP field of the
DSA tag, for PTP timestamping (the command: one-step/two-step, and the
TX timestamp identifier).
None of that requires deep insight into the driver, it is quite
stateless, as it only depends upon the skb->cb. So let's make it a
static inline function and put it in include/linux/dsa/ocelot.h, a
file that despite its name is used by the ocelot switch driver for
populating the injection header too - since commit 40d3f295b5fe ("net:
mscc: ocelot: use common tag parsing code with DSA").
With that function declared as static inline, its body is expanded
inside each call site, so the dependency is broken and the DSA tagger
can be built without the switch library, upon which the felix driver
depends.
Fixes: 39e5308b3250 ("net: mscc: ocelot: support PTP Sync one-step timestamping") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The real_dev of a vlan net_device may be freed after
unregister_vlan_dev(). Access the real_dev continually by
vlan_dev_real_dev() will trigger the UAF problem for the
real_dev like following:
Move the put_device(real_dev) to vlan_dev_free(). Ensure
real_dev not be freed before vlan_dev unregistered.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+e4df4e1389e28972e955@syzkaller.appspotmail.com Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Throughout the OpenRISC kernel port VMA is passed as NULL when flushing
kernel tlb entries. Somehow this was missed when I was testing c28b27416da9 ("openrisc: Implement proper SMP tlb flushing") and now the
SMP kernel fails to completely boot.
In OpenRISC VMA is used only to determine which cores need to have their
TLB entries flushed.
This patch updates the logic to flush tlbs on all cores when the VMA is
passed as NULL. Also, we update places VMA is passed as NULL to use
flush_tlb_kernel_range instead. Now, the only place VMA is passed as
NULL is in the implementation of flush_tlb_kernel_range.
ETHTOOL_A_PAUSE_STAT_MAX is the MAX attribute id,
so we need to subtract non-stats and add one to
get a count (IOW -2+1 == -1).
Otherwise we'll see:
ethnl cmd 21: calculated reply length 40, but consumed 52
Fixes: 9a27a33027f2 ("ethtool: add standard pause stats") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
When generating the selftests to another folder, the icmp.sh test will
miss as it is not in Makefile, e.g.
make -C tools/testing/selftests/ install \
TARGETS="net" INSTALL_PATH=/tmp/kselftests
Fixes: 7e9838b7915e ("selftests/net: Add icmp.sh for testing ICMP dummy address responses") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch allows to use 0 for `coal->rx_coalesce_usecs` param to
disable rx irq coalescing.
Previously we could enable rx irq coalescing via ethtool
(For ex: `ethtool -C eth0 rx-usecs 2000`) but we couldn't disable
it because this part rejects 0 value:
If 'part_shift' is not zero, then 'index << part_shift' might
overflow to a value that is not greater than '0xfffff', then sysfs
might complains about duplicate creation.
Fixes: b0d9111a2d53 ("nbd: use an idr to keep track of nbd devices") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20211102015237.2309763-3-yebin10@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
commit b1a811633f73 ("block: nbd: add sanity check for first_minor")
checks that 'first_minor' should not be greater than 0xff, which is
wrong. Whitout the commit, the details that when user pass 0x100000,
it ends up create sysfs dir "/sys/block/43:0" are as follows:
By the way, with the wrong fix, when part_shift is the default value,
only 8 ndb devices can be created since 8 << 5 is greater than 0xff.
Since the max bits for 'first_minor' should be the same as what
MKDEV() does, which is 20. Change the upper bound of 'first_minor'
from 0xff to 0xfffff.
Return NULL instead of passing to ERR_PTR while err is zero,
this fix smatch warnings:
drivers/xen/xen-pciback/conf_space_capability.c:163
pm_ctrl_init() warn: passing zero to 'ERR_PTR'
Fixes: a92336a1176b ("xen/pciback: Drop two backends, squash and cleanup some code.") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20211008074417.8260-1-yuehaibing@huawei.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The irqchip uses one domain for all GPIO lines, so the line offset
should be determined w.r.t. the first line of the first port, not the
first line of the triggered port.
A successful 'clk_prepare()' call should be balanced by a corresponding
'clk_unprepare()' call in the error handling path of the probe, as already
done in the remove function.
More specifically, 'clk_prepare_enable()' is used, but 'clk_disable()' is
also already called. So just the unprepare step has still to be done.
Update the error handling path accordingly.
Fixes: 75d31c2372e4 ("i2c: xlr: add support for Sigma Designs controller variant") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If we already hold open state on the client, yet the server gives us a
completely different stateid to the one we already hold, then we
currently treat it as if it were an out-of-sequence update, and wait for
5 seconds for other updates to come in.
This commit fixes the behaviour so that we immediately start processing
of the new stateid, and then leave it to the call to
nfs4_test_and_free_stateid() to decide what to do with the old stateid.
Fixes: b4868b44c562 ("NFSv4: Wait for stateid updates after CLOSE/OPEN_DOWNGRADE") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When user uses issue_lip to do link bounce, driver sends additional target
reset to remote device before resetting the link. The target reset would
affect other paths with active I/Os. This patch will remove the unnecessary
target reset.
Link: https://lore.kernel.org/r/20211026115412.27691-4-njavali@marvell.com Fixes: 5854771e314e ("[SCSI] qla2xxx: Add ISPFX00 specific bus reset routine") Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently TMF commands are removed from de_device.dev_tmf_list at the very
end of se_cmd lifecycle. However, se_lun unlinks from se_cmd upon a command
status (response) being queued in transport layer. This means that LUN and
backend device can be deleted in the meantime and a panic will occur:
target_tmr_work()
cmd->se_tfo->queue_tm_rsp(cmd); // send abort_rsp to a wire
transport_lun_remove_cmd(cmd) // unlink se_cmd from se_lun
- // - // - // -
<<<--- lun remove
<<<--- core backend device remove
- // - // - // -
qlt_handle_abts_completion()
tfo->free_mcmd()
transport_generic_free_cmd()
target_put_sess_cmd()
core_tmr_release_req() {
if (dev) { // backend device, can not be null
spin_lock_irqsave(&dev->se_tmr_lock, flags); //<<<--- CRASH
- Send next iSCSI command to acknowledge the Abort_Response
- Target panics
There is no need to keep the command in tmr_list until response completion,
so move the removal from tmr_list from the response completion to the
response queueing when the LUN is unlinked. Move the removal from state
list too as it is a subject to the same race condition.
Link: https://lore.kernel.org/r/20211018135753.15297-1-d.bogdanov@yadro.com Fixes: c66ac9db8d4a ("[SCSI] target: Add LIO target core v4.0.0-rc6") Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Avoid error like:
In file included from drivers/watchdog/ar7_wdt.c:29:
./arch/mips/include/asm/mach-ar7/ar7.h: In function ‘ar7_is_titan’:
./arch/mips/include/asm/mach-ar7/ar7.h:111:24: error: implicit declaration of function ‘KSEG1ADDR’; did you mean ‘CKSEG1ADDR’? [-Werror=implicit-function-declaration]
111 | return (readl((void *)KSEG1ADDR(AR7_REGS_GPIO + 0x24)) & 0xffff) ==
| ^~~~~~~~~
| CKSEG1ADDR
Fixes: da2a68b3eb47 ("watchdog: Enable COMPILE_TEST where possible") Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20210907024904.4127611-1-liu.yun@linux.dev Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The fintek watchdog timer can configure timeouts of second granularity
only up to 255 seconds. Beyond that, the timeout needs to be configured
with minute granularity. WDIOC_GETTIMEOUT should report the actual
timeout configured, not just echo back the timeout configured by the
user. Do so.
Fixes: 96cb4eb019ce ("watchdog: f71808e_wdt: new watchdog driver for Fintek F71808E and F71882FG") Suggested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Link: https://lore.kernel.org/r/5e17960fe8cc0e3cb2ba53de4730b75d9a0f33d5.1628525954.git-series.a.fatoum@pengutronix.de Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
'make randconfig' can produce a .config file with
"CONFIG_MEMORY_RESERVE=" (no value) since it has no default.
When a subsequent 'make all' is done, kconfig restarts the config
and prompts for a value for MEMORY_RESERVE. This breaks
scripting/automation where there is no interactive user input.
Add a default value for MEMORY_RESERVE. (Any integer value will
work here for kconfig.)
Fixes a kconfig warning:
.config:214:warning: symbol value '' invalid for MEMORY_RESERVE
* Restart config...
Memory reservation (MiB) (MEMORY_RESERVE) [] (NEW)
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") # from beginning of git history Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: linux-m68k@lists.linux-m68k.org Signed-off-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Today the sh code allocates memory the first time a process uses
the fpu. If that memory allocation fails, kill the affected task
with force_sig(SIGKILL) rather than do_group_exit(SIGKILL).
Calling do_group_exit from an exception handler can potentially lead
to dead locks as do_group_exit is not designed to be called from
interrupt context. Instead use force_sig(SIGKILL) to kill the
userspace process. Sending signals in general and force_sig in
particular has been tested from interrupt context so there should be
no problems.
Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: linux-sh@vger.kernel.org Fixes: 0ea820cf9bf5 ("sh: Move over to dynamically allocated FPU context.") Link: https://lkml.kernel.org/r/20211020174406.17889-6-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Before the `callback_result` callback was introduced drivers coded their
invocation to the callback in a similar way to:
if (cb->callback) {
spin_unlock(&dma->lock);
cb->callback(cb->callback_param);
spin_lock(&dma->lock);
}
With the introduction of `callback_result` two helpers where introduced to
transparently handle both types of callbacks. And drivers where updated to
look like this:
if (dmaengine_desc_callback_valid(cb)) {
spin_unlock(&dma->lock);
dmaengine_desc_callback_invoke(cb, ...);
spin_lock(&dma->lock);
}
dmaengine_desc_callback_invoke() correctly handles both `callback_result`
and `callback`. But we forgot to update the dmaengine_desc_callback_valid()
function to check for `callback_result`. As a result DMA descriptors that
use the `callback_result` rather than `callback` don't have their callback
invoked by drivers that follow the pattern above.
Fix this by checking for both `callback` and `callback_result` in
dmaengine_desc_callback_valid().
Fixes: f067025bc676 ("dmaengine: add support to provide error result from a DMA transation") Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Acked-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20211023134101.28042-1-lars@metafoo.de Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Free the kbuf buffer before returning from the dpaa2_console_read()
function. The variable no longer goes out of scope, leaking the storage
it points to.
Fix this by providing a handler that always returns zero, to make sure
blank events will be sent to the actual device handling the backlight.
Reported-by: Robin van der Gracht <robin@protonic.nl> Suggested-by: Robin van der Gracht <robin@protonic.nl> Fixes: 8992da44c6805d53 ("auxdisplay: ht16k33: Driver for LED controller") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently /sys/class/graphics/fb0/bl_curve is not accessible (-ENODEV),
as the driver does not connect the backlight to the frame buffer device.
Fix this moving backlight initialization up, and filling in
fb_info.bl_dev.
Fixes: 8992da44c6805d53 ("auxdisplay: ht16k33: Driver for LED controller") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Robin van der Gracht <robin@protonic.nl> Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
While writing an empty string to a device attribute is a no-op, and thus
does not need explicit safeguards, the user can still write a single
newline to an attribute file:
echo > .../message
If that happens, img_ascii_lcd_display() trims the newline, yielding an
empty string, and causing an infinite loop in img_ascii_lcd_scroll().
Fix this by adding a check for empty strings. Clear the display in case
one is encountered.
buf_addr parameter of stm32_dma_set_xfer_param function is a dma_addr_t.
We only need to check the remainder of buf_addr/max_width, so, no need to
use do_div and extra u64 addr. Use '%' instead.
Fixes: e0ebdbdcb42a ("dmaengine: stm32-dma: take address into account when computing max width") Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com> Link: https://lore.kernel.org/r/20211011094259.315023-3-amelie.delaunay@foss.st.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
AT_XDMAC_CC_PERID() should be used to setup bits 24..30 of XDMAC_CC
register. Using it without parenthesis around 0x7f & (i) will lead to
setting all the time zero for bits 24..30 of XDMAC_CC as the << operator
has higher precedence over bitwise &. Thus, add paranthesis around
0x7f & (i).
at_xdmac could be used on SoCs which supports backup mode (where most
of the SoC power, including power to DMA controller, is closed at suspend
time). Thus, on resume, the settings which were previously done need to be
restored. Do the same for axi configuration.
If 'copy_dma_range_map() fails, the memory allocated for 'rvdev' will leak.
Move the 'copy_dma_range_map()' call after the device registration so
that 'rproc_rvdev_release()' can be called to free some resources.
Also, branch to the error handling path if 'copy_dma_range_map()' instead
of a direct return to avoid some other leaks.
Previously, if del_mtd_device() failed with -EBUSY due to a non-zero
usecount, a subsequent call to attempt the deletion again would try to
remove a debugfs directory that had already been removed and panic.
With this change the second call can instead proceed safely.
Under the following conditions:
* after rounding up by 4 the number of bytes to transfer (this is
related to the controller's internal constraints),
* if this (rounded) amount of data is situated beyond the end of the
device,
* and only in NV-DDR mode,
the Arasan NAND controller timeouts.
This currently can happen in a particular helper used when picking
software ECC algorithms. Let's prevent this situation by refusing to use
the NV-DDR interface with software engines.
The condition register PCI_RCV_INTX is used in irq_mask() and irq_unmask()
callbacks. Accesses to register can occur at the same time without a lock.
Add a lock into each callback to prevent the issue.
And INTX mask and unmask fields in PCL_RCV_INTX register should only be
set/reset for each bit. Clearing by PCL_RCV_INTX_ALL_MASK should be
removed.
INTX status fields in PCL_RCV_INTX register only indicates each INTX
interrupt status, so the handler can't clear by writing 1 to the field.
The status is expected to be cleared by the interrupt origin.
The ack function has no meaning, so should remove it.
hisi_spi_nor_probe() invokes clk_disable_unprepare() on all paths after
successful call of clk_prepare_enable(). Besides, the clock is enabled by
hispi_spi_nor_prep() and disabled by hispi_spi_nor_unprep(). So at remove
time it is not possible to have the clock enabled. The patch removes
excessive clk_disable_unprepare() from hisi_spi_nor_remove().
Found by Linux Driver Verification project (linuxtesting.org).
I have a ST1633 touch controller which fails to probe due to a timeout
waiting for the controller to become ready. Increasing the minimum
delay to 100ms ensures that the probe sequence completes successfully.
The ST1633 datasheet says nothing about the maximum delay here and the
ST1232 I2C protocol document says "wait until" with no notion of a
timeout.
Since this only runs once during probe, being generous with the timout
seems reasonable and most likely the device will become ready
eventually.
(It may be worth noting that I saw this issue with a PREEMPT_RT patched
kernel which probably has tighter wakeups from usleep_range() than other
preemption models.)
Fixes: f605be6a57b4 ("Input: st1232 - wait until device is ready before reading resolution") Signed-off-by: John Keeping <john@metanate.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20210929152609.2421483-1-john@metanate.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When op_alloc() returns NULL to new_op, no error return code of
orangefs_revalidate_lookup() is assigned.
To fix this bug, ret is assigned with -ENOMEM in this case.
Fixes: 8bb8aefd5afb ("OrangeFS: Change almost all instances of the string PVFS2 to OrangeFS.") Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Partially revert commit 2ce209c42c01 ("NFS: Wait for requests that are
locked on the commit list"), since it can lead to deadlocks between
commit requests and nfs_join_page_group().
For now we should assume that any locked requests on the commit list are
either about to be removed and committed by another task, or the writes
they describe are about to be retransmitted. In either case, we should
not need to worry.
Fixes: 2ce209c42c01 ("NFS: Wait for requests that are locked on the commit list") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 43f5c77bcbd2 ("PCI: aardvark: Fix reporting CRS value") started
using CRSSVE flag for handling CRS responses.
PCI_EXP_RTCTL_CRSSVE flag is stored only in emulated config space buffer
and there is handler for PCI_EXP_RTCTL register. So every read operation
from config space automatically clears CRSSVE flag as it is not defined in
PCI_EXP_RTCTL read handler.
Fix this by reading current CRSSVE bit flag from emulated space buffer and
appending it to PCI_EXP_RTCTL read response.
Link: https://lore.kernel.org/r/20211005180952.6812-5-kabel@kernel.org Fixes: 43f5c77bcbd2 ("PCI: aardvark: Fix reporting CRS value") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Marek Behún <kabel@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
drivers/gpu/drm/drm_plane_helper.c: In function 'drm_primary_helper_update':
drivers/gpu/drm/drm_plane_helper.c:113:32: error: 'visible' is used uninitialized [-Werror=uninitialized]
113 | struct drm_plane_state plane_state = {
| ^~~~~~~~~~~
drivers/gpu/drm/drm_plane_helper.c:178:14: note: 'visible' was declared here
178 | bool visible;
| ^~~~~~~
cc1: all warnings being treated as errors
visible is an output, not an input. in practice this use might turn out
OK but it's still UB.
Fixes: df86af9133b4 ("drm/plane-helper: Add drm_plane_helper_check_state()") Reviewed-by: Simon Ser <contact@emersion.fr> Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca> Signed-off-by: Simon Ser <contact@emersion.fr> Link: https://patchwork.freedesktop.org/patch/msgid/20211007063706.305984-1-alex_y_xu@yahoo.ca Signed-off-by: Sasha Levin <sashal@kernel.org>
_nfs4_pnfs_v3/v4_ds_connect do
some work
smp_wmb
ds->ds_clp = clp;
And nfs4_ff_layout_prepare_ds currently does
smp_rmb
if(ds->ds_clp)
...
This patch places the smp_rmb after the if. This ensures that following
reads only happen once nfs4_ff_layout_prepare_ds has checked that data
has been properly initialized.
If the directory changed while we were revalidating the dentry, then
don't update the dentry verifier. There is no value in setting the
verifier to an older value, and we could end up overwriting a more up to
date verifier from a parallel revalidation.
Fixes: efeda80da38d ("NFSv4: Fix revalidation of dentries with delegations") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Both NFSv3 and NFSv2 generate their change attribute from the ctime
value that was supplied by the server. However the problem is that there
are plenty of servers out there with ctime resolutions of 1ms or worse.
In a modern performance system, this is insufficient when trying to
decide which is the most recent set of attributes when, for instance, a
READ or GETATTR call races with a WRITE or SETATTR.
For this reason, let's revert to labelling the NFSv2/v3 change
attributes as NFS4_CHANGE_TYPE_IS_UNDEFINED. This will ensure we protect
against such races.
Fixes: 7b24dacf0840 ("NFS: Another inode revalidation improvement") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding a SPI device ID table.
Fixes: 96c8395e2166 ("spi: Revert modalias changes") Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20210927134104.38648-1-broonie@kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>