MIPS is the ONLY arch just defining __HAVE_ARCH_PMD_ALLOC_ONE alone.
Since commit b2b29d6d011944 (mm: account PMD tables like PTE tables),
"pmd_free" in asm-generic with PMD table accounting and "pmd_alloc_one"
in MIPS without PMD table accounting causes PageTable accounting number
negative, which read by global_zone_page_state(), always returns 0.
Based on 2001:3319 and 2357:0109 which I used to test the fix and
0bda:818b and 2357:0108 for which I found efuse dumps online.
== 2357:0109 ==
=== Before ===
Vendor: Realtek
Product: \x03802.11n NI
Serial:
=== After ===
Vendor: Realtek
Product: 802.11n NIC
Serial not available.
== 2001:3319 ==
=== Before ===
Vendor: Realtek
Product: Wireless N
Serial: no USB Adap
=== After ===
Vendor: Realtek
Product: Wireless N Nano USB Adapter
Serial not available.
The iv from RXD is only for TKIP_RSC/CCMP_PN/GCMP_PN, and it needs a
check for CCMP header insertion. Move mt76_cipher_type to mt76.h to
reduce duplicated code.
Signed-off-by: Xing Song <xing.song@mediatek.com> Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Sasha Levin <sashal@kernel.org>
When nla_put_u32() fails, 'ret' could be 0, it should
return error code in tcf_del_walker().
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
This node pointer is returned by of_parse_phandle() with refcount
incremented in this function. of_node_put() on it before exiting
this function.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The entry for PTYPE 2 in the ice_ptype_lkup table incorrectly states
that this is an L2 packet with no payload. According to the datasheet,
this PTYPE is actually unused and reserved.
Fix the lookup entry to indicate this is an unused entry that is
reserved.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The entry for PTYPE 90 indicates that the payload is layer 3. This does
not match the specification in the datasheet which indicates the packet
is a MAC, IPv6, UDP packet, with a payload in layer 4.
Fix the lookup table to match the data sheet.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
syzbot reported a shift-out-of-bounds that KUBSAN observed in the
interpreter:
[...]
UBSAN: shift-out-of-bounds in kernel/bpf/core.c:1420:2
shift exponent 255 is too large for 64-bit type 'long long unsigned int'
CPU: 1 PID: 11097 Comm: syz-executor.4 Not tainted 5.12.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x141/0x1d7 lib/dump_stack.c:120
ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
__ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:327
___bpf_prog_run.cold+0x19/0x56c kernel/bpf/core.c:1420
__bpf_prog_run32+0x8f/0xd0 kernel/bpf/core.c:1735
bpf_dispatcher_nop_func include/linux/bpf.h:644 [inline]
bpf_prog_run_pin_on_cpu include/linux/filter.h:624 [inline]
bpf_prog_run_clear_cb include/linux/filter.h:755 [inline]
run_filter+0x1a1/0x470 net/packet/af_packet.c:2031
packet_rcv+0x313/0x13e0 net/packet/af_packet.c:2104
dev_queue_xmit_nit+0x7c2/0xa90 net/core/dev.c:2387
xmit_one net/core/dev.c:3588 [inline]
dev_hard_start_xmit+0xad/0x920 net/core/dev.c:3609
__dev_queue_xmit+0x2121/0x2e00 net/core/dev.c:4182
__bpf_tx_skb net/core/filter.c:2116 [inline]
__bpf_redirect_no_mac net/core/filter.c:2141 [inline]
__bpf_redirect+0x548/0xc80 net/core/filter.c:2164
____bpf_clone_redirect net/core/filter.c:2448 [inline]
bpf_clone_redirect+0x2ae/0x420 net/core/filter.c:2420
___bpf_prog_run+0x34e1/0x77d0 kernel/bpf/core.c:1523
__bpf_prog_run512+0x99/0xe0 kernel/bpf/core.c:1737
bpf_dispatcher_nop_func include/linux/bpf.h:644 [inline]
bpf_test_run+0x3ed/0xc50 net/bpf/test_run.c:50
bpf_prog_test_run_skb+0xabc/0x1c50 net/bpf/test_run.c:582
bpf_prog_test_run kernel/bpf/syscall.c:3127 [inline]
__do_sys_bpf+0x1ea9/0x4f00 kernel/bpf/syscall.c:4406
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae
[...]
Generally speaking, KUBSAN reports from the kernel should be fixed.
However, in case of BPF, this particular report caused concerns since
the large shift is not wrong from BPF point of view, just undefined.
In the verifier, K-based shifts that are >= {64,32} (depending on the
bitwidth of the instruction) are already rejected. The register-based
cases were not given their content might not be known at verification
time. Ideas such as verifier instruction rewrite with an additional
AND instruction for the source register were brought up, but regularly
rejected due to the additional runtime overhead they incur.
As Edward Cree rightly put it:
Shifts by more than insn bitness are legal in the BPF ISA; they are
implementation-defined behaviour [of the underlying architecture],
rather than UB, and have been made legal for performance reasons.
Each of the JIT backends compiles the BPF shift operations to machine
instructions which produce implementation-defined results in such a
case; the resulting contents of the register may be arbitrary but
program behaviour as a whole remains defined.
Guard checks in the fast path (i.e. affecting JITted code) will thus
not be accepted.
The case of division by zero is not truly analogous here, as division
instructions on many of the JIT-targeted architectures will raise a
machine exception / fault on division by zero, whereas (to the best
of my knowledge) none will do so on an out-of-bounds shift.
Given the KUBSAN report only affects the BPF interpreter, but not JITs,
one solution is to add the ANDs with 63 or 31 into ___bpf_prog_run().
That would make the shifts defined, and thus shuts up KUBSAN, and the
compiler would optimize out the AND on any CPU that interprets the shift
amounts modulo the width anyway (e.g., confirmed from disassembly that
on x86-64 and arm64 the generated interpreter code is the same before
and after this fix).
The BPF interpreter is slow path, and most likely compiled out anyway
as distros select BPF_JIT_ALWAYS_ON to avoid speculative execution of
BPF instructions by the interpreter. Given the main argument was to
avoid sacrificing performance, the fact that the AND is optimized away
from compiler for mainstream archs helps as well as a solution moving
forward. Also add a comment on LSH/RSH/ARSH translation for JIT authors
to provide guidance when they see the ___bpf_prog_run() interpreter
code and use it as a model for a new JIT backend.
Reported-by: syzbot+bed360704c521841c85d@syzkaller.appspotmail.com Reported-by: Kurt Manucredo <fuzzybritches0@gmail.com> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Co-developed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: syzbot+bed360704c521841c85d@syzkaller.appspotmail.com Cc: Edward Cree <ecree.xilinx@gmail.com> Link: https://lore.kernel.org/bpf/0000000000008f912605bd30d5d7@google.com Link: https://lore.kernel.org/bpf/bac16d8d-c174-bdc4-91bd-bfa62b410190@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Don't check the sequence number when deciding when to update time_in in
the node table if tag removal is offloaded since the sequence number is
part of the tag. This fixes a problem where the times in the node table
wouldn't update when 0 appeared to be before or equal to seq_out when
tag removal was offloaded.
Signed-off-by: George McCollister <george.mccollister@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Calling free_mqd inside of destroy_queue_nocpsch_locked can cause a
circular lock. destroy_queue_nocpsch_locked is called under a DQM lock,
which is taken in MMU notifiers, potentially in FS reclaim context.
Taking another lock, which is BO reservation lock from free_mqd, while
causing an FS reclaim inside the DQM lock creates a problematic circular
lock dependency. Therefore move free_mqd out of
destroy_queue_nocpsch_locked and call it after unlocking DQM.
Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
get_wave_state acquires the mmap_lock on copy_to_user but so do
mmu_notifiers. mmu_notifiers allows dqm locking so do get_wave_state
outside the dqm_lock to prevent circular locking.
v2: squash in unused variable removal.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch adds missing MODULE_DEVICE_TABLE definition which generates
correct modalias for automatic loading of this driver when it is built
as an external module.
At least on wl12xx, reading the MAC after boot can fail with a warning
at drivers/net/wireless/ti/wlcore/sdio.c:78 wl12xx_sdio_raw_read.
The failed call comes from wl12xx_get_mac() that wlcore_nvs_cb() calls
after request_firmware_work_func().
After the error, no wireless interface is created. Reloading the wl12xx
module makes the interface work.
Turns out the wlan controller can be in a low-power ELP state after the
boot from the bootloader or kexec, and needs to be woken up first.
Let's wake the hardware and add a sleep after that similar to
wl12xx_pre_boot() is already doing.
Note that a similar issue could exist for wl18xx, but I have not seen it
so far. And a search for wl18xx_get_mac and wl12xx_sdio_raw_read did not
produce similar errors.
When memory allocation for XFRMA_ENCAP or XFRMA_COADDR fails,
the error will not be reported because the -ENOMEM assignment
to the err variable is overwritten before. Fix this by moving
these two in front of the function so that memory allocation
failures will be reported.
Fix the following kernel build warnings:
drivers/net/ethernet/ibm/ibmvnic.c:1516: warning: Function parameter or member 'skb' not described in 'build_hdr_descs_arr'
drivers/net/ethernet/ibm/ibmvnic.c:1516: warning: Function parameter or member 'indir_arr' not described in 'build_hdr_descs_arr'
drivers/net/ethernet/ibm/ibmvnic.c:1516: warning: Excess function parameter 'txbuff' description in 'build_hdr_descs_arr'
Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
It has been reported that on RTL8106e the link-up interrupt may be
significantly delayed if the user enables ASPM L1. Per default ASPM
is disabled. The change leaves L1 enabled on the PCIe link (thus still
allowing to reach higher package power saving states), but the
NIC won't actively trigger it.
Reported-by: Koba Ko <koba.ko@canonical.com> Tested-by: Koba Ko <koba.ko@canonical.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the field, we have seen lots of allocation failure from the call
path below.
06-03 13:29:12.999 1010315 31557 31557 W Binder : 31542_2: page allocation failure: order:0, mode:0x800(GFP_NOWAIT), nodemask=(null),cpuset=background,mems_allowed=0
...
...
06-03 13:29:12.999 1010315 31557 31557 W Call trace:
06-03 13:29:12.999 1010315 31557 31557 W : dump_backtrace.cfi_jt+0x0/0x8
06-03 13:29:12.999 1010315 31557 31557 W : dump_stack+0xc8/0x14c
06-03 13:29:12.999 1010315 31557 31557 W : warn_alloc+0x158/0x1c8
06-03 13:29:12.999 1010315 31557 31557 W : __alloc_pages_slowpath+0x9d8/0xb80
06-03 13:29:12.999 1010315 31557 31557 W : __alloc_pages_nodemask+0x1c4/0x430
06-03 13:29:12.999 1010315 31557 31557 W : allocate_slab+0xb4/0x390
06-03 13:29:12.999 1010315 31557 31557 W : ___slab_alloc+0x12c/0x3a4
06-03 13:29:12.999 1010315 31557 31557 W : kmem_cache_alloc+0x358/0x5e4
06-03 13:29:12.999 1010315 31557 31557 W : avc_alloc_node+0x30/0x184
06-03 13:29:12.999 1010315 31557 31557 W : avc_update_node+0x54/0x4f0
06-03 13:29:12.999 1010315 31557 31557 W : avc_has_extended_perms+0x1a4/0x460
06-03 13:29:12.999 1010315 31557 31557 W : selinux_file_ioctl+0x320/0x3d0
06-03 13:29:12.999 1010315 31557 31557 W : __arm64_sys_ioctl+0xec/0x1fc
06-03 13:29:12.999 1010315 31557 31557 W : el0_svc_common+0xc0/0x24c
06-03 13:29:12.999 1010315 31557 31557 W : el0_svc+0x28/0x88
06-03 13:29:12.999 1010315 31557 31557 W : el0_sync_handler+0x8c/0xf0
06-03 13:29:12.999 1010315 31557 31557 W : el0_sync+0x1a4/0x1c0
..
..
06-03 13:29:12.999 1010315 31557 31557 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT|__GFP_ZERO)
06-03 13:29:12.999 1010315 31557 31557 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0
06-03 13:29:12.999 1010315 31557 31557 W node 0 : slabs: 57, objs: 2907, free: 0
06-03 13:29:12.999 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT|__GFP_ZERO)
06-03 13:29:12.999 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0
06-03 13:29:12.999 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0
06-03 13:29:12.999 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT|__GFP_ZERO)
06-03 13:29:12.999 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0
06-03 13:29:12.999 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0
06-03 13:29:12.999 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT|__GFP_ZERO)
06-03 13:29:12.999 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0
06-03 13:29:12.999 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0
06-03 13:29:13.000 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT|__GFP_ZERO)
06-03 13:29:13.000 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0
06-03 13:29:13.000 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0
06-03 13:29:13.000 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT|__GFP_ZERO)
06-03 13:29:13.000 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0
06-03 13:29:13.000 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0
06-03 13:29:13.000 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT|__GFP_ZERO)
06-03 13:29:13.000 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0
06-03 13:29:13.000 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0
06-03 13:29:13.000 10230 30892 30892 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT|__GFP_ZERO)
06-03 13:29:13.000 10230 30892 30892 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0
06-03 13:29:13.000 10230 30892 30892 W node 0 : slabs: 57, objs: 2907, free: 0
06-03 13:29:13.000 10230 30892 30892 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT|__GFP_ZERO)
06-03 13:29:13.000 10230 30892 30892 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0
Based on [1], selinux is tolerate for failure of memory allocation.
Then, use __GFP_NOWARN together.
[1] 476accbe2f6e ("selinux: use GFP_NOWAIT in the AVC kmem_caches")
Signed-off-by: Minchan Kim <minchan@kernel.org>
[PM: subj fix, line wraps, normalized commit refs] Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Several tests do not set some ports down as part of their cleanup(),
resulting in IPv6 link-local addresses and associated routes not being
deleted.
These leaks were found using a BPF tool that monitors ASIC resources.
Solve this by setting the ports down at the end of the tests.
Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
PHY will delay about 11.5ms to generate RXC clock when switching from
power down to normal operation. Read/write registers would also cause RXC
become unstable and stop for a while during this process. Realtek engineer
suggests 15ms or more delay can workaround this issue.
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
[why]
When OS overrides training link training parameters
for MST device to SST mode, MST resources are not
released and leak of the resource may result crash and
incorrect MST discovery during following hot plugs.
[how]
Retaining sink object to be reused by SST link and
releasing MST resources.
Signed-off-by: Vladimir Stempen <vladimir.stempen@amd.com> Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[Why]
We update scaling settings when scaling mode has been changed.
However when changing mode from native resolution the scaling mode previously
set gets ignored.
[How]
Perform scaling settings update on modeset.
Signed-off-by: Roman Li <roman.li@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[why]
DSCCLK validation is not necessary because DSCCLK is derrived from
DISPCLK, therefore if DISPCLK validation passes, DSCCLK is valid, too.
Doing DSCLK validation in addition to DISPCLK leads to modes being
wrongly rejected when DSCCLK was incorrectly set outside of DML.
[how]
Remove DSCCLK validation because it's implicitly validated under DISPCLK
Signed-off-by: Nikola Cornij <nikola.cornij@amd.com> Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Use devm_platform_get_and_ioremap_resource() to simplify
code and avoid a null-ptr-deref by checking 'res' in it.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
We should not directly BUG() when there is hdr error, it is
better to output a print when such error happens. Currently,
the caller of xmit_skb() already did it.
Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
clang generates deadcode.DeadStores warnings when a variable
is used to read a value, but then that value isn't used later
in the code. Fix this warning.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
It may need hold Global Config Lock a longer time when download DDP
package file, extend the timeout value to 5000ms to ensure that
download can be finished before other AQ command got time to run,
this will fix the issue below when probe the device, 5000ms is a test
value that work with both Backplane and BreakoutCable NVM image:
ice 0000:f4:00.0: VSI 12 failed lan queue config, error ICE_ERR_CFG
ice 0000:f4:00.0: Failed to delete VSI 12 in FW - error: ICE_ERR_AQ_TIMEOUT
ice 0000:f4:00.0: probe failed due to setup PF switch: -12
ice: probe of 0000:f4:00.0 failed with error -12
Signed-off-by: Liwei Song <liwei.song@windriver.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Checking value of MCP_INTF in mcp23s08_irq suggests that the handler may be
called even when there is no interrupt pending.
But the actual interrupt could happened between reading MCP_INTF and MCP_GPIO.
In this situation we got nothing from MCP_INTF, but the event gets acknowledged
on the expander by reading MCP_GPIO. This leads to losing events.
Fix the problem by not reading any register until we see something in MCP_INTF.
The error was reproduced and fix tested on MCP23017.
According to the standard IEC 62439-2, the number of transitions needs
to be counted for each transition 'between' ring state open and ring
state closed and not from open state to closed state.
Therefore fix this for both ring and interconnect ring.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Introduce the BIO flag BIO_ZONE_WRITE_LOCKED to indicate that a BIO owns
the write lock of the zone it is targeting. This is the counterpart of
the struct request flag RQF_ZONE_WRITE_LOCKED.
This new BIO flag is reserved for now for zone write locking control
for device mapper targets exposing a zoned block device. Since in this
case, the lock flag must not be propagated to the struct request that
will be used to process the BIO, a BIO private flag is used rather than
changing the RQF_ZONE_WRITE_LOCKED request flag into a common REQ_XXX
flag that could be used for both BIO and request. This avoids conflicts
down the stack with the block IO scheduler zone write locking
(in mq-deadline).
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix dm_accept_partial_bio() to actually check that zone management
commands are not passed as explained in the function documentation
comment. Also, since a zone append operation cannot be split, add
REQ_OP_ZONE_APPEND as a forbidden command.
White lines are added around the group of BUG_ON() calls to make the
code more legible.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If dm-writecache overwrites existing cached data, it splits the
incoming bio into many block-sized bios. The I/O scheduler does merge
these bios into one large request but this needless splitting and
merging causes performance degradation.
Fix this by avoiding bio splitting if the cache target area that is
being overwritten is contiguous.
Current commit code resets the place where the search for free blocks
will begin back to the start of the metadata device. There are a couple
of repercussions to this:
- The first allocation after the commit is likely to take longer than
normal as it searches for a free block in an area that is likely to
have very few free blocks (if any).
- Any free blocks it finds will have been recently freed. Reusing them
means we have fewer old copies of the metadata to aid recovery from
hardware error.
Fix these issues by leaving the cursor alone, only resetting when the
search hits the end of the metadata device.
Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch aims to improve the situation when reordering and loss are
ocurring in the same flight of packets.
Previously the reordering would first induce a spurious recovery, then
the subsequent ACK may undo the cwnd (based on the timestamps e.g.).
However the current loss recovery does not proceed to invoke
RACK to install a reordering timer. If some packets are also lost, this
may lead to a long RTO-based recovery. An example is
https://groups.google.com/g/bbr-dev/c/OFHADvJbTEI
The solution is to after reverting the recovery, always invoke RACK
to either mount the RACK timer to fast retransmit after the reordering
window, or restarts the recovery if new loss is identified. Hence
it is possible the sender may go from Recovery to Disorder/Open to
Recovery again in one ACK.
Reported-by: mingkun bian <bianmingkun@gmail.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Take the situation with gfxoff, the optimization may cause
corrupt CE ram contents. In addition emit_cntxcntl callback
has similar optimization which firmware can handle properly
even for power feature.
Instead of reading the desired 5 bytes of the actual target field,
the code was reading 8. This could result in a corrupted value if the
trailing 3 bytes were non-zero, so instead use an appropriately sized
and zero-initialized bounce buffer, and read only 5 bytes before casting
to u64.
Re-adjust the function return order to avoid empty sdma version in the
sriov environment. (read amdgpu_firmware_info)
Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The clock driving the XBurst CPUs in Ingenic SoCs is integer divided
from the main PLL. As such, it is possible to control the frequency of
the CPU, either by changing the divider, or by changing the rate of the
main PLL.
The XBurst CPUs also lack the CP0 timer; the TCU, a separate piece of
hardware in the SoC, provides this functionality.
Signed-off-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Ingenic JZ4760 and JZ4760B do have a FPU, but the config registers don't
report it. Force the FPU detection in case the processor ID match the
JZ4760(B) one.
Signed-off-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
This is a complement to commit aa6dd211e4b1 ("inet: use bigger hash
table for IP ID generation"), but focusing on some specific aspects
of IPv6.
Contary to IPv4, IPv6 only uses packet IDs with fragments, and with a
minimum MTU of 1280, it's much less easy to force a remote peer to
produce many fragments to explore its ID sequence. In addition packet
IDs are 32-bit in IPv6, which further complicates their analysis. On
the other hand, it is often easier to choose among plenty of possible
source addresses and partially work around the bigger hash table the
commit above permits, which leaves IPv6 partially exposed to some
possibilities of remote analysis at the risk of weakening some
protocols like DNS if some IDs can be predicted with a good enough
probability.
Given the wide range of permitted IDs, the risk of collision is extremely
low so there's no need to rely on the positive increment algorithm that
is shared with the IPv4 code via ip_idents_reserve(). We have a fast
PRNG, so let's simply call prandom_u32() and be done with it.
Performance measurements at 10 Gbps couldn't show any difference with
the previous code, even when using a single core, because due to the
large fragments, we're limited to only ~930 kpps at 10 Gbps and the cost
of the random generation is completely offset by other operations and by
the network transfer time. In addition, this change removes the need to
update a shared entry in the idents table so it may even end up being
slightly faster on large scale systems where this matters.
The risk of at least one collision here is about 1/80 million among
10 IDs, 1/850k among 100 IDs, and still only 1/8.5k among 1000 IDs,
which remains very low compared to IPv4 where all IDs are reused
every 4 to 80ms on a 10 Gbps flow depending on packet sizes.
Reported-by: Amit Klein <aksecurity@gmail.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20210529110746.6796-1-w@1wt.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
pm_runtime_get_sync will increment pm usage counter even it failed.
Forgetting to putting operation will result in reference leak here.
Fix it by replacing it with pm_runtime_resume_and_get to keep usage
counter balanced.
The PLLU (USB) consists of the PLL configuration itself and configuration
of the PLLU outputs. The PLLU programming is inconsistent on T30 vs T114,
where T114 immediately bails out if PLLU is enabled and T30 re-enables
a potentially already enabled PLL (left after bootloader) and then fully
reprograms it, which could be unsafe to do. The correct way should be to
skip enabling of the PLL if it's already enabled and then apply
configuration to the outputs. This patch doesn't fix any known problems,
it's a minor improvement.
The refcounting of the gate clocks has a bug causing the enable_refcnt
to underflow when unused clocks are disabled. This happens because clk
provider erroneously bumps the refcount if clock is enabled at a boot
time, which it shouldn't be doing, and it does this only for the gate
clocks, while peripheral clocks are using the same gate ops and the
peripheral clocks are missing the initial bump. Hence the refcount of
the peripheral clocks is 0 when unused clocks are disabled and then the
counter is decremented further by the gate ops, causing the integer
underflow.
Fix this problem by removing the erroneous bump and by implementing the
disable_unused() callback, which disables the unused gates properly.
The visible effect of the bug is such that the unused clocks are never
gated if a loaded kernel module grabs the unused clocks and starts to use
them. In practice this shouldn't cause any real problems for the drivers
and boards supported by the kernel today.
Max IB immediate data size is 2^28 (MAX_IMM_PAYL_BITS)
and the minimum chunk size is 4096 (2^12).
Therefore the maximum sess_queue_depth is 65536 (2^16).
Link: https://lore.kernel.org/r/20210528113018.52290-6-jinpu.wang@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
stmmac_mdio_register() has logic to search for PHYs on the MDIO bus and
assign them IRQ lines, as well as to set priv->plat->phy_addr.
If no PHY is found, the "found" variable remains set to 0 and the
function errors out.
After the introduction of commit f213bbe8a9d6 ("net: stmmac: Integrate
it with DesignWare XPCS"), the "found" variable was immediately reused
for searching for a PCS on the same MDIO bus.
This can result in 2 types of potential problems (none of them seems to
be seen on the only Intel system that sets has_xpcs = true, otherwise it
would have been reported):
1. If a PCS is found but a PHY is not, then the code happily exits with
no error. One might say "yes, but this is not possible, because
of_mdiobus_register will probe a PHY for all MDIO addresses,
including for the XPCS, so if an XPCS exists, then a PHY certainly
exists too". Well, that is not true, see intel_mgbe_common_data():
Add dither_up, dsp_lut_en and data_blank registers to enable their
respective functionality for RK3066's VOP.
While at that also fix .rb_swap and .format registers for all windows,
which have to be set though RK3066_SYS_CTRL1 register.
Also remove .scl from win1: Scaling is only supported on the primary
plane.
Add dither_up, dsp_lut_en and data_blank registers to enable their
respective functionality for RK3188's VOP.
While at that also fix .dsp_blank register which is (only) set with
BIT24 (same as RK3066)
rep_tc copy REG_C1 to REG_B. IPsec crypto utilizes the whole REG_B
register with BIT31 as IPsec marker. rep_tc_update_skb drops
IPsec because it thought REG_B contains bad value.
In previous patch, BIT 31 of REG_C1 is reserved for IPsec.
Skip the rep_tc_update_skb if BIT31 of REG_B is set.
There are two issues with scaling calculations, odm recout
calculation and matching viewport to actual recout.
This change fixes both issues. Odm recout calculation via
special casing and viewport matching issue by reworking
the viewport calcualtion to use scaling ratios and recout
to derrive the required offset and size.
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
R-Car D3 ZA2 clock is from PLL0D3 or S0,
and it can be controlled by ZA2CKCR.
It is needed for R-Car Sound, but is not used so far.
Using default settings is very enough at this point.
This patch adds it by DEF_FIXED().
pm_runtime_get_sync will increment pm usage counter even it failed.
Forgetting to putting operation will result in reference leak here.
Fix it by replacing it with pm_runtime_resume_and_get to keep usage
counter balanced.
The igb driver was trying hard to be sparse correct, but somehow
ended up converting a variable into little endian order and then
tries to OR something with it.
A much plainer way of doing things is to leave all variables and
OR operations in CPU (non-endian) mode, and then convert to
little endian only once, which is what this change does.
This probably fixes a bug that might have been seen only on
big endian systems.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The sparse build (C=2) finds some issues with how the driver
dealt with the (very difficult) hardware that in some generations
uses little-endian, and in others uses big endian, for the VLAN
field. The code as written picks __le16 as a type and for some
hardware revisions we override it to __be16 as done in this
patch. This impacted the VF driver as well so fix it there too.
Also change the vlan_tci assignment to override the sparse
warning without changing functionality.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
pm_runtime_get_sync will increment pm usage counter even it failed.
Forgetting to putting operation will result in reference leak here.
Fix it by replacing it with pm_runtime_resume_and_get to keep usage
counter balanced.
In function udf_symlink, epos.bh is assigned with the value returned
by udf_tgetblk. The function udf_tgetblk is defined in udf/misc.c
and returns the value of sb_getblk function that could be NULL.
Then, epos.bh is used without any check, causing a possible
NULL pointer dereference when sb_getblk fails.
This fix adds a check to validate the value of epos.bh.
Problem: If scheduler is already stopped by the time sched_entity
is released and entity's job_queue not empty I encountred
a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle
never becomes false.
Fix: In drm_sched_fini detach all sched_entities from the
scheduler's run queues. This will satisfy drm_sched_entity_is_idle.
Also wakeup all those processes stuck in sched_entity flushing
as the scheduler main thread which wakes them up is stopped by now.
v2:
Reverse order of drm_sched_rq_remove_entity and marking
s_entity as stopped to prevent reinserion back to rq due
to race.
v3:
Drop drm_sched_rq_remove_entity, only modify entity->stopped
and check for it in drm_sched_entity_is_idle
This patch adds missing MODULE_DEVICE_TABLE definition which generates
correct modalias for automatic loading of this driver when it is built
as an external module.
modern userspace applications, like OVN, can configure the TC datapath to
"recirculate" packets several times. If more than 4 "recirculation" rules
are configured, packets can be dropped by __tcf_classify().
Changing the maximum number of reclassifications (from 4 to 16) should be
sufficient to prevent drops in most use cases, and guard against loops at
the same time.
Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Similar to the way in which of_mdiobus_register() has a fallback to the
non-DT based mdiobus_register() when CONFIG_OF is not set, we can create
a shim for the device-managed devm_of_mdiobus_register() which calls
devm_mdiobus_register() and discards the struct device_node *.
In particular, this solves a build issue with the qca8k DSA driver which
uses devm_of_mdiobus_register and can be compiled without CONFIG_OF.
Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The virtio_gpu_init() will free vgdev and vgdev->vbufs on failure.
But such failure will be caught by virtio_gpu_probe() and then
virtio_gpu_release() will be called to do some cleanup which
will free vgdev and vgdev->vbufs again. So let's set dev->dev_private
to NULL to avoid double free.
This patch adds missing MODULE_DEVICE_TABLE definition which generates
correct modalias for automatic loading of this driver when it is built
as an external module.
mdio drivers should not use REGCHACHE. Also disable locking since it's
handled by the mdio users and regmap is always accessed atomically.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
pm_runtime_get_sync will increment pm usage counter even it failed.
Forgetting to putting operation will result in reference leak here.
Fix it by replacing it with pm_runtime_resume_and_get to keep usage
counter balanced.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Li <wangli74@huawei.com> Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
__napi_schedule_irqoff() is an optimized version of __napi_schedule()
which can be used where it is known that interrupts are disabled,
e.g. in interrupt-handlers, spin_lock_irq() sections or hrtimer
callbacks.
On PREEMPT_RT enabled kernels this assumptions is not true. Force-
threaded interrupt handlers and spinlocks are not disabling interrupts
and the NAPI hrtimer callback is forced into softirq context which runs
with interrupts enabled as well.
Chasing all usage sites of __napi_schedule_irqoff() is a whack-a-mole
game so make __napi_schedule_irqoff() invoke __napi_schedule() for
PREEMPT_RT kernels.
The callers of ____napi_schedule() in the networking core have been
audited and are correct on PREEMPT_RT kernels as well.
Reported-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
This module's remove path calls del_timer(). However, that function
does not wait until the timer handler finishes. This means that the
timer handler may still be running after the driver's remove function
has finished, which would result in a use-after-free.
Fix by calling del_timer_sync(), which makes sure the timer handler
has finished, and unable to re-schedule itself.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
This module's remove path calls del_timer(). However, that function
does not wait until the timer handler finishes. This means that the
timer handler may still be running after the driver's remove function
has finished, which would result in a use-after-free.
Fix by calling del_timer_sync(), which makes sure the timer handler
has finished, and unable to re-schedule itself.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
This module's remove path calls del_timer(). However, that function
does not wait until the timer handler finishes. This means that the
timer handler may still be running after the driver's remove function
has finished, which would result in a use-after-free.
Fix by calling del_timer_sync(), which makes sure the timer handler
has finished, and unable to re-schedule itself.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
If multiple threads are accessing the same huge page at the same
time, hugetlb_cow will be called if one thread write the COW huge
page. And function huge_ptep_clear_flush is called to notify other
threads to clear the huge pte tlb entry. The other threads clear
the huge pte tlb entry and reload it from page table, the reload
huge pte entry may be old.
This patch fixes this issue on mips platform, and it clears huge
pte entry before notifying other threads to flush current huge
page entry, it is similar with other architectures.
Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Change to 60s. This matches what we already do in virtualization.
Infinite timeout can lead to deadlocks in the kernel.
Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>