The dev_warn to notify about a spurious interrupt was introduced with
the reasoning that these are unexpected. However spurious interrupts
tend to trigger continously and the error message on the serial console
prevents that the core's detection of spurious interrupts kicks in
(which disables the irq) and just floods the console.
Fixes: c64e7efe46b7 ("spi: stm32: make spurious and overrun interrupts visible") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://msgid.link/r/20240521105241.62400-2-u.kleine-koenig@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently, comparisons to 'm' or 'n' result in incorrect output.
[Test Code]
config MODULES
def_bool y
modules
config A
def_tristate m
config B
def_bool A > n
CONFIG_B is unset, while CONFIG_B=y is expected.
The reason for the issue is because Kconfig compares the tristate values
as strings.
Currently, the .type fields in the constant symbol definitions,
symbol_{yes,mod,no} are unspecified, i.e., S_UNKNOWN.
When expr_calc_value() evaluates 'A > n', it checks the types of 'A' and
'n' to determine how to compare them.
The left-hand side, 'A', is a tristate symbol with a value of 'm', which
corresponds to a numeric value of 1. (Internally, 'y', 'm', and 'n' are
represented as 2, 1, and 0, respectively.)
The right-hand side, 'n', has an unknown type, so it is treated as the
string "n" during the comparison.
expr_calc_value() compares two values numerically only when both can
have numeric values. Otherwise, they are compared as strings.
symbol numeric value ASCII code
-------------------------------------
y 2 0x79
m 1 0x6d
n 0 0x6e
'm' is greater than 'n' if compared numerically (since 1 is greater
than 0), but smaller than 'n' if compared as strings (since the ASCII
code 0x6d is smaller than 0x6e).
Specifying .type=S_TRISTATE for symbol_{yes,mod,no} fixes the above
test code.
Doing so, however, would cause a regression to the following test code.
[Test Code 2]
config MODULES
def_bool n
modules
config A
def_tristate n
config B
def_bool A = m
You would get CONFIG_B=y, while CONFIG_B should not be set.
The reason is because sym_get_string_value() turns 'm' into 'n' when the
module feature is disabled. Consequently, expr_calc_value() evaluates
'A = n' instead of 'A = m'. This oddity has been hidden because the type
of 'm' was previously S_UNKNOWN instead of S_TRISTATE.
sym_get_string_value() should not tweak the string because the tristate
value has already been correctly calculated. There is no reason to
return the string "n" where its tristate value is mod.
Fixes: 31847b67bec0 ("kconfig: allow use of relations other than (in)equality") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
syzbot reports:
general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
[..]
RIP: 0010:nf_tproxy_laddr4+0xb7/0x340 net/ipv4/netfilter/nf_tproxy_ipv4.c:62
Call Trace:
nft_tproxy_eval_v4 net/netfilter/nft_tproxy.c:56 [inline]
nft_tproxy_eval+0xa9a/0x1a00 net/netfilter/nft_tproxy.c:168
__in_dev_get_rcu() can return NULL, so check for this.
Reported-and-tested-by: syzbot+b94a6818504ea90d7661@syzkaller.appspotmail.com Fixes: cc6eb4338569 ("tproxy: use the interface primary IP address as a default value for --on-ip") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Userspace assumes vlan header is present at a given offset, but vlan
offload allows to store this in metadata fields of the skbuff. Hence
mangling vlan results in a garbled packet. Handle this transparently by
adding a parser to the kernel.
If vlan metadata is present and payload offset is over 12 bytes (source
and destination mac address fields), then subtract vlan header present
in vlan metadata, otherwise mangle vlan metadata based on offset and
length, extracting data from the source register.
This is similar to:
8cfd23e67401 ("netfilter: nft_payload: work around vlan header stripping")
When fec_probe() fails or fec_drv_remove() needs to release the
fec queue and remove a NAPI context, therefore add a function
corresponding to fec_enet_init() and call fec_enet_deinit() which
does the opposite to release memory and remove a NAPI context.
Fixes: 59d0f7465644 ("net: fec: init multi queue date structure") Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240524050528.4115581-1-xiaolei.wang@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
We have seen an influx of syzkaller reports where a BPF program attached to
a tracepoint triggers a locking rule violation by performing a map_delete
on a sockmap/sockhash.
We don't intend to support this artificial use scenario. Extend the
existing verifier allowed-program-type check for updating sockmap/sockhash
to also cover deleting from a map.
From now on only BPF programs which were previously allowed to update
sockmap/sockhash can delete from these map types.
Fixes: ff9105993240 ("bpf, sockmap: Prevent lock inversion deadlock in map delete elem") Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Reported-by: syzbot+ec941d6e24f633a59172@syzkaller.appspotmail.com Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: syzbot+ec941d6e24f633a59172@syzkaller.appspotmail.com Acked-by: John Fastabend <john.fastabend@gmail.com> Closes: https://syzkaller.appspot.com/bug?extid=ec941d6e24f633a59172 Link: https://lore.kernel.org/bpf/20240527-sockmap-verify-deletes-v1-1-944b372f2101@cloudflare.com Signed-off-by: Sasha Levin <sashal@kernel.org>
LED Select (LED_SEL) bit in the LED General Purpose IO Configuration
register is used to determine the functionality of external LED pins
(Speed Indicator, Link and Activity Indicator, Full Duplex Link
Indicator). The default value for this bit is 0 when no EEPROM is
present. If a EEPROM is present, the default value is the value of the
LED Select bit in the Configuration Flags of the EEPROM. A USB Reset or
Lite Reset (LRST) will cause this bit to be restored to the image value
last loaded from EEPROM, or to be set to 0 if no EEPROM is present.
While configuring the dual purpose GPIO/LED pins to LED outputs in the
LED General Purpose IO Configuration register, the LED_SEL bit is changed
as 0 and resulting the configured value from the EEPROM is cleared. The
issue is fixed by using read-modify-write approach.
Fixes: f293501c61c5 ("smsc95xx: configure LED outputs") Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Woojung Huh <woojung.huh@microchip.com> Link: https://lore.kernel.org/r/20240523085314.167650-1-Parthiban.Veerasooran@microchip.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
enic_set_vf_port assumes that the nl attribute IFLA_PORT_PROFILE
is of length PORT_PROFILE_MAX and that the nl attributes
IFLA_PORT_INSTANCE_UUID, IFLA_PORT_HOST_UUID are of length PORT_UUID_MAX.
These attributes are validated (in the function do_setlink in rtnetlink.c)
using the nla_policy ifla_port_policy. The policy defines IFLA_PORT_PROFILE
as NLA_STRING, IFLA_PORT_INSTANCE_UUID as NLA_BINARY and
IFLA_PORT_HOST_UUID as NLA_STRING. That means that the length validation
using the policy is for the max size of the attributes and not on exact
size so the length of these attributes might be less than the sizes that
enic_set_vf_port expects. This might cause an out of bands
read access in the memcpys of the data of these
attributes in enic_set_vf_port.
Fixes: f8bd909183ac ("net: Add ndo_{set|get}_vf_port support for enic dynamic vnics") Signed-off-by: Roded Zats <rzats@paloaltonetworks.com> Link: https://lore.kernel.org/r/20240522073044.33519-1-rzats@paloaltonetworks.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
err is a 32-bit integer, but elf_update returns an off_t, which is 64-bit
at least on 64-bit platforms. If symbols_patch is called on a binary between
2-4GB in size, the result will be negative when cast to a 32-bit integer,
which the code assumes means an error occurred. This can wrongly trigger
build failures when building very large kernel images.
Fixes: fbbb68de80a4 ("bpf: Add resolve_btfids tool to resolve BTF IDs in ELF object") Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240514070931.199694-1-friedrich.vock@gmx.de Signed-off-by: Sasha Levin <sashal@kernel.org>
Since commit a6aa8fca4d79 ("dma-buf/sw-sync: Reduce irqsave/irqrestore from
known context") by error replaced spin_unlock_irqrestore() with
spin_unlock_irq() for both sync_debugfs_show() and sync_print_obj() despite
sync_print_obj() is called from sync_debugfs_show(), lockdep complains
inconsistent lock state warning.
Use plain spin_{lock,unlock}() for sync_print_obj(), for
sync_debugfs_show() is already using spin_{lock,unlock}_irq().
Reported-by: syzbot <syzbot+a225ee3df7e7f9372dbe@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=a225ee3df7e7f9372dbe Fixes: a6aa8fca4d79 ("dma-buf/sw-sync: Reduce irqsave/irqrestore from known context") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/c2e46020-aaa6-4e06-bf73-f05823f913f0@I-love.SAKURA.ne.jp Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Previously, the driver incorrectly used rx_dropped to report device
buffer exhaustion.
According to the documentation, rx_dropped should not be used to count
packets dropped due to buffer exhaustion, which is the purpose of
rx_missed_errors.
Use rx_missed_errors as intended for counting packets dropped due to
buffer exhaustion.
Fixes: 269e6b3af3bf ("net/mlx5e: Report additional error statistics in get stats ndo") Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Remove faulty check disabling checksum offload and GSO for offload of
simple IPsec tunnel L4 traffic. Comment previously describing the deleted
code incorrectly claimed the check prevented double tunnel (or three layers
of ip headers).
Fixes: f1267798c980 ("net/mlx5: Fix checksum issue of VXLAN and IPsec crypto offload") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Following a similar reinstate for the KSZ8081 and KSZ9031.
Older kernels would use the genphy_soft_reset if the PHY did not implement
a .soft_reset.
The KSZ8061 errata described here:
https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ8061-Errata-DS80000688B.pdf
and worked around with 232ba3a51c ("net: phy: Micrel KSZ8061: link failure after cable connect")
is back again without this soft reset.
Fixes: 6e2d85ec0559 ("net: phy: Stop with excessive soft reset") Tested-by: Karim Ben Houcine <karim.benhoucine@landisgyr.com> Signed-off-by: Mathieu Othacehe <othacehe@gnu.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
When disabling an nvmet namespace, there is a period where the
subsys->lock is released, as the ns disable waits for backend IO to
complete, and the ns percpu ref to be properly killed. The original
intent was to avoid taking the subsystem lock for a prolong period as
other processes may need to acquire it (for example new incoming
connections).
However, it opens up a window where another process may come in and
enable the ns, (re)intiailizing the ns percpu_ref, causing the disable
sequence to hang.
Solve this by taking the global nvmet_config_sem over the entire configfs
enable/disable sequence.
Fixes: a07b4970f464 ("nvmet: add a generic NVMe target") Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
While validating node ids in map_benchmark_ioctl(), node_possible() may
be provided with invalid argument outside of [0,MAX_NUMNODES-1] range
leading to:
BUG: KASAN: wild-memory-access in map_benchmark_ioctl (kernel/dma/map_benchmark.c:214)
Read of size 8 at addr 1fffffff8ccb6398 by task dma_map_benchma/971
CPU: 7 PID: 971 Comm: dma_map_benchma Not tainted 6.9.0-rc6 #37
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
Call Trace:
<TASK>
dump_stack_lvl (lib/dump_stack.c:117)
kasan_report (mm/kasan/report.c:603)
kasan_check_range (mm/kasan/generic.c:189)
variable_test_bit (arch/x86/include/asm/bitops.h:227) [inline]
arch_test_bit (arch/x86/include/asm/bitops.h:239) [inline]
_test_bit at (include/asm-generic/bitops/instrumented-non-atomic.h:142) [inline]
node_state (include/linux/nodemask.h:423) [inline]
map_benchmark_ioctl (kernel/dma/map_benchmark.c:214)
full_proxy_unlocked_ioctl (fs/debugfs/file.c:333)
__x64_sys_ioctl (fs/ioctl.c:890)
do_syscall_64 (arch/x86/entry/common.c:83)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
Compare node ids with sane bounds first. NUMA_NO_NODE is considered a
special valid case meaning that benchmarking kthreads won't be bound to a
cpuset of a given node.
Found by Linux Verification Center (linuxtesting.org).
Fixes: 65789daa8087 ("dma-mapping: add benchmark support for streaming DMA APIs") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
There is no need to set the DMA mapped flag of the message if it has
no mapped transfers. Moreover, it may give the code a chance to take
the wrong paths, i.e. to exercise DMA related APIs on unmapped data.
Make __spi_map_msg() to bail earlier on the above mentioned cases.
Fixes: 99adef310f68 ("spi: Provide core support for DMA mapping transfers") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://msgid.link/r/20240522171018.3362521-2-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
f41f72d09ee1 ("netfilter: nft_payload: simplify vlan header handling")
already allows to match on inner vlan tags by subtract the vlan header
size to the payload offset which has been popped and stored in skbuff
metadata fields.
A bug occurs because a safety check guarding AF_XDP-related queues in
ethnl_set_channels(), does not trigger. This happens, because kernel and
ice driver interpret the ethtool command differently.
How the bug occurs:
1. ethtool -l <IFNAME> -> combined: 40
2. Attach AF_XDP to queue 30
3. ethtool -L <IFNAME> rx 15 tx 15
combined number is not specified, so command becomes {rx_count = 15,
tx_count = 15, combined_count = 40}.
4. ethnl_set_channels checks, if there are any AF_XDP of queues from the
new (combined_count + rx_count) to the old one, so from 55 to 40, check
does not trigger.
5. ice interprets `rx 15 tx 15` as 15 combined channels and deletes the
queue that AF_XDP is attached to.
Interpret the command in a way that is more consistent with ethtool
manual [0] (--show-channels and --set-channels).
Considering that in the ice driver only the difference between RX and TX
queues forms dedicated channels, change the correct way to set number of
channels to:
When nci_rx_work() receives a zero-length payload packet, it should not
discard the packet and exit the loop. Instead, it should continue
processing subsequent packets.
Fixes: d24b03535e5e ("nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet") Signed-off-by: Ryosuke Yasuoka <ryasuoka@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240521153444.535399-1-ryasuoka@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 7e8cdc97148c ("nfc: Add KCOV annotations") added
kcov_remote_start_common()/kcov_remote_stop() pair into nci_rx_work(),
with an assumption that kcov_remote_stop() is called upon continue of
the for loop. But commit d24b03535e5e ("nfc: nci: Fix uninit-value in
nci_dev_up and nci_ntf_packet") forgot to call kcov_remote_stop() before
break of the for loop.
Reported-by: syzbot <syzbot+0438378d6f157baae1a2@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=0438378d6f157baae1a2 Fixes: d24b03535e5e ("nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet") Suggested-by: Andrey Konovalov <andreyknvl@gmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/6d10f829-5a0c-405a-b39a-d7266f3a1a0b@I-love.SAKURA.ne.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 6671e352497c ("nfc: nci: Fix handling of zero-length payload packets in nci_rx_work()") Signed-off-by: Sasha Levin <sashal@kernel.org>
In tls_init(), a write memory barrier is missing, and store-store
reordering may cause NULL dereference in tls_{setsockopt,getsockopt}.
CPU0 CPU1
----- -----
// In tls_init()
// In tls_ctx_create()
ctx = kzalloc()
ctx->sk_proto = READ_ONCE(sk->sk_prot) -(1)
// In update_sk_prot()
WRITE_ONCE(sk->sk_prot, tls_prots) -(2)
// In sock_common_setsockopt()
READ_ONCE(sk->sk_prot)->setsockopt()
// In tls_{setsockopt,getsockopt}()
ctx->sk_proto->setsockopt() -(3)
In the above scenario, when (1) and (2) are reordered, (3) can observe
the NULL value of ctx->sk_proto, causing NULL dereference.
To fix it, we rely on rcu_assign_pointer() which implies the release
barrier semantic. By moving rcu_assign_pointer() after ctx->sk_proto is
initialized, we can ensure that ctx->sk_proto are visible when
changing sk->sk_prot.
The assignment of pps_enable is protected by tmreg_lock, but the read
operation of pps_enable is not. So the Coverity tool reports a lock
evasion warning which may cause data race to occur when running in a
multithread environment. Although this issue is almost impossible to
occur, we'd better fix it, at least it seems more logically reasonable,
and it also prevents Coverity from continuing to issue warnings.
Fixes: 278d24047891 ("net: fec: ptp: Enable PPS output based on ptp clock") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://lore.kernel.org/r/20240521023800.17102-1-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
and in the code check is performed:
if (regs && (regs->epc == pc) && (frame->fp & 0x7))
I see no reason to check frame->fp value at all, because it is can be
uninitialized value on the stack. A better way is to check frame->ra to
be an address on the stack. After the stacktrace shows as expect:
Changes since v2 [2]:
- Add accidentally forgotten curly brace
Changes since v1 [1]:
- Instead of just dropping frame->fp check, replace it with validation of
frame->ra, which should be a stack address.
- Move frame pointer validation into the separate function.
Fixes: f766f77a74f5 ("riscv/stacktrace: Fix stack output without ra on the stack top") Signed-off-by: Matthew Bystrin <dev.mbstr@gmail.com> Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20240521191727.62012-1-dev.mbstr@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Here is the error injection test code for the above output:
drivers/irqchip/irq-riscv-intc.c:
static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
{
unsigned long cause = regs->cause & ~CAUSE_IRQ_FLAG;
+ u32 tmp; __get_user(tmp, (u32 *)0);
When request_irq() fails, error path calls vp_del_vqs(). There, as vq is
present in the list, free_irq() is called for the same vector. That
causes following splat:
When CONFIG_DEBUG_BUGVERBOSE=n, we fail to add necessary padding bytes
to bug_table entries, and as a result the last entry in a bug table will
be ignored, potentially leading to an unexpected panic(). All prior
entries in the table will be handled correctly.
The arm64 ABI requires that struct fields of up to 8 bytes are
naturally-aligned, with padding added within a struct such that struct
are suitably aligned within arrays.
When CONFIG_DEBUG_BUGVERPOSE=y, the layout of a bug_entry is:
struct bug_entry {
signed int bug_addr_disp; // 4 bytes
signed int file_disp; // 4 bytes
unsigned short line; // 2 bytes
unsigned short flags; // 2 bytes
}
... with 12 bytes total, requiring 4-byte alignment.
When CONFIG_DEBUG_BUGVERBOSE=n, the layout of a bug_entry is:
struct bug_entry {
signed int bug_addr_disp; // 4 bytes
unsigned short flags; // 2 bytes
< implicit padding > // 2 bytes
}
... with 8 bytes total, with 6 bytes of data and 2 bytes of trailing
padding, requiring 4-byte alginment.
When we create a bug_entry in assembly, we align the start of the entry
to 4 bytes, which implicitly handles padding for any prior entries.
However, we do not align the end of the entry, and so when
CONFIG_DEBUG_BUGVERBOSE=n, the final entry lacks the trailing padding
bytes.
For the main kernel image this is not a problem as find_bug() doesn't
depend on the trailing padding bytes when searching for entries:
for (bug = __start___bug_table; bug < __stop___bug_table; ++bug)
if (bugaddr == bug_addr(bug))
return bug;
However for modules, module_bug_finalize() depends on the trailing
bytes when calculating the number of entries:
Open vSwitch is originally intended to switch at layer 2, only dealing with
Ethernet frames. With the introduction of l3 tunnels support, it crossed
into the realm of needing to care a bit about some routing details when
making forwarding decisions. If an oversized packet would need to be
fragmented during this forwarding decision, there is a chance for pmtu
to get involved and generate a routing exception. This is gated by the
skbuff->pkt_type field.
When a flow is already loaded into the openvswitch module this field is
set up and transitioned properly as a packet moves from one port to
another. In the case that a packet execute is invoked after a flow is
newly installed this field is not properly initialized. This causes the
pmtud mechanism to omit sending the required exception messages across
the tunnel boundary and a second attempt needs to be made to make sure
that the routing exception is properly setup. To fix this, we set the
outgoing packet's pkt_type to PACKET_OUTGOING, since it can only get
to the openvswitch module via a port device or packet command.
Even for bridge ports as users, the pkt_type needs to be reset when
doing the transmit as the packet is truly outgoing and routing needs
to get involved post packet transformations, in the case of
VXLAN/GENEVE/udp-tunnel packets. In general, the pkt_type on output
gets ignored, since we go straight to the driver, but in the case of
tunnel ports they go through IP routing layer.
This issue is periodically encountered in complex setups, such as large
openshift deployments, where multiple sets of tunnel traversal occurs.
A way to recreate this is with the ovn-heater project that can setup
a networking environment which mimics such large deployments. We need
larger environments for this because we need to ensure that flow
misses occur. In these environment, without this patch, we can see:
./ovn_cluster.sh start
podman exec ovn-chassis-1 ip r a 170.168.0.5/32 dev eth1 mtu 1200
podman exec ovn-chassis-1 ip netns exec sw01p1 ip r flush cache
podman exec ovn-chassis-1 ip netns exec sw01p1 \
ping 21.0.0.3 -M do -s 1300 -c2
PING 21.0.0.3 (21.0.0.3) 1300(1328) bytes of data.
From 21.0.0.3 icmp_seq=2 Frag needed and DF set (mtu = 1142)
--- 21.0.0.3 ping statistics ---
...
Using tcpdump, we can also see the expected ICMP FRAG_NEEDED message is not
sent into the server.
With this patch, setting the pkt_type, we see the following:
podman exec ovn-chassis-1 ip netns exec sw01p1 \
ping 21.0.0.3 -M do -s 1300 -c2
PING 21.0.0.3 (21.0.0.3) 1300(1328) bytes of data.
From 21.0.0.3 icmp_seq=1 Frag needed and DF set (mtu = 1222)
ping: local error: message too long, mtu=1222
--- 21.0.0.3 ping statistics ---
...
In this case, the first ping request receives the FRAG_NEEDED message and
a local routing exception is created.
Billy Jheng Bing-Jhong reported a race between __unix_gc() and
queue_oob().
__unix_gc() tries to garbage-collect close()d inflight sockets,
and then if the socket has MSG_OOB in unix_sk(sk)->oob_skb, GC
will drop the reference and set NULL to it locklessly.
However, the peer socket still can send MSG_OOB message and
queue_oob() can update unix_sk(sk)->oob_skb concurrently, leading
NULL pointer dereference. [0]
To fix the issue, let's update unix_sk(sk)->oob_skb under the
sk_receive_queue's lock and take it everywhere we touch oob_skb.
Note that we defer kfree_skb() in manage_oob() to silence lockdep
false-positive (See [1]).
Under the scenario of IB device bonding, when bringing down one of the
ports, or all ports, we saw xprtrdma entering a non-recoverable state
where it is not even possible to complete the disconnect and shut it
down the mount, requiring a reboot. Following debug, we saw that
transport connect never ended after receiving the
RDMA_CM_EVENT_DEVICE_REMOVAL callback.
The DEVICE_REMOVAL callback is irrespective of whether the CM_ID is
connected, and ESTABLISHED may not have happened. So need to work with
each of these states accordingly.
Fixes: 2acc5cae2923 ('xprtrdma: Prevent dereferencing r_xprt->rx_ep after it is freed') Cc: Sagi Grimberg <sagi.grimberg@vastdata.com> Signed-off-by: Dan Aloni <dan.aloni@vastdata.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
It used to be quite awhile ago since 1b63a75180c6 ('SUNRPC: Refactor
rpc_clone_client()'), in 2012, that `cl_timeout` was copied in so that
all mount parameters propagate to NFSACL clients. However since that
change, if mount options as follows are given:
These values lead to NFSACL operations not being retried under the
condition of transient network outages with soft mount. Instead, getacl
call fails after 60 seconds with EIO.
The simple fix is to pass the existing client's `cl_timeout` as the new
client timeout.
Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Benjamin Coddington <bcodding@redhat.com> Link: https://lore.kernel.org/all/20231105154857.ryakhmgaptq3hb6b@gmail.com/T/ Fixes: 1b63a75180c6 ('SUNRPC: Refactor rpc_clone_client()') Signed-off-by: Dan Aloni <dan.aloni@vastdata.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Dan Carpenter reports smatch warning for nfs4_try_migration() when a memory
allocation failure results in a zero return value. In this case, a
transient allocation failure error will likely be retried the next time the
server responds with NFS4ERR_MOVED.
We can fixup the smatch warning with a small refactor: attempt all three
allocations before testing and returning on a failure.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Fixes: c3ed222745d9 ("NFSv4: Fix free of uninitialized nfs4_label on referral lookup.") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
syzbot reported the following uninit-value access issue [1]
nci_rx_work() parses received packet from ndev->rx_q. It should be
validated header size, payload size and total packet size before
processing the packet. If an invalid packet is detected, it should be
silently discarded.
Fixes: d24b03535e5e ("nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet") Reported-and-tested-by: syzbot+d7b4dc6cd50410152534@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d7b4dc6cd50410152534 [1] Signed-off-by: Ryosuke Yasuoka <ryasuoka@redhat.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The seg6_input() function is responsible for adding the SRH into a
packet, delegating the operation to the seg6_input_core(). This function
uses the skb_cow_head() to ensure that there is sufficient headroom in
the sk_buff for accommodating the link-layer header.
In the event that the skb_cow_header() function fails, the
seg6_input_core() catches the error but it does not release the sk_buff,
which will result in a memory leak.
This issue was introduced in commit af3b5158b89d ("ipv6: sr: fix BUG due
to headroom too small after SRH push") and persists even after commit 7a3f5b0de364 ("netfilter: add netfilter hooks to SRv6 data plane"),
where the entire seg6_input() code was refactored to deal with netfilter
hooks.
The proposed patch addresses the identified memory leak by requiring the
seg6_input_core() function to release the sk_buff in the event that
skb_cow_head() fails.
Fixes: af3b5158b89d ("ipv6: sr: fix BUG due to headroom too small after SRH push") Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Stephen reported that he was unable to get the dsa_loop driver to get
probed, and the reason ended up being because he had CONFIG_FIXED_PHY=y
in his kernel configuration. As Masahiro explained it:
"obj-m += dsa/" means everything under dsa/ must be modular.
If there is a built-in object under dsa/ with CONFIG_NET_DSA=m,
you cannot do "obj-$(CONFIG_NET_DSA) += dsa/".
You need to change it back to "obj-y += dsa/".
This was the case here whereby CONFIG_NET_DSA=m, and so the
obj-$(CONFIG_FIXED_PHY) += dsa_loop_bdinfo.o rule is not executed and
the DSA loop mdio_board info structure is not registered with the
kernel, and eventually the device is simply not found.
To preserve the intention of the original commit of limiting the amount
of folder descending, conditionally descend into drivers/net/dsa when
CONFIG_NET_DSA is enabled.
Fixes: 227d72063fcc ("dsa: simplify Kconfig symbols and dependencies") Reported-by: Stephen Langstaff <stephenlangstaff1@gmail.com> Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
It took me some time to understand the purpose of the tricky code at
the end of arch/x86/Kconfig.debug.
Without it, the following would be shown:
WARNING: unmet direct dependencies detected for FRAME_POINTER
because
81d387190039 ("x86/kconfig: Consolidate unwinders into multiple choice selection")
removed 'select ARCH_WANT_FRAME_POINTERS'.
The correct and more straightforward approach should have been to move
it where 'select FRAME_POINTER' is located.
Several architectures properly handle the conditional selection of
ARCH_WANT_FRAME_POINTERS. For example, 'config UNWINDER_FRAME_POINTER'
in arch/arm/Kconfig.debug.
Some of the regulators on the BD71828 have common voltage setting for
RUN/SUSPEND/IDLE/LPSR states. The enable control can be set for each
state though.
The driver allows setting the voltage values for these states via
device-tree. As a side effect, setting the voltages for
SUSPEND/IDLE/LPSR will also change the RUN level voltage which is not
desired and can break the system.
The comment in code reflects this behaviour, but it is likely to not
make people any happier. The right thing to do is to allow setting the
enable/disable state at SUSPEND/IDLE/LPSR via device-tree, but to
disallow setting state specific voltages for those regulators.
BUCK1 is a bit different. It only shares the SUSPEND and LPSR state
voltages. The former behaviour of allowing to silently overwrite the
SUSPEND state voltage by LPSR state voltage is also changed here so that
the SUSPEND voltage is prioritized over LPSR voltage.
Prevent setting PMIC state specific voltages for regulators which do not
support it.
If, when waiting for a transmit to finish, the wait is interrupted,
then you might get a "transmit timed out" message, even though the
transmit was interrupted and did not actually time out.
Set transmit_in_progress_aborted to true if the
wait_for_completion_killable() call was interrupted and ensure
that the transmit is properly marked as ABORTED.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Reported-by: Yang, Chenyuan <cy54@illinois.edu> Closes: https://lore.kernel.org/linux-media/PH7PR11MB57688E64ADE4FE82E658D86DA09EA@PH7PR11MB5768.namprd11.prod.outlook.com/ Fixes: 590a8e564c6e ("media: cec: abort if the current transmit was canceled") Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Use call_(void_)op consistently in the CEC core framework. Ditto
for the cec pin ops. And check if !adap->devnode.unregistered before
calling each op. This avoids calls to ops when the device has been
unregistered and the underlying hardware may be gone.
The results of non-blocking transmits were not correctly communicated
to userspace.
Specifically:
1) if a non-blocking transmit was canceled, then rx_status wasn't set to 0
as it should.
2) if the non-blocking transmit succeeded, but the corresponding reply
never arrived (aborted or timed out), then tx_status wasn't set to 0
as it should, and rx_status was hardcoded to ABORTED instead of the
actual reason, such as TIMEOUT. In addition, adap->ops->received() was
never called, so drivers that want to do message processing themselves
would not be informed of the failed reply.
If a transmit-in-progress was canceled, then, once the transmit
is done, mark it as aborted and refrain from retrying the transmit.
To signal this situation the new transmit_in_progress_aborted field is
set to true.
The old implementation would just set adap->transmitting to NULL and
set adap->transmit_in_progress to false, but on the hardware level
the transmit was still ongoing. However, the framework would think
the transmit was aborted, and if a new transmit was issued, then
it could overwrite the HW buffer containing the old transmit with the
new transmit, leading to garbled data on the CEC bus.
Don't enable/disable the adapter if the first fh is opened or the
last fh is closed, instead do this when the adapter is configured
or unconfigured, and also when we enter Monitor All or Monitor Pin
mode for the first time or we exit the Monitor All/Pin mode for the
last time.
However, if needs_hpd is true, then do this when the physical
address is set or cleared: in that case the adapter typically is
powered by the HPD, so it really is disabled when the HPD is low.
This case (needs_hpd is true) was already handled in this way, so
this wasn't changed.
The problem with the old behavior was that if the HPD goes low when
no fh is open, and a transmit was in progress, then the adapter would
be disabled, typically stopping the transmit immediately which
leaves a partial message on the bus, which isn't nice and can confuse
some adapters.
It makes much more sense to disable it only when the adapter is
unconfigured and we're not monitoring the bus, since then you really
won't be using it anymore.
To keep track of this store a CEC activation count and call adap_enable
only when it goes from 0 to 1 or back to 0.
Commit d725d20e81c2 ("media: flexcop-usb: sanity checking of endpoint type
") adds a sanity check for endpoint[1], but fails to modify the sanity
check of bNumEndpoints.
Fix this by modifying the sanity check of bNumEndpoints to 2.
Grab input->mutex during suspend/resume functions like it is done in
other input drivers. This fixes the following warning during system
suspend/resume cycle on Samsung Exynos5250-based Snow Chromebook:
------------[ cut here ]------------
WARNING: CPU: 1 PID: 1680 at drivers/input/input.c:2291 input_device_enabled+0x68/0x6c
Modules linked in: ...
CPU: 1 PID: 1680 Comm: kworker/u4:12 Tainted: G W 6.6.0-rc5-next-20231009 #14109
Hardware name: Samsung Exynos (Flattened Device Tree)
Workqueue: events_unbound async_run_entry_fn
unwind_backtrace from show_stack+0x10/0x14
show_stack from dump_stack_lvl+0x58/0x70
dump_stack_lvl from __warn+0x1a8/0x1cc
__warn from warn_slowpath_fmt+0x18c/0x1b4
warn_slowpath_fmt from input_device_enabled+0x68/0x6c
input_device_enabled from cyapa_gen3_set_power_mode+0x13c/0x1dc
cyapa_gen3_set_power_mode from cyapa_reinitialize+0x10c/0x15c
cyapa_reinitialize from cyapa_resume+0x48/0x98
cyapa_resume from dpm_run_callback+0x90/0x298
dpm_run_callback from device_resume+0xb4/0x258
device_resume from async_resume+0x20/0x64
async_resume from async_run_entry_fn+0x40/0x15c
async_run_entry_fn from process_scheduled_works+0xbc/0x6a8
process_scheduled_works from worker_thread+0x188/0x454
worker_thread from kthread+0x108/0x140
kthread from ret_from_fork+0x14/0x28
Exception stack(0xf1625fb0 to 0xf1625ff8)
...
---[ end trace 0000000000000000 ]---
...
------------[ cut here ]------------
WARNING: CPU: 1 PID: 1680 at drivers/input/input.c:2291 input_device_enabled+0x68/0x6c
Modules linked in: ...
CPU: 1 PID: 1680 Comm: kworker/u4:12 Tainted: G W 6.6.0-rc5-next-20231009 #14109
Hardware name: Samsung Exynos (Flattened Device Tree)
Workqueue: events_unbound async_run_entry_fn
unwind_backtrace from show_stack+0x10/0x14
show_stack from dump_stack_lvl+0x58/0x70
dump_stack_lvl from __warn+0x1a8/0x1cc
__warn from warn_slowpath_fmt+0x18c/0x1b4
warn_slowpath_fmt from input_device_enabled+0x68/0x6c
input_device_enabled from cyapa_gen3_set_power_mode+0x13c/0x1dc
cyapa_gen3_set_power_mode from cyapa_reinitialize+0x10c/0x15c
cyapa_reinitialize from cyapa_resume+0x48/0x98
cyapa_resume from dpm_run_callback+0x90/0x298
dpm_run_callback from device_resume+0xb4/0x258
device_resume from async_resume+0x20/0x64
async_resume from async_run_entry_fn+0x40/0x15c
async_run_entry_fn from process_scheduled_works+0xbc/0x6a8
process_scheduled_works from worker_thread+0x188/0x454
worker_thread from kthread+0x108/0x140
kthread from ret_from_fork+0x14/0x28
Exception stack(0xf1625fb0 to 0xf1625ff8)
...
---[ end trace 0000000000000000 ]---
Fixes: d69f0a43c677 ("Input: use input_device_enabled()") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Link: https://lore.kernel.org/r/20231009121018.1075318-1-m.szyprowski@samsung.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
strlcpy() reads the entire source buffer first. This read may exceed the
destination size limit. This is both inefficient and can lead to linear
read overflows if a source string is not NUL-terminated [1]. In an effort
to remove strlcpy() completely [2], replace strlcpy() here with strscpy().
No return values were used, so direct replacement is safe.
The subtract in this condition is reversed. The ->length is the length
of the buffer. The ->bytesused is how many bytes we have copied thus
far. When the condition is reversed that means the result of the
subtraction is always negative but since it's unsigned then the result
is a very high positive value. That means the overflow check is never
true.
Additionally, the ->bytesused doesn't actually work for this purpose
because we're not writing to "buf->mem + buf->bytesused". Instead, the
math to calculate the destination where we are writing is a bit
involved. You calculate the number of full lines already written,
multiply by two, skip a line if necessary so that we start on an odd
numbered line, and add the offset into the line.
To fix this buffer overflow, just take the actual destination where we
are writing, if the offset is already out of bounds print an error and
return. Otherwise, write up to buf->length bytes.
Fixes: 9cb2173e6ea8 ("[media] media: Add stk1160 new driver (easycap replacement)") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Ricardo Ribalda <ribalda@chromium.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Sasha Levin <sashal@kernel.org>
The bridge always uses 24bpp internally. Therefore, for jeida-18
mapping we need to discard the lowest two bits for each channel and thus
starting with LV_[RGB]2. jeida-24 has the same mapping but uses four
lanes instead of three, with the forth pair transmitting the lowest two
bits of each channel. Thus, the mapping between jeida-18 and jeida-24
is actually the same, except that one channel is turned off (by
selecting the RGB666 format in VPCTRL).
While at it, remove the bogus comment about the hardware default because
the default is overwritten in any case.
Tested with a jeida-18 display (Evervision VGG644804).
Fixes: b26975593b17 ("display/drm/bridge: TC358775 DSI/LVDS driver") Signed-off-by: Michael Walle <mwalle@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Robert Foss <rfoss@kernel.org> Signed-off-by: Robert Foss <rfoss@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240225062008.33191-5-tony@atomide.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Registering a winch IRQ is racy, an interrupt may occur before the winch is
added to the winch_handlers list.
If that happens, register_winch_irq() adds to that list a winch that is
scheduled to be (or has already been) freed, causing a panic later in
winch_cleanup().
Avoid the race by adding the winch to the winch_handlers list before
registering the IRQ, and rolling back if um_request_irq() fails.
Fixes: 42a359e31a0e ("uml: SIGIO support cleanup") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>
As we can clearly see in a downstream kernel [1], flushing the slave INTF
is skipped /only if/ the PPSPLIT topology is active.
However, when DPU was originally submitted to mainline PPSPLIT was no
longer part of it (seems to have been ripped out before submission), but
this clause was incorrectly ported from the original SDE driver. Given
that there is no support for PPSPLIT (currently), flushing the slave
INTF should /never/ be skipped (as the `if (ppsplit && !master) goto
skip;` clause downstream never becomes true).
When dual-DSI (bonded DSI) was added in commit ed9976a09b48
("drm/msm/dsi: adjust dsi timing for dual dsi mode") some DBG() prints
were not updated, leading to print the original mode->clock rather
than the adjusted (typically the mode clock divided by two, though more
recently also adjusted for DSC compression) msm_host->pixel_clk_rate
which is passed to clk_set_rate() just below. Fix that by printing the
actual pixel_clk_rate that is being set.
While STRB is currently used for DATA and CRC responses, the CMD
responses from the device to the host still require ITAPDLY for
HS400 timing.
Currently what is stored for HS400 is the ITAPDLY from High Speed
mode which is incorrect. The ITAPDLY for HS400 speed mode should
be the same as ITAPDLY as HS200 timing after tuning is executed.
Add the functionality to save ITAPDLY from HS200 tuning and save
as HS400 ITAPDLY.
Fixes: a161c45f2979 ("mmc: sdhci_am654: Enable DLL only for some speed modes") Signed-off-by: Judith Mendez <jm@ti.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20240320223837.959900-8-jm@ti.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add ITAPDLYSEL to sdhci_j721e_4bit_set_clock function.
This allows to set the correct ITAPDLY for timings that
do not carry out tuning.
Fixes: 1accbced1c32 ("mmc: sdhci_am654: Add Support for 4 bit IP on J721E") Signed-off-by: Judith Mendez <jm@ti.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20240320223837.959900-7-jm@ti.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently the OTAP/ITAP delay enable functionality is incorrect in
the am654_set_clock function. The OTAP delay is not enabled when
timing < SDR25 bus speed mode. The ITAP delay is not enabled for
timings that do not carry out tuning.
Add this OTAP/ITAP delay functionality according to the datasheet
[1] OTAPDLYENA and ITAPDLYENA for MMC0.
ti,otap-del-sel has been deprecated since v5.7 and there are no users of
this property and no documentation in the DT bindings either.
Drop the fallback code looking for this property, this makes
sdhci_am654_get_otap_delay() much easier to read as all the TAP values
can be handled via a single iterator loop.
For DDR52 timing, DLL is enabled but tuning is not carried
out, therefore the ITAPDLY value in PHY CTRL 4 register is
not correct. Fix this by writing ITAPDLY after enabling DLL.
Fixes: a161c45f2979 ("mmc: sdhci_am654: Enable DLL only for some speed modes") Signed-off-by: Judith Mendez <jm@ti.com> Reviewed-by: Andrew Davis <afd@ti.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20240320223837.959900-3-jm@ti.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently the sdhci_am654 driver only supports one tuning
algorithm which should be used only when DLL is enabled. The
ITAPDLY is selected from the largest passing window and the
buffer is viewed as a circular buffer.
The new algorithm should be used when the delay chain
is enabled. The ITAPDLY is selected from the largest passing
window and the buffer is not viewed as a circular buffer.
This implementation is based off of the following paper: [1].
Also add support for multiple failing windows.
[1] https://www.ti.com/lit/an/spract9/spract9.pdf
Fixes: 13ebeae68ac9 ("mmc: sdhci_am654: Add support for software tuning") Signed-off-by: Judith Mendez <jm@ti.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20240320223837.959900-2-jm@ti.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new() which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
clang warns about a string overflow in this driver
drivers/input/misc/ims-pcu.c:1802:2: error: 'snprintf' will always be truncated; specified size is 10, but format string expands to at least 12 [-Werror,-Wformat-truncation]
drivers/input/misc/ims-pcu.c:1814:2: error: 'snprintf' will always be truncated; specified size is 10, but format string expands to at least 12 [-Werror,-Wformat-truncation]
Make the buffer a little longer to ensure it always fits.
Initialize the correct fields of the nvme dump block.
This bug had not been detected before because first, the fcp and nvme fields
of struct ipl_parameter_block are part of the same union and, therefore,
overlap in memory and second, they are identical in structure and size.
Use correct symbolic constants IPL_BP_NVME_LEN and IPL_BP0_NVME_LEN
to initialize nvme reipl block when 'scp_data' sysfs attribute is
being updated. This bug had not been detected before because
the corresponding fcp and nvme symbolic constants are equal.
By default user space is compiled with standard stack frame layout and not
with the packed stack layout. The vdso code however inherited the
-mpacked-stack compiler option from the kernel. Remove this option to make
sure the vdso is compiled with standard stack frame layout.
This makes sure that the stack frame backchain location for vdso generated
stack frames is the same like for calling code (if compiled with default
options). This allows to manually walk stack frames without DWARF
information, like the kernel is doing it e.g. with arch_stack_walk_user().
Gcc recently implemented an optimization [1] for loading symbols without
explicit alignment, aligning with the IBM Z ELF ABI. This ABI mandates
symbols to reside on a 2-byte boundary, enabling the use of the larl
instruction. However, kernel linker scripts may still generate unaligned
symbols. To address this, a new -munaligned-symbols option has been
introduced [2] in recent gcc versions.
However, when -munaligned-symbols is used in vdso code, it leads to the
following compilation error:
`.data.rel.ro.local' referenced in section `.text' of
arch/s390/kernel/vdso64/vdso64_generic.o: defined in discarded section
`.data.rel.ro.local' of arch/s390/kernel/vdso64/vdso64_generic.o
vdso linker script discards .data section to make it lightweight.
However, -munaligned-symbols in vdso object files references literal
pool and accesses _vdso_data. Hence, compile vdso code without
-munaligned-symbols. This means in the future, vdso code should deal
with alignment of newly introduced unaligned linker symbols.
Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Link: https://lore.kernel.org/r/20240219132734.22881-2-sumanthk@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Stable-dep-of: 10f705253651 ("s390/vdso: Generate unwind information for C modules") Signed-off-by: Sasha Levin <sashal@kernel.org>
cmd_vdso_check checks if there are any dynamic relocations in
vdso64.so.dbg. When kernel is compiled with
-mno-pic-data-is-text-relative, R_390_RELATIVE relocs are generated and
this results in kernel build error.
kpatch uses -mno-pic-data-is-text-relative option when building the
kernel to prevent relative addressing between code and data. The flag
avoids relocation error when klp text and data are too far apart
kpatch does not patch vdso code and hence the
mno-pic-data-is-text-relative flag is not essential.
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Stable-dep-of: 10f705253651 ("s390/vdso: Generate unwind information for C modules") Signed-off-by: Sasha Levin <sashal@kernel.org>
The to-be-fixed commit removed locking when invalidating the DMA RX
descriptors on shutdown. It overlooked that there is still a rx_timer
running which may still access the protected data. So, re-add the
locking.
Reported-by: Dirk Behme <dirk.behme@de.bosch.com> Closes: https://lore.kernel.org/r/ee6c9e16-9f29-450e-81da-4a8dceaa8fc7@de.bosch.com Fixes: 2c4ee23530ff ("serial: sh-sci: Postpone DMA release when falling back to PIO") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20240506114016.30498-7-wsa+renesas@sang-engineering.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ASSERT] (fsck_chk_inode_blk:1256) --> ino: 0x5 has i_blocks: 0x00000002, but has 0x3 blocks
[FSCK] valid_block_count matching with CP [Fail] [0x4, 0x5]
[FSCK] other corrupted bugs [Fail]
The reason is: partial truncation assume compressed inode has reserved
blocks, after partial truncation, valid block count may change w/o
.i_blocks and .total_valid_block_count update, result in corruption.
This patch only allow cluster size aligned truncation on released
compress inode for fixing.
Fixes: c61404153eb6 ("f2fs: introduce FI_COMPRESS_RELEASED instead of using IMMUTABLE bit") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
It needs to cover {reserve,release}_compress_blocks() w/ cp_rwsem lock
to avoid racing with checkpoint, otherwise, filesystem metadata including
blkaddr in dnode, inode fields and .total_valid_block_count may be
corrupted after SPO case.
The "Downstream Port Containment related Enhancements" ECN of Jan 28, 2019
(document 12888 below), defined the EDR_PORT_LOCATE_DSM function with
Revision ID 5 with a return value encoding (Bits 2:0 = Function, Bits 7:3 =
Device, Bits 15:8 = Bus). When the ECN was integrated into PCI Firmware
r3.3, sec 4.6.13, Bit 31 was added to indicate success or failure.
The "Downstream Port Containment related Enhancements" ECN of Jan 28, 2019
(document 12888 below), defined the EDR_PORT_DPC_ENABLE_DSM function with
Revision ID 5 with Arg3 being an integer. But when the ECN was integrated
into PCI Firmware r3.3, sec 4.6.12, it was defined as Revision ID 6 with
Arg3 being a package containing an integer.
The implementation in acpi_enable_dpc() supplies a package as Arg3 (arg4 in
the code), but it previously specified Revision ID 5. Align this with PCI
Firmware r3.3 by using Revision ID 6.
If firmware implemented per the ECN, its Revision 5 function would receive
a package as Arg3 when it expects an integer, so acpi_enable_dpc() would
likely fail. If such firmware exists and lacks a Revision 6 function that
expects a package, we may have to add support for Revision 5.
IRQ_DOMAIN is a hidden (not user visible) symbol. Users cannot set
it directly thru "make *config", so drivers should select it instead
of depending on it if they need it.
Relying on it being set for a dependency is risky.
Consistently using "select" or "depends on" can also help reduce
Kconfig circular dependency issues.
Therefore, change EXTCON_MAX8997's use of "depends on" for
IRQ_DOMAIN to "select".
Link: https://lore.kernel.org/lkml/20240213060028.9744-1-rdunlap@infradead.org/ Fixes: dca1a71e4108 ("extcon: Add support irq domain for MAX8997 muic") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In register_device, the return value of ida_simple_get is unchecked,
in witch ida_simple_get will use an invalid index value.
To address this issue, index should be checked after ida_simple_get. When
the index value is abnormal, a warning message should be printed, the port
should be dropped, and the value should be recorded.
This prevents use of a stale pointer if functions are called after
g_cleanup that shouldn't be. This doesn't fix any races, but converts
a possibly silent kernel memory corruption into an obvious NULL pointer
dereference report.
The always-running (from linux,wdt-gpio.yaml) is abused by the BD9576
watchdog driver. It's defined meaning is "the watchdog is always running
and can not be stopped". The BD9576 watchdog driver has implemented it
as "start watchdog when loading the module and prevent it from being
stopped".
Furthermore, the implementation does not set the WDOG_HW_RUNNING when
enabling the watchdog due to the "always-running" at module loading.
This will end up resulting a watchdog timeout if the device is not
opened.
The culprit was pointed out by Guenter, discussion can be found from
https://lore.kernel.org/lkml/4fa3a64b-60fb-4e5e-8785-0f14da37eea2@roeck-us.net/
Drop the invalid "always-running" handling.
Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: b237bcac557a ("wdt: Support wdt on ROHM BD9576MUF and BD9573MUF") Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/ZhPAt76yaJMersXf@fedora Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>