En-Wei reported that traffic breaks if cable is unplugged for more
than 3s and then re-plugged. This was supposed to be fixed by 621735f59064 ("r8169: fix rare issue with broken rx after link-down on
RTL8125"). But apparently this didn't fix the issue for everybody.
The 3s threshold rang a bell, as this is the delay after which ALDPS
kicks in. And indeed disabling ALDPS fixes the issue for this user.
Maybe this fixes the issue in general. In a follow-up step we could
remove the first fix attempt and see whether anybody complains.
Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125") Tested-by: En-Wei WU <en-wei.wu@canonical.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/778b9d86-05c4-4856-be59-cde4487b9e52@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
disable_irq() after request_irq() still has a time gap in which
interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
disable IRQ auto-enable when request IRQ.
Fixes: bbb96dc7fa1a ("enetc: Factor out the traffic start/stop procedures") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/20240911094445.1922476-3-ruanjinjie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Both bareudp_xmit_skb() and bareudp6_xmit_skb() read their skb's inner
IP header to get its ECN value (with ip_tunnel_ecn_encap()). Therefore
we need to ensure that the inner IP header is part of the skb's linear
data.
Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.") Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/267328222f0a11519c6de04c640a4f87a38ea9ed.1726046181.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When innerprotoinherit is set, the tunneled packets do not have an inner
Ethernet header.
Change 'maclen' to not always assume the header length is ETH_HLEN, as
there might not be a MAC header.
This resolves issues with drivers (e.g. mlx5, in
mlx5e_tx_tunnel_accel()) who rely on the skb inner network header offset
to be correct, and use it for TX offloads.
Fixes: d8a6213d70ac ("geneve: fix header validation in geneve[6]_xmit_skb") Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: c471236b2359 ("bareudp: Pull inner IP header on xmit.") Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch adds support for encapsulating IPv4/IPv6 within GENEVE.
In order to use this, a new IFLA_GENEVE_INNER_PROTO_INHERIT flag needs
to be provided at device creation. This property cannot be changed for
the time being.
In case IP traffic is received on a non-tun device the drop count is
increased.
Bareudp reads the inner IP header to get the ECN value. Therefore, it
needs to ensure that it's part of the skb's linear data.
This is similar to the vxlan and geneve fixes for that same problem:
* commit f7789419137b ("vxlan: Pull inner IP header in vxlan_rcv().")
* commit 1ca1ba465e55 ("geneve: make sure to pull inner header in
geneve_rx()")
Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.") Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/5205940067c40218a70fbb888080466b2fc288db.1726046181.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Even though bareudp transports L3 data (typically IP or MPLS), it needs
to reset the mac_header pointer, so that other parts of the stack don't
mistakenly access the outer header after the packet has been
decapsulated.
This allows to push an Ethernet header to bareudp packets and redirect
them to an Ethernet device:
Without this patch, push_eth refuses to add an ethernet header because
the skb appears to already have a MAC header.
Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 45fa29c85117 ("bareudp: Pull inner IP header in bareudp_udp_encap_recv().") Signed-off-by: Sasha Levin <sashal@kernel.org>
Requesting transfers of the exact same size of wMaxPacketSize may result
in ZPL/short-transfer since the USB stack cannot handle it as we are
limiting the buffer size to be the same as wMaxPacketSize.
Also, in terms of throughput this change has the same effect to
interrupt endpoint as 290ba200815f "Bluetooth: Improve USB driver throughput
by increasing the frame size" had for the bulk endpoint, so users of the
advertisement bearer (e.g. BT Mesh) may benefit from this change.
Fixes: 5e23b923da03 ("[Bluetooth] Add generic driver for Bluetooth USB devices") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Tested-by: Kiran K <kiran.k@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
After calling m_can_stop() an interrupt may be pending or NAPI might
still be executed. This means the driver might still touch registers
of the IP core after the clocks have been disabled. This is not good
practice and might lead to aborts depending on the SoC integration.
To avoid these potential problems, make m_can_close() symmetric to
m_can_open(), i.e. stop the clocks at the end, right before shutting
down the transceiver.
The blamed change fixed another warning that is triggered when
connect() is issued again for a socket whose connect()ed device has
been unregistered.
However, if the socket is just close()d without the 2nd connect(), the
remaining bo->bcm_proc_read triggers unnecessary remove_proc_entry()
in bcm_release().
Let's clear bo->bcm_proc_read after remove_proc_entry() in bcm_notify().
Several syzbot soft lockup reports all have in common sock_hash_free()
If a map with a large number of buckets is destroyed, we need to yield
the cpu when needed.
Fixes: 75e68e5bf2c7 ("bpf, sockhash: Synchronize delete from bucket list on map free") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20240906154449.3742932-1-edumazet@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
In the `wilc_parse_join_bss_param` function, the TSF field of the `ies`
structure is accessed after the RCU read-side critical section is
unlocked. According to RCU usage rules, this is illegal. Reusing this
pointer can lead to unpredictable behavior, including accessing memory
that has been updated or causing use-after-free issues.
This possible bug was identified using a static analysis tool developed
by myself, specifically designed to detect RCU-related issues.
To address this, the TSF value is now stored in a local variable
`ies_tsf` before the RCU lock is released. The `param->tsf_lo` field is
then assigned using this local variable, ensuring that the TSF value is
safely accessed.
issues the warning (as reported by syzbot reproducer):
WARNING: CPU: 2 PID: 5128 at kernel/softirq.c:362 __local_bh_enable_ip+0xc3/0x120
Fix this by implementing a two-phase skb reclamation in
'ieee80211_do_stop()', where actual work is performed
outside of a section with interrupts disabled.
Fixes: 5061b0c2b906 ("mac80211: cooperate more with network namespaces") Reported-by: syzbot+1a3986bbd3169c307819@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=1a3986bbd3169c307819 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Link: https://patch.msgid.link/20240906123151.351647-1-dmantipov@yandex.ru Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Although not reproduced in practice, these two cases may be
considered by UBSAN as off-by-one errors. So fix them in the
same way as in commit a26a5107bc52 ("wifi: cfg80211: fix UBSAN
noise in cfg80211_wext_siwscan()").
Fixes: 807f8a8c3004 ("cfg80211/nl80211: add support for scheduled scans") Fixes: 5ba63533bbf6 ("cfg80211: fix alignment problem in scan request") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Link: https://patch.msgid.link/20240909090806.1091956-1-dmantipov@yandex.ru Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Looking at https://syzkaller.appspot.com/bug?extid=1a3986bbd3169c307819
and running reproducer with CONFIG_UBSAN_BOUNDS, I've noticed the
following:
[ T4985] UBSAN: array-index-out-of-bounds in net/wireless/scan.c:3479:25
[ T4985] index 164 is out of range for type 'struct ieee80211_channel *[]'
<...skipped...>
[ T4985] Call Trace:
[ T4985] <TASK>
[ T4985] dump_stack_lvl+0x1c2/0x2a0
[ T4985] ? __pfx_dump_stack_lvl+0x10/0x10
[ T4985] ? __pfx__printk+0x10/0x10
[ T4985] __ubsan_handle_out_of_bounds+0x127/0x150
[ T4985] cfg80211_wext_siwscan+0x11a4/0x1260
<...the rest is not too useful...>
Even if we do 'creq->n_channels = n_channels' before 'creq->ssids =
(void *)&creq->channels[n_channels]', UBSAN treats the latter as
off-by-one error. Fix this by using pointer arithmetic rather than
an expression with explicit array indexing and use convenient
'struct_size()' to simplify the math here and in 'kzalloc()' above.
Fixes: 5ba63533bbf6 ("cfg80211: fix alignment problem in scan request") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reviewed-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20240905150400.126386-1-dmantipov@yandex.ru
[fix coding style for multi-line calculation] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit b4bc9f9e27ed ("cpufreq: ti-cpufreq: add support for omap34xx
and omap36xx") introduced special handling for OMAP3 class devices
where syscon node may not be present. However, this also creates a bug
where the syscon node is present, however the offset used to read
is beyond the syscon defined range.
Fix this by providing a quirk option that is populated when such
special handling is required. This allows proper failure for all other
platforms when the syscon node and efuse offsets are mismatched.
Fixes: b4bc9f9e27ed ("cpufreq: ti-cpufreq: add support for omap34xx and omap36xx") Signed-off-by: Nishanth Menon <nm@ti.com> Tested-by: Dhruva Gole <d-gole@ti.com> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If element timeout is unset and set provides no default timeout, the
element expiration is silently ignored, reject this instead to let user
know this is unsupported.
Also prepare for supporting timeout that never expire, where zero
timeout and expiration must be also rejected.
Fixes: 8e1102d5a159 ("netfilter: nf_tables: support timeouts larger than 23 days") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Element timeout that is below CONFIG_HZ never expires because the
timeout extension is not allocated given that nf_msecs_to_jiffies64()
returns 0. Set timeout to the minimum value to honor timeout.
Fixes: 8e1102d5a159 ("netfilter: nf_tables: support timeouts larger than 23 days") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the case where we are forcing the ps.chunk_size to be at least 1,
we are ignoring the caller's alignment.
Move the forcing of ps.chunk_size to be at least 1 before rounding it
up to caller's alignment, so that caller's alignment is honored.
While at it, use max() to force the ps.chunk_size to be at least 1 to
improve readability.
Fixes: 6d45e1c948a8 ("padata: Fix possible divide-by-0 panic in padata_mt_helper()") Signed-off-by: Kamlesh Gurudasani <kamlesh@ti.com>
Acked-by:Â Waiman Long <longman@redhat.com> Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit f8b92ba67c5d ("mount: Add mount warning for impending timestamp
expiry") introduced a mount warning regarding filesystem timestamp
limits, that is printed upon each writable mount or remount.
This can result in a lot of unnecessary messages in the kernel log in
setups where filesystems are being frequently remounted (or mounted
multiple times).
Avoid this by setting a superblock flag which indicates that the warning
has been emitted at least once for any particular mount, as suggested in
[1].
Link: https://lore.kernel.org/CAHk-=wim6VGnxQmjfK_tDg6fbHYKL4EFkmnTjVr9QnRqjDBAeA@mail.gmail.com/ Link: https://lkml.kernel.org/r/20220119202934.26495-1-ailiop@suse.com Signed-off-by: Anthony Iliopoulos <ailiop@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stable-dep-of: 4bcda1eaf184 ("mount: handle OOM on mnt_warn_timestamp_expiry") Signed-off-by: Sasha Levin <sashal@kernel.org>
In 'rtw_wait_firmware_completion()', always wait for both (regular and
wowlan) firmware loading attempts. Otherwise if 'rtw_usb_intf_init()'
has failed in 'rtw_usb_probe()', 'rtw_usb_disconnect()' may issue
'ieee80211_free_hw()' when one of 'rtw_load_firmware_cb()' (usually
the wowlan one) is still in progress, causing UAF detected by KASAN.
Commit d23b5c577715 ("cgroup: Make operations on the cgroup root_list RCU
safe") adds a new rcu_head to the cgroup_root structure and kvfree_rcu()
for freeing the cgroup_root.
The current implementation of kvfree_rcu(), however, has the limitation
that the offset of the rcu_head structure within the larger data
structure must be less than 4096 or the compilation will fail. See the
macro definition of __is_kvfree_rcu_offset() in include/linux/rcupdate.h
for more information.
By putting rcu_head below the large cgroup structure, any change to the
cgroup structure that makes it larger run the risk of causing build
failure under certain configurations. Commit 77070eeb8821 ("cgroup:
Avoid false cacheline sharing of read mostly rstat_cpu") happens to be
the last straw that breaks it. Fix this problem by moving the rcu_head
structure up before the cgroup structure.
Fixes: d23b5c577715 ("cgroup: Make operations on the cgroup root_list RCU safe") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/lkml/20231207143806.114e0a74@canb.auug.org.au/ Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Yafang Shao <laoar.shao@gmail.com> Reviewed-by: Yosry Ahmed <yosryahmed@google.com> Reviewed-by: Michal KoutnĂ½ <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
linereq_set_config() behaves badly when direction is not set.
The configuration validation is borrowed from linereq_create(), where,
to verify the intent of the user, the direction must be set to in order to
effect a change to the electrical configuration of a line. But, when
applied to reconfiguration, that validation does not allow for the unset
direction case, making it possible to clear flags set previously without
specifying the line direction.
Adding to the inconsistency, those changes are not immediately applied by
linereq_set_config(), but will take effect when the line value is next get
or set.
For example, by requesting a configuration with no flags set, an output
line with GPIO_V2_LINE_FLAG_ACTIVE_LOW and GPIO_V2_LINE_FLAG_OPEN_DRAIN
set could have those flags cleared, inverting the sense of the line and
changing the line drive to push-pull on the next line value set.
Skip the reconfiguration of lines for which the direction is not set, and
only reconfigure the lines for which direction is set.
Fixes: a54756cb24ea ("gpiolib: cdev: support GPIO_V2_LINE_SET_CONFIG_IOCTL") Signed-off-by: Kent Gibson <warthog618@gmail.com> Link: https://lore.kernel.org/r/20240626052925.174272-3-warthog618@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
BUG: KASAN: use-after-free in ftrace_location+0x90/0x120
Read of size 8 at addr ffff888141d40010 by task insmod/424
CPU: 8 PID: 424 Comm: insmod Tainted: G W 6.9.0-rc2+
[...]
Call Trace:
<TASK>
dump_stack_lvl+0x68/0xa0
print_report+0xcf/0x610
kasan_report+0xb5/0xe0
ftrace_location+0x90/0x120
register_kprobe+0x14b/0xa40
kprobe_init+0x2d/0xff0 [kprobe_example]
do_one_initcall+0x8f/0x2d0
do_init_module+0x13a/0x3c0
load_module+0x3082/0x33d0
init_module_from_file+0xd2/0x130
__x64_sys_finit_module+0x306/0x440
do_syscall_64+0x68/0x140
entry_SYSCALL_64_after_hwframe+0x71/0x79
The root cause is that, in lookup_rec(), ftrace record of some address
is being searched in ftrace pages of some module, but those ftrace pages
at the same time is being freed in ftrace_release_mod() as the
corresponding module is being deleted:
To fix this issue:
1. Hold rcu lock as accessing ftrace pages in ftrace_location_range();
2. Use ftrace_location_range() instead of lookup_rec() in
ftrace_location();
3. Call synchronize_rcu() before freeing any ftrace pages both in
ftrace_process_locs()/ftrace_release_mod()/ftrace_free_mem().
Link: https://lore.kernel.org/linux-trace-kernel/20240509192859.1273558-1-zhengyejian1@huawei.com Cc: stable@vger.kernel.org Cc: <mhiramat@kernel.org> Cc: <mark.rutland@arm.com> Cc: <mathieu.desnoyers@efficios.com> Fixes: ae6aa16fdc16 ("kprobes: introduce ftrace based optimization") Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
[Shivani: Modified to apply on v5.10.y] Signed-off-by: Shivani Agarwal <shivani.agarwal@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently a lot of ftrace code assumes __fentry__ is at sym+0. However
with Intel IBT enabled the first instruction of a function will most
likely be ENDBR.
Change ftrace_location() to not only return the __fentry__ location
when called for the __fentry__ location, but also when called for the
sym+0 location.
Then audit/update all callsites of this function to consistently use
these new semantics.
Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154318.227581603@infradead.org
Stable-dep-of: e60b613df8b6 ("ftrace: Fix possible use-after-free issue in ftrace_location()")
[Shivani: Modified to apply on v5.10.y] Signed-off-by: Shivani Agarwal <shivani.agarwal@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ip_local_out() and other functions can pass skb->sk as function argument.
If the skb is a fragment and reassembly happens before such function call
returns, the sk must not be released.
This affects skb fragments reassembled via netfilter or similar
modules, e.g. openvswitch or ct_act.c, when run as part of tx pipeline.
Eric Dumazet made an initial analysis of this bug. Quoting Eric:
Calling ip_defrag() in output path is also implying skb_orphan(),
which is buggy because output path relies on sk not disappearing.
A relevant old patch about the issue was : 8282f27449bf ("inet: frag: Always orphan skbs inside ip_defrag()")
[..]
net/ipv4/ip_output.c depends on skb->sk being set, and probably to an
inet socket, not an arbitrary one.
If we orphan the packet in ipvlan, then downstream things like FQ
packet scheduler will not work properly.
We need to change ip_defrag() to only use skb_orphan() when really
needed, ie whenever frag_list is going to be used.
Eric suggested to stash sk in fragment queue and made an initial patch.
However there is a problem with this:
If skb is refragmented again right after, ip_do_fragment() will copy
head->sk to the new fragments, and sets up destructor to sock_wfree.
IOW, we have no choice but to fix up sk_wmem accouting to reflect the
fully reassembled skb, else wmem will underflow.
This change moves the orphan down into the core, to last possible moment.
As ip_defrag_offset is aliased with sk_buff->sk member, we must move the
offset into the FRAG_CB, else skb->sk gets clobbered.
This allows to delay the orphaning long enough to learn if the skb has
to be queued or if the skb is completing the reasm queue.
In the former case, things work as before, skb is orphaned. This is
safe because skb gets queued/stolen and won't continue past reasm engine.
In the latter case, we will steal the skb->sk reference, reattach it to
the head skb, and fix up wmem accouting when inet_frag inflates truesize.
Fixes: 7026b1ddb6b8 ("netfilter: Pass socket pointer down through okfn().") Diagnosed-by: Eric Dumazet <edumazet@google.com> Reported-by: xingwei lee <xrivendell7@gmail.com> Reported-by: yue sun <samsun1006219@gmail.com> Reported-by: syzbot+e5167d7144a62715044c@syzkaller.appspotmail.com Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240326101845.30836-1-fw@strlen.de Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Saeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In remove_anno_list_by_saddr(running on CPU2), after leaving the critical
zone protected by "pm.lock", the entry will be released, which leads to the
occurrence of uaf in the mptcp_pm_del_add_timer(running on CPU1).
Keeping a reference to add_timer inside the lock, and calling
sk_stop_timer_sync() with this reference, instead of "entry->add_timer".
Move list_del(&entry->list) to mptcp_pm_del_add_timer and inside the pm lock,
do not directly access any members of the entry outside the pm lock, which
can avoid similar "entry->x" uaf.
Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout") Cc: stable@vger.kernel.org Reported-and-tested-by: syzbot+f3a31fb909db9b2a5c4d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f3a31fb909db9b2a5c4d Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Edward Adam Davis <eadavis@qq.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/tencent_7142963A37944B4A74EF76CD66EA3C253609@qq.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit b4cd80b0338945a94972ac3ed54f8338d2da2076) Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
when Linux receives an echo-ed ADD_ADDR, it checks the IP address against
the list of "announced" addresses. In case of a positive match, the timer
that handles retransmissions is stopped regardless of the 'Address Id' in
the received packet: this behaviour does not comply with RFC8684 3.4.1.
Fix it by validating the 'Address Id' in received echo-ed ADD_ADDRs.
Tested using packetdrill, with the following captured output:
unpatched kernel:
Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0xfd2e62517888fe29,mptcp dss ack 3007449509], length 0
In <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 1 1.2.3.4,mptcp dss ack 3013740213], length 0
Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0xfd2e62517888fe29,mptcp dss ack 3007449509], length 0
In <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 90 198.51.100.2,mptcp dss ack 3013740213], length 0
^^^ retransmission is stopped here, but 'Address Id' is 90
patched kernel:
Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0x1cf372d59e05f4b8,mptcp dss ack 3007449509], length 0
In <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 1 1.2.3.4,mptcp dss ack 1672384568], length 0
Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0x1cf372d59e05f4b8,mptcp dss ack 3007449509], length 0
In <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 90 198.51.100.2,mptcp dss ack 1672384568], length 0
Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0x1cf372d59e05f4b8,mptcp dss ack 3007449509], length 0
In <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 1 198.51.100.2,mptcp dss ack 1672384568], length 0
^^^ retransmission is stopped here, only when both 'Address Id' and 'IP Address' match
Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: b4cd80b03389 ("mptcp: pm: Fix uaf in __timer_delete_sync")
[ Conflicts in options.c, because some features are missing in this
version, e.g. commit 557963c383e8 ("mptcp: move to next addr when
subflow creation fail") and commit f7dafee18538 ("mptcp: use
mptcp_addr_info in mptcp_options_received"). ] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Userspace may trigger a speculative read of an address outside the gpio
descriptor array.
Users can do that by calling gpio_ioctl() with an offset out of range.
Offset is copied from user and then used as an array index to get
the gpio descriptor without sanitization in gpio_device_get_desc().
This change ensures that the offset is sanitized by using
array_index_nospec() to mitigate any possibility of speculative
information leaks.
This bug was discovered and resolved using Coverity Static Analysis
Security Testing (SAST) by Synopsys, Inc.
Add missing decorator type to lookup expression and tighten WARN_ON_ONCE
check in pipapo to spot earlier that this is unset.
Fixes: 29b359cf6d95 ("netfilter: nft_set_pipapo: walk over current view on netlink dump") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The generation mask can be updated while netlink dump is in progress.
The pipapo set backend walk iterator cannot rely on it to infer what
view of the datastructure is to be used. Add notation to specify if user
wants to read/update the set.
Based on patch from Florian Westphal.
Fixes: 2b84e215f874 ("netfilter: nft_set_pipapo: .walk does not deal with generations") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
At present, when we perform operations on the cgroup root_list, we must
hold the cgroup_mutex, which is a relatively heavyweight lock. In reality,
we can make operations on this list RCU-safe, eliminating the need to hold
the cgroup_mutex during traversal. Modifications to the list only occur in
the cgroup root setup and destroy paths, which should be infrequent in a
production environment. In contrast, traversal may occur frequently.
Therefore, making it RCU-safe would be beneficial.
xattr in ocfs2 maybe 'non-indexed', which saved with additional space
requested. It's better to check if the memory is out of bound before
memcmp, although this possibility mainly comes from crafted poisonous
images.
Link: https://lkml.kernel.org/r/20240520024024.1976129-2-joseph.qi@linux.alibaba.com Signed-off-by: Ferry Meng <mengferry@linux.alibaba.com> Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: lei lu <llfamsec@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add a paranoia check to make sure it doesn't stray beyond valid memory
region containing ocfs2 xattr entries when scanning for a match. It will
prevent out-of-bound access in case of crafted images.
Link: https://lkml.kernel.org/r/20240520024024.1976129-1-joseph.qi@linux.alibaba.com Signed-off-by: Ferry Meng <mengferry@linux.alibaba.com> Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: lei lu <llfamsec@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: af77c4fc1871 ("ocfs2: strict bound check before memcmp in ocfs2_xattr_find_entry()") Signed-off-by: Sasha Levin <sashal@kernel.org>
A Linux guest on Hyper-V gets the TSC frequency from a synthetic MSR, if
available. In this case, set X86_FEATURE_TSC_KNOWN_FREQ so that Linux
doesn't unnecessarily do refined TSC calibration when setting up the TSC
clocksource.
With this change, a message such as this is no longer output during boot
when the TSC is used as the clocksource:
Furthermore, the guest and host will have exactly the same view of the
TSC frequency, which is important for features such as the TSC deadline
timer that are emulated by the Hyper-V host.
Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Roman Kisel <romank@linux.microsoft.com> Link: https://lore.kernel.org/r/20240606025559.1631-1-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20240606025559.1631-1-mhklinux@outlook.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
We use komeda_crtc_normalize_zpos to normalize zpos of affected planes
to their blending zorder in CU. If there's only one slave plane in
affected planes and its layer_split property is enabled, order++ for
its split layer, so that when calculating the normalized_zpos
of master planes, the split layer of the slave plane is included, but
the max_slave_zorder does not include the split layer and keep zero
because there's only one slave plane in affacted planes, although we
actually use two slave layers in this commit.
In most cases, this bug does not result in a commit failure, but assume
the following situation:
slave_layer 0: zpos = 0, layer split enabled, normalized_zpos =
0;(use slave_layer 2 as its split layer)
master_layer 0: zpos = 2, layer_split enabled, normalized_zpos =
2;(use master_layer 2 as its split layer)
master_layer 1: zpos = 4, normalized_zpos = 4;
master_layer 3: zpos = 5, normalized_zpos = 5;
kcrtc_st->max_slave_zorder = 0;
When we use master_layer 3 as a input of CU in function
komeda_compiz_set_input and check it with function
komeda_component_check_input, the parameter idx is equal to
normailzed_zpos minus max_slave_zorder, the value of idx is 5
and is euqal to CU's max_active_inputs, so that
komeda_component_check_input returns a -EINVAL value.
To fix the bug described above, when calculating the max_slave_zorder
with the layer_split enabled, count the split layer in this calculation
directly.
There is a WARNING in iwl_trans_wait_tx_queues_empty() (that was
recently converted from just a message), that can be hit if we
wait for TX queues to become empty after firmware died. Clearly,
we can't expect anything from the firmware after it's declared dead.
Don't call iwl_trans_wait_tx_queues_empty() in this case. While it could
be a good idea to stop the flow earlier, the flush functions do some
maintenance work that is not related to the firmware, so keep that part
of the code running even when the firmware is not running.
An invalid buffer destination is not a problem for the driver and it
does not make sense to report it with the KERN_ERR message level. As
such, change the message to use IWL_DEBUG_FW.
Reported-by: Len Brown <lenb@kernel.org> Closes: https://lore.kernel.org/r/CAJvTdKkcxJss=DM2sxgv_MR5BeZ4_OC-3ad6tA40TYH2yqHCWw@mail.gmail.com Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20240825191257.20abf78f05bc.Ifbcecc2ae9fb40b9698302507dcba8b922c8d856@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The driver must ensure TX descriptor updates are visible
before updating TX pointer and TX clear pointer.
This resolves TX hangs observed on AST2600 when running
iperf3.
Signed-off-by: Jacky Chou <jacky_chou@aspeedtech.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Before commit 721f4a6526da ("mm/memblock: remove empty dummy entry") the
check for non-zero of memblock.reserved.cnt in mmu_init() would always
be true either because memblock.reserved.cnt is initialized to 1 or
because there were memory reservations earlier.
The removal of dummy empty entry in memblock caused this check to fail
because now memblock.reserved.cnt is initialized to 0.
Remove the check for non-zero of memblock.reserved.cnt because it's
perfectly fine to have an empty memblock.reserved array that early in
boot.
pinctrl-at91 currently does not support the gpio-groups devicetree
property and has no pin-range.
Because of this at91 gpios stopped working since patch
commit 2ab73c6d8323fa1e ("gpio: Support GPIO controllers without pin-ranges")
This was discussed in the patches
commit fc328a7d1fcce263 ("gpio: Revert regression in sysfs-gpio (gpiolib.c)")
commit 56e337f2cf132632 ("Revert "gpio: Revert regression in sysfs-gpio (gpiolib.c)"")
As a workaround manually set pin-range via gpiochip_add_pin_range() until
a) pinctrl-at91 is reworked to support devicetree gpio-groups
b) another solution as mentioned in
commit 56e337f2cf132632 ("Revert "gpio: Revert regression in sysfs-gpio (gpiolib.c)"")
is found
Dell platform with ALC215 ALC285 ALC289 ALC225 ALC295 ALC299, plug
headphone or headset.
It had a chance to get no sound from headphone.
Replace depop procedure will solve this issue.
Until VM_DONTEXPAND was added in commit 1c1914d6e8c6 ("dma-buf: heaps:
Don't track CMA dma-buf pages under RssFile") it was possible to obtain
a mapping larger than the buffer size via mremap and bypass the overflow
check in dma_buf_mmap_internal. When using such a mapping to attempt to
fault past the end of the buffer, the CMA heap fault handler also checks
the fault offset against the buffer size, but gets the boundary wrong by
1. Fix the boundary check so that we don't read off the end of the pages
array and insert an arbitrary page in the mapping.
Reported-by: Xingyu Jin <xingyuj@google.com> Fixes: a5d2d29e24be ("dma-buf: heaps: Move heap-helper logic into the cma_heap implementation") Cc: stable@vger.kernel.org # Applicable >= 5.10. Needs adjustments only for 5.10. Signed-off-by: T.J. Mercier <tjmercier@google.com> Acked-by: John Stultz <jstultz@google.com> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240830192627.2546033-1-tjmercier@google.com
[ TJ: Backport to 5.10. On this kernel the bug is located in
dma_heap_vm_fault which is used by both the CMA and system heaps. ] Signed-off-by: T.J. Mercier <tjmercier@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Buffer 'card->dai_link' is reallocated in 'meson_card_reallocate_links()',
so move 'pad' pointer initialization after this function when memory is
already reallocated.
Kasan bug report:
==================================================================
BUG: KASAN: slab-use-after-free in axg_card_add_link+0x76c/0x9bc
Read of size 8 at addr ffff000000e8b260 by task modprobe/356
This reverts commit ab8d66d132bc8f1992d3eb6cab8d32dda6733c84 because it
breaks codecs using non-continuous masks in source and sink ports. The
commit missed the point that port numbers are not used as indices for
iterating over prop.sink_ports or prop.source_ports.
Soundwire core and existing codecs expect that the array passed as
prop.sink_ports and prop.source_ports is continuous. The port mask still
might be non-continuous, but that's unrelated.
Reported-by: Bard Liao <yung-chuan.liao@linux.intel.com> Closes: https://lore.kernel.org/all/b6c75eee-761d-44c8-8413-2a5b34ee2f98@linux.intel.com/ Fixes: ab8d66d132bc ("soundwire: stream: fix programming slave ports for non-continous port maps") Acked-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Cc: stable@vger.kernel.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Tested-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://lore.kernel.org/r/20240909164746.136629-1-krzysztof.kozlowski@linaro.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Change the memcpy length to fix the out-of-bounds issue when writing the
data that is not 4 byte aligned to TX FIFO.
To reproduce the issue, write 3 bytes data to NOR chip.
dd if=3b of=/dev/mtd0
[ 36.926103] ==================================================================
[ 36.933409] BUG: KASAN: slab-out-of-bounds in nxp_fspi_exec_op+0x26ec/0x2838
[ 36.940514] Read of size 4 at addr ffff00081037c2a0 by task dd/455
[ 36.946721]
[ 36.948235] CPU: 3 UID: 0 PID: 455 Comm: dd Not tainted 6.11.0-rc5-gc7b0e37c8434 #1070
[ 36.956185] Hardware name: Freescale i.MX8QM MEK (DT)
[ 36.961260] Call trace:
[ 36.963723] dump_backtrace+0x90/0xe8
[ 36.967414] show_stack+0x18/0x24
[ 36.970749] dump_stack_lvl+0x78/0x90
[ 36.974451] print_report+0x114/0x5cc
[ 36.978151] kasan_report+0xa4/0xf0
[ 36.981670] __asan_report_load_n_noabort+0x1c/0x28
[ 36.986587] nxp_fspi_exec_op+0x26ec/0x2838
[ 36.990800] spi_mem_exec_op+0x8ec/0xd30
[ 36.994762] spi_mem_no_dirmap_read+0x190/0x1e0
[ 36.999323] spi_mem_dirmap_write+0x238/0x32c
[ 37.003710] spi_nor_write_data+0x220/0x374
[ 37.007932] spi_nor_write+0x110/0x2e8
[ 37.011711] mtd_write_oob_std+0x154/0x1f0
[ 37.015838] mtd_write_oob+0x104/0x1d0
[ 37.019617] mtd_write+0xb8/0x12c
[ 37.022953] mtdchar_write+0x224/0x47c
[ 37.026732] vfs_write+0x1e4/0x8c8
[ 37.030163] ksys_write+0xec/0x1d0
[ 37.033586] __arm64_sys_write+0x6c/0x9c
[ 37.037539] invoke_syscall+0x6c/0x258
[ 37.041327] el0_svc_common.constprop.0+0x160/0x22c
[ 37.046244] do_el0_svc+0x44/0x5c
[ 37.049589] el0_svc+0x38/0x78
[ 37.052681] el0t_64_sync_handler+0x13c/0x158
[ 37.057077] el0t_64_sync+0x190/0x194
[ 37.060775]
[ 37.062274] Allocated by task 455:
[ 37.065701] kasan_save_stack+0x2c/0x54
[ 37.069570] kasan_save_track+0x20/0x3c
[ 37.073438] kasan_save_alloc_info+0x40/0x54
[ 37.077736] __kasan_kmalloc+0xa0/0xb8
[ 37.081515] __kmalloc_noprof+0x158/0x2f8
[ 37.085563] mtd_kmalloc_up_to+0x120/0x154
[ 37.089690] mtdchar_write+0x130/0x47c
[ 37.093469] vfs_write+0x1e4/0x8c8
[ 37.096901] ksys_write+0xec/0x1d0
[ 37.100332] __arm64_sys_write+0x6c/0x9c
[ 37.104287] invoke_syscall+0x6c/0x258
[ 37.108064] el0_svc_common.constprop.0+0x160/0x22c
[ 37.112972] do_el0_svc+0x44/0x5c
[ 37.116319] el0_svc+0x38/0x78
[ 37.119401] el0t_64_sync_handler+0x13c/0x158
[ 37.123788] el0t_64_sync+0x190/0x194
[ 37.127474]
[ 37.128977] The buggy address belongs to the object at ffff00081037c2a0
[ 37.128977] which belongs to the cache kmalloc-8 of size 8
[ 37.141177] The buggy address is located 0 bytes inside of
[ 37.141177] allocated 3-byte region [ffff00081037c2a0, ffff00081037c2a3)
[ 37.153465]
[ 37.154971] The buggy address belongs to the physical page:
[ 37.160559] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x89037c
[ 37.168596] flags: 0xbfffe0000000000(node=0|zone=2|lastcpupid=0x1ffff)
[ 37.175149] page_type: 0xfdffffff(slab)
[ 37.179021] raw: 0bfffe0000000000ffff000800002500dead0000000001220000000000000000
[ 37.186788] raw: 0000000000000000000000008080008000000001fdffffff0000000000000000
[ 37.194553] page dumped because: kasan: bad access detected
[ 37.200144]
[ 37.201647] Memory state around the buggy address:
[ 37.206460] ffff00081037c180: fa fc fc fc fa fc fc fc fa fc fc fc fa fc fc fc
[ 37.213701] ffff00081037c200: fa fc fc fc 05 fc fc fc 03 fc fc fc 02 fc fc fc
[ 37.220946] >ffff00081037c280: 06 fc fc fc 03 fc fc fc fc fc fc fc fc fc fc fc
[ 37.228186] ^
[ 37.232473] ffff00081037c300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 37.239718] ffff00081037c380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 37.246962] ==================================================================
[ 37.254394] Disabling lock debugging due to kernel taint
0+1 records in
0+1 records out
3 bytes copied, 0.335911 s, 0.0 kB/s
Fixes: a5356aef6a90 ("spi: spi-mem: Add driver for NXP FlexSPI controller") Cc: stable@kernel.org Signed-off-by: Han Xu <han.xu@nxp.com> Link: https://patch.msgid.link/20240911211146.3337068-1-han.xu@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When sending packets under 60 bytes, up to three bytes of the buffer
following the data may be leaked. Avoid this by extending all packets to
ETH_ZLEN, ensuring nothing is leaked in the padding. This bug can be
reproduced by running
$ ping -s 11 destination
Fixes: 9ad1a3749333 ("dpaa_eth: add support for DPAA Ethernet") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20240910143144.1439910-1-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently, the driver only enables RX interrupt to handle RX
packets and TX resources. Sometimes there is not RX traffic,
so the TX resource needs to wait for RX interrupt to free.
This situation will toggle the TX timeout watchdog when the MAC
TX ring has no more resources to transmit packets.
Therefore, enable TX interrupt to release TX resources at any time.
When I am verifying iperf3 over UDP, the network hangs.
Like the log below.
The network topology is FTGMAC connects directly to a PC.
UDP does not need to wait for ACK, unlike TCP.
Therefore, FTGMAC needs to enable TX interrupt to release TX resources instead
of waiting for the RX interrupt.
When adding a switch filter (such as a MAC or VLAN filter), it is expected
that the driver will detect the case where the filter already exists, and
return -EEXIST. This is used by calling code such as ice_vc_add_mac_addr,
and ice_vsi_add_vlan to avoid incrementing the accounting fields such as
vsi->num_vlan or vf->num_mac.
This logic works correctly for the case where only a single VSI has added a
given switch filter.
When a second VSI adds the same switch filter, the driver converts the
existing filter from an ICE_FWD_TO_VSI filter into an ICE_FWD_TO_VSI_LIST
filter. This saves switch resources, by ensuring that multiple VSIs can
re-use the same filter.
The ice_add_update_vsi_list() function is responsible for doing this
conversion. When first converting a filter from the FWD_TO_VSI into
FWD_TO_VSI_LIST, it checks if the VSI being added is the same as the
existing rule's VSI. In such a case it returns -EEXIST.
However, when the switch rule has already been converted to a
FWD_TO_VSI_LIST, the logic is different. Adding a new VSI in this case just
requires extending the VSI list entry. The logic for checking if the rule
already exists in this case returns 0 instead of -EEXIST.
This breaks the accounting logic mentioned above, so the counters for how
many MAC and VLAN filters exist for a given VF or VSI no longer accurately
reflect the actual count. This breaks other code which relies on these
counts.
In typical usage this primarily affects such filters generally shared by
multiple VSIs such as VLAN 0, or broadcast and multicast MAC addresses.
Fix this by correctly reporting -EEXIST in the case of adding the same VSI
to a switch rule already converted to ICE_FWD_TO_VSI_LIST.
Fixes: 9daf8208dd4d ("ice: Add support for switch filter programming") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The current implementation of pmbus_show_boolean assumes that all devices
support write-back operation of status register to clear pending warnings
or faults. Since clearing individual bits in the status registers was only
introduced in PMBus specification 1.2, this operation may not be supported
by some older devices. This can result in an error while reading boolean
attributes such as temp1_max_alarm.
Fetch PMBus revision supported by the device and modify pmbus_show_boolean
so that it only tries to clear individual status bits if the device is
compliant with PMBus specs >= 1.2. Otherwise clear all fault indicators
on the current page after a fault status was reported.
Fixes: 35f165f08950a ("hwmon: (pmbus) Clear pmbus fault/warning bits after read") Signed-off-by: Patryk Biel <pbiel7@gmail.com>
Message-ID: <20240909-pmbus-status-reg-clearing-v1-1-f1c0d68c6408@gmail.com>
[groeck:
Rewrote description
Moved revision detection code ahead of clear faults command
Assigned revision if return value from PMBUS_REVISION command is 0
Improved return value check from calling _pmbus_write_byte_data()] Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Some of the pmbus core functions uses pmbus_write_byte_data, which does
not support driver callbacks for chip specific write operations. This
could potentially influence some specific regulator chips that for
example need a time delay before each data access.
Lets add support for driver callback with _pmbus_write_byte_data.
Avoid unnecessary nested min()/max() which results in egregious macro
expansion.
Use clamp_t() as this introduces the least possible expansion, and turn
the {s,u}DIGIT_FITTING() macros into inline functions to avoid the
nested expansion.
This resolves an issue with slackware 15.0 32-bit compilation as
reported by Richard Narron.
Presumably the min/max fixups would be difficult to backport, this patch
should be easier and fix's Richard's problem in 5.15.
Reported-by: Richard Narron <richard@aaazen.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Closes: https://lore.kernel.org/all/4a5321bd-b1f-1832-f0c-cea8694dc5aa@aaazen.com/ Fixes: 867046cc7027 ("minmax: relax check to allow comparison between unsigned arguments and signed constants") Cc: stable@vger.kernel.org Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Qseven BIOS_DISABLE signal on the RK3399-Q7 keeps the on-module eMMC
and SPI flash powered-down initially (in fact it keeps the reset signal
asserted). BIOS_DISABLE_OVERRIDE pin allows to override that signal so
that eMMC and SPI can be used regardless of the state of the signal.
Let's make this GPIO a hog so that it's reserved and locked in the
proper state.
At the same time, make sure the pin is reserved for the hog and cannot
be requested by another node.
-ENODEV is used to signify that there is no zap shader for the platform,
and the CPU can directly take the GPU out of secure mode. We want to
use this return code when there is no zap-shader node. But not when
there is, but without a firmware-name property. This case we want to
treat as-if the needed fw is not found.
Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/604564/ Signed-off-by: Sasha Levin <sashal@kernel.org>
When merging files without trailing newlines at the end of the file, two
config fragments end up at the same row if file1.config doens't have a
trailing newline at the end of the file.
Unlink changes the link count on the target inode. POSIX mandates that
the ctime must also change when this occurs.
According to https://pubs.opengroup.org/onlinepubs/9699919799/functions/unlink.html:
"Upon successful completion, unlink() shall mark for update the last data
modification and last file status change timestamps of the parent
directory. Also, if the file's link count is not 0, the last file status
change timestamp of the file shall be marked for update."
Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com>
[ add link to the opengroup docs ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This is due to virt_addr_valid() using high_memory before it is set.
high_memory is set in mem_init() using max_low_pfn, but max_low_pfn
is available long before, it is set in mem_topology_setup(). So just
like commit daa9ada2093e ("powerpc/mm: Fix boot crash with FLATMEM")
moved the setting of max_mapnr immediately after the call to
mem_topology_setup(), the same can be done for high_memory.
Apart from the standard "configurations", "interfaces" and "alternate
interface settings" in USB, iOS devices also have a notion of
"modes". In different modes, the device exposes a different set of
available configurations.
Depending on the iOS version, and depending on the current mode, the
length and contents of the carrier state control message differs:
* 1 byte (seen on iOS 4.2.1, 8.4):
* 03: carrier off (mode 0)
* 04: carrier on (mode 0)
* 3 bytes (seen on iOS 10.3.4, 15.7.6):
* 03 03 03: carrier off (mode 0)
* 04 04 03: carrier on (mode 0)
* 4 bytes (seen on iOS 16.5, 17.6):
* 03 03 03 00: carrier off (mode 0)
* 04 03 03 00: carrier off (mode 1)
* 06 03 03 00: carrier off (mode 4)
* 04 04 03 04: carrier on (mode 0 and 1)
* 06 04 03 04: carrier on (mode 4)
Before this change, the driver always used the first byte of the
response to determine carrier state.
From this larger sample, the first byte seems to indicate the number of
available USB configurations in the current mode (with the exception of
the default mode 0), and in some cases (namely mode 1 and 4) does not
correlate with the carrier state.
Previous logic erroneously counted `04 03 03 00` as "carrier on" and
`06 04 03 04` as "carrier off" on iOS versions that support mode 1 and
mode 4 respectively.
Only modes 0, 1 and 4 expose the USB Ethernet interfaces necessary for
the ipheth driver.
Check the second byte of the control message where possible, and fall
back to checking the first byte on older iOS versions.
Signed-off-by: Foster Snowhill <forst@pen.gy> Tested-by: Georgi Valkov <gvalkov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
This fix addresses STAR 9001285599, which only affects DWC_usb3 version
3.20a. The timer value for PM_LC_TIMER in DWC_usb3 3.20a for the Link
ECN changes is incorrect. If the PM TIMER ECN is enabled via GUCTL2[19],
the link compliance test (TD7.21) may fail. If the ECN is not enabled
(GUCTL2[19] = 0), the controller will use the old timer value (5us),
which is still acceptable for the link compliance test. Therefore, clear
GUCTL2[19] to pass the USB link compliance test: TD 7.21.
When configured in HOST mode, after issuing U3/L2 exit controller fails
to send proper CRC checksum in CRC5 field. Because of this behavior
Transaction Error is generated, resulting in reset and re-enumeration of
usb device attached. Enabling chicken bit 10 of GUCTL1 will correct this
problem.
When this bit is set to '1', the UTMI/ULPI opmode will be changed to
"normal" along with HS terminations, term, and xcvr signals after EOR.
This option is to support certain legacy UTMI/ULPI PHYs.
Added "snps,resume-hs-terminations" quirk to resolved the above issue.
On DWC_usb3 revisions 3.00a and newer (including DWC_usb31 and
DWC_usb32) the GUCTL1 register gained the DEV_DECOUPLE_L1L2_EVT
field (bit 31) which when enabled allows the controller in device
mode to treat USB 2.0 L1 LPM & L2 events separately.
After commit d1d90dd27254 ("usb: dwc3: gadget: Enable suspend
events") the controller will now receive events (and therefore
interrupts) for every state change when entering/exiting either
L1 or L2 states. Since L1 is handled entirely by the hardware
and requires no software intervention, there is no need to even
enable these events and unnecessarily notify the gadget driver.
Enable the aforementioned bit to help reduce the overall interrupt
count for these L1 events that don't need to be handled while
retaining the events for full L2 suspend/wakeup.
Tested-by: Jun Li <jun.li@nxp.com> Tested-by: Amit Pundir <amit.pundir@linaro.org> # for RB5 (sm8250) Tested-by: John Stultz <john.stultz@linaro.org> # for HiKey960 & db845c Reviewed-by: Jun Li <jun.li@nxp.com> Acked-by: Felipe Balbi <balbi@kernel.org> Signed-off-by: Jack Pham <jackp@codeaurora.org> Link: https://lore.kernel.org/r/20210812082635.12924-1-jackp@codeaurora.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stable-dep-of: 9149c9b0c7e0 ("usb: dwc3: core: update LC timer as per USB Spec V3.2") Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after
many small jobs") decoupled the memcg IDs from the CSS ID space to fix the
cgroup creation failures. It introduced IDR to maintain the memcg ID
space. The IDR depends on external synchronization mechanisms for
modifications. For the mem_cgroup_idr, the idr_alloc() and idr_replace()
happen within css callback and thus are protected through cgroup_mutex
from concurrent modifications. However idr_remove() for mem_cgroup_idr
was not protected against concurrency and can be run concurrently for
different memcgs when they hit their refcnt to zero. Fix that.
We have been seeing list_lru based kernel crashes at a low frequency in
our fleet for a long time. These crashes were in different part of
list_lru code including list_lru_add(), list_lru_del() and reparenting
code. Upon further inspection, it looked like for a given object (dentry
and inode), the super_block's list_lru didn't have list_lru_one for the
memcg of that object. The initial suspicions were either the object is
not allocated through kmem_cache_alloc_lru() or somehow
memcg_list_lru_alloc() failed to allocate list_lru_one() for a memcg but
returned success. No evidence were found for these cases.
Looking more deeply, we started seeing situations where valid memcg's id
is not present in mem_cgroup_idr and in some cases multiple valid memcgs
have same id and mem_cgroup_idr is pointing to one of them. So, the most
reasonable explanation is that these situations can happen due to race
between multiple idr_remove() calls or race between
idr_alloc()/idr_replace() and idr_remove(). These races are causing
multiple memcgs to acquire the same ID and then offlining of one of them
would cleanup list_lrus on the system for all of them. Later access from
other memcgs to the list_lru cause crashes due to missing list_lru_one.
Link: https://lkml.kernel.org/r/20240802235822.1830976-1-shakeel.butt@linux.dev Fixes: 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs") Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Muchun Song <muchun.song@linux.dev> Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Adapted due to commit be740503ed03 ("mm: memcontrol: fix cannot alloc the
maximum memcg ID") and 6f0df8e16eb5 ("memcontrol: ensure memcg acquired by id
is properly set up") not in the tree ] Signed-off-by: Tomas Krcka <krckatom@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When using a BPF program on kernel_connect(), the call can return -EPERM. This
causes xs_tcp_setup_socket() to loop forever, filling up the syslog and causing
the kernel to potentially freeze up.
Neil suggested:
This will propagate -EPERM up into other layers which might not be ready
to handle it. It might be safer to map EPERM to an error we would be more
likely to expect from the network system - such as ECONNREFUSED or ENETDOWN.
ECONNREFUSED as error seems reasonable. For programs setting a different error
can be out of reach (see handling in 4fbac77d2d09) in particular on kernels
which do not have f10d05966196 ("bpf: Make BPF_PROG_RUN_ARRAY return -err
instead of allow boolean"), thus given that it is better to simply remap for
consistent behavior. UDP does handle EPERM in xs_udp_send_request().
So it turns out that we have to do two passes of
pti_clone_entry_text(), once before initcalls, such that device and
late initcalls can use user-mode-helper / modprobe and once after
free_initmem() / mark_readonly().
Now obviously mark_readonly() can cause PMD splits, and
pti_clone_pgtable() doesn't like that much.
Allow the late clone to split PMDs so that pagetables stay in sync.
[peterz: Changelog and comments] Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lkml.kernel.org/r/20240806184843.GX37996@noisy.programming.kicks-ass.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
rt_mutex_handle_deadlock() is called with rt_mutex::wait_lock held. In the
good case it returns with the lock held and in the deadlock case it emits a
warning and goes into an endless scheduling loop with the lock held, which
triggers the 'scheduling in atomic' warning.
Unlock rt_mutex::wait_lock in the dead lock case before issuing the warning
and dropping into the schedule for ever loop.
[ tglx: Moved unlock before the WARN(), removed the pointless comment,
massaged changelog, added Fixes tag ]
In a review discussion of the changes to support vCPU hotplug where
a check was added on the GICC being enabled if was online, it was
noted that there is need to map back to the cpu and use that to index
into a cpumask. As such, a valid ID is needed.
If an MPIDR check fails in acpi_map_gic_cpu_interface() it is possible
for the entry in cpu_madt_gicc[cpu] == NULL. This function would
then cause a NULL pointer dereference. Whilst a path to trigger
this has not been established, harden this caller against the
possibility.
If acpi_processor_get_info() returned an error, pr and the associated
pr->throttling.shared_cpu_map were leaked.
The unwind code was in the wrong order wrt to setup, relying on
some unwind actions having no affect (clearing variables that were
never set etc). That makes it harder to reason about so reorder
and add appropriate labels to only undo what was actually set up
in the first place.
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240529133446.28446-6-Jonathan.Cameron@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Rafael observed [1] that returning 0 from processor_add() will result in
acpi_default_enumeration() being called which will attempt to create a
platform device, but that makes little sense when the processor is known
to be not available. So just return the error code from acpi_processor_get_info()
instead.