Rohan G Thomas [Tue, 28 Oct 2025 03:18:45 +0000 (11:18 +0800)]
net: stmmac: est: Fix GCL bounds checks
Fix the bounds checks for the hw supported maximum GCL entry
count and gate interval time.
Fixes: b60189e0392f ("net: stmmac: Integrate EST with TAPRIO scheduler API") Signed-off-by: Rohan G Thomas <rohan.g.thomas@altera.com> Reviewed-by: Matthew Gerlach <matthew.gerlach@altera.com> Link: https://patch.msgid.link/20251028-qbv-fixes-v4-3-26481c7634e3@altera.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rohan G Thomas [Tue, 28 Oct 2025 03:18:44 +0000 (11:18 +0800)]
net: stmmac: Consider Tx VLAN offload tag length for maxSDU
Queue maxSDU requirement of 802.1 Qbv standard requires mac to drop
packets that exceeds maxSDU length and maxSDU doesn't include
preamble, destination and source address, or FCS but includes
ethernet type and VLAN header.
On hardware with Tx VLAN offload enabled, VLAN header length is not
included in the skb->len, when Tx VLAN offload is requested. This
leads to incorrect length checks and allows transmission of
oversized packets. Add the VLAN_HLEN to the skb->len before checking
the Qbv maxSDU if Tx VLAN offload is requested for the packet.
Fixes: c5c3e1bfc9e0 ("net: stmmac: Offload queueMaxSDU from tc-taprio") Signed-off-by: Rohan G Thomas <rohan.g.thomas@altera.com> Reviewed-by: Matthew Gerlach <matthew.gerlach@altera.com> Link: https://patch.msgid.link/20251028-qbv-fixes-v4-2-26481c7634e3@altera.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rohan G Thomas [Tue, 28 Oct 2025 03:18:43 +0000 (11:18 +0800)]
net: stmmac: vlan: Disable 802.1AD tag insertion offload
The DWMAC IP's VLAN tag insertion offload does not support inserting
STAG (802.1AD) and CTAG (802.1Q) types in bytes 13 and 14 using the
same MAC_VLAN_Incl and MAC_VLAN_Inner_Incl register configurations.
Currently, MAC_VLAN_Incl is configured to offload only STAG type
insertion. However, the DWMAC IP inserts a CTAG type when the inner
VLAN ID field of the descriptor is not configured, and a STAG type
when it is configured. This behavior is not documented and leads to
inconsistent double VLAN tagging.
Additionally, an unexpected CTAG with VLAN ID 0 is inserted, resulting
in frames like:
Frame 1: 110 bytes on wire (880 bits), 110 bytes captured (880 bits)
Ethernet II, Src: <src> (<src>), Dst: <dst> (<dst>)
IEEE 802.1ad, ID: 100
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 0 (unexpected)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 200
Internet Protocol Version 4, Src: 192.168.4.10, Dst: 192.168.4.11
Internet Control Message Protocol
To avoid this undocumented and incorrect behavior, disable 802.1AD tag
insertion offload. Also, don't set CSVL bit. As per the data book,
when this bit is set, S-VLAN type (0x88A8) is inserted in the 13th and
14th bytes of transmitted packets and when this bit is reset, C-VLAN
type (0x8100) is inserted in the 13th and 14th bytes of transmitted
packets.
Fixes: 30d932279dc2 ("net: stmmac: Add support for VLAN Insertion Offload") Fixes: e94e3f3b51ce ("net: stmmac: Add support for VLAN Insertion Offload in GMAC4+") Fixes: 1d2c7a5fee31 ("net: stmmac: Refactor VLAN implementation") Signed-off-by: Rohan G Thomas <rohan.g.thomas@altera.com> Reviewed-by: Boon Khai Ng <boon.khai.ng@altera.com> Link: https://patch.msgid.link/20251028-qbv-fixes-v4-1-26481c7634e3@altera.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
tls: Introduce and use RX async resync request cancel function
This series by Shahar introduces RX async resync request cancel function
in tls module, and uses it in mlx5e driver.
For a device-offloaded TLS RX connection, the TLS module increments
rcd_delta each time a new TLS record is received, tracking the distance
from the original resync request. In the meanwhile, the device is
queried and is expected to respond, asynchronously.
However, if the device response is delayed or fails (e.g due to unstable
connection and device getting out of tracking, hardware errors, resource
exhaustion etc.), the TLS module keeps logging and incrementing
rcd_delta, which can lead to a WARN() when rcd_delta exceeds the
threshold.
This series improves this code area by canceling the resync request when
spotting an issue with the device response.
====================
Shahar Shitrit [Sun, 26 Oct 2025 20:03:03 +0000 (22:03 +0200)]
net/mlx5e: kTLS, Cancel RX async resync request in error flows
When device loses track of TLS records, it attempts to resync by
monitoring records and requests an asynchronous resynchronization
from software for this TLS connection.
The TLS module handles such device RX resync requests by logging record
headers and comparing them with the record tcp_sn when provided by the
device. It also increments rcd_delta to track how far the current
record tcp_sn is from the tcp_sn of the original resync request.
If the device later responds with a matching tcp_sn, the TLS module
approves the tcp_sn for resync.
However, the device response may be delayed or never arrive,
particularly due to traffic-related issues such as packet drops or
reordering. In such cases, the TLS module remains unaware that resync
will not complete, and continues performing unnecessary work by logging
headers and incrementing rcd_delta, which can eventually exceed the
threshold and trigger a WARN(). For example, this was observed when the
device got out of tracking, causing
mlx5e_ktls_handle_get_psv_completion() to fail and ultimately leading
to the rcd_delta warning.
To address this, call tls_offload_rx_resync_async_request_cancel()
to cancel the resync request and stop resync tracking in such error
cases. Also, increment the tls_resync_req_skip counter to track these
cancellations.
Shahar Shitrit [Sun, 26 Oct 2025 20:03:02 +0000 (22:03 +0200)]
net: tls: Cancel RX async resync request on rcd_delta overflow
When a netdev issues a RX async resync request for a TLS connection,
the TLS module handles it by logging record headers and attempting to
match them to the tcp_sn provided by the device. If a match is found,
the TLS module approves the tcp_sn for resynchronization.
While waiting for a device response, the TLS module also increments
rcd_delta each time a new TLS record is received, tracking the distance
from the original resync request.
However, if the device response is delayed or fails (e.g due to
unstable connection and device getting out of tracking, hardware
errors, resource exhaustion etc.), the TLS module keeps logging and
incrementing, which can lead to a WARN() when rcd_delta exceeds the
threshold.
To address this, introduce tls_offload_rx_resync_async_request_cancel()
to explicitly cancel resync requests when a device response failure is
detected. Call this helper also as a final safeguard when rcd_delta
crosses its threshold, as reaching this point implies that earlier
cancellation did not occur.
Shahar Shitrit [Sun, 26 Oct 2025 20:03:01 +0000 (22:03 +0200)]
net: tls: Change async resync helpers argument
Update tls_offload_rx_resync_async_request_start() and
tls_offload_rx_resync_async_request_end() to get a struct
tls_offload_resync_async parameter directly, rather than
extracting it from struct sock.
This change aligns the function signatures with the upcoming
tls_offload_rx_resync_async_request_cancel() helper, which
will be introduced in a subsequent patch.
Jakub Kicinski [Thu, 30 Oct 2025 01:25:12 +0000 (18:25 -0700)]
Merge tag 'nf-25-10-29' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Florian Westphal says:
====================
netfilter: updates for net
1) its not possible to attach conntrack labels via ctnetlink
unless one creates a dummy 'ct labels set' rule in nftables.
This is an oversight, the 'ruleset tests presence, userspace
(netlink) sets' use-case is valid and should 'just work'.
Always broken since this got added in Linux 4.7.
2) nft_connlimit reads count value without holding the relevant
lock, add a READ_ONCE annotation. From Fernando Fernandez Mancera.
3) There is a long-standing bug (since 4.12) in nftables helper infra
when NAT is in use: if the helper gets assigned after the nat binding
was set up, we fail to initialise the 'seqadj' extension, which is
needed in case NAT payload rewrites need to add (or remove) from the
packet payload. Fix from Andrii Melnychenko.
* tag 'nf-25-10-29' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_ct: add seqadj extension for natted connections
netfilter: nft_connlimit: fix possible data race on connection count
netfilter: nft_ct: enable labels for get case too
====================
Thanh Quan [Mon, 27 Oct 2025 14:02:43 +0000 (15:02 +0100)]
net: phy: dp83869: fix STRAP_OPMODE bitmask
According to the TI DP83869HM datasheet Revision D (June 2025), section
7.6.1.41 STRAP_STS Register, the STRAP_OPMODE bitmask is bit [11:9].
Fix this.
In case the PHY is auto-detected via PHY ID registers, or not described
in DT, or, in case the PHY is described in DT but the optional DT property
"ti,op-mode" is not present, then the driver reads out the PHY functional
mode (RGMII, SGMII, ...) from hardware straps.
Currently, all upstream users of this PHY specify both DT compatible string
"ethernet-phy-id2000.a0f1" and ti,op-mode = <DP83869_RGMII_COPPER_ETHERNET>
property, therefore it seems no upstream users are affected by this bug.
The driver currently interprets bits [2:0] of STRAP_STS register as PHY
functional mode. Those bits are controlled by ANEG_DIS, ANEGSEL_0 straps
and an always-zero reserved bit. Systems that use RGMII-to-Copper functional
mode are unlikely to disable auto-negotiation via ANEG_DIS strap, or change
auto-negotiation behavior via ANEGSEL_0 strap. Therefore, even with this bug
in place, the STRAP_STS register content is likely going to be interpreted
by the driver as RGMII-to-Copper mode.
However, for a system with PHY functional mode strapping set to other mode
than RGMII-to-Copper, the driver is likely to misinterpret the strapping
as RGMII-to-Copper and misconfigure the PHY.
For example, on a system with SGMII-to-Copper strapping, the STRAP_STS
register reads as 0x0c20, but the PHY ends up being configured for
incompatible RGMII-to-Copper mode.
Fixes: 0eaf8ccf2047 ("net: phy: dp83869: Set opmode from straps") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Thanh Quan <thanh.quan.xn@renesas.com> Signed-off-by: Hai Pham <hai.pham.ud@renesas.com> Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org> # Port from U-Boot to Linux Link: https://patch.msgid.link/20251027140320.8996-1-marek.vasut+renesas@mailbox.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Po-Hsu Lin [Mon, 27 Oct 2025 09:57:10 +0000 (17:57 +0800)]
selftests: net: use BASH for bareudp testing
In bareudp.sh, this script uses /bin/sh and it will load another lib.sh
BASH script at the very beginning.
But on some operating systems like Ubuntu, /bin/sh is actually pointed to
DASH, thus it will try to run BASH commands with DASH and consequently
leads to syntax issues:
# ./bareudp.sh: 4: ./lib.sh: Bad substitution
# ./bareudp.sh: 5: ./lib.sh: source: not found
# ./bareudp.sh: 24: ./lib.sh: Syntax error: "(" unexpected
Fix this by explicitly using BASH for bareudp.sh. This fixes test
execution failures on systems where /bin/sh is not BASH.
Jinliang Wang [Mon, 27 Oct 2025 06:55:30 +0000 (23:55 -0700)]
net: mctp: Fix tx queue stall
The tx queue can become permanently stuck in a stopped state due to a
race condition between the URB submission path and its completion
callback.
The URB completion callback can run immediately after usb_submit_urb()
returns, before the submitting function calls netif_stop_queue(). If
this occurs, the queue state management becomes desynchronized, leading
to a stall where the queue is never woken.
Fix this by moving the netif_stop_queue() call to before submitting the
URB. This closes the race window by ensuring the network stack is aware
the queue is stopped before the URB completion can possibly run.
Fixes: 0791c0327a6e ("net: mctp: Add MCTP USB transport driver") Signed-off-by: Jinliang Wang <jinliangw@google.com> Acked-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20251027065530.2045724-1-jinliangw@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cosmin Ratiu [Sun, 26 Oct 2025 20:20:19 +0000 (22:20 +0200)]
net/mlx5: Don't zero user_count when destroying FDB tables
esw->user_count tracks how many TC rules are added on an esw via
mlx5e_configure_flower -> mlx5_esw_get -> atomic64_inc(&esw->user_count)
esw.user_count was unconditionally set to 0 in
esw_destroy_legacy_fdb_table and esw_destroy_offloads_fdb_tables.
These two together can lead to the following sequence of events:
1. echo 1 > /sys/class/net/eth2/device/sriov_numvfs
- mlx5_core_sriov_configure -...-> esw_create_legacy_table ->
atomic64_set(&esw->user_count, 0)
2. tc qdisc add dev eth2 ingress && \
tc filter replace dev eth2 pref 1 protocol ip chain 0 ingress \
handle 1 flower action ct nat zone 64000 pipe
- mlx5e_configure_flower -> mlx5_esw_get ->
atomic64_inc(&esw->user_count)
3. echo 0 > /sys/class/net/eth2/device/sriov_numvfs
- mlx5_core_sriov_configure -..-> esw_destroy_legacy_fdb_table
-> atomic64_set(&esw->user_count, 0)
4. devlink dev eswitch set pci/0000:08:00.0 mode switchdev
- mlx5_devlink_eswitch_mode_set -> mlx5_esw_try_lock ->
atomic64_read(&esw->user_count) == 0
- then proceed to a WARN_ON in:
esw_offloads_start -> mlx5_eswitch_enable_locke -> esw_offloads_enable
-> mlx5_esw_offloads_rep_load -> mlx5e_vport_rep_load ->
mlx5e_netdev_change_profile -> mlx5e_detach_netdev ->
mlx5e_cleanup_nic_rx -> mlx5e_tc_nic_cleanup ->
mlx5e_mod_hdr_tbl_destroy
Fix this by not clearing out the user_count when destroying FDB tables,
so that the check in mlx5_esw_try_lock can prevent the mode change when
there are TC rules configured, as originally intended.
Miaoqian Lin [Sun, 26 Oct 2025 16:43:16 +0000 (00:43 +0800)]
net: usb: asix_devices: Check return value of usbnet_get_endpoints
The code did not check the return value of usbnet_get_endpoints.
Add checks and return the error if it fails to transfer the error.
Found via static anlaysis and this is similar to
commit 07161b2416f7 ("sr9800: Add check for usbnet_get_endpoints").
Fixes: 933a27d39e0e ("USB: asix - Add AX88178 support and many other changes") Fixes: 2e55cc7210fe ("[PATCH] USB: usbnet (3/9) module for ASIX Ethernet adapters") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Link: https://patch.msgid.link/20251026164318.57624-1-linmq006@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 30 Oct 2025 00:44:30 +0000 (17:44 -0700)]
Merge branch 'mptcp-various-rare-sending-issues'
Matthieu Baerts says:
====================
mptcp: various rare sending issues
Here are various fixes from Paolo, addressing very occasional issues on
the sending side:
- Patch 1: drop an optimisation that could lead to timeout in case of
race conditions. A fix for up to v5.11.
- Patch 2: fix stream corruption under very specific conditions.
A fix for up to v5.13.
- Patch 3: restore MPTCP-level zero window probe after a recent fix.
A fix for up to v5.16.
- Patch 4: new MIB counter to track MPTCP-level zero windows probe to
help catching issues similar to the one fixed by the previous patch.
====================
Paolo Abeni [Tue, 28 Oct 2025 08:16:54 +0000 (09:16 +0100)]
mptcp: restore window probe
Since commit 72377ab2d671 ("mptcp: more conservative check for zero
probes") the MPTCP-level zero window probe check is always disabled, as
the TCP-level write queue always contains at least the newly allocated
skb.
Refine the relevant check tacking in account that the above condition
and that such skb can have zero length.
Fixes: 72377ab2d671 ("mptcp: more conservative check for zero probes") Cc: stable@vger.kernel.org Reported-by: Geliang Tang <geliang@kernel.org> Closes: https://lore.kernel.org/d0a814c364e744ca6b836ccd5b6e9146882e8d42.camel@kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251028-net-mptcp-send-timeout-v1-3-38ffff5a9ec8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Tue, 28 Oct 2025 08:16:53 +0000 (09:16 +0100)]
mptcp: fix MSG_PEEK stream corruption
If a MSG_PEEK | MSG_WAITALL read operation consumes all the bytes in the
receive queue and recvmsg() need to waits for more data - i.e. it's a
blocking one - upon arrival of the next packet the MPTCP protocol will
start again copying the oldest data present in the receive queue,
corrupting the data stream.
Address the issue explicitly tracking the peeked sequence number,
restarting from the last peeked byte.
Paolo Abeni [Tue, 28 Oct 2025 08:16:52 +0000 (09:16 +0100)]
mptcp: drop bogus optimization in __mptcp_check_push()
Accessing the transmit queue without owning the msk socket lock is
inherently racy, hence __mptcp_check_push() could actually quit early
even when there is pending data.
That in turn could cause unexpected tx lock and timeout.
Dropping the early check avoids the race, implicitly relaying on later
tests under the relevant lock. With such change, all the other
mptcp_send_head() call sites are now under the msk socket lock and we
can additionally drop the now unneeded annotation on the transmit head
pointer accesses.
Fixes: 6e628cd3a8f7 ("mptcp: use mptcp release_cb for delayed tasks") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Geliang Tang <geliang@kernel.org> Tested-by: Geliang Tang <geliang@kernel.org> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251028-net-mptcp-send-timeout-v1-1-38ffff5a9ec8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
netconsole: Fix race condition in between reader and writer of userdata
The update_userdata() function constructs the complete userdata string
in nt->extradata_complete and updates nt->userdata_length. This data
is then read by write_msg() and write_ext_msg() when sending netconsole
messages. However, update_userdata() was not holding target_list_lock
during this process, allowing concurrent message transmission to read
partially updated userdata.
This race condition could result in netconsole messages containing
incomplete or inconsistent userdata - for example, reading the old
userdata_length with new extradata_complete content, or vice versa,
leading to truncated or corrupted output.
Fix this by acquiring target_list_lock with spin_lock_irqsave() before
updating extradata_complete and userdata_length, and releasing it after
both fields are fully updated. This ensures that readers see a
consistent view of the userdata, preventing corruption during concurrent
access.
The fix aligns with the existing locking pattern used throughout the
netconsole code, where target_list_lock protects access to target
fields including buf[] and msgcounter that are accessed during message
transmission.
Also get rid of the unnecessary variable complete_idx, which makes it
easier to bail out of update_userdata().
Bagas Sanjaya [Tue, 28 Oct 2025 13:20:27 +0000 (20:20 +0700)]
Documentation: netconsole: Remove obsolete contact people
Breno Leitao has been listed in MAINTAINERS as netconsole maintainer
since 7c938e438c56db ("MAINTAINERS: make Breno the netconsole
maintainer"), but the documentation says otherwise that bug reports
should be sent to original netconsole authors.
Remove obsolate contact info.
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20251028132027.48102-1-bagasdotme@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Abdun Nihaal [Tue, 28 Oct 2025 16:08:41 +0000 (21:38 +0530)]
nfp: xsk: fix memory leak in nfp_net_alloc()
In nfp_net_alloc(), the memory allocated for xsk_pools is not freed in
the subsequent error paths, leading to a memory leak. Fix that by
freeing it in the error path.
Jakub Kicinski [Thu, 30 Oct 2025 00:30:45 +0000 (17:30 -0700)]
Merge branch 'tcp-fix-receive-autotune-again'
Matthieu Baerts says:
====================
tcp: fix receive autotune again
Neal Cardwell found that recent kernels were having RWIN limited
issues, even when net.ipv4.tcp_rmem[2] was set to a very big value like
512MB.
He suspected that tcp_stream default buffer size (64KB) was triggering
heuristic added in ea33537d8292 ("tcp: add receive queue awareness
in tcp_rcv_space_adjust()").
After more testing, it turns out the bug was added earlier
with commit 65c5287892e9 ("tcp: fix sk_rcvbuf overshoot").
I forgot once again that DRS has one RTT latency.
MPTCP also got the same issue.
This series :
- Prevents calling tcp_rcvbuf_grow() on some MPTCP subflows.
- adds rcv_ssthresh, window_clamp and rcv_wnd to trace_tcp_rcvbuf_grow().
- Refactors code in a patch with no functional changes.
- Fixes the issue in the final patch.
====================
Eric Dumazet [Tue, 28 Oct 2025 11:58:01 +0000 (12:58 +0100)]
tcp: add newval parameter to tcp_rcvbuf_grow()
This patch has no functional change, and prepares the following one.
tcp_rcvbuf_grow() will need to have access to tp->rcvq_space.space
old and new values.
Change mptcp_rcvbuf_grow() in a similar way.
Signed-off-by: Eric Dumazet <edumazet@google.com>
[ Moved 'oldval' declaration to the next patch to avoid warnings at
build time. ] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20251028-net-tcp-recv-autotune-v3-3-74b43ba4c84c@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Tue, 28 Oct 2025 11:57:59 +0000 (12:57 +0100)]
mptcp: fix subflow rcvbuf adjust
The mptcp PM can add subflow to the conn_list before tcp_init_transfer().
Calling tcp_rcvbuf_grow() on such subflow is not correct as later
init will overwrite the update.
Fix the issue calling tcp_rcvbuf_grow() only after init buffer
initialization.
For ice, Grzegorz fixes setting of PHY lane number and logical PF ID for
E82x devices. He also corrects access of CGU (Clock Generation Unit) on
dual complex devices.
Kohei Enju resolves issues with error path cleanup for probe when in
recovery mode on ixgbe and ensures PHY is powered on for link testing
on igc. Lastly, he converts incorrect use of -ENOTSUPP to -EOPNOTSUPP
on igb, igc, and ixgbe.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ixgbe: use EOPNOTSUPP instead of ENOTSUPP in ixgbe_ptp_feature_enable()
igc: use EOPNOTSUPP instead of ENOTSUPP in igc_ethtool_get_sset_count()
igb: use EOPNOTSUPP instead of ENOTSUPP in igb_get_sset_count()
igc: power up the PHY before the link test
ixgbe: fix memory leak and use-after-free in ixgbe_recovery_probe()
ice: fix usage of logical PF id
ice: fix destination CGU for dual complex E825
ice: fix lane number calculation
====================
netfilter: nft_ct: add seqadj extension for natted connections
Sequence adjustment may be required for FTP traffic with PASV/EPSV modes.
due to need to re-write packet payload (IP, port) on the ftp control
connection. This can require changes to the TCP length and expected
seq / ack_seq.
The easiest way to reproduce this issue is with PASV mode.
Example ruleset:
table inet ftp_nat {
ct helper ftp_helper {
type "ftp" protocol tcp
l3proto inet
}
chain prerouting {
type filter hook prerouting priority 0; policy accept;
tcp dport 21 ct state new ct helper set "ftp_helper"
}
}
table ip nat {
chain prerouting {
type nat hook prerouting priority -100; policy accept;
tcp dport 21 dnat ip prefix to ip daddr map {
192.168.100.1 : 192.168.13.2/32 }
}
chain postrouting {
type nat hook postrouting priority 100 ; policy accept;
tcp sport 21 snat ip prefix to ip saddr map {
192.168.13.2 : 192.168.100.1/32 }
}
}
Note that the ftp helper gets assigned *after* the dnat setup.
The inverse (nat after helper assign) is handled by an existing
check in nf_nat_setup_info() and will not show the problem.
ftp nat changes do not work as expected in this case:
Connected to 192.168.100.1.
[..]
ftp> epsv
EPSV/EPRT on IPv4 off.
ftp> ls
227 Entering passive mode (192,168,100,1,209,129).
421 Service not available, remote server has closed connection.
netfilter: nft_connlimit: fix possible data race on connection count
nft_connlimit_eval() reads priv->list->count to check if the connection
limit has been exceeded. This value is being read without a lock and can
be modified by a different process. Use READ_ONCE() for correctness.
Fixes: df4a90250976 ("netfilter: nf_conncount: merge lookup and add functions") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Florian Westphal <fw@strlen.de>
Florian Westphal [Wed, 22 Oct 2025 15:18:10 +0000 (17:18 +0200)]
netfilter: nft_ct: enable labels for get case too
conntrack labels can only be set when the conntrack has been created
with the "ctlabel" extension.
For older iptables (connlabel match), adding an "-m connlabel" rule
turns on the ctlabel extension allocation for all future conntrack
entries.
For nftables, its only enabled for 'ct label set foo', but not for
'ct label foo' (i.e. check).
But users could have a ruleset that only checks for presence, and rely
on userspace to set a label bit via ctnetlink infrastructure.
This doesn't work without adding a dummy 'ct label set' rule.
We could also enable extension infra for the first (failing) ctnetlink
request, but unlike ruleset we would not be able to disable the
extension again.
Therefore turn on ctlabel extension allocation if an nftables ruleset
checks for a connlabel too.
Fixes: 1ad8f48df6f6 ("netfilter: nftables: add connlabel set support") Reported-by: Antonio Ojea <aojea@google.com> Closes: https://lore.kernel.org/netfilter-devel/aPi_VdZpVjWujZ29@strlen.de/ Signed-off-by: Florian Westphal <fw@strlen.de>
====================
bug fixes for the hibmcge ethernet driver
This patch set is intended to fix several issues for hibmcge driver:
1. Patch1 fixes the issue where buf avl irq is disabled after irq_handle.
2. Patch2 eliminates the error logs in scenarios without phy.
3. Patch3 fixes the issue where the network port becomes unusable
after a PCIe RAS event.
====================
Jijie Shao [Sat, 25 Oct 2025 01:46:42 +0000 (09:46 +0800)]
net: hibmcge: fix the inappropriate netif_device_detach()
current, driver will call netif_device_detach() in
pci_error_handlers.error_detected() and do reset in
pci_error_handlers.slot_reset().
However, if pci_error_handlers.slot_reset() is not called
after pci_error_handlers.error_detected(),
driver will be detached and unable to recover.
drivers/pci/pcie/err.c/report_error_detected() says:
If any device in the subtree does not have an error_detected
callback, PCI_ERS_RESULT_NO_AER_DRIVER prevents subsequent
error callbacks of any device in the subtree, and will
exit in the disconnected error state.
Therefore, when the hibmcge device and other devices that do not
support the error_detected callback are under the same subtree,
hibmcge will be unable to do slot_reset even for non-fatal errors.
This path move netif_device_detach() from error_detected() to slot_reset(),
ensuring that detach and reset are always executed together.
Jijie Shao [Sat, 25 Oct 2025 01:46:41 +0000 (09:46 +0800)]
net: hibmcge: remove unnecessary check for np_link_fail in scenarios without phy.
hibmcge driver uses fixed_phy to configure scenarios without PHY,
where the driver is always in a linked state. However,
there might be no link in hardware, so the np_link error
is detected in hbg_hw_adjust_link(), which can cause abnormal logs.
Therefore, in scenarios without a PHY, the driver no longer
checks the np_link status.
Jijie Shao [Sat, 25 Oct 2025 01:46:40 +0000 (09:46 +0800)]
net: hibmcge: fix rx buf avl irq is not re-enabled in irq_handle issue
irq initialized with the macro HBG_ERR_IRQ_I will automatically
be re-enabled, whereas those initialized with the macro HBG_IRQ_I
will not be re-enabled.
Since the rx buf avl irq is initialized using the macro HBG_IRQ_I,
it needs to be actively re-enabled;
otherwise priv->stats.rx_fifo_less_empty_thrsld_cnt cannot be
correctly incremented.
Fixes: fd394a334b1c ("net: hibmcge: Add support for abnormal irq handling feature") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251025014642.265259-2-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ivan Vecera [Mon, 27 Oct 2025 14:09:12 +0000 (15:09 +0100)]
dpll: zl3073x: Fix output pin registration
Currently, the signal format of an associated output is not considered
during output pin registration. As a result, the driver registers output
pins that are disabled by the signal format configuration.
Fix this by calling zl3073x_output_pin_is_enabled() to check whether
a given output pin should be registered or not.
Pavel Zhigulin [Fri, 24 Oct 2025 16:13:02 +0000 (19:13 +0300)]
net: cxgb4/ch_ipsec: fix potential use-after-free in ch_ipsec_xfrm_add_state() callback
In ch_ipsec_xfrm_add_state() there is not check of try_module_get
return value. It is very unlikely, but try_module_get() could return
false value, which could cause use-after-free error.
Conditions: The module count must be zero, and a module unload in
progress. The thread doing the unload is blocked somewhere.
Another thread makes a callback into the module for some request
that (for instance) would need to create a kernel thread.
It tries to get a reference for the thread.
So try_module_get(THIS_MODULE) is the right call - and will fail here.
This fix adds checking the result of try_module_get call
Petr Oros [Fri, 24 Oct 2025 19:07:33 +0000 (21:07 +0200)]
dpll: fix device-id-get and pin-id-get to return errors properly
The device-id-get and pin-id-get handlers were ignoring errors from
the find functions and sending empty replies instead of returning
error codes to userspace.
When dpll_device_find_from_nlattr() or dpll_pin_find_from_nlattr()
returned an error (e.g., -EINVAL for "multiple matches" or -ENODEV
for "not found"), the handlers checked `if (!IS_ERR(ptr))` and
skipped adding the device/pin handle to the message, but then still
sent the empty message as a successful reply.
This caused userspace tools to receive empty responses with id=0
instead of proper netlink errors with extack messages like
"multiple matches".
The bug is visible via strace, which shows the kernel sending TWO
netlink messages in response to a single request:
1. Empty reply (20 bytes, just header, no attributes):
recvfrom(3, [{nlmsg_len=20, nlmsg_type=dpll, nlmsg_flags=0, ...},
{cmd=0x7, version=1}], ...)
2. NLMSG_ERROR ACK with extack (because of NLM_F_ACK flag):
recvfrom(3, [{nlmsg_len=60, nlmsg_type=NLMSG_ERROR,
nlmsg_flags=NLM_F_CAPPED|NLM_F_ACK_TLVS, ...},
[{error=0, msg={...}},
[{nla_type=NLMSGERR_ATTR_MSG}, "multiple matches"]]], ...)
The C YNL library parses the first message, sees an empty response,
and creates a result object with calloc() which zero-initializes all
fields, resulting in id=0.
The Python YNL library parses both messages and displays the extack
from the second NLMSG_ERROR message.
Fix by checking `if (IS_ERR(ptr))` first and returning the error
code immediately, so that netlink properly sends only NLMSG_ERROR with
the extack message to userspace. After this fix, both C and Python
YNL tools receive only the NLMSG_ERROR and behave consistently.
This affects:
- DPLL_CMD_DEVICE_ID_GET: now properly returns error when multiple
devices match the criteria (e.g., same module-name + clock-id)
- DPLL_CMD_PIN_ID_GET: now properly returns error when multiple pins
match the criteria (e.g., same module-name)
Before fix:
$ dpll pin id-get module-name ice
0 (wrong - should be error, there are 17 pins with module-name "ice")
After fix:
$ dpll pin id-get module-name ice
Error: multiple matches
(correct - kernel reports the ambiguity via extack)
Fixes: 9d71b54b65b1 ("dpll: netlink: Add DPLL framework base functions") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Link: https://patch.msgid.link/20251024190733.364101-1-poros@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kohei Enju [Mon, 6 Oct 2025 12:35:22 +0000 (21:35 +0900)]
igc: use EOPNOTSUPP instead of ENOTSUPP in igc_ethtool_get_sset_count()
igc_ethtool_get_sset_count() returns -ENOTSUPP when a given stringset is
not supported, causing userland programs to get "Unknown error 524".
Since EOPNOTSUPP should be used when error is propagated to userland,
return -EOPNOTSUPP instead of -ENOTSUPP.
Fixes: 36b9fea60961 ("igc: Add support for statistics") Signed-off-by: Kohei Enju <enjuk@amazon.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The current implementation of the igc driver doesn't power up the PHY
before the link test in igc_ethtool_diag_test(), causing the link test
to always report FAIL when admin state is down and the PHY is
consequently powered down.
To test the link state regardless of admin state, power up the PHY
before the link test in the offline test path. After the link test, the
original PHY state is restored by igc_reset(), so additional code which
explicitly restores the original state is not necessary.
Note that this change is applied only for the offline test path. This is
because in the online path we shouldn't interrupt normal networking
operation and powering up the PHY and restoring the original state would
interrupt that.
This implementation also uses igc_power_up_phy_copper() without checking
the media type, since igc devices are currently only copper devices and
the function is called in other places without checking the media type.
Furthermore, the powering up is on a best-effort basis, that is, we
don't handle failures of powering up (e.g. bus error) and just let the
test report FAIL.
Tested on Intel Corporation Ethernet Controller I226-V (rev 04) with
cable connected and link available.
Set device down and do ethtool test.
# ip link set dev enp0s5 down
Without patch:
# ethtool --test enp0s5
The test result is FAIL
The test extra info:
Register test (offline) 0
Eeprom test (offline) 0
Interrupt test (offline) 0
Loopback test (offline) 0
Link test (on/offline) 1
With patch:
# ethtool --test enp0s5
The test result is PASS
The test extra info:
Register test (offline) 0
Eeprom test (offline) 0
Interrupt test (offline) 0
Loopback test (offline) 0
Link test (on/offline) 0
Fixes: f026d8ca2904 ("igc: add support to eeprom, registers and link self-tests") Signed-off-by: Kohei Enju <enjuk@amazon.com> Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Kohei Enju [Sun, 31 Aug 2025 20:33:11 +0000 (05:33 +0900)]
ixgbe: fix memory leak and use-after-free in ixgbe_recovery_probe()
The error path of ixgbe_recovery_probe() has two memory bugs.
For non-E610 adapters, the function jumps to clean_up_probe without
calling devlink_free(), leaking the devlink instance and its embedded
adapter structure.
For E610 adapters, devlink_free() is called at shutdown_aci, but
clean_up_probe then accesses adapter->state, sometimes triggering
use-after-free because adapter is embedded in devlink. This UAF is
similar to the one recently reported in ixgbe_remove(). (Link)
Fix both issues by moving devlink_free() after adapter->state access,
aligning with the cleanup order in ixgbe_probe().
Link: https://lore.kernel.org/intel-wired-lan/20250828020558.1450422-1-den@valinux.co.jp/ Fixes: 29cb3b8d95c7 ("ixgbe: add E610 implementation of FW recovery mode") Signed-off-by: Kohei Enju <enjuk@amazon.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
When distributing RSS and FDIR masks, which are global resources across
the active devices, it is required to have a contiguous PF id, which can
be described as a logical PF id. In the case above, function 0 would
have a logical PF id of 0, function 1 would have a logical PF id
of 1, and functions 4 and 5 would have a logical PF ids 2 and 3
respectively.
Using logical PF id can properly describe which slice of resources can
be used by a particular PF.
The 'function id' to 'logical id' mapping has been introduced with the
commit 015307754a19 ("ice: Support VF queue rate limit and quanta size
configuration"). However, the usage of 'logical_pf_id' field was
unintentionally skipped for profile mask configuration.
Fix it by using 'logical_pf_id' instead of 'pf_id' value when configuring
masks.
Without that patch, wrong indexes, i.e. out of range for given PF, can
be used while configuring resources masks, which might lead to memory
corruption and undefined driver behavior.
The call trace below is one of the examples of such error:
Fixes: 015307754a19 ("ice: Support VF queue rate limit and quanta size configuration") Suggested-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Grzegorz Nitka [Mon, 29 Sep 2025 15:29:05 +0000 (17:29 +0200)]
ice: fix destination CGU for dual complex E825
On dual complex E825, only complex 0 has functional CGU (Clock
Generation Unit), powering all the PHYs.
SBQ (Side Band Queue) destination device 'cgu' in current implementation
points to CGU on current complex and, in order to access primary CGU
from the secondary complex, the driver should use 'cgu_peer' as
a destination device in read/write CGU registers operations.
Define new 'cgu_peer' (15) as RDA (Remote Device Access) client over
SB-IOSF interface and use it as device target when accessing CGU from
secondary complex.
This problem has been identified when working on recovery clock
enablement [1]. In existing implementation for E825 devices, only PF0,
which is clock owner, is involved in CGU configuration, thus the
problem was not exposed to the user.
Fixes: e2193f9f9ec9 ("ice: enable timesync operation on 2xNAC E825 devices") Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Arkadiusz Kubalewski <Arkadiusz.kubalewski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Grzegorz Nitka [Fri, 21 Feb 2025 09:39:49 +0000 (10:39 +0100)]
ice: fix lane number calculation
E82X adapters do not have sequential IDs, lane number is PF ID.
Add check for ICE_MAC_GENERIC and skip checking port options.
Also, adjust logical port number for specific E825 device with external
PHY support (PCI device id 0x579F). For this particular device,
with 2x25G (PHY0) and 2x10G (PHY1) port configuration, modification of
pf_id -> lane_number mapping is required. PF IDs on the 2nd PHY start
from 4 in such scenario. Otherwise, the lane number cannot be
determined correctly, leading to PTP init errors during PF initialization.
Fixes: 258f5f9058159 ("ice: Add correct PHY lane assignment") Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Milena Olech <milena.olech@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
dt-bindings: net: sparx5: Narrow properly LAN969x register space windows
Commit 267bca002c50 ("dt-bindings: net: sparx5: correct LAN969x register
space windows") said that LAN969x has exactly two address spaces ("reg"
property) but implemented it as 2 or more. Narrow the constraint to
properly express that only two items are allowed, which also matches
Linux driver.
Fixes: 267bca002c50 ("dt-bindings: net: sparx5: correct LAN969x register space windows") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20251026101741.20507-2-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Petr Oros [Fri, 24 Oct 2025 18:55:12 +0000 (20:55 +0200)]
dpll: spec: add missing module-name and clock-id to pin-get reply
The dpll.yaml spec incorrectly omitted module-name and clock-id from the
pin-get operation reply specification, even though the kernel DPLL
implementation has always included these attributes in pin-get responses
since the initial implementation.
This spec inconsistency caused issues with the C YNL code generator.
The generated dpll_pin_get_rsp structure was missing these fields.
Fix the spec by adding module-name and clock-id to the pin-attrs reply
specification to match the actual kernel behavior.
Fixes: 3badff3a25d8 ("dpll: spec: Add Netlink spec in YAML") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Link: https://patch.msgid.link/20251024185512.363376-1-poros@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hangbin Liu [Fri, 24 Oct 2025 12:58:53 +0000 (12:58 +0000)]
tools: ynl: avoid print_field when there is no reply
When request a none support device operation, there will be no reply.
In this case, the len(desc) check will always be true, causing print_field
to enter an infinite loop and crash the program. Example reproducer:
# ethtool.py -c veth0
To fix this, return immediately if there is no reply.
Jakub Kicinski [Tue, 28 Oct 2025 01:00:53 +0000 (18:00 -0700)]
Merge tag 'batadv-net-pullrequest-20251024' of https://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
Here is a batman-adv bugfix:
- release references to inactive interfaces, by Sven Eckelmann
* tag 'batadv-net-pullrequest-20251024' of https://git.open-mesh.org/linux-merge:
batman-adv: Release references to inactive interfaces
====================
Abdun Nihaal [Thu, 23 Oct 2025 14:18:42 +0000 (19:48 +0530)]
sfc: fix potential memory leak in efx_mae_process_mport()
In efx_mae_enumerate_mports(), memory allocated for mae_mport_desc is
passed as a argument to efx_mae_process_mport(), but when the error path
in efx_mae_process_mport() gets executed, the memory allocated for desc
gets leaked.
Fix that by freeing the memory allocation before returning error.
Fixes: a6a15aca4207 ("sfc: enumerate mports in ef100") Acked-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: Abdun Nihaal <nihaal@cse.iitm.ac.in> Link: https://patch.msgid.link/20251023141844.25847-1-nihaal@cse.iitm.ac.in Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Bagas Sanjaya [Thu, 23 Oct 2025 09:24:06 +0000 (16:24 +0700)]
MAINTAINERS: mark ISDN subsystem as orphan
We have not heard any activities from Karsten in years:
- Last review tag was nine years ago in commit a921e9bd4e22a7
("isdn: i4l: move active-isdn drivers to staging")
- Last message on lore was in October 2020 [1].
Furthermore, messages to isdn mailing list bounce.
Jakub Kicinski [Tue, 28 Oct 2025 00:44:35 +0000 (17:44 -0700)]
Merge tag 'for-net-2025-10-24' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- fix corruption in h4_recv_buf() after cleanupCen Zhang (1):
- hci_sync: fix race in hci_cmd_sync_dequeue_once
- btmtksdio: Add pmctrl handling for BT closed state during reset
- Revert "Bluetooth: L2CAP: convert timeouts to secs_to_jiffies()"
- rfcomm: fix modem control handling
- btintel_pcie: Fix event packet loss issue
- ISO: Fix BIS connection dst_type handling
- HCI: Fix tracking of advertisement set/instance 0x00
- ISO: Fix another instance of dst_type handling
- hci_conn: Fix connection cleanup with BIG with 2 or more BIS
- hci_core: Fix tracking of periodic advertisement
- MGMT: fix crash in set_mesh_sync and set_mesh_complete
* tag 'for-net-2025-10-24' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: rfcomm: fix modem control handling
Bluetooth: hci_core: Fix tracking of periodic advertisement
Bluetooth: hci_conn: Fix connection cleanup with BIG with 2 or more BIS
Bluetooth: fix corruption in h4_recv_buf() after cleanup
Bluetooth: btintel_pcie: Fix event packet loss issue
Bluetooth: ISO: Fix another instance of dst_type handling
Revert "Bluetooth: L2CAP: convert timeouts to secs_to_jiffies()"
Bluetooth: MGMT: fix crash in set_mesh_sync and set_mesh_complete
Bluetooth: HCI: Fix tracking of advertisement set/instance 0x00
Bluetooth: btmtksdio: Add pmctrl handling for BT closed state during reset
Bluetooth: ISO: Fix BIS connection dst_type handling
Bluetooth: hci_sync: fix race in hci_cmd_sync_dequeue_once
====================
Petr Oros [Fri, 24 Oct 2025 13:24:38 +0000 (15:24 +0200)]
tools: ynl: fix string attribute length to include null terminator
The ynl_attr_put_str() function was not including the null terminator
in the attribute length calculation. This caused kernel to reject
CTRL_CMD_GETFAMILY requests with EINVAL:
"Attribute failed policy validation".
For a 4-character family name like "dpll":
- Sent: nla_len=8 (4 byte header + 4 byte string without null)
- Expected: nla_len=9 (4 byte header + 5 byte string with null)
The bug was introduced in commit 15d2540e0d62 ("tools: ynl: check for
overflow of constructed messages") when refactoring from stpcpy() to
strlen(). The original code correctly included the null terminator:
Since stpcpy() returns a pointer past the null terminator, the length
included it. The refactored version using strlen() omitted the +1.
The fix also removes NLA_ALIGN() from nla_len calculation, since
nla_len should contain actual attribute length, not aligned length.
Alignment is only for calculating next attribute position. This makes
the code consistent with ynl_attr_put().
CTRL_ATTR_FAMILY_NAME uses NLA_NUL_STRING policy which requires
null terminator. Kernel validates with memchr() and rejects if not
found.
Emanuele Ghidoli [Thu, 23 Oct 2025 14:48:53 +0000 (16:48 +0200)]
net: phy: dp83867: Disable EEE support as not implemented
While the DP83867 PHYs report EEE capability through their feature
registers, the actual hardware does not support EEE (see Links).
When the connected MAC enables EEE, it causes link instability and
communication failures.
The issue is reproducible with a iMX8MP and relevant stmmac ethernet port.
Since the introduction of phylink-managed EEE support in the stmmac driver,
EEE is now enabled by default, leading to issues on systems using the
DP83867 PHY.
Call phy_disable_eee during phy initialization to prevent EEE from being
enabled on DP83867 PHYs.
Johan Hovold [Thu, 23 Oct 2025 12:05:30 +0000 (14:05 +0200)]
Bluetooth: rfcomm: fix modem control handling
The RFCOMM driver confuses the local and remote modem control signals,
which specifically means that the reported DTR and RTS state will
instead reflect the remote end (i.e. DSR and CTS).
This issue dates back to the original driver (and a follow-on update)
merged in 2002, which resulted in a non-standard implementation of
TIOCMSET that allowed controlling also the TS07.10 IC and DV signals by
mapping them to the RI and DCD input flags, while TIOCMGET failed to
return the actual state of DTR and RTS.
Note that the bogus control of input signals in tiocmset() is just
dead code as those flags will have been masked out by the tty layer
since 2003.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: hci_core: Fix tracking of periodic advertisement
Periodic advertising enabled flag cannot be tracked by the enabled
flag since advertising and periodic advertising each can be
enabled/disabled separately from one another causing the states to be
inconsistent when for example an advertising set is disabled its
enabled flag is set to false which is then used for periodic which has
not being disabled.
Fixes: eca0ae4aea66 ("Bluetooth: Add initial implementation of BIS connections") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: hci_conn: Fix connection cleanup with BIG with 2 or more BIS
This fixes bis_cleanup not considering connections in BT_OPEN state
before attempting to remove the BIG causing the following error:
btproxy[20110]: < HCI Command: LE Terminate Broadcast Isochronous Group (0x08|0x006a) plen 2
BIG Handle: 0x01
Reason: Connection Terminated By Local Host (0x16)
> HCI Event: Command Status (0x0f) plen 4
LE Terminate Broadcast Isochronous Group (0x08|0x006a) ncmd 1
Status: Unknown Advertising Identifier (0x42)
Fixes: fa224d0c094a ("Bluetooth: ISO: Reassociate a socket with an active BIS") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Calvin Owens [Thu, 23 Oct 2025 18:47:19 +0000 (11:47 -0700)]
Bluetooth: fix corruption in h4_recv_buf() after cleanup
A different structure is stored in drvdata for the drivers which used
that duplicate function, but h4_recv_buf() assumes drvdata is always an
hci_uart structure.
Consequently, alignment and padding are now randomly corrupted for
btmtkuart, btnxpuart, and bpa10x in h4_recv_buf(), causing erratic
breakage.
Fix this by making the hci_uart structure the explicit argument to
h4_recv_buf(). Every caller already has a reference to hci_uart, and
already obtains the hci_hdev reference through it, so this actually
eliminates a redundant pointer indirection for all existing callers.
Fixes: 93f06f8f0daf ("Bluetooth: remove duplicate h4_recv_buf() in header") Reported-by: Francesco Valla <francesco@valla.it> Closes: https://lore.kernel.org/lkml/6837167.ZASKD2KPVS@fedora.fritz.box/ Signed-off-by: Calvin Owens <calvin@wbinvd.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Kiran K [Thu, 16 Oct 2025 04:30:43 +0000 (10:00 +0530)]
Bluetooth: btintel_pcie: Fix event packet loss issue
In the current btintel_pcie driver implementation, when an interrupt is
received, the driver checks for the alive cause before the TX/RX cause.
Handling the alive cause involves resetting the TX/RX queue indices.
This flow works correctly when the causes are mutually exclusive.
However, if both cause bits are set simultaneously, the alive cause
resets the queue indices, resulting in an event packet drop and a
command timeout. To fix this issue, the driver is modified to handle all
other causes before checking for the alive cause.
Test case:
Issue is seen with stress reboot scenario - 50x run
Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Sai Teja Aluvala <aluvala.sai.teja@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Fixes: c2b636b3f788 ("Bluetooth: btintel_pcie: Add support for PCIe transport") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Frédéric Danis [Mon, 6 Oct 2025 08:35:44 +0000 (10:35 +0200)]
Revert "Bluetooth: L2CAP: convert timeouts to secs_to_jiffies()"
This reverts commit c9d84da18d1e0d28a7e16ca6df8e6d47570501d4. It
replaces in L2CAP calls to msecs_to_jiffies() to secs_to_jiffies()
and updates the constants accordingly. But the constants are also
used in LCAP Configure Request and L2CAP Configure Response which
expect values in milliseconds.
This may prevent correct usage of L2CAP channel.
To fix it, keep those constants in milliseconds and so revert this
change.
Fixes: c9d84da18d1e ("Bluetooth: L2CAP: convert timeouts to secs_to_jiffies()") Signed-off-by: Frédéric Danis <frederic.danis@collabora.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Pauli Virtanen [Fri, 3 Oct 2025 19:07:32 +0000 (22:07 +0300)]
Bluetooth: MGMT: fix crash in set_mesh_sync and set_mesh_complete
There is a BUG: KASAN: stack-out-of-bounds in set_mesh_sync due to
memcpy from badly declared on-stack flexible array.
Another crash is in set_mesh_complete() due to double list_del via
mgmt_pending_valid + mgmt_pending_remove.
Use DEFINE_FLEX to declare the flexible array right, and don't memcpy
outside bounds.
As mgmt_pending_valid removes the cmd from list, use mgmt_pending_free,
and also report status on error.
Fixes: 302a1f674c00d ("Bluetooth: MGMT: Fix possible UAFs") Signed-off-by: Pauli Virtanen <pav@iki.fi> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: HCI: Fix tracking of advertisement set/instance 0x00
This fixes the state tracking of advertisement set/instance 0x00 which
is considered a legacy instance and is not tracked individually by
adv_instances list, previously it was assumed that hci_dev itself would
track it via HCI_LE_ADV but that is a global state not specifc to
instance 0x00, so to fix it a new flag is introduced that only tracks the
state of instance 0x00.
Fixes: 1488af7b8b5f ("Bluetooth: hci_sync: Fix hci_resume_advertising_sync") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Chris Lu [Tue, 30 Sep 2025 05:39:33 +0000 (13:39 +0800)]
Bluetooth: btmtksdio: Add pmctrl handling for BT closed state during reset
This patch adds logic to handle power management control when the
Bluetooth function is closed during the SDIO reset sequence.
Specifically, if BT is closed before reset, the driver enables the
SDIO function and sets driver pmctrl. After reset, if BT remains
closed, the driver sets firmware pmctrl and disables the SDIO function.
These changes ensure proper power management and device state consistency
across the reset flow.
Fixes: 8fafe702253d ("Bluetooth: mt7921s: support bluetooth reset mechanism") Signed-off-by: Chris Lu <chris.lu@mediatek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: hci_sync: fix race in hci_cmd_sync_dequeue_once
hci_cmd_sync_dequeue_once() does lookup and then cancel
the entry under two separate lock sections. Meanwhile,
hci_cmd_sync_work() can also delete the same entry,
leading to double list_del() and "UAF".
Fix this by holding cmd_sync_work_lock across both
lookup and cancel, so that the entry cannot be removed
concurrently.
Fixes: 505ea2b29592 ("Bluetooth: hci_sync: Add helper functions to manipulate cmd_sync queue") Reported-by: Cen Zhang <zzzccc427@163.com> Signed-off-by: Cen Zhang <zzzccc427@163.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
virtio-net: drop the multi-buffer XDP packet in zerocopy
In virtio-net, we have not yet supported multi-buffer XDP packet in
zerocopy mode when there is a binding XDP program. However, in that
case, when receiving multi-buffer XDP packet, we skip the XDP program
and return XDP_PASS. As a result, the packet is passed to normal network
stack which is an incorrect behavior (e.g. a XDP program for packet
count is installed, multi-buffer XDP packet arrives and does go through
XDP program. As a result, the packet count does not increase but the
packet is still received from network stack).This commit instead returns
XDP_ABORTED in that case.
Fixes: 99c861b44eb1 ("virtio_net: xsk: rx: support recv merge mode") Cc: stable@vger.kernel.org Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com> Link: https://patch.msgid.link/20251022155630.49272-1-minhquangbui99@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lizhi Xu [Wed, 22 Oct 2025 02:40:07 +0000 (10:40 +0800)]
usbnet: Prevents free active kevent
The root cause of this issue are:
1. When probing the usbnet device, executing usbnet_link_change(dev, 0, 0);
put the kevent work in global workqueue. However, the kevent has not yet
been scheduled when the usbnet device is unregistered. Therefore, executing
free_netdev() results in the "free active object (kevent)" error reported
here.
2. Another factor is that when calling usbnet_disconnect()->unregister_netdev(),
if the usbnet device is up, ndo_stop() is executed to cancel the kevent.
However, because the device is not up, ndo_stop() is not executed.
The solution to this problem is to cancel the kevent before executing
free_netdev().
Fixes: a69e617e533e ("usbnet: Fix linkwatch use-after-free on disconnect") Reported-by: Sam Sun <samsun1006219@gmail.com> Closes: https://syzkaller.appspot.com/bug?extid=8bfd7bcc98f7300afb84 Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> Link: https://patch.msgid.link/20251022024007.1831898-1-lizhi.xu@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 24 Oct 2025 00:15:47 +0000 (17:15 -0700)]
Merge tag 'wireless-2025-10-23' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
First set of fixes:
- brcmfmac: long-standing crash when used w/o P2P
- iwlwifi: fix for a use-after-free bug
- mac80211: key tailroom accounting bug could leave
allocation overhead and cause a warning
- ath11k: add a missing platform,
fix key flag operations
- bcma: skip devices disabled in OF/DT
- various (potential) memory leaks
* tag 'wireless-2025-10-23' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: nl80211: call kfree without a NULL check
wifi: mac80211: fix key tailroom accounting leak
wifi: brcmfmac: fix crash while sending Action Frames in standalone AP Mode
MAINTAINERS: wcn36xx: Add linux-wireless list
bcma: don't register devices disabled in OF
wifi: mac80211: reset FILS discovery and unsol probe resp intervals
wifi: iwlwifi: fix potential use after free in iwl_mld_remove_link()
wifi: ath11k: avoid bit operation on key flags
wifi: ath12k: free skb during idr cleanup callback
wifi: ath11k: Add missing platform IDs for quirk table
wifi: ath10k: Fix memory leak on unsupported WMI command
====================
Linus Torvalds [Thu, 23 Oct 2025 17:03:18 +0000 (07:03 -1000)]
Merge tag 'net-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from can. Slim pickings, I'm guessing people haven't
really started testing.
Current release - new code bugs:
- eth: mlx5e:
- psp: avoid 'accel' NULL pointer dereference
- skip PPHCR register query for FEC histogram if not supported
Previous releases - regressions:
- bonding: update the slave array for broadcast mode
- rtnetlink: re-allow deleting FDB entries in user namespace
- eth: dpaa2: fix the pointer passed to PTR_ALIGN on Tx path
Previous releases - always broken:
- can: drop skb on xmit if device is in listen-only mode
- gro: clear skb_shinfo(skb)->hwtstamps in napi_reuse_skb()
- eth: mlx5e
- RX, fix generating skb from non-linear xdp_buff if program
trims frags
- make devcom init failures non-fatal, fix races with IPSec
Misc:
- some documentation formatting 'fixes'"
* tag 'net-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
net/mlx5: Fix IPsec cleanup over MPV device
net/mlx5: Refactor devcom to return NULL on failure
net/mlx5e: Skip PPHCR register query if not supported by the device
net/mlx5: Add PPHCR to PCAM supported registers mask
virtio-net: zero unused hash fields
net: phy: micrel: always set shared->phydev for LAN8814
vsock: fix lock inversion in vsock_assign_transport()
ovpn: use datagram_poll_queue for socket readiness in TCP
espintcp: use datagram_poll_queue for socket readiness
net: datagram: introduce datagram_poll_queue for custom receive queues
net: bonding: fix possible peer notify event loss or dup issue
net: hsr: prevent creation of HSR device with slaves from another netns
sctp: avoid NULL dereference when chunk data buffer is missing
ptp: ocp: Fix typo using index 1 instead of i in SMA initialization loop
net: ravb: Ensure memory write completes before ringing TX doorbell
net: ravb: Enforce descriptor type ordering
net: hibmcge: select FIXED_PHY
net: dlink: use dev_kfree_skb_any instead of dev_kfree_skb
Documentation: networking: ax25: update the mailing list info.
net: gro_cells: fix lock imbalance in gro_cells_receive()
...
Linus Torvalds [Thu, 23 Oct 2025 16:53:12 +0000 (06:53 -1000)]
Merge tag 'acpi-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix a fallout of a recent ACPI properties management update and
work around a compiler bug in ACPICA:
- Fix a recent coding mistake causing __acpi_node_get_property_reference()
arguments to be put in an incorrect order (Sunil V L)
- Work around bogus -Wstringop-overread warning on LoongArch since
GCC 11 in ACPICA (Xi Ruoyao)"
* tag 'acpi-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPICA: Work around bogus -Wstringop-overread warning since GCC 11
ACPI: property: Fix argument order in __acpi_node_get_property_reference()
Linus Torvalds [Thu, 23 Oct 2025 16:48:32 +0000 (06:48 -1000)]
Merge tag 'pm-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These revert a cpuidle menu governor commit leading to a performance
regression, fix an amd-pstate driver regression introduced recently,
and fix new conditional guard definitions for runtime PM.
- Add missing _RET == 0 condition to recently introduced conditional
guard definitions for runtime PM (Rafael Wysocki)
- Revert a cpuidle menu governor change that introduced a serious
performance regression on Chromebooks with Intel Jasper Lake
processors (Rafael Wysocki)
- Fix an amd-pstate driver regression leading to EPP=0 after
hibernation (Mario Limonciello)"
* tag 'pm-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: runtime: Fix conditional guard definitions
Revert "cpuidle: menu: Avoid discarding useful information"
cpufreq/amd-pstate: Fix a regression leading to EPP 0 after hibernate
Linus Torvalds [Thu, 23 Oct 2025 16:44:43 +0000 (06:44 -1000)]
Merge tag 'for-6.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- in send, fix duplicated rmdir operations when using extrefs
(hardlinks), receive can fail with ENOENT
- fixup of error check when reading extent root in ref-verify and
damaged roots are allowed by mount option (found by smatch)
- fix freeing partially initialized fs info (found by syzkaller)
- fix use-after-free when printing ref_tracking status of delayed
inodes
* tag 'for-6.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: ref-verify: fix IS_ERR() vs NULL check in btrfs_build_ref_tree()
btrfs: fix delayed_node ref_tracker use after free
btrfs: send: fix duplicated rmdir operations when using extrefs
btrfs: directly free partially initialized fs_info in btrfs_check_leaked_roots()
When we do mlx5e_detach_netdev() we eventually disable blocking events
notifier, among those events are IPsec MPV events from IB to core.
So before disabling those blocking events, make sure to also unregister
the devcom device and mark all this device operations as complete,
in order to prevent the other device from using invalid netdev
during future devcom events which could cause the trace below.
net/mlx5: Refactor devcom to return NULL on failure
Devcom device and component registration isn't always critical to the
functionality of the caller, hence the registration can fail and we can
continue working with an ERR_PTR value saved inside a variable.
In order to avoid that make sure all devcom failures return NULL.
Alexei Lazar [Wed, 22 Oct 2025 12:29:39 +0000 (15:29 +0300)]
net/mlx5: Add PPHCR to PCAM supported registers mask
Add the PPHCR bit to the port_access_reg_cap_mask field of PCAM
register to indicate that the device supports the PPHCR register
and the RS-FEC histogram feature.
Jason Wang [Wed, 22 Oct 2025 03:44:21 +0000 (11:44 +0800)]
virtio-net: zero unused hash fields
When GSO tunnel is negotiated virtio_net_hdr_tnl_from_skb() tries to
initialize the tunnel metadata but forget to zero unused rxhash
fields. This may leak information to another side. Fixing this by
zeroing the unused hash fields.
Acked-by: Michael S. Tsirkin <mst@redhat.com> Fixes: a2fb4bc4e2a6a ("net: implement virtio helpers to handle UDP GSO tunneling") Cc: <stable@vger.kernel.org> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://patch.msgid.link/20251022034421.70244-1-jasowang@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Robert Marko [Tue, 21 Oct 2025 13:20:26 +0000 (15:20 +0200)]
net: phy: micrel: always set shared->phydev for LAN8814
Currently, during the LAN8814 PTP probe shared->phydev is only set if PTP
clock gets actually set, otherwise the function will return before setting
it.
This is an issue as shared->phydev is unconditionally being used when IRQ
is being handled, especially in lan8814_gpio_process_cap and since it was
not set it will cause a NULL pointer exception and crash the kernel.
So, simply always set shared->phydev to avoid the NULL pointer exception.
Fixes: b3f1a08fcf0d ("net: phy: micrel: Add support for PTP_PF_EXTTS for lan8814") Signed-off-by: Robert Marko <robert.marko@sartura.hr> Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://patch.msgid.link/20251021132034.983936-1-robert.marko@sartura.hr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
vsock: fix lock inversion in vsock_assign_transport()
Syzbot reported a potential lock inversion deadlock between
vsock_register_mutex and sk_lock-AF_VSOCK when vsock_linger() is called.
The issue was introduced by commit 687aa0c5581b ("vsock: Fix
transport_* TOCTOU") which added vsock_register_mutex locking in
vsock_assign_transport() around the transport->release() call, that can
call vsock_linger(). vsock_assign_transport() can be called with sk_lock
held. vsock_linger() calls sk_wait_event() that temporarily releases and
re-acquires sk_lock. During this window, if another thread hold
vsock_register_mutex while trying to acquire sk_lock, a circular
dependency is created.
Fix this by releasing vsock_register_mutex before calling
transport->release() and vsock_deassign_transport(). This is safe
because we don't need to hold vsock_register_mutex while releasing the
old transport, and we ensure the new transport won't disappear by
obtaining a module reference first via try_module_get().
====================
fix poll behaviour for TCP-based tunnel protocols
This patch series introduces a polling function for datagram-style
sockets that operates on custom skb queues, and updates ovpn (the
OpenVPN data-channel offload module) and espintcp (the TCP Encapsulation
of IKE and IPsec Packets implementation) to use it accordingly.
Protocols like the aforementioned one decapsulate packets received over
TCP and deliver userspace-bound data through a separate skb queue, not
the standard sk_receive_queue. Previously, both relied on
datagram_poll(), which would signal readiness based on non-userspace
packets, leading to misleading poll results and unnecessary recv
attempts in userspace.
Patch 1 introduces datagram_poll_queue(), a variant of datagram_poll()
that accepts an explicit receive queue. This builds on the approach
introduced in commit b50b058, which extended other skb-related functions
to support custom queues. Patch 2 and 3 update espintcp_poll() and
ovpn_tcp_poll() respectively to use this helper, ensuring readiness is
only signaled when userspace data is available.
Each patch is self-contained and the ovpn one includes rationale and
lifecycle enforcement where appropriate.
====================
Ralf Lici [Tue, 21 Oct 2025 10:09:42 +0000 (12:09 +0200)]
ovpn: use datagram_poll_queue for socket readiness in TCP
openvpn TCP encapsulation uses a custom queue to deliver packets to
userspace. Currently it relies on datagram_poll, which checks
sk_receive_queue, leading to false readiness signals when that queue
contains non-userspace packets.
Switch ovpn_tcp_poll to use datagram_poll_queue with the peer's
user_queue, ensuring poll only signals readiness when userspace data is
actually available. Also refactor ovpn_tcp_poll in order to enforce the
assumption we can make on the lifetime of ovpn_sock and peer.
Ralf Lici [Tue, 21 Oct 2025 10:09:41 +0000 (12:09 +0200)]
espintcp: use datagram_poll_queue for socket readiness
espintcp uses a custom queue (ike_queue) to deliver packets to
userspace. The polling logic relies on datagram_poll, which checks
sk_receive_queue, which can lead to false readiness signals when that
queue contains non-userspace packets.
Switch espintcp_poll to use datagram_poll_queue with ike_queue, ensuring
poll only signals readiness when userspace data is actually available.
Ralf Lici [Tue, 21 Oct 2025 10:09:40 +0000 (12:09 +0200)]
net: datagram: introduce datagram_poll_queue for custom receive queues
Some protocols using TCP encapsulation (e.g., espintcp, openvpn) deliver
userspace-bound packets through a custom skb queue rather than the
standard sk_receive_queue.
Introduce datagram_poll_queue that accepts an explicit receive queue,
and convert datagram_poll into a wrapper around datagram_poll_queue.
This allows protocols with custom skb queues to reuse the core polling
logic without relying on sk_receive_queue.
Cc: Sabrina Dubroca <sd@queasysnail.net> Cc: Antonio Quartulli <antonio@openvpn.net> Signed-off-by: Ralf Lici <ralf@mandelbit.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Antonio Quartulli <antonio@openvpn.net> Link: https://patch.msgid.link/20251021100942.195010-2-ralf@mandelbit.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Merge an ACPI device properties handling change fixing the order of
__acpi_node_get_property_reference() arguments broken by a recent
update (Sunil V L)
* 'acpi-property':
ACPI: property: Fix argument order in __acpi_node_get_property_reference()
- Revert a cpuidle menu governor change that introduced a serious
performance regression on Chromebooks with Intel Jasper Lake
processors (Rafael Wysocki)
- Fix an amd-pstate driver regression leading to EPP=0 after
hibernation (Mario Limonciello)
Tonghao Zhang [Tue, 21 Oct 2025 05:09:33 +0000 (13:09 +0800)]
net: bonding: fix possible peer notify event loss or dup issue
If the send_peer_notif counter and the peer event notify are not synchronized.
It may cause problems such as the loss or dup of peer notify event.
Before this patch:
- If should_notify_peers is true and the lock for send_peer_notif-- fails, peer
event may be sent again in next mii_monitor loop, because should_notify_peers
is still true.
- If should_notify_peers is true and the lock for send_peer_notif-- succeeded,
but the lock for peer event fails, the peer event will be lost.
This patch locks the RTNL for send_peer_notif, events, and commit simultaneously.
Fixes: 07a4ddec3ce9 ("bonding: add an option to specify a delay between peer notifications") Cc: Jay Vosburgh <jv@jvosburgh.net> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Hangbin Liu <liuhangbin@gmail.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Vincent Bernat <vincent@bernat.ch> Cc: <stable@vger.kernel.org> Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com> Acked-by: Jay Vosburgh <jv@jvosburgh.net> Link: https://patch.msgid.link/20251021050933.46412-1-tonghao@bamaicloud.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
net: hsr: prevent creation of HSR device with slaves from another netns
HSR/PRP driver does not handle correctly having slaves/interlink devices
in a different net namespace. Currently, it is possible to create a HSR
link in a different net namespace than the slaves/interlink with the
following command:
ip link add hsr0 netns hsr-ns type hsr slave1 eth1 slave2 eth2
As there is no use-case on supporting this scenario, enforce that HSR
device link matches netns defined by IFLA_LINK_NETNSID.
The iproute2 command mentioned above will throw the following error:
Error: hsr: HSR slaves/interlink must be on the same net namespace than HSR link.
Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Link: https://patch.msgid.link/20251020135533.9373-1-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexey Simakov [Tue, 21 Oct 2025 13:00:36 +0000 (16:00 +0300)]
sctp: avoid NULL dereference when chunk data buffer is missing
chunk->skb pointer is dereferenced in the if-block where it's supposed
to be NULL only.
chunk->skb can only be NULL if chunk->head_skb is not. Check for frag_list
instead and do it just before replacing chunk->skb. We're sure that
otherwise chunk->skb is non-NULL because of outer if() condition.
Jiasheng Jiang [Tue, 21 Oct 2025 18:24:56 +0000 (18:24 +0000)]
ptp: ocp: Fix typo using index 1 instead of i in SMA initialization loop
In ptp_ocp_sma_fb_init(), the code mistakenly used bp->sma[1]
instead of bp->sma[i] inside a for-loop, which caused only SMA[1]
to have its DIRECTION_CAN_CHANGE capability cleared. This led to
inconsistent capability flags across SMA pins.
This series addresses several issues in the Renesas Ethernet AVB (ravb)
driver related descriptor ordering.
A potential ordering hazard in descriptor setup could cause
the DMA engine to start prematurely, leading to TX stalls on some
platforms.
The series includes the following changes:
Enforce descriptor type ordering to prevent early DMA start
Ensure proper write ordering of TX descriptor type fields to prevent the
DMA engine from observing an incomplete descriptor chain. This fixes
observed TX stalls on RZ/G2L platforms running RT kernels.
Tested on R/G1x Gen2, RZ/G2x Gen3 and RZ/G2L family hardware.
====================
Lad Prabhakar [Fri, 17 Oct 2025 15:18:30 +0000 (16:18 +0100)]
net: ravb: Ensure memory write completes before ringing TX doorbell
Add a final dma_wmb() barrier before triggering the transmit request
(TCCR_TSRQ) to ensure all descriptor and buffer writes are visible to
the DMA engine.
According to the hardware manual, a read-back operation is required
before writing to the doorbell register to guarantee completion of
previous writes. Instead of performing a dummy read, a dma_wmb() is
used to both enforce the same ordering semantics on the CPU side and
also to ensure completion of writes.
Lad Prabhakar [Fri, 17 Oct 2025 15:18:29 +0000 (16:18 +0100)]
net: ravb: Enforce descriptor type ordering
Ensure the TX descriptor type fields are published in a safe order so the
DMA engine never begins processing a descriptor chain before all descriptor
fields are fully initialised.
For multi-descriptor transmits the driver writes DT_FEND into the last
descriptor and DT_FSTART into the first. The DMA engine begins processing
when it observes DT_FSTART. Move the dma_wmb() barrier so it executes
immediately after DT_FEND and immediately before writing DT_FSTART
(and before DT_FSINGLE in the single-descriptor case). This guarantees
that all prior CPU writes to the descriptor memory are visible to the
device before DT_FSTART is seen.
This avoids a situation where compiler/CPU reordering could publish
DT_FSTART ahead of DT_FEND or other descriptor fields, allowing the DMA to
start on a partially initialised chain and causing corrupted transmissions
or TX timeouts. Such a failure was observed on RZ/G2L with an RT kernel as
transmit queue timeouts and device resets.
Linus Torvalds [Thu, 23 Oct 2025 01:00:34 +0000 (15:00 -1000)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"All driver fixes. The big change is the storvsc one to rejig the
hyper-v channel handling to be more efficient for SMP virtual
machines"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: phy: dt-bindings: Add QMP UFS PHY compatible for Kaanapali
scsi: ufs: qcom: dt-bindings: Document the Kaanapali UFS controller
scsi: libfc: Prevent integer overflow in fc_fcp_recv_data()
scsi: qla4xxx: Fix typos in comments
scsi: storvsc: Prefer returning channel with the same CPU as on the I/O issuing CPU
Linus Torvalds [Thu, 23 Oct 2025 00:57:35 +0000 (14:57 -1000)]
Merge tag 'mm-hotfixes-stable-2025-10-22-12-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
"17 hotfixes. 12 are cc:stable and 14 are for MM.
There's a two-patch DAMON series from SeongJae Park which addresses a
missed check and possible memory leak. Apart from that it's all
singletons - please see the changelogs for details"
* tag 'mm-hotfixes-stable-2025-10-22-12-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
csky: abiv2: adapt to new folio flags field
mm/damon/core: use damos_commit_quota_goal() for new goal commit
mm/damon/core: fix potential memory leak by cleaning ops_filter in damon_destroy_scheme
hugetlbfs: move lock assertions after early returns in huge_pmd_unshare()
vmw_balloon: indicate success when effectively deflating during migration
mm/damon/core: fix list_add_tail() call on damon_call()
mm/mremap: correctly account old mapping after MREMAP_DONTUNMAP remap
mm: prevent poison consumption when splitting THP
ocfs2: clear extent cache after moving/defragmenting extents
mm: don't spin in add_stack_record when gfp flags don't allow
dma-debug: don't report false positives with DMA_BOUNCE_UNALIGNED_KMALLOC
mm/damon/sysfs: dealloc commit test ctx always
mm/damon/sysfs: catch commit test ctx alloc failure
hung_task: fix warnings caused by unaligned lock pointers
Linus Torvalds [Wed, 22 Oct 2025 15:17:32 +0000 (05:17 -1000)]
Merge tag 'platform-drivers-x86-v6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
- alienware-wmi-wmax:
- Fix NULL pointer dereference in sleep handlers
- Add AWCC support to Dell G15 5530
- mellanox: mlxbf-pmc: add sysfs_attr_init() to count_clock init
* tag 'platform-drivers-x86-v6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: alienware-wmi-wmax: Add AWCC support to Dell G15 5530
MAINTAINERS: add Denis Benato as maintainer for asus notebooks
platform/mellanox: mlxbf-pmc: add sysfs_attr_init() to count_clock init
platform/x86: alienware-wmi-wmax: Fix NULL pointer dereference in sleep handlers