Daniel Borkmann [Mon, 13 Apr 2026 22:08:04 +0000 (00:08 +0200)]
tools/ynl: Make YnlFamily closeable as a context manager
YnlFamily opens an AF_NETLINK socket in __init__ but has no way
to release it other than leaving it to the GC. YnlFamily holds a
self reference cycle through SpecFamily's self.family = self
in its super().__init__() call, so refcount GC cannot reclaim
it and the socket stays open until the cyclic GC runs.
If a test creates a guest netns, instantiates a YnlFamily inside
it via NetNSEnter(), performs some test case work via Ynl, and
then deletes the netns, then the 'ip netns del' only drops the
mount binding and cleanup_net in the kernel never runs, so any
subsequent test case assertions that objects got cleaned up would
fail given this only gets triggered later via cyclic GC run.
Add an explicit close() that closes the netlink socket and wire
up the __enter__/__exit__ so callers can scope the instance
deterministically via 'with YnlFamily(...) as ynl: ...'.
Lorenzo Bianconi [Sun, 12 Apr 2026 09:56:25 +0000 (11:56 +0200)]
net: airoha: Remove PCE_MC_EN_MASK bit in REG_FE_PCE_CFG configuration
PCE_MC_EN_MASK bit in REG_FE_PCE_CFG configuration performed in
airoha_fe_init() is used to duplicate multicast packets and send a copy
to the CPU when the traffic is offloaded. This is necessary just if
it is requested by the user. Disable multicast packets duplication by
default.
====================
net: airoha: Preliminary series to support multiple net_devices connected to the same GDM port
EN7581 or AN7583 SoCs support connecting multiple external SerDes (e.g.
Ethernet or USB SerDes) to GDM3 or GDM4 ports via a hw arbiter that
manages the traffic in a TDM manner.
This series introduces some preliminary changes necessary to introduce
support for multiple net_devices connected to the same Frame Engine (FE)
GDM port (GDM3 or GDM4).
====================
Lorenzo Bianconi [Sun, 12 Apr 2026 17:13:14 +0000 (19:13 +0200)]
net: airoha: Rely on net_device pointer in ETS callbacks
Remove airoha_gdm_port dependency in ETS tc callback signatures and rely
on net_device pointer instead. Please note this patch does not introduce
any logical change and it is a preliminary patch in order to support
multiple net_devices connected to the same GDM3 or GDM4 port via an
external hw arbiter.
Lorenzo Bianconi [Sun, 12 Apr 2026 17:13:13 +0000 (19:13 +0200)]
net: airoha: Rely on net_device pointer in HTB callbacks
Remove airoha_gdm_port dependency in HTB tc callback signatures and rely
on net_device pointer instead. Please note this patch does not introduce
any logical change and it is a preliminary patch in order to support
multiple net_devices connected to the same GDM3 or GDM4 port via an
external hw arbiter.
Lorenzo Bianconi [Sun, 12 Apr 2026 17:13:12 +0000 (19:13 +0200)]
net: airoha: Rely on net_device pointer in airoha_dev_setup_tc_block signature
Remove airoha_gdm_port dependency in airoha_dev_setup_tc_block routine
signature and rely on net_device pointer instead. Please note this patch
does not introduce any logical change and it is a preliminary patch to
support multiple net_devices connected to the GDM3 or GDM4 ports via an
external hw arbiter.
====================
net: dsa: mxl862xx: add statistics support
Add per-port RMON statistics support for the MxL862xx DSA driver,
covering hardware-specific ethtool -S counters, standard IEEE 802.3
MAC/ctrl/pause statistics, and rtnl_link_stats64 via polled 64-bit
accumulation.
====================
Daniel Golle [Sun, 12 Apr 2026 00:02:05 +0000 (01:02 +0100)]
net: dsa: mxl862xx: implement .get_stats64
Poll free-running firmware RMON counters every 2 seconds and accumulate
deltas into 64-bit per-port statistics. 32-bit packet counters wrap
in ~220s at 10 Gbps line rate with minimum-size frames; the 2s polling
interval provides a comfortable margin. The .get_stats64 callback
forces a fresh poll so that counters are always up to date when queried.
Daniel Golle [Sun, 12 Apr 2026 00:01:57 +0000 (01:01 +0100)]
net: dsa: mxl862xx: add ethtool statistics support
The MxL862xx firmware exposes per-port RMON counters through the
RMON_PORT_GET command, covering standard IEEE 802.3 MAC statistics
(unicast/multicast/broadcast packet and byte counts, collision
counters, pause frames) as well as hardware-specific counters such
as extended VLAN discard and MTU exceed events.
Add the RMON counter firmware API structures and command definitions.
Implement .get_strings, .get_sset_count, and .get_ethtool_stats for
legacy ethtool -S support. Implement .get_eth_mac_stats,
.get_eth_ctrl_stats, and .get_pause_stats for the standardized
IEEE 802.3 statistics interface.
Back in 2024 I reported a 7-12% regression on an iperf3 UDP loopback
thoughput test that we traced to the extra overhead of calling
compute_score on two places, introduced by commit f0ea27e7bfe1 ("udp:
re-score reuseport groups when connected sockets are present"). At the
time, I pointed out the overhead was caused by the multiple calls,
associated with cpu-specific mitigations, and merged commit 50aee97d1511 ("udp: Avoid call to compute_score on multiple sites") to
jump back explicitly, to force the rescore call in a single place.
Recently though, we got another regression report against a newer distro
version, which a team colleague traced back to the same root-cause.
Turns out that once we updated to gcc-13, the compiler got smart enough
to unroll the loop, undoing my previous mitigation. Let's bite the
bullet and __always_inline compute_score on both ipv4 and ipv6 to
prevent gcc from de-optimizing it again in the future. These functions
are only called in two places each, udpX_lib_lookup1 and
udpX_lib_lookup2, so the extra size shouldn't be a problem and it is hot
enough to be very visible in profilings. In fact, with gcc13, forcing
the inline will prevent gcc from unrolling the fix from commit 50aee97d1511, so we don't end up increasing udpX_lib_lookup2 at all.
I haven't recollected the results myself, as I don't have access to the
machine at the moment. But the same colleague reported 4.67%
inprovement with this patch in the loopback benchmark, solving the
regression report within noise margins.
Eric Dumazet reported no size change to vmlinux when built with clang.
I report the same also with gcc-13:
Fixes: 50aee97d1511 ("udp: Avoid call to compute_score on multiple sites") Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de> Link: https://patch.msgid.link/20260410155936.654915-1-krisman@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
net: move .getsockopt away from __user buffers
Currently, the .getsockopt callback requires __user pointers:
int (*getsockopt)(struct socket *sock, int level,
int optname, char __user *optval, int __user *optlen);
This prevents kernel callers (io_uring, BPF) from using getsockopt on
levels other than SOL_SOCKET, since they pass kernel pointers.
Following Linus' suggestion [0], this series introduces sockopt_t, a
type-safe wrapper around iov_iter, and a getsockopt_iter callback that
works with both user and kernel buffers. AF_PACKET and CAN raw are
converted as initial users, with selftests covering the trickiest
conversion patterns.
Convert CAN raw socket's getsockopt implementation to use the new
getsockopt_iter callback with sockopt_t.
Key changes:
- Replace (char __user *optval, int __user *optlen) with sockopt_t *opt
- Use opt->optlen for buffer length (input) and returned size (output)
- Use copy_to_iter() instead of copy_to_user()
- For CAN_RAW_FILTER and CAN_RAW_XL_VCID_OPTS: on -ERANGE, set
opt->optlen to the required buffer size. The wrapper writes this
back to userspace even on error, preserving the existing API that
lets userspace discover the needed allocation size.
Convert AF_PACKET's getsockopt implementation to use the new
getsockopt_iter callback with sockopt_t.
Key changes:
- Replace (char __user *optval, int __user *optlen) with sockopt_t *opt
- Use opt->optlen for buffer length (input) and returned size (output)
- Use copy_to_iter() instead of put_user()/copy_to_user()
- For PACKET_HDRLEN which reads from optval: use opt->iter_in with
copy_from_iter() for the input read, then the common opt->iter_out
copy_to_iter() epilogue handles the output
Update do_sock_getsockopt() to use the new getsockopt_iter callback
when available. Add do_sock_getsockopt_iter() helper that:
1. Reads optlen from user/kernel space
2. Initializes a sockopt_t with the appropriate iov_iter (kvec for
kernel, ubuf for user buffers) and sets opt.optlen
3. Calls the protocol's getsockopt_iter callback
4. Writes opt.optlen back to user/kernel space
The optlen is always written back, even on failure. Some protocols
(e.g. CAN raw) return -ERANGE and set optlen to the required buffer
size so userspace knows how much to allocate.
The callback is responsible for setting opt.optlen to indicate the
returned data size.
Important to say that iov_out does not need to be copied back in
do_sock_getsockopt().
When optval is not kernel (the userspace path), sockptr_to_sockopt()
sets up opt->iter_out as a ITER_DEST ubuf iterator pointing directly at
the userspace buffer (optval.user). So when getsockopt_iter
implementations call copy_to_iter(..., &opt->iter_out), the data is
written directly to userspace — no intermediate kernel buffer is
involved.
When optval.is_kernel is true (the in-kernel path, e.g. from io_uring),
the kvec points at the already-provided kernel buffer (optval.kernel),
so the data lands in the caller's buffer directly via the kvec-backed
iterator.
In both cases the iterator writes to the final destination in-place at
protocol callback. There's nothing to copy back — only optlen needs to
be written back.
Add a new getsockopt_iter callback to struct proto_ops that uses
sockopt_t, a type-safe wrapper around iov_iter. This provides a clean
interface for socket option operations that works with both user and
kernel buffers.
The sockopt_t type encapsulates an iov_iter and an optlen field.
The optlen field, although not suggested by Linus, serves as both input
(buffer size) and output (returned data size), allowing callbacks to
return random values independent of the bytes written via
copy_to_iter(), so, keep it separated from iov_iter.count.
This is preparatory work for removing the SOL_SOCKET level restriction
from io_uring getsockopt operations.
Keep in mind that both iter_out and iter_in always point to the same
data at all times, and we just have two of them to make the callback
implementation sane.
Jakub Kicinski [Mon, 13 Apr 2026 21:26:02 +0000 (14:26 -0700)]
Merge tag 'for-net-next-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Luiz Augusto von Dentz says:
====================
bluetooth-next pull request for net-next:
core:
- hci_core: Rate limit the logging of invalid ISO handle
- hci_sync: make hci_cmd_sync_run_once return -EEXIST if exists
- hci_event: fix locking in hci_conn_request_evt() with HCI_PROTO_DEFER
- hci_event: fix potential UAF in SSP passkey handlers
- HCI: Avoid a couple -Wflex-array-member-not-at-end warnings
- L2CAP: CoC: Disconnect if received packet size exceeds MPS
- L2CAP: Add missing chan lock in l2cap_ecred_reconf_rsp
- L2CAP: Fix printing wrong information if SDU length exceeds MTU
- SCO: check for codecs->num_codecs == 1 before assigning to sco_pi(sk)->codec
drivers:
- btusb: MT7922: Add VID/PID 0489/e174
- btusb: Add Lite-On 04ca:3807 for MediaTek MT7921
- btusb: Add MT7927 IDs ASUS ROG Crosshair X870E Hero, Lenovo Legion Pro 7
16ARX9, Gigabyte Z790 AORUS MASTER X, MSI X870E Ace Max, TP-Link
Archer TBE550E, ASUS X870E / ProArt X870E-Creator.
- btusb: Add MT7902 IDs 13d3/3579, 13d3/3580, 13d3/3594, 13d3/3596, 0e8d/1ede
- btusb: Add MT7902 IDs 13d3/3579, 13d3/3580, 13d3/3594, 13d3/3596, 0e8d/1ede
- btusb: MediaTek MT7922: Add VID 0489 & PID e11d
- btintel: Add support for Scorpious Peak2 support
- btintel: Add support for Scorpious Peak2F support
- btintel_pcie: Add device id of Scorpius Peak2, Nova Lake-PCD-H
- btintel_pcie: Add device id of Scorpious2, Nova Lake-PCD-S
- btmtk: Add reset mechanism if downloading firmware failed
- btmtk: Add MT6639 (MT7927) Bluetooth support
- btmtk: fix ISO interface setup for single alt setting
- btmtk: add MT7902 SDIO support
- Bluetooth: btmtk: add MT7902 MCU support
- btbcm: Add entry for BCM4343A2 UART Bluetooth
- qca: enable pwrseq support for wcn39xx devices
- hci_qca: Fix BT not getting powered-off on rmmod
- hci_qca: disable power control for WCN7850 when bt_en is not defined
- hci_qca: Fix missing wakeup during SSR memdump handling
- hci_ldisc: Clear HCI_UART_PROTO_INIT on error
- mmc: sdio: add MediaTek MT7902 SDIO device ID
- hci_ll: Enable BROKEN_ENHANCED_SETUP_SYNC_CONN for WL183x
* tag 'for-net-next-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (59 commits)
Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling
Bluetooth: btintel_pcie: use strscpy to copy plain strings
Bluetooth: hci_event: fix potential UAF in SSP passkey handlers
Bluetooth: hci.h: Avoid a couple -Wflex-array-member-not-at-end warnings
Bluetooth: SCO: check for codecs->num_codecs == 1 before assigning to sco_pi(sk)->codec
Bluetooth: btintel_pcie: Align shared DMA memory to 128 bytes
Bluetooth: l2cap: Add missing chan lock in l2cap_ecred_reconf_rsp
Bluetooth: hci_ll: Enable BROKEN_ENHANCED_SETUP_SYNC_CONN for WL183x
Bluetooth: btusb: MediaTek MT7922: Add VID 0489 & PID e11d
Bluetooth: btmtk: hide unused btmtk_mt6639_devs[] array
Bluetooth: btusb: Add MT7927 ID for ASUS X870E / ProArt X870E-Creator
Bluetooth: btusb: Add MT7927 ID for TP-Link Archer TBE550E
Bluetooth: btusb: Add MT7927 ID for MSI X870E Ace Max
Bluetooth: btusb: Add MT7927 ID for Gigabyte Z790 AORUS MASTER X
Bluetooth: btusb: Add MT7927 ID for Lenovo Legion Pro 7 16ARX9
Bluetooth: btusb: Add MT7927 ID for ASUS ROG Crosshair X870E Hero
Bluetooth: btmtk: fix ISO interface setup for single alt setting
Bluetooth: btmtk: Add MT6639 (MT7927) Bluetooth support
Bluetooth: fix locking in hci_conn_request_evt() with HCI_PROTO_DEFER
Bluetooth: btmtk: refactor endpoint lookup
...
====================
mlx4: correct error reporting in mlx4_master_process_vhcr()
mlx4_master_process_vhcr() logs vhcr->errno on failures, but this field
is never populated by the PF path. As a result, all failures are reported
with errno 0 and err print in status case which is misleading.
Use the actual return value (err) instead, translate it to FW status
before logging, and report both values.
Jakub Kicinski [Wed, 8 Apr 2026 00:14:38 +0000 (17:14 -0700)]
tcp: update window_clamp when SO_RCVBUF is set
Commit under Fixes moved recomputing the window clamp to
tcp_measure_rcv_mss() (when scaling_ratio changes).
I suspect it missed the fact that we don't recompute the clamp
when rcvbuf is set. Until scaling_ratio changes we are
stuck with the old window clamp which may be based on
the small initial buffer. scaling_ratio may never change.
Inspired by Eric's recent commit d1361840f8c5 ("tcp: fix
SO_RCVLOWAT and RCVBUF autotuning") plumb the user action
thru to TCP and have it update the clamp.
A smaller fix would be to just have tcp_rcvbuf_grow()
adjust the clamp even if SOCK_RCVBUF_LOCK is set.
But IIUC this is what we were trying to get away from
in the first place.
Fixes: a2cbb1603943 ("tcp: Update window clamping condition") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumaze@google.com> Link: https://patch.msgid.link/20260408001438.129165-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling
When a Bluetooth controller encounters a coredump, it triggers the
Subsystem Restart (SSR) mechanism. The controller first reports the
coredump data and, once the upload is complete, sends a hw_error
event. The host relies on this event to proceed with subsequent
recovery actions.
If the host has not finished processing the coredump data when the
hw_error event is received, it waits until either the processing is
complete or the 8-second timeout expires before handling the event.
The current implementation clears QCA_MEMDUMP_COLLECTION using
clear_bit(), which does not wake up waiters sleeping in
wait_on_bit_timeout(). As a result, the waiting thread may remain
blocked until the timeout expires even if the coredump collection
has already completed.
Fix this by clearing QCA_MEMDUMP_COLLECTION with
clear_and_wake_up_bit(), which also wakes up the waiting thread and
allows the hw_error handling to proceed immediately.
Test case:
- Trigger a controller coredump using:
hcitool cmd 0x3f 0c 26
- Tested on QCA6390.
- Capture HCI logs using btmon.
- Verify that the delay between receiving the hw_error event and
initiating the power-off sequence is reduced compared to the
timeout-based behavior.
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Shuai Zhang <shuai.zhang@oss.qualcomm.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: hci_event: fix potential UAF in SSP passkey handlers
hci_conn lookup and field access must be covered by hdev lock in
hci_user_passkey_notify_evt() and hci_keypress_notify_evt(), otherwise
the connection can be freed concurrently.
Extend the hci_dev_lock critical section to cover all conn usage in both
handlers.
Keep the existing keypress notification behavior unchanged by routing
the early exits through a common unlock path.
Fixes: 92a25256f142 ("Bluetooth: mgmt: Implement support for passkey notification") Cc: stable@vger.kernel.org Signed-off-by: Shuvam Pandey <shuvampandey1@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: hci.h: Avoid a couple -Wflex-array-member-not-at-end warnings
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.
struct hci_std_codecs and struct hci_std_codecs_v2 are flexible
structures, this is structures that contain a flexible-array member
(__u8 codec[]; and struct hci_std_codec_v2 codec[];, correspondingly.)
Since struct hci_rp_read_local_supported_codecs and struct
hci_rp_read_local_supported_codecs_v2 are defined by hardware, we
create the new struct hci_std_codecs_hdr and struct hci_std_codecs_v2_hdr
types, and use them to replace the object types causing trouble in
struct hci_rp_read_local_supported_codecs and struct
hci_rp_read_local_supported_codecs_v2, namely struct hci_std_codecs
std_codecs; and struct hci_std_codecs_v2_hdr std_codecs;.
Also, once -fms-extensions is enabled, we can use transparent struct
members in both struct hci_std_codecs and struct hci_std_codecs_v2_hdr.
Notice that the newly created types does not contain the flex-array
member `codec`, which is the object causing the -Wfamnae warnings.
After these changes, the size of struct hci_rp_read_local_supported_codecs
and struct hci_rp_read_local_supported_codecs_v2, along with their
member's offsets remain the same, hence the memory layouts don't
change:
include/net/bluetooth/hci.h:1490:31: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
include/net/bluetooth/hci.h:1525:34: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: SCO: check for codecs->num_codecs == 1 before assigning to sco_pi(sk)->codec
copy_struct_from_sockptr() fill 'buffer' in
sco_sock_setsockopt() with zeros, so there's no
real problem.
But it actually looks strange to do this,
without checking all of codecs->codecs[0]
really comes from userspace:
sco_pi(sk)->codec = codecs->codecs[0];
As only optlen < sizeof(struct bt_codecs) is checked
and codecs->num_codecs is not checked against != 1,
but only <= 1, and the space for the additional struct bt_codec
is not checked.
Note I don't understand bluetooth and I didn't do any runtime
tests with this! I just found it when debugging a problem
in copy_struct_from_sockptr().
I just added this to check the size is as expected:
make CF=-D__CHECK_ENDIAN__ W=1ce C=1 net/bluetooth/sco.o
Fixes: 3e643e4efa1e ("Bluetooth: Improve setsockopt() handling of malformed user input") Cc: Michal Luczaj <mhal@rbox.co> Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: David Wei <dw@davidwei.uk> Cc: linux-bluetooth@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Kiran K [Tue, 7 Apr 2026 10:02:40 +0000 (15:32 +0530)]
Bluetooth: btintel_pcie: Align shared DMA memory to 128 bytes
Align each descriptor/index/context region to 128 bytes before
calculating the total DMA pool size. This ensures the memory layout
shared with firmware meets the 128-byte alignment requirement.
The DMA pool alignment is also set to 128 bytes to match the firmware
expectation for all shared structures.
Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Dudu Lu [Sun, 5 Apr 2026 15:47:41 +0000 (23:47 +0800)]
Bluetooth: l2cap: Add missing chan lock in l2cap_ecred_reconf_rsp
l2cap_ecred_reconf_rsp() calls l2cap_chan_del() without holding
l2cap_chan_lock(). Every other l2cap_chan_del() caller in the file
acquires the lock first. A remote BLE device can send a crafted
L2CAP ECRED reconfiguration response to corrupt the channel list
while another thread is iterating it.
Add l2cap_chan_hold() and l2cap_chan_lock() before l2cap_chan_del(),
and l2cap_chan_unlock() and l2cap_chan_put() after, matching the
pattern used in l2cap_ecred_conn_rsp() and l2cap_conn_del().
Fixes: 15f02b910562 ("Bluetooth: L2CAP: Add initial code for Enhanced Credit Based Mode") Signed-off-by: Dudu Lu <phx0fer@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: hci_ll: Enable BROKEN_ENHANCED_SETUP_SYNC_CONN for WL183x
TI WL183x controllers advertise support for the HCI Enhanced Setup
Synchronous Connection command, but SCO setup fails when the enhanced
path is used. The only working configuration is to fall back to the
legacy HCI Setup Synchronous Connection (0x0028).
This matches the scenario described in commit 05abad857277
("Bluetooth: HCI: Add HCI_QUIRK_BROKEN_ENHANCED_SETUP_SYNC_CONN quirk").
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Kamiyama Chiaki <nercone@nercone.dev> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
When USB support is disabled, the array is not referenced anywhere,
causing a warning:
drivers/bluetooth/btmtk.c:35:3: error: 'btmtk_mt6639_devs' defined but not used [-Werror=unused-const-variable=]
35 | } btmtk_mt6639_devs[] = {
| ^~~~~~~~~~~~~~~~~
Move it into the #ifdef block.
Fixes: 28b7c5a6db74 ("Bluetooth: btmtk: Add MT6639 (MT7927) Bluetooth support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Javier Tia [Mon, 30 Mar 2026 20:39:30 +0000 (14:39 -0600)]
Bluetooth: btusb: Add MT7927 ID for ASUS X870E / ProArt X870E-Creator
Add USB device ID 13d3:3588 (IMC Networks/Azurewave) for the MediaTek
MT7927 (Filogic 380) Bluetooth interface found on the ASUS ROG STRIX
X870E-E GAMING WIFI and ASUS ProArt X870E-Creator WiFi motherboards.
The information in /sys/kernel/debug/usb/devices about the Bluetooth
device is listed as the below.
Javier Tia [Mon, 30 Mar 2026 20:39:29 +0000 (14:39 -0600)]
Bluetooth: btusb: Add MT7927 ID for TP-Link Archer TBE550E
Add USB device ID 0489:e116 (Foxconn/Hon Hai) for the MediaTek MT7927
(Filogic 380) Bluetooth interface found on the TP-Link Archer TBE550E
PCIe adapter.
The information in /sys/kernel/debug/usb/devices about the Bluetooth
device is listed as the below.
Javier Tia [Mon, 30 Mar 2026 20:39:27 +0000 (14:39 -0600)]
Bluetooth: btusb: Add MT7927 ID for Gigabyte Z790 AORUS MASTER X
Add USB device ID 0489:e10f (Foxconn/Hon Hai) for the MediaTek MT7927
(Filogic 380) Bluetooth interface found on the Gigabyte Z790 AORUS
MASTER X motherboard.
The information in /sys/kernel/debug/usb/devices about the Bluetooth
device is listed as the below.
Javier Tia [Mon, 30 Mar 2026 20:39:26 +0000 (14:39 -0600)]
Bluetooth: btusb: Add MT7927 ID for Lenovo Legion Pro 7 16ARX9
Add USB device ID 0489:e0fa (Foxconn/Hon Hai) for the MediaTek MT7927
(Filogic 380) Bluetooth interface found on the Lenovo Legion Pro 7
16ARX9 laptop.
The information in /sys/kernel/debug/usb/devices about the Bluetooth
device is listed as the below.
Javier Tia [Mon, 30 Mar 2026 20:39:25 +0000 (14:39 -0600)]
Bluetooth: btusb: Add MT7927 ID for ASUS ROG Crosshair X870E Hero
Add USB device ID 0489:e13a (Foxconn/Hon Hai) for the MediaTek MT7927
(Filogic 380) Bluetooth interface found on the ASUS ROG Crosshair X870E
Hero WiFi motherboard.
The information in /sys/kernel/debug/usb/devices about the Bluetooth
device is listed as the below.
Javier Tia [Mon, 30 Mar 2026 20:39:24 +0000 (14:39 -0600)]
Bluetooth: btmtk: fix ISO interface setup for single alt setting
Some MT6639 Bluetooth USB interfaces (e.g. IMC Networks 13d3:3588 on
ASUS ROG STRIX X870E-E and ProArt X870E-Creator boards) expose only a
single alternate setting (alt 0) on the ISO interface. The driver
unconditionally requests alt setting 1, which fails with EINVAL on
these devices, causing a ~20 second initialization delay and no LE
audio support.
Check the number of available alternate settings before selecting one.
If only alt 0 exists, use it; otherwise request alt 1 as before.
Closes: https://github.com/jetm/mediatek-mt7927-dkms/pull/39 Signed-off-by: Javier Tia <floss@jetm.me> Reported-by: Ryan Gilbert <xelnaga@gmail.com> Tested-by: Ryan Gilbert <xelnaga@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Javier Tia [Mon, 30 Mar 2026 20:39:23 +0000 (14:39 -0600)]
Bluetooth: btmtk: Add MT6639 (MT7927) Bluetooth support
The MediaTek MT7927 (Filogic 380) combo WiFi 7 + BT 5.4 module uses
hardware variant 0x6639 for its Bluetooth subsystem. Without this patch,
the chip fails with "Unsupported hardware variant (00006639)" or hangs
during firmware download.
Three changes are needed to support MT6639:
1. CHIPID workaround: On some boards the BT USB MMIO register reads
0x0000 for dev_id, causing the driver to skip the 0x6639 init path.
Force dev_id to 0x6639 only when the USB VID/PID matches a known
MT6639 device, avoiding misdetection if a future chip also reads
zero. This follows the WiFi-side pattern that uses PCI device IDs
to scope the same workaround.
2. Firmware naming: MT6639 uses firmware version prefix "2_1" instead of
"1_1" used by MT7925 and other variants. The firmware path is
mediatek/mt7927/BT_RAM_CODE_MT6639_2_1_hdr.bin, using the mt7927
directory to match the WiFi firmware convention. The filename will
likely change to use MT7927 once MediaTek submits a dedicated
Linux firmware binary.
3. Section filtering: The MT6639 firmware binary contains 9 sections, but
only sections with (dlmodecrctype & 0xff) == 0x01 are Bluetooth-related.
Sending the remaining WiFi/other sections causes an irreversible BT
subsystem hang requiring a full power cycle. This matches the Windows
driver behavior observed via USB captures.
Also add 0x6639 to the reset register (CONNV3) and firmware setup switch
cases alongside the existing 0x7925 handling.
Pauli Virtanen [Sun, 29 Mar 2026 13:42:59 +0000 (16:42 +0300)]
Bluetooth: fix locking in hci_conn_request_evt() with HCI_PROTO_DEFER
When protocol sets HCI_PROTO_DEFER, hci_conn_request_evt() calls
hci_connect_cfm(conn) without hdev->lock. Generally hci_connect_cfm()
assumes it is held, and if conn is deleted concurrently -> UAF.
Only SCO and ISO set HCI_PROTO_DEFER and only for defer setup listen,
and HCI_EV_CONN_REQUEST is not generated for ISO. In the non-deferred
listening socket code paths, hci_connect_cfm(conn) is called with
hdev->lock held.
Fix by holding the lock.
Fixes: 70c464256310 ("Bluetooth: Refactor connection request handling") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: hci_ldisc: Clear HCI_UART_PROTO_INIT on error
When hci_register_dev() fails in hci_uart_register_dev()
HCI_UART_PROTO_INIT is not cleared before calling hu->proto->close(hu)
and setting hu->hdev to NULL. This means incoming UART data will reach
the protocol-specific recv handler in hci_uart_tty_receive() after
resources are freed.
Clear HCI_UART_PROTO_INIT with a write lock before calling
hu->proto->close() and setting hu->hdev to NULL. The write lock ensures
all active readers have completed and no new reader can enter the
protocol recv path before resources are freed.
This allows the protocol-specific recv functions to remove the
"HCI_UART_REGISTERED" guard without risking a null pointer dereference
if hci_register_dev() fails.
Fixes: 5df5dafc171b ("Bluetooth: hci_uart: Fix another race during initialization") Signed-off-by: Jonathan Rissanen <jonathan.rissanen@axis.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Pauli Virtanen [Wed, 25 Mar 2026 19:07:45 +0000 (21:07 +0200)]
Bluetooth: hci_sync: make hci_cmd_sync_run_once return -EEXIST if exists
hci_cmd_sync_run_once() needs to indicate whether a queue item was
added, so caller can know if callbacks are called, so it can avoid
leaking resources.
Change the function to return -EEXIST if queue item already exists.
Modify all callsites vs. the changes. The only callsite is
hci_abort_conn().
Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Shuai Zhang [Tue, 24 Mar 2026 02:30:16 +0000 (10:30 +0800)]
Bluetooth: hci_qca: disable power control for WCN7850 when bt_en is not defined
On platforms using an M.2 slot with both UART and USB support, bt_en is
pulled high by hardware. In this case, software-based power control
should be disabled. The current platforms are Lemans-EVK and Monaco-EVK.
Add QCA_WCN7850 to the existing condition so that power_ctrl_enabled is
cleared when bt_en is not software-controlled (or absent), aligning its
behavior with WCN6750 and WCN6855
Signed-off-by: Shuai Zhang <shuai.zhang@oss.qualcomm.com> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
====================
Decouple receive and transmit enablement in team driver
Allow independent control over receive and transmit enablement states
for aggregated ports in the team driver.
The motivation is that IEE 802.3ad LACP "independent control" can't
be implemented for the team driver currently. This was added to the
bonding driver in commit 240fd405528b ("bonding: Add independent
control state machine").
This series also has a few patches that add tests to show that the old
coupled enablement still works and that the new decoupled enablement
works as intended (4, 5, and 10).
There are three patches with small fixes as well, with the goal of
making the final decoupling patch clearer (1, 2, and 3).
Signed-off-by: Marc Harvey <marcharvey@google.com>
====================
Marc Harvey [Thu, 9 Apr 2026 02:59:31 +0000 (02:59 +0000)]
net: team: Add new tx_enabled team port option
This option allows independent control over tx enablement without
affecting rx enablement. Like the rx_enabled option, this also
implicitly affects the enabled option.
If this option is not used, then the enabled option will continue to
behave as it did before.
Marc Harvey [Thu, 9 Apr 2026 02:59:30 +0000 (02:59 +0000)]
net: team: Add new rx_enabled team port option
Allow independent control over rx enablement via the rx_enabled option
without affecting tx enablement. This affects the normal enabled
option since a port is only considered enabled if both tx and rx are
enabled.
If this option is not used, then the enabled option will continue to
behave exactly as it did before.
Marc Harvey [Thu, 9 Apr 2026 02:59:29 +0000 (02:59 +0000)]
net: team: Track rx enablement separately from tx enablement
Separate the rx and tx enablement/disablement into different
functions so that it is easier to interact with them independently
later.
Although this patch changes receive and transmit paths, the actual
behavior of the teaming driver should remain unchanged, since there
is no option introduced yet to change rx or tx enablement
independently. Those options will be added in follow-up patches.
Marc Harvey [Thu, 9 Apr 2026 02:59:28 +0000 (02:59 +0000)]
net: team: Rename enablement functions and struct members to tx
Add no functional changes, but rename enablement functions, variables
etc. that are used in teaming driver transmit decisions.
Since rx and tx enablement are still coupled, some of the variables
renamed in this patch are still used for the rx path, but that will
change in a follow-up patch.
Marc Harvey [Thu, 9 Apr 2026 02:59:27 +0000 (02:59 +0000)]
selftests: net: Add test for enablement of ports with teamd
There are no tests that verify enablement and disablement of team driver
ports with teamd. This should work even with changes to the enablement
option, so it is important to test.
This test sets up an active-backup network configuration across two
network namespaces, and tries to send traffic while changing which
link is the active one.
Also increase the team test timeout to 300 seconds, because gracefully
killing teamd can take 30 seconds for each instance.
Marc Harvey [Thu, 9 Apr 2026 02:59:26 +0000 (02:59 +0000)]
selftests: net: Add tests for failover of team-aggregated ports
There are currently no kernel tests that verify the effect of setting
the enabled team driver option. In a followup patch, there will be
changes to this option, so it will be important to make sure it still
behaves as it does now.
The test verifies that tcp continues to work across two different team
devices in separate network namespaces, even when member links are
manually disabled.
====================
net: fix skb_ext BUILD_BUG_ON failures with GCOV
This mini-series fixes build failures in net/core/skbuff.c when the
kernel is built with CONFIG_GCOV_PROFILE_ALL=y.
This is part of a larger effort to add -fprofile-update=atomic to
global CFLAGS_GCOV (posted earlier as a combined series):
https://lore.kernel.org/lkml/20260401142020.1434243-1-khorenko@virtuozzo.com/T/#t
That combined series was split per subsystem as requested by Jakub.
The companion patches are:
- iommu: use __always_inline for amdv1pt_install_leaf_entry()
(sent to iommu maintainers)
- gcov: add -fprofile-update=atomic globally (sent to gcov/kbuild
maintainers, depends on this series and the iommu patch)
Patch 1/2 fixes a pre-existing build failure with CONFIG_GCOV_PROFILE_ALL:
GCOV counters prevent GCC from constant-folding the skb_ext_total_length()
loop. It also removes the CONFIG_KCOV_INSTRUMENT_ALL preprocessor guard
from d6e5794b06c0: that guard was a precaution in case KCOV instrumentation
also prevented constant folding, but KCOV's -fsanitize-coverage=trace-pc
does not interfere with GCC's constant folding (verified experimentally
with GCC 14.2 and GCC 16.0.1), so the guard is unnecessary.
Patch 2/2 is an additional fix needed when -fprofile-update=atomic is
added to CFLAGS_GCOV: __no_profile on the __always_inline function alone
is insufficient because after inlining, the code resides in the caller's
profiled body. The caller (skb_extensions_init) needs __no_profile and
noinline to prevent re-exposure to GCOV instrumentation.
====================
net: add noinline __init __no_profile to skb_extensions_init() for GCOV compatibility
With -fprofile-update=atomic in global CFLAGS_GCOV, GCC still cannot
constant-fold the skb_ext_total_length() loop when it is inlined into a
profiled caller. The existing __no_profile on skb_ext_total_length()
itself is insufficient because after __always_inline expansion the code
resides in the caller's body, which still carries GCOV instrumentation.
Mark skb_extensions_init() with __no_profile so the BUILD_BUG_ON checks
can be evaluated at compile time. Also mark it noinline to prevent the
compiler from inlining it into skb_init() (which lacks __no_profile),
which would re-expose the function body to GCOV instrumentation.
Add __init since skb_extensions_init() is only called from __init
skb_init(). Previously it was implicitly inlined into the .init.text
section; with noinline it would otherwise remain in permanent .text,
wasting memory after boot.
Build-tested with both CONFIG_GCOV_PROFILE_ALL=y and
CONFIG_KCOV_INSTRUMENT_ALL=y.
net: fix skb_ext_total_length() BUILD_BUG_ON with CONFIG_GCOV_PROFILE_ALL
When CONFIG_GCOV_PROFILE_ALL=y is enabled, the kernel fails to build:
In file included from <command-line>:
In function 'skb_extensions_init',
inlined from 'skb_init' at net/core/skbuff.c:5214:2:
././include/linux/compiler_types.h:706:45: error: call to
'__compiletime_assert_1490' declared with attribute error:
BUILD_BUG_ON failed: skb_ext_total_length() > 255
CONFIG_GCOV_PROFILE_ALL adds -fprofile-arcs -ftest-coverage
-fno-tree-loop-im to CFLAGS globally. GCC inserts branch profiling
counters into the skb_ext_total_length() loop and, combined with
-fno-tree-loop-im (which disables loop invariant motion), cannot
constant-fold the result.
BUILD_BUG_ON requires a compile-time constant and fails.
The issue manifests in kernels with 5+ SKB extension types enabled
(e.g., after addition of SKB_EXT_CAN, SKB_EXT_PSP). With 4 extensions
GCC can still unroll and fold the loop despite GCOV instrumentation;
with 5+ it gives up.
Mark skb_ext_total_length() with __no_profile to prevent GCOV from
inserting counters into this function. Without counters the loop is
"clean" and GCC can constant-fold it even with -fno-tree-loop-im active.
This allows BUILD_BUG_ON to work correctly while keeping GCOV profiling
for the rest of the kernel.
This also removes the CONFIG_KCOV_INSTRUMENT_ALL preprocessor guard
introduced by d6e5794b06c0. That guard was added as a precaution because
KCOV instrumentation was also suspected of inhibiting constant folding.
However, KCOV uses -fsanitize-coverage=trace-pc, which inserts
lightweight trace callbacks that do not interfere with GCC's constant
folding or loop optimization passes. Only GCOV's -fprofile-arcs combined
with -fno-tree-loop-im actually prevents the compiler from evaluating
the loop at compile time. The guard is therefore unnecessary and can be
safely removed.
Fixes: 96ea3a1e2d31 ("can: add CAN skb extension infrastructure") Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com> Reviewed-by: Thomas Weissschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20260410162150.3105738-2-khorenko@virtuozzo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Qingfang Deng [Fri, 10 Apr 2026 05:49:49 +0000 (13:49 +0800)]
pppox: remove sk_pppox() helper
The sk member can be directly accessed from struct pppox_sock without
relying on type casting. Remove the sk_pppox() helper and update all
call sites to use po->sk directly.
Jakub Kicinski [Sun, 12 Apr 2026 21:34:27 +0000 (14:34 -0700)]
Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Tariq Toukan says:
====================
mlx5-next updates 2026-04-09
* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
net/mlx5: Add icm_mng_function_id_mode cap bit
net/mlx5: Rename MLX5_PF page counter type to MLX5_SELF
net/mlx5: Add vhca_id_type bit to alias context
mlx5: Remove redundant iseg base
====================
====================
Add support for PIC64-HPSC/HX MDIO controller
This series adds a driver for the two MDIO controllers of PIC64-HPSC/HX.
The hardware supports C22 and C45 but only C22 is implemented for now.
This MDIO hardware is based on a Microsemi design supported in Linux by
mdio-mscc-miim.c. However, The register interface is completely different
with pic64hpsc, hence the need for a separate driver.
The documentation recommends an input clock of 156.25MHz and a prescaler of
39, which yields an MDIO clock of 1.95MHz.
This was tested on Microchip HB1301 evalkit which has a VSC8574 and a
VSC8541. I've tested with bus frequencies of 0.6, 1.95 and 2.5 MHz.
This series also adds a PHY write barrier when disabling PHY interrupts as
discussed in: https://lore.kernel.org/acvUqDgepCIScs8M@shell.armlinux.org.uk
====================
Charles Perry [Wed, 8 Apr 2026 13:18:16 +0000 (06:18 -0700)]
net: phy: add a PHY write barrier when disabling interrupts
MDIO bus controllers are not required to wait for write transactions to
complete before returning as synchronization is often achieved by polling
status bits.
This can cause issues when disabling interrupts since an interrupt could
fire before the interrupt handler is unregistered and there's no status
bit to poll.
Add a phy_write_barrier() function and use it in phy_disable_interrupts()
to fix this issue. The write barrier just reads an MII register and
discards the value, which is enough to guarantee that previous writes have
completed.
Charles Perry [Wed, 8 Apr 2026 13:18:15 +0000 (06:18 -0700)]
net: mdio: add a driver for PIC64-HPSC/HX MDIO controller
This adds an MDIO driver for PIC64-HPSC/HX. The hardware supports C22
and C45 but only C22 is implemented in this commit.
This MDIO hardware is based on a Microsemi design supported in Linux by
mdio-mscc-miim.c. However, The register interface is completely
different with pic64hpsc, hence the need for a separate driver.
The documentation recommends an input clock of 156.25MHz and a prescaler
of 39, which yields an MDIO clock of 1.95MHz.
The hardware supports an interrupt pin or a "TRIGGER" bit that can be
polled to signal transaction completion. This commit uses polling.
This was tested on Microchip HB1301 evalkit with a VSC8574 and a
VSC8541.
This MDIO hardware is based on a Microsemi design supported in Linux by
mdio-mscc-miim.c. However, The register interface is completely different
with pic64hpsc, hence the need for separate documentation.
The hardware supports C22 and C45.
The documentation recommends an input clock of 156.25MHz and a prescaler
of 39, which yields an MDIO clock of 1.95MHz.
The hardware supports an interrupt pin to signal transaction completion
which is not strictly needed as the software can also poll a "TRIGGER"
bit for this.
====================
net: enetc: improve statistics for v1 and add statistics for v4
For ENETC v1, some standardized statistics were redundantly included in
the unstructured statistics, so remove these duplicated entries.
Previously, the unstructured statistics only contained eMAC data and
did not include pMAC data; add pMAC statistics to ensure completeness.
For ENETC v4, the driver previously reported MAC statistics only for the
internal ENETC (Pseudo MAC). Extend the implementation to provide
additional statistics for both the internal ENETC and the standalone
ENETC.
====================
net: enetc: add unstructured pMAC counters for ENETC v1
The ENETC v1 has two MACs (eMAC and pMAC) to support preemption. The
existing unstructured counters include the eMAC counters, but not the
pMAC counters. So add pMAC counters to improve statistical coverage.
net: enetc: remove standardized counters from enetc_pm_counters
The standardized counters are already exposed via the get_pause_stats(),
get_rmon_stats(), get_eth_ctrl_stats() and get_eth_mac_stats()
interfaces. Keeping the same counters in enetc_pm_counters results in
redundant output.
Remove these standardized counters from enetc_pm_counters and rely on
the existing statistics interfaces to report them.
net: enetc: show RX drop counters only for assigned RX rings
For ENETC v1, each SI provides 16 RBDCR registers for RX ring drop
counters, but this does not imply that an SI actually owns 16 RX rings.
The ENETC hardware supports a total of 16 RX rings, which are assigned
to 3 SIs (1 PSI and 2 VSIs), so each SI is assigned fewer than 16 RX
rings.
The current implementation always reports 16 RX drop counters per SI,
leading to redundant output for SIs with fewer RX rings. Update the
logic to display drop counters only for the RX rings that are actually
assigned to the SI.
net: enetc: add support for the standardized counters
ENETC v4 provides 64-bit counters for IEEE 802.3 basic and mandatory
managed objects, the IETF Management Information Database (MIB) package
(RFC2665), and Remote Network Monitoring (RMON) statistics. In addition,
some ENETCs support preemption, so these ENETCs have two MACs: MAC 0 is
the express MAC (eMAC), MAC 1 is the preemptible MAC (pMAC). Both MACs
support these statistics.
Gal Pressman [Thu, 9 Apr 2026 09:09:45 +0000 (12:09 +0300)]
gre: Count GRE packet drops
GRE is silently dropping packets without updating statistics.
In case of drop, increment rx_dropped counter to provide visibility into
packet loss. For the case where no GRE protocol handler is registered,
use rx_nohandler.
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20260409090945.1542440-1-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
net: phy: add support for disabling autonomous EEE
Some PHYs implement autonomous EEE where the PHY manages EEE
independently, preventing the MAC from controlling LPI signaling.
This conflicts with MACs that implement their own LPI control.
This series adds a .disable_autonomous_eee callback to struct phy_driver
and calls it from phy_support_eee(). When a MAC indicates it supports
EEE, the PHY's autonomous EEE is automatically disabled. The setting is
persisted across suspend/resume by re-applying it in phy_init_hw() after
soft reset, following the same pattern suggested by Russell King for PHY
tunables [1].
Patch 1 adds the phylib infrastructure.
Patch 2 implements it for Broadcom BCM54xx (AutogrEEEn).
Patch 3 converts the Realtek RTL8211F, which previously unconditionally
disabled PHY-mode EEE in config_init.
This came up while adding EEE support to the Cadence macb driver (used
on Raspberry Pi 5 with a BCM54210PE PHY). The PHY's AutogrEEEn mode
prevented the MAC from tracking LPI state. The Realtek RTL8211F has
the same pattern, unconditionally disabling PHY-mode EEE with the
comment "Disable PHY-mode EEE so LPI is passed to the MAC".
Other BCM54xx PHYs likely have the same AutogrEEEn register layout,
but I only have access to the BCM54210PE/BCM54213PE datasheets. It
would be appreciated if Florian or others could confirm which other
BCM54xx variants share this register so we can wire them up too.
Tested on Raspberry Pi CM4 (bcmgenet + BCM54210PE),
Raspberry Pi CM5 (Cadence GEM + BCM54210PE) and
Raspberry Pi 5 (Cadence GEM + BCM54213PE).
net: phy: realtek: convert RTL8211F to .disable_autonomous_eee
The RTL8211F previously unconditionally disabled PHY-mode EEE in
config_init. Convert this to use the new .disable_autonomous_eee
callback so it is only disabled when the MAC indicates EEE support
via phy_support_eee().
This preserves PHY-autonomous EEE for MACs that do not support EEE,
while still disabling it when the MAC manages LPI.
net: phy: add support for disabling PHY-autonomous EEE
Some PHYs (e.g. Broadcom BCM54xx, Realtek RTL8211F) implement
autonomous EEE where the PHY manages LPI signaling without forwarding
it to the MAC. This conflicts with MAC drivers that implement their own
LPI control.
Add a .disable_autonomous_eee callback to struct phy_driver and call it
from phy_support_eee(). When a MAC driver indicates it supports EEE via
phy_support_eee(), the PHY's autonomous EEE is automatically disabled so
the MAC can manage LPI entry/exit.
====================
ynl/ethtool/netlink: fix nla_len overflow for large string sets
This series addresses a silent data corruption issue triggered when ynl
retrieves string sets from NICs with a large number of statistics entries
(e.g. mlx5_core with thousands of ETH_SS_STATS strings).
The root cause is that struct nlattr.nla_len is a __u16 (max 65535
bytes). When a NIC exports enough statistics strings, the
ETHTOOL_A_STRINGSET_STRINGS nest built by strset_fill_set() exceeds
this limit. nla_nest_end() silently truncates the length on assignment,
producing a corrupted netlink message.
Patch 1 moves ethtool.py to selftest.
Patch 2 improves the ethtool tool: rename the doit/dumpit helpers
to do_set/do_get and convert do_get to use ynl.do() with an
explicit device header instead of a full dump with client-side filtering.
Patch 3 adds a --dbg-small-recv option to the YNL ethtool tool,
matching the same option already present in cli.py, to help debug netlink
message size issues
Patch 4 adds a new helper nla_nest_end_safe() to check whether the nla_len
is overflow and return -EMSGSIZE early if so.
Patch 5 uses the new helper in ethtool to make sure the ethtool doesn't
reply a corrupted netlink message.
====================
Hangbin Liu [Wed, 8 Apr 2026 07:08:53 +0000 (15:08 +0800)]
ethtool: strset: check nla_len overflow
The netlink attribute length field nla_len is a __u16, which can only
represent values up to 65535 bytes. NICs with a large number of
statistics strings (e.g. mlx5_core with thousands of ETH_SS_STATS
entries) can produce a ETHTOOL_A_STRINGSET_STRINGS nest that exceeds
this limit.
When nla_nest_end() writes the actual nest size back to nla_len, the
value is silently truncated. This results in a corrupted netlink message
being sent to userspace: the parser reads a wrong (truncated) attribute
length and misaligns all subsequent attribute boundaries, causing decode
errors.
Fix this by using the new helper nla_nest_end_safe and error out if
the size exceeds U16_MAX.
Hangbin Liu [Wed, 8 Apr 2026 07:08:52 +0000 (15:08 +0800)]
netlink: add a nla_nest_end_safe() helper
The nla_len field in struct nlattr is a __u16, which can only hold
values up to 65535. If a nested attribute grows beyond this limit,
nla_nest_end() silently truncates the length, producing a corrupted
netlink message with no indication of the problem.
Since nla_nest_end() is used everywhere and this issue rarely happens,
let's add a new helper to check the length.
Hangbin Liu [Wed, 8 Apr 2026 07:08:51 +0000 (15:08 +0800)]
tools: ynl: ethtool: add --dbg-small-recv option
Add a --dbg-small-recv debug option to control the recv() buffer size
used by YNL, matching the same option already present in cli.py. This
is useful if user need to get large netlink message.
Hangbin Liu [Wed, 8 Apr 2026 07:08:50 +0000 (15:08 +0800)]
tools: ynl: ethtool: use doit instead of dumpit for per-device GET
Rename the local helper doit() to do_set() and dumpit() to do_get() to
better reflect their purpose.
Convert do_get() to use ynl.do() with an explicit device header instead
of ynl.dump() followed by client-side filtering. This is more efficient
as the kernel only processes and returns data for the requested device,
rather than dumping all devices across the netns.
====================
bng_en: add link management and statistics support
This series enhances the bng_en driver by adding:
1. Link/PHY support
a. Link query
b. Async Link events
c. Ethtool link set/get functionality
2. Hardware statistics reporting via ethtool -S
This version incorporates feedback received prior to splitting the
original series into two parts.
====================
Implement the legacy ethtool statistics interface (get_sset_count,
get_strings, get_ethtool_stats) to expose hardware counters not
available through standard kernel stats APIs.
Implement netdev_stat_ops to provide standardized per-queue
statistics via the Netlink API.
Below is the description of the hardware drop counters:
rx-hw-drop-overruns: Packets dropped by HW due to resource limitations
(e.g., no BDs available in the host ring).
rx-hw-drops: Total packets dropped by HW (sum of overruns and error
drops).
tx-hw-drop-errors: Packets dropped by HW because they were invalid or
malformed.
tx-hw-drops: Total packets dropped by HW (sum of resource limitations
and error drops).
The implementation was verified using the ynl tool:
Implement the ndo_get_stats64 callback to report aggregate network
statistics. The driver gathers these by accumulating the per-ring
counters into the provided rtnl_link_stats64 structure.