Vladimir Oltean [Tue, 20 Jan 2026 21:10:39 +0000 (23:10 +0200)]
net: dsa: fix off-by-one in maximum bridge ID determination
Prior to the blamed commit, the bridge_num range was from
0 to ds->max_num_bridges - 1. After the commit, it is from
1 to ds->max_num_bridges.
So this check:
if (bridge_num >= max)
return 0;
must be updated to:
if (bridge_num > max)
return 0;
in order to allow the last bridge_num value (==max) to be used.
This is easiest visible when a driver sets ds->max_num_bridges=1.
The observed behaviour is that even the first created bridge triggers
the netlink extack "Range of offloadable bridges exceeded" warning, and
is handled in software rather than being offloaded.
Eric Dumazet [Tue, 20 Jan 2026 16:17:44 +0000 (16:17 +0000)]
bonding: provide a net pointer to __skb_flow_dissect()
After 3cbf4ffba5ee ("net: plumb network namespace into __skb_flow_dissect")
we have to provide a net pointer to __skb_flow_dissect(),
either via skb->dev, skb->sk, or a user provided pointer.
In the following case, syzbot was able to cook a bare skb.
Taehee Yoo [Tue, 20 Jan 2026 13:39:30 +0000 (13:39 +0000)]
selftests: net: amt: wait longer for connection before sending packets
Both send_mcast4() and send_mcast6() use sleep 2 to wait for the tunnel
connection between the gateway and the relay, and for the listener
socket to be created in the LISTENER namespace.
However, tests sometimes fail because packets are sent before the
connection is fully established.
Increase the waiting time to make the tests more reliable, and use
wait_local_port_listen() to explicitly wait for the listener socket.
Andrey Vatoropin [Tue, 20 Jan 2026 11:37:47 +0000 (11:37 +0000)]
be2net: Fix NULL pointer dereference in be_cmd_get_mac_from_list
When the parameter pmac_id_valid argument of be_cmd_get_mac_from_list() is
set to false, the driver may request the PMAC_ID from the firmware of the
network card, and this function will store that PMAC_ID at the provided
address pmac_id. This is the contract of this function.
However, there is a location within the driver where both
pmac_id_valid == false and pmac_id == NULL are being passed. This could
result in dereferencing a NULL pointer.
To resolve this issue, it is necessary to pass the address of a stub
variable to the function.
This change lead to MHI WWAN device can't connect to internet.
I found a netwrok issue with kernel 6.19-rc4, but network works
well with kernel 6.18-rc1. After checking, this commit is the
root cause.
Before appliing this serial changes on MHI WWAN network, we shall
revert this change in case of v6.19 being impacted.
Jeongjun Park [Mon, 19 Jan 2026 06:33:59 +0000 (15:33 +0900)]
netrom: fix double-free in nr_route_frame()
In nr_route_frame(), old_skb is immediately freed without checking if
nr_neigh->ax25 pointer is NULL. Therefore, if nr_neigh->ax25 is NULL,
the caller function will free old_skb again, causing a double-free bug.
Therefore, to prevent this, we need to modify it to check whether
nr_neigh->ax25 is NULL before freeing old_skb.
Cc: <stable@vger.kernel.org> Reported-by: syzbot+999115c3bf275797dc27@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/69694d6f.050a0220.58bed.0029.GAE@google.com/ Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Link: https://patch.msgid.link/20260119063359.10604-1-aha310510@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Laurent Vivier [Mon, 19 Jan 2026 07:55:18 +0000 (08:55 +0100)]
usbnet: limit max_mtu based on device's hard_mtu
The usbnet driver initializes net->max_mtu to ETH_MAX_MTU before calling
the device's bind() callback. When the bind() callback sets
dev->hard_mtu based the device's actual capability (from CDC Ethernet's
wMaxSegmentSize descriptor), max_mtu is never updated to reflect this
hardware limitation).
This allows userspace (DHCP or IPv6 RA) to configure MTU larger than the
device can handle, leading to silent packet drops when the backend sends
packet exceeding the device's buffer size.
Fix this by limiting net->max_mtu to the device's hard_mtu after the
bind callback returns.
See https://gitlab.com/qemu-project/qemu/-/issues/3268 and
https://bugs.passt.top/attachment.cgi?bugid=189
Jiawen Wu [Mon, 19 Jan 2026 06:59:35 +0000 (14:59 +0800)]
net: txgbe: remove the redundant data return in SW-FW mailbox
For these two firmware mailbox commands, in txgbe_test_hostif() and
txgbe_set_phy_link_hostif(), there is no need to read data from the
buffer.
Under the current setting, OEM firmware will cause the driver to fail to
probe. Because OEM firmware returns more link information, with a larger
OEM structure txgbe_hic_ephy_getlink. However, the current driver does
not support the OEM function. So just fix it in the way that does not
involve reading the returned data.
====================
fix some bugs in the flow director of HNS3 driver
This patchset fixes two bugs in the flow director:
1. Incorrect definition of HCLGE_FD_AD_COUNTER_NUM_M
2. Incorrect assignment of HCLGE_FD_AD_NXT_KEY
====================
Tao Wang reports that sometimes, after resume, stmmac can watchdog:
NETDEV WATCHDOG: CPU: x: transmit queue x timed out xx ms
When this occurs, the DMA transmit descriptors contain:
eth0: 221 [0x0000000876d10dd0]: 0x73660cbe 0x8 0x42 0xb04416a0
eth0: 222 [0x0000000876d10de0]: 0x77731d40 0x8 0x16a0 0x90000000
where descriptor 221 is the TSO header and 222 is the TSO payload.
tdes3 for descriptor 221 (0xb04416a0) has both bit 29 (first
descriptor) and bit 28 (last descriptor) set, which is incorrect.
The following packet also has bit 28 set, but isn't marked as a
first descriptor, and this causes the transmit DMA to stall.
This occurs because stmmac_tso_allocator() populates the first
descriptor, but does not set .last_segment correctly. There are two
places where this matters: one is later in stmmac_tso_xmit() where
we use it to update the TSO header descriptor. The other is in the
ring/chain mode clean_desc3() which is a performance optimisation.
Rather than using tx_q->tx_skbuff_dma[].last_segment to determine
whether the first descriptor entry is the only segment, calculate the
number of descriptor entries used. If there is only one descriptor,
then the first is also the last, so mark it as such.
Further work will be necessary to either eliminate .last_segment
entirely or set it correctly. Code analysis also indicates that a
similar issue exists with .is_jumbo. These will be the subject of
a future patch.
Reported-by: Tao Wang <tao03.wang@horizon.auto> Fixes: c2837423cb54 ("net: stmmac: Rework TX Coalesce logic") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vhq8O-00000005N5s-0Ke5@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Yang [Mon, 19 Jan 2026 15:34:36 +0000 (23:34 +0800)]
be2net: fix data race in be_get_new_eqd
In be_get_new_eqd(), statistics of pkts, protected by u64_stats_sync, are
read and accumulated in ignorance of possible u64_stats_fetch_retry()
events. Before the commit in question, these statistics were retrieved
one by one directly from queues. Fix this by reading them into temporary
variables first.
Fixes: 209477704187 ("be2net: set interrupt moderation for Skyhawk-R using EQ-DB") Signed-off-by: David Yang <mmyangfl@gmail.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20260119153440.1440578-1-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Yang [Mon, 19 Jan 2026 16:27:16 +0000 (00:27 +0800)]
idpf: Fix data race in idpf_net_dim
In idpf_net_dim(), some statistics protected by u64_stats_sync, are read
and accumulated in ignorance of possible u64_stats_fetch_retry() events.
The correct way to copy statistics is already illustrated by
idpf_add_queue_stats(). Fix this by reading them into temporary variables
first.
Fixes: c2d548cad150 ("idpf: add TX splitq napi poll support") Fixes: 3a8845af66ed ("idpf: add RX splitq napi poll support") Signed-off-by: David Yang <mmyangfl@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260119162720.1463859-1-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Yang [Mon, 19 Jan 2026 16:07:37 +0000 (00:07 +0800)]
net: hns3: fix data race in hns3_fetch_stats
In hns3_fetch_stats(), ring statistics, protected by u64_stats_sync, are
read and accumulated in ignorance of possible u64_stats_fetch_retry()
events. These statistics are already accumulated by
hns3_ring_stats_update(). Fix this by reading them into a temporary
buffer first.
nfc: MAINTAINERS: Orphan the NFC and look for new maintainers
NFC stack in Linux is in poor shape, with several bugs being discovered
last years via fuzzing, not much new development happening and limited
review and testing. It requires some more effort than drive-by reviews
I have been offering last one or two years.
I don't have much time nor business interests to keep looking at NFC,
so let's drop me from the maintainers to clearly indicate that more
hands are needed.
Yun Lu [Fri, 16 Jan 2026 09:53:08 +0000 (17:53 +0800)]
netdevsim: fix a race issue related to the operation on bpf_bound_progs list
The netdevsim driver lacks a protection mechanism for operations on the
bpf_bound_progs list. When the nsim_bpf_create_prog() performs
list_add_tail, it is possible that nsim_bpf_destroy_prog() is
simultaneously performs list_del. Concurrent operations on the list may
lead to list corruption and trigger a kernel crash as follows:
Michal Luczaj [Fri, 16 Jan 2026 08:52:36 +0000 (09:52 +0100)]
vsock/test: Do not filter kallsyms by symbol type
Blamed commit implemented logic to discover available vsock transports by
grepping /proc/kallsyms for known symbols. It incorrectly filtered entries
by type 'd'.
GangMin Kim <km.kim1503@gmail.com> managed to create a UAF on qfq by inserting
teql as a child qdisc and exploiting a qlen sync issue.
teql is not intended to be used as a child qdisc. Lets enforce that rule in
patch #1. Although patch #1 fixes the issue, we prevent another potential qlen
exploit in qfq in patch #2 by enforcing the child's active status is not
determined by inspecting the qlen. In patch #3 we add a tdc test case.
====================
Jamal Hadi Salim [Wed, 14 Jan 2026 16:02:42 +0000 (11:02 -0500)]
net/sched: qfq: Use cl_is_active to determine whether class is active in qfq_rm_from_ag
This is more of a preventive patch to make the code more consistent and
to prevent possible exploits that employ child qlen manipulations on qfq.
use cl_is_active instead of relying on the child qdisc's qlen to determine
class activation.
Jamal Hadi Salim [Wed, 14 Jan 2026 16:02:41 +0000 (11:02 -0500)]
net/sched: Enforce that teql can only be used as root qdisc
Design intent of teql is that it is only supposed to be used as root qdisc.
We need to check for that constraint.
Although not important, I will describe the scenario that unearthed this
issue for the curious.
GangMin Kim <km.kim1503@gmail.com> managed to concot a scenario as follows:
ROOT qdisc 1:0 (QFQ)
├── class 1:1 (weight=15, lmax=16384) netem with delay 6.4s
└── class 1:2 (weight=1, lmax=1514) teql
GangMin sends a packet which is enqueued to 1:1 (netem).
Any invocation of dequeue by QFQ from this class will not return a packet
until after 6.4s. In the meantime, a second packet is sent and it lands on
1:2. teql's enqueue will return success and this will activate class 1:2.
Main issue is that teql only updates the parent visible qlen (sch->q.qlen)
at dequeue. Since QFQ will only call dequeue if peek succeeds (and teql's
peek always returns NULL), dequeue will never be called and thus the qlen
will remain as 0. With that in mind, when GangMin updates 1:2's lmax value,
the qfq_change_class calls qfq_deact_rm_from_agg. Since the child qdisc's
qlen was not incremented, qfq fails to deactivate the class, but still
frees its pointers from the aggregate. So when the first packet is
rescheduled after 6.4 seconds (netem's delay), a dangling pointer is
accessed causing GangMin's causing a UAF.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: GangMin Kim <km.kim1503@gmail.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20260114160243.913069-2-jhs@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 19 Jan 2026 18:19:32 +0000 (10:19 -0800)]
Merge tag 'linux-can-fixes-for-6.19-20260116' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2026-01-16
The first patch is by me and sets the missing CAN device default
capabilities in the CAN device layer.
The next patch is by me, target the gs_usb driver and adds the missing
unanchor URB on usb_submit_urb() error.
The last 5 patches are also from me and fix the same USB-URB leak (as
in the gs_usb driver) in the affected CAN-USB driver: ems_usb,
esd_usb, kvaser_usb, mcba_usb and usb_8dev.
The RX flowid programming initializes the TCAM mask to all ones, but
then overwrites it when clearing the MAC DA mask bits. This results
in losing the intended initialization and may affect other match fields.
Update the code to clear the MAC DA bits using an AND operation, making
the handling of mask[0] consistent with mask[1], where the field-specific
bits are cleared after initializing the mask to ~0ULL.
Fixes: 57d00d4364f3 ("octeontx2-pf: mcs: Match macsec ethertype along with DMAC") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://patch.msgid.link/20260116164724.2733511-1-alok.a.tiwari@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Howells [Wed, 14 Jan 2026 22:03:23 +0000 (22:03 +0000)]
rxrpc: Fix recvmsg() unconditional requeue
If rxrpc_recvmsg() fails because MSG_DONTWAIT was specified but the call at
the front of the recvmsg queue already has its mutex locked, it requeues
the call - whether or not the call is already queued. The call may be on
the queue because MSG_PEEK was also passed and so the call was not dequeued
or because the I/O thread requeued it.
The unconditional requeue may then corrupt the recvmsg queue, leading to
things like UAFs or refcount underruns.
Fix this by only requeuing the call if it isn't already on the queue - and
moving it to the front if it is already queued. If we don't queue it, we
have to put the ref we obtained by dequeuing it.
Also, MSG_PEEK doesn't dequeue the call so shouldn't call
rxrpc_notify_socket() for the call if we didn't use up all the data on the
queue, so fix that also.
Fixes: 540b1c48c37a ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg") Reported-by: Faith <faith@zellic.io> Reported-by: Pumpkin Chang <pumpkin@devco.re> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Marc Dionne <marc.dionne@auristor.com>
cc: Nir Ohfeld <niro@wiz.io>
cc: Willy Tarreau <w@1wt.eu>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: stable@kernel.org Link: https://patch.msgid.link/95163.1768428203@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 19 Jan 2026 18:03:33 +0000 (10:03 -0800)]
Merge branch 'ipvlan-addrs_lock-made-per-port'
Dmitry Skorodumov says:
====================
ipvlan: addrs_lock made per port
First patch fixes a rather minor issues that sometimes
ipvlan-addrs are modified without lock (because
for IPv6 addr can be sometimes added without RTNL)
====================
Make the addrs_lock be per port, not per ipvlan dev.
Initial code seems to be written in the assumption,
that any address change must occur under RTNL.
But it is not so for the case of IPv6. So
1) Introduce per-port addrs_lock.
2) It was needed to fix places where it was forgotten
to take lock (ipvlan_open/ipvlan_close)
This appears to be a very minor problem though.
Since it's highly unlikely that ipvlan_add_addr() will
be called on 2 CPU simultaneously. But nevertheless,
this could cause:
1) False-negative of ipvlan_addr_busy(): one interface
iterated through all port->ipvlans + ipvlan->addrs
under some ipvlan spinlock, and another added IP
under its own lock. Though this is only possible
for IPv6, since looks like only ipvlan_addr6_event() can be
called without rtnl_lock.
2) Race since ipvlan_ht_addr_add(port) is called under
different ipvlan->addrs_lock locks
This should not affect performance, since add/remove IP
is a rare situation and spinlock is not taken on fast
paths.
Fixes: 8230819494b3 ("ipvlan: use per device spinlock to protect addrs list updates") Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com> Reviewed-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/20260112142417.4039566-2-skorodumov.dmitry@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
NFC packets may have NUL-bytes. Checking for string length is not a correct
assumption here. As long as there is a check for the length copied from
copy_from_user, all should be fine.
The fix only prevented the syzbot reproducer from triggering the bug
because the packet is not enqueued anymore and the code that triggers the
bug is not exercised.
The fix even broke
testing/selftests/nci/nci_dev, making all tests there fail. After the
revert, 6 out of 8 tests pass.
Fixes: 068648aab72c ("nfc/nci: Add the inconsistency check between the input data length and count") Cc: stable@vger.kernel.org Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Link: https://patch.msgid.link/20260113202458.449455-1-cascardo@igalia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hamza Mahfooz [Tue, 13 Jan 2026 23:29:57 +0000 (18:29 -0500)]
net: sfp: add potron quirk to the H-COM SPP425H-GAB4 SFP+ Stick
This is another one of those XGSPON ONU sticks that's using the
X-ONU-SFPP internally, thus it also requires the potron quirk to avoid tx
faults. So, add an entry for it in sfp_quirks[].
David Yang [Wed, 14 Jan 2026 12:24:45 +0000 (20:24 +0800)]
veth: fix data race in veth_get_ethtool_stats
In veth_get_ethtool_stats(), some statistics protected by
u64_stats_sync, are read and accumulated in ignorance of possible
u64_stats_fetch_retry() events. These statistics, peer_tq_xdp_xmit and
peer_tq_xdp_xmit_err, are already accumulated by veth_xdp_xmit(). Fix
this by reading them into a temporary buffer first.
The repro generated a GUE packet with its inner protocol 0.
gue_udp_recv() returns -guehdr->proto_ctype for "resubmit"
in ip_protocol_deliver_rcu(), but this only works with
non-zero protocol number.
Let's drop such packets.
Note that 0 is a valid number (IPv6 Hop-by-Hop Option).
I think it is not practical to encap HOPOPT in GUE, so once
someone starts to complain, we could pass down a resubmit
flag pointer to distinguish two zeros from the upper layer:
Simon Horman [Thu, 15 Jan 2026 13:54:00 +0000 (13:54 +0000)]
docs: netdev: refine 15-patch limit
The 15 patch limit is intended by the maintainers to cover
all outstanding patches on the mailing list on a per-tree basis.
Not just those in a single patchset. Document this practice accordingly.
Raju Rangoju [Wed, 14 Jan 2026 16:30:37 +0000 (22:00 +0530)]
amd-xgbe: avoid misleading per-packet error log
On the receive path, packet can be damaged because of buffer
overflow in Rx FIFO. Avoid misleading per-packet error log when
packet->errors is set, this can flood the log. Instead, rely on the
standard rtnl_link_stats64 stats.
0 is a valid DMA address [1] so using it as the error value can lead to
errors. The error value of dma_map_XXX() functions is DMA_MAPPING_ERROR
which is ~0. The callers of otx2_dma_map_page() use dma_mapping_error()
to test the return value of otx2_dma_map_page(). This means that they
would not detect an error in otx2_dma_map_page().
Make otx2_dma_map_page() return the raw value of dma_map_page_attrs().
The issue is triggered when sctp_auth_asoc_init_active_key() fails in
sctp_sf_do_5_1C_ack() while processing an INIT_ACK. In this case, the
command sequence is currently:
If SCTP_CMD_ASSOC_SHKEY fails, asoc->shkey remains NULL, while
asoc->peer.auth_capable and asoc->peer.peer_chunks have already been set by
SCTP_CMD_PEER_INIT. This allows a DATA chunk with auth = 1 and shkey = NULL
to be queued by sctp_datamsg_from_user().
Since command interpretation stops on failure, no COOKIE_ECHO should been
sent via SCTP_CMD_GEN_COOKIE_ECHO. However, the T1_COOKIE timer has already
been started, and it may enqueue a COOKIE_ECHO into the outqueue later. As
a result, the DATA chunk can be transmitted together with the COOKIE_ECHO
in sctp_outq_flush_data(), leading to the observed issue.
Similar to the other places where it calls sctp_auth_asoc_init_active_key()
right after sctp_process_init(), this patch moves the SCTP_CMD_ASSOC_SHKEY
immediately after SCTP_CMD_PEER_INIT, before stopping T1_INIT and starting
T1_COOKIE. This ensures that if shared key generation fails, authenticated
DATA cannot be sent. It also allows the T1_INIT timer to retransmit INIT,
giving the client another chance to process INIT_ACK and retry key setup.
Merge patch series "can: usb: fix URB memory leaks"
Marc Kleine-Budde <mkl@pengutronix.de> says:
An URB memory leak [1][2] was recently fixed in the gs_usb driver. The
driver did not take into account that completed URBs are no longer
anchored, causing them to be lost during ifdown. The memory leak was fixed
by re-anchoring the URBs in the URB completion callback.
Several USB CAN drivers are affected by the same error. Fix them
accordingly.
Fix similar memory leak as in commit 7352e1d5932a ("can: gs_usb:
gs_usb_receive_bulk_callback(): fix URB memory leak").
In usb_8dev_open() -> usb_8dev_start(), the URBs for USB-in transfers are
allocated, added to the priv->rx_submitted anchor and submitted. In the
complete callback usb_8dev_read_bulk_callback(), the URBs are processed and
resubmitted. In usb_8dev_close() -> unlink_all_urbs() the URBs are freed by
calling usb_kill_anchored_urbs(&priv->rx_submitted).
However, this does not take into account that the USB framework unanchors
the URB before the complete function is called. This means that once an
in-URB has been completed, it is no longer anchored and is ultimately not
released in usb_kill_anchored_urbs().
Fix the memory leak by anchoring the URB in the
usb_8dev_read_bulk_callback() to the priv->rx_submitted anchor.
Fix similar memory leak as in commit 7352e1d5932a ("can: gs_usb:
gs_usb_receive_bulk_callback(): fix URB memory leak").
In mcba_usb_probe() -> mcba_usb_start(), the URBs for USB-in transfers are
allocated, added to the priv->rx_submitted anchor and submitted. In the
complete callback mcba_usb_read_bulk_callback(), the URBs are processed and
resubmitted. In mcba_usb_close() -> mcba_urb_unlink() the URBs are freed by
calling usb_kill_anchored_urbs(&priv->rx_submitted).
However, this does not take into account that the USB framework unanchors
the URB before the complete function is called. This means that once an
in-URB has been completed, it is no longer anchored and is ultimately not
released in usb_kill_anchored_urbs().
Fix the memory leak by anchoring the URB in the
mcba_usb_read_bulk_callback()to the priv->rx_submitted anchor.
Fix similar memory leak as in commit 7352e1d5932a ("can: gs_usb:
gs_usb_receive_bulk_callback(): fix URB memory leak").
In kvaser_usb_set_{,data_}bittiming() -> kvaser_usb_setup_rx_urbs(), the
URBs for USB-in transfers are allocated, added to the dev->rx_submitted
anchor and submitted. In the complete callback
kvaser_usb_read_bulk_callback(), the URBs are processed and resubmitted. In
kvaser_usb_remove_interfaces() the URBs are freed by calling
usb_kill_anchored_urbs(&dev->rx_submitted).
However, this does not take into account that the USB framework unanchors
the URB before the complete function is called. This means that once an
in-URB has been completed, it is no longer anchored and is ultimately not
released in usb_kill_anchored_urbs().
Fix the memory leak by anchoring the URB in the
kvaser_usb_read_bulk_callback() to the dev->rx_submitted anchor.
Fix similar memory leak as in commit 7352e1d5932a ("can: gs_usb:
gs_usb_receive_bulk_callback(): fix URB memory leak").
In esd_usb_open(), the URBs for USB-in transfers are allocated, added to
the dev->rx_submitted anchor and submitted. In the complete callback
esd_usb_read_bulk_callback(), the URBs are processed and resubmitted. In
esd_usb_close() the URBs are freed by calling
usb_kill_anchored_urbs(&dev->rx_submitted).
However, this does not take into account that the USB framework unanchors
the URB before the complete function is called. This means that once an
in-URB has been completed, it is no longer anchored and is ultimately not
released in esd_usb_close().
Fix the memory leak by anchoring the URB in the
esd_usb_read_bulk_callback() to the dev->rx_submitted anchor.
Fix similar memory leak as in commit 7352e1d5932a ("can: gs_usb:
gs_usb_receive_bulk_callback(): fix URB memory leak").
In ems_usb_open(), the URBs for USB-in transfers are allocated, added to
the dev->rx_submitted anchor and submitted. In the complete callback
ems_usb_read_bulk_callback(), the URBs are processed and resubmitted. In
ems_usb_close() the URBs are freed by calling
usb_kill_anchored_urbs(&dev->rx_submitted).
However, this does not take into account that the USB framework unanchors
the URB before the complete function is called. This means that once an
in-URB has been completed, it is no longer anchored and is ultimately not
released in ems_usb_close().
Fix the memory leak by anchoring the URB in the
ems_usb_read_bulk_callback() to the dev->rx_submitted anchor.
can: gs_usb: gs_usb_receive_bulk_callback(): unanchor URL on usb_submit_urb() error
In commit 7352e1d5932a ("can: gs_usb: gs_usb_receive_bulk_callback(): fix
URB memory leak"), the URB was re-anchored before usb_submit_urb() in
gs_usb_receive_bulk_callback() to prevent a leak of this URB during
cleanup.
However, this patch did not take into account that usb_submit_urb() could
fail. The URB remains anchored and
usb_kill_anchored_urbs(&parent->rx_submitted) in gs_can_close() loops
infinitely since the anchor list never becomes empty.
To fix the bug, unanchor the URB when an usb_submit_urb() error occurs,
also print an info message.
can: dev: alloc_candev_mqs(): add missing default CAN capabilities
The idea behind series 6c1f5146b214 ("Merge patch series "can: raw: better
approach to instantly reject unsupported CAN frames"") is to set the
capabilities of a CAN device (CAN-CC, CAN-FD, CAN-XL, and listen only) [1]
and, based on these capabilities, reject unsupported CAN frames in the
CAN-RAW protocol [2].
This works perfectly for CAN devices configured in CAN-FD or CAN-XL mode.
CAN devices with static CAN control modes define their capabilities via
can_set_static_ctrlmode() -> can_set_cap_info(). CAN devices configured by
the user space for CAN-FD or CAN-XL set their capabilities via
can_changelink() -> can_ctrlmode_changelink() -> can_set_cap_info().
However, in commit 166e87329ce6 ("can: propagate CAN device capabilities
via ml_priv"), the capabilities of CAN devices are not initialized.
This results in CAN-RAW rejecting all CAN frames on devices directly
after ifup if the user space has not changed the CAN control mode.
Fix this problem by setting the default capabilities to CAN-CC in
alloc_candev_mqs() as soon as the CAN specific ml_priv is allocated.
[1] commit 166e87329ce6 ("can: propagate CAN device capabilities via ml_priv")
[2] commit faba5860fcf9 ("can: raw: instantly reject disabled CAN frames")
net: freescale: ucc_geth: Return early when TBI PHY can't be found
In ucc_geth's .mac_config(), we configure the TBI Serdes block represented by a
struct phy_device that we get from firmware.
While porting to phylink, a check was missed to make sure we don't try
to access the TBI PHY if we can't get it. Let's add it and return early
in case of error
Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202601130843.rFGNXA5a-lkp@intel.com/ Fixes: 53036aa8d031 ("net: freescale: ucc_geth: phylink conversion") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20260114080247.366252-1-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 16 Jan 2026 03:59:40 +0000 (19:59 -0800)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2026-01-13 (ice, igc)
For ice:
Jake adds missing initialization calls to u64_stats_init().
Dave stops deletion of VLAN 0 from prune list when device is primary
LAG interface.
Ding Hui adds a missed unit conversion function for proper timeout
value.
For igc:
Kurt Kanzenbach adds a call to re-set default Qbv schedule when number
of channels changes.
Chwee-Lin Choong reworks Tx timestamp detection logic to resolve a race
condition and reverts changes to TSN packet buffer size causing Tx
hangs under heavy load.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
igc: Reduce TSN TX packet buffer from 7KB to 5KB per queue
igc: fix race condition in TX timestamp read for register 0
igc: Restore default Qbv schedule when changing channels
ice: Fix incorrect timeout ice_release_res()
ice: Avoid detrimental cleanup for bond during interface stop
ice: initialize ring_stats->syncp
====================
selftests: net: fib-onlink-tests: Convert to use namespaces by default
Currently, the test breaks if the SUT already has a default route
configured for IPv6. Fix by avoiding the use of the default namespace.
Fixes: 4ed591c8ab44 ("net/ipv6: Allow onlink routes to have a device mismatch if it is the default route") Suggested-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Ricardo B. Marlière <rbm@suse.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de> Link: https://patch.msgid.link/20260113-selftests-net-fib-onlink-v2-1-89de2b931389@suse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Tue, 13 Jan 2026 19:12:01 +0000 (19:12 +0000)]
bonding: limit BOND_MODE_8023AD to Ethernet devices
BOND_MODE_8023AD makes sense for ARPHRD_ETHER only.
syzbot reported:
BUG: KASAN: global-out-of-bounds in __hw_addr_create net/core/dev_addr_lists.c:63 [inline]
BUG: KASAN: global-out-of-bounds in __hw_addr_add_ex+0x25d/0x760 net/core/dev_addr_lists.c:118
Read of size 16 at addr ffffffff8bf94040 by task syz.1.3580/19497
The SR9700 chip sends more than one packet in a USB transaction,
like the DM962x chips can optionally do, but the dm9601 driver does not
support this mode, and the hardware does not have the DM962x
MODE_CTL register to disable it, so this driver drops packets on SR9700
devices. The sr9700 driver correctly handles receiving more than one
packet per transaction.
While the dm9601 driver could be improved to handle this, the easiest
way to fix this issue in the short term is to remove the SR9700 device
ID from the dm9601 driver so the sr9700 driver is always used. This
device ID should not have been in more than one driver to begin with.
The "Fixes" commit was chosen so that the patch is automatically
included in all kernels that have the sr9700 driver, even though the
issue affects dm9601.
Fixes: c9b37458e956 ("USB2NET : SR9700 : One chip USB 1.1 USB2NET SR9700Device Driver Support") Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Acked-by: Peter Korsgaard <peter@korsgaard.com> Link: https://patch.msgid.link/20260113063924.74464-1-enelsonmoore@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
vsock/virtio: Fix data loss/disclosure due to joining of non-linear skb
Loopback transport coalesces some skbs too eagerly. Handling a zerocopy
(non-linear) skb as a linear one leads to skb data loss and kernel memory
disclosure.
Plug the loss/leak by allowing only linear skb join. Provide a test.
====================
Michal Luczaj [Tue, 13 Jan 2026 15:08:19 +0000 (16:08 +0100)]
vsock/test: Add test for a linear and non-linear skb getting coalesced
Loopback transport can mangle data in rx queue when a linear skb is
followed by a small MSG_ZEROCOPY packet.
To exercise the logic, send out two packets: a weirdly sized one (to ensure
some spare tail room in the skb) and a zerocopy one that's small enough to
fit in the spare room of its predecessor. Then, wait for both to land in
the rx queue, and check the data received. Faulty packets merger manifests
itself by corrupting payload of the later packet.
Michal Luczaj [Tue, 13 Jan 2026 15:08:18 +0000 (16:08 +0100)]
vsock/virtio: Coalesce only linear skb
vsock/virtio common tries to coalesce buffers in rx queue: if a linear skb
(with a spare tail room) is followed by a small skb (length limited by
GOOD_COPY_LEN = 128), an attempt is made to join them.
Since the introduction of MSG_ZEROCOPY support, assumption that a small skb
will always be linear is incorrect. In the zerocopy case, data is lost and
the linear skb is appended with uninitialized kernel memory.
Of all 3 supported virtio-based transports, only loopback-transport is
affected. G2H virtio-transport rx queue operates on explicitly linear skbs;
see virtio_vsock_alloc_linear_skb() in virtio_vsock_rx_fill(). H2G
vhost-transport may allocate non-linear skbs, but only for sizes that are
not considered for coalescence; see PAGE_ALLOC_COSTLY_ORDER in
virtio_vsock_alloc_skb().
Ensure only linear skbs are coalesced. Note that skb_tailroom(last_skb) > 0
guarantees last_skb is linear.
Simon Schippers [Tue, 13 Jan 2026 07:51:38 +0000 (08:51 +0100)]
usbnet: fix crash due to missing BQL accounting after resume
In commit 7ff14c52049e ("usbnet: Add support for Byte Queue Limits
(BQL)"), it was missed that usbnet_resume() may enqueue SKBs using
__skb_queue_tail() without reporting them to BQL. As a result, the next
call to netdev_completed_queue() triggers a BUG_ON() in dql_completed(),
since the SKBs queued during resume were never accounted for.
This patch fixes the issue by adding a corresponding netdev_sent_queue()
call in usbnet_resume() when SKBs are queued after suspend. Because
dev->txq.lock is held at this point, no concurrent calls to
netdev_sent_queue() from usbnet_start_xmit() can occur.
The crash can be reproduced by generating network traffic
(e.g. iperf3 -c ... -t 0), suspending the system, and then waking it up
(e.g. rtcwake -m mem -s 5).
When testing USB2 Android tethering (cdc_ncm), the system crashed within
three suspend/resume cycles without this patch. With the patch applied,
no crashes were observed after 90 cycles. Testing with an AX88179 USB
Ethernet adapter also showed no crashes.
Fixes: 7ff14c52049e ("usbnet: Add support for Byte Queue Limits (BQL)") Reported-by: Bard Liao <yung-chuan.liao@linux.intel.com> Tested-by: Bard Liao <yung-chuan.liao@linux.intel.com> Tested-by: Simon Schippers <simon.schippers@tu-dortmund.de> Signed-off-by: Simon Schippers <simon.schippers@tu-dortmund.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260113075139.6735-1-simon.schippers@tu-dortmund.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- ip6_tunnel: use skb_vlan_inet_prepare() in __ip6_tnl_rcv()
- bluetooth: hci_sync: enable PA sync lost event
- eth: virtio-net:
- fix the deadlock when disabling rx NAPI
- fix misalignment bug in struct virtnet_info
Previous releases - always broken:
- ipv4: ip_gre: make ipgre_header() robust
- can: fix SSP_SRC in cases when bit-rate is higher than 1 MBit.
- eth:
- mlx5e: profile change fix
- octeon_ep_vf: fix free_irq dev_id mismatch in IRQ rollback
- macvlan: fix possible UAF in macvlan_forward_source()"
* tag 'net-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits)
virtio_net: Fix misalignment bug in struct virtnet_info
net: can: j1939: j1939_xtp_rx_rts_session_active(): deactivate session upon receiving the second rts
can: raw: instantly reject disabled CAN frames
can: propagate CAN device capabilities via ml_priv
Revert "can: raw: instantly reject unsupported CAN frames"
net/sched: sch_qfq: do not free existing class in qfq_change_class()
selftests: drv-net: fix RPS mask handling for high CPU numbers
selftests: drv-net: fix RPS mask handling in toeplitz test
ipv6: Fix use-after-free in inet6_addr_del().
dst: fix races in rt6_uncached_list_del() and rt_del_uncached_list()
net: hv_netvsc: reject RSS hash key programming without RX indirection table
tools: ynl: render event op docs correctly
net: add net.core.qdisc_max_burst
net: airoha: Fix typo in airoha_ppe_setup_tc_block_cb definition
net: phy: motorcomm: fix duplex setting error for phy leds
net: octeon_ep_vf: fix free_irq dev_id mismatch in IRQ rollback
net/mlx5e: Restore destroying state bit after profile cleanup
net/mlx5e: Pass netdev to mlx5e_destroy_netdev instead of priv
net/mlx5e: Don't store mlx5e_priv in mlx5e_dev devlink priv
net/mlx5e: Fix crash on profile change rollback failure
...
Paolo Abeni [Thu, 15 Jan 2026 12:13:01 +0000 (13:13 +0100)]
Merge tag 'linux-can-fixes-for-6.19-20260115' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2026-01-15
this is a pull request of 4 patches for net/main, it super-seeds the
"can 2026-01-14" pull request. The dev refcount leak in patch #3 is
fixed.
The first 3 patches are by Oliver Hartkopp and revert the approach to
instantly reject unsupported CAN frames introduced in
net-next-for-v6.19 and replace it by placing the needed data into the
CAN specific ml_priv.
The last patch is by Tetsuo Handa and fixes a J1939 refcount leak for
j1939_session in session deactivation upon receiving the second RTS.
* tag 'linux-can-fixes-for-6.19-20260115' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
net: can: j1939: j1939_xtp_rx_rts_session_active(): deactivate session upon receiving the second rts
can: raw: instantly reject disabled CAN frames
can: propagate CAN device capabilities via ml_priv
Revert "can: raw: instantly reject unsupported CAN frames"
====================
1) Fix inner mode lookup in tunnel mode GSO segmentation.
The protocol was taken from the wrong field.
2) Set ipv4 no_pmtu_disc flag only on output SAs. The
insertation of input SAs can fail if no_pmtu_disc
is set.
Please pull or let me know if there are problems.
ipsec-2026-01-14
* tag 'ipsec-2026-01-14' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
xfrm: set ipv4 no_pmtu_disc flag only on output sa when direction is set
xfrm: Fix inner mode lookup in tunnel mode GSO segmentation
====================
virtio_net: Fix misalignment bug in struct virtnet_info
Use the new TRAILING_OVERLAP() helper to fix a misalignment bug
along with the following warning:
drivers/net/virtio_net.c:429:46: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
This helper creates a union between a flexible-array member (FAM)
and a set of members that would otherwise follow it (in this case
`u8 rss_hash_key_data[VIRTIO_NET_RSS_MAX_KEY_SIZE];`). This
overlays the trailing members (rss_hash_key_data) onto the FAM
(hash_key_data) while keeping the FAM and the start of MEMBERS aligned.
The static_assert() ensures this alignment remains.
Notice that due to tail padding in flexible `struct
virtio_net_rss_config_trailer`, `rss_trailer.hash_key_data`
(at offset 83 in struct virtnet_info) and `rss_hash_key_data` (at
offset 84 in struct virtnet_info) are misaligned by one byte. See
below:
As a result, the RSS key passed to the device is shifted by 1
byte: the last byte is cut off, and instead a (possibly
uninitialized) byte is added at the beginning.
As a last note `struct virtio_net_rss_config_hdr *rss_hdr;` is also
moved to the end, since it seems those three members should stick
around together. :)
Cc: stable@vger.kernel.org Fixes: ed3100e90d0d ("virtio_net: Use new RSS config structs") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/aWIItWq5dV9XTTCJ@kspp Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tetsuo Handa [Tue, 13 Jan 2026 15:28:47 +0000 (00:28 +0900)]
net: can: j1939: j1939_xtp_rx_rts_session_active(): deactivate session upon receiving the second rts
Since j1939_session_deactivate_activate_next() in j1939_tp_rxtimer() is
called only when the timer is enabled, we need to call
j1939_session_deactivate_activate_next() if we cancelled the timer.
Otherwise, refcount for j1939_session leaks, which will later appear as
| unregister_netdevice: waiting for vcan0 to become free. Usage count = 2.
The reverted patch was accessing CAN device internal data structures
from the network layer because it needs to know about the CAN protocol
capabilities of the CAN devices.
This data access caused build problems between the CAN network and the
CAN driver layer which introduced unwanted Kconfig dependencies and fixes.
The patches 2 & 3 implement a better approach which makes use of the
CAN specific ml_priv data which is accessible from both sides.
With this change the CAN network layer can check the required features
and the decoupling of the driver layer and network layer is restored.
Oliver Hartkopp [Fri, 9 Jan 2026 14:41:35 +0000 (15:41 +0100)]
can: raw: instantly reject disabled CAN frames
For real CAN interfaces the CAN_CTRLMODE_FD and CAN_CTRLMODE_XL control
modes indicate whether an interface can handle those CAN FD/XL frames.
In the case a CAN XL interface is configured in CANXL-only mode with
disabled error-signalling neither CAN CC nor CAN FD frames can be sent.
The checks are now performed on CAN_RAW sockets to give an instant feedback
to the user when writing unsupported CAN frames to the interface or when
the CAN interface is in read-only mode.
Oliver Hartkopp [Fri, 9 Jan 2026 14:41:34 +0000 (15:41 +0100)]
can: propagate CAN device capabilities via ml_priv
Commit 1a620a723853 ("can: raw: instantly reject unsupported CAN frames")
caused a sequence of dependency and linker fixes.
Instead of accessing CAN device internal data structures which caused the
dependency problems this patch introduces capability information into the
CAN specific ml_priv data which is accessible from both sides.
With this change the CAN network layer can check the required features and
the decoupling of the driver layer and network layer is restored.
Fixes: 1a620a723853 ("can: raw: instantly reject unsupported CAN frames") Cc: Marc Kleine-Budde <mkl@pengutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Vincent Mailhol <mailhol@kernel.org> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://patch.msgid.link/20260109144135.8495-3-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The entire problem was caused by the requirement that a new network layer
feature needed to know about the protocol capabilities of the CAN devices.
Instead of accessing CAN device internal data structures which caused the
dependency problems a better approach has been developed which makes use of
CAN specific ml_priv data which is accessible from both sides.
Cc: Marc Kleine-Budde <mkl@pengutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Vincent Mailhol <mailhol@kernel.org> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://patch.msgid.link/20260109144135.8495-2-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Linus Torvalds [Wed, 14 Jan 2026 19:24:38 +0000 (11:24 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Only one core change (and one in doc only) the rest are drivers.
The one core fix is for some inline encrypting drives that can't
handle encryption requests on non-data commands (like error handling
ones); it saves the request level encryption parameters in the eh_save
structure so they can be cleared for error handling and restored after
it is completed"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: host: mediatek: Make read-only array scale_us static const
scsi: bfa: Update outdated comment
scsi: mpt3sas: Update maintainer list
scsi: ufs: core: Configure MCQ after link startup
scsi: core: Fix error handler encryption support
scsi: core: Correct documentation for scsi_test_unit_ready()
scsi: ufs: dt-bindings: Fix several grammar errors
Linus Torvalds [Wed, 14 Jan 2026 16:18:01 +0000 (08:18 -0800)]
Merge tag 'media/v6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- ov02c10: some fixes related to preserving bayer pattern and
horizontal control
- ipu-bridge: Add quirks for some Dell XPS laptops with inverted
sensors
- mali-c55: Fix version identifier logic
- rzg2l-cru: csi-2: fix RZ/V2H input sizes on some variants
* tag 'media/v6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: ov02c10: Remove unnecessary hflip and vflip pointers
media: ipu-bridge: Add DMI quirk for Dell XPS laptops with upside down sensors
media: ov02c10: Fix the horizontal flip control
media: ov02c10: Adjust x-win/y-win when changing flipping to preserve bayer-pattern
media: ov02c10: Fix bayer-pattern change after default vflip change
media: rzg2l-cru: csi-2: Support RZ/V2H input sizes
media: uapi: mali-c55-config: Remove version identifier
media: mali-c55: Remove duplicated version check
media: Documentation: mali-c55: Use v4l2-isp version identifier
Linus Torvalds [Wed, 14 Jan 2026 05:21:13 +0000 (21:21 -0800)]
Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:
- Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK in riscv JIT (Menglong
Dong)
- Fix reference count leak in bpf_prog_test_run_xdp() (Tetsuo Handa)
- Fix metadata size check in bpf_test_run() (Toke Høiland-Jørgensen)
- Check that BPF insn array is not allowed as a map for const strings
(Deepanshu Kartikey)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Fix reference count leak in bpf_prog_test_run_xdp()
bpf: Reject BPF_MAP_TYPE_INSN_ARRAY in check_reg_const_str()
selftests/bpf: Update xdp_context_test_run test to check maximum metadata size
bpf, test_run: Subtract size of xdp_frame from allowed metadata size
riscv, bpf: Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK
Alice Ryhl [Mon, 5 Jan 2026 10:44:06 +0000 (10:44 +0000)]
rust: bitops: fix missing _find_* functions on 32-bit ARM
On 32-bit ARM, you may encounter linker errors such as this one:
ld.lld: error: undefined symbol: _find_next_zero_bit
>>> referenced by rust_binder_main.43196037ba7bcee1-cgu.0
>>> drivers/android/binder/rust_binder_main.o:(<rust_binder_main::process::Process>::insert_or_update_handle) in archive vmlinux.a
>>> referenced by rust_binder_main.43196037ba7bcee1-cgu.0
>>> drivers/android/binder/rust_binder_main.o:(<rust_binder_main::process::Process>::insert_or_update_handle) in archive vmlinux.a
This error occurs because even though the functions are declared by
include/linux/find.h, the definition is #ifdef'd out on 32-bit ARM. This
is because arch/arm/include/asm/bitops.h contains:
And the underscore-prefixed function is conditional on #ifndef of the
non-underscore-prefixed name, but the declaration in find.h is *not*
conditional on that #ifndef.
To fix the linker error, we ensure that the symbols in question exist
when compiling Rust code. We do this by defining them in rust/helpers/
whenever the normal definition is #ifndef'd out.
Note that these helpers are somewhat unusual in that they do not have
the rust_helper_ prefix that most helpers have. Adding the rust_helper_
prefix does not compile, as 'bindings::_find_next_zero_bit()' will
result in a call to a symbol called _find_next_zero_bit as defined by
include/linux/find.h rather than a symbol with the rust_helper_ prefix.
This is because when a symbol is present in both include/ and
rust/helpers/, the one from include/ wins under the assumption that the
current configuration is one where that helper is unnecessary. This
heuristic fails for _find_next_zero_bit() because the header file always
declares it even if the symbol does not exist.
The functions still use the __rust_helper annotation. This lets the
wrapper function be inlined into Rust code even if full kernel LTO is
not used once the patch series for that feature lands.
Yury: arches are free to implement they own find_bit() functions. Most
rely on generic implementation, but arm32 and m86k - not; so they require
custom handling. Alice confirmed it fixes the build for both.
Cc: stable@vger.kernel.org Fixes: 6cf93a9ed39e ("rust: add bindings for bitops.h") Reported-by: Andreas Hindborg <a.hindborg@kernel.org> Closes: https://rust-for-linux.zulipchat.com/#narrow/channel/x/topic/x/near/561677301 Tested-by: Andreas Hindborg <a.hindborg@kernel.org> Reviewed-by: Dirk Behme <dirk.behme@de.bosch.com> Signed-off-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Gal Pressman [Mon, 12 Jan 2026 17:37:14 +0000 (19:37 +0200)]
selftests: drv-net: fix RPS mask handling in toeplitz test
The toeplitz.py test passed the hex mask without "0x" prefix (e.g.,
"300" for CPUs 8,9). The toeplitz.c strtoul() call wrongly parsed this
as decimal 300 (0x12c) instead of hex 0x300.
Pass the prefixed mask to toeplitz.c, and the unprefixed one to sysfs.
Fixes: 9cf9aa77a1f6 ("selftests: drv-net: hw: convert the Toeplitz test to Python") Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260112173715.384843-2-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
syzbot reported use-after-free of inet6_ifaddr in
inet6_addr_del(). [0]
The cited commit accidentally moved ipv6_del_addr() for
mngtmpaddr before reading its ifp->flags for temporary
addresses in inet6_addr_del().
Let's move ipv6_del_addr() down to fix the UAF.
[0]:
BUG: KASAN: slab-use-after-free in inet6_addr_del.constprop.0+0x67a/0x6b0 net/ipv6/addrconf.c:3117
Read of size 4 at addr ffff88807b89c86c by task syz.3.1618/9593
Eric Dumazet [Mon, 12 Jan 2026 10:38:25 +0000 (10:38 +0000)]
dst: fix races in rt6_uncached_list_del() and rt_del_uncached_list()
syzbot was able to crash the kernel in rt6_uncached_list_flush_dev()
in an interesting way [1]
Crash happens in list_del_init()/INIT_LIST_HEAD() while writing
list->prev, while the prior write on list->next went well.
static inline void INIT_LIST_HEAD(struct list_head *list)
{
WRITE_ONCE(list->next, list); // This went well
WRITE_ONCE(list->prev, list); // Crash, @list has been freed.
}
Issue here is that rt6_uncached_list_del() did not attempt to lock
ul->lock, as list_empty(&rt->dst.rt_uncached) returned
true because the WRITE_ONCE(list->next, list) happened on the other CPU.
We might use list_del_init_careful() and list_empty_careful(),
or make sure rt6_uncached_list_del() always grabs the spinlock
whenever rt->dst.rt_uncached_list has been set.
A similar fix is neeed for IPv4.
[1]
BUG: KASAN: slab-use-after-free in INIT_LIST_HEAD include/linux/list.h:46 [inline]
BUG: KASAN: slab-use-after-free in list_del_init include/linux/list.h:296 [inline]
BUG: KASAN: slab-use-after-free in rt6_uncached_list_flush_dev net/ipv6/route.c:191 [inline]
BUG: KASAN: slab-use-after-free in rt6_disable_ip+0x633/0x730 net/ipv6/route.c:5020
Write of size 8 at addr ffff8880294cfa78 by task kworker/u8:14/3450
The buggy address belongs to the object at ffff8880294cfa00
which belongs to the cache ip6_dst_cache of size 232
The buggy address is located 120 bytes inside of
freed 232-byte region [ffff8880294cfa00, ffff8880294cfae8)
Fixes: 8d0b94afdca8 ("ipv6: Keep track of DST_NOCACHE routes in case of iface down/unregister") Fixes: 78df76a065ae ("ipv4: take rt_uncached_lock only if needed") Reported-by: syzbot+179fc225724092b8b2b2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6964cdf2.050a0220.eaf7.009d.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20260112103825.3810713-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
RSS configuration requires a valid RX indirection table. When the device
reports a single receive queue, rndis_filter_device_add() does not
allocate an indirection table, accepting RSS hash key updates in this
state leads to a hang.
Fix this by gating netvsc_set_rxfh() on ndc->rx_table_sz and return
-EOPNOTSUPP when the table is absent. This aligns set_rxfh with the device
capabilities and prevents incorrect behavior.
Fixes: 962f3fee83a4 ("netvsc: add ethtool ops to get/set RSS key") Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com> Reviewed-by: Dipayaan Roy <dipayanroy@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://patch.msgid.link/1768212093-1594-1-git-send-email-gargaditya@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
igc: Reduce TSN TX packet buffer from 7KB to 5KB per queue
The previous 7 KB per queue caused TX unit hangs under heavy
timestamping load. Reducing to 5 KB avoids these hangs and matches
the TSN recommendation in I225/I226 SW User Manual Section 7.5.4.
The 8 KB "freed" by this change is currently unused. This reduction
is not expected to impact throughput, as the i226 is PCIe-limited
for small TSN packets rather than TX-buffer-limited.
Fixes: 0d58cdc902da ("igc: optimize TX packet buffer utilization for TSN mode") Reported-by: Zdenek Bouska <zdenek.bouska@siemens.com> Closes: https://lore.kernel.org/netdev/AS1PR10MB5675DBFE7CE5F2A9336ABFA4EBEAA@AS1PR10MB5675.EURPRD10.PROD.OUTLOOK.COM/ Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Chwee-Lin Choong <chwee.lin.choong@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Chwee-Lin Choong [Fri, 28 Nov 2025 10:53:04 +0000 (18:53 +0800)]
igc: fix race condition in TX timestamp read for register 0
The current HW bug workaround checks the TXTT_0 ready bit first,
then reads TXSTMPL_0 twice (before and after reading TXSTMPH_0)
to detect whether a new timestamp was captured by timestamp
register 0 during the workaround.
This sequence has a race: if a new timestamp is captured after
checking the TXTT_0 bit but before the first TXSTMPL_0 read, the
detection fails because both the "old" and "new" values come from
the same timestamp.
Fix by reading TXSTMPL_0 first to establish a baseline, then
checking the TXTT_0 bit. This ensures any timestamp captured
during the race window will be detected.
Old sequence:
1. Check TXTT_0 ready bit
2. Read TXSTMPL_0 (baseline)
3. Read TXSTMPH_0 (interrupt workaround)
4. Read TXSTMPL_0 (detect changes vs baseline)
New sequence:
1. Read TXSTMPL_0 (baseline)
2. Check TXTT_0 ready bit
3. Read TXSTMPH_0 (interrupt workaround)
4. Read TXSTMPL_0 (detect changes vs baseline)
Fixes: c789ad7cbebc ("igc: Work around HW bug causing missing timestamps") Suggested-by: Avi Shalev <avi.shalev@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Co-developed-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Chwee-Lin Choong <chwee.lin.choong@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Kurt Kanzenbach [Thu, 20 Nov 2025 08:18:29 +0000 (09:18 +0100)]
igc: Restore default Qbv schedule when changing channels
The Multi-queue Priority (MQPRIO) and Earliest TxTime First (ETF) offloads
utilize the Time Sensitive Networking (TSN) Tx mode. This mode is always
coupled to IEEE 802.1Qbv time aware shaper (Qbv). Therefore, the driver
sets a default Qbv schedule of all gates opened and a cycle time of
1s. This schedule is set during probe.
However, the following sequence of events lead to Tx issues:
- Boot a dual core system
igc_probe():
igc_tsn_clear_schedule():
-> Default Schedule is set
Note: At this point the driver has allocated two Tx/Rx queues, because
there are only two CPUs.
- ethtool -L enp3s0 combined 4
igc_ethtool_set_channels():
igc_reinit_queues()
-> Default schedule is gone, per Tx ring start and end time are zero
Therefore, restore the default Qbv schedule after changing the number of
channels.
Furthermore, add a restriction to not allow queue reconfiguration when
TSN/Qbv is enabled, because it may lead to inconsistent states.
Fixes: c814a2d2d48f ("igc: Use default cycle 'start' and 'end' values for queues") Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Ding Hui [Sat, 6 Dec 2025 13:46:09 +0000 (21:46 +0800)]
ice: Fix incorrect timeout ice_release_res()
The commit 5f6df173f92e ("ice: implement and use rd32_poll_timeout for
ice_sq_done timeout") converted ICE_CTL_Q_SQ_CMD_TIMEOUT from jiffies
to microseconds.
But the ice_release_res() function was missed, and its logic still
treats ICE_CTL_Q_SQ_CMD_TIMEOUT as a jiffies value.
So correct the issue by usecs_to_jiffies().
Found by inspection of the DDP downloading process.
Compile and modprobe tested only.
Fixes: 5f6df173f92e ("ice: implement and use rd32_poll_timeout for ice_sq_done timeout") Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Dave Ertman [Thu, 20 Nov 2025 17:58:26 +0000 (09:58 -0800)]
ice: Avoid detrimental cleanup for bond during interface stop
When the user issues an administrative down to an interface that is the
primary for an aggregate bond, the prune lists are being purged. This
breaks communication to the secondary interface, which shares a prune
list on the main switch block while bonded together.
For the primary interface of an aggregate, avoid deleting these prune
lists during stop, and since they are hardcoded to specific values for
the default vlan and QinQ vlans, the attempt to re-add them during the
up phase will quietly fail without any additional problem.
Fixes: 1e0f9881ef79 ("ice: Flesh out implementation of support for SRIOV on bonded interface") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jacob Keller [Thu, 20 Nov 2025 20:20:41 +0000 (12:20 -0800)]
ice: initialize ring_stats->syncp
The u64_stats_sync structure is empty on 64-bit systems. However, on 32-bit
systems it contains a seqcount_t which needs to be initialized. While the
memory is zero-initialized, a lack of u64_stats_init means that lockdep
won't get initialized properly. Fix this by adding u64_stats_init() calls
to the rings just after allocation.
Fixes: 2b245cb29421 ("ice: Implement transmit and NAPI support") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Linus Torvalds [Tue, 13 Jan 2026 18:04:34 +0000 (10:04 -0800)]
Merge tag 'gfs2-for-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 revert from Andreas Gruenbacher:
"Revert bad commit "gfs2: Fix use of bio_chain"
I was originally assuming that there must be a bug in gfs2
because gfs2 chains bios in the opposite direction of what
bio_chain_and_submit() expects.
It turns out that the bio chains are set up in "reverse direction"
intentionally so that the first bio's bi_end_io callback is invoked
rather than the last bio's callback.
We want the first bio's callback invoked for the following reason: The
initial bio starts page aligned and covers one or more pages. When it
terminates at a non-page-aligned offset, subsequent bios are added to
handle the remaining portion of the final page.
Upon completion of the bio chain, all affected pages need to be be
marked as read, and only the first bio references all of these pages"
* tag 'gfs2-for-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
Revert "gfs2: Fix use of bio_chain"
Linus Torvalds [Tue, 13 Jan 2026 17:50:07 +0000 (09:50 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull x86 kvm fixes from Paolo Bonzini:
- Avoid freeing stack-allocated node in kvm_async_pf_queue_task
- Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
selftests: kvm: Verify TILELOADD actually #NM faults when XFD[18]=1
selftests: kvm: try getting XFD and XSAVE state out of sync
selftests: kvm: replace numbered sync points with actions
x86/fpu: Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1
x86/kvm: Avoid freeing stack-allocated node in kvm_async_pf_queue_task
Linus Torvalds [Tue, 13 Jan 2026 17:37:49 +0000 (09:37 -0800)]
Merge tag 'hyperv-fixes-signed-20260112' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
- Minor fixes and cleanups for the MSHV driver
* tag 'hyperv-fixes-signed-20260112' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
mshv: release mutex on region invalidation failure
hyperv: Avoid -Wflex-array-member-not-at-end warning
mshv: hide x86-specific functions on arm64
mshv: Initialize local variables early upon region invalidation
mshv: Use PMD_ORDER instead of HPAGE_PMD_ORDER when processing regions
Eric Dumazet [Wed, 7 Jan 2026 10:41:59 +0000 (10:41 +0000)]
net: add net.core.qdisc_max_burst
In blamed commit, I added a check against the temporary queue
built in __dev_xmit_skb(). Idea was to drop packets early,
before any spinlock was acquired.
if (unlikely(defer_count > READ_ONCE(q->limit))) {
kfree_skb_reason(skb, SKB_DROP_REASON_QDISC_DROP);
return NET_XMIT_DROP;
}
It turned out that HTB Qdisc has a zero q->limit.
HTB limits packets on a per-class basis.
Some of our tests became flaky.
Add a new sysctl : net.core.qdisc_max_burst to control
how many packets can be stored in the temporary lockless queue.
Also add a new QDISC_BURST_DROP drop reason to better diagnose
future issues.
Tetsuo Handa [Thu, 8 Jan 2026 12:36:48 +0000 (21:36 +0900)]
bpf: Fix reference count leak in bpf_prog_test_run_xdp()
syzbot is reporting
unregister_netdevice: waiting for sit0 to become free. Usage count = 2
problem. A debug printk() patch found that a refcount is obtained at
xdp_convert_md_to_buff() from bpf_prog_test_run_xdp().
According to commit ec94670fcb3b ("bpf: Support specifying ingress via
xdp_md context in BPF_PROG_TEST_RUN"), the refcount obtained by
xdp_convert_md_to_buff() will be released by xdp_convert_buff_to_md().
Therefore, we can consider that the error handling path introduced by
commit 1c1949982524 ("bpf: introduce frags support to
bpf_prog_test_run_xdp()") forgot to call xdp_convert_buff_to_md().