The comment for calculating the prg_rclk_dly value is incorrect as it
omits the brackets around the divisor. Add the brackets to allow the
reader to correctly evaluate the value. Validated with the values given
in the driver.
net: stmmac: qcom-ethqos: move loopback decision next to reg update
Move the loopback decision next to the register update, and make the
local variable unsigned. As a result, there is now no need for the
comment referring to the programming being later.
Rather than coding the entire register update twice with different
values, use a local variable to specify the value and have one
register update statement that uses this local variable. This results
in neater code.
net: stmmac: qcom-ethqos: finally eliminate the switch
Move the RCLK delay configuration out of the switch, which just leaves
the RGMII_CONFIG_LOOPBACK_EN setting in all three paths. This makes it
trivial to eliminate the switch.
Move RGMII_CONFIG2_RX_PROG_SWAP out of the switch. 1G speed always
sets this field. 100M and 10M sets it for has_emac_ge_3 devices,
otherwise it is cleared.
Move the speed programming for 100M and 10M out of the switch. There
is no programming done for 1G speed.
It looks like there are two fields, 7:6 which are programemd to '1'
to select a /2 divisor for 100M, and bits 16:8 which are programmed
to '19' to select a /20 divisor.
net: stmmac: qcom-ethqos: move two more RGMII_IO_MACRO_CONFIG2 out
RGMII_CONFIG2_DATA_DIVIDE_CLK_SEL is always cleared, and
RGMII_CONFIG2_TX_CLK_PHASE_SHIFT_EN is always updated with the phase
shift in each path through the switch, so these are independent of
the speed. Move them out.
net: stmmac: qcom-ethqos: move 1G vs 100M/10M RGMII settings
Move RGMII_CONFIG_BYPASS_TX_ID_EN, RGMII_CONFIG_POS_NEG_DATA_SEL and
RGMII_CONFIG_PROG_SWAP. There are two states for these: one group for
1G, and the logical inversion for 100M and 10M. Move this out of the
switch into an if-else clause.
net: stmmac: qcom-ethqos: move detection of invalid RGMII speed
Move detection of invalid RGMII speeds (which will never be triggered)
before the switch() to allow register modifications that are common to
all speeds to be moved out of the switch.
Since ethqos_fix_mac_speed() is called via a function pointer, and only
indirects via the configure_func function pointer, eliminate this
unnecessary indirection.
net: stmmac: qcom-ethqos: pass ethqos to ethqos_pcs_set_inband()
Rather than getting the stmmac_priv pointer in
ethqos_configure_sgmii(), move it into ethqos_pcs_set_inband() and pass
the struct qcom_ethqos pointer instead.
ethqos_configure() does nothing more than indirect via
ethqos->configure_func, and is only called from ethqos_fix_mac_speed()
just below. Move the indirect call into ethqos_fix_mac_speed().
Jan Hoffmann [Sun, 29 Mar 2026 19:11:11 +0000 (21:11 +0200)]
net: sfp: add quirk for ZOERAX SFP-2.5G-T
This is a 2.5G copper module which appears to be based on a Motorcomm
YT8821 PHY. There doesn't seem to be a usable way to to access the PHY
(I2C address 0x56 provides only read-only C22 access, and Rollball is
also not working).
The module does not report the correct extended compliance code for
2.5GBase-T, and instead claims to support SONET OC-48 and Fibre Channel:
Despite this, the kernel still enables the correct 2500Base-X interface
mode. However, for the module to actually work, it is also necessary to
disable inband auto-negotiation.
Enable the existing "sfp_quirk_oem_2_5g" for this module, which handles
that and also sets the bit for 2500Base-T link mode.
====================
net: hsr: subsystem cleanups and modernization
This series contains two focused HSR cleanups with practical benefit.
It constifies protocol operation tables and replaces a hardcoded
function name with __func__ to keep diagnostics correct across
refactoring.
====================
Luka Gejak [Thu, 26 Mar 2026 17:46:00 +0000 (18:46 +0100)]
net: hsr: use __func__ instead of hardcoded function name
Replace the hardcoded string "hsr_get_untagged_frame" with the
standard __func__ macro in netdev_warn_once() call to make the code
more robust to refactoring.
Luka Gejak [Thu, 26 Mar 2026 17:45:59 +0000 (18:45 +0100)]
net: hsr: constify hsr_ops and prp_ops protocol operation structures
The hsr_ops and prp_ops structures are assigned to hsr->proto_ops during
device initialization and are never modified at runtime. Declaring them
as const allows the compiler to place these structures in read-only
memory, which improves security by preventing accidental or malicious
modification of the function pointers they contain.
The proto_ops field in struct hsr_priv is also updated to a const
pointer to maintain type consistency.
Jakub Kicinski [Sun, 29 Mar 2026 21:35:29 +0000 (14:35 -0700)]
Merge branch 'macb-usrio-tsu-patches'
Conor Dooley says:
====================
macb usrio/tsu patches
At the very least, it'd be good of the soc vendor folks could check
their platforms and see if their usrio stuff actually lines up with what
the driver currently calls "macb_default_usrio". Ours didn't and it was
a nasty surprise.
Ryan and I figured out that the sama7g5 stuff is not actually using the
same usrio bits as earlier devices, so there's now more patches in this
series to split them apart. I've not tested the split or the new
property due to lack of hardware, but Ryan has.
Marking this stuff net-next, because although they're fixes I don't see
any particular urgency, and it avoids creating some dependencies between
cleanup items and the fixes.
====================
Théo Lebrun [Wed, 25 Mar 2026 16:28:18 +0000 (16:28 +0000)]
net: macb: drop usrio pointer on EyeQ5 config
USRIO is disabled on this platform, drop its inherited usrio config.
We will end up with MACB_CAPS_USRIO_DISABLED on this platform:
- We have no config->usrio so macb_configure_caps() deduces that the
feature is disabled.
- Anecdotally, we would also land in the runtime detection codepath
that reads DCFG1.
Théo Lebrun [Wed, 25 Mar 2026 16:28:17 +0000 (16:28 +0000)]
net: macb: set MACB_CAPS_USRIO_DISABLED if no usrio config is provided
bp->usrio is copied directly from dt_conf->usrio in macb_probe().
If dt_conf->usrio is NULL, we do not want to land in USRIO write
codepaths which dereference bp->usrio. Inherit automatically
MACB_CAPS_USRIO_DISABLED to avoid those.
This means a macb_config that wants to disable usrio can simply drop
its .usrio field, rather than add the disabled capability explicitly.
Nit: drop the dt_conf NULL check because the pointer is always valid.
DCFG1 (design config 1 register) carries a bit indicating whether User
I/O feature has been enabled or not. The MACB/GEM driver has a cap flag
indicating that HW has the feature disabled (default is enabled). Add
the missing connection between DCFG1 bit and MACB_CAPS_USRIO_DISABLED.
Indirect impact: avoid useless writel() on USERIO register; this is not
an important fix because USERIO is anyway read-only when feature is
disabled.
If for some reason a compatible sets USRIO_DISABLED but DCFG1 indicates
it is enabled, we still keep the disabled capability flag. This ensures
we don't break "cdns,np4-macb" that sets the flag from compatible match
data.
Conor Dooley [Wed, 25 Mar 2026 16:28:15 +0000 (16:28 +0000)]
net: macb: timer adjust mode is not supported
The ptp portion of this driver controls the tsu's timer using the
controls for "increment mode", which is not compatible with the hardware
trying to control it via the gem_tsu_inc_ctrl and gem_tsu_ms inputs in
"timer adjust mode". Abort probe if the property signalling that the
relevant signals have been wired up is present.
The GEM IP has two methods for modifying the ptp timer. The first of
these, named "increment mode", relies on software controlling the timer
by setting tsu_timer_incr and tsu_timer_incr_sub_nsec and performing
once-off adjustments via the tsu_timer_adjust register. This is what the
macb driver uses. The second mechanism, "timer adjust mode" uses the
gem_tsu_inc_ctrl and gem_tsu_ms signals to control the timer. These
modes are not intended to be used in parallel, but both can be possible
on the same device and which mode is used cannot be determined from the
compatible on all devices, because some users of the GEM IP are SoC
FPGAs that permit configuring how the IP is wired up.
Add a property to indicate that gem_tsu_inc_ctrl and gem_tsu_ms are wired
up for timer adjust mode.
Conor Dooley [Wed, 25 Mar 2026 16:28:13 +0000 (16:28 +0000)]
net: macb: clean up tsu clk rate acquisition
tsu_clk is grabbed during probe, so doesn't need to be re-grabbed here.
pclk is mandatory, probe will fail if it is err/NULL, so there's no need
to check it here or have a !pclk 3rd arm. Simplify gem_get_tsu_rate() to
account for these facts.
Conor Dooley [Wed, 25 Mar 2026 16:28:12 +0000 (16:28 +0000)]
net: macb: warn on pclk use as a tsu_clk fallback
The Candence GEM IP has a configuration parameter which determines the
source of the clock used for the timestamp unit (if it is enabled),
switching it between using the pclk and a dedicated input.
When ptp support was added to the macb driver, a new tsu_clk was added
to represent the dedicated input. While this is understandable, I think
it is bug prone and that the tsu_clk should represent whatever clock is
used for the timestamper and not just that specific input.
>From a driver point of view, the benefit of taking the conceptual
approach is avoiding misconfiguring the driver when the hardware
supports ptp (and it is set as a capability in the relevant per-device
structure) but no tsu_clk is provided in devicetree. At the moment, the
timestamper will be registered and programmed with an increment that
reflects the pclk in these cases, but will malfunction if the pclk and
tsu_clk frequencies do not match. Obviously, this means the devicetree
incorrectly represents the hardware, but this change in approach would
make the driver more resilient without meaningfully impacting correctly
described users.
Out of the devices that claim MACB_CAPS_GEM_HAS_PTP the fu540, mpfs,
sama5d2 and sama7g5-emac (but not sama7g5-gem) are at risk of having
this problem with the in-kernel devicetrees. mpfs and sama7g5-emac
have been confirmed to be incorrect, and sama5d2 is correct. It may be
that the other platforms actually do use the pclk for the timestamper
(either by supplying pclk to the tsu_clk input of the IP, or by having
the IP block configured to use pclk instead of the tsu_clk input), but
at least two are wrong, as they do not use pclk for the tsu_clk, so the
driver is registering the ptp clock incorrectly.
Add a warning if no tsu_clk is provided on a platform that uses the
timerstamper, to encourage people to specifically provide a tsu_clk and
avoid silently registering the timerstamper with the wrong clock. If the
pclk is actually used, it can be provided as a tsu_clk for improved
clarity in devicetrees.
While this changes the meaning of the devicetree property, it is
backwards compatible as there's no functional change for platforms that
didn't provide a tsu_clk and the changed meaning of providing a tsu_clk
in the devicetree does not impact platforms that already provided one as
the decision about the tsu clock source is at IP instantiation time
rather than at runtime, so there's no driver behaviour that needs to
change based on the input to the IP used for the timestamping unit.
Conor Dooley [Wed, 25 Mar 2026 16:28:11 +0000 (16:28 +0000)]
net: macb: add mpfs specific usrio configuration
On mpfs the driver needs to make sure the tsu clock source is not the
fabric, as this requires that the hardware is in Timer Adjust mode,
which is not compatible with the linux driver trying to control the
hardware. It is unlikely that this will be set, as the peripheral is
reset during probe, but if the resets are not provided in devicetree
it's probable that this bit is set incorrectly, as U-Boot's macb driver
has the same issue with using usrio settings for at91 platforms as the
default.
Conor Dooley [Wed, 25 Mar 2026 16:28:09 +0000 (16:28 +0000)]
net: macb: rework usrio refclk selection code
The USRIO based refclk selection code abuses a capability flag to set
the refclk to an external source based on match data/compatible on
sama7g5-emac and use an internal source for the gmac.
Ryan previously added a property in an attempt to decouple the refclk
source from the compatible, because this is not fixed by compatible
and there's variance based on the choices made by board designers.
Originally when Ryan added it, he removed the capability flag entirely
from match data, but this changed the default for the sama7g5-emac and
the removal had to be reverted for these devices. Because these devices
default to an external refclk, and the current property is only capable
of communicating external refclks, there's no way to make the
sama7g5-emac use an internal refclk.
Additionally, this property has no limiting based on compatible, and
if used on a platform with an external refclk that is not controlled
by USRIO the capability would be erroneously set. Because of the reuse
of the at91_default_usrio struct by non-at91 devices, this could cause
the refclk bit to be set in error, on a system where the refclk is
externally provided without usrio settings being required.
Change the new capability flag so that it actually represents the
hardware being capable of controlling the refclk source via USRIO,
and move the selection of default behaviour into the macb_usrio_config
struct provided as part of match data.
Modify the devicetree code to support a new property,
"cdns,refclk-source" which will support devices with either default,
retaining support for "cdns,refclk-external" for compatibility reasons.
Conor Dooley [Wed, 25 Mar 2026 16:28:08 +0000 (16:28 +0000)]
dt-bindings: net: cdns,macb: replace cdns,refclk-ext with cdns,refclk-source
Ryan added cdns,refclk-ext with the intent of decoupling the source of
the reference clock on sama7g5 (and related platforms) from the
compatible. Unfortunately, the default for sama7g5-emac is an external
reference clock, so this property had no effect there, so that
compatibility with older devicetrees is preserved.
Replace cdns,refclk-ext with one that supports both default states and
therefore is usable for sama7g5-emac.
For now, limit it to only the platforms that have USRIO controlled
reference clock selection, but this could be generalised in the future.
The existing property only works on devices that are compatible with
sama7g5-gem, so mark it deprecated, and limit its use to that specific
scenario.
Conor Dooley [Wed, 25 Mar 2026 16:28:07 +0000 (16:28 +0000)]
net: macb: split USRIO_HAS_CLKEN capability in two
While trying to rework the internal/external refclk selection on
sama7g5, Ryan and I noticed that the sama7g5 was "overloading" the
meaning of MACB_CAPS_USRIO_HAS_CLKEN, using it differently to how it was
originally intended.
Originally, on the macb hardware on sam9620 et al,
MACB_CAPS_USRIO_HAS_CLKEN represented the hardware having a bit that
needed to be set to turn on the input clock to the transceivers. The
sama7g5 doesn't have this bit, so for some reason the decision was made
to reuse this capability flag to control selection of internal/external
references.
Split the caps in two, so that capabilities do what they say on the tin,
and allow reworking the refclk selection handling without impacting the
older devices that use MACB_CAPS_USRIO_CLKEN for its original purpose.
Conor Dooley [Wed, 25 Mar 2026 16:28:06 +0000 (16:28 +0000)]
net: macb: rename macb_default_usrio to at91_default_usrio as not all platforms have mii mode control in usrio
Calling this structure macb_default_usrio is misleading, I believe, as
it implies that it should be used if your platform has nothing special
to do in usrio. Since usrio is platform dependent, the default here is
probably for each usrio to do nothing, with the macb documentation I
have access to prescribing no standard behaviour here. We noticed that
this was problematic because on mpfs, a bit that macb_default_usrio
sets to deal with the MII mode actually changes the source for the
tsu_clk to something with how the majority of mpfs devices are actually
configured!
Rename it to at91_default_usrio, since that's where the values actually
come from for these. I have no idea if any of the other platforms that
use the default actually copied at91's usrio configuration or if they
have usrio configurations where what the driver does has no impact.
Gate touching these bits behind a capability, like the clken refclock
usrio knob, so that platforms without the MII mode stuff can avoid
running this code.
Conor Dooley [Wed, 25 Mar 2026 16:28:05 +0000 (16:28 +0000)]
Revert "net: macb: Clean up the .usrio settings in macb_config instances"
Commit 0ae998c4efd69 ("net: macb: Clean up the .usrio settings in
macb_config instances") was a misguided attempt to clean up the driver
that actually just propagated problematic code. The default for usrio is
actually no usrio, and already there are issues with people using the
problematically named "macb_default_usrio" on platforms where the usrio
does not have this so-called default behaviour. usrio is platform
specific and using the default at91 usrio settings should be opt-in
only. Revert the "cleanup" patch.
====================
bnxt_en: Add XDP RSS hash metadata support
This series adds XDP RSS hash metadata extraction support for the bnxt_en
driver and includes selftests to validate the functionality. I was able
to test this on a BCM57414 NIC.
====================
This test loads xdp_metadata.bpf which calls bpf_xdp_metadata_rx_hash() on
incoming packets. The metadata from that packet is then sent to a BPF
map for validation. It borrows structure from xdp.py, reusing common
functions.
The test checks the device's xdp-rx-metadata-features via netlink
before running and skips on devices that do not advertise hash support.
This can be run on veth devices as well as real hardware.
The test is fairly simple and just verifies that a TCP or UDP packet can be
identified as an L4 flow. This minimal test also passes if run on a veth
device.
Chris J Arges [Wed, 25 Mar 2026 20:09:51 +0000 (15:09 -0500)]
selftests: net: move common xdp.py functions into lib
This moves a few functions which can be useful to other python programs
that manipulate XDP programs. This also refactors xdp.py to use the
refactored functions.
Chris J Arges [Wed, 25 Mar 2026 20:09:49 +0000 (15:09 -0500)]
bnxt_en: Move bnxt_rss_ext_op into header
This allows bnxt_rss_ext_op to be used by other functions. In addition this
modifies the rxcmp argument to be const since the function only reads from
this structure.
Add support for extracting RSS hash values and hash types from hardware
completion descriptors in XDP programs for bnxt_en.
Add IP_TYPE definition for determining if completion is ipv4 or ipv6. In
addition add ITYPE_ICMP flag for identifying ICMP completions.
Signed-off-by: Chris J Arges <carges@cloudflare.com> Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Chris J Arges [Wed, 25 Mar 2026 20:09:47 +0000 (15:09 -0500)]
bnxt_en: use bnxt_xdp_buff for xdp context
This adds bnxt_xdp_buff which embeds the xdp_buff struct and stores
pointers to hardware RX completion descriptors (rx_cmp and rx_cmp_ext)
along with the completion type.
Linus Walleij [Fri, 27 Mar 2026 12:23:45 +0000 (13:23 +0100)]
net: dsa: qca8k: Use the right GPIO header
The driver header for qca8k includes the legacy GPIO header
<linux/gpio.h> but does not use any symbols from it and actually
wants <linux/gpio/consumer.h> so fix this up.
Eric Dumazet [Fri, 27 Mar 2026 04:06:46 +0000 (04:06 +0000)]
tcp: use __jhash_final() in inet6_ehashfn()
I misread jhash2() implementation.
Last round should use __jhash_final() instead of __jhash_mix().
Using __jhash_mix() here leaves entropy distributed across a, b, and c,
which might lead to incomplete diffusion of the faddr and fport bits
into the bucket index. Replacing this last __jhash_mix() with
__jhash_final() provides the correct avalanche properties
for the returned value in c.
$ scripts/bloat-o-meter -t vmlinux.0 vmlinux
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-4 (-4)
Function old new delta
inet6_ehashfn 306 302 -4
Total: Before=25155089, After=25155085, chg -0.00%
Fixes: 854587e69ef3 ("tcp: improve inet6_ehashfn() entropy") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260327040646.3849503-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
Convert CONFIG_IPV6 to built-in and remove stubs
Historically, the Linux kernel has supported compiling the IPv6 stack as
a loadable module. While this made sense in the early days of IPv6
adoption, modern deployments and distributions overwhelmingly either
build IPv6 directly into the kernel (CONFIG_IPV6=y) or disable it
entirely (CONFIG_IPV6=n). The modular IPv6 use-case offers image size
and memory savings for specific setups, this benefit is outweighed by
the architectural burden it imposes on the subsystems on implementation
and maintenance.
In addition, most of the distributions are already using CONFIG_IPV6=y
by default [1], including openWRT [2] and Android gki_defconfig [3]. So
this won't have an impact on them. The most impacted architecture would
probably be arm64 as their default config is still using CONFIG_IPV6=m.
To allow core networking, BPF, Netfilter, and various device drivers to
safely interact with a potentially unloaded IPv6 module, the kernel
relies on indirect call structures like ipv6_stub, ipv6_bpf_stub, and
nf_ipv6_ops, along with dynamic RCU registrations for things like ICMPv6
senders.
This patch series addresses this by changing CONFIG_IPV6 from a tristate
to a boolean, enforcing that IPv6 is either built-in or disabled. This
allows us to completely rip out the stub infrastructures and safely
replace them with direct function calls.
The bloat-o-meter report the following results for m68k, arm64, x86_64
defconfig.
m68k (keep on mind that CONFIG_IPV6 is disabled now):
add/remove: 65/938 grow/shrink: 36/254 up/down: 3022/-49692 (-46670)
Considering that each new kernel release increases sizes by 30-40KiB on
average, this size increase isn't a huge jump for the distributions that
are still using CONFIG_IPV6=m. For the ones that are already using
CONFIG_IPV6=y, the size is reduced actually.
All the patches has been independently build tested. With allmodconfig
and allmodconfig + CONFIG_IPV6=n. In addition, net selftest has been run
against them on virtme-ng.
The series applied as a whole as been tested with allyesconfig and also
allyesconfig + CONFIG_IPV6=n but not all patches has been independently
tested this way.
netfilter: remove nf_ipv6_ops and use direct function calls
As IPv6 is built-in only, nf_ipv6_ops can be removed completely as it is
not longer necessary.
Convert all nf_ipv6_ops usage to direct function calls instead. In
addition, remove the ipv6_netfilter_init/fini() functions as they are
not necessary any longer.
bpf: remove ipv6_bpf_stub completely and use direct function calls
As IPv6 is built-in only, the ipv6_bpf_stub can be removed completely.
Convert all ipv6_bpf_stub usage to direct function calls instead. The
fallback functions introduced previously will prevent linkage errors
when CONFIG_IPV6 is disabled.
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Tested-by: Ricardo B. Marlière <rbm@suse.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260325120928.15848-10-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: convert remaining ipv6_stub users to direct function calls
As IPv6 is built-in only, the ipv6_stub infrastructure is no longer
necessary.
Convert remaining ipv6_stub users to make direct function calls. The
fallback functions introduced previously will prevent linkage errors
when CONFIG_IPV6 is disabled.
ipv4: drop ipv6_stub usage and use direct function calls
As IPv6 is built-in only, the ipv6_stub infrastructure is no longer
necessary.
The IPv4 stack interacts with IPv6 mainly to support IPv4 routes with
IPv6 next-hops (RFC 8950). Convert all these cross-family calls from
ipv6_stub to direct function calls. The fallback functions introduced
previously will prevent linkage errors when CONFIG_IPV6 is disabled.
drivers: net: drop ipv6_stub usage and use direct function calls
As IPv6 is built-in only, the ipv6_stub infrastructure is no longer
necessary.
Convert all drivers currently utilizing ipv6_stub to make direct
function calls. The fallback functions introduced previously will
prevent linkage errors when CONFIG_IPV6 is disabled.
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Tested-by: Ricardo B. Marlière <rbm@suse.com> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Antonio Quartulli <antonio@openvpn.net> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20260325120928.15848-7-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In preparation for dropping ipv6_stub and converting its users to direct
function calls, introduce static inline dummy functions and fallback
macros in the IPv6 networking headers. In addition, introduce checks on
fib6_nh_init(), ip6_dst_lookup_flow() and ip6_fragment() to avoid a
crash due to ipv6.disable=1 set during booting. The other functions are
safe as they cannot be called with ipv6.disable=1 set.
These fallbacks ensure that when CONFIG_IPV6 is completely disabled,
there are no compiling or linking errors due to code paths not guarded
by preprocessor macro IS_ENABLED(CONFIG_IPV6).
In addition, export ndisc_send_na(), ip6_route_input() and
ip6_fragment().
As IPv6 is built-in only, there is no need to maintain the sender
registration infrastructure used to allow built-in subsystems to send
ICMPv6 messages when IPv6 was compiled as a module.
Drop the registration mechanism and the __icmpv6_send() sender
implementation. While icmpv6_send() users could be converted to
icmp6_send() that doesn't seems necessary as none of them are using the
force_saddr parameter.
ipv6: replace IS_BUILTIN(CONFIG_IPV6) with IS_ENABLED(CONFIG_IPV6)
As IPv6 is built-in only, it does not make sense to continue using
IS_BUILTIN(CONFIG_IPV6). Therefore, replace it with IS_ENABLED() when
necessary and drop it if it isn't valid anymore.
Notice that there is still one instance related to ICMPv6, as it
requires more changes it will be handle separately.
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Tested-by: Ricardo B. Marlière <rbm@suse.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260325120928.15848-4-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ipv6: convert CONFIG_IPV6 to built-in only and clean up Kconfigs
Maintaining a modular IPv6 stack offers image size savings for specific
setups, this benefit is outweighed by the architectural burden it
imposes on the subsystems on implementation and maintenance. Therefore,
drop it.
Change CONFIG_IPV6 from tristate to bool. Remove all Kconfig
dependencies across the tree that explicitly checked for IPV6=m. In
addition, remove MODULE_DESCRIPTION(), MODULE_ALIAS(), MODULE_AUTHOR()
and MODULE_LICENSE().
This is also replacing module_init() by device_initcall(). It is not
possible to use fs_initcall() as IPv4 does because that creates a race
condition on IPv6 addrconf.
Finally, modify the default configs from CONFIG_IPV6=m to CONFIG_IPV6=y
except for m68k as according to the bloat-o-meter the image is
increasing by 330KB~ and that isn't acceptable. Instead, disable IPv6 on
this architecture by default. This is aligned with m68k RAM requirements
and recommendations [1].
[1] http://www.linux-m68k.org/faq/ram.html
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Tested-by: Ricardo B. Marlière <rbm@suse.com> Acked-by: Krzysztof Kozlowski <krzk@kernel.org> # arm64 Link: https://patch.msgid.link/20260325120928.15848-2-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Thu, 26 Mar 2026 20:32:33 +0000 (22:32 +0200)]
vrf: Remove unnecessary RCU protection around dst entries
During initialization of a VRF device, the VRF driver creates two dst
entries (for IPv4 and IPv6). They are attached to locally generated
packets that are transmitted out of the VRF ports (via the
l3mdev_l3_out() hook). Their purpose is to redirect packets towards the
VRF device instead of having the packets egress directly out of the VRF
ports. This is useful, for example, when a queuing discipline is
configured on the VRF device.
In order to avoid a NULL pointer dereference, commit b0e95ccdd775 ("net:
vrf: protect changes to private data with rcu") made the pointers to the
dst entries RCU protected. As far as I can tell, this was needed because
back then the dst entries were released (and the pointers reset to NULL)
before removing the VRF ports.
Later on, commit f630c38ef0d7 ("vrf: fix bug_on triggered by rx when
destroying a vrf") moved the removal of the VRF ports to the VRF
device's dellink() callback. As such, the tear down sequence of a VRF
device looks as follows:
1. VRF ports are removed.
2. VRF device is unregistered.
a. Device is closed.
b. An RCU grace period passes.
c. ndo_uninit() is called.
i. dst entries are released.
Given the above, the Tx path will always see the same fully initialized
dst entries and will never race with the ndo_uninit() callback.
Therefore, there is no need to make the pointers to the dst entries RCU
protected. Remove it as well as the unnecessary NULL checks in the Tx
path.
Ido Schimmel [Thu, 26 Mar 2026 20:32:32 +0000 (22:32 +0200)]
vrf: Use dst_dev_put() instead of using loopback device
Use dst_dev_put() to clean up the device referenced by the dst entry
instead of partially open coding it. Internally, the helper uses the
blackhole device instead of the loopback device.
Jakub Kicinski [Sat, 28 Mar 2026 03:57:40 +0000 (20:57 -0700)]
Merge branch 'net-stmmac-disable-eee-on-i-mx'
Laurent Pinchart says:
====================
net: stmmac: Disable EEE on i.MX
This small patch series fixes a long-standing interrupt storm issue with
stmmac on NXP i.MX platforms.
The initial attempt to fix^Wwork around the problem in DT ([1]) was
painfully but rightfully rejected by Russell, who helped me investigate
the issue in depth. It turned out that the root cause is a mistake in
how interrupts are wired in the SoC, a hardware bug that has been
replicated in all i.MX SoCs that integrate an stmmac. The only viable
solution is to disable EEE on those devices.
Individual patches explain the issue in more details. Patch 1/2,
authored by Russell, adds a new STMMAC_FLAG to disable EEE, and patch
2/2 sets the flag for i.MX platforms.
Laurent Pinchart [Wed, 25 Mar 2026 21:00:03 +0000 (23:00 +0200)]
net: stmmac: imx: Disable EEE
The i.MX8MP suffers from an interrupt storm related to the stmmac and
EEE. A long and tedious analysis ([1]) concluded that the SoC wires the
stmmac lpi_intr_o signal to an OR gate along with the main dwmac
interrupts, which causes an interrupt storm for two reasons.
First, there's a race condition due to the interrupt deassertion being
synchronous to the RX clock domain:
- When the PHY exits LPI mode, it restarts generating the RX clock
(clk_rx_i input signal to the GMAC).
- The MAC detects exit from LPI, and asserts lpi_intr_o. This triggers
the ENET_EQOS interrupt.
- Before the CPU has time to process the interrupt, the PHY enters LPI
mode again, and stops generating the RX clock.
- The CPU processes the interrupt and reads the GMAC4_LPI_CTRL_STATUS
registers. This does not clear lpi_intr_o as there's no clk_rx_i.
An attempt was made to fixing the issue by not stopping RX_CLK in Rx LPI
state ([2]). This alleviates the symptoms but doesn't fix the issue.
Since lpi_intr_o takes four RX_CLK cycles to clear, an interrupt storm
can still occur during that window. In 1000T mode this is harder to
notice, but slower receive clocks cause hundreds to thousands of
spurious interrupts.
Fix the issue by disabling EEE completely on i.MX8MP.
Some platforms have problems when EEE is enabled, and thus need a way
to disable stmmac EEE support. Add a flag before the other LPI related
flags which tells stmmac to avoid populating the phylink LPI
capabilities, which causes phylink to call phy_disable_eee() for any
PHY that is attached to the affected phylink instance.
iMX8MP is an example - the lpi_intr_o signal is wired to an OR gate
along with the main dwmac interrupts. Since lpi_intr_o is synchronous
to the receive clock domain, and takes four clock cycles to clear, this
leads to interrupt storms as the interrupt remains asserted for some
time after the LPI control and status register is read.
This problem becomes worse when the receive clock from the PHY stops
when the receive path enters LPI state - which means that lpi_intr_o
can not deassert until the clock restarts. Since the LPI state of the
receive path depends on the link partner, this is out of our control.
We could disable RX clock stop at the PHY, but that doesn't get around
the slow-to-deassert lpi_intr_o mentioned in the above paragraph.
Previously, iMX8MP worked around this by disabling gigabit EEE, but
this is insufficient - the problem is also visible at 100M speeds,
where the receive clock is slower.
There is extensive discussion and investigation in the thread linked
below, the result of which is summarised in this commit message.
net: mana: Use at least SZ_4K in doorbell ID range check
mana_gd_ring_doorbell() accesses offsets up to DOORBELL_OFFSET_EQ
(0xFF8) + 8 bytes = 4KB within each doorbell page. A db_page_size
smaller than SZ_4K is fundamentally incompatible with the driver:
doorbell pages would overlap and the device cannot function correctly.
Validate db_page_size at the source and fail the
probe early if the value is below SZ_4K. This ensures the doorbell ID
range check in mana_gd_register_device() can rely on db_page_size
being valid.
Fixes: 89fe91c65992 ("net: mana: hardening: Validate doorbell ID from GDMA_REGISTER_DEVICE response") Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260325180423.1923060-1-ernis@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 25 Mar 2026 15:28:01 +0000 (16:28 +0100)]
mlx5: shd: Gracefully avoid shared devlink creation when no usable SN is found
On some HW, not even fall-back "SERIALNO" is found in VPD data. Unify
the behavior with the case there are no VPD data at all and avoid
creation of shared devlink instance.
Fixes: 2a8c8a03f306 ("net/mlx5: Add a shared devlink instance for PFs on same chip") Reported-by: Adam Young <admiyo@amperemail.onmicrosoft.com> Closes: https://lore.kernel.org/all/bab5b6bc-aa42-4af1-80d1-e56bcef06bc2@amperemail.onmicrosoft.com/ Reported-by: Ben Copeland <ben.copeland@linaro.org> Closes: https://lore.kernel.org/all/20260324151014.860376-1-ben.copeland@linaro.org/ Signed-off-by: Jiri Pirko <jiri@nvidia.com> Tested-by: Ben Copeland <ben.copeland@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260325152801.236343-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
value changed: 0xffff888104a93a00 -> 0x0000000000000000
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 22632 Comm: syz.0.4135 Tainted: G W syzkaller #0
PREEMPT(full)
Tainted: [W]=WARN
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/24/2026
There is no race on ring accesses: reading/writing a partial pointer
would be fine, because the reading is done by the producer
which merely cares about NULL/non NULL.
Document and disable the warnings using data_race().
net: qrtr: fix endian handling of confirm_rx field
Convert confirm_rx to little endian when enqueueing and convert it back on
receive. This fixes control flow on big endian hosts, little endian is
unaffected.
On transmit, store confirm_rx as __le32 using cpu_to_le32(). On receive,
apply le32_to_cpu() before using the value. !! ensures the value is 0 or 1
in native endianness, so the conversion isn’t strictly required here, but
it is kept for consistency and clarity.
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Alexander Wilhelm <alexander.wilhelm@westermo.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
net: stmmac: remove unused and unimplemented AXI properties
commit afea03656add ("stmmac: rework DMA bus setting and introduce new
platform AXI structure") added support for parsing all the stmmac AXI
attributes, and added code to set most of the appropriate register bits
with three exceptions:
snps,kbbe
snps,mb
snps,rb
These were parsed by the driver, but the result of parsing was never
used by any of the cores.
Moreover, no DTS in the kernel makes use of these properties.
Thus, it doesn't make sense for the driver to parse these, so let's
remove them. Also remove them from the DT binding document.
====================
dt-bindings: remove unimplemented AXI snps,kbbe snps,mb and snps,rb
Remove the AXI snps,kbbe snps,mb and snps,rb properties as they have
not been used, and although the driver parses these, the code hasn't
ever used the parsed result. This parsing has now been removed.
These were introduced by commit afea03656add ("stmmac: rework DMA bus
setting and introduce new platform AXI structure").
Di Zhu [Mon, 23 Mar 2026 04:17:30 +0000 (12:17 +0800)]
virtio-net: enable NETIF_F_GRO_HW only if GRO-related offloads are supported
Negotiating VIRTIO_NET_F_CTRL_GUEST_OFFLOADS indicates the device
allows control over offload support, but the offloads that can be
controlled may have nothing to do with GRO (e.g., if neither GUEST_TSO4
nor GUEST_TSO6 is supported).
In such a setup, reporting NETIF_F_GRO_HW as available for the device
is too optimistic and misleading to the user.
Improve the situation by masking off NETIF_F_GRO_HW unless the device
possesses actual GRO-related offload capabilities. Out of an abundance
of caution, this does not change the current behaviour for hardware with
just v6 or just v4 GRO: current interfaces do not allow distinguishing
between v6/v4 GRO, so we can't expose them to userspace precisely.
virtio_net: sync RX buffer before reading the header
receive_buf() reads the virtio header through buf before
page_pool_dma_sync_for_cpu() runs in receive_small() or
receive_mergeable(). The header buffer is thus unsynchronized at the
point where flags and, for mergeable buffers, num_buffers are consumed.
Omar Elghoul reported that on s390x Secure Execution this showed up as
greatly reduced virtio-net performance together with "bad gso" and
"bad csum" messages in dmesg. This is because with SE sync actually
copies data, so the header is uninitialized.
Move the sync into receive_buf() so the
header is synchronized before any access through buf.
Tool use: Cursor with GPT-5.4 drafted the initial code move from prompt:
"in drivers/net/virtio_net.c, move page_pool_dma_sync_for_cpu on receive
path to before memory is accessed through buf".
The result and the commit log were reviewed and edited manually.
Yohei Kojima [Tue, 24 Mar 2026 17:20:28 +0000 (02:20 +0900)]
selftests: net: Remove unnecessary backslashes in fq_band_pktlimit.sh
Address "grep: warning: stray \ before white space" warning from GNU
grep 3.12. This warns the misplaced backslashes before whitespaces
(e.g. \\' ' or '\ ') which leads to unspecified behavior [1].
We can just remove the backslashes before whitespaces as POSIX says:
Enclosing characters in single-quotes ('') shall preserve the literal
value of each character within the single-quotes.
Jeremy Kerr [Tue, 24 Mar 2026 07:19:56 +0000 (15:19 +0800)]
net: mctp: avoid copy in fragmentation loop for near-MTU messages
Currently, we incorrectly send messages that are within 4 bytes (a
struct mctp_hdr) smaller than the MTU through mctp_do_fragment_route().
This has no effect on the actual fragmentation, as we will still send as
one packet, but unnecessarily copies the original skb into a new
single-fragment skb.
Instead of having the MTU comparisons in both mctp_local_output() and
mctp_do_fragment_route(), feed all local messages through the latter,
and add the single-packet optimisation there.
This means we can coalesce the routing path of mctp_local_output, so our
out_release path is now solely for errors, so rename the label
accordingly.
Include a check in the route tests for the single-packet case too.
Reported-by: yuanzhaoming <yuanzm2@lenovo.com> Closes: https://github.com/openbmc/linux/commit/269936db5eb3962fe290b1dc4dbf1859cd5a04dd#r175836230 Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260324-dev-mtu-copy-v1-1-7af6bd7027d3@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Iurman [Tue, 24 Mar 2026 09:14:34 +0000 (10:14 +0100)]
selftests: add check for seg6 tunsrc
Extend srv6_hencap_red_l3vpn_test.sh to include checks for the new
"tunsrc" feature. If there is no support for tunsrc, it silently
falls back to the encap config without tunsrc.
Justin Iurman [Tue, 24 Mar 2026 09:14:33 +0000 (10:14 +0100)]
seg6: add per-route tunnel source address
Add SEG6_IPTUNNEL_SRC in the uapi for users to configure a specific
tunnel source address. Make seg6_iptunnel handle the new attribute
correctly. It has priority over the configured per-netns tunnel
source address, if any.
net: phylink: use phylink_expects_phy() in phylink_fwnode_phy_connect()
The tests in phylink_expects_phy() and phylink_fwnode_phy_connect() are
identical (by intention). Use phylink_expects_phy() to decide whether
to ignore a call to phylink_fwnode_phy_connect().
This small series does a bit of cleanup in the dwmad-socfpga glue
driver, especially around the .fix_mac_speed() operation.
It's mostly about re-using existing helpers from the glue driver, as
well as reorganizing the code to make the local private structures a
little bit smaller.
No functionnal changes are intended, this was tested on cyclone V.
====================
net: stmmac: dwmac-sofcpga: Drop the struct device reference
We keep a reference to our the struct device in the socfpga_dwmac priv
structure, but now it's only ever used to produce logs in the
.set_phy_mode() ops, that are specific to this driver.
When we call that ops, we always have a ref to the struct device around,
so let's pass it to .set_phy_mode(). We can now discard that reference
from struct socfpga_dwmac.
Jakub Kicinski [Fri, 27 Mar 2026 01:17:14 +0000 (18:17 -0700)]
Merge tag 'wireless-next-2026-03-26' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
A fairly big set of changes all over, notably with:
- cfg80211: new APIs for NAN (Neighbor Aware Networking,
aka Wi-Fi Aware) so less work must be in firmware
- mt76:
- mt7996/mt7925 MLO fixes/improvements
- mt7996 NPU support (HW eth/wifi traffic offload)
- iwlwifi: UNII-9 and continuing UHR work
* tag 'wireless-next-2026-03-26' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (230 commits)
wifi: mac80211: ignore reserved bits in reconfiguration status
wifi: cfg80211: allow protected action frame TX for NAN
wifi: ieee80211: Add some missing NAN definitions
wifi: nl80211: Add a notification to notify NAN channel evacuation
wifi: nl80211: add NL80211_CMD_NAN_ULW_UPDATE notification
wifi: nl80211: allow reporting spurious NAN Data frames
wifi: cfg80211: allow ToDS=0/FromDS=0 data frames on NAN data interfaces
wifi: nl80211: define an API for configuring the NAN peer's schedule
wifi: nl80211: add support for NAN stations
wifi: cfg80211: separately store HT, VHT and HE capabilities for NAN
wifi: cfg80211: add support for NAN data interface
wifi: cfg80211: make sure NAN chandefs are valid
wifi: cfg80211: Add an API to configure local NAN schedule
wifi: mac80211: cleanup error path of ieee80211_do_open
wifi: mac80211: extract channel logic from link logic
wifi: iwlwifi: mld: set RX_FLAG_RADIOTAP_TLV_AT_END generically
wifi: iwlwifi: reduce the number of prints upon firmware crash
wifi: iwlwifi: fix the description of SESSION_PROTECTION_CMD
wifi: iwlwifi: mld: introduce iwl_mld_vif_fw_id_valid
wifi: iwlwifi: mld: block EMLSR during TDLS connections
...
====================
====================
Add support for Nuvoton MA35D1 GMAC
This patch series is submitted to add GMAC support for Nuvoton MA35D1
SoC platform. This work involves implementing a GMAC driver glue layer
based on Synopsys DWMAC driver framework to leverage MA35D1's dual GMAC
interface capabilities.
Overview:
1. Added a GMAC driver glue layer for MA35D1 SoC, providing support for
the platform's two GMAC interfaces.
2. Added device tree settings, with specific configurations for our
development boards:
a. SOM board: Configured for two RGMII interfaces.
b. IoT board: Configured with one RGMII and one RMII interface.
3. Added dt-bindings for the GMAC interfaces.
====================
- eth: iavf: fix out-of-bounds writes in iavf_get_ethtool_stats()
Previous releases - always broken:
- bluetooth: fix null-ptr-deref on l2cap_sock_ready_cb
- udp: fix wildcard bind conflict check when using hash2
- netfilter: fix use of uninitialized rtp_addr in process_sdp
- tls: Purge async_hold in tls_decrypt_async_wait()
- xfrm:
- prevent policy_hthresh.work from racing with netns teardown
- fix skb leak with espintcp and async crypto
- smc: fix double-free of smc_spd_priv when tee() duplicates splice pipe buffer
- can:
- add missing error handling to call can_ctrlmode_changelink()
- fix OOB heap access in cgw_csum_crc8_rel()
- eth:
- mana: fix use-after-free in add_adev() error path
- virtio-net: fix for VIRTIO_NET_F_GUEST_HDRLEN
- bcmasp: fix double free of WoL irq"
* tag 'net-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (90 commits)
net: macb: use the current queue number for stats
netfilter: ctnetlink: use netlink policy range checks
netfilter: nf_conntrack_sip: fix use of uninitialized rtp_addr in process_sdp
netfilter: nf_conntrack_expect: skip expectations in other netns via proc
netfilter: nf_conntrack_expect: store netns and zone in expectation
netfilter: ctnetlink: ensure safe access to master conntrack
netfilter: nf_conntrack_expect: use expect->helper
netfilter: nf_conntrack_expect: honor expectation helper field
netfilter: nft_set_rbtree: revisit array resize logic
netfilter: ip6t_rt: reject oversized addrnr in rt_mt6_check()
netfilter: nfnetlink_log: fix uninitialized padding leak in NFULA_PAYLOAD
tls: Purge async_hold in tls_decrypt_async_wait()
selftests: netfilter: nft_concat_range.sh: add check for flush+reload bug
netfilter: nft_set_pipapo_avx2: don't return non-matching entry on expiry
Bluetooth: btusb: clamp SCO altsetting table indices
Bluetooth: L2CAP: Fix ERTM re-init and zero pdu_len infinite loop
Bluetooth: L2CAP: Fix deadlock in l2cap_conn_del()
Bluetooth: btintel: serialize btintel_hw_error() with hci_req_sync_lock
Bluetooth: L2CAP: Fix send LE flow credits in ACL link
net: mana: fix use-after-free in add_adev() error path
...
Linus Torvalds [Thu, 26 Mar 2026 15:22:07 +0000 (08:22 -0700)]
Merge tag 'dma-mapping-7.0-2026-03-25' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping fixes from Marek Szyprowski:
"A set of fixes for DMA-mapping subsystem, which resolve false-
positive warnings from KMSAN and DMA-API debug (Shigeru Yoshida
and Leon Romanovsky) as well as a simple build fix (Miguel Ojeda)"
* tag 'dma-mapping-7.0-2026-03-25' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
dma-mapping: add missing `inline` for `dma_free_attrs`
mm/hmm: Indicate that HMM requires DMA coherency
RDMA/umem: Tell DMA mapping that UMEM requires coherency
iommu/dma: add support for DMA_ATTR_REQUIRE_COHERENT attribute
dma-direct: prevent SWIOTLB path when DMA_ATTR_REQUIRE_COHERENT is set
dma-mapping: Introduce DMA require coherency attribute
dma-mapping: Clarify valid conditions for CPU cache line overlap
dma-mapping: handle DMA_ATTR_CPU_CACHE_CLEAN in trace output
dma-debug: Allow multiple invocations of overlapping entries
dma: swiotlb: add KMSAN annotations to swiotlb_bounce()
Paolo Abeni [Thu, 26 Mar 2026 14:38:14 +0000 (15:38 +0100)]
Merge tag 'nf-26-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter for net
This is v3, I kept back an ipset fix and another to tigthen the xtables
interface to reject invalid combinations with the NFPROTO_ARP family.
They need a bit more discussion. I fixed the issues reported by AI on
patch 9 (add #ifdef to access ct zone, update nf_conntrack_broadcast
and patch 10 (use better Fixes: tag). Thanks!
The following patchset contains Netfilter fixes for *net*.
Note that most bugs fixed here stem from 2.6 days, the large PR is not
due to an increase in regressions.
1) Fix incorrect reject of set updates with nf_tables pipapo set
avx2 backend. This comes with a regression test in patch 2.
From Florian Westphal.
2) nfnetlink_log needs to zero padding to prevent infoleak to userspace,
from Weiming Shi.
3) xtables ip6t_rt module never validated that addrnr length is within the
allowed array boundary. Reject bogus values. From Ren Wei.
4) Fix high memory usage in rbtree set backend that was unwanted side-effect
of the recently added binary search blob. From Pablo Neira Ayuso.
5) Patches 5 to 10, also from Pablo, address long-standing RCU safety bugs
in conntracks handling of expectations: We can never safely defer
a conntrack extension area without holding a reference. Yet expectation
handling does so in multiple places. Fix this by avoiding the need to
look into the master conntrack to begin with and by extending locked
sections in a few places.
11) Fix use of uninitialized rtp_addr in the sip conntrack helper,
also from Weiming Shi.
12) Add stricter netlink policy checks in ctnetlink, from David Carlier.
This avoids undefined behaviour when userspace provides huge wscale
value.
netfilter pull request 26-03-26
* tag 'nf-26-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: ctnetlink: use netlink policy range checks
netfilter: nf_conntrack_sip: fix use of uninitialized rtp_addr in process_sdp
netfilter: nf_conntrack_expect: skip expectations in other netns via proc
netfilter: nf_conntrack_expect: store netns and zone in expectation
netfilter: ctnetlink: ensure safe access to master conntrack
netfilter: nf_conntrack_expect: use expect->helper
netfilter: nf_conntrack_expect: honor expectation helper field
netfilter: nft_set_rbtree: revisit array resize logic
netfilter: ip6t_rt: reject oversized addrnr in rt_mt6_check()
netfilter: nfnetlink_log: fix uninitialized padding leak in NFULA_PAYLOAD
selftests: netfilter: nft_concat_range.sh: add check for flush+reload bug
netfilter: nft_set_pipapo_avx2: don't return non-matching entry on expiry
====================
Paolo Abeni [Thu, 26 Mar 2026 14:14:51 +0000 (15:14 +0100)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
For ice:
Michal corrects call to alloc_etherdev_mqs() to provide maximum number
of queues supported rather than currently allocated number of queues.
Petr Oros fixes issues related to some ethtool operations in switchdev
mode.
For iavf:
Kohei Enju corrects number of reported queues for ethtool statistics to
absolute max as using current number could race and cause out-of-bounds
issues.
For idpf:
Josh NULLs cdev_info pointer after freeing to prevent possible subsequent
improper access. He also defers setting of refillqs value until after
allocation to prevent possible NULL pointer dereference.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
idpf: only assign num refillqs if allocation was successful
idpf: clear stale cdev_info ptr
iavf: fix out-of-bounds writes in iavf_get_ethtool_stats()
ice: use ice_update_eth_stats() for representor stats
ice: fix inverted ready check for VF representors
ice: set max queues in alloc_etherdev_mqs()
====================