These are the remaining patches from the "ice: Separate TSPLL from PTP
and cleanup" series [1] with control flow macros removed. What remains
are cleanups and some minor improvements.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: default to TIME_REF instead of TXCO on E825-C
ice: move TSPLL init calls to ice_ptp.c
ice: fall back to TCXO on TSPLL lock fail
ice: wait before enabling TSPLL
ice: add multiple TSPLL helpers
ice: use bitfields instead of unions for CGU regs
ice: read TSPLL registers again before reporting status
ice: clear time_sync_en field for E825-C during reprogramming
====================
Jakub Kicinski [Thu, 26 Jun 2025 16:54:41 +0000 (09:54 -0700)]
eth: bnxt: take page size into account for page pool recycling rings
The Rx rings are filled with Rx buffers. Which are supposed to fit
packet headers (or MTU if HW-GRO is disabled). The aggregation buffers
are filled with "device pages". Adjust the sizes of the page pool
recycling ring appropriately, based on ratio of the size of the
buffer on given ring vs system page size. Otherwise on a system
with 64kB pages we end up with >700MB of memory sitting in every
single page pool cache.
Correct the size calculation for the head_pool. Since the buffers
there are always small I'm pretty sure I meant to cap the size
at 1k, rather than make it the lowest possible size. With 64k pages
1k cache with a 1k ring is 64x larger than we need.
xin.guo [Thu, 26 Jun 2025 12:34:19 +0000 (12:34 +0000)]
tcp: fix tcp_ofo_queue() to avoid including too much DUP SACK range
If the new coming segment covers more than one skbs in the ofo queue,
and which seq is equal to rcv_nxt, then the sequence range
that is duplicated will be sent as DUP SACK, the detail as below,
in step6, the {501,2001} range is clearly including too much
DUP SACK range, in violation of RFC 2883 rules.
Jakub Kicinski [Fri, 27 Jun 2025 22:14:58 +0000 (15:14 -0700)]
Merge branch 'net-dsa-ks8995-fix-up-bindings'
Linus Walleij says:
====================
net: dsa: ks8995: Fix up bindings
After looking at the datasheets for KS8995 I realized this is
a DSA switch and need to have DT bindings as such and be implemented
as such.
This series just fixes up the bindings and the offending device tree.
The existing kernel driver which is in drivers/net/phy/spi_ks8995.c
does not implement DSA. It can be forgiven for this because it was
merged in 2011 and the DSA framework was not widely established
back then. It continues to probe fine but needs to be rewritten
to use the special DSA tag and moved to drivers/net/dsa as time
permits. (I hope I can do this.)
It's fine for the networking tree to merge both patches, I maintain
ixp4xx as well. But I can also carry the second patch through the
SoC tree if so desired.
Linus Walleij [Wed, 25 Jun 2025 06:51:25 +0000 (08:51 +0200)]
ARM: dts: Fix up wrv54g device tree
Fix up the KS8995 switch and PHYs the way that is most likely:
- Phy 1-4 is certainly the PHYs of the KS8995 (mask 0x1e in
the outoftree code masks PHYs 1,2,3,4).
- Phy 5 is the MII-P5 separate WAN phy of the KS8995 directly
connected to EthC.
- The EthB MII is probably connected as CPU interface to the
KS8995.
Properly integrate the KS8995 switch using the new bindings.
Linus Walleij [Wed, 25 Jun 2025 06:51:24 +0000 (08:51 +0200)]
dt-bindings: dsa: Rewrite Micrel KS8995 in schema
After studying the datasheets for some of the KS8995 variants
it becomes pretty obvious that this is a straight-forward
and simple MII DSA switch with one port in (CPU) and four outgoing
ports, and it even supports custom tags by setting a bit in
a special register, and elaborate VLAN handling as all DSA
switches do.
What is a bit odd with KS8995 is that it uses an extra MII-P5
port to access one of the PHYs separately, on the side of the
switch fabric, such as when using a WAN port separately from
a LAN switch in a home router.
Rewrite the terse bindings to YAML, and move to the proper
subdirectory. Include a verbose example to make things clear.
The Allwinner A100/A133 has an Ethernet MAC (EMAC) controller that is
compatible with the A64 one. It features the same syscon registers for
control of the top-level integration of the unit.
====================
NFC: trf7970a: Add option to reduce antenna gain
The TRF7970a device is sensitive to RF disturbances, which can make it
hard to pass some EMC immunity tests. By reducing the RX antenna gain,
the device becomes less sensitive to EMC disturbances, as a trade-off
against antenna performance.
====================
Paul Geurts [Thu, 26 Jun 2025 14:12:42 +0000 (16:12 +0200)]
NFC: trf7970a: Create device-tree parameter for RX gain reduction
The TRF7970a device is sensitive to RF disturbances, which can make it
hard to pass some EMC immunity tests. By reducing the RX antenna gain,
the device becomes less sensitive to EMC disturbances, as a trade-off
against antenna performance.
Add a device tree option to select RX gain reduction to improve EMC
performance.
Selecting a communication standard in the ISO control register resets
the RX antenna gain settings. Therefore set the RX gain reduction
everytime the ISO control register changes, when the option is used.
Jeff Layton [Thu, 26 Jun 2025 12:52:14 +0000 (08:52 -0400)]
ref_tracker: do xarray and workqueue job initializations earlier
The kernel test robot reported an oops that occurred when attempting to
deregister a dentry from the xarray during subsys_initcall().
The ref_tracker xarrays and workqueue job are being initialized in
late_initcall() which is too late. Move those to postcore_initcall()
instead.
Fixes: 65b584f53611 ("ref_tracker: automatically register a file in debugfs for a ref_tracker_dir") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202506251406.c28f2adb-lkp@intel.com Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250626-reftrack-dbgfs-v1-1-812102e2a394@kernel.org
Simon Horman [Wed, 25 Jun 2025 12:52:10 +0000 (13:52 +0100)]
tg3: spelling corrections
Correct spelling as flagged by codespell.
Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: mdio: Add MDIO bus controller for Airoha AN7583
Airoha AN7583 SoC have 2 dedicated MDIO bus controller in the SCU
register map. To driver register an MDIO controller based on the DT
reg property and access the register by accessing the parent syscon.
The MDIO bus logic is similar to the MT7530 internal MDIO bus but
deviates of some setting and some HW bug.
On Airoha AN7583 the MDIO clock is set to 25MHz by default and needs to
be correctly setup to 2.5MHz to correctly work (by setting the divisor
to 10x).
There seems to be Hardware bug where AN7583_MII_RWDATA
is not wiped in the context of unconnected PHY and the
previous read value is returned.
Example: (only one PHY on the BUS at 0x1f)
- read at 0x1f report at 0x2 0x7500
- read at 0x0 report 0x7500 on every address
To workaround this, we reset the Mdio BUS at every read
to have consistent values on read operation.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
dt-bindings: net: Document support for Airoha AN7583 MDIO Controller
Airoha AN7583 SoC have 3 different MDIO Controller. One comes from
the intergated Switch based on MT7530. The other 2 live under the SCU
register and expose 2 dedicated MDIO controller.
Document the schema for the 2 dedicated MDIO controller.
Each MDIO controller can be independently reset with the SoC reset line.
Each MDIO controller have a dedicated clock configured to 2.5MHz by
default to follow MDIO bus IEEE 802.3 standard.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
ptp: Belated spring cleaning of the chardev driver
When looking into supporting auxiliary clocks in the PTP ioctl, the
inpenetrable ptp_ioctl() letter soup bothered me enough to clean it up.
The code (~400 lines!) is really hard to follow due to a gazillion of
local variables, which are only used in certain case scopes, and a
mixture of gotos, breaks and direct error return paths.
Clean it up by splitting out the IOCTL functionality into seperate
functions, which contain only the required local variables and are trivial
to follow. Complete the cleanup by converting the code to lock guards and
get rid of all gotos.
That reduces the code size by 48 lines and also the binary text size is
80 bytes smaller than the current maze.
The series is split up into one patch per IOCTL command group for easy
review.
Thomas Gleixner [Wed, 25 Jun 2025 11:52:39 +0000 (13:52 +0200)]
ptp: Simplify ptp_read()
The mixture of gotos and direct return codes is inconsistent and just makes
the code harder to read. Let it consistently return error codes directly and
tidy the code flow up accordingly.
Thomas Gleixner [Wed, 25 Jun 2025 11:52:38 +0000 (13:52 +0200)]
ptp: Convert chardev code to lock guards
Convert the various spin_lock_irqsave() protected critical regions to
scoped guards. Use spinlock_irq instead of spinlock_irqsave as all the
functions are invoked in thread context with interrupts enabled.
Thomas Gleixner [Wed, 25 Jun 2025 11:52:34 +0000 (13:52 +0200)]
ptp: Split out PTP_PIN_SETFUNC ioctl code
Continue the ptp_ioctl() cleanup by splitting out the PTP_PIN_SETFUNC ioctl
code into a helper function. Convert to lock guard while at it and remove
the pointless memset of the pd::rsv because nothing uses it.
Thomas Gleixner [Wed, 25 Jun 2025 11:52:33 +0000 (13:52 +0200)]
ptp: Split out PTP_PIN_GETFUNC ioctl code
Continue the ptp_ioctl() cleanup by splitting out the PTP_PIN_GETFUNC ioctl
code into a helper function. Convert to lock guard while at it and remove
the pointless memset of the pd::rsv because nothing uses it.
Thomas Gleixner [Wed, 25 Jun 2025 11:52:24 +0000 (13:52 +0200)]
ptp: Split out PTP_CLOCK_GETCAPS ioctl code
ptp_ioctl() is an inpenetrable letter soup with a gazillion of case (scope)
specific variables defined at the top of the function and pointless breaks
and gotos.
Start cleaning it up by splitting out the PTP_CLOCK_GETCAPS ioctl code into
a helper function. Use a argument pointer with a single sparse compliant
type cast instead of proliferating the type cast all over the place.
Petr Machata [Wed, 25 Jun 2025 10:41:23 +0000 (12:41 +0200)]
selftests: forwarding: lib: Split setup_wait()
setup_wait() takes an optional argument and then is called from the top
level of the test script. That confuses shellcheck, which thinks that maybe
the intention is to pass $1 of the script to the function, which is never
the case. To avoid having to annotate every single new test with a SC
disable, split the function in two: one that takes a mandatory argument,
and one that takes no argument at all.
Convert the two existing users of that optional argument, both in Spectrum
resource selftest, to use the new form. Clean up vxlan_bridge_1q_mc_ul.sh
to not pass a now-unused argument.
We've added 6 non-merge commits during the last 8 day(s) which contain
a total of 6 files changed, 120 insertions(+), 20 deletions(-).
The main changes are:
1) Fix RCU usage in task_cls_state() for BPF programs using helpers like
bpf_get_cgroup_classid_curr() outside of networking, from Charalampos
Mitrodimas.
2) Fix a sockmap race between map_update and a pending workqueue from
an earlier map_delete freeing the old psock where both pointed to the
same psock->sk, from Jiayuan Chen.
3) Fix a data corruption issue when using bpf_msg_pop_data() in kTLS which
failed to recalculate the ciphertext length, also from Jiayuan Chen.
4) Remove xdp_redirect_map{,_err} trace events since they are unused and
also hide XDP trace events under CONFIG_BPF_SYSCALL, from Steven Rostedt.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
xdp: tracing: Hide some xdp events under CONFIG_BPF_SYSCALL
xdp: Remove unused events xdp_redirect_map and xdp_redirect_map_err
net, bpf: Fix RCU usage in task_cls_state() for BPF programs
selftests/bpf: Add test to cover ktls with bpf_msg_pop_data
bpf, ktls: Fix data corruption when using bpf_msg_pop_data() in ktls
bpf, sockmap: Fix psock incorrectly pointing to sk
====================
Lorenzo Bianconi [Wed, 25 Jun 2025 14:43:15 +0000 (16:43 +0200)]
net: airoha: Get rid of dma_sync_single_for_device() in airoha_qdma_fill_rx_queue()
Since the page_pool for airoha_eth driver is created with
PP_FLAG_DMA_SYNC_DEV flag, we do not need to sync_for_device each page
received from the pool since it is already done by the page_pool codebase.
The DT bindings file "renesas,r9a09g057-gbeth.yaml" applies to a whole
family of SoCs, and uses "renesas,rzv2h-gbeth" as a fallback compatible
value. Hence rename it to the more generic "renesas,rzv2h-gbeth.yaml".
Cross-merge networking fixes after downstream PR (net-6.16-rc4).
Conflicts:
Documentation/netlink/specs/mptcp_pm.yaml 9e6dd4c256d0 ("netlink: specs: mptcp: replace underscores with dashes in names") ec362192aa9e ("netlink: specs: fix up indentation errors")
https://lore.kernel.org/20250626122205.389c2cd4@canb.auug.org.au
Adjacent changes:
Documentation/netlink/specs/fou.yaml 791a9ed0a40d ("netlink: specs: fou: replace underscores with dashes in names") 880d43ca9aa4 ("netlink: specs: clean up spaces in brackets")
Jacob Keller [Tue, 24 Jun 2025 00:30:04 +0000 (17:30 -0700)]
ice: default to TIME_REF instead of TXCO on E825-C
The driver currently defaults to the internal oscillator as the clock
source for E825-C hardware. While this clock source is labeled TCXO,
indicating a temperature compensated oscillator, this is only true for some
board designs. Many board designs have a less capable oscillator. The
E825-C hardware may also have its clock source set to the TIME_REF pin.
This pin is connected to the DPLL and is often a more stable clock source.
The choice of the internal oscillator is not suitable for all systems,
especially those which want to enable SyncE support.
There is currently no interface available for users to configure the clock
source. Other variants of the E82x board have the clock source configured
in the NVM, but E825-C lacks this capability, so different board designs
cannot select a different default clock via firmware.
In most setups, the TIME_REF is a suitable default clock source.
Additionally, we now fall back to the internal oscillator automatically if
the TIME_REF clock source cannot be locked.
Change the default clock source for E825-C to TIME_REF. Note that the
driver logs a dev_dbg message upon configuring the TSPLL which includes the
clock source and frequency. This can be enabled to confirm which clock
source is in use.
Longterm a proper interface to dynamically introspect and change the clock
source will be designed (perhaps some extension of the DPLL subsystem?)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Karol Kolacinski [Tue, 24 Jun 2025 00:30:03 +0000 (17:30 -0700)]
ice: move TSPLL init calls to ice_ptp.c
Initialize TSPLL after initializing PHC in ice_ptp.c instead of calling
for each product in PHC init in ice_ptp_hw.c.
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Karol Kolacinski [Tue, 24 Jun 2025 00:30:02 +0000 (17:30 -0700)]
ice: fall back to TCXO on TSPLL lock fail
TSPLL can fail when trying to lock to TIME_REF as a clock source, e.g.
when the external clock source is not stable or connected to the board.
To continue operation after failure, try to lock again to internal TCXO
and inform user about this.
Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Karol Kolacinski [Tue, 24 Jun 2025 00:29:59 +0000 (17:29 -0700)]
ice: use bitfields instead of unions for CGU regs
Switch from unions with bitfield structs to definitions with bitfield
masks. This is necessary, because some registers have different
field definitions or even use a different register for the same fields
based on HW type.
Remove unused register fields.
Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jacob Keller [Tue, 24 Jun 2025 00:29:58 +0000 (17:29 -0700)]
ice: read TSPLL registers again before reporting status
After programming the TSPLL, re-read the registers before reporting status.
This ensures the debug log message will show what was actually programmed,
rather than relying on a cached value.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jacob Keller [Tue, 24 Jun 2025 00:29:57 +0000 (17:29 -0700)]
ice: clear time_sync_en field for E825-C during reprogramming
When programming the Clock Generation Unit for E285-C hardware, we need
to clear the time_sync_en bit of the DWORD 9 before we set the
frequency.
Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jakub Kicinski [Tue, 24 Jun 2025 14:28:34 +0000 (07:28 -0700)]
eth: fbnic: rename fbnic_fw_clear_cmpl to fbnic_mbx_clear_cmpl
fbnic_fw_clear_cmpl() does the inverse of fbnic_mbx_set_cmpl().
It removes the completion from the mailbox table.
It also calls fbnic_mbx_set_cmpl_slot() internally.
It should have fbnic_mbx prefix, not fbnic_fw.
I'm not very clear on what the distinction is between the two
prefixes but the matching "set" and "clear" functions should
use the same prefix.
While at it move the "clear" function closer to the "set".
Jakub Kicinski [Tue, 24 Jun 2025 14:28:32 +0000 (07:28 -0700)]
eth: fbnic: realign whitespace
Relign various whitespace things. Some of it is spaces which should
be tabs and some is making sure the values are actually correctly
aligned to "columns" with 8 space tabs. Whitespace changes only.
====================
Follow-up to RGMII mode clarification: am65-cpsw fix + checkpatch
Following previous discussion [1] and the documentation update by
Andrew [2]:
Fix up the mode to account for the fixed TX delay on the AM65 CPSW
Ethernet controllers, similar to the way the icssg-prueth does it. For
backwards compatibility, the "impossible" modes that claim to have a
delay on the PCB are still accepted, but trigger a warning message.
As Andrew suggested, I have also added a checkpatch check that requires
a comment for any RGMII mode that is not "rgmii-id".
No Device Trees are updated to avoid the warning for now, to give other
projects syncing the Linux Device Trees some time to fix their drivers
as well. I intend to submit an equivalent change for U-Boot's
am65-cpsw-nuss driver as soon as the changes are accepted for Linux.
[1] https://lore.kernel.org/lkml/d25b1447-c28b-4998-b238-92672434dc28@lunn.ch/
[2] https://lore.kernel.org/all/20250430-v6-15-rc3-net-rgmii-delays-v2-1-099ae651d5e5@lunn.ch/
commit c360eb0c3ccb ("dt-bindings: net: ethernet-controller: Add informative text about RGMII delays")
v1: https://lore.kernel.org/all/cover.1744710099.git.matthias.schiffer@ew.tq-group.com/
====================
checkpatch: check for comment explaining rgmii(|-rxid|-txid) PHY modes
Historically, the RGMII PHY modes specified in Device Trees have been
used inconsistently, often referring to the usage of delays on the PHY
side rather than describing the board; many drivers still implement this
incorrectly.
Require a comment in Devices Trees using these modes (usually mentioning
that the delay is realized on the PCB), so we can avoid adding more
incorrect uses (or will at least notice which drivers still need to be
fixed).
dt-bindings: net: ti: k3-am654-cpsw-nuss: update phy-mode in example
k3-am65-cpsw-nuss controllers have a fixed internal TX delay, so RXID
mode is not actually possible and will result in a warning from the
driver going forward.
Jiawen Wu [Wed, 25 Jun 2025 02:39:24 +0000 (10:39 +0800)]
net: libwx: fix the creation of page_pool
'rx_ring->size' means the count of ring descriptors multiplied by the
size of one descriptor. When increasing the count of ring descriptors,
it may exceed the limit of pool size.
[ 864.209610] page_pool_create_percpu() gave up with errno -7
[ 864.209613] txgbe 0000:11:00.0: Page pool creation failed: -7
Fix to set the pool_size to the count of ring descriptors.
Fixes: 850b971110b2 ("net: libwx: Allocate Rx and Tx resources") Cc: stable@vger.kernel.org Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/434C72BFB40E350A+20250625023924.21821-1-jiawenwu@trustnetic.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 24 Jun 2025 18:32:58 +0000 (11:32 -0700)]
net: selftests: fix TCP packet checksum
The length in the pseudo header should be the length of the L3 payload
AKA the L4 header+payload. The selftest code builds the packet from
the lower layers up, so all the headers are pushed already when it
constructs L4. We need to subtract the lower layer headers from skb->len.
Linus Torvalds [Thu, 26 Jun 2025 04:09:02 +0000 (21:09 -0700)]
Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:
- Fix use-after-free in libbpf when map is resized (Adin Scannell)
- Fix verifier assumptions about 2nd argument of bpf_sysctl_get_name
(Jerome Marchand)
- Fix verifier assumption of nullness of d_inode in dentry (Song Liu)
- Fix global starvation of LRU map (Willem de Bruijn)
- Fix potential NULL dereference in btf_dump__free (Yuan Chen)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: adapt one more case in test_lru_map to the new target_free
libbpf: Fix possible use-after-free for externs
selftests/bpf: Convert test_sysctl to prog_tests
bpf: Specify access type of bpf_sysctl_get_name args
libbpf: Fix null pointer dereference in btf_dump__free on allocation failure
bpf: Adjust free target to avoid global starvation of LRU map
bpf: Mark dentry->d_inode as trusted_or_null
Linus Torvalds [Thu, 26 Jun 2025 03:48:48 +0000 (20:48 -0700)]
Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull mount fixes from Al Viro:
"Several mount-related fixes"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
userns and mnt_idmap leak in open_tree_attr(2)
attach_recursive_mnt(): do not lock the covering tree when sliding something under it
replace collect_mounts()/drop_collected_mounts() with a safer variant
net: phy: realtek: add error handling to rtl8211f_get_wol
We should check if the WOL settings was successfully read from the PHY.
In case this fails we cannot just use the error code and proceed.
Signed-off-by: Daniel Braunwarth <daniel.braunwarth@kuka.com> Reported-by: Jon Hunter <jonathanh@nvidia.com> Closes: https://lore.kernel.org/baaa083b-9a69-460f-ab35-2a7cb3246ffd@nvidia.com Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250624-realtek_fixes-v1-1-02a0b7c369bc@kuka.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
atm: Release atm_dev_mutex after removing procfs in atm_dev_deregister().
syzbot reported a warning below during atm_dev_register(). [0]
Before creating a new device and procfs/sysfs for it, atm_dev_register()
looks up a duplicated device by __atm_dev_lookup(). These operations are
done under atm_dev_mutex.
However, when removing a device in atm_dev_deregister(), it releases the
mutex just after removing the device from the list that __atm_dev_lookup()
iterates over.
So, there will be a small race window where the device does not exist on
the device list but procfs/sysfs are still not removed, triggering the
splat.
Let's hold the mutex until procfs/sysfs are removed in
atm_dev_deregister().
====================
netlink: specs: enforce strict naming of properties
I got annoyed once again by the name properties in the ethtool spec
which use underscore instead of dash. I previously assumed that there
is a lot of such properties in the specs so fixing them now would
be near impossible. On a closer look, however, I only found 22
(rough grep suggests we have ~4.8k names in the specs, so bad ones
are just 0.46%).
Add a regex to the JSON schema to enforce the naming, fix the few
bad names. I was hoping we could start enforcing this from newer
families, but there's no correlation between the protocol and the
number of errors. If anything classic netlink has more recently
added specs so it has fewer errors.
The regex is just for name properties which will end up visible
to the user (in Python or YNL CLI). I left the c-name properties
alone, those don't matter as much. C codegen rewrites them, anyway.
I'm not updating the spec for genetlink-c. Looks like it has no
users, new families use genetlink, all old ones need genetlink-legacy.
If these patches are merged I will remove genetlink-c completely
in net-next.
====================
Jakub Kicinski [Tue, 24 Jun 2025 21:10:02 +0000 (14:10 -0700)]
netlink: specs: enforce strict naming of properties
Add a regexp to make sure all names which may end up being visible
to the user consist of lower case characters, numbers and dashes.
Underscores keep sneaking into the specs, which is not visible
in the C code but makes the Python and alike inconsistent.
Note that starting with a number is okay, as in C the full
name will include the family name.
For legacy families we can't enforce the naming in the family
name or the multicast group names, as these are part of the
binary uAPI of the kernel.
For classic netlink we need to allow capital letters in names
of struct members. TC has some structs with capitalized members.
Jakub Kicinski [Tue, 24 Jun 2025 21:10:01 +0000 (14:10 -0700)]
netlink: specs: tc: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Jakub Kicinski [Tue, 24 Jun 2025 21:10:00 +0000 (14:10 -0700)]
netlink: specs: rt-link: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Fixes: b2f63d904e72 ("doc/netlink: Add spec for rt link messages") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250624211002.3475021-9-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Jun 2025 21:09:59 +0000 (14:09 -0700)]
netlink: specs: mptcp: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Fixes: bc8aeb2045e2 ("Documentation: netlink: add a YAML spec for mptcp") Reviewed-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250624211002.3475021-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Jun 2025 21:09:58 +0000 (14:09 -0700)]
netlink: specs: ovs_flow: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Fixes: 93b230b549bc ("netlink: specs: add ynl spec for ovs_flow") Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: Eelco Chaudron <echaudro@redhat.com> Link: https://patch.msgid.link/20250624211002.3475021-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Jun 2025 21:09:57 +0000 (14:09 -0700)]
netlink: specs: devlink: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Fixes: 429ac6211494 ("devlink: define enum for attr types of dynamic attributes") Fixes: f2f9dd164db0 ("netlink: specs: devlink: add the remaining command to generate complete split_ops") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250624211002.3475021-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Jun 2025 21:09:56 +0000 (14:09 -0700)]
netlink: specs: dpll: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Fixes: 3badff3a25d8 ("dpll: spec: Add Netlink spec in YAML") Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250624211002.3475021-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Jun 2025 21:09:55 +0000 (14:09 -0700)]
netlink: specs: ethtool: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen replaces special chars in names)
but gives more uniform naming in Python.
Fixes: 13e59344fb9d ("net: ethtool: add support for symmetric-xor RSS hash") Fixes: 46fb3ba95b93 ("ethtool: Add an interface for flashing transceiver modules' firmware") Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250624211002.3475021-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Jun 2025 21:09:54 +0000 (14:09 -0700)]
netlink: specs: fou: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Jakub Kicinski [Tue, 24 Jun 2025 21:09:53 +0000 (14:09 -0700)]
netlink: specs: nfsd: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Fixes: 13727f85b49b ("NFSD: introduce netlink stubs") Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20250624211002.3475021-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
RubenKelevra [Tue, 24 Jun 2025 16:57:11 +0000 (18:57 +0200)]
uapi: net_dropmon: drop unused is_drop_point_hw macro
Commit 4ea7e38696c7 ("dropmon: add ability to detect when hardware
drops rx packets") introduced is_drop_point_hw, but the symbol was
never referenced anywhere in the kernel tree and is currently not used
by dropwatch. I could not find, to the best of my abilities, a current
out-of-tree user of this macro.
The definition also contains a syntax error in its for-loop, so any
project that tried to compile against it would fail. Removing the
macro therefore eliminates dead code without breaking existing
users.
Simon Horman [Tue, 24 Jun 2025 16:35:12 +0000 (17:35 +0100)]
net: enetc: Correct endianness handling in _enetc_rd_reg64
enetc_hw.h provides two versions of _enetc_rd_reg64.
One which simply calls ioread64() when available.
And another that composes the 64-bit result from ioread32() calls.
In the second case the code appears to assume that each ioread32() call
returns a little-endian value. However both the shift and logical or
used to compose the return value would not work correctly on big endian
systems if this were the case. Moreover, this is inconsistent with the
first case where the return value of ioread64() is assumed to be in host
byte order.
It appears that the correct approach is for both versions to treat the
return value of ioread*() functions as being in host byte order. And
this patch corrects the ioread32()-based version to do so.
This is a bug but would only manifest on big endian systems
that make use of the ioread32-based implementation of _enetc_rd_reg64.
While all in-tree users of this driver are little endian and
make use of the ioread64-based implementation of _enetc_rd_reg64.
Thus, no in-tree user of this driver is affected by this bug.
Nathan Lynch [Tue, 24 Jun 2025 13:50:44 +0000 (08:50 -0500)]
lib: packing: Include necessary headers
packing.h uses ARRAY_SIZE(), BUILD_BUG_ON_MSG(), min(), max(), and
sizeof_field() without including the headers where they are defined,
potentially causing build failures.
Next step on the path to moving RSS config to Netlink. With the
refactoring of the driver-facing API for ETHTOOL_GRXFH/ETHTOOL_SRXFH
out of the way we can move on to more interesting work.
Add Netlink notifications for changes in RSS configuration.
As a reminder (part) of rss-get was introduced in previous releases
when input-xfrm (symmetric hashing) was added. rss-set isn't
implemented, yet, but we can implement rss-ntf and hook it into
the changes done via the IOCTL path (same as other ethtool-nl
notifications do).
Most of the series is concerned with passing arguments to notifications.
So far none of the notifications needed to be parametrized, but RSS can
have multiple contexts per device, and since GET operates on a single
context at a time, the notification needs to also be scoped to a context.
Patches 2-5 add support for passing arguments to notifications thru
ethtool-nl generic infra.
The notification handling itself is pretty trivial, it's mostly
hooking in the right entries into the ethool-nl op tables.
Jakub Kicinski [Mon, 23 Jun 2025 23:17:20 +0000 (16:17 -0700)]
selftests: drv-net: test RSS Netlink notifications
Test that changing the RSS config generates Netlink notifications.
# ./tools/testing/selftests/drivers/net/hw/rss_api.py
TAP version 13
1..2
ok 1 rss_api.test_rxfh_indir_ntf
ok 2 rss_api.test_rxfh_indir_ctx_ntf
# Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0
Jakub Kicinski [Mon, 23 Jun 2025 23:17:19 +0000 (16:17 -0700)]
doc: ethtool: mark ETHTOOL_GRXFHINDIR as reimplemented
The ETHTOOL_GRXFHINDIR reimplementation has been completed around
a year ago. We have been tweaking it so a bit hard to point
to a single commit that completed it, but all the fields available
in IOCTL are reported via Netlink.
Jakub Kicinski [Mon, 23 Jun 2025 23:17:18 +0000 (16:17 -0700)]
net: ethtool: rss: add notifications
In preparation for RSS_SET handling in ethnl introduce Netlink
notifications for RSS. Only cover modifications, not creation
and not removal of a context, because the latter may deserve
a different notification type. We should cross that bridge
when we add the support for context add / remove via Netlink.
Jakub Kicinski [Mon, 23 Jun 2025 23:17:17 +0000 (16:17 -0700)]
net: ethtool: copy req_info from SET to NTF
Copy information parsed for SET with .req_parse to NTF handling
and therefore the GET-equivalent that it ends up executing.
This way if the SET was on a sub-object (like RSS context)
the notification will also be appropriately scoped.
Also copy the phy_index, Maxime suggests this will help PLCA
commands generate accurate notifications as well.
Jakub Kicinski [Mon, 23 Jun 2025 23:17:16 +0000 (16:17 -0700)]
net: ethtool: remove the data argument from ethtool_notify()
ethtool_notify() takes a const void *data argument, which presumably
was intended to pass information from the call site to the subcommand
handler. This argument currently has no users.
Expecting the data to be subcommand-specific has two complications.
Complication #1 is that its not plumbed thru any of the standardized
callbacks. It gets propagated to ethnl_default_notify() where it
remains unused. Coming from the ethnl_default_set_doit() side we pass
in NULL, because how could we have a command specific attribute in
a generic handler.
Complication #2 is that we expect the ethtool_notify() callers to
know what attribute type to pass in. Again, the data pointer is
untyped.
RSS will need to pass the context ID to the notifications.
I think it's a better design if the "subcommand" exports its own
typed interface and constructs the appropriate argument struct
(which will be req_info). Remove the unused data argument from
ethtool_notify() but retain it in a new internal helper which
subcommands can use to build a typed interface.
Jakub Kicinski [Mon, 23 Jun 2025 23:17:15 +0000 (16:17 -0700)]
net: ethtool: call .parse_request for SET handlers
In preparation for using req_info to carry parameters between SET
and NTF - call .parse_request during ethnl_default_set_doit().
The main question here is whether .parse_request is intended to be
GET-specific. Originally the SET handling was delegated to each subcommand
directly - ethnl_default_set_doit() and .set callbacks in ethnl_request_ops
did not exist. Looking at existing users does not shed much light, all
of the following subcommands use .parse_request but have no SET handler
(and no NTF):
where .parse_request handling is used to select which statistics to query.
Not relevant for SET but also harmless.
Going back to RSS (which doesn't have SET today) .parse_request parses
the rss_context ID. Using the req_info struct to pass the context ID
from SET to NTF will be very useful.
Switch to ethnl_default_parse(), effectively adding the .parse_request
for SET handlers.
Jakub Kicinski [Mon, 23 Jun 2025 23:17:14 +0000 (16:17 -0700)]
net: ethtool: dynamically allocate full req size req
In preparation for using req_info to carry parameters between
SET and NTF allocate a full request info struct. Since the size
depends on the subcommand we need to allocate it on the heap.
syszbot reports various ordering issues for lower instance locks and
team lock. Switch to using rtnl lock for protecting team device,
similar to bonding. Based on the patch by Tetsuo Handa.
Cc: Jiri Pirko <jiri@resnulli.us> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04 Reported-by: syzbot+71fd22ae4b81631e22fd@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=71fd22ae4b81631e22fd Fixes: 6b1d3c5f675c ("team: grab team lock during team_change_rx_flags") Link: https://lkml.kernel.org/r/ZoZ2RH9BcahEB9Sb@nanopsycho.orion Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250623153147.3413631-1-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Adin Scannell [Wed, 25 Jun 2025 05:02:15 +0000 (22:02 -0700)]
libbpf: Fix possible use-after-free for externs
The `name` field in `obj->externs` points into the BTF data at initial
open time. However, some functions may invalidate this after opening and
before loading (e.g. `bpf_map__set_value_size`), which results in
pointers into freed memory and undefined behavior.
The simplest solution is to simply `strdup` these strings, similar to
the `essent_name`, and free them at the same time.
In order to test this path, the `global_map_resize` BPF selftest is
modified slightly to ensure the presence of an extern, which causes this
test to fail prior to the fix. Given there isn't an obvious API or error
to test against, I opted to add this to the existing test as an aspect
of the resizing feature rather than duplicate the test.