Jakub Kicinski [Wed, 11 Jun 2025 14:59:44 +0000 (07:59 -0700)]
net: ethtool: add dedicated callbacks for getting and setting rxfh fields
We mux multiple calls to the drivers via the .get_nfc and .set_nfc
callbacks. This is slightly inconvenient to the drivers as they
have to de-mux them back. It will also be awkward for netlink code
to construct struct ethtool_rxnfc when it wants to get info about
RX Flow Hash, from the RSS module.
Add dedicated driver callbacks. Create struct ethtool_rxfh_fields
which contains only data relevant to RXFH. Maintain the names of
the fields to avoid having to heavily modify the drivers.
For now support both callbacks, once all drivers are converted
ethtool_*et_rxfh_fields() will stop using the rxnfc callbacks.
Jakub Kicinski [Wed, 11 Jun 2025 14:59:43 +0000 (07:59 -0700)]
net: ethtool: require drivers to opt into the per-RSS ctx RXFH
RX Flow Hashing supports using different configuration for different
RSS contexts. Only two drivers seem to support it. Make sure we
uniformly error out for drivers which don't.
Jakub Kicinski [Wed, 11 Jun 2025 14:59:41 +0000 (07:59 -0700)]
net: ethtool: copy the rxfh flow handling
RX Flow Hash configuration uses the same argument structure
as flow filters. This is probably why ethtool IOCTL handles
them together. The more checks we add the more convoluted
this code is getting (as some of the checks apply only
to flow filters and others only to the hashing).
Copy the code to separate the handling. This is an exact
copy, the next change will remove unnecessary handling.
- wifi: ath11k: avoid burning CPU in ath11k_debugfs_fw_stats_request()
for 3 sec if fw_stats_done is not set"
* tag 'net-6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (70 commits)
selftests: drv-net: rss_ctx: Add test for ntuple rules targeting default RSS context
net: ethtool: Don't check if RSS context exists in case of context 0
af_unix: Allow passing cred for embryo without SO_PASSCRED/SO_PASSPIDFD.
ipv6: Move fib6_config_validate() to ip6_route_add().
net: drv: netdevsim: don't napi_complete() from netpoll
net/mlx5: HWS, Add error checking to hws_bwc_rule_complex_hash_node_get()
veth: prevent NULL pointer dereference in veth_xdp_rcv
net_sched: remove qdisc_tree_flush_backlog()
net_sched: ets: fix a race in ets_qdisc_change()
net_sched: tbf: fix a race in tbf_change()
net_sched: red: fix a race in __red_change()
net_sched: prio: fix a race in prio_tune()
net_sched: sch_sfq: reject invalid perturb period
net: phy: phy_caps: Don't skip better duplex macth on non-exact match
MAINTAINERS: Update Kuniyuki Iwashima's email address.
selftests: net: add test case for NAT46 looping back dst
net: clear the dst when changing skb protocol
net/mlx5e: Fix number of lanes to UNKNOWN when using data_rate_oper
net/mlx5e: Fix leak of Geneve TLV option object
net/mlx5: HWS, make sure the uplink is the last destination
...
Linus Torvalds [Thu, 12 Jun 2025 15:21:13 +0000 (08:21 -0700)]
Merge tag 'pinctrl-v6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- Add some missing pins on the Qualcomm QCM2290, along with a managed
resources patch that make it clean and nice
- Drop an unused function in the ST Micro driver
- Drop bouncing MAINTAINER entry
- Drop of_match_ptr() macro to rid compile warnings in the TB10x
driver
- Fix up calculation of pin numbers from base in the Sunxi driver
* tag 'pinctrl-v6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: sunxi: dt: Consider pin base when calculating bank number from pin
pinctrl: tb10x: Drop of_match_ptr for ID table
pinctrl: MAINTAINERS: Drop bouncing Jianlong Huang
pinctrl: st: Drop unused st_gpio_bank() function
pinctrl: qcom: pinctrl-qcm2290: Add missing pins
pinctrl: qcom: switch to devm_gpiochip_add_data()
Linus Torvalds [Thu, 12 Jun 2025 15:17:56 +0000 (08:17 -0700)]
Merge tag 'arc-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
- arch_atomic64_cmpxchg relaxed variant [Jason]
- use of inbuilt swap in stack unwinder [Yu-Chun Lin]
- use of __ASSEMBLER__ in kernel headers [Thomas Huth]
* tag 'arc-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: Replace __ASSEMBLY__ with __ASSEMBLER__ in the non-uapi headers
ARC: Replace __ASSEMBLY__ with __ASSEMBLER__ in uapi headers
ARC: unwind: Use built-in sort swap to reduce code size and improve performance
ARC: atomics: Implement arch_atomic64_cmpxchg using _relaxed
Jakub Kicinski [Thu, 12 Jun 2025 15:16:47 +0000 (08:16 -0700)]
Merge tag 'wireless-2025-06-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
Another quick round of updates:
- revert mwifiex HT40 that was causing issues
- many ath10k/ath11k/ath12k fixes
- re-add some iwlwifi code I lost in a merge
- use kfree_sensitive() on an error path in cfg80211
* tag 'wireless-2025-06-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: cfg80211: use kfree_sensitive() for connkeys cleanup
wifi: iwlwifi: fix merge damage related to iwl_pci_resume
Revert "wifi: mwifiex: Fix HT40 bandwidth issue."
wifi: ath12k: fix uaf in ath12k_core_init()
wifi: ath12k: Fix hal_reo_cmd_status kernel-doc
wifi: ath12k: fix GCC_GCC_PCIE_HOT_RST definition for WCN7850
wifi: ath11k: validate ath11k_crypto_mode on top of ath11k_core_qmi_firmware_ready
wifi: ath11k: consistently use ath11k_mac_get_fw_stats()
wifi: ath11k: move locking outside of ath11k_mac_get_fw_stats()
wifi: ath11k: adjust unlock sequence in ath11k_update_stats_event()
wifi: ath11k: move some firmware stats related functions outside of debugfs
wifi: ath11k: don't wait when there is no vdev started
wifi: ath11k: don't use static variables in ath11k_debugfs_fw_stats_process()
wifi: ath11k: avoid burning CPU in ath11k_debugfs_fw_stats_request()
wil6210: fix support for sparrow chipsets
wifi: ath10k: Avoid vdev delete timeout when firmware is already down
ath10k: snoc: fix unbalanced IRQ enable in crash recovery
====================
This series addresses a regression in ethtool flow steering where rules
targeting the default RSS context (context 0) were incorrectly rejected.
The default RSS context always exists but is not stored in the rss_ctx
xarray like additional contexts. The current validation logic was
checking for the existence of context 0 in this array, causing valid
flow steering rules to be rejected.
This prevented configurations such as:
- High priority rules directing specific traffic to the default context
- Low priority catch-all rules directing remaining traffic to additional
contexts
Patch 1 fixes the validation logic to skip the existence check for
context 0.
Patch 2 adds a selftest that verifies this behavior.
Gal Pressman [Thu, 12 Jun 2025 07:19:58 +0000 (10:19 +0300)]
selftests: drv-net: rss_ctx: Add test for ntuple rules targeting default RSS context
Add test_rss_default_context_rule() to verify that ntuple rules can
correctly direct traffic to the default RSS context (context 0).
The test creates two ntuple rules with explicit location priorities:
- A high-priority rule (loc 0) directing specific port traffic to
context 0.
- A low-priority rule (loc 1) directing all other TCP traffic to context
1.
This validates that:
1. Rules targeting the default context function properly.
2. Traffic steering works as expected when mixing default and
additional RSS contexts.
The test was written by AI, and reviewed by humans.
Gal Pressman [Thu, 12 Jun 2025 07:19:57 +0000 (10:19 +0300)]
net: ethtool: Don't check if RSS context exists in case of context 0
Context 0 (default context) always exists, there is no need to check
whether it exists or not when adding a flow steering rule.
The existing check fails when creating a flow steering rule for context
0 as it is not stored in the rss_ctx xarray.
For example:
$ ethtool --config-ntuple eth2 flow-type tcp4 dst-ip 194.237.147.23 dst-port 19983 context 0 loc 618
rmgr: Cannot insert RX class rule: Invalid argument
Cannot insert classification rule
An example usecase for this could be:
- A high-priority rule (loc 0) directing specific port traffic to
context 0.
- A low-priority rule (loc 1) directing all other TCP traffic to context
1.
This is a user-visible regression that was caught in our testing
environment, it was not reported by a user yet.
Fixes: de7f7582dff2 ("net: ethtool: prevent flow steering to RSS contexts which don't exist") Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20250612071958.1696361-2-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 12 Jun 2025 15:13:47 +0000 (08:13 -0700)]
Merge tag 'for-net-2025-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- eir: Fix NULL pointer deference on eir_get_service_data
- eir: Fix possible crashes on eir_create_adv_data
- hci_sync: Fix broadcast/PA when using an existing instance
- ISO: Fix using BT_SK_PA_SYNC to detect BIS sockets
- ISO: Fix not using bc_sid as advertisement SID
- MGMT: Fix sparse errors
* tag 'for-net-2025-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: MGMT: Fix sparse errors
Bluetooth: ISO: Fix not using bc_sid as advertisement SID
Bluetooth: ISO: Fix using BT_SK_PA_SYNC to detect BIS sockets
Bluetooth: eir: Fix possible crashes on eir_create_adv_data
Bluetooth: hci_sync: Fix broadcast/PA when using an existing instance
Bluetooth: Fix NULL pointer deference on eir_get_service_data
====================
af_unix: Allow passing cred for embryo without SO_PASSCRED/SO_PASSPIDFD.
Before the cited commit, the kernel unconditionally embedded SCM
credentials to skb for embryo sockets even when both the sender
and listener disabled SO_PASSCRED and SO_PASSPIDFD.
Now, the credentials are added to skb only when configured by the
sender or the listener.
However, as reported in the link below, it caused a regression for
some programs that assume credentials are included in every skb,
but sometimes not now.
The only problematic scenario would be that a socket starts listening
before setting the option. Then, there will be 2 types of non-small
race window, where a client can send skb without credentials, which
the peer receives as an "invalid" message (and aborts the connection
it seems ?):
Client Server
------ ------
s1.listen() <-- No SO_PASS{CRED,PIDFD}
s2.connect()
s2.send() <-- w/o cred
s1.setsockopt(SO_PASS{CRED,PIDFD})
s2.send() <-- w/ cred
or
Client Server
------ ------
s1.listen() <-- No SO_PASS{CRED,PIDFD}
s2.connect()
s2.send() <-- w/o cred
s3, _ = s1.accept() <-- Inherit cred options
s2.send() <-- w/o cred but not set yet
Jakub Kicinski [Wed, 11 Jun 2025 17:46:43 +0000 (10:46 -0700)]
net: drv: netdevsim: don't napi_complete() from netpoll
netdevsim supports netpoll. Make sure we don't call napi_complete()
from it, since it may not be scheduled. Breno reports hitting a
warning in napi_complete_done():
WARNING: CPU: 14 PID: 104 at net/core/dev.c:6592 napi_complete_done+0x2cc/0x560
__napi_poll+0x2d8/0x3a0
handle_softirqs+0x1fe/0x710
This is presumably after netpoll stole the SCHED bit prematurely.
Reported-by: Breno Leitao <leitao@debian.org> Fixes: 3762ec05a9fb ("netdevsim: add NAPI support") Tested-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250611174643.2769263-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
veth: prevent NULL pointer dereference in veth_xdp_rcv
The veth peer device is RCU protected, but when the peer device gets
deleted (veth_dellink) then the pointer is assigned NULL (via
RCU_INIT_POINTER).
This patch adds a necessary NULL check in veth_xdp_rcv when accessing
the veth peer net_device.
This fixes a bug introduced in commit dc82a33297fc ("veth: apply qdisc
backpressure on full ptr_ring to reduce TX drops"). The bug is a race
and only triggers when having inflight packets on a veth that is being
deleted.
Reported-by: Ihor Solodrai <ihor.solodrai@linux.dev> Closes: https://lore.kernel.org/all/fecfcad0-7a16-42b8-bff2-66ee83a6e5c4@linux.dev/ Reported-by: syzbot+c4c7bf27f6b0c4bd97fe@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/683da55e.a00a0220.d8eae.0052.GAE@google.com/ Fixes: dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops") Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://patch.msgid.link/174964557873.519608.10855046105237280978.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog()
should fix the race, because all packets will be purged from the qdisc
before releasing the lock.
Fixes: b05972f01e7d ("net: sched: tbf: don't call qdisc_put() while holding tree lock") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250611111515.1983366-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog()
should fix the race, because all packets will be purged from the qdisc
before releasing the lock.
Fixes: b05972f01e7d ("net: sched: tbf: don't call qdisc_put() while holding tree lock") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://patch.msgid.link/20250611111515.1983366-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog()
should fix the race, because all packets will be purged from the qdisc
before releasing the lock.
Fixes: 0c8d13ac9607 ("net: sched: red: delay destroying child qdisc on replace") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250611111515.1983366-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog()
should fix the race, because all packets will be purged from the qdisc
before releasing the lock.
Fixes: 7b8e0b6e6599 ("net: sched: prio: delay destroying child qdiscs on change") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250611111515.1983366-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: phy: phy_caps: Don't skip better duplex macth on non-exact match
When performing a non-exact phy_caps lookup, we are looking for a
supported mode that matches as closely as possible the passed speed/duplex.
Blamed patch broke that logic by returning a match too early in case
the caller asks for half-duplex, as a full-duplex linkmode may match
first, and returned as a non-exact match without even trying to mach on
half-duplex modes.
Reported-by: Jijie Shao <shaojijie@huawei.com> Closes: https://lore.kernel.org/netdev/20250603102500.4ec743cf@fedora/T/#m22ed60ca635c67dc7d9cbb47e8995b2beb5c1576 Tested-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Fixes: fc81e257d19f ("net: phy: phy_caps: Allow looking-up link caps based on speed and duplex") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250606094321.483602-1-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
net: bcmgenet: add support for GRO software interrupt coalescing
Enable support for software IRQ coalescing and GRO aggregation
and apply conservative defaults which can help improve system
and network performance by reducing the number of hardware
interrupts and improving GRO aggregation ratio.
Zak Kemble [Tue, 10 Jun 2025 22:04:02 +0000 (23:04 +0100)]
net: bcmgenet: use napi_complete_done return value
Make use of the return value from napi_complete_done(). This allows users to
use the gro_flush_timeout and napi_defer_hard_irqs sysfs attributes for
configuring software interrupt coalescing.
Abin Joseph [Tue, 10 Jun 2025 11:41:11 +0000 (17:11 +0530)]
net: macb: Add shutdown operation support
Implement the shutdown hook to ensure clean and complete deactivation of
MACB controller. The shutdown sequence is protected with 'rtnl_lock()'
to serialize access and prevent race conditions while detaching and
closing the network device. This ensure a safe transition when the Kexec
utility calls the shutdown hook, facilitating seamless loading and
booting of a new kernel from the currently running one.
====================
net: phy: micrel: add extended PHY support for KSZ9477-class devices
This patch series extends the PHY driver support for the Microchip
KSZ9477-class switch-integrated PHYs. These changes enhance ethtool
functionality and diagnostic capabilities by implementing the following
features:
- MDI/MDI-X configuration support
All crossover modes (auto, MDI, MDI-X) are now configurable.
- RX error counter reporting
The RXER counter (reg 0x15) is now accumulated and exported via
ethtool stats.
- Cable test support
Reuses the KSZ9131 implementation to enable open/short fault
detection and approximate fault length reporting.
====================
Oleksij Rempel [Tue, 10 Jun 2025 09:13:54 +0000 (11:13 +0200)]
net: phy: micrel: add cable test support for KSZ9477-class PHYs
Enable cable test support for KSZ9477-class PHYs by reusing the
existing KSZ9131 implementation.
This also adds support for 100Mbit-only PHYs like KSZ8563, which are
identified as KSZ9477. For these PHYs, only two wire pairs (A and B)
are active, so the cable test logic limits the pair_mask accordingly.
Support for KSZ8563 is untested but added based on its register
compatibility and PHY ID match.
Tested on KSZ9893 (Gigabit): open and short conditions were correctly
detected on all four pairs. Fault length reporting is functional and
varies by pair. For example:
- 2m cable: open faults reported ~1.2m (pairs B–D), 0.0m (pair A)
- No cable: all pairs report 0.0m fault length
Oleksij Rempel [Tue, 10 Jun 2025 09:13:53 +0000 (11:13 +0200)]
net: phy: micrel: Add RX error counter support for KSZ9477 switch-integrated PHYs
Add support for tracking receive error statistics from PHYs integrated
into the KSZ9477 family of Ethernet switches.
The integrated PHYs expose a receive error (RXER) counter in register 0x15.
This counter increments when the PHY detects one or more symbol errors
on a received frame. The register is cleared upon reading.
Changes include:
- `kszphy_update_stats()` to accumulate the RX error count.
- `kszphy_get_phy_stats()` to expose this count via ethtool PHY stats.
- Addition of a private `rx_err_pkt_cnt` field in the driver.
- Registration of `.update_stats` and `.get_phy_stats` callbacks in the
KSZ9477 PHY driver structure.
The functionality of this counter was confirmed by physically disturbing
the signal lines - specifically by wiggling exposed twisted pair wires and
intentionally shorting between pairs. These actions triggered RXER
increments, validating the counter's behavior.
This RXER counter is confirmed for KSZ9477 and likely applicable to
other related PHYs like those in KSZ9313.
Oleksij Rempel [Tue, 10 Jun 2025 09:13:52 +0000 (11:13 +0200)]
net: phy: micrel: add MDI/MDI-X control support for KSZ9477 switch-integrated PHYs
Add MDI/MDI-X configuration support for PHYs integrated in the KSZ9477
family of Ethernet switches.
All MDI/MDI-X configuration modes are supported:
- Automatic MDI/MDI-X (ETH_TP_MDI_AUTO)
- Forced MDI (ETH_TP_MDI)
- Forced MDI-X (ETH_TP_MDI_X)
However, when operating in automatic mode, the PHY does not expose the
resolved crossover status (i.e., whether MDI or MDI-X is active).
Therefore, in auto mode, the driver reports ETH_TP_MDI_INVALID as
the current status.
The output interface has a 4->6 program attached at ingress.
We try to loop the multicast skb back to the sending socket.
Ingress BPF runs as part of netif_rx(), pushes a valid v6 hdr
and changes skb->protocol to v6. We enter ip6_rcv_core which
tries to use skb_dst(). But the dst is still an IPv4 one left
after IPv4 mcast output.
Clear the dst in all BPF helpers which change the protocol.
Try to preserve metadata dsts, those may carry non-routing
metadata.
Cc: stable@vger.kernel.org Reviewed-by: Maciej Żenczykowski <maze@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Fixes: d219df60a70e ("bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room()") Fixes: 1b00e0dfe7d0 ("bpf: update skb->protocol in bpf_skb_net_grow") Fixes: 6578171a7ff0 ("bpf: add bpf_skb_change_proto helper") Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250610001245.1981782-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 11 Jun 2025 23:41:54 +0000 (16:41 -0700)]
Merge branch 'fbnic-expand-mac-stats-coverage'
Mohsin Bashir says:
====================
fbnic: Expand mac stats coverage
This patch series expand the coverage of mac stats for fbnic. The first
patch increment the ETHTOOL_RMON_HIST_MAX by 1 to provide necessary
support for all the ranges of rmon histogram supported by fbnic.
The second patch add support for rmon and eth_ctrl stats.
====================
Mohsin Bashir [Tue, 10 Jun 2025 17:11:08 +0000 (10:11 -0700)]
eth: Update rmon hist range
The fbnic driver reports up-to 11 ranges resulting in the drop of the
last range. This patch increment the value of ETHTOOL_RMON_HIST_MAX to
address this limitation.
Magnus Lindholm [Tue, 18 Feb 2025 17:55:14 +0000 (18:55 +0100)]
mm: pgtable: fix pte_swp_exclusive
Make pte_swp_exclusive return bool instead of int. This will better
reflect how pte_swp_exclusive is actually used in the code.
This fixes swap/swapoff problems on Alpha due pte_swp_exclusive not
returning correct values when _PAGE_SWP_EXCLUSIVE bit resides in upper
32-bits of PTE (like on alpha).
Shahar Shitrit [Tue, 10 Jun 2025 15:15:14 +0000 (18:15 +0300)]
net/mlx5e: Fix number of lanes to UNKNOWN when using data_rate_oper
When the link is up, either eth_proto_oper or ext_eth_proto_oper
typically reports the active link protocol, from which both speed
and number of lanes can be retrieved. However, in certain cases,
such as when a NIC is connected via a non-standard cable, the
firmware may not report the protocol.
In such scenarios, the speed can still be obtained from the
data_rate_oper field in PTYS register. Since data_rate_oper
provides only speed information and lacks lane details, it is
incorrect to derive the number of lanes from it.
This patch corrects the behavior by setting the number of lanes to
UNKNOWN instead of incorrectly using MAX_LANES when relying on
data_rate_oper.
Fixes: 7e959797f021 ("net/mlx5e: Enable lanes configuration when auto-negotiation is off") Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250610151514.1094735-10-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jianbo Liu [Tue, 10 Jun 2025 15:15:13 +0000 (18:15 +0300)]
net/mlx5e: Fix leak of Geneve TLV option object
Previously, a unique tunnel id was added for the matching on TC
non-zero chains, to support inner header rewrite with goto action.
Later, it was used to support VF tunnel offload for vxlan, then for
Geneve and GRE. To support VF tunnel, a temporary mlx5_flow_spec is
used to parse tunnel options. For Geneve, if there is TLV option, a
object is created, or refcnt is added if already exists. But the
temporary mlx5_flow_spec is directly freed after parsing, which causes
the leak because no information regarding the object is saved in
flow's mlx5_flow_spec, which is used to free the object when deleting
the flow.
To fix the leak, call mlx5_geneve_tlv_option_del() before free the
temporary spec if it has TLV object.
Fixes: 521933cdc4aa ("net/mlx5e: Support Geneve and GRE with VF tunnel offload") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Alex Lazar <alazar@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250610151514.1094735-9-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vlad Dogaru [Tue, 10 Jun 2025 15:15:11 +0000 (18:15 +0300)]
net/mlx5: HWS, make sure the uplink is the last destination
When there are more than one destinations, we create a FW flow
table and provide it with all the destinations. FW requires to
have wire as the last destination in the list (if it exists),
otherwise the operation fails with FW syndrome.
This patch fixes the destination array action creation: if it
contains a wire destination, it is moved to the end.
net/mlx5: Fix return value when searching for existing flow group
When attempting to add a rule to an existing flow group, if a matching
flow group exists but is not active, the error code returned should be
EAGAIN, so that the rule can be added to the matching flow group once
it is active, rather than ENOENT, which indicates that no matching
flow group was found.
Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel") Signed-off-by: Gavi Teitz <gavi@nvidia.com> Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250610151514.1094735-4-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Amir Tzin [Tue, 10 Jun 2025 15:15:07 +0000 (18:15 +0300)]
net/mlx5: Fix ECVF vports unload on shutdown flow
Fix shutdown flow UAF when a virtual function is created on the embedded
chip (ECVF) of a BlueField device. In such case the vport acl ingress
table is not properly destroyed.
ECVF functionality is independent of ecpf_vport_exists capability and
thus functions mlx5_eswitch_(enable|disable)_pf_vf_vports() should not
test it when enabling/disabling ECVF vports.
Moshe Shemesh [Tue, 10 Jun 2025 15:15:06 +0000 (18:15 +0300)]
net/mlx5: Ensure fw pages are always allocated on same NUMA
When firmware asks the driver to allocate more pages, using event of
give_pages, the driver should always allocate it from same NUMA, the
original device NUMA. Current code uses dev_to_node() which can result
in different NUMA as it is changed by other driver flows, such as
mlx5_dma_zalloc_coherent_node(). Instead, use saved numa node for
allocating firmware pages.
Fixes: 311c7c71c9bb ("net/mlx5e: Allocate DMA coherent memory on reader NUMA node") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250610151514.1094735-2-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For i40e:
Robert Malz improves reset handling for situations where multiple reset
requests could cause some to be missed.
For iavf:
Ahmed adds detection, and handling, of reset that could occur early in
the initialization process to stop long wait/hangs.
For ice:
Anton, properly, sets missed use_nsecs value.
For e1000:
Joe Damato moves cancel_work_sync() call to avoid deadlock.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
e1000: Move cancel_work_sync to avoid deadlock
ice/ptp: fix crosstimestamp reporting
iavf: fix reset_task for early reset event
i40e: retry VFLR handling if there is ongoing VF reset
i40e: return false from i40e_reset_vf if reset is in progress
====================
Alexander Stein [Tue, 10 Jun 2025 11:40:56 +0000 (13:40 +0200)]
net: fman_memac: Don't use of_property_read_bool on non-boolean property managed
'managed' is a non-boolean property specified in ethernet-controller.yaml.
Since commit c141ecc3cecd7 ("of: Warn when of_property_read_bool() is
used on non-boolean properties") this raises a warning. Use the
replacement of_property_present() instead.
Bluetooth: ISO: Fix using BT_SK_PA_SYNC to detect BIS sockets
BT_SK_PA_SYNC is only valid for Broadcast Sinks which means socket used
for Broadcast Sources wouldn't be able to use the likes of getpeername
to read out the sockaddr_iso_bc fields which may have been update (e.g.
bc_sid).
Fixes: 0a766a0affb5 ("Bluetooth: ISO: Fix getpeername not returning sockaddr_iso_bc fields") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: hci_sync: Fix broadcast/PA when using an existing instance
When using and existing adv_info instance for broadcast source it
needs to be updated to periodic first before it can be reused, also in
case the existing instance already have data hci_set_adv_instance_data
cannot be used directly since it would overwrite the existing data so
this reappend the original data after the Broadcast ID, if one was
generated.
Example:
bluetoothctl># Add PBP to EA so it can be later referenced as the BIS ID
bluetoothctl> advertise.service 0x1856 0x00 0x00
bluetoothctl> advertise on
...
< HCI Command: LE Set Extended Advertising Data (0x08|0x0037) plen 13
Handle: 0x01
Operation: Complete extended advertising data (0x03)
Fragment preference: Minimize fragmentation (0x01)
Data length: 0x09
Service Data: Public Broadcast Announcement (0x1856)
Data[2]: 0000
Flags: 0x06
LE General Discoverable Mode
BR/EDR Not Supported
...
bluetoothctl># Attempt to acquire Broadcast Source transport
bluetoothctl>transport.acquire /org/bluez/hci0/pac_bcast0/fd0
...
< HCI Command: LE Set Extended Advertising Data (0x08|0x0037) plen 255
Handle: 0x01
Operation: Complete extended advertising data (0x03)
Fragment preference: Minimize fragmentation (0x01)
Data length: 0x0e
Service Data: Broadcast Audio Announcement (0x1852)
Broadcast ID: 11371620 (0xad8464)
Service Data: Public Broadcast Announcement (0x1856)
Data[2]: 0000
Flags: 0x06
LE General Discoverable Mode
BR/EDR Not Supported
Link: https://github.com/bluez/bluez/issues/1117 Fixes: eca0ae4aea66 ("Bluetooth: Add initial implementation of BIS connections") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bobby Eshleman [Mon, 9 Jun 2025 16:39:24 +0000 (09:39 -0700)]
selftests/vsock: add initial vmtest.sh for vsock
This commit introduces a new vmtest.sh runner for vsock.
It uses virtme-ng/qemu to run tests in a VM. The tests validate G2H,
H2G, and loopback. The testing tools from tools/testing/vsock/ are
reused. Currently, only vsock_test is used.
VMCI and hyperv support is included in the config file to be built with
the -b option, though not used in the tests.
Only tested on x86.
To run:
$ make -C tools/testing/selftests TARGETS=vsock
$ tools/testing/selftests/vsock/vmtest.sh
or
$ make -C tools/testing/selftests TARGETS=vsock run_tests
Example runs (after make -C tools/testing/selftests TARGETS=vsock):
$ ./tools/testing/selftests/vsock/vmtest.sh
1..3
ok 0 vm_server_host_client
ok 1 vm_client_host_server
ok 2 vm_loopback
SUMMARY: PASS=3 SKIP=0 FAIL=0
Log: /tmp/vsock_vmtest_m7DI.log
$ mkdir -p ~/scratch
$ make -C tools/testing/selftests install TARGETS=vsock INSTALL_PATH=~/scratch
[... omitted ...]
$ cd ~/scratch
$ ./run_kselftest.sh
TAP version 13
1..1
# timeout set to 300
# selftests: vsock: vmtest.sh
# 1..3
# ok 0 vm_server_host_client
# ok 1 vm_client_host_server
# ok 2 vm_loopback
# SUMMARY: PASS=3 SKIP=0 FAIL=0
# Log: /tmp/vsock_vmtest_svEl.log
ok 1 selftests: vsock: vmtest.sh
Future work can include vsock_diag_test.
Because vsock requires a VM to test anything other than loopback, this
patch adds vmtest.sh as a kselftest itself. This is different than other
systems that have a "vmtest.sh", where it is used as a utility script to
spin up a VM to run the selftests as a guest (but isn't hooked into
kselftest).
When using publicly available tools like 'mdio-tools' to read/write data
from/to network interface and its PHY via C45 (clause 45) mdiobus,
there is no verification of parameters passed to the ioctl and
it accepts any mdio address.
Currently there is support for 32 addresses in kernel via PHY_MAX_ADDR define,
but it is possible to pass higher value than that via ioctl.
While read/write operation should generally fail in this case,
mdiobus provides stats array, where wrong address may allow out-of-bounds
read/write.
Fix that by adding address verification before C45 read/write operation.
While this excludes this access from any statistics, it improves security of
read/write operation.
Fixes: 4e4aafcddbbf ("net: mdio: Add dedicated C45 API to MDIO bus drivers") Signed-off-by: Jakub Raczynski <j.raczynski@samsung.com> Reported-by: Wenjing Shan <wenjing.shan@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When using publicly available tools like 'mdio-tools' to read/write data
from/to network interface and its PHY via mdiobus, there is no verification of
parameters passed to the ioctl and it accepts any mdio address.
Currently there is support for 32 addresses in kernel via PHY_MAX_ADDR define,
but it is possible to pass higher value than that via ioctl.
While read/write operation should generally fail in this case,
mdiobus provides stats array, where wrong address may allow out-of-bounds
read/write.
Fix that by adding address verification before read/write operation.
While this excludes this access from any statistics, it improves security of
read/write operation.
Fixes: 080bb352fad00 ("net: phy: Maintain MDIO device and bus statistics") Signed-off-by: Jakub Raczynski <j.raczynski@samsung.com> Reported-by: Wenjing Shan <wenjing.shan@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Zilin Guan [Fri, 23 May 2025 11:01:56 +0000 (11:01 +0000)]
wifi: cfg80211: use kfree_sensitive() for connkeys cleanup
The nl80211_parse_connkeys() function currently uses kfree() to release
the 'result' structure in error handling paths. However, if an error
occurs due to result->def being less than 0, the 'result' structure may
contain sensitive information.
To prevent potential leakage of sensitive data, replace kfree() with
kfree_sensitive() when freeing 'result'. This change aligns with the
approach used in its caller, nl80211_join_ibss(), enhancing the overall
security of the wireless subsystem.
wifi: iwlwifi: fix merge damage related to iwl_pci_resume
The changes I made in
wifi: iwlwifi: don't warn if the NIC is gone in resume
conflicted with a major rework done in this area.
The merge de-facto removed my changes.
Re-add them.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Link: https://patch.msgid.link/20250603091754.70182-1-emmanuel.grumbach@intel.com Fixes: 06c4b2036818 ("Merge tag 'iwlwifi-next-2025-05-15' of https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This reverts commit 4fcfcbe45734 ("wifi: mwifiex: Fix HT40 bandwidth
issue.")
That commit introduces a regression, when HT40 mode is enabled,
received packets are lost, this was experience with W8997 with both
SDIO-UART and SDIO-SDIO variants. From an initial investigation the
issue solves on its own after some time, but it's not clear what is
the reason. Given that this was just a performance optimization, let's
revert it till we have a better understanding of the issue and a proper
fix.
Cc: Jeff Chen <jeff.chen_1@nxp.com> Cc: stable@vger.kernel.org Fixes: 4fcfcbe45734 ("wifi: mwifiex: Fix HT40 bandwidth issue.") Closes: https://lore.kernel.org/all/20250603203337.GA109929@francesco-nb/ Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Link: https://patch.msgid.link/20250605130302.55555-1-francesco@dolcini.it
[fix commit reference format] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
According to 802.1AE standard, when ES and SC flags in TCI are zero,
used SCI should be the current active SC_RX. Current code uses the
header MAC address. Without this patch, when ES flag is 0 (using a
bridge or switch), header MAC will not fit the SCI and MACSec frames
will be discarted.
In order to test this issue, MACsec link should be stablished between
two interfaces, setting SC and ES flags to zero and a port identifier
different than one. For example, using ip macsec tools:
ip link add link $ETH0 macsec0 type macsec port 11 send_sci off
end_station off
ip macsec add macsec0 tx sa 0 pn 2 on key 01 $ETH1_KEY
ip macsec add macsec0 rx port 11 address $ETH1_MAC
ip macsec add macsec0 rx port 11 address $ETH1_MAC sa 0 pn 2 on key 02
ip link set dev macsec0 up
ip link add link $ETH1 macsec1 type macsec port 11 send_sci off
end_station off
ip macsec add macsec1 tx sa 0 pn 2 on key 01 $ETH0_KEY
ip macsec add macsec1 rx port 11 address $ETH0_MAC
ip macsec add macsec1 rx port 11 address $ETH0_MAC sa 0 pn 2 on key 02
ip link set dev macsec1 up
Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Co-developed-by: Andreu Montiel <Andreu.Montiel@technica-engineering.de> Signed-off-by: Andreu Montiel <Andreu.Montiel@technica-engineering.de> Signed-off-by: Carlos Fernandez <carlos.fernandez@technica-engineering.de> Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: stop napi kthreads when THREADED napi is disabled
Once the THREADED napi is disabled, the napi kthread should also be
stopped. Keeping the kthread intact after disabling THREADED napi makes
the PID of this kthread show up in the output of netlink 'napi-get' and
ps -ef output.
The is discussed in the patch below:
https://lore.kernel.org/all/20250502191548.559cc416@kernel.org
NAPI kthread should stop only if,
- There are no pending napi poll scheduled for this thread.
- There are no new napi poll scheduled for this thread while it has
stopped.
- The ____napi_schedule can correctly fallback to the softirq for napi
polling.
Since napi_schedule_prep provides mutual exclusion over STATE_SCHED bit,
it is safe to unset the STATE_THREADED when SCHED_THREADED is set or the
SCHED bit is not set. SCHED_THREADED being set means that SCHED is
already set and the kthread owns this napi.
To disable threaded napi, unset STATE_THREADED bit safely if
SCHED_THREADED is set or SCHED is unset. Once STATE_THREADED is unset
safely then wait for the kthread to unset the SCHED_THREADED bit so it
safe to stop the kthread.
Add a new test in nl_netdev to verify this behaviour.
Tested:
./tools/testing/selftests/net/nl_netdev.py
TAP version 13
1..6
ok 1 nl_netdev.empty_check
ok 2 nl_netdev.lo_check
ok 3 nl_netdev.page_pool_check
ok 4 nl_netdev.napi_list_check
ok 5 nl_netdev.dev_set_threaded
ok 6 nl_netdev.nsim_rxq_reset_down
# Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0
Ran neper for 300 seconds and did enable/disable of thread napi in a
loop continuously.
Moon Yeounsu [Tue, 10 Jun 2025 00:01:30 +0000 (09:01 +0900)]
net: dlink: enable RMON MMIO access on supported devices
Enable memory-mapped I/O access to RMON statistics registers for devices
known to work correctly. Currently, only the D-Link DGE-550T (`0x4000`)
with PCI revision A3 (`0x0c`) is allowed.
To avoid issues on other hardware, a runtime check was added to restrict
MMIO usage. The `MEM_MAPPING` macro was removed in favor of runtime
detection.
To access RMON registers, the code `dw32(RmonStatMask, 0x0007ffff);`
must also be skipped, so this patch conditionally disables it as well.
netconsole: fix appending sysdata when sysdata_fields == SYSDATA_RELEASE
Before appending sysdata, prepare_extradata() checks if any feature is
enabled in sysdata_fields (and exits early if none is enabled).
When SYSDATA_RELEASE was introduced, we missed adding it to the list of
features being checked against sysdata_fields in prepare_extradata().
The result was that, if only SYSDATA_RELEASE is enabled in
sysdata_fields, we incorreclty exit early and fail to append the
release.
Instead of checking specific bits in sysdata_fields, check if
sysdata_fields has ALL bit zeroed and exit early if true. This fixes
case when only SYSDATA_RELEASE enabled and makes the code more general /
less error prone in future feature implementation.
The first 4 patches are by Vincent Mailhol and prepare the CAN netlink
interface for the introduction of CAN XL configuration.
Geert Uytterhoeven's patch updates the CAN networking documentation.
The last 2 patched are by Davide Caratti and introduce skb drop
reasons in the receive path of several CAN protocols.
* tag 'linux-can-next-for-6.17-20250610' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
can: add drop reasons in CAN protocols receive path
can: add drop reasons in the receive path of AF_CAN
documentation: networking: can: Document alloc_candev_mqs()
can: netlink: can_changelink(): rename tdc_mask into fd_tdc_flag_provided
can: bittiming: rename can_tdc_is_enabled() into can_fd_tdc_is_enabled()
can: bittiming: rename CAN_CTRLMODE_TDC_MASK into CAN_CTRLMODE_FD_TDC_MASK
can: netlink: replace tabulation by space in assignment
====================
Michal Luczaj [Mon, 9 Jun 2025 17:08:03 +0000 (19:08 +0200)]
net: Fix TOCTOU issue in sk_is_readable()
sk->sk_prot->sock_is_readable is a valid function pointer when sk resides
in a sockmap. After the last sk_psock_put() (which usually happens when
socket is removed from sockmap), sk->sk_prot gets restored and
sk->sk_prot->sock_is_readable becomes NULL.
This makes sk_is_readable() racy, if the value of sk->sk_prot is reloaded
after the initial check. Which in turn may lead to a null pointer
dereference.
Ensure the function pointer does not turn NULL after the check.
Fixes: 8934ce2fd081 ("bpf: sockmap redirect ingress support") Suggested-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250609-skisreadable-toctou-v1-1-d0dfb2d62c37@rbox.co Signed-off-by: Jakub Kicinski <kuba@kernel.org>
octeontx2-pf: Avoid typecasts by simplifying otx2_atomic64_add macro
Just because otx2_atomic64_add is using u64 pointer as argument
all callers has to typecast __iomem void pointers which inturn
causing sparse warnings. Fix those by changing otx2_atomic64_add
argument to void pointer.
This patch removes unnecessary typecasts by marking the
mbox_regions array as __iomem since it is used to store
pointers to memory-mapped I/O (MMIO) regions. Also simplified
the call to readq() in PF driver by removing redundant type casts.
Add a new function, netif_subqueue_sent, which is a wrapper for
netdev_tx_sent_queue.
Drivers that use the subqueue variant macros, netif_subqueue_xxx,
identify queue by index and are not required to obtain
struct netdev_queue explicitly.
Such drivers still need to call netdev_tx_sent_queue which is a
counterpart of netif_subqueue_completed_wake. Allowing drivers to use a
subqueue variant for this purpose improves their code consistency by
always referring to queue by its index.
Jake moves from individual virtchnl RSS configuration values, for ice,
i40e, and iavf, to a common libie location and values.
Martyna and Dawid add counters for link_down_events to ice, i40e, and
ixgbe drivers. The counter increments only on actual physical link-down
events visible to the PHY. It does not increment when the user performs
a software-only interface down/up (e.g. ip link set dev down).
The counter does increment in cases where the interface is reinitialized
in a way that causes a real link drop - such as eg. when attaching
an XDP program, reconfiguring channels, or toggling certain priv-flags.
For ice:
Arkadiusz and Karol separate PTP and DPLL functionality to their
respective APIs.
Michal adds a separate handler for Flow Director command processing.
For iavf:
Ahmed converts driver to utilize core's IRQ affinity API.
For ixgbe:
Alok Tiwari fixes issues with some comments; typos, copy/paste errors,
etc.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ixgbe: Fix typos and clarify comments in X550 driver code
iavf: convert to NAPI IRQ affinity API
ice: add a separate Rx handler for flow director commands
ice: add ice driver PTP pin documentation
ice: change SMA pins to SDP in PTP API
ice: redesign dpll sma/u.fl pins control
ixgbe: add link_down_events statistic
i40e: add link_down_events statistic
ice: add link_down_events statistic
net: intel: move RSS packet classifier types to libie
net: intel: rename 'hena' to 'hashcfg' for clarity
====================
Jakub Kicinski [Mon, 9 Jun 2025 14:39:33 +0000 (07:39 -0700)]
uapi: in6: restore visibility of most IPv6 socket options
A decade ago commit 6d08acd2d32e ("in6: fix conflict with glibc")
hid the definitions of IPV6 options, because GCC was complaining
about duplicates. The commit did not list the warnings seen, but
trying to recreate them now I think they are (building iproute2):
In file included from ./include/uapi/rdma/rdma_user_cm.h:39,
from rdma.h:16,
from res.h:9,
from res-ctx.c:7:
../include/uapi/linux/in6.h:171:9: warning: ‘IPV6_ADD_MEMBERSHIP’ redefined
171 | #define IPV6_ADD_MEMBERSHIP 20
| ^~~~~~~~~~~~~~~~~~~
In file included from /usr/include/netinet/in.h:37,
from rdma.h:13:
/usr/include/bits/in.h:233:10: note: this is the location of the previous definition
233 | # define IPV6_ADD_MEMBERSHIP IPV6_JOIN_GROUP
| ^~~~~~~~~~~~~~~~~~~
../include/uapi/linux/in6.h:172:9: warning: ‘IPV6_DROP_MEMBERSHIP’ redefined
172 | #define IPV6_DROP_MEMBERSHIP 21
| ^~~~~~~~~~~~~~~~~~~~
/usr/include/bits/in.h:234:10: note: this is the location of the previous definition
234 | # define IPV6_DROP_MEMBERSHIP IPV6_LEAVE_GROUP
| ^~~~~~~~~~~~~~~~~~~~
Compilers don't complain about redefinition if the defines
are identical, but here we have the kernel using the literal
value, and glibc using an indirection (defining to a name
of another define, with the same numerical value).
Problem is, the commit in question hid all the IPV6 socket
options, and glibc has a pretty sparse list. For instance
it lacks Flow Label related options. Willem called this out
in commit 3fb321fde22d ("selftests/net: ipv6 flowlabel"):
/* uapi/glibc weirdness may leave this undefined */
#ifndef IPV6_FLOWINFO
#define IPV6_FLOWINFO 11
#endif
More interestingly some applications (socat) use
a #ifdef IPV6_FLOWINFO to gate compilation of thier
rudimentary flow label support. (For added confusion
socat misspells it as IPV4_FLOWINFO in some places.)
Hide only the two defines we know glibc has a problem
with. If we discover more warnings we can hide more
but we should avoid covering the entire block of
defines for "IPV6 socket options".