git.ipfire.org Git - thirdparty/linux.git/log

wifi: mac80211: Check 802.11 encaps offloading in ieee80211_tx_h_select_key()

With 802.11 encapsulation offloading, ieee80211_tx_h_select_key() is
called on 802.3 frames. In that case do not try to use skb data as
valid 802.11 headers.

Reported-by: Bert Karwatzki <spasswolf@web.de>
Closes: https://lore.kernel.org/linux-wireless/20250410215527.3001-1-spasswolf@web.de
Fixes: bb42f2d13ffc ("mac80211: Move reorder-sensitive TX handlers to after TXQ dequeue")
Signed-off-by: Remi Pommarel <repk@triplefau.lt>
Link: https://patch.msgid.link/1af4b5b903a5fca5ebe67333d5854f93b2be5abe.1752765971.git.repk@triplefau.lt
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211: Don't call fq_flow_idx() for management frames

skb_get_hash() can only be used when the skb is linked to a netdev
device.

Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
Fixes: 73bc9e0af594 ("mac80211: don't apply flow control on management frames")
Link: https://patch.msgid.link/20250717162547.94582-3-Alexander@wetzel-home.de
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211: Do not schedule stopped TXQs

Ignore TXQs with the flag IEEE80211_TXQ_STOP when scheduling a queue.

The flag is only set after all fragments have been dequeued and won't
allow dequeueing other frames as long as the flag is set.

For drivers using ieee80211_txq_schedule_start() this prevents an
loop trying to push the queued frames while IEEE80211_TXQ_STOP is set:

After setting IEEE80211_TXQ_STOP the driver will call
ieee80211_return_txq(). Which calls __ieee80211_schedule_txq(), detects
that there sill are frames in the queue and immediately restarts the
stopped TXQ. Which can't dequeue any frame and thus starts over the loop.

Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
Fixes: ba8c3d6f16a1 ("mac80211: add an intermediate software queue implementation")
Link: https://patch.msgid.link/20250717162547.94582-2-Alexander@wetzel-home.de
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: cfg80211: Add missing lock in cfg80211_check_and_end_cac()

Callers of wdev_chandef() must hold the wiphy mutex.

But the worker cfg80211_propagate_cac_done_wk() never takes the lock.
Which triggers the warning below with the mesh_peer_connected_dfs
test from hostapd and not (yet) released mac80211 code changes:

WARNING: CPU: 0 PID: 495 at net/wireless/chan.c:1552 wdev_chandef+0x60/0x165
Modules linked in:
CPU: 0 UID: 0 PID: 495 Comm: kworker/u4:2 Not tainted 6.14.0-rc5-wt-g03960e6f9d47 #33 13c287eeabfe1efea01c0bcc863723ab082e17cf
Workqueue: cfg80211 cfg80211_propagate_cac_done_wk
Stack:
00000000 00000001 ffffff00 6093267c
00000000 6002ec30 6d577c50 60037608
00000000 67e8d108 6063717b 00000000
Call Trace:
[<6002ec30>] ? _printk+0x0/0x98
[<6003c2b3>] show_stack+0x10e/0x11a
[<6002ec30>] ? _printk+0x0/0x98
[<60037608>] dump_stack_lvl+0x71/0xb8
[<6063717b>] ? wdev_chandef+0x60/0x165
[<6003766d>] dump_stack+0x1e/0x20
[<6005d1b7>] __warn+0x101/0x20f
[<6005d3a8>] warn_slowpath_fmt+0xe3/0x15d
[<600b0c5c>] ? mark_lock.part.0+0x0/0x4ec
[<60751191>] ? __this_cpu_preempt_check+0x0/0x16
[<600b11a2>] ? mark_held_locks+0x5a/0x6e
[<6005d2c5>] ? warn_slowpath_fmt+0x0/0x15d
[<60052e53>] ? unblock_signals+0x3a/0xe7
[<60052f2d>] ? um_set_signals+0x2d/0x43
[<60751191>] ? __this_cpu_preempt_check+0x0/0x16
[<607508b2>] ? lock_is_held_type+0x207/0x21f
[<6063717b>] wdev_chandef+0x60/0x165
[<605f89b4>] regulatory_propagate_dfs_state+0x247/0x43f
[<60052f00>] ? um_set_signals+0x0/0x43
[<605e6bfd>] cfg80211_propagate_cac_done_wk+0x3a/0x4a
[<6007e460>] process_scheduled_works+0x3bc/0x60e
[<6007d0ec>] ? move_linked_works+0x4d/0x81
[<6007d120>] ? assign_work+0x0/0xaa
[<6007f81f>] worker_thread+0x220/0x2dc
[<600786ef>] ? set_pf_worker+0x0/0x57
[<60087c96>] ? to_kthread+0x0/0x43
[<6008ab3c>] kthread+0x2d3/0x2e2
[<6007f5ff>] ? worker_thread+0x0/0x2dc
[<6006c05b>] ? calculate_sigpending+0x0/0x56
[<6003b37d>] new_thread_handler+0x4a/0x64
irq event stamp: 614611
hardirqs last enabled at (614621): [<00000000600bc96b>] __up_console_sem+0x82/0xaf
hardirqs last disabled at (614630): [<00000000600bc92c>] __up_console_sem+0x43/0xaf
softirqs last enabled at (614268): [<00000000606c55c6>] __ieee80211_wake_queue+0x933/0x985
softirqs last disabled at (614266): [<00000000606c52d6>] __ieee80211_wake_queue+0x643/0x985

Fixes: 26ec17a1dc5e ("cfg80211: Fix radar event during another phy CAC")
Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
Link: https://patch.msgid.link/20250717162547.94582-1-Alexander@wetzel-home.de
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: plfxlc: Fix error handling in usb driver probe

If probe fails before ieee80211_register_hw() is successfully done,
ieee80211_unregister_hw() will be called anyway. This may lead to various
bugs as the implementation of ieee80211_unregister_hw() assumes that
ieee80211_register_hw() has been called.

Divide error handling section into relevant subsections, so that
ieee80211_unregister_hw() is called only when it is appropriate. Correct
the order of the calls: ieee80211_unregister_hw() should go before
plfxlc_mac_release(). Also move ieee80211_free_hw() to plfxlc_mac_release()
as it supposed to be the opposite to plfxlc_mac_alloc_hw() that calls
ieee80211_alloc_hw().

Found by Linux Verification Center (linuxtesting.org) with Syzkaller.

Fixes: 68d57a07bfe5 ("wireless: add plfxlc driver for pureLiFi X, XL, XC devices")
Signed-off-by: Murad Masimov <m.masimov@mt-integration.ru>
Link: https://patch.msgid.link/20250321185226.71-3-m.masimov@mt-integration.ru
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211: support returning the S1G short beacon skb

When short beaconing is enabled, check the value of the sb_count
to determine whether we are to send a long beacon or short beacon.
sb_count represents the number of short beacons until the next
long beacon, where if its value is 0 we are to send a long beacon.
The value is then reset to the long beacon period, which represents
the number of beacon intervals between each long beacon. The decrement
process follows the same cadence as the decrement of the DTIM count value.

Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com>
Link: https://patch.msgid.link/20250717074205.312577-5-lachlan.hodges@morsemicro.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211: support initialising current S1G short beacon index

Introduce the sb_count variable which tracks the number of
beacon intervals until the next long beacon. To initialise this
value, we find the current short beacon index into this period
which represents the number of short beacons left to send before
the next long beacon. We use the same TSF value used to initialise
the DTIM count to ensure the short beacon count and DTIM count
are in sync as its common for the long beacon period and DTIM period
to be equivalent.

Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com>
Link: https://patch.msgid.link/20250717074205.312577-4-lachlan.hodges@morsemicro.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211: support initialising an S1G short beaconing BSS

Introduce the ability to parse the short beacon data and long
beacon period. The long beacon period represents the number of beacon
intervals between each long beacon transmission. Additionally,
as a BSS cannot change its configuration such that short beaconing
is dynamically disabled/enabled without tearing down the interface
- we ensure we have an existing short beacon before performing
the update.

Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com>
Link: https://patch.msgid.link/20250717074205.312577-3-lachlan.hodges@morsemicro.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: cfg80211: support configuring an S1G short beaconing BSS

S1G short beacons are an optional frame type used in an S1G BSS
that contain a limited set of elements. While they are optional,
they are a fundamental part of S1G that enables significant
power saving.

Expose 2 additional netlink attributes,
NL80211_ATTR_S1G_LONG_BEACON_PERIOD which denotes the number of beacon
intervals between each long beacon and NL80211_ATTR_S1G_SHORT_BEACON
which is a nested attribute containing the short beacon tail and
head. We split them as the long beacon period cannot be updated,
and is only used when initialisng the interface, whereas the short
beacon data can be used to both initialise and update the templates.
This follows how things such as the beacon interval and DTIM period
currently operate.

During the initialisation path, we ensure we have the long beacon
period if the short beacon data is being passed down, whereas
the update path will simply update the template if its sent down.

The short beacon data is validated using the same routines for regular
beacons as they support correctly parsing the short beacon format
while ensuring the frame is well-formed.

Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com>
Link: https://patch.msgid.link/20250717074205.312577-2-lachlan.hodges@morsemicro.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: brcmfmac: Add support for the SDIO 43751 device

Add the SDIO ID and firmware matching for the 43751 device.

Based on the previous work from Marc Gonzalez <mgonzalez@freebox.fr>.

Tested on an i.MX6DL board connected to an AP6398SV chip with the
brcmfmac43752-sdio.bin firmware taken from:

https://source.puri.sm/Librem5/firmware-brcm43752-nonfree

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>>
Link: https://patch.msgid.link/20250712215307.1310802-1-festevam@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: wilc1000: Use min() to improve code

Use min() to reduce the code and improve its readability.

Reviewed-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Link: https://patch.msgid.link/20250715121721.266713-7-rongqianfeng@vivo.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mwifiex: Use max_t() to improve code

Use max_t() to reduce the code and improve its readability.

Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Reviewed-by: Jeff Chen <jeff.chen_1@nxp.con>
Link: https://patch.msgid.link/20250715121721.266713-6-rongqianfeng@vivo.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: brcm80211: Use min() to improve code

Use min() to reduce the code and improve its readability.

Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>>
Link: https://patch.msgid.link/20250715121721.266713-5-rongqianfeng@vivo.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: brcmfmac: Fix typo "notifer"

There is a spelling mistake of 'notifer' in the comment which
should be 'notifier'.

Signed-off-by: WangYuli <wangyuli@uniontech.com>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>>
Link: https://patch.msgid.link/F92035B0A9123150+20250715134407.540483-5-wangyuli@uniontech.com
[remove prior link]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211: reject TDLS operations when station is not associated

syzbot triggered a WARN in ieee80211_tdls_oper() by sending
NL80211_TDLS_ENABLE_LINK immediately after NL80211_CMD_CONNECT,
before association completed and without prior TDLS setup.

This left internal state like sdata->u.mgd.tdls_peer uninitialized,
leading to a WARN_ON() in code paths that assumed it was valid.

Reject the operation early if not in station mode or not associated.

Reported-by: syzbot+f73f203f8c9b19037380@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f73f203f8c9b19037380
Fixes: 81dd2b882241 ("mac80211: move TDLS data to mgd private part")
Tested-by: syzbot+f73f203f8c9b19037380@syzkaller.appspotmail.com
Signed-off-by: Moon Hee Lee <moonhee.lee.ca@gmail.com>
Link: https://patch.msgid.link/20250715230904.661092-2-moonhee.lee.ca@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: brcmsmac: Remove const from tbl_ptr parameter in wlc_lcnphy_common_read_table()

A new warning in clang [1] complains that diq_start in
wlc_lcnphy_tx_iqlo_cal() is passed uninitialized as a const pointer to
wlc_lcnphy_common_read_table():

  drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_lcn.c:2728:13: error: variable 'diq_start' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer]
   2728 |                                                      &diq_start, 1, 16, 69);
        |                                                       ^~~~~~~~~

The table pointer passed to wlc_lcnphy_common_read_table() should not be
considered constant, as wlc_phy_read_table() is ultimately going to
update it. Remove the const qualifier from the tbl_ptr to clear up the
warning.

Cc: stable@vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/2108
Fixes: 5b435de0d786 ("net: wireless: add brcm80211 drivers")
Link: https://github.com/llvm/llvm-project/commit/00dacf8c22f065cb52efb14cd091d441f19b319e
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>>
Link: https://patch.msgid.link/20250715-brcmsmac-fix-uninit-const-pointer-v1-1-16e6a51a8ef4@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211: extend connection monitoring for MLO

Currently, reset connection monitor (ieee80211_sta_reset_conn_monitor())
timer is handled only for non-AP non-MLD STA and do not support non-AP MLD
STA. The current implementation checks for the CSA active and update the
monitor timer with the timeout value of deflink and reset the timer based
on the deflink's timeout value else schedule the connection loss work when
the deflink is timed out and it won't work for the non-AP MLD STA.

Handle the reset connection monitor timer for non-AP MLD STA by updating
the monitor timer with the timeout value which is determined based on the
link that will expire last among all the links in MLO. If at least one link
has not timed out, the timer is updated accordingly with the latest timeout
value else schedule the connection loss work when all links have timed out.

Remove the MLO-related WARN_ON() checks in the beacon and connection
monitoring logic code paths as they support MLO now.

Signed-off-by: Maharaja Kennadyrajan <maharaja.kennadyrajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250718060837.59371-5-maharaja.kennadyrajan@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211: extend beacon monitoring for MLO

Currently, reset beacon monitor (ieee80211_sta_reset_beacon_monitor())
timer is handled only for non-AP non-MLD STA and do not support non-AP MLD
STA. When the beacon loss occurs in non-AP MLD STA with the current
implementation, it is treated as a single link and the timer will reset
based on the timeout of the deflink, without checking all the links.

Check the CSA flags for all the links in the MLO and decide whether to
schedule the work queue for beacon loss. If any of the links has CSA
active, then beacon loss work is not scheduled.

Also, call the functions ieee80211_sta_reset_beacon_monitor() and
ieee80211_sta_reset_conn_monitor() from ieee80211_csa_switch_work() only
when all the links are CSA active.

Signed-off-by: Maharaja Kennadyrajan <maharaja.kennadyrajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250718060837.59371-4-maharaja.kennadyrajan@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211: Add link iteration macro for link data with rcu_dereference

Currently, the existing macro for_each_link_data() uses sdata_dereference()
which requires the wiphy lock. This lock cannot be used in atomic or RCU
read-side contexts, such as in the RX path.

Introduce a new macro, for_each_link_data_rcu(), that iterates over link of
sdata using rcu_dereference(), making it safe to use in RCU contexts. This
allows callers to access link data without requiring the wiphy lock.

The macro takes into account the vif.valid_links bitmap and ensures only
valid links are accessed safely. Callers are responsible for ensuring that
rcu_read_lock() is held when using this macro.

Signed-off-by: Maharaja Kennadyrajan <maharaja.kennadyrajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250718060837.59371-3-maharaja.kennadyrajan@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211: fix macro scoping in for_each_link_data

The for_each_link_data() macro currently declares a local variable
__sdata directly, which could lead to compiler warnings or errors when
reused in the same function or within switch-case blocks due to variable
redefinition or invalid scoping.

To address this, restructure the macro to use an outer for-loop that runs
only once, allowing safe declaration of __sdata without polluting the outer
scope. This ensures compatibility with static analyzers.

No functional changes; this is purely a cleanup to improve macro hygiene.

Signed-off-by: Aditya Kumar Singh <aditya.kumar.singh@oss.qualcomm.com>
Signed-off-by: Maharaja Kennadyrajan <maharaja.kennadyrajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250718060837.59371-2-maharaja.kennadyrajan@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: cfg80211/mac80211: remove wrong scan request n_channels

This (partially) reverts commits
- 838c7b8f1f27 ("wifi: nl80211: Avoid address calculations via out of bounds array indexing")
- f1d3334d604c ("wifi: cfg80211: sme: init n_channels before channels[] access")
- 82bbe02b2500 ("wifi: mac80211: Set n_channels after allocating struct cfg80211_scan_request")

These commits all set the structure to be in an inconsistent
state, setting n_channels to some value before them actually
being filled in. That's fine for what the code does now, but
with the removal of __counted_by() in 444020f4bf06 ("wifi:
cfg80211: remove scan request n_channels counted_by") it's no
longer needed and it does leave a bit of a landmine there
since breaking out of some code to send the scan or something
would leave it wrong.

Remove the now superfluous n_channels settings.

Link: https://patch.msgid.link/20250718103237.59510b2384c5.Ied5ba9c5c49efc008f4491c8ca7a45858a83f064@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

Merge tag 'rtw-next-2025-07-18' of https://github.com/pkshih/rtw

Ping-Ke Shih says:
==================
rtw-next patches for v6.17

Some minor fixes and refinements. Major changes are listed:

rtw89:

- STA+P2P concurrency feature gets implemented.

- add USB architecture and support RTL8851BU and RTL8852BU.
==================

Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: rtlwifi: Use min()/max() to improve code

Use min()/max() to reduce the code and improve its readability.

Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20250715121721.266713-8-rongqianfeng@vivo.com

wifi: rtw89: wow: Add Basic Rate IE to probe request in scheduled scan mode

In scheduled scan mode, the current probe request only includes the SSID
IE, but omits the Basic Rate IE. Some APs do not respond to such
incomplete probe requests, causing net-detect failures. To improve
interoperability and ensure APs respond correctly, add the Basic Rate IE
to the probe request in driver.

Signed-off-by: Chin-Yen Lee <timlee@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20250716122926.6709-1-pkshih@realtek.com

wifi: rtw89: Lower the timeout in rtw89_fwdl_check_path_ready_ax() for USB

When the chip is not powered on correctly (like during driver
development) rtw89_fwdl_check_path_ready_ax() can fail.
read_poll_timeout_atomic() with a delay of 1 µs and a timeout of
400000 µs can take 50 seconds with USB because of the time it takes to
send a USB control message. The firmware upload is tried 5 times, so
in total it takes 250 seconds.

Lower the timeout to 3200 for USB in order to reduce the time
rtw89_fwdl_check_path_ready_ax() takes from 50 seconds to less than 1
second.

Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/af0b25d0-ea67-455e-91f2-8e4c18ae4328@gmail.com

wifi: rtw89: Lower the timeout in rtw89_fw_read_c2h_reg() for USB

This read_poll_timeout_atomic() with a delay of 1 µs and a timeout of
1000000 µs can take ~250 seconds in the worst case because sending a
USB control message takes ~250 µs.

Lower the timeout to 4000 for USB in order to reduce the maximum polling
time to ~1 second.

This problem was observed with RTL8851BU while suspending to RAM with
WOWLAN enabled. The computer sat for 4 minutes with a black screen
before suspending.

Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/09313da6-c865-4e91-b758-4cb38a878796@gmail.com

wifi: rtw89: check path range before using in rtw89_fw_h2c_rf_ps_info()

The variable 'path' from rtw89_phy_get_syn_sel() as index of array could
be 3, but array size is 2. Fortunately, current chip->rf_path_num is
smaller or equal to 2, so it is safe. To prevent mistakes in the future,
add a checking and avoid Coverity warnings.

Addresses-Coverity-ID: linux-next: 1644716 ("Out-of-bounds write")
Addresses-Coverity-ID: linux-next: 1644717 ("Out-of-bounds write")

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20250715035259.45061-6-pkshih@realtek.com

wifi: rtw89: purge obsoleted scan events with software sequence number

The queued and obsoleted scan events can be wrongly treated as events of
new scan request, causing unexpected scan result. Attach a software
sequence number to scan request and its corresponding events. When a new
scan request is acknowledged by firmware, purge the scan events if its
sequence number is not belong to current request.

Normal case:

   mac80211                   event work        event BH
   -------------              ----------        --------
   scan req #1 ---->o
                    |
               <----o  <...........................o
                                  o
                                  |
       <--------------------------+
         ieee80211_scan_completed()

Abnormal case (late event work):

   mac80211                   event work        event BH
   -------------              ----------        --------
   scan req #1 ---->o
                    |
               <----o  <...........................o
                                  o #1

   scan cancel #2 ->o
                    |
               <----o  <...........................o
                                  o #2
                                  | (patch to avoid this)
   scan req #3 ---->o             |
                    |             |
               <----o  <..........|................o
                                  | o #3
       <--------------------------+
         ieee80211_scan_completed()

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20250715035259.45061-5-pkshih@realtek.com

wifi: rtw89: dynamically update EHT preamble puncturing

When the 'Disabled Subchannel Bitmap' within the EHT Operation
element is changed, mac80211 parse and pass it to the driver.
The driver is then updated with this puncturing bitmap to
optimize bandwidth usage and prevent interference from
degrading performance across the entire channel.

Signed-off-by: Kuan-Chung Chen <damon.chen@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20250715035259.45061-4-pkshih@realtek.com

wifi: rtw89: mac: reduce PPDU status length for WiFi 6 chips

Since the RX counter in the PPDU status is not used,
it is disabled to reduce the waste of DLE quota.

Signed-off-by: Chia-Yuan Li <leo.li@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20250715035259.45061-3-pkshih@realtek.com

wifi: rtw89: trigger TX stuck if FIFO full

In order for the situation where the dispatcher blocking
causes HAXIDMA to be unable to TX to be reported as
a TX stuck, so that subsequent recovery can be handled.

Signed-off-by: Chia-Yuan Li <leo.li@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20250715035259.45061-2-pkshih@realtek.com

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR (net-6.16-rc7).

Conflicts:

Documentation/netlink/specs/ovpn.yaml
  880d43ca9aa4 ("netlink: specs: clean up spaces in brackets")
  af52020fc599 ("ovpn: reject unexpected netlink attributes")

drivers/net/phy/phy_device.c
  a44312d58e78 ("net: phy: Don't register LEDs for genphy")
  f0f2b992d818 ("net: phy: Don't register LEDs for genphy")
https://lore.kernel.org/20250710114926.7ec3a64f@kernel.org

drivers/net/wireless/intel/iwlwifi/fw/regulatory.c
drivers/net/wireless/intel/iwlwifi/mld/regulatory.c
  5fde0fcbd760 ("wifi: iwlwifi: mask reserved bits in chan_state_active_bitmap")
  ea045a0de3b9 ("wifi: iwlwifi: add support for accepting raw DSM tables by firmware")

net/ipv6/mcast.c
  ae3264a25a46 ("ipv6: mcast: Delay put pmc->idev in mld_del_delrec()")
  a8594c956cc9 ("ipv6: mcast: Avoid a duplicate pointer check in mld_del_delrec()")
https://lore.kernel.org/8cc52891-3653-4b03-a45e-05464fe495cf@kernel.org

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'net-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Including fixes from Bluetooth, CAN, WiFi and Netfilter.

  More code here than I would have liked. That said, better now than
  next week. Nothing particularly scary stands out. The improvement to
  the OpenVPN input validation is a bit large but better get them in
  before the code makes it to a final release. Some of the changes we
  got from sub-trees could have been split better between the fix and
  -next refactoring, IMHO, that has been communicated.

  We have one known regression in a TI AM65 board not getting link. The
  investigation is going a bit slow, a number of people are on vacation.
  We'll try to wrap it up, but don't think it should hold up the
  release.

  Current release - fix to a fix:

   - Bluetooth: L2CAP: fix attempting to adjust outgoing MTU, it broke
     some headphones and speakers

  Current release - regressions:

   - wifi: ath12k: fix packets received in WBM error ring with REO LUT
     enabled, fix Rx performance regression

   - wifi: iwlwifi:
       - fix crash due to a botched indexing conversion
       - mask reserved bits in chan_state_active_bitmap, avoid FW assert()

  Current release - new code bugs:

   - nf_conntrack: fix crash due to removal of uninitialised entry

   - eth: airoha: fix potential UaF in airoha_npu_get()

  Previous releases - regressions:

   - net: fix segmentation after TCP/UDP fraglist GRO

   - af_packet: fix the SO_SNDTIMEO constraint not taking effect and a
     potential soft lockup waiting for a completion

   - rpl: fix UaF in rpl_do_srh_inline() for sneaky skb geometry

   - virtio-net: fix recursive rtnl_lock() during probe()

   - eth: stmmac: populate entire system_counterval_t in get_time_fn()

   - eth: libwx: fix a number of crashes in the driver Rx path

   - hv_netvsc: prevent IPv6 addrconf after IFF_SLAVE lost that meaning

  Previous releases - always broken:

   - mptcp: fix races in handling connection fallback to pure TCP

   - rxrpc: assorted error handling and race fixes

   - sched: another batch of "security" fixes for qdiscs (QFQ, HTB)

   - tls: always refresh the queue when reading sock, avoid UaF

   - phy: don't register LEDs for genphy, avoid deadlock

   - Bluetooth: btintel: check if controller is ISO capable on
     btintel_classify_pkt_type(), work around FW returning incorrect
     capabilities

  Misc:

   - make OpenVPN Netlink input checking more strict before it makes it
     to a final release

   - wifi: cfg80211: remove scan request n_channels __counted_by, it's
     only yielding false positives"

* tag 'net-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits)
  rxrpc: Fix to use conn aborts for conn-wide failures
  rxrpc: Fix transmission of an abort in response to an abort
  rxrpc: Fix notification vs call-release vs recvmsg
  rxrpc: Fix recv-recv race of completed call
  rxrpc: Fix irq-disabled in local_bh_enable()
  selftests/tc-testing: Test htb_dequeue_tree with deactivation and row emptying
  net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree
  net: bridge: Do not offload IGMP/MLD messages
  selftests: Add test cases for vlan_filter modification during runtime
  net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime
  tls: always refresh the queue when reading sock
  virtio-net: fix recursived rtnl_lock() during probe()
  net/mlx5: Update the list of the PCI supported devices
  hv_netvsc: Set VF priv_flags to IFF_NO_ADDRCONF before open to prevent IPv6 addrconf
  phonet/pep: Move call to pn_skb_get_dst_sockaddr() earlier in pep_sock_accept()
  Bluetooth: L2CAP: Fix attempting to adjust outgoing MTU
  netfilter: nf_conntrack: fix crash due to removal of uninitialised entry
  net: fix segmentation after TCP/UDP fraglist GRO
  ipv6: mcast: Delay put pmc->idev in mld_del_delrec()
  net: airoha: fix potential use-after-free in airoha_npu_get()
  ...

Merge tag 'pm-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
"These address three issues introduced during the current development
  cycle and related to system suspend and hibernation, one triggering
  when asynchronous suspend of devices fails, one possibly affecting
  memory management in the core suspend code error path, and one due to
  duplicate filesystems freezing during system suspend:

   - Fix a deadlock that may occur on asynchronous device suspend
     failures due to missing completion updates in error paths (Rafael
     Wysocki)

   - Drop a misplaced pm_restore_gfp_mask() call, which may cause swap
     to be accessed too early if system suspend fails, from
     suspend_devices_and_enter() (Rafael Wysocki)

   - Remove duplicate filesystems_freeze/thaw() calls, which sometimes
     cause systems to be unable to resume, from enter_state() (Zihuan
     Zhang)"

* tag 'pm-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: sleep: Update power.completion for all devices on errors
  PM: suspend: clean up redundant filesystems_freeze/thaw() handling
  PM: suspend: Drop a misplaced pm_restore_gfp_mask() call

Merge tag 'for-net-2025-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

- hci_sync: fix connectable extended advertising when using static random address
- hci_core: fix typos in macros
- hci_core: add missing braces when using macro parameters
- hci_core: replace 'quirks' integer by 'quirk_flags' bitmap
- SMP: If an unallowed command is received consider it a failure
- SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout
- L2CAP: Fix null-ptr-deref in l2cap_sock_resume_cb()
- L2CAP: Fix attempting to adjust outgoing MTU
- btintel: Check if controller is ISO capable on btintel_classify_pkt_type
- btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID

* tag 'for-net-2025-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: L2CAP: Fix attempting to adjust outgoing MTU
  Bluetooth: btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID
  Bluetooth: hci_dev: replace 'quirks' integer by 'quirk_flags' bitmap
  Bluetooth: hci_core: add missing braces when using macro parameters
  Bluetooth: hci_core: fix typos in macros
  Bluetooth: SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout
  Bluetooth: SMP: If an unallowed command is received consider it a failure
  Bluetooth: btintel: Check if controller is ISO capable on btintel_classify_pkt_type
  Bluetooth: hci_sync: fix connectable extended advertising when using static random address
  Bluetooth: Fix null-ptr-deref in l2cap_sock_resume_cb()
====================

Link: https://patch.msgid.link/20250717142849.537425-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'rxrpc-miscellaneous-fixes'

David Howells says:

====================
rxrpc: Miscellaneous fixes

Here are some fixes for rxrpc:

(1) Fix the calling of IP routing code with IRQs disabled.

(2) Fix a recvmsg/recvmsg race when the first completes a call.

(3) Fix a race between notification, recvmsg and sendmsg releasing a call.

(4) Fix abort of abort.

(5) Fix call-level aborts that should be connection-level aborts.
====================

Link: https://patch.msgid.link/20250717074350.3767366-1-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

rxrpc: Fix to use conn aborts for conn-wide failures

Fix rxrpc to use connection-level aborts for things that affect the whole
connection, such as the service ID not matching a local service.

Fixes: 57af281e5389 ("rxrpc: Tidy up abort generation infrastructure")
Reported-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
Link: https://patch.msgid.link/20250717074350.3767366-6-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

rxrpc: Fix transmission of an abort in response to an abort

Under some circumstances, such as when a server socket is closing, ABORT
packets will be generated in response to incoming packets. Unfortunately,
this also may include generating aborts in response to incoming aborts -
which may cause a cycle. It appears this may be made possible by giving
the client a multicast address.

Fix this such that rxrpc_reject_packet() will refuse to generate aborts in
response to aborts.

Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com>
cc: LePremierHomme <kwqcheii@proton.me>
cc: Linus Torvalds <torvalds@linux-foundation.org>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
Link: https://patch.msgid.link/20250717074350.3767366-5-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

rxrpc: Fix notification vs call-release vs recvmsg

When a call is released, rxrpc takes the spinlock and removes it from
->recvmsg_q in an effort to prevent racing recvmsg() invocations from
seeing the same call.  Now, rxrpc_recvmsg() only takes the spinlock when
actually removing a call from the queue; it doesn't, however, take it in
the lead up to that when it checks to see if the queue is empty.  It *does*
hold the socket lock, which prevents a recvmsg/recvmsg race - but this
doesn't prevent sendmsg from ending the call because sendmsg() drops the
socket lock and relies on the call->user_mutex.

Fix this by firstly removing the bit in rxrpc_release_call() that dequeues
the released call and, instead, rely on recvmsg() to simply discard
released calls (done in a preceding fix).

Secondly, rxrpc_notify_socket() is abandoned if the call is already marked
as released rather than trying to be clever by setting both pointers in
call->recvmsg_link to NULL to trick list_empty().  This isn't perfect and
can still race, resulting in a released call on the queue, but recvmsg()
will now clean that up.

Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com>
cc: LePremierHomme <kwqcheii@proton.me>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
Link: https://patch.msgid.link/20250717074350.3767366-4-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

rxrpc: Fix recv-recv race of completed call

If a call receives an event (such as incoming data), the call gets placed
on the socket's queue and a thread in recvmsg can be awakened to go and
process it.  Once the thread has picked up the call off of the queue,
further events will cause it to be requeued, and once the socket lock is
dropped (recvmsg uses call->user_mutex to allow the socket to be used in
parallel), a second thread can come in and its recvmsg can pop the call off
the socket queue again.

In such a case, the first thread will be receiving stuff from the call and
the second thread will be blocked on call->user_mutex.  The first thread
can, at this point, process both the event that it picked call for and the
event that the second thread picked the call for and may see the call
terminate - in which case the call will be "released", decoupling the call
from the user call ID assigned to it (RXRPC_USER_CALL_ID in the control
message).

The first thread will return okay, but then the second thread will wake up
holding the user_mutex and, if it sees that the call has been released by
the first thread, it will BUG thusly:

kernel BUG at net/rxrpc/recvmsg.c:474!

Fix this by just dequeuing the call and ignoring it if it is seen to be
already released.  We can't tell userspace about it anyway as the user call
ID has become stale.

Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: LePremierHomme <kwqcheii@proton.me>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
Link: https://patch.msgid.link/20250717074350.3767366-3-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

rxrpc: Fix irq-disabled in local_bh_enable()

The rxrpc_assess_MTU_size() function calls down into the IP layer to find
out the MTU size for a route.  When accepting an incoming call, this is
called from rxrpc_new_incoming_call() which holds interrupts disabled
across the code that calls down to it.  Unfortunately, the IP layer uses
local_bh_enable() which, config dependent, throws a warning if IRQs are
enabled:

WARNING: CPU: 1 PID: 5544 at kernel/softirq.c:387 __local_bh_enable_ip+0x43/0xd0
...
RIP: 0010:__local_bh_enable_ip+0x43/0xd0
...
Call Trace:
<TASK>
rt_cache_route+0x7e/0xa0
rt_set_nexthop.isra.0+0x3b3/0x3f0
__mkroute_output+0x43a/0x460
ip_route_output_key_hash+0xf7/0x140
ip_route_output_flow+0x1b/0x90
rxrpc_assess_MTU_size.isra.0+0x2a0/0x590
rxrpc_new_incoming_peer+0x46/0x120
rxrpc_alloc_incoming_call+0x1b1/0x400
rxrpc_new_incoming_call+0x1da/0x5e0
rxrpc_input_packet+0x827/0x900
rxrpc_io_thread+0x403/0xb60
kthread+0x2f7/0x310
ret_from_fork+0x2a/0x230
ret_from_fork_asm+0x1a/0x30
...
hardirqs last  enabled at (23): _raw_spin_unlock_irq+0x24/0x50
hardirqs last disabled at (24): _raw_read_lock_irq+0x17/0x70
softirqs last  enabled at (0): copy_process+0xc61/0x2730
softirqs last disabled at (25): rt_add_uncached_list+0x3c/0x90

Fix this by moving the call to rxrpc_assess_MTU_size() out of
rxrpc_init_peer() and further up the stack where it can be done without
interrupts disabled.

It shouldn't be a problem for rxrpc_new_incoming_call() to do it after the
locks are dropped as pmtud is going to be performed by the I/O thread - and
we're in the I/O thread at this point.

Fixes: a2ea9a907260 ("rxrpc: Use irq-disabling spinlocks between app and I/O thread")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com>
cc: LePremierHomme <kwqcheii@proton.me>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
Link: https://patch.msgid.link/20250717074350.3767366-2-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests/tc-testing: Test htb_dequeue_tree with deactivation and row emptying

Ensure that any deactivation and row emptying that occurs
during htb_dequeue_tree does not cause a kernel panic.
This scenario originally triggered a kernel BUG_ON, and
we are checking for a graceful fail now.

Signed-off-by: William Liu <will@willsroot.io>
Signed-off-by: Savino Dicanosa <savy@syst3mfailure.io>
Link: https://patch.msgid.link/20250717022912.221426-1-will@willsroot.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree

htb_lookup_leaf has a BUG_ON that can trigger with the following:

tc qdisc del dev lo root
tc qdisc add dev lo root handle 1: htb default 1
tc class add dev lo parent 1: classid 1:1 htb rate 64bit
tc qdisc add dev lo parent 1:1 handle 2: netem
tc qdisc add dev lo parent 2:1 handle 3: blackhole
ping -I lo -c1 -W0.001 127.0.0.1

The root cause is the following:

1. htb_dequeue calls htb_dequeue_tree which calls the dequeue handler on
   the selected leaf qdisc
2. netem_dequeue calls enqueue on the child qdisc
3. blackhole_enqueue drops the packet and returns a value that is not
   just NET_XMIT_SUCCESS
4. Because of this, netem_dequeue calls qdisc_tree_reduce_backlog, and
   since qlen is now 0, it calls htb_qlen_notify -> htb_deactivate ->
   htb_deactiviate_prios -> htb_remove_class_from_row -> htb_safe_rb_erase
5. As this is the only class in the selected hprio rbtree,
   __rb_change_child in __rb_erase_augmented sets the rb_root pointer to
   NULL
6. Because blackhole_dequeue returns NULL, netem_dequeue returns NULL,
   which causes htb_dequeue_tree to call htb_lookup_leaf with the same
   hprio rbtree, and fail the BUG_ON

The function graph for this scenario is shown here:
0)               |  htb_enqueue() {
0) + 13.635 us   |    netem_enqueue();
0)   4.719 us    |    htb_activate_prios();
0) # 2249.199 us |  }
0)               |  htb_dequeue() {
0)   2.355 us    |    htb_lookup_leaf();
0)               |    netem_dequeue() {
0) + 11.061 us   |      blackhole_enqueue();
0)               |      qdisc_tree_reduce_backlog() {
0)               |        qdisc_lookup_rcu() {
0)   1.873 us    |          qdisc_match_from_root();
0)   6.292 us    |        }
0)   1.894 us    |        htb_search();
0)               |        htb_qlen_notify() {
0)   2.655 us    |          htb_deactivate_prios();
0)   6.933 us    |        }
0) + 25.227 us   |      }
0)   1.983 us    |      blackhole_dequeue();
0) + 86.553 us   |    }
0) # 2932.761 us |    qdisc_warn_nonwc();
0)               |    htb_lookup_leaf() {
0)               |      BUG_ON();
------------------------------------------

The full original bug report can be seen here [1].

We can fix this just by returning NULL instead of the BUG_ON,
as htb_dequeue_tree returns NULL when htb_lookup_leaf returns
NULL.

[1] https://lore.kernel.org/netdev/pF5XOOIim0IuEfhI-SOxTgRvNoDwuux7UHKnE_Y5-zVd4wmGvNk2ceHjKb8ORnzw0cGwfmVu42g9dL7XyJLf1NEzaztboTWcm0Ogxuojoeo=@willsroot.io/

Fixes: 512bb43eb542 ("pkt_sched: sch_htb: Optimize WARN_ONs in htb_dequeue_tree() etc.")
Signed-off-by: William Liu <will@willsroot.io>
Signed-off-by: Savino Dicanosa <savy@syst3mfailure.io>
Link: https://patch.msgid.link/20250717022816.221364-1-will@willsroot.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: Do not offload IGMP/MLD messages

Do not offload IGMP/MLD messages as it could lead to IGMP/MLD Reports
being unintentionally flooded to Hosts. Instead, let the bridge decide
where to send these IGMP/MLD messages.

Consider the case where the local host is sending out reports in response
to a remote querier like the following:

       mcast-listener-process (IP_ADD_MEMBERSHIP)
          \
          br0
         /   \
      swp1   swp2
        |     |
  QUERIER     SOME-OTHER-HOST

In the above setup, br0 will want to br_forward() reports for
mcast-listener-process's group(s) via swp1 to QUERIER; but since the
source hwdom is 0, the report is eligible for tx offloading, and is
flooded by hardware to both swp1 and swp2, reaching SOME-OTHER-HOST as
well. (Example and illustration provided by Tobias.)

Fixes: 472111920f1c ("net: bridge: switchdev: allow the TX data plane forwarding to be offloaded")
Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250716153551.1830255-1-Joseph.Huang@garmin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-vlan-fix-vlan-0-refcount-imbalance-of-toggling-filtering-during-runtime'

Dong Chenchen says:

====================
net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime

Fix VLAN 0 refcount imbalance of toggling filtering during runtime.
====================

Link: https://patch.msgid.link/20250716034504.2285203-1-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: Add test cases for vlan_filter modification during runtime

Add test cases for vlan_filter modification during runtime, which
may triger null-ptr-ref or memory leak of vlan0.

Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Link: https://patch.msgid.link/20250716034504.2285203-3-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime

Assuming the "rx-vlan-filter" feature is enabled on a net device, the
8021q module will automatically add or remove VLAN 0 when the net device
is put administratively up or down, respectively. There are a couple of
problems with the above scheme.

The first problem is a memory leak that can happen if the "rx-vlan-filter"
feature is disabled while the device is running:

# ip link add bond1 up type bond mode 0
# ethtool -K bond1 rx-vlan-filter off
# ip link del dev bond1

When the device is put administratively down the "rx-vlan-filter"
feature is disabled, so the 8021q module will not remove VLAN 0 and the
memory will be leaked [1].

Another problem that can happen is that the kernel can automatically
delete VLAN 0 when the device is put administratively down despite not
adding it when the device was put administratively up since during that
time the "rx-vlan-filter" feature was disabled. null-ptr-unref or
bug_on[2] will be triggered by unregister_vlan_dev() for refcount
imbalance if toggling filtering during runtime:

$ ip link add bond0 type bond mode 0
$ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q
$ ethtool -K bond0 rx-vlan-filter off
$ ifconfig bond0 up
$ ethtool -K bond0 rx-vlan-filter on
$ ifconfig bond0 down
$ ip link del vlan0

Root cause is as below:
step1: add vlan0 for real_dev, such as bond, team.
register_vlan_dev
    vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1
step2: disable vlan filter feature and enable real_dev
step3: change filter from 0 to 1
vlan_device_event
    vlan_filter_push_vids
        ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0
step4: real_dev down
vlan_device_event
    vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0
        vlan_info_rcu_free //free vlan0
step5: delete vlan0
unregister_vlan_dev
    BUG_ON(!vlan_info); //vlan_info is null

Fix both problems by noting in the VLAN info whether VLAN 0 was
automatically added upon NETDEV_UP and based on that decide whether it
should be deleted upon NETDEV_DOWN, regardless of the state of the
"rx-vlan-filter" feature.

[1]
unreferenced object 0xffff8880068e3100 (size 256):
  comm "ip", pid 384, jiffies 4296130254
  hex dump (first 32 bytes):
    00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00  . 0.............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace (crc 81ce31fa):
    __kmalloc_cache_noprof+0x2b5/0x340
    vlan_vid_add+0x434/0x940
    vlan_device_event.cold+0x75/0xa8
    notifier_call_chain+0xca/0x150
    __dev_notify_flags+0xe3/0x250
    rtnl_configure_link+0x193/0x260
    rtnl_newlink_create+0x383/0x8e0
    __rtnl_newlink+0x22c/0xa40
    rtnl_newlink+0x627/0xb00
    rtnetlink_rcv_msg+0x6fb/0xb70
    netlink_rcv_skb+0x11f/0x350
    netlink_unicast+0x426/0x710
    netlink_sendmsg+0x75a/0xc20
    __sock_sendmsg+0xc1/0x150
    ____sys_sendmsg+0x5aa/0x7b0
    ___sys_sendmsg+0xfc/0x180

[2]
kernel BUG at net/8021q/vlan.c:99!
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 #61 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1))
RSP: 0018:ffff88810badf310 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80
R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000
R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e
FS:  00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0
Call Trace:
<TASK>
rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6945)
netlink_rcv_skb (net/netlink/af_netlink.c:2535)
netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339)
netlink_sendmsg (net/netlink/af_netlink.c:1883)
____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566)
___sys_sendmsg (net/socket.c:2622)
__sys_sendmsg (net/socket.c:2652)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)

Fixes: ad1afb003939 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b
Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250716034504.2285203-2-dongchenchen2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'ovpn-net-20250716' of https://github.com/OpenVPN/ovpn-net-next

Antonio Quartulli says:

====================
This bugfix batch includes the following changes:
* properly propagate sk mark to skb->mark field
* reject unexpected incoming netlink attributes
* reset GSO state when moving skb from transport to tunnel layer

* tag 'ovpn-net-20250716' of https://github.com/OpenVPN/ovpn-net-next:
  ovpn: reset GSO metadata after decapsulation
  ovpn: reject unexpected netlink attributes
  ovpn: propagate socket mark to skb in UDP
====================

Link: https://patch.msgid.link/20250716115443.16763-1-antonio@openvpn.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tls: always refresh the queue when reading sock

After recent changes in net-next TCP compacts skbs much more
aggressively. This unearthed a bug in TLS where we may try
to operate on an old skb when checking if all skbs in the
queue have matching decrypt state and geometry.

    BUG: KASAN: slab-use-after-free in tls_strp_check_rcv+0x898/0x9a0 [tls]
    (net/tls/tls_strp.c:436 net/tls/tls_strp.c:530 net/tls/tls_strp.c:544)
    Read of size 4 at addr ffff888013085750 by task tls/13529

    CPU: 2 UID: 0 PID: 13529 Comm: tls Not tainted 6.16.0-rc5-virtme
    Call Trace:
     kasan_report+0xca/0x100
     tls_strp_check_rcv+0x898/0x9a0 [tls]
     tls_rx_rec_wait+0x2c9/0x8d0 [tls]
     tls_sw_recvmsg+0x40f/0x1aa0 [tls]
     inet_recvmsg+0x1c3/0x1f0

Always reload the queue, fast path is to have the record in the queue
when we wake, anyway (IOW the path going down "if !strp->stm.full_len").

Fixes: 0d87bbd39d7f ("tls: strp: make sure the TCP skbs do not have overlapping data")
Link: https://patch.msgid.link/20250716143850.1520292-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

virtio-net: fix recursived rtnl_lock() during probe()

The deadlock appears in a stack trace like:

  virtnet_probe()
    rtnl_lock()
    virtio_config_changed_work()
      netdev_notify_peers()
        rtnl_lock()

It happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the
virtio-net driver is still probing.

The config_work in probe() will get scheduled until virtnet_open() enables
the config change notification via virtio_config_driver_enable().

Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down")
Signed-off-by: Zigit Zo <zuozhijie@bytedance.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20250716115717.1472430-1-zuozhijie@bytedance.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Update the list of the PCI supported devices

Add the upcoming ConnectX-10 device ID to the table of supported
PCI device IDs.

Cc: stable@vger.kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1752650969-148501-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

hv_netvsc: Set VF priv_flags to IFF_NO_ADDRCONF before open to prevent IPv6 addrconf

Set an additional flag IFF_NO_ADDRCONF to prevent ipv6 addrconf.

Commit under Fixes added a new flag change that was not made
to hv_netvsc resulting in the VF being assinged an IPv6.

Fixes: 8a321cf7becc ("net: add IFF_NO_ADDRCONF and use it in bonding to prevent ipv6 addrconf")
Suggested-by: Cathy Avery <cavery@redhat.com>
Signed-off-by: Li Tian <litian@redhat.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20250716002607.4927-1-litian@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

phonet/pep: Move call to pn_skb_get_dst_sockaddr() earlier in pep_sock_accept()

A new warning in clang [1] points out a place in pep_sock_accept() where
dst is uninitialized then passed as a const pointer to pep_find_pipe():

  net/phonet/pep.c:829:37: error: variable 'dst' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer]
    829 |         newsk = pep_find_pipe(&pn->hlist, &dst, pipe_handle);
        |                                            ^~~:

Move the call to pn_skb_get_dst_sockaddr(), which initializes dst, to
before the call to pep_find_pipe(), so that dst is consistently used
initialized throughout the function.

Cc: stable@vger.kernel.org
Fixes: f7ae8d59f661 ("Phonet: allocate sock from accept syscall rather than soft IRQ")
Link: https://github.com/llvm/llvm-project/commit/00dacf8c22f065cb52efb14cd091d441f19b319e
Closes: https://github.com/ClangBuiltLinux/linux/issues/2101
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20250715-net-phonet-fix-uninit-const-pointer-v1-1-8efd1bd188b3@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Bluetooth: L2CAP: Fix attempting to adjust outgoing MTU

Configuration request only configure the incoming direction of the peer
initiating the request, so using the MTU is the other direction shall
not be used, that said the spec allows the peer responding to adjust:

Bluetooth Core 6.1, Vol 3, Part A, Section 4.5

'Each configuration parameter value (if any is present) in an
L2CAP_CONFIGURATION_RSP packet reflects an ‘adjustment’ to a
configuration parameter value that has been sent (or, in case of
default values, implied) in the corresponding
L2CAP_CONFIGURATION_REQ packet.'

That said adjusting the MTU in the response shall be limited to ERTM
channels only as for older modes the remote stack may not be able to
detect the adjustment causing it to silently drop packets.

Link: https://github.com/bluez/bluez/issues/1422
Link: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/149
Link: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4793
Fixes: 042bb9603c44 ("Bluetooth: L2CAP: Fix L2CAP MTU negotiation")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>

Merge branch 'ppp-replace-per-cpu-recursion-counter-with-lock-owner-field'

Sebastian Andrzej Siewior says:

====================
ppp: Replace per-CPU recursion counter with lock-owner field

This is another approach to avoid relying on local_bh_disable() for
locking of per-CPU in ppp.

I redid it with the per-CPU lock and local_lock_nested_bh() as discussed
in v1. The xmit_recursion counter has been removed since it served the
same purpose as the owner field. Both were updated and checked.

The xmit_recursion looks like a counter in ppp_channel_push() but at
this point, the counter should always be 0 so it always serves as a
boolean. Therefore I removed it.

I do admit that this looks easier to review.

v2 https://lore.kernel.org/all/20250710162403.402739-1-bigeasy@linutronix.de/
v1 https://lore.kernel.org/all/20250627105013.Qtv54bEk@linutronix.de/
====================

Link: https://patch.msgid.link/20250715150806.700536-1-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

ppp: Replace per-CPU recursion counter with lock-owner field

The per-CPU variable ppp::xmit_recursion is protecting against recursion
due to wrong configuration of the ppp unit. The per-CPU variable
relies on disabled BH for its locking. Without per-CPU locking in
local_bh_disable() on PREEMPT_RT this data structure requires explicit
locking.

The ppp::xmit_recursion is used as a per-CPU boolean. The counter is
checked early in the send routing and the transmit path is only entered
if the counter is zero. Then the counter is incremented to avoid
recursion. It used to detect recursion on channel::downl and
ppp::wlock.

Create a struct ppp_xmit_recursion and move the counter into it.
Add local_lock_t to the struct and use local_lock_nested_bh() for
locking. Due to possible nesting, the lock cannot be acquired
unconditionally but it requires an owner field to identify recursion
before attempting to acquire the lock.

The counter is incremented and checked only after the lock is acquired.
Since it functions as a boolean rather than a count, and its role is now
superseded by the owner field, it can be safely removed.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250715150806.700536-2-bigeasy@linutronix.de
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch 'dpll-zl3073x-add-misc-features'

Ivan Vecera says:

====================
dpll: zl3073x: Add misc features

Add several new features missing in initial submission:

* Embedded sync for both pin types
* Phase offset reporting for connected input pin
* Selectable phase offset monitoring (aka all inputs phase monitor)
* Phase adjustments for both pin types
* Fractional frequency offset reporting for input pins

Everything was tested on Microchip EVB-LAN9668 EDS2 development board.
====================

Link: https://patch.msgid.link/20250715144633.149156-1-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

dpll: zl3073x: Add support to get fractional frequency offset

Adds support to get fractional frequency offset for input pins. Implement
the appropriate callback and function that periodicaly performs reference
frequency measurement and notifies DPLL core about changes.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Prathosh Satish <prathosh.satish@microchip.com>
Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250715144633.149156-6-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

dpll: zl3073x: Add support to adjust phase

Add support to get/set phase adjustment for both input and output pins.
The phase adjustment is implemented using reference and output phase
offset compensation registers. For input pins the adjustment value can
be arbitrary number but for outputs the value has to be a multiple
of half synthesizer clock cycles.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Prathosh Satish <prathosh.satish@microchip.com>
Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250715144633.149156-5-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

dpll: zl3073x: Implement phase offset monitor feature

Implement phase offset monitor feature to allow a user to monitor
phase offsets across all available inputs.

The device firmware periodically performs phase offsets measurements for
all available DPLL channels and input references. The driver can ask
the firmware to fill appropriate latch registers with measured values.

There are 2 sets of latch registers for phase offsets reporting:

1) DPLL-to-connected-ref: up to 5 registers that contain values for
   phase offset between particular DPLL channel and its connected input
   reference.
2) selected-DPLL-to-ref: 10  registers that contain values for phase
   offsets between selected DPLL channel and all available input
   references.

Both are filled with single read request so the driver can read
DPLL-to-connected-ref phase offset for all DPLL channels at once.
This was implemented in the previous patch.

To read selected-DPLL-to-ref registers for all DPLLs a separate read
request has to be sent to device firmware for each DPLL channel.

To implement phase offset monitor feature:
* Extend zl3073x_ref_phase_offsets_update() to select given DPLL channel
  in phase offset read request. The caller can set channel==-1 if it
  will not read Type2 registers.
* Use this extended function to update phase offset latch registers
  during zl3073x_dpll_changes_check() call if phase monitor is enabled
* Extend zl3073x_dpll_pin_phase_offset_check() to check phase offset
  changes for all available input references
* Extend zl3073x_dpll_input_pin_phase_offset_get() to report phase
  offset values for all available input references
* Implement phase offset monitor callbacks to enable/disable this
  feature

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Prathosh Satish <prathosh.satish@microchip.com>
Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250715144633.149156-4-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

dpll: zl3073x: Add support to get phase offset on connected input pin

Add support to get phase offset for the connected input pin. Implement
the appropriate callback and function that performs DPLL to connected
reference phase error measurement and notifies DPLL core about changes.

The measurement is performed internally by device on background 40 times
per second but the measured value is read each second and compared with
previous value.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Prathosh Satish <prathosh.satish@microchip.com>
Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250715144633.149156-3-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

dpll: zl3073x: Add support to get/set esync on pins

Add support to get/set embedded sync for both input and output pins.
The DPLL is able to lock on input reference when the embedded sync
frequency is 1 PPS and pulse width 25%. The esync on outputs are more
versatille and theoretically supports any esync frequency that divides
current output frequency but for now support the same that supported on
input pins (1 PPS & 25% pulse).

Note that for the output pins the esync divisor shares the same register
used for N-divided signal formats. Due to this the esync cannot be
enabled on outputs configured with one of the N-divided signal formats.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Prathosh Satish <prathosh.satish@microchip.com>
Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250715144633.149156-2-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge tag 'wireless-next-2025-07-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Johannes Berg says:

====================
Another set of changes, notably:
- cfg80211: fix double-free introduced earlier
- mac80211: fix RCU iteration in CSA
- iwlwifi: many cleanups (unused FW APIs, PCIe code, WoWLAN)
- mac80211: some work around how FIPS affects wifi, which was
             wrong (RC4 is used by TKIP, not only WEP)
- cfg/mac80211: improvements for unsolicated probe response
                 handling

* tag 'wireless-next-2025-07-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (64 commits)
  wifi: cfg80211: fix double free for link_sinfo in nl80211_station_dump()
  wifi: cfg80211: fix off channel operation allowed check for MLO
  wifi: mac80211: use RCU-safe iteration in ieee80211_csa_finish
  wifi: mac80211_hwsim: Update comments in header
  wifi: mac80211: parse unsolicited broadcast probe response data
  wifi: cfg80211: parse attribute to update unsolicited probe response template
  wifi: mac80211: don't use TPE data from assoc response
  wifi: mac80211: handle WLAN_HT_ACTION_NOTIFY_CHANWIDTH async
  wifi: mac80211: simplify __ieee80211_rx_h_amsdu() loop
  wifi: mac80211: don't mark keys for inactive links as uploaded
  wifi: mac80211: only assign chanctx in reconfig
  wifi: mac80211_hwsim: Declare support for AP scanning
  wifi: mac80211: clean up cipher suite handling
  wifi: mac80211: don't send keys to driver when fips_enabled
  wifi: cfg80211: Fix interface type validation
  wifi: mac80211: remove ieee80211_link_unreserve_chanctx() return value
  wifi: mac80211: don't unreserve never reserved chanctx
  mwl8k: Add missing check after DMA map
  wifi: mac80211: make VHT opmode NSS ignore a debug message
  wifi: iwlwifi: remove support of several iwl_ppag_table_cmd versions
  ...
====================

Link: https://patch.msgid.link/20250717094610.20106-47-johannes@sipsolutions.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge tag 'wireless-2025-07-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Johannes Berg says:

====================
Couple of fixes:
- ath12k performance regression from -rc1
- cfg80211 counted_by() removal for scan request
   as it doesn't match usage and keeps complaining
- iwlwifi crash with certain older devices
- iwlwifi missing an error path unlock
- iwlwifi compatibility with certain BIOS updates

* tag 'wireless-2025-07-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: iwlwifi: Fix botched indexing conversion
  wifi: cfg80211: remove scan request n_channels counted_by
  wifi: ath12k: Fix packets received in WBM error ring with REO LUT enabled
  wifi: iwlwifi: mask reserved bits in chan_state_active_bitmap
  wifi: iwlwifi: pcie: fix locking on invalid TOP reset
====================

Link: https://patch.msgid.link/20250717091831.18787-5-johannes@sipsolutions.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge tag 'nf-25-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following batch contains Netfilter fixes for net:

1) Three patches to enhance conntrack selftests for resize and clash
   resolution, from Florian Westphal.

2) Expand nft_concat_range.sh selftest to improve coverage from error
   path, from Florian Westphal.

3) Hide clash bit to userspace from netlink dumps until there is a
   good reason to expose, from Florian Westphal.

4) Revert notification for device registration/unregistration for
   nftables basechains and flowtables, we decided to go for a better
   way to handle this through the nfnetlink_hook infrastructure which
   will come via nf-next, patch from Phil Sutter.

5) Fix crash in conntrack due to race related to SLAB_TYPESAFE_BY_RCU
   that results in removing a recycled object that is not yet in the
   hashes. Move IPS_CONFIRM setting after the object is in the hashes.
   From Florian Westphal.

netfilter pull request 25-07-17

* tag 'nf-25-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_conntrack: fix crash due to removal of uninitialised entry
  Revert "netfilter: nf_tables: Add notifications for hook changes"
  netfilter: nf_tables: hide clash bit from userspace
  selftests: netfilter: nft_concat_range.sh: send packets to empty set
  selftests: netfilter: conntrack_resize.sh: also use udpclash tool
  selftests: netfilter: add conntrack clash resolution test case
  selftests: netfilter: conntrack_resize.sh: extend resize test
====================

Link: https://patch.msgid.link/20250717095808.41725-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: pcs: xpcs: Use devm_clk_get_optional

Synopsys DesignWare XPCS CSR clock is optional,
so it is better to use devm_clk_get_optional
instead of devm_clk_get.

Signed-off-by: Jack Ping CHNG <jchng@maxlinear.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250715021956.3335631-1-jchng@maxlinear.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/linux

Tony Nguyen says:

====================
Add RDMA support for Intel IPU E2000 in idpf

Tatyana Nikolova says:

This idpf patch series is the second part of the staged submission for
introducing RDMA RoCEv2 support for the IPU E2000 line of products,
referred to as GEN3.

To support RDMA GEN3 devices, the idpf driver uses common definitions
of the IIDC interface and implements specific device functionality in
iidc_rdma_idpf.h.

The IPU model can host one or more logical network endpoints called
vPorts per PCI function that are flexibly associated with a physical
port or an internal communication port.

Other features as it pertains to GEN3 devices include:
* MMIO learning
* RDMA capability negotiation
* RDMA vectors discovery between idpf and control plane

These patches are split from the submission "Add RDMA support for Intel
IPU E2000 (GEN3)" [1]. The patches have been tested on a range of hosts
and platforms with a variety of general RDMA applications which include
standalone verbs (rping, perftest, etc.), storage and HPC applications.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
[1] https://lore.kernel.org/all/20240724233917.704-1-tatyana.e.nikolova@intel.com/
This idpf patch series is the second part of the staged submission for
introducing RDMA RoCEv2 support for the IPU E2000 line of products,
referred to as GEN3.

To support RDMA GEN3 devices, the idpf driver uses common definitions
of the IIDC interface and implements specific device functionality in
iidc_rdma_idpf.h.

The IPU model can host one or more logical network endpoints called
vPorts per PCI function that are flexibly associated with a physical
port or an internal communication port.

Other features as it pertains to GEN3 devices include:
* MMIO learning
* RDMA capability negotiation
* RDMA vectors discovery between idpf and control plane

These patches are split from the submission "Add RDMA support for Intel
IPU E2000 (GEN3)" [1]. The patches have been tested on a range of hosts
and platforms with a variety of general RDMA applications which include
standalone verbs (rping, perftest, etc.), storage and HPC applications.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
[1] https://lore.kernel.org/all/20240724233917.704-1-tatyana.e.nikolova@intel.com/

IWL reviews:
v3: https://lore.kernel.org/all/20250708210554.1662-1-tatyana.e.nikolova@intel.com/
v2: https://lore.kernel.org/all/20250612220002.1120-1-tatyana.e.nikolova@intel.com/
v1 (split from previous series):
    https://lore.kernel.org/all/20250523170435.668-1-tatyana.e.nikolova@intel.com/

v3: https://lore.kernel.org/all/20250207194931.1569-1-tatyana.e.nikolova@intel.com/
RFC v2: https://lore.kernel.org/all/20240824031924.421-1-tatyana.e.nikolova@intel.com/
RFC: https://lore.kernel.org/all/20240724233917.704-1-tatyana.e.nikolova@intel.com/

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/linux:
  idpf: implement get LAN MMIO memory regions
  idpf: implement IDC vport aux driver MTU change handler
  idpf: implement remaining IDC RDMA core callbacks and handlers
  idpf: implement RDMA vport auxiliary dev create, init, and destroy
  idpf: implement core RDMA auxiliary dev create, init, and destroy
  idpf: use reserved RDMA vectors from control plane
====================

Link: https://patch.msgid.link/20250714181002.2865694-1-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

netfilter: nf_conntrack: fix crash due to removal of uninitialised entry

A crash in conntrack was reported while trying to unlink the conntrack
entry from the hash bucket list:
    [exception RIP: __nf_ct_delete_from_lists+172]
    [..]
#7 [ff539b5a2b043aa0] nf_ct_delete at ffffffffc124d421 [nf_conntrack]
#8 [ff539b5a2b043ad0] nf_ct_gc_expired at ffffffffc124d999 [nf_conntrack]
#9 [ff539b5a2b043ae0] __nf_conntrack_find_get at ffffffffc124efbc [nf_conntrack]
    [..]

The nf_conn struct is marked as allocated from slab but appears to be in
a partially initialised state:

ct hlist pointer is garbage; looks like the ct hash value
(hence crash).
ct->status is equal to IPS_CONFIRMED|IPS_DYING, which is expected
ct->timeout is 30000 (=30s), which is unexpected.

Everything else looks like normal udp conntrack entry.  If we ignore
ct->status and pretend its 0, the entry matches those that are newly
allocated but not yet inserted into the hash:
  - ct hlist pointers are overloaded and store/cache the raw tuple hash
  - ct->timeout matches the relative time expected for a new udp flow
    rather than the absolute 'jiffies' value.

If it were not for the presence of IPS_CONFIRMED,
__nf_conntrack_find_get() would have skipped the entry.

Theory is that we did hit following race:

cpu x cpu y cpu z
found entry E found entry E
E is expired <preemption>
nf_ct_delete()
return E to rcu slab
init_conntrack
E is re-inited,
ct->status set to 0
reply tuplehash hnnode.pprev
stores hash value.

cpu y found E right before it was deleted on cpu x.
E is now re-inited on cpu z.  cpu y was preempted before
checking for expiry and/or confirm bit.

->refcnt set to 1
E now owned by skb
->timeout set to 30000

If cpu y were to resume now, it would observe E as
expired but would skip E due to missing CONFIRMED bit.

nf_conntrack_confirm gets called
sets: ct->status |= CONFIRMED
This is wrong: E is not yet added
to hashtable.

cpu y resumes, it observes E as expired but CONFIRMED:
<resumes>
nf_ct_expired()
-> yes (ct->timeout is 30s)
confirmed bit set.

cpu y will try to delete E from the hashtable:
nf_ct_delete() -> set DYING bit
__nf_ct_delete_from_lists

Even this scenario doesn't guarantee a crash:
cpu z still holds the table bucket lock(s) so y blocks:

wait for spinlock held by z

CONFIRMED is set but there is no
guarantee ct will be added to hash:
"chaintoolong" or "clash resolution"
logic both skip the insert step.
reply hnnode.pprev still stores the
hash value.

unlocks spinlock
return NF_DROP
<unblocks, then
crashes on hlist_nulls_del_rcu pprev>

In case CPU z does insert the entry into the hashtable, cpu y will unlink
E again right away but no crash occurs.

Without 'cpu y' race, 'garbage' hlist is of no consequence:
ct refcnt remains at 1, eventually skb will be free'd and E gets
destroyed via: nf_conntrack_put -> nf_conntrack_destroy -> nf_ct_destroy.

To resolve this, move the IPS_CONFIRMED assignment after the table
insertion but before the unlock.

Pablo points out that the confirm-bit-store could be reordered to happen
before hlist add resp. the timeout fixup, so switch to set_bit and
before_atomic memory barrier to prevent this.

It doesn't matter if other CPUs can observe a newly inserted entry right
before the CONFIRMED bit was set:

Such event cannot be distinguished from above "E is the old incarnation"
case: the entry will be skipped.

Also change nf_ct_should_gc() to first check the confirmed bit.

The gc sequence is:
1. Check if entry has expired, if not skip to next entry
2. Obtain a reference to the expired entry.
3. Call nf_ct_should_gc() to double-check step 1.

nf_ct_should_gc() is thus called only for entries that already failed an
expiry check. After this patch, once the confirmed bit check passes
ct->timeout has been altered to reflect the absolute 'best before' date
instead of a relative time.  Step 3 will therefore not remove the entry.

Without this change to nf_ct_should_gc() we could still get this sequence:

1. Check if entry has expired.
2. Obtain a reference.
3. Call nf_ct_should_gc() to double-check step 1:
    4 - entry is still observed as expired
    5 - meanwhile, ct->timeout is corrected to absolute value on other CPU
      and confirm bit gets set
    6 - confirm bit is seen
    7 - valid entry is removed again

First do check 6), then 4) so the gc expiry check always picks up either
confirmed bit unset (entry gets skipped) or expiry re-check failure for
re-inited conntrack objects.

This change cannot be backported to releases before 5.19. Without
commit 8a75a2c17410 ("netfilter: conntrack: remove unconfirmed list")
|= IPS_CONFIRMED line cannot be moved without further changes.

Cc: Razvan Cojocaru <rzvncj@gmail.com>
Link: https://lore.kernel.org/netfilter-devel/20250627142758.25664-1-fw@strlen.de/
Link: https://lore.kernel.org/netfilter-devel/4239da15-83ff-4ca4-939d-faef283471bb@gmail.com/
Fixes: 1397af5bfd7d ("netfilter: conntrack: remove the percpu dying list")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

net: fix segmentation after TCP/UDP fraglist GRO

Since "net: gro: use cb instead of skb->network_header", the skb network
header is no longer set in the GRO path.
This breaks fraglist segmentation, which relies on ip_hdr()/tcp_hdr()
to check for address/port changes.
Fix this regression by selectively setting the network header for merged
segment skbs.

Fixes: 186b1ea73ad8 ("net: gro: use cb instead of skb->network_header")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250705150622.10699-1-nbd@nbd.name
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

ipv6: mcast: Delay put pmc->idev in mld_del_delrec()

pmc->idev is still used in ip6_mc_clear_src(), so as mld_clear_delrec()
does, the reference should be put after ip6_mc_clear_src() return.

Fixes: 63ed8de4be81 ("mld: add mc_lock for protecting per-interface mld data")
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Link: https://patch.msgid.link/20250714141957.3301871-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-mlx5e-add-support-for-pcie-congestion-events'

Tariq Toukan says:

====================
net/mlx5e: Add support for PCIe congestion events

Dragos says:

PCIe congestion events are events generated by the firmware when the
device side has sustained PCIe inbound or outbound traffic above
certain thresholds. The high and low threshold are hysteresis thresholds
to prevent flapping: once the high threshold has been reached, a low
threshold event will be triggered only after the bandwidth usage went
below the low threshold.

This series adds support for receiving and exposing such events as
ethtool counters.

2 new pairs of counters are exposed: pci_bw_in/outbound_high/low. These
should help the user understand if the device PCI is under pressure.

Planned followup patches:
- Allow configuration of thresholds through devlink.
- Add ethtool counter for wakeups which did not result in any state
change.
====================

Link: https://patch.msgid.link/1752589821-145787-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Add device PCIe congestion ethtool stats

Implement the PCIe Congestion Event notifier which triggers a work item
to query the PCIe Congestion Event object. The result of the congestion
state is reflected in the new ethtool stats:

* pci_bw_inbound_high: the device has crossed the high threshold for
  inbound PCIe traffic.
* pci_bw_inbound_low: the device has crossed the low threshold for
  inbound PCIe traffic
* pci_bw_outbound_high: the device has crossed the high threshold for
  outbound PCIe traffic.
* pci_bw_outbound_low: the device has crossed the low threshold for
  outbound PCIe traffic

The high and low thresholds are currently configured at 90% and 75%.
These are hysteresis thresholds which help to check if the
PCI bus on the device side is in a congested state.

If low + 1 = high then the device is in a congested state. If low == high
then the device is not in a congested state.

The counters are also documented.

A follow-up patch will make the thresholds configurable.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1752589821-145787-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Create/destroy PCIe Congestion Event object

Add initial infrastructure to create and destroy the PCIe Congestion
Event object if the object is supported.

The verb for the object creation function is "set" instead of
"create" because the function will accommodate the modify operation
as well in a subsequent patch.

The next patches will hook it up to the event handler and will add
actual functionality.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1752589821-145787-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

s390/net: Remove NETIUCV device driver

The netiucv driver creates TCP/IP interfaces over IUCV between Linux
guests on z/VM and other z/VM entities.

Rationale for removal:
- NETIUCV connections are only supported for compatibility with
  earlier versions and not to be used for new network setups,
  since at least Linux kernel 4.0.
- No known active users, use cases, or product dependencies
- The driver is no longer relevant for z/VM networking;
  preferred methods include:
* Device pass-through (e.g., OSA, RoCE)
* z/VM Virtual Switch (VSWITCH)

The IUCV mechanism itself remains supported and is actively used
via AF_IUCV, hvc_iucv, and smsg_iucv.

Signed-off-by: Nagamani PV <nagamani@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250715074210.3999296-1-wintera@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'expose-refclk-for-rmii-and-enable-rmii'

Ryan Wanner says:

====================
Expose REFCLK for RMII and enable RMII

This set allows the REFCLK property to be exposed as a dt-property to
properly reflect the correct RMII layout. RMII can take an external or
internal provided REFCLK, since this is not SoC dependent but board
dependent this must be exposed as a DT property for the macb driver.

This set also enables RMII mode for the SAMA7 SoCs gigabit mac.

v1: https://lore.kernel.org/cover.1750346271.git.Ryan.Wanner@microchip.com
====================

Link: https://patch.msgid.link/cover.1752510727.git.Ryan.Wanner@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: cadence: macb: sama7g5_emac: Remove USARIO CLKEN flag

Remove USARIO_CLKEN flag since this is now a device tree argument and
not fixed to the SoC.

This will instead be selected by the "cdns,refclk-ext"
device tree property.

Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com>
Link: https://patch.msgid.link/1e7a8c324526f631f279925aa8a6aa937d55c796.1752510727.git.Ryan.Wanner@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: cadence: macb: Enable RMII for SAMA7 gem

This macro enables the RMII mode bit in the USRIO register when RMII
mode is requested.

Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com>
Link: https://patch.msgid.link/6698836e4ee7df5f6bee181f0d2e38d4b8e4cec2.1752510727.git.Ryan.Wanner@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: cadence: macb: Expose REFCLK as a device tree property

The RMII and RGMII can both support internal or external provided
REFCLKs 50MHz and 125MHz respectively. Since this is dependent on
the board that the SoC is on this needs to be set via the device tree.

This property flag is checked in the MACB DT node so the REFCLK cap is
configured the correct way for the RMII or RGMII is configured on the
board.

Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com>
Link: https://patch.msgid.link/7f9b65896d6b7b48275bc527b72a16347f8ce10a.1752510727.git.Ryan.Wanner@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

dt-bindings: net: cdns,macb: Add external REFCLK property

REFCLK can be provided by an external source so this should be exposed
by a DT property. The REFCLK is used for RMII and in some SoCs that use
this driver the RGMII 125MHz clk can also be provided by an external
source.

Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/d558467c4d5b27fb3135ffdead800b14cd9c6c0a.1752510727.git.Ryan.Wanner@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'selftest-net-add-selftest-for-netpoll'

Breno Leitao says:

====================
selftest: net: Add selftest for netpoll

I am submitting a new selftest for the netpoll subsystem specifically
targeting the case where the RX is polling in the TX path, which is
a case that we don't have any test in the tree today. This is done when
netpoll_poll_dev() called, and this test creates a scenario when that is
probably.

The test does the following:

1) Configuring a single RX/TX queue to increase contention on the
    interface.
2) Generating background traffic to saturate the network, mimicking
    real-world congestion.
3) Sending netconsole messages to trigger netpoll polling and monitor
    its behavior.
4) Using dynamic netconsole targets via configfs, with the ability to
    delete and recreate targets during the test.
5) Running bpftrace in parallel to verify that netpoll_poll_dev() is
    called when expected. If it is called, then the test passes,
    otherwise the test is marked as skipped.

In order to achieve it, I stole Jakub's bpftrace helper from [1], and
did some small changes that I found useful to use the helper.

So, this patchset basically contains:

1) The code stolen from Jakub
2) Improvements on bpftrace() helper
3) The selftest itself

Link: https://lore.kernel.org/all/20250421222827.283737-22-kuba@kernel.org/
====================

Link: https://patch.msgid.link/20250714-netpoll_test-v7-0-c0220cfaa63e@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: net: add netpoll basic functionality test

Add a basic selftest for the netpoll polling mechanism, specifically
targeting the netpoll poll() side.

The test creates a scenario where network transmission is running at
maximum speed, and netpoll needs to poll the NIC. This is achieved by:

  1. Configuring a single RX/TX queue to create contention
  2. Generating background traffic to saturate the interface
  3. Sending netconsole messages to trigger netpoll polling
  4. Using dynamic netconsole targets via configfs
  5. Delete and create new netconsole targets after some messages
  6. Start a bpftrace in parallel to make sure netpoll_poll_dev() is
     called
  7. If bpftrace exists and netpoll_poll_dev() was called, stop.

The test validates a critical netpoll code path by monitoring traffic
flow and ensuring netpoll_poll_dev() is called when the normal TX path
is blocked.

This addresses a gap in netpoll test coverage for a path that is
tricky for the network stack.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250714-netpoll_test-v7-3-c0220cfaa63e@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: drv-net: Strip '@' prefix from bpftrace map keys

The '@' prefix in bpftrace map keys is specific to bpftrace and can be
safely removed when processing results. This patch modifies the bpftrace
utility to strip the '@' from map keys before storing them in the result
dictionary, making the keys more consistent with Python conventions.

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20250714-netpoll_test-v7-2-c0220cfaa63e@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: drv-net: add helper/wrapper for bpftrace

bpftrace is very useful for low level driver testing. perf or trace-cmd
would also do for collecting data from tracepoints, but they require
much more post-processing.

Add a wrapper for running bpftrace and sanitizing its output.
bpftrace has JSON output, which is great, but it prints loose objects
and in a slightly inconvenient format. We have to read the objects
line by line, and while at it return them indexed by the map name.

Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20250714-netpoll_test-v7-1-c0220cfaa63e@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2025-07-15 (ixgbe, fm10k, i40e, ice)

Arnd Bergmann resolves compile issues with large NR_CPUS for ixgbe, fm10k,
and i40e.

For ice:
Dave adds a NULL check for LAG netdev.

Michal corrects a pointer check in debugfs initialization.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: check correct pointer in fwlog debugfs
  ice: add NULL check in eswitch lag check
  ethernet: intel: fix building with large NR_CPUS
====================

Link: https://patch.msgid.link/20250715202948.3841437-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: airoha: fix potential use-after-free in airoha_npu_get()

np->name was being used after calling of_node_put(np), which
releases the node and can lead to a use-after-free bug.
Previously, of_node_put(np) was called unconditionally after
of_find_device_by_node(np), which could result in a use-after-free if
pdev is NULL.

This patch moves of_node_put(np) after the error check to ensure
the node is only released after both the error and success cases
are handled appropriately, preventing potential resource issues.

Fixes: 23290c7bc190 ("net: airoha: Introduce Airoha NPU support")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250715143102.3458286-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv6: mcast: Simplify mld_clear_{report|query}()

Use __skb_queue_purge() instead of re-implementing it. Note that it uses
kfree_skb_reason() instead of kfree_skb() internally, and pass
SKB_DROP_REASON_QUEUE_PURGE drop reason to the kfree_skb tracepoint.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20250715120709.3941510-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

vsock/test: fix vsock_ioctl_int() check for unsupported ioctl

`vsock_do_ioctl` returns -ENOIOCTLCMD if an ioctl support is not
implemented, like for SIOCINQ before commit f7c722659275 ("vsock: Add
support for SIOCINQ ioctl"). In net/socket.c, -ENOIOCTLCMD is re-mapped
to -ENOTTY for the user space. So, our test suite, without that commit
applied, is failing in this way:

34 - SOCK_STREAM ioctl(SIOCINQ) functionality...ioctl(21531): Inappropriate ioctl for device

Return false in vsock_ioctl_int() to skip the test in this case as well,
instead of failing.

Fixes: 53548d6bffac ("test/vsock: Add retry mechanism to ioctl wrapper")
Cc: niuxuewei.nxw@antgroup.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
Link: https://patch.msgid.link/20250715093233.94108-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tcp: fix UaF in tcp_prune_ofo_queue()

The CI reported a UaF in tcp_prune_ofo_queue():

BUG: KASAN: slab-use-after-free in tcp_prune_ofo_queue+0x55d/0x660
Read of size 4 at addr ffff8880134729d8 by task socat/20348

CPU: 0 UID: 0 PID: 20348 Comm: socat Not tainted 6.16.0-rc5-virtme #1 PREEMPT(full)
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
<TASK>
dump_stack_lvl+0x82/0xd0
print_address_description.constprop.0+0x2c/0x400
print_report+0xb4/0x270
kasan_report+0xca/0x100
tcp_prune_ofo_queue+0x55d/0x660
tcp_try_rmem_schedule+0x855/0x12e0
tcp_data_queue+0x4dd/0x2260
tcp_rcv_established+0x5e8/0x2370
tcp_v4_do_rcv+0x4ba/0x8c0
__release_sock+0x27a/0x390
release_sock+0x53/0x1d0
tcp_sendmsg+0x37/0x50
sock_write_iter+0x3c1/0x520
vfs_write+0xc09/0x1210
ksys_write+0x183/0x1d0
do_syscall_64+0xc1/0x380
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fcf73ef2337
Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
RSP: 002b:00007ffd4f924708 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fcf73ef2337
RDX: 0000000000002000 RSI: 0000555f11d1a000 RDI: 0000000000000008
RBP: 0000555f11d1a000 R08: 0000000000002000 R09: 0000000000000000
R10: 0000000000000040 R11: 0000000000000246 R12: 0000000000000008
R13: 0000000000002000 R14: 0000555ee1a44570 R15: 0000000000002000
</TASK>

Allocated by task 20348:
kasan_save_stack+0x24/0x50
kasan_save_track+0x14/0x30
__kasan_slab_alloc+0x59/0x70
kmem_cache_alloc_node_noprof+0x110/0x340
__alloc_skb+0x213/0x2e0
tcp_collapse+0x43f/0xff0
tcp_try_rmem_schedule+0x6b9/0x12e0
tcp_data_queue+0x4dd/0x2260
tcp_rcv_established+0x5e8/0x2370
tcp_v4_do_rcv+0x4ba/0x8c0
__release_sock+0x27a/0x390
release_sock+0x53/0x1d0
tcp_sendmsg+0x37/0x50
sock_write_iter+0x3c1/0x520
vfs_write+0xc09/0x1210
ksys_write+0x183/0x1d0
do_syscall_64+0xc1/0x380
entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 20348:
kasan_save_stack+0x24/0x50
kasan_save_track+0x14/0x30
kasan_save_free_info+0x3b/0x60
__kasan_slab_free+0x38/0x50
kmem_cache_free+0x149/0x330
tcp_prune_ofo_queue+0x211/0x660
tcp_try_rmem_schedule+0x855/0x12e0
tcp_data_queue+0x4dd/0x2260
tcp_rcv_established+0x5e8/0x2370
tcp_v4_do_rcv+0x4ba/0x8c0
__release_sock+0x27a/0x390
release_sock+0x53/0x1d0
tcp_sendmsg+0x37/0x50
sock_write_iter+0x3c1/0x520
vfs_write+0xc09/0x1210
ksys_write+0x183/0x1d0
do_syscall_64+0xc1/0x380
entry_SYSCALL_64_after_hwframe+0x77/0x7f

The buggy address belongs to the object at ffff888013472900
which belongs to the cache skbuff_head_cache of size 232
The buggy address is located 216 bytes inside of
freed 232-byte region [ffff888013472900, ffff8880134729e8)

The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x13472
head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x80000000000040(head|node=0|zone=1)
page_type: f5(slab)
raw: 0080000000000040 ffff88800198fb40 ffffea0000347b10 ffffea00004f5290
raw: 0000000000000000 0000000000120012 00000000f5000000 0000000000000000
head: 0080000000000040 ffff88800198fb40 ffffea0000347b10 ffffea00004f5290
head: 0000000000000000 0000000000120012 00000000f5000000 0000000000000000
head: 0080000000000001 ffffea00004d1c81 00000000ffffffff 00000000ffffffff
head: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff888013472880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888013472900: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888013472980: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
^
ffff888013472a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888013472a80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb

Indeed tcp_prune_ofo_queue() is reusing the skb dropped a few lines
above. The caller wants to enqueue 'in_skb', lets check space vs the
latter.

Fixes: 1d2fbaad7cd8 ("tcp: stronger sk_rcvbuf checks")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: syzbot+865aca08c0533171bf6a@syzkaller.appspotmail.com
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/b78d2d9bdccca29021eed9a0e7097dd8dc00f485.1752567053.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Correctly set gso_size when LRO is used

gso_size is expected by the networking stack to be the size of the
payload (thus, not including ethernet/IP/TCP-headers). However, cqe_bcnt
is the full sized frame (including the headers). Dividing cqe_bcnt by
lro_num_seg will then give incorrect results.

For example, running a bpftrace higher up in the TCP-stack
(tcp_event_data_recv), we commonly have gso_size set to 1450 or 1451 even
though in reality the payload was only 1448 bytes.

This can have unintended consequences:
- In tcp_measure_rcv_mss() len will be for example 1450, but. rcv_mss
will be 1448 (because tp->advmss is 1448). Thus, we will always
recompute scaling_ratio each time an LRO-packet is received.
- In tcp_gro_receive(), it will interfere with the decision whether or
not to flush and thus potentially result in less gro'ed packets.

So, we need to discount the protocol headers from cqe_bcnt so we can
actually divide the payload by lro_num_seg to get the real gso_size.

v2:
- Use "(unsigned char *)tcp + tcp->doff * 4 - skb->data)" to compute header-len
(Tariq Toukan <tariqt@nvidia.com>)
- Improve commit-message (Gal Pressman <gal@nvidia.com>)

Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Christoph Paasch <cpaasch@openai.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20250715-cpaasch-pf-925-investigate-incorrect-gso_size-on-cx-7-nic-v2-1-e06c3475f3ac@openai.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: packetdrill: correct the expected timing in tcp_rcv_big_endseq

Commit f5fda1a86884 ("selftests/net: packetdrill: add tcp_rcv_big_endseq.pkt")
added this test recently, but it's failing with:

  # tcp_rcv_big_endseq.pkt:41: error handling packet: timing error: expected outbound packet at 1.230105 sec but happened at 1.190101 sec; tolerance 0.005046 sec
  # script packet:  1.230105 . 1:1(0) ack 54001 win 0
  # actual packet:  1.190101 . 1:1(0) ack 54001 win 0

It's unclear why the test expects the ack to be delayed.
Correct it.

Link: https://patch.msgid.link/20250715142849.959444-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ethtool: Don't check for RXFH fields conflict when no input_xfrm is requested

The requirement of ->get_rxfh_fields() in ethtool_set_rxfh() is there to
verify that we have no conflict of input_xfrm with the RSS fields
options, there is no point in doing it if input_xfrm is not
supported/requested.

This is under the assumption that a driver that supports input_xfrm will
also support ->get_rxfh_fields(), so add a WARN_ON() to
ethtool_check_ops() to verify it, and remove the op NULL check.

This fixes the following error in mlx4_en, which doesn't support
getting/setting RXFH fields.
$ ethtool --set-rxfh-indir eth2 hfunc xor
Cannot set RX flow hash configuration: Operation not supported

Fixes: 72792461c8e8 ("net: ethtool: don't mux RXFH via rxnfc callbacks")
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20250715140754.489677-1-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: rtnetlink: fix addrlft test flakiness on power-saving systems

Jakub reported that the rtnetlink test for the preferred lifetime of an
address has become quite flaky. The issue started appearing around the 6.16
merge window in May, and the test fails with:

FAIL: preferred_lft addresses remaining

The flakiness might be related to power-saving behavior, as address
expiration is handled by a "power-efficient" workqueue.

To address this, use slowwait to check more frequently whether the address
still exists. This reduces the likelihood of the system entering a low-power
state during the test, improving reliability.

Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20250715043459.110523-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'probes-fixes-v6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes fix from Masami Hiramatsu:

- fprobe-event: The @params variable was being used in an error path
without being initialized. The fix to return an error code.

* tag 'probes-fixes-v6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/probes: Avoid using params uninitialized in parse_btf_arg()

Bluetooth: btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID

For GF variant of WCN6855 without board ID programmed
btusb_generate_qca_nvm_name() will chose wrong NVM
'qca/nvm_usb_00130201.bin' to download.

Fix by choosing right NVM 'qca/nvm_usb_00130201_gf.bin'.
Also simplify NVM choice logic of btusb_generate_qca_nvm_name().

Fixes: d6cba4e6d0e2 ("Bluetooth: btusb: Add support using different nvm for variant WCN6855 controller")
Signed-off-by: Zijun Hu <zijun.hu@oss.qualcomm.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>

Bluetooth: hci_dev: replace 'quirks' integer by 'quirk_flags' bitmap

The 'quirks' member already ran out of bits on some platforms some time
ago. Replace the integer member by a bitmap in order to have enough bits
in future. Replace raw bit operations by accessor macros.

Fixes: ff26b2dd6568 ("Bluetooth: Add quirk for broken READ_VOICE_SETTING")
Fixes: 127881334eaa ("Bluetooth: Add quirk for broken READ_PAGE_SCAN_TYPE")
Suggested-by: Pauli Virtanen <pav@iki.fi>
Tested-by: Ivan Pravdin <ipravdin.official@gmail.com>
Signed-off-by: Kiran K <kiran.k@intel.com>
Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>

Bluetooth: hci_core: add missing braces when using macro parameters

Macro parameters should always be put into braces when accessing it.

Fixes: 4fc9857ab8c6 ("Bluetooth: hci_sync: Add check simultaneous roles support")
Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>

Bluetooth: hci_core: fix typos in macros

The provided macro parameter is named 'dev' (rather than 'hdev', which
may be a variable on the stack where the macro is used).

Fixes: a9a830a676a9 ("Bluetooth: hci_event: Fix sending HCI_OP_READ_ENC_KEY_SIZE")
Fixes: 6126ffabba6b ("Bluetooth: Introduce HCI_CONN_FLAG_DEVICE_PRIVACY device flag")
Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>

Bluetooth: SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout

This replaces the usage of HCI_ERROR_REMOTE_USER_TERM, which as the name
suggest is to indicate a regular disconnection initiated by an user,
with HCI_ERROR_AUTH_FAILURE to indicate the session has timeout thus any
pairing shall be considered as failed.

Fixes: 1e91c29eb60c ("Bluetooth: Use hci_disconnect for immediate disconnection from SMP")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>

Bluetooth: SMP: If an unallowed command is received consider it a failure

If a command is received while a bonding is ongoing consider it a
pairing failure so the session is cleanup properly and the device is
disconnected immediately instead of continuing with other commands that
may result in the session to get stuck without ever completing such as
the case bellow:

> ACL Data RX: Handle 2048 flags 0x02 dlen 21
      SMP: Identity Information (0x08) len 16
        Identity resolving key[16]: d7e08edef97d3e62cd2331f82d8073b0
> ACL Data RX: Handle 2048 flags 0x02 dlen 21
      SMP: Signing Information (0x0a) len 16
        Signature key[16]: 1716c536f94e843a9aea8b13ffde477d
Bluetooth: hci0: unexpected SMP command 0x0a from XX:XX:XX:XX:XX:XX
> ACL Data RX: Handle 2048 flags 0x02 dlen 12
      SMP: Identity Address Information (0x09) len 7
        Address: XX:XX:XX:XX:XX:XX (Intel Corporate)

While accourding to core spec 6.1 the expected order is always BD_ADDR
first first then CSRK:

When using LE legacy pairing, the keys shall be distributed in the
following order:

    LTK by the Peripheral

    EDIV and Rand by the Peripheral

    IRK by the Peripheral

    BD_ADDR by the Peripheral

    CSRK by the Peripheral

    LTK by the Central

    EDIV and Rand by the Central

    IRK by the Central

    BD_ADDR by the Central

    CSRK by the Central

When using LE Secure Connections, the keys shall be distributed in the
following order:

    IRK by the Peripheral

    BD_ADDR by the Peripheral

    CSRK by the Peripheral

    IRK by the Central

    BD_ADDR by the Central

    CSRK by the Central

According to the Core 6.1 for commands used for key distribution "Key
Rejected" can be used:

  '3.6.1. Key distribution and generation

  A device may reject a distributed key by sending the Pairing Failed command
  with the reason set to "Key Rejected".

Fixes: b28b4943660f ("Bluetooth: Add strict checks for allowed SMP PDUs")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>

Bluetooth: btintel: Check if controller is ISO capable on btintel_classify_pkt_type

Due to what seem to be a bug with variant version returned by some
firmwares the code may set hdev->classify_pkt_type with
btintel_classify_pkt_type when in fact the controller doesn't even
support ISO channels feature but may use the handle range expected from
a controllers that does causing the packets to be reclassified as ISO
causing several bugs.

To fix the above btintel_classify_pkt_type will attempt to check if the
controller really supports ISO channels and in case it doesn't don't
reclassify even if the handle range is considered to be ISO, this is
considered safer than trying to fix the specific controller/firmware
version as that could change over time and causing similar problems in
the future.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=219553
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2100565
Link: https://github.com/StarLabsLtd/firmware/issues/180
Fixes: f25b7fd36cc3 ("Bluetooth: Add vendor-specific packet classification for ISO data")
Cc: stable@vger.kernel.org
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-by: Sean Rhodes <sean@starlabs.systems>