====================
net: atlantic: add PTP support for AQC113 (Antigua)
This series adds IEEE 1588 PTP support for the AQC113 (Antigua) network
controller. AQC113 is the successor to the existing AQC107 (Atlantic)
chip already supported by the atlantic driver.
AQC113 uses a substantially different hardware architecture for PTP
compared to AQC107:
- Dual on-chip TSG clocks with direct register access instead of
PHY-based timestamping via firmware
- TX timestamps via descriptor writeback instead of firmware mailbox
- Hardware L3/L4 RX filters for PTP multicast steering with both
IPv4 and IPv6 support
- Reference-counted shared filter slots managed through an Action
Resolver Table (ART), allowing multiple rules to share L3/L4
hardware filters when their match criteria are identical
The series is structured in four parts:
Patches 1-3 prepare the existing L3/L4 filter path:
Patch 1 corrects flow_type masking and IPv6 address handling in
aq_set_data_fl3l4(). Patch 2 moves the active_ipv4/ipv6 bitmap
updates to after the hardware write succeeds. Patch 3 decouples
the function from driver-internal structures so it can be called
directly by the AQC113 PTP filter setup code.
Patches 4-5 add the AQC113 hardware infrastructure:
Patch 4 adds the low-level register definitions and accessor
functions. Patch 5 adds filter data structures and firmware
capability query.
Patches 6-7 implement the AQC113 L2/L3/L4 RX filter management:
Patch 6 fixes the AQC113 HW init path: ART section selection,
L2 filter slot assignment, and MAC address programming. Patch 7
implements the complete L3/L4 RX filter ops including reference-
counted ART sharing and IPv4/IPv6 steering.
Patches 8-12 add the AQC113 PTP feature:
Patch 8 reserves the dedicated PTP traffic class buffer and
configures the TX path. Patch 9 extends the hw_ops interface
with PTP-specific function pointers and updates AQC107 to the
new signatures. Patches 10-12 implement the full PTP subsystem:
Patch 10 adds the hw_atl2 register-level PTP clock ops, Patch 11
adds TX timestamp polling and PTP TX traffic classification, and
Patch 12 integrates PTP into aq_ptp and the driver core.
The existing AQC107 PTP implementation is not functionally changed
by this series; AQC113-specific code paths are gated on chip
detection throughout.
Tested on AQC113 at 1G, 2.5G, 5G, and 10G link speeds using
ptp4l/phc2sys with hardware timestamping in both L2 and L4
(IPv4/IPv6) modes.
====================
Sukhdeep Singh [Wed, 10 Jun 2026 11:54:48 +0000 (17:24 +0530)]
net: atlantic: add AQC113 PTP support in aq_ptp and driver core
aq_ptp.c / aq_ptp.h:
- Add aq_ptp_state enum (AQ_PTP_FIRST_INIT, AQ_PTP_LINK_UP,
AQ_PTP_NO_LINK) to distinguish first init from link-change events;
on AQC113 only reset the TSG clock on first init to avoid disrupting
ongoing synchronization.
- Add aq_ptp_dpath_enable() for comprehensive L3/L4 PTP filter
setup/teardown, replacing the previous single-filter approach with
an array of 4 slots for IPv4 and IPv6 PTP multicast addresses
(224.0.1.129, 224.0.0.107, ff0e::181, ff02::6b).
- Add aq_ptp_parse_rx_filters() to map hwtstamp_rx_filters to L2/L4
enable flags and call aq_ptp_dpath_enable().
- Re-apply RX filters on link change (hardware state lost after reset).
- Extend PTP ring alloc/init/start/stop to handle AQC113 PTP ring ops.
- Add per-instance PTP offset table for AQC113 with empirically measured
values at 100M/1G/2.5G/5G/10G link speeds.
- Export aq_ptp_dpath_enable() and updated ring helpers in aq_ptp.h.
aq_hw.h:
- Include hw_atl2/hw_atl2.h for AQC113 PTP type definitions.
aq_nic.c:
- Account for PTP IRQ vector (AQ_HW_PTP_IRQS) in vector count math.
- Call hw_atl2 PTP re-enable hook after hardware reset in
aq_nic_update_link_status().
aq_pci_func.c:
- Pass PTP IRQ index to aq_ptp_irq_alloc() in probe path.
aq_ring.h / aq_ring.c:
- Add ptp_ts_deadline field to aq_ring_s to track TX timestamp timeout.
- In aq_ring_tx_clean(): when hw_ring_tx_ptp_get_ts() returns 0 (HW not
yet written back the timestamp), clear buff->is_mapped and buff->pa
before breaking to prevent double dma_unmap on retry. When
ptp_ts_deadline expires, dequeue and drop the head of skb_ring to keep
it in lockstep with buff_ring, then clear request_ts and free the skb
via dev_kfree_skb_any() to unblock the ring.
aq_main.c:
- Add IPv6 PTP packet detection in aq_ndev_start_xmit() using
ipv6_hdr()->nexthdr for ETH_P_IPV6 frames, steering them through
aq_ptp_xmit() alongside the existing IPv4 path.
- Use PTP_EV_PORT/PTP_GEN_PORT constants instead of magic numbers 319/320.
- Remove duplicate aq_reapply_rxnfc_all_rules() and
aq_filters_vlans_update()
calls from aq_ndev_open() - now covered by aq_nic_start(), which also
ensures filters are restored correctly after PM resume.
aq_nic.c:
- Move aq_reapply_rxnfc_all_rules() and aq_filters_vlans_update() into
aq_nic_start() after hardware init, replacing the duplicate calls that
were removed from aq_ndev_open().
Sukhdeep Singh [Wed, 10 Jun 2026 11:54:46 +0000 (17:24 +0530)]
net: atlantic: add AQC113 PTP hardware ops in hw_atl2
Add the hardware-layer PTP implementation for AQC113 (Antigua):
- hw_atl2.h/hw_atl2_utils.h/hw_atl2_internal.h: add PTP offset
constants, RX timestamp size (HW_ATL2_RX_TS_SIZE=8), and reduced
HW_ATL2_RXBUF_MAX=172 (AQC113 on-chip RX packet buffer hardware
limit for data TCs).
- hw_atl2.c: implement hw_atl2_enable_ptp() to reset and enable TSG
clocks and set PTP TC scheduling priority after hardware reset.
- hw_atl2.c: implement hw_atl2_adj_sys_clock(), hw_atl2_adj_clock_freq(),
and aq_get_ptp_ts() for TSG clock read/adjust/increment operations.
- hw_atl2.c: implement hw_atl2_gpio_pulse() for PPS output generation
via TSG pulse generator.
- hw_atl2.c: implement hw_atl2_hw_tx_ptp_ring_init() and
hw_atl2_hw_rx_ptp_ring_init() for PTP ring setup.
- hw_atl2.c: implement hw_atl2_hw_ring_tx_ptp_get_ts() to read TX
timestamp from descriptor writeback, and hw_atl2_hw_rx_extract_ts()
to extract RX timestamp from the 8-byte packet trailer.
Sukhdeep Singh [Wed, 10 Jun 2026 11:54:45 +0000 (17:24 +0530)]
net: atlantic: extend hw_ops and TX descriptor for AQC113 PTP
Extend the aq_hw_ops interface with new function pointers required for
PTP support on AQC113:
- enable_ptp: enable/disable PTP counter with clock selection
- hw_ring_tx_ptp_get_ts: read TX timestamp from descriptor writeback
- hw_tx_ptp_ring_init/hw_rx_ptp_ring_init: per-ring PTP initialization
- hw_get_clk_sel: query active TSG clock selection
Update existing hw_ops signatures to support AQC113 dual-clock
architecture:
- hw_gpio_pulse: add clk_sel and hightime parameters
- hw_extts_gpio_enable: add channel parameter
Add PTP-related hardware defines:
- AQ_HW_TXD_CTL_TS_EN/TS_TSG0 for TX descriptor timestamp control
- AQ2_HW_PTP_COUNTER_HZ for AQC113 TSG clock frequency
- AQ_HW_PTP_IRQS for PTP interrupt vector accounting
- PTP enable flags (L2/L4) and TSG clock selection constants
Add request_ts and clk_sel bitfields to aq_ring_buff_s for per-packet
TX timestamp request tracking.
Update hw_atl_b0.c (AQC107) implementations:
- Adapt gpio_pulse and extts_gpio_enable to new signatures
- Add TX descriptor timestamp bits for AQC113 when ANTIGUA chip
feature is detected
Sukhdeep Singh [Wed, 10 Jun 2026 11:54:44 +0000 (17:24 +0530)]
net: atlantic: add AQC113 PTP traffic class and TX path setup
Add PTP traffic class (TC) buffer reservation and TX path
improvements for AQC113:
- Reserve dedicated TX and RX buffer space for PTP TC when PTP is
enabled, reducing user TC buffers accordingly (TX: 8KB, RX: 16KB).
- Configure PTP TC with no flow control and highest priority
scheduling to ensure timely PTP packet transmission.
TX path improvements:
- Increase TX data and descriptor read-request limits when firmware
has already enabled extended PCIe tag mode.
Also simplify RSS queue calculation in hw_atl2_hw_rss_set() by
extracting to a local variable and use unsigned types for loop
variables to match their usage.
Implement complete RX filter management for AQC113 hardware:
- Add tag-based ethertype filter policy (hw_atl2_filter_tag_get/put)
that allocates and releases ART tags for L2 ethertype filters.
- Add L3/L4 filter sharing via serialized usage counters in
hw_atl2_l3_filter/hw_atl2_l4_filter, managed through
hw_atl2_rxf_l3_get/put and hw_atl2_rxf_l4_get/put.
- Implement L3 (IPv4/IPv6 source/destination address and protocol)
filter find, get (program HW and increment refcount), and put
(decrement refcount and clear HW when last user releases).
- Implement L4 (TCP/UDP/SCTP source/destination port) filter management
with the same find/get/put pattern.
- Add combined L3L4 filter configuration (hw_atl2_new_fl3l4_configure)
that translates legacy aq_rx_filter_l3l4 commands into AQC113 separate
L3+L4 filter programming with Action Resolver Table (ART) entries.
- Add L2 ethertype filter set/clear (hw_atl2_hw_fl2_set/clear) with
tag-based ART integration.
- Wire .hw_filter_l2_set, .hw_filter_l2_clear, .hw_filter_l3l4_set
into hw_atl2_ops.
Fix initialization issues in hw_atl2 to correctly support AQC113:
- hw_atl2_hw_reset: replace unconditional priv memset with selective
field clears so that l3l4_filters[].l3_index and l4_index can be
initialized to -1 (not allocated) rather than 0; 0 is a valid filter
index and would incorrectly appear as an occupied slot after a reset.
- hw_atl2_hw_init_new_rx_filters: use firmware-reported ART section
base and count (clamped to 16) instead of hardcoded 0xFFFF mask;
enable simultaneous IPv4/IPv6 L3 filter mode (rpf_l3_v6_v4_select);
tag the UC MAC slot using firmware-supplied l2_filters_base_index
instead of hardcoded HW_ATL2_MAC_UC.
- hw_atl2_hw_init_rx_path: enable only the firmware-assigned MAC slot
(priv->l2_filters_base_index) instead of always slot 0.
- Add hw_atl2_hw_mac_addr_set() that programs the MAC address into
the firmware-assigned L2 filter slot. Wire into hw_atl2_ops
replacing the A1 hw_atl_b0_hw_mac_addr_set; call it from hw_init.
Sukhdeep Singh [Wed, 10 Jun 2026 11:54:41 +0000 (17:24 +0530)]
net: atlantic: add AQC113 filter data structures, firmware query and register dump
Add filter infrastructure for AQC113 hardware:
- Define L3 (IPv4/IPv6), L4 (TCP/UDP/SCTP), and combined L3L4 filter
structures with serialized usage counter for filter sharing.
- Define tag policy structure for ethertype filter management.
- Add RPF L3/L4 command bit definitions for filter programming.
- Add filter count constants for L3L4, L3V4, L4, VLAN, and ethertype.
- Extend hw_atl2_priv with filter arrays, base indices, and counts
discovered from firmware.
Query filter capabilities from firmware shared memory at init time
to discover available L2/L3/L4/VLAN/ethertype filter resources and
ART (Action Resolver Table) configuration.
Add hardware register dump utility for AQC113 debug support.
Sukhdeep Singh [Wed, 10 Jun 2026 11:54:39 +0000 (17:24 +0530)]
net: atlantic: decouple aq_set_data_fl3l4() from driver internals
Refactor aq_set_data_fl3l4() to take an ethtool_rx_flow_spec pointer and
an explicit HW register location instead of driver-internal structures
(aq_nic_s, aq_rx_filter). This makes the function reusable for PTP
filter setup which constructs flow specs independently.
Key changes:
- Add aq_is_ipv6_flow_type() helper to derive IPv6 status from the
flow_type field, replacing the dependency on rx_fltrs->fl3l4.is_ipv6
shared state.
- Change aq_set_data_fl3l4() signature to accept (fsp, data, location,
add) and export it via aq_filters.h.
- Update aq_add_del_fl3l4() to compute the HW register location and
pass it explicitly.
Sukhdeep Singh [Wed, 10 Jun 2026 11:54:38 +0000 (17:24 +0530)]
net: atlantic: move active_ipv4/ipv6 bitmap updates after HW write
Move active_ipv4/active_ipv6 bitmap updates from aq_set_data_fl3l4()
into aq_add_del_fl3l4() after the hardware write succeeds. The bitmaps
track which filter slots are actively programmed in hardware and must
only be updated once the HW write is confirmed.
The bitmap updates in aq_nic_reserve_filter() and aq_nic_release_filter()
are intentionally retained: they guard the aq_check_approve_fl3l4()
IPv4/IPv6 mixing validation for callers such as the AQC113 PTP path that
program filters directly via hw_atl2_new_fl3l4_configure() without going
through aq_add_del_fl3l4().
Sukhdeep Singh [Wed, 10 Jun 2026 11:54:37 +0000 (17:24 +0530)]
net: atlantic: correct L3L4 filter flow_type masking and IPv6 handling
Correct three issues in aq_set_data_fl3l4() required for the AQC113
PTP filter path introduced later in this series:
1. Mask FLOW_EXT from flow_type before the protocol switch statement.
Flow types with FLOW_EXT set (e.g. TCP_V4_FLOW | FLOW_EXT) fall
through to the default case and skip protocol comparison flags.
2. Extend the L3 address comparison check to cover all four IPv6
words. The original code only checked ip_src[0]/ip_dst[0] and
required !is_ipv6, so CMP_SRC_ADDR_L3/CMP_DEST_ADDR_L3 were never
set for IPv6 filters.
3. Use explicit flow type checks for port extraction instead of
negating IP_USER_FLOW/IPV6_USER_FLOW. The old check did not mask
FLOW_EXT, so IP_USER_FLOW | FLOW_EXT would incorrectly attempt
port extraction. Use the actual flow type to pick the correct
union member directly.
====================
net: dsa: netc: add bridge mode support
This series adds bridge mode support to the NETC DSA switch driver,
covering both VLAN-aware and VLAN-unaware operation.
The NETC switch manages forwarding through a set of hardware tables
accessed via NTMP: the FDB table (FDBT), VLAN filter table (VFT), egress
treatment table (ETT), and egress count table (ECT). The series extends
the NTMP layer with the operations required for bridging, then builds the
DSA bridge callbacks on top.
Since all switch ports share the VFT, so only one VLAN-aware bridge is
supported.
FDB aging is managed in software. A periodic delayed work sweeps the
table using the hardware activity element mechanism, with a default aging
time of 300 seconds matching the IEEE 802.1Q standard. Per-port entries
are also flushed immediately on bridge leave and link-down events.
====================
The NETC switch does not age out dynamic FDB entries automatically.
Without software management, stale entries persist after topology
changes and cause incorrect forwarding.
Add a delayed work that periodically removes entries that have not been
refreshed within the specified cycles. The effective ageing time is:
ageing_time = fdbt_ageing_delay * 100
Default values are 3s interval and 100 cycles (300s total), matching
the IEEE 802.1Q default ageing time. The work starts when the first
port joins a bridge (tracked via br_cnt) and is cancelled when the
last port leaves. All FDB operations are serialized under fdbt_lock.
Implement .set_ageing_time() to allow the bridge layer to reconfigure
ageing parameters on demand.
Wei Fang [Thu, 11 Jun 2026 02:14:57 +0000 (10:14 +0800)]
net: dsa: netc: add bridge mode support
Wire up the port_bridge_join, port_bridge_leave and port_vlan_filtering
DSA callbacks to support both VLAN-unaware and VLAN-aware bridge modes.
For VLAN-unaware bridges, each bridge instance is assigned a dedicated
internal PVID via NETC_VLAN_UNAWARE_PVID(bridge.num), counting down
from VID 4095. A VFT entry is created for this PVID with hardware MAC
learning and flood-on-miss forwarding enabled. The CPU port is included
as a VFT member so frames can reach the host. The reserved VID range is
blocked in port_vlan_add to prevent user-space conflicts.
Only one VLAN-aware bridge is supported at a time; this constraint is
enforced in port_bridge_join and port_vlan_filtering. The per-port PVID
is tracked in software and written to the BPDVR register whenever VLAN
filtering is active.
When a port leaves the bridge, its dynamic FDB entries are flushed right
away in port_bridge_leave(), without waiting for the ageing cycle. When
a link down event occurs on a port, netc_mac_link_down() will also clear
the port's dynamic FDB entries via netc_port_remove_dynamic_entries().
Non-bridge ports have no dynamic FDB entries, so this call is always
safe. Additionally, .port_fast_age() callback is added to flush the
dynamic FDB entries associated to a port.
Host flood rules are removed from the ingress port filter table when a
port joins a bridge to avoid bypassing FDB lookup and MAC learning.
Implement the DSA .port_vlan_add and .port_vlan_del operations to enable
VLAN-aware bridge offloading on the NETC switch.
VLAN membership is maintained in the VLAN Filter Table (VFT). Adding the
first port to a VLAN creates a new VFT entry with hardware MAC learning
and flood-on-miss forwarding; subsequent ports update the existing
entry's membership bitmap. Removing the last port deletes the entry.
Egress tagging is handled through the Egress Treatment Table (ETT). Each
VLAN is allocated a group of ETT entries, one per available port. Ports
are assigned a sequential ett_offset during initialisation, used to
address each port's entry within the group. Untagged ports configure the
ETT to strip the outer VLAN tag; tagged ports pass frames through
unmodified. Each ETT group is optionally paired with an Egress Counter
Table (ECT) group for per-port frame counting, allocated on a best-effort
basis. When the egress rule of an ETT entry changes, the counter of the
corresponding ECT entry will be recounted to track the number of frames
that match the new egress rule.
A software shadow list serialised by vft_lock tracks active VLAN state
across both port membership and egress tagging. VID 0 is used for single
port mode and is ignored by both callbacks.
Wei Fang [Thu, 11 Jun 2026 02:14:55 +0000 (10:14 +0800)]
net: enetc: add helpers to set/clear table bitmap
NTMP index tables require software to allocate and manage entry IDs.
Add two bitmap helper functions to facilitate this management:
ntmp_lookup_free_eid(): finds the first zero bit in the given bitmap,
sets it to mark the entry as in-use, and returns the corresponding entry
ID. Returns NTMP_NULL_ENTRY_ID if no free entry is available.
ntmp_clear_eid_bitmap(): clears the bit associated with the given entry
ID in the bitmap to mark the entry as free. It is a no-op if the entry
ID is NTMP_NULL_ENTRY_ID.
Both functions are exported for use by other modules, such as the NETC
switch driver which needs to manage group index bitmaps for the Egress
Treatment Table (ETT) and Egress Count Table (ECT).
Wei Fang [Thu, 11 Jun 2026 02:14:54 +0000 (10:14 +0800)]
net: dsa: netc: initialize the group bitmap of ETT and ECT
The Egress Treatment Table (ETT) and Egress Count Table (ECT) are both
index tables whose entry IDs are allocated by software. Every num_ports
entries form a group, where each entry in the group corresponds to one
port. To facilitate group allocation and management, initialize the group
index bitmaps for both tables based on hardware capabilities reported by
ETTCAPR and ECTCAPR registers.
The bitmap size per table is calculated as the total number of hardware
entries divided by the number of available ports, which gives the number
of groups available for software allocation. A set bit in the bitmap
represents a group index that has been allocated.
These bitmaps will be used by subsequent patches that add VLAN support.
Wei Fang [Thu, 11 Jun 2026 02:14:53 +0000 (10:14 +0800)]
net: enetc: add "Update" operation to the egress count table
The egress count table is a static bounded index table, egress related
statistics are maintained in this table. The table is implemented as a
linear array of entries accessed using an index (0, 1, 2, ..., n) that
uniquely identifies an entry within the array. Egress Counter Entry ID
(EC_EID) is used as an index to an entry in this table. The EC_EID is
specified in the egress treatment table.
Egress count table entries are always present and enabled. The table
only supports access via entry ID, which is assigned by the software.
And it supports Update, Query and Query followed by Update operations.
Currently, only Update operation is supported.
Wei Fang [Thu, 11 Jun 2026 02:14:52 +0000 (10:14 +0800)]
net: enetc: add interfaces to manage egress treatment table
Each entry in the egress treatment table contains the egress packet
processing actions to be applied to a grouping or scope of packets
exiting on a particular egress port of the switch. A scope of packets,
for example, could be the packets exiting a particular VLAN, matching
a particular 802.1Q bridge forwarding entry or belonging to a stream
identified at ingress. The egress treatment table is implemented as a
linear array of entries accessed using an index (0,1, 2, ..., n) that
uniquely identifies an entry within the array.
The egress treatment table only supports access vid entry ID, which is
assigned by the software. It supports Add, Update, Delete and Query
operations. Note that only Query operation is not supported yet.
Wei Fang [Thu, 11 Jun 2026 02:14:51 +0000 (10:14 +0800)]
net: enetc: add "Update" and "Delete" operations to VLAN filter table
Add two interfaces to manage entries in the VLAN filter table:
ntmp_vft_update_entry(): Update the configuration element data of the
specified VLAN filter entry based on the given VLAN ID. It uses the
exact key access method to locate the entry.
ntmp_vft_delete_entry(): Delete the VLAN filter entry corresponding to
the specified VLAN ID. It also uses the exact key access method to
identify the target entry.
In addition, introduce struct vft_req_qd to describe the request data
buffer format for Query and Delete actions of the VLAN filter table,
which contains a common request data header and a VLAN access key.
Wei Fang [Thu, 11 Jun 2026 02:14:50 +0000 (10:14 +0800)]
net: enetc: add interfaces to manage dynamic FDB entries
Add three interfaces to manage dynamic entries in the FDB table:
ntmp_fdbt_update_activity_element(): Update the activity element of all
dynamic FDB entries. For each entry, if its activity flag is not set,
which means no packet has matched this entry since the last update, the
activity counter is incremented. Otherwise, both the activity flag and
activity counter are reset. The activity counter is used to track how
long an FDB entry has been inactive, which is useful for implementing
an ageing mechanism.
ntmp_fdbt_delete_ageing_entries(): Delete all dynamic FDB entries whose
activity flag is not set and whose activity counter is greater than or
equal to the specified threshold. This is used to remove stale entries
that have been inactive for too long.
ntmp_fdbt_delete_port_dynamic_entries(): Delete all dynamic FDB entries
associated with the specified switch port. This is typically called when
a port goes down or is removed from a bridge.
Minxi Hou [Fri, 12 Jun 2026 13:05:03 +0000 (21:05 +0800)]
selftests/net/openvswitch: add SET action test
Add test_action_set exercising OVS_ACTION_ATTR_SET with an ipv4 dst
rewrite. The test verifies the SET action in three steps: first
confirm normal forwarding, then apply set(ipv4(dst=10.0.0.99)) to
rewrite the destination to an address nobody owns and verify ping
fails, then restore normal forwarding and verify connectivity
recovers.
Jakub Kicinski [Mon, 15 Jun 2026 21:14:50 +0000 (14:14 -0700)]
Merge branch 'net-sfp-extend-smbus-support'
Jonas Jelonek says:
====================
net: sfp: extend SMBus support
Today, the SFP driver only drives I2C adapters that advertise full
I2C_FUNC_I2C, or SMBus-only adapters via single-byte transfers (with
hwmon disabled). Several SoCs ship I2C/SMBus-only controllers that
support more than just byte access -- e.g. word and I2C block -- and
have SFP cages wired to them. Today, those adapters either work
poorly or not at all.
This series teaches the SFP driver to use the larger SMBus access
modes when the adapter advertises them, and along the way starts
honoring i2c_adapter quirks on read/write length so adapters that
cap below the SFP block size are handled correctly. Patch 1 is a
small prep doing only the quirks handling; patch 2 extends the
SMBus path itself.
Capability matrix supported by patch 2:
- BYTE only: single-byte access (unchanged).
- BYTE + WORD: word for >=2-byte chunks, byte tail.
- I2C_BLOCK present: block as the universal transport.
- WORD only (no BYTE/BLOCK): accepted with WARN_ONCE; works for
even-length transfers, odd-length
transfers will error at xfer time.
Adapters with asymmetric R/W capabilities (e.g. only READ_I2C_BLOCK
without WRITE_I2C_BLOCK) remain functionally correct but use the
worse-supported direction's max for both directions, since
i2c_max_block_size is a single field. No mainline I2C driver was
seen advertising such asymmetry; per-direction sizes can be added
later if needed.
====================
Jonas Jelonek [Sun, 14 Jun 2026 13:34:18 +0000 (13:34 +0000)]
net: sfp: extend SMBus support
Commit 7662abf4db94 ("net: phy: sfp: Add support for SMBus module access")
added SMBus access for SFP modules, but limited it to single-byte
transfers. As a side effect, hwmon is disabled (16-bit reads cannot be
guaranteed atomic) and a warning is printed.
Many SMBus-only I2C controllers in the wild support more than just
byte access, and SFP cages are often wired to such controllers
rather than to a full-featured I2C controller -- e.g. the SMBus
controllers in the Realtek longan and mango SoCs, which advertise
word access and I2C block reads. Today, they cannot drive an SFP at
all without falling back to the byte-only path.
Extend sfp_smbus_read()/sfp_smbus_write() so that, in addition to
the existing byte access, they also use SMBus word access and SMBus
I2C block access whenever the adapter advertises them. Both
directions are handled in a single read and a single write helper
that pick the largest supported transfer per chunk and fall back as
needed.
I2C-block is preferred unconditionally when available: the protocol
carries any length 1..32, so it can serve every chunk -- including
the 1- and 2-byte tails -- without help from word or byte access.
Note that this requires I2C_FUNC_SMBUS_I2C_BLOCK, which reads a
caller-specified number of bytes. This deviates from the official
SMBus Block Read (length is supplied by the slave) but is widely
supported by Linux I2C controllers/drivers.
Capability matrix this implementation supports:
- BYTE only: works (unchanged behaviour); 1-byte
xfers, hwmon disabled.
- BYTE + WORD: word for >=2-byte chunks, byte for
trailing odd byte.
- I2C_BLOCK present (with or
without BYTE/WORD): block as the universal transport for
every chunk.
- WORD only (no BYTE/BLOCK): accepted with WARN_ONCE. Even-length
transfers work; odd-length transfers
(e.g. the 3-byte cotsworks fixup
write) hit the BYTE branch which the
adapter does not implement, so the
xfer returns an error and the
operation is aborted. No mainline
I2C driver was found to advertise
WORD without BYTE; the warning lets
us learn about it if it ever shows
up.
Adapters with asymmetric R/W capabilities (e.g. only READ_I2C_BLOCK
but not WRITE_I2C_BLOCK) remain functionally correct -- the
per-iteration fallback uses the direction-specific bits -- but the
shared i2c_max_block_size is sized by the all-bits-set check, so a
transfer in the better-supported direction is not upgraded. None of
the mainline I2C bus drivers surveyed during review advertise such
asymmetry; promoting i2c_max_block_size to per-direction sizes can
be revisited if needed.
Jonas Jelonek [Sun, 14 Jun 2026 13:34:17 +0000 (13:34 +0000)]
net: sfp: apply I2C adapter quirks to limit block size
The SFP driver assumes all I2C adapters support reading and writing the
pre-defined block size SFP_EEPROM_BLOCK_SIZE of 16 bytes. This constant
was probably chosen based on good guesses and known limitations of a
range of I2C adapters and SFP modules.
However, I2C adapters may even support less and usually need to specify
this via I2C quirks. Theoretically, such an adapter may provide full
functionality but only support a read and write length of e.g. 8 bytes.
Currently, the SFP driver doesn't account for that.
Add handling for I2C quirks in SFP I2C configuration taking the fields
max_read_len and max_write_len in struct i2c_adapter_quirks into account
to further limit the maximum block size if needed.
Jakub Kicinski [Mon, 15 Jun 2026 21:09:56 +0000 (14:09 -0700)]
Merge tag 'nf-next-26-06-14' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-next
The following patchset contains Netfilter/IPVS updates for net-next.
More specifically, this contains conncount rework to address AI related
reports, assorted Netfiter updates and two small incremental updates on
IPVS:
1) Replace old obsolete workqueues (system_wq, system_unbound_wq)
in IPVS, from Marco Crivellari.
2) Replace WARN_ON{_ONCE} by DEBUG_NET_WARN_ON_ONCE in nf_tables.
In the recent years, reporters say that the use of WARN_ON{_ONCE}
in conjunction with panic_on_warn=1 results in DoS. Let's replace
it by DEBUG_NET_WARN_ON_ONCE so this is only exercised by test
infrastructure and fuzzers, while also providing context to AI
agents. From Fernando F. Mancera.
Five patches from Florian Westphal to address AI reports in the conncount
infrastructures:
3) Fix missing rcu read lock section when calling
__ovs_ct_limit_get_zone_limit().
4) Add a dedicate lock per rbtree tree, this increases memory
usage but it should improve scalability.
5) Add a helper function to find the rbtree node, no functional
changes are intented.
6) Add sequence counter to detect concurrent tree modifications
and retry lookups.
7) Add locks to GC conncount walk and address other nitpicks.
Then, several assorted updates:
8) Defensive Tree-wide addition of NULL checks for ct extensions.
9) Bail out if flowtable bypass cannot be fully set up from the
flow offload expression, instead of lazy building a likely
incomplete one.
10) Fix documentation for the new conn_max sysctl toggle in IPVS.
11) Add nf_dev_xmit_recursion*() helpers and use them, to address
recent AI reports.
* tag 'nf-next-26-06-14' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: nf_dup_netdev: add nf_dev_xmit_recursion*() helpers and use them
ipvs: fix doc syntax for conn_max sysctl
netfilter: flowtable: bail out if forward path cannot be discovered
netfilter: conntrack: check NULL when retrieving ct extension
netfilter: nf_conncount: gc and rcu fixes
netfilter: nf_conncount: add sequence counter to detect tree modifications
netfilter: nf_conncount: split count_tree_node rbtree walk into helper
netfilter: nf_conncount: use per nf_conncount_data spinlocks
netfilter: nf_conncount: callers must hold rcu read lock
netfilter: nf_tables: use DEBUG_NET_WARN_ON_ONCE in packet and control paths
ipvs: Replace use of system_unbound_wq with system_dfl_long_wq
====================
====================
netdev: expose page pool order via netlink
This small series exposes io_uring's high order page configuration
via the page_pool netlink interface and updates the appropriate
selftest to check this value.
====================
====================
selftests/vsock: improve vng version and quirk handling
As vng has continued updating, there have been two things in our
selftests that have been affected. One is that newer versions always
emit the vng version warning, and two is that we have a workaround that
is not needed in newer versions.
This series just updates the version handling to allow all newer
versions without warning and version-gates the workaround to only those
versions that don't have the commit that fixed the root cause.
Additionally, we add function for comparing major.minor versions which
is used in both patches.
-===================
Bobby Eshleman [Fri, 12 Jun 2026 19:08:42 +0000 (12:08 -0700)]
selftests/vsock: skip vng setsid workaround on >= 1.41
virtme-ng 1.41 ships the upstream fix for the SIGTTOU hang
(https://github.com/arighi/virtme-ng/pull/453), so the setsid wrapper in
vng_dry_run() is no longer needed there. Gate the workaround on the vng
version: setsid is used for vng < 1.41, and vng is invoked directly on
>= 1.41.
Bobby Eshleman [Fri, 12 Jun 2026 19:08:41 +0000 (12:08 -0700)]
selftests/vsock: accept vng 1.33 or >= 1.36
The current vng version check uses a discrete allowlist of "1.33",
"1.36", and "1.37", which forces a script update on every new release
even though all post-1.36 releases work.
Replace the discrete list with: "1.33", or any version >= 1.36. 1.34
and 1.35 are skipped because they were not tested. Add a version_lt()
helper that compares MAJOR.MINOR numerically, so the check reads as a
straightforward version comparison.
Eric Dumazet [Fri, 12 Jun 2026 16:25:17 +0000 (16:25 +0000)]
tcp: ipv6: clamp default adverting MSS to avoid GSO_BY_FRAGS (0xFFFF)
When MTU is large, ip6_default_advmss() can return IPV6_MAXPLEN (65535).
This is interpreted by TCP as mss_clamp, allowing the MSS to reach 65535.
However, 0xFFFF is also used as a magic value GSO_BY_FRAGS in the kernel.
If a TCP packet with gso_size=0xFFFF is passed to skb_segment(), it will
be mistakenly treated as GSO_BY_FRAGS, leading to a NULL pointer
dereference because local TCP packets do not use frag_list.
Fix this by returning min(IPV6_MAXPLEN, GSO_BY_FRAGS - 1) (65534) from
ip6_default_advmss() when MTU is large.
Also update the stale comment in ip6_default_advmss() which suggested
that IPV6_MAXPLEN is returned to mean "any MSS".
Fixes: 3953c46c3ac7 ("sk_buff: allow segmenting based on frag sizes") Reported-by: syzbot+ebdb22d461c904fc3cb2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6a2c3193.8812e0fc.3c3fa4.0001.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260612162517.83394-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Fri, 12 Jun 2026 13:59:49 +0000 (13:59 +0000)]
tipc: fix UAF in tipc_l2_send_msg()
Syzbot reported a slab-use-after-free in ipvlan_hard_header() when
called from tipc_l2_send_msg().
The root cause is that tipc_disable_l2_media() calls synchronize_net()
while b->media_ptr is still valid. This allows concurrent RCU readers
to obtain the device pointer after synchronize_net() has finished.
The pointer is cleared later in bearer_disable(), but without any
subsequent synchronization, allowing the device to be freed while
still in use by readers.
Fix this by clearing b->media_ptr in tipc_disable_l2_media() before
calling synchronize_net().
This is safe to do now because the call order in bearer_disable()
was reversed in 0d051bf93c06 ("tipc: make bearer packet filtering generic")
to call tipc_node_delete_links() (which needs the pointer) before
disable_media().
Fixes: 282b3a056225 ("tipc: send out RESET immediately when link goes down")
https: //lore.kernel.org/netdev/6a2c1007.428ffe26.258b27.015d.GAE@google.com/T/#u Reported-by: syzbot+64ec81389cbad56a8c35@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jon Maloy <jmaloy@redhat.com> Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech> Link: https://patch.msgid.link/20260612135949.4010482-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
octeontx2: quiesce stale mailbox IRQ state before request_irq()
Both OTX2 mailbox registration paths currently install their IRQ
handlers before clearing stale local mailbox interrupt state, even
though the code comments already say that the clear is needed first to
avoid spurious interrupts.
This issue was found by our static analysis tool and manually audited on
Linux v6.18.21. Directed QEMU no-device validation further showed that
the real PF and VF mailbox handlers are already reachable in that
pre-clear window and can touch the same mailbox and workqueue carrier
before local quiesce has completed.
This series keeps the change minimal:
- clear stale mailbox interrupt state before request_irq()
- keep interrupt enabling after the handler is installed
That closes the early-IRQ window without introducing a new
enable-before-handler window.
Patch 1 fixes the PF mailbox registration path.
Patch 2 fixes the VF mailbox registration path.
Build-tested by compiling otx2_pf.o and otx2_vf.o.
No OTX2 hardware was available for end-to-end runtime testing.
====================
Runyu Xiao [Thu, 11 Jun 2026 16:00:14 +0000 (00:00 +0800)]
octeontx2-vf: clear stale mailbox IRQ state before request_irq()
otx2vf_register_mbox_intr() currently installs the VF mailbox IRQ
handler before clearing stale mailbox interrupt state. The code then says
that local interrupt bits should be cleared first to avoid spurious
interrupts, but that clear still happens only after request_irq() has
already made the handler reachable.
A running system can reach this during VF mailbox interrupt registration
while stale or latched RVU_VF_INT state is still present. If delivery
happens in the request_irq()-to-clear window,
otx2vf_vfaf_mbox_intr_handler() can run before local quiesce and touch
the same vf->mbox and vf->mbox_wq carrier that probe and teardown later
reuse or destroy.
Move the stale mailbox interrupt clear ahead of request_irq(), but keep
interrupt enabling after the handler is installed. This closes the
pre-clear early-IRQ window without creating a new enable-before-handler
window.
Fixes: 3184fb5ba96e ("octeontx2-vf: Virtual function driver support") Cc: stable@vger.kernel.org Signed-off-by: Runyu Xiao <runyu.xiao@seu.edu.cn> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260611160014.3202224-3-runyu.xiao@seu.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Runyu Xiao [Thu, 11 Jun 2026 16:00:13 +0000 (00:00 +0800)]
octeontx2-pf: clear stale mailbox IRQ state before request_irq()
otx2_register_mbox_intr() currently installs the PF mailbox IRQ handler
before clearing stale mailbox interrupt state. The function itself then
comments that the local interrupt bits must be cleared first to avoid
spurious interrupts, but that clear happens only after request_irq() has
already exposed the handler to irq delivery.
A running system can reach this during PF mailbox interrupt registration
while stale or latched RVU_PF_INT state is still present. If delivery
happens in the request_irq()-to-clear window,
otx2_pfaf_mbox_intr_handler() can run before local quiesce and touch
the same pf->mbox and pf->mbox_wq carrier that probe and teardown later
reuse or destroy.
Move the stale mailbox interrupt clear ahead of request_irq(), but keep
interrupt enabling after the handler is installed. This closes the
pre-clear early-IRQ window without creating a new enable-before-handler
window.
Fixes: 5a6d7c9daef3 ("octeontx2-pf: Mailbox communication with AF") Cc: stable@vger.kernel.org Signed-off-by: Runyu Xiao <runyu.xiao@seu.edu.cn> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Ratheesh Kannoth <rkannoth@marvell.com> Link: https://patch.msgid.link/20260611160014.3202224-2-runyu.xiao@seu.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Greg Patrick [Thu, 11 Jun 2026 17:53:41 +0000 (17:53 +0000)]
net: phy: sfp: detect presence via I2C when no MOD_DEF0 GPIO
An SFP cage (compatible "sff,sfp") whose MOD_DEF0 signal is not wired to a
GPIO currently falls back to sff_gpio_get_state(), which unconditionally
reports the module as present. An empty cage therefore fails its probe and
is parked in SFP_MOD_ERROR forever; because SFP_F_PRESENT never deasserts
there is no REMOVE event to recover the state machine, so a module inserted
after boot is never detected, and empty cages spam -EIO at boot.
This affects boards that route none of the cage presence signal to a
software-readable input. On the NicGiga S100-0800S-M (RTL9303, 8x SFP+) the
cage I2C bus is the switch's SMBus master; TX_DISABLE is driven via a
PCA9534 I/O expander, but no MOD_ABS/MOD_DEF0 line reaches a readable GPIO
(the RTL9303 gpio0 lines read stuck-low, the single PCA9534 is fully
consumed by TX_DISABLE, and there is no RTL8231). The Horaco ZX-SW82TS-L2P
(RTL9302D, 2x SFP+) is independently affected in the same way.
For such an SFP cage, derive presence from a throttled single-byte I2C read
of the module EEPROM instead: a successful read asserts SFP_F_PRESENT,
R_PROBE_ABSENT consecutive failures clear it (to ride out a transient error
on a live module). The existing poll then emits SFP_E_INSERT / SFP_E_REMOVE
normally, giving working hot-plug and silencing the boot-time -EIO spam on
empty cages. Presence is re-probed every T_PROBE_PRESENT, so insertion is
detected within that interval and removal within
T_PROBE_PRESENT * R_PROBE_ABSENT.
A soldered-down module (compatible "sff,sff") has no presence signal and is
genuinely always present, so it continues to use sff_gpio_get_state(); the
new path is gated on the cage type advertising SFP_F_PRESENT.
Signed-off-by: Greg Patrick <gregspatrick@hotmail.com> Tested-by: Manuel Stocker <mensi@mensi.ch> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20260611175341.2223184-1-gregspatrick@hotmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Yang [Thu, 11 Jun 2026 07:08:51 +0000 (15:08 +0800)]
devlink: Warn on resource ID collision with PARENT_TOP
ID 0 serves as the sentinel DEVLINK_RESOURCE_ID_PARENT_TOP to mark
top-level resources. While it is technically possible to use 0 as a real
resource ID, a user might be tempted to write:
register(..., MY_RESOURCE_ID_C, DEVLINK_RESOURCE_ID_PARENT_TOP, ...);
register(..., MY_RESOURCE_ID_D, MY_RESOURCE_ID_C, ...);
/* D is a child of C */
register(..., MY_RESOURCE_ID_A, DEVLINK_RESOURCE_ID_PARENT_TOP, ...);
register(..., MY_RESOURCE_ID_B, MY_RESOURCE_ID_A, ...);
/* Is B intentionally top-level, or is it actually a child of A? */
Add a WARN_ON() to catch this and prevent confusion.
David Yang [Thu, 11 Jun 2026 07:08:50 +0000 (15:08 +0800)]
net: dsa: mv88e6xxx: Avoid devlink resource IDs collision with PARENT_TOP
The devlink resource ID for ATU collides with the sentinel
DEVLINK_RESOURCE_ID_PARENT_TOP (0). As a result, ATU_bin_* are
registered as in fact registered as top-level siblings, not as children
of ATU.
Whether intentional or unintentional, clarify it by keeping the real
resource IDs starting at 1. Unfortunately ATU_bin_* are already
registered at top-level, so keep their parent to PARENT_TOP.
David Yang [Thu, 11 Jun 2026 07:08:49 +0000 (15:08 +0800)]
net: dsa: hellcreek: avoid devlink resource IDs collision with PARENT_TOP
This might not cause real problems, but the hellcreek devlink resource
ID collides with the sentinel DEVLINK_RESOURCE_ID_PARENT_TOP (0). Avoid
it by keeping the real resource IDs starting at 1.
David Yang [Thu, 11 Jun 2026 07:08:48 +0000 (15:08 +0800)]
net: dsa: b53: avoid devlink resource IDs collision with PARENT_TOP
This might not cause real problems, but the b53 devlink resource ID
collides with the sentinel DEVLINK_RESOURCE_ID_PARENT_TOP (0). Avoid it
by keeping the real resource IDs starting at 1.
David Yang [Thu, 11 Jun 2026 07:08:47 +0000 (15:08 +0800)]
net: dsa: dsa_loop: avoid devlink resource IDs collision with PARENT_TOP
This might not cause real problems, but the dsa_loop devlink resource ID
collides with the sentinel DEVLINK_RESOURCE_ID_PARENT_TOP (0). Avoid it
by keeping the real resource IDs starting at 1.
Johan Hovold [Thu, 11 Jun 2026 12:03:45 +0000 (14:03 +0200)]
pmdomain: core: fix unused variable warning with !PM_GENERIC_DOMAINS_OF
The genpd provider bus is really only used when
CONFIG_PM_GENERIC_DOMAINS_OF is enabled, and since the recent deferred
initialisation of domain parent devices, the root device pointer is
otherwise unused.
Fix the unused variable warning by moving the definition of the root device
pointer inside the corresponding ifdef.
Fixes: 92b69eff8012 ("pmdomain: core: fix early domain registration") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202606111746.kAxaAbwg-lkp@intel.com/ Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Ulf Hansson <ulfh@kernel.org>
Bhargav Joshi [Thu, 11 Jun 2026 21:12:29 +0000 (02:42 +0530)]
dt-bindings: interrupt-controller: ti,irq-crossbar: Convert to DT schema
Convert TI irq-crossbar binding from text format to DT schema.
As part of conversion following changes are made:
- Add '#interrupt-cells' as a required property which was missing in
text binding
- As irq-crossbar is interrupt-controller. Move binding from
bindings/arm/omap to bindings/interrupt-controller
====================
net/mlx5: Add switchdev mode support for Socket Direct single netdev, part 2/2
This is part 2. Find part 1 here:
https://lore.kernel.org/all/20260531113954.395443-1-tariqt@nvidia.com/
This series enables Socket Direct single netdev to operate in switchdev
mode with shared FDB. SD single netdev combines multiple PCI functions
behind a single netdev interface. To support switchdev offloads, these
functions must participate in virtual LAG (shared FDB).
Design
Rather than introducing a separate LAG instance for SD, this series
integrates SD secondary devices into the existing LAG structure
(priv.lag) created at probe time. Each lag_func entry carries a
group_id field that identifies its SD group membership (0 means not
part of any SD group). An xarray mark (XA_MARK_PORT) distinguishes
physical port entries from SD secondaries, enabling a single unified
iterator that filters by group:
- MLX5_LAG_FILTER_PORTS: iterate port-level entries only (existing
behavior, used by bonding, FW LAG commands, v2p_map)
- MLX5_LAG_FILTER_ALL: iterate all devices including SD secondaries
(used by MPESW shared FDB across all devices)
- specific group_id: iterate only devices in that SD group (used by
per-group SD shared FDB operations)
Existing callers use mlx5_ldev_for_each() which maps to
MLX5_LAG_FILTER_PORTS, preserving current behavior for non-SD
configurations.
Lifecycle and ownership
The SD LAG lifecycle is tied to the SD group, not to bonding events:
1. At PCI probe, mlx5_lag_add_mdev() creates the LAG structure
(priv.lag) for each LAG-capable PF. e.g.: SD primary devices
2. During mlx5_sd_init(), after the SD group is fully formed (primary
and secondaries paired), sd_lag_init() registers the secondary
devices into the primary's existing priv.lag by calling
mlx5_ldev_add_mdev() with the SD group_id. The primary's lag_func
also gets its group_id set. No separate LAG instance is created.
3. After all the devices in SD group transition to switchdev,
mlx5_lag_shared_fdb_create() is invoked with the group_id to create
a software-only shared FDB scoped to that SD group. This sets
sd_fdb_active on all lag_func entries in the group. No FW LAG
commands are issued since SD devices share the same physical port.
4. If MPESW (multi-port eswitch) is enabled on top of SD groups, the
per-group SD shared FDB is torn down first, then MPESW shared FDB is
created spanning all devices (ports + SD secondaries) using
MLX5_LAG_FILTER_ALL. On MPESW disable, per-group SD shared FDB is
restored.
5. On SD teardown (mlx5_sd_cleanup or device unbind), sd_lag_cleanup()
removes secondaries from priv.lag and clears the primary's group_id.
The LAG structure itself is not destroyed.
The sd_fdb_active flag is set on all lag_func entries in a group (not
just the primary), so any device can detect the SD shared FDB state
during lag_disable_change teardown without needing to look up peer
entries.
SD shared FDB is a pure software construct -- unlike regular LAG modes
(ROCE, SRIOV, MPESW), it does not issue FW create_lag/destroy_lag
commands. The software vport LAG for SD is implemented via eswitch
egress ACL bounce rules, managed by the IB layer through
mlx5_eth_lag_init(). And the software LAG demux is implemented via
steering rules that utilize new destination, VHCA_RX.
Devcom support (patches 2-3):
- Expose locked variant of send_event
- Add DEVCOM_CANT_FAIL for non-rollback events
SD core hardening (patches 4-6):
- Make primary/secondary role determination more robust
- Add L2 table silent mode query support
- Expand vport metadata for SD secondary devices
SD switchdev transition (patches 7-8):
- Support switchdev mode transition with shared FDB
- Notify SD on eswitch disable
LAG integration (patches 9-12):
- Store demux resources per master lag_func
- Disable both regular and SD LAG on lag_disable_change
- Introduce software vport LAG implementation
- Add MPESW over SD LAG support
Deferred init (patches 13-14):
- Tie rep load/unload to SD LAG state
- Defer vport metadata init until SD is ready
Enablement (patch 15):
- Enable SD over ECPF and allow switchdev transition
Shay Drory [Fri, 12 Jun 2026 11:39:04 +0000 (14:39 +0300)]
net/mlx5: SD, enable SD over ECPF and allow switchdev transition
Remove the restriction blocking SD on embedded CPU PFs (ECPF), enabling
SD functionality on BlueField DPUs. Remove the blocker preventing SD
devices from transitioning to switchdev mode.
The infrastructure added in earlier patches properly handles this case.
Shay Drory [Fri, 12 Jun 2026 11:39:03 +0000 (14:39 +0300)]
net/mlx5: SD, defer vport metadata init until SD is ready
Allow SD devices to transition to switchdev before the SD group is
fully up. Metadata allocation requires the SD group to be ready, so
defer it from esw_offloads_enable() until SD shared-FDB activation.
Add mlx5_esw_offloads_init_deferred_metadata() which allocates per-vport
metadata and refreshes the ingress ACLs that were previously programmed
with metadata=0. The helper is idempotent and can be called multiple
times.
Shay Drory [Fri, 12 Jun 2026 11:39:02 +0000 (14:39 +0300)]
net/mlx5: E-Switch, Tie rep load/unload to SD LAG state
On an SD device, vport representors are not functional until the SD
group is combined and shared FDB is active. Skip the initial load and
the reload paths in that window; reps are loaded as part of the SD LAG
activation flow once it becomes active.
In addition, explicitly unload representors when SD LAG is destroyed.
Shay Drory [Fri, 12 Jun 2026 11:39:01 +0000 (14:39 +0300)]
net/mlx5: LAG, add MPESW over SD LAG support
Enable MPESW LAG creation over SD LAG members, forming a composite LAG
hierarchy. This allows bonding multiple SD groups together under a
single MPESW configuration with shared FDB.
When enabling composite MPESW, the individual SD LAG shared FDB
configurations are temporarily torn down and recreated when the
composite LAG is disabled.
Shay Drory [Fri, 12 Jun 2026 11:39:00 +0000 (14:39 +0300)]
net/mlx5: LAG, introduce software vport LAG implementation
SD LAG is a virtual LAG without hardware LAG support, so it cannot use
the firmware vport LAG commands. Implement a software-based vport LAG
using egress ACL bounce rules.
Add esw_set_slave_egress_rule() to create an egress ACL rule on the
slave's manager vport that bounces traffic to the master's manager
vport. This achieves the same traffic steering as hardware vport LAG.
Redirect mlx5_cmd_create_vport_lag() and mlx5_cmd_destroy_vport_lag()
to the software implementation when operating in SD LAG mode.
In addition, adjust lag_demux creation to check SD LAG mode as well.
Shay Drory [Fri, 12 Jun 2026 11:38:59 +0000 (14:38 +0300)]
net/mlx5: LAG, disable both regular and SD LAG on lag_disable_change
Extend mlx5_lag_disable_change() to properly disable both regular LAG
and SD LAG when requested. Each LAG type uses its own devcom component
for locking.
Use mlx5_sd_get_devcom() helper to retrieve the SD devcom component,
needed for proper locking when disabling SD LAG.
Shay Drory [Fri, 12 Jun 2026 11:38:58 +0000 (14:38 +0300)]
net/mlx5: LAG, store demux resources per master lag_func
The lag demux resources (flow table, flow group, and rules xarray)
are stored on the shared ldev. With Socket Direct, multiple SD groups
each create their own demux FT/FG during their master's IB device
initialization. Since they all write to the same ldev fields, the
second group's init overwrites the first group's pointers, leaking
the first group's FT/FG.
During teardown, the cleanup uses the overwritten pointers, destroying
the wrong group's resources and leaving leaked flow tables in the LAG
namespace. These leaked tables can interfere with subsequently created
demux tables.
Move the demux resources from the shared ldev to per-master lag_func
instances. Each master device now owns its own independent demux
state. The rule_add and rule_del helpers look up the appropriate
master's lag_func via the existing filter/group infrastructure.
Shay Drory [Fri, 12 Jun 2026 11:38:57 +0000 (14:38 +0300)]
net/mlx5: E-Switch, notify SD on eswitch disable
When eswitch is disabled, notify the SD layer so it can clean up
SD-specific resources such as the TX flow table root configuration
on secondary devices.
Shay Drory [Fri, 12 Jun 2026 11:38:56 +0000 (14:38 +0300)]
net/mlx5: SD, support switchdev mode transition with shared FDB
When the eswitch transitions, propagate the change to SD: secondaries
get their TX flow table root reconfigured for the new mode, and when
all group devices move to switchdev, the per-group shared FDB is
activated.
Shared FDB activation is best-effort - failure does not block the
eswitch transition; the next transition retries.
Note: the existing mlx5_get_sd() guard that blocks switchdev for SD
devices is intentionally retained. It will be removed once all
supporting patches are in place.
Shay Drory [Fri, 12 Jun 2026 11:38:55 +0000 (14:38 +0300)]
net/mlx5: SD, expend vport metadata for SD secondary devices
In Socket Direct configurations the primary and secondary PFs share the
same native_port_num. The eswitch vport metadata encodes pf_num in its
upper bits to distinguish vports across PFs. Without SD-awareness, both
PFs generate identical metadata, causing FDB rules to steer traffic to
the wrong representor.
Add mlx5_sd_pf_num_get() which remaps the pf_num for SD devices.
Use it so each PF in an SD group produces unique vport metadata.
Shay Drory [Fri, 12 Jun 2026 11:38:54 +0000 (14:38 +0300)]
net/mlx5: SD, add L2 table silent mode query support
Add mlx5_fs_cmd_query_l2table_silent() to query the current silent mode
state from firmware. This allows detecting if firmware has already put
secondary devices into silent mode.
During SD group registration, query the silent mode of each device. If
a device is already in silent mode (set by firmware), record this in
the fw_silents_secondaries flag and use it to help determine the
primary/secondary roles.
When fw_silents_secondaries is set, skip the driver-initiated silent
mode set/unset operations since firmware manages this state. This
handles configurations where firmware persistently silences secondary
devices.
Shay Drory [Fri, 12 Jun 2026 11:38:53 +0000 (14:38 +0300)]
net/mlx5: SD, make primary/secondary role determination more robust
Refactor SD group registration to use devcom event-driven role
determination to ensure SD is marked as ready only after roles are fully
assigned and the group state is consistent, making outside accessors,
which will be added in downstream patches, safe to use without races.
The devcom events:
- SD_PRIMARY_SET event: each device compares bus numbers with peers
to determine which should be primary
- SD_SECONDARIES_SET event: secondaries register themselves with the
elected primary device
Shay Drory [Fri, 12 Jun 2026 11:38:52 +0000 (14:38 +0300)]
net/mlx5: devcom, add DEVCOM_CANT_FAIL for non-rollback events
Some devcom events are not expected to fail. Rather than attempting
a rollback that may not be meaningful, allow callers to pass
DEVCOM_CANT_FAIL as the rollback_event to indicate that the event
handler should not fail. If it does, emit a warning and stop
propagating to further peers, but skip the rollback path.
Shay Drory [Fri, 12 Jun 2026 11:38:51 +0000 (14:38 +0300)]
net/mlx5: devcom, expose locked variant of send_event
Factor mlx5_devcom_send_event() into two functions:
- mlx5_devcom_locked_send_event(): performs the dispatch (and
rollback) with comp->sem already held by the caller.
- mlx5_devcom_send_event(): unchanged wrapper that takes comp->sem,
calls the locked variant, and releases it.
This lets callers bracket multiple event broadcasts under a single
held write lock, eliminating the gap between consecutive dispatches
where peer state could change.
SD secondary devices share the primary's uplink and do not have
their own uplink representor. When reloading IB reps on secondary
devices, skip the uplink and only load VF/SF vport IB reps.
Takashi Iwai [Mon, 15 Jun 2026 18:19:22 +0000 (20:19 +0200)]
Merge tag 'asoc-v7.2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Updates for v7.2
There's been quite a lot of framework improvements this time around,
though mainly cleanups and robustness rather than user visible features.
The same pattern is seen with a lot of the driver work that's going on,
there are new features but a huge proportion of this is bug fixing and
cleanup work. We also have a good selectio of new device support.
- Improvements to SDCA jack handling from Charles Keepax.
- Use of device links to make suspend handling more robust from Richard
Fitzgerald.
- Use of a new helper to factor out a common pattern in SoundWire
enmeration from Charles Keepax.
- Slimming down of the component from Kuninori Morimoto.
- Simplification of format auto selection from Kuninori Morimoto.
- Lots of conversions to guard() from Bui Duc Phuc.
- Addition of a simple-amplifier driver supporting more featureful GPIO
controller amplifiers than the previous basic driver from Herve
Codina.
- Support for AMD ACP 7.x, Cirrus Logic CS42448/CS42888, Everest Semi
ES9356, Mediatek MT2701 and MT8196, Renesas RZ/G3E, Spacemit K3,
Texas Instruments TAC5xx2 and TAS67524.
Lianqin Hu [Mon, 15 Jun 2026 12:05:01 +0000 (12:05 +0000)]
ALSA: usb-audio: Add iface reset and delay quirk for XIBERIA K03S
Setting up the interface when suspended/resumeing fail on this card.
Adding a reset and delay quirk will eliminate this problem.
usb 1-1: New USB device found, idVendor=36f9, idProduct=c009
usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 1-1: Product: XIBERIA K03S
usb 1-1: Manufacturer: Actions
usb 1-1: usb_probe_device
Paolo Bonzini [Mon, 15 Jun 2026 13:38:14 +0000 (15:38 +0200)]
Merge tag 'kvm-riscv-7.2-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv changes for 7.2
- Batch G-stage TLB flushes for GPA range based page table updates
- Convert HGEI line management to fully per-HART
- Fix missing CSR dirty marking when FWFT state updated via ONE_REG
- Fix stale FWFT feature exposure to Guest/VM
- Speed up dirty logging write faults using MMU rwlock and atomic
PTE updates using cmpxchg() for permission-only changes
- Use flexible array for APLIC IRQ state
- Use kvm_slot_dirty_track_enabled() for logging enable check on
a memslot
- Avoid skipping valid pages in kvm_riscv_gstage_wp_range()
- Avoid skipping valid pages in kvm_riscv_gstage_unmap_range()
- Use endian-specific __lelong for NACL shared memory
Paolo Bonzini [Mon, 15 Jun 2026 13:36:43 +0000 (15:36 +0200)]
Merge tag 'kvm-s390-next-7.2-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: s390: New features for 7.2
New features for 7.2 for KVM/s390:
* KVM_PRE_FAULT_MEMORY support
* Support for 2G hugepages
* Support for the ASTFLEIE 2 facility
* kvm_arch_set_irq_inatomic Fast Inject
* Fix potential leak of uninitialized bytes
Jiri Kosina [Fri, 12 Jun 2026 15:48:22 +0000 (17:48 +0200)]
HID: hidpp: fix potential UAF in hidpp_connect_event()
If input_register_device() fails, we call input_free_device(), but keep
stale pointer to the old device in hidpp->input, which could potentially
lead to UAF. Fix that by resetting it to NULL before returning from
hidpp_connect_event().
Zhenghang Xiao [Mon, 15 Jun 2026 10:25:56 +0000 (12:25 +0200)]
fuse-uring: clear ent->fuse_req in commit_fetch error path
fuse_uring_commit_fetch() error path called fuse_request_end(req) without
clearing ent->fuse_req when fuse_ring_ent_set_commit() fails. The
still-pending fuse_uring_send_in_task() task-work later dereferences the
dangling pointer through fuse_uring_prepare_send(), causing a
use-after-free.
End the request with fuse_uring_req_end(), which handles all conditions
already.
Annotation/edition by Bernd: The UAF should be fixed by other means already
and actually has to be avoided that way.
Just checking for ent->fuse_req == NULL in fuse_uring_send_in_task()
would be prone to race conditions, because if malicious userspace
would commit requests that have passed the NULL check, but are
in doing args copy, it would still trigger a use-after-free.
Setting ent->fuse_req = NULL in fuse_uring_commit_fetch() still
makes sense, though.
KVM: s390: Introducing kvm_arch_set_irq_inatomic fast inject
s390 needs a fast path for irq injection, and along those lines we
introduce kvm_arch_set_irq_inatomic. Instead of placing all interrupts on
the global work queue as it does today, this patch provides a fast path for
irq injection.
The inatomic fast path cannot lose control since it is running with
interrupts disabled. This meant making the following changes that exist on
the slow path today. First, the adapter_indicators page needs to be mapped
since it is accessed with interrupts disabled, so we added map/unmap
functions. Second, access to shared resources between the fast and slow
paths needed to be changed from mutex and semaphores to spin_lock's.
Finally, the memory allocation on the slow path utilizes GFP_KERNEL_ACCOUNT
but we had to implement the fast path with GFP_ATOMIC allocation. Each of
these enhancements were required to prevent blocking on the fast inject
path.
Fencing of Fast Inject in Secure Execution environments is enabled in the
patch series by not mapping adapter indicator pages. In Secure Execution
environments the path of execution available before this patch is followed.
Statistical counters have been added to enable analysis of irq injection on
the fast path and slow path including io_390_inatomic, io_flic_inject_airq,
io_set_adapter_int and io_390_inatomic_no_inject. The no inject counter
captures adapter masked, coalesced and suppressed interrupts.
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Douglas Freimuth <freimuth@linux.ibm.com> Acked-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-ID: <20260604192755.203143-4-freimuth@linux.ibm.com>
KVM: s390: Enable adapter_indicators_set to use mapped pages
The s390 adapter_indicators_set function can now be optimized to use
long-term mapped pages when available so that work can be
processed on a fast path when interrupts are disabled.
If adapter indicator pages are not mapped then local mapping is
done on a slow path as it is prior to this patch. For example, Secure
Execution environments will take the local mapping path as it does prior to
this patch.
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Douglas Freimuth <freimuth@linux.ibm.com> Acked-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-ID: <20260604192755.203143-3-freimuth@linux.ibm.com>
KVM: s390: Add map/unmap ioctl and clean mappings post-guest
s390 needs map/unmap ioctls, which map the adapter set
indicator pages, so the pages can be accessed when interrupts are
disabled. The mappings are cleaned up when the guest is removed.
pin_user_pages_remote is used for both the ioctl as well
as the pin-on-demand logic in adapter_indicators_set().
Map/Unmap ioctls are fenced in order to avoid the longterm pinning
in Secure Execution environments. In Secure Execution
environments the path of execution available before this patch is followed.
Statistical counters to count map/unmap functions for adapter indicator
pages are added. The counters can be used to analyze
map/unmap functions in non-Secure Execution environments and similarly
can be used to analyze Secure Execution environments where the counters
will not be incremented as the adapter indicator pages are not mapped.
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Douglas Freimuth <freimuth@linux.ibm.com> Acked-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-ID: <20260604192755.203143-2-freimuth@linux.ibm.com>
Joanne Koong [Fri, 12 Jun 2026 21:05:09 +0000 (14:05 -0700)]
fuse-uring: use named constants for io-uring iovec indices
Replace magic indices 0 and 1 for the iovec array with named constants
FUSE_URING_IOV_HEADERS and FUSE_URING_IOV_PAYLOAD. This makes the usages
self-documenting and prepares for buffer ring support which will also
reference these iovec slots by index.
Reviewed-by: Bernd Schubert <bernd@bsbernd.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Baokun Li <libaokun@linux.alibaba.com> Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Joanne Koong [Fri, 12 Jun 2026 21:05:07 +0000 (14:05 -0700)]
fuse-uring: use enum types for header copying
Use enum types to identify which part of the header needs to be copied.
This improves the interface and will simplify both kernel-space and
user-space header addresses copying when buffer rings are added.
Reviewed-by: Bernd Schubert <bschubert@ddn.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Baokun Li <libaokun@linux.alibaba.com> Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Joanne Koong [Fri, 12 Jun 2026 21:05:06 +0000 (14:05 -0700)]
fuse-uring: refactor io-uring header copying from ring
Move header copying from ring logic into a new copy_header_from_ring()
function. This makes the copy_from_user() logic more clear and
centralizes error handling / rate-limited logging.
Reviewed-by: Bernd Schubert <bschubert@ddn.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Baokun Li <libaokun@linux.alibaba.com> Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Joanne Koong [Fri, 12 Jun 2026 21:05:05 +0000 (14:05 -0700)]
fuse-uring: refactor io-uring header copying to ring
Move header copying to ring logic into a new copy_header_to_ring()
function. This makes the copy_to_user() logic more clear and centralizes
error handling / rate-limited logging.
Reviewed-by: Bernd Schubert <bschubert@ddn.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Baokun Li <libaokun@linux.alibaba.com> Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Joanne Koong [Fri, 12 Jun 2026 21:05:04 +0000 (14:05 -0700)]
fuse-uring: separate next request fetching from sending logic
Simplify the logic for fetching + sending off the next request.
This gets rid of fuse_uring_send_next_to_ring() which contained
duplicated logic from fuse_uring_send(). This decouples request fetching
from the send operation, which makes the control flow clearer and
reduces unnecessary parameter passing.
Reviewed-by: Bernd Schubert <bschubert@ddn.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Baokun Li <libaokun@linux.alibaba.com> Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Jun Wu [Fri, 15 May 2026 00:14:12 +0000 (17:14 -0700)]
fuse: invalidate readdir cache on epoch bump
FUSE_NOTIFY_INC_EPOCH invalidates dentries, but does not invalidate cached
readdir results. A process with cwd inside a FUSE mount can therefore
observe stale readdir(".") output after an epoch bump.
Fix this by recording epoch in the readdir cache and checking it on reuse.
Minimal reproducer:
- mount a tiny FUSE fs with an empty root directory
- on opendir, enable fi->cache_readdir and fi->keep_cache
- chdir into the mount and call readdir(".") to populate readdir cache
- make the FUSE server report one file in the root directory
- send only FUSE_NOTIFY_INC_EPOCH
- call readdir(".") again; before this change it stays stale, after this
change it sees the new file
Fixes: 2396356a945b ("fuse: add more control over cache invalidation behaviour") Signed-off-by: Jun Wu <quark@meta.com> Reviewed-by: Joanne Koong <joannelkoong@gmail.com> Reviewed-by: Luis Henriques <luis@igalia.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
virtio-fs: avoid double-free on failed queue setup
virtio_fs_setup_vqs() allocates fs->vqs and fs->mq_map before calling
virtio_find_vqs(). If virtio_find_vqs() fails, the error path frees both
pointers and returns an error to virtio_fs_probe().
virtio_fs_probe() then drops the last kobject reference, and
virtio_fs_ktype_release() frees fs->vqs and fs->mq_map again. This leaves
dangling pointers in struct virtio_fs and can trigger a double-free during
probe failure cleanup.
Set fs->vqs and fs->mq_map to NULL immediately after kfree() in the
virtio_fs_setup_vqs() error path so that the later kobject release sees an
uninitialized state and kfree(NULL) becomes harmless.
This can be reproduced when a broken virtio-fs device advertises more
request queues than the transport actually provides. In that case
virtio_find_vqs() fails while setting up the extra queue, and the probe
path reaches the double-free cleanup sequence.
fuse: invalidate page cache after DIO and async DIO writes
This fixe does page cache invalidation after DIO and async DIO writes for
both O_DIRECT and FOPEN_DIRECT_IO cases.
Commit b359af8275a9 ("fuse: Invalidate the page cache after FOPEN_DIRECT_IO
write") fixed xfstests generic/209 for DIO writes in the FOPEN_DIRECT_IO
path. DIO writes without FOPEN_DIRECT_IO are already handled by
generic_file_direct_write().
However, async DIO writes (xfstests generic/451) remain unhandled.
After this fix:
- Async write with FUSE_ASYNC_DIO:
invalidate in fuse_aio_invalidate_worker()
- Otherwise (Sync or async write without FUSE_ASYNC_DIO):
- With FOPEN_DIRECT_IO:
invalidate in fuse_direct_write_iter()
- Without FOPEN_DIRECT_IO:
invalidate in generic_file_direct_write()
Workqueue is required for async write invalidation to prevent deadlock:
calling it directly in the I/O end routine (which is in fuse worker thread
context) can block on a folio lock held by a buffered I/O thread waiting
for the same fuse worker thread.
Li Wang [Fri, 22 May 2026 08:38:05 +0000 (16:38 +0800)]
fuse: use READ_ONCE in fuse_chan_num_background()
fuse_chan_num_background() is called without holding fch->bg_lock (for
example from fuse_writepages() to compare against fc->congestion_threshold),
while fch->num_background is updated under bg_lock in dev.c and dev_uring.c.
This is the same locked-write/lockless-read pattern already used for
max_background in fuse_chan_max_background().
Use READ_ONCE() on the read side so that:
- The compiler does not cache or coalesce loads of a value that may change
concurrently on another CPU.
- Prevent KCSAN from reporting an unexpected race.
Signed-off-by: Li Wang <liwang@kylinos.cn> Fixes: 670d21c6e17f ("fuse: remove reliance on bdi congestion") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
fuse: dax: Move long delayed work on system_dfl_long_wq
Currently the code enqueue work items using {queue|mod}_delayed_work(),
using system_long_wq. This workqueue should be used when long works are
expected and it is a per-cpu workqueue.
The function(s) end up calling __queue_delayed_work(), which set a global
timer that could fire anywhere, enqueuing the work where the timer fired.
Unbound works could benefit from scheduler task placement, to optimize
performance and power consumption. Long work shouldn't stick to a single
CPU.
Recently, a new unbound workqueue specific for long running work has
been added:
c116737e972e ("workqueue: Add system_dfl_long_wq for long unbound works")
Since the workqueue work doesn't rely on per-cpu variables, there is no
obvious reason that justify the use of a per-cpu workqueue. So change
system_long_wq with system_dfl_long_wq so that the work may benefit from
scheduler task placement.
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Amir Goldstein [Fri, 5 Jun 2026 09:26:47 +0000 (11:26 +0200)]
fuse: add fuse_request_sent tracepoint
This new tracepoint complements fuse_request_send (enqueue) and
fuse_request_end (completion). It fires after the request has been
successfully copied to the daemon's buffer, just before the daemon
can start to process it.
fuse_request_sent does not fire if the copy of the request fails.
It also does not fire for NOTIFY_REPLY, which fires the _end tracepoint
at the end of copy.
This is needed for tools tracking the in-flight state of user initiated
fuse requests.
Tim Bird [Thu, 4 Jun 2026 19:55:08 +0000 (13:55 -0600)]
fuse: Add SPDX ID lines to some files
Some fuse source files are missing SPDX-License-Identifier
lines. Add appropriate IDs to these files, and remove old
license references from the headers.
Signed-off-by: Tim Bird <tim.bird@sony.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>