Stephan Gerhold [Wed, 9 Jul 2025 10:08:55 +0000 (12:08 +0200)]
clk: qcom: videocc-sm8550: Add separate frequency tables for X1E80100
X1E80100 videocc is identical to the one in SM8550, aside from slightly
different recommended PLL frequencies. Add the separate frequency tables
for that and apply them if the qcom,x1e80100-videocc compatible is used.
Stephan Gerhold [Wed, 9 Jul 2025 10:08:54 +0000 (12:08 +0200)]
clk: qcom: videocc-sm8550: Allow building without SM8550/SM8560 GCC
>From the build perspective, the videocc-sm8550 driver doesn't depend on
having one of the GCC drivers enabled. It builds just fine without the GCC
driver. In practice, it doesn't make much sense to have it enabled without
the GCC driver, but currently this extra dependency is inconsistent with
most of the other VIDEOCC entries in Kconfig. This can easily cause
confusion when you see the VIDEOCC options for some of the SoCs but not for
all of them.
Let's just drop the depends line to allow building the videocc driver
independent of the GCC selection. Compile testing with randconfig will also
benefit from keeping the dependencies minimal.
X1E80100 videocc is largely identical to SM8550, but needs slightly
different PLL frequencies. Add a separate qcom,x1e80100-videocc compatible
to the existing schema used for SM8550.
Acked-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org> Link: https://lore.kernel.org/r/20250709-x1e-videocc-v2-1-ad1acf5674b4@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
clk: qcom: tcsrcc-sm8650: Add support for Milos SoC
The Milos SoC has a very similar tcsrcc block, only TCSR_UFS_CLKREF_EN
uses different regs, and both TCSR_USB2_CLKREF_EN and
TCSR_USB3_CLKREF_EN are not present.
Modify these resources at probe if we're probing for Milos.
Brian Masney [Thu, 3 Jul 2025 23:22:30 +0000 (19:22 -0400)]
clk: qcom: spmi-pmic-div: convert from round_rate() to determine_rate()
The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.
Brian Masney [Thu, 3 Jul 2025 23:22:29 +0000 (19:22 -0400)]
clk: qcom: smd-rpm: convert from round_rate() to determine_rate()
The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.
Brian Masney [Thu, 3 Jul 2025 23:22:28 +0000 (19:22 -0400)]
clk: qcom: rpmh: convert from round_rate() to determine_rate()
The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.
Brian Masney [Thu, 3 Jul 2025 23:22:27 +0000 (19:22 -0400)]
clk: qcom: rpm: convert from round_rate() to determine_rate()
The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.
Brian Masney [Thu, 3 Jul 2025 23:22:26 +0000 (19:22 -0400)]
clk: qcom: gcc-ipq4019: convert from round_rate() to determine_rate()
The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.
Taniya Das [Wed, 2 Jul 2025 09:04:21 +0000 (14:34 +0530)]
clk: qcom: clk-alpha-pll: Add support for dynamic update for slewing PLLs
The alpha PLLs which slew to a new frequency at runtime would require
the PLL to calibrate at the mid point of the VCO. Add the new PLL ops
which can support the slewing of the PLL to a new frequency.
George Moussalem [Mon, 30 Jun 2025 12:35:00 +0000 (16:35 +0400)]
clk: qcom: gcc-ipq5018: fix GE PHY reset
The MISC reset is supposed to trigger a resets across the MDC, DSP, and
RX & TX clocks of the IPQ5018 internal GE PHY. So let's set the bitmask
of the reset definition accordingly in the GCC as per the downstream
driver.
Loic Poulain [Fri, 13 Jun 2025 10:22:45 +0000 (12:22 +0200)]
clk: qcom: gcc-qcm2290: Set HW_CTRL_TRIGGER for video GDSC
The venus video driver will uses dev_pm_genpd_set_hwmode() API to switch
the video GDSC to HW and SW control modes at runtime. This requires domain
to have the HW_CTRL_TRIGGER flag.
George Moussalem [Fri, 16 May 2025 12:36:09 +0000 (16:36 +0400)]
dt-bindings: clock: qcom: Add CMN PLL support for IPQ5018 SoC
The CMN PLL block in the IPQ5018 SoC takes 96 MHZ as the reference
input clock. Its output clocks are the XO (24Mhz), sleep (32Khz), and
ethernet (50Mhz) clocks.
firmware: qcom: scm: request the waitqueue irq *after* initializing SCM
There's a subtle race in the SCM driver: we assign the __scm pointer
before requesting the waitqueue interrupt. Assigning __scm marks the SCM
API as ready to accept calls. It's possible that a user makes a call
right after we set __scm and the firmware raises an interrupt before the
driver's ready to service it. Move the __scm assignment after we request
the interrupt.
This has the added benefit of allowing us to drop the goto label.
firmware: qcom: scm: initialize tzmem before marking SCM as available
Now that qcom_scm_shm_bridge_enable() uses the struct device passed to
it as argument to make the QCOM_SCM_MP_SHM_BRIDGE_ENABLE SCM call, we
can move the TZMem initialization before the assignment of the __scm
pointer in the SCM driver (which marks SCM as ready to users) thus
fixing the potential race between consumer calls and the memory pool
initialization.
firmware: qcom: scm: take struct device as argument in SHM bridge enable
qcom_scm_shm_bridge_enable() is used early in the SCM initialization
routine. It makes an SCM call and so expects the internal __scm pointer
in the SCM driver to be assigned. For this reason the tzmem memory pool
is allocated *after* this pointer is assigned. However, this can lead to
a crash if another consumer of the SCM API makes a call using the memory
pool between the assignment of the __scm pointer and the initialization
of the tzmem memory pool.
As qcom_scm_shm_bridge_enable() is a special case, not meant to be
called by ordinary users, pull it into the local SCM header. Make it
take struct device as argument. This is the device that will be used to
make the SCM call as opposed to the global __scm pointer. This will
allow us to move the tzmem initialization *before* the __scm assignment
in the core SCM driver.
firmware: qcom: scm: remove unused arguments from SHM bridge routines
qcom_scm_shm_bridge_create() and qcom_scm_shm_bridge_delete() take
struct device as argument but don't use it. Remove it from these
functions' prototypes.
====================
A tool to verify the BPF memory model
I am building a tool called blitmus[1] that converts memory model litmus
tests written in C into BPF programs that run in parallel to verify that the
JITs are enforcing the memory model correctly.
With this tool I was able to find a bug in the implementation of the smp_mb()
in the selftests.
Using the following litmus test:
C SB+fencembonceonces
(*
* Result: Never
*
* This litmus test demonstrates that full memory barriers suffice to
* order the store-buffering pattern, where each process writes to the
* variable that the preceding process reads. (Locking and RCU can also
* suffice, but not much else.)
*)
As BPF doesn't include any barrier instructions, smp_mb() is implemented
by doing a dummy value returning atomic operation. Such an operation
acts a full barrier as enforced by LKMM and also by the work in progress
BPF memory model.
If the returned value is not used, clang[1] can optimize the value
returning atomic instruction in to a normal atomic instruction which
provides no ordering guarantees.
Mark the variable as volatile so the above optimization is never
performed and smp_mb() works as expected.
Tao Chen [Wed, 16 Jul 2025 13:46:53 +0000 (21:46 +0800)]
bpf: Add struct bpf_token_info
The 'commit 35f96de04127 ("bpf: Introduce BPF token object")' added
BPF token as a new kind of BPF kernel object. And BPF_OBJ_GET_INFO_BY_FD
already used to get BPF object info, so we can also get token info with
this cmd.
One usage scenario, when program runs failed with token, because of
the permission failure, we can report what BPF token is allowing with
this API for debugging.
selftests/bpf: Stress test attaching a BPF prog to another BPF prog
Add a test that invokes a BPF prog in a loop, while concurrently
attaching and detaching another BPF prog to and from it. This helps
identifying race conditions in bpf_arch_text_poke().
s390/bpf: Fix bpf_arch_text_poke() with new_addr == NULL again
Commit 7ded842b356d ("s390/bpf: Fix bpf_plt pointer arithmetic") has
accidentally removed the critical piece of commit c730fce7c70c
("s390/bpf: Fix bpf_arch_text_poke() with new_addr == NULL"), causing
intermittent kernel panics in e.g. perf's on_switch() prog to reappear.
The last iterators update (commit 515ee52b2224 ("bpf: make preloaded
map iterators to display map elements count")) missed the big-endian
skeleton. Update it by running "make big" with Debian clang version
21.0.0 (++20250706105601+01c97b4953e8-1~exp1~20250706225612.1558).
====================
this series follows up on the one introducing 9+ args for tracing
programs [1]. It has been observed with this series that there are cases
for which we can not identify accurately the location of the target
function arguments to prepare correctly the corresponding BPF
trampoline. This is the case for example if:
- the function consumes a struct variable _by value_
- it is passed on the stack (no more register available for it)
- it has some __packed__ or __aligned(X)__ attribute
As a consequence, a small restrictive check has been added to the ARM64
side, highlighting that other arch supporting 9+ args in BPF trampolines
are already suffering from the same issue. After a bit of discussions
and attempts, the chosen solution is, rather than applying the same
constraint to all JIT compilers, to prevent such function from being
encoded at all in BTF info([2]). As the pahole side is closed to be
integrated, we can now remove the restrictive check from kernel side.
selftests/bpf: enable tracing_struct tests for arm64
Now that the constraint preventing attachment to functions consuming
struct on stack has been removed from the kernel (and moved to pahole,
with a slightly smarter detection, to prevent only those that are
packed), re-enable the tracing_struct tests for arm64.
While introducing support for 9+ arguments for tracing programs on
ARM64, commit 9014cf56f13d ("bpf, arm64: Support up to 12 function
arguments") has also introduced a constraint preventing BPF trampolines
from being generated if the target function consumes a struct argument
passed on stack, because of uncertainties around the exact struct
location: if the struct has been marked as packed or with a custom
alignment, this info is not reflected in BTF data, and so generated
tracing trampolines could read the target function arguments at wrong
offsets.
This issue is not specific to ARM64: there has been an attempt (see [1])
to bring the same constraint to other architectures JIT compilers. But
discussions following this attempt led to the move of this constraint
out of the kernel (see [2]): instead of preventing the kernel from
generating trampolines for those functions consuming structs on stack,
it is simpler to just make sure that those functions with uncertain
struct arguments location are not encoded in BTF information, and so
that one can not even attempt to attach a tracing program to such
function. The task is then deferred to pahole (see [3]).
Now that the constraint is handled by pahole, remove it from the arm64
JIT compiler to keep it simple.
sched/ext: Prevent update_locked_rq() calls with NULL rq
Avoid invoking update_locked_rq() when the runqueue (rq) pointer is NULL
in the SCX_CALL_OP and SCX_CALL_OP_RET macros.
Previously, calling update_locked_rq(NULL) with preemption enabled could
trigger the following warning:
BUG: using __this_cpu_write() in preemptible [00000000]
This happens because __this_cpu_write() is unsafe to use in preemptible
context.
rq is NULL when an ops invoked from an unlocked context. In such cases, we
don't need to store any rq, since the value should already be NULL
(unlocked). Ensure that update_locked_rq() is only called when rq is
non-NULL, preventing calling __this_cpu_write() on preemptible context.
Suggested-by: Peter Zijlstra <peterz@infradead.org> Fixes: 18853ba782bef ("sched_ext: Track currently locked rq") Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org # v6.15
====================
net/mlx5e: Add support for PCIe congestion events
Dragos says:
PCIe congestion events are events generated by the firmware when the
device side has sustained PCIe inbound or outbound traffic above
certain thresholds. The high and low threshold are hysteresis thresholds
to prevent flapping: once the high threshold has been reached, a low
threshold event will be triggered only after the bandwidth usage went
below the low threshold.
This series adds support for receiving and exposing such events as
ethtool counters.
2 new pairs of counters are exposed: pci_bw_in/outbound_high/low. These
should help the user understand if the device PCI is under pressure.
Planned followup patches:
- Allow configuration of thresholds through devlink.
- Add ethtool counter for wakeups which did not result in any state
change.
====================
Implement the PCIe Congestion Event notifier which triggers a work item
to query the PCIe Congestion Event object. The result of the congestion
state is reflected in the new ethtool stats:
* pci_bw_inbound_high: the device has crossed the high threshold for
inbound PCIe traffic.
* pci_bw_inbound_low: the device has crossed the low threshold for
inbound PCIe traffic
* pci_bw_outbound_high: the device has crossed the high threshold for
outbound PCIe traffic.
* pci_bw_outbound_low: the device has crossed the low threshold for
outbound PCIe traffic
The high and low thresholds are currently configured at 90% and 75%.
These are hysteresis thresholds which help to check if the
PCI bus on the device side is in a congested state.
If low + 1 = high then the device is in a congested state. If low == high
then the device is not in a congested state.
The counters are also documented.
A follow-up patch will make the thresholds configurable.
Add initial infrastructure to create and destroy the PCIe Congestion
Event object if the object is supported.
The verb for the object creation function is "set" instead of
"create" because the function will accommodate the modify operation
as well in a subsequent patch.
The next patches will hook it up to the event handler and will add
actual functionality.
The netiucv driver creates TCP/IP interfaces over IUCV between Linux
guests on z/VM and other z/VM entities.
Rationale for removal:
- NETIUCV connections are only supported for compatibility with
earlier versions and not to be used for new network setups,
since at least Linux kernel 4.0.
- No known active users, use cases, or product dependencies
- The driver is no longer relevant for z/VM networking;
preferred methods include:
* Device pass-through (e.g., OSA, RoCE)
* z/VM Virtual Switch (VSWITCH)
The IUCV mechanism itself remains supported and is actively used
via AF_IUCV, hvc_iucv, and smsg_iucv.
Signed-off-by: Nagamani PV <nagamani@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250715074210.3999296-1-wintera@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
Expose REFCLK for RMII and enable RMII
This set allows the REFCLK property to be exposed as a dt-property to
properly reflect the correct RMII layout. RMII can take an external or
internal provided REFCLK, since this is not SoC dependent but board
dependent this must be exposed as a DT property for the macb driver.
This set also enables RMII mode for the SAMA7 SoCs gigabit mac.
Ryan Wanner [Mon, 14 Jul 2025 16:37:00 +0000 (09:37 -0700)]
net: cadence: macb: Expose REFCLK as a device tree property
The RMII and RGMII can both support internal or external provided
REFCLKs 50MHz and 125MHz respectively. Since this is dependent on
the board that the SoC is on this needs to be set via the device tree.
This property flag is checked in the MACB DT node so the REFCLK cap is
configured the correct way for the RMII or RGMII is configured on the
board.
REFCLK can be provided by an external source so this should be exposed
by a DT property. The REFCLK is used for RMII and in some SoCs that use
this driver the RGMII 125MHz clk can also be provided by an external
source.
====================
selftest: net: Add selftest for netpoll
I am submitting a new selftest for the netpoll subsystem specifically
targeting the case where the RX is polling in the TX path, which is
a case that we don't have any test in the tree today. This is done when
netpoll_poll_dev() called, and this test creates a scenario when that is
probably.
The test does the following:
1) Configuring a single RX/TX queue to increase contention on the
interface.
2) Generating background traffic to saturate the network, mimicking
real-world congestion.
3) Sending netconsole messages to trigger netpoll polling and monitor
its behavior.
4) Using dynamic netconsole targets via configfs, with the ability to
delete and recreate targets during the test.
5) Running bpftrace in parallel to verify that netpoll_poll_dev() is
called when expected. If it is called, then the test passes,
otherwise the test is marked as skipped.
In order to achieve it, I stole Jakub's bpftrace helper from [1], and
did some small changes that I found useful to use the helper.
So, this patchset basically contains:
1) The code stolen from Jakub
2) Improvements on bpftrace() helper
3) The selftest itself
selftests: net: add netpoll basic functionality test
Add a basic selftest for the netpoll polling mechanism, specifically
targeting the netpoll poll() side.
The test creates a scenario where network transmission is running at
maximum speed, and netpoll needs to poll the NIC. This is achieved by:
1. Configuring a single RX/TX queue to create contention
2. Generating background traffic to saturate the interface
3. Sending netconsole messages to trigger netpoll polling
4. Using dynamic netconsole targets via configfs
5. Delete and create new netconsole targets after some messages
6. Start a bpftrace in parallel to make sure netpoll_poll_dev() is
called
7. If bpftrace exists and netpoll_poll_dev() was called, stop.
The test validates a critical netpoll code path by monitoring traffic
flow and ensuring netpoll_poll_dev() is called when the normal TX path
is blocked.
This addresses a gap in netpoll test coverage for a path that is
tricky for the network stack.
selftests: drv-net: Strip '@' prefix from bpftrace map keys
The '@' prefix in bpftrace map keys is specific to bpftrace and can be
safely removed when processing results. This patch modifies the bpftrace
utility to strip the '@' from map keys before storing them in the result
dictionary, making the keys more consistent with Python conventions.
Jakub Kicinski [Mon, 14 Jul 2025 09:56:48 +0000 (02:56 -0700)]
selftests: drv-net: add helper/wrapper for bpftrace
bpftrace is very useful for low level driver testing. perf or trace-cmd
would also do for collecting data from tracepoints, but they require
much more post-processing.
Add a wrapper for running bpftrace and sanitizing its output.
bpftrace has JSON output, which is great, but it prints loose objects
and in a slightly inconvenient format. We have to read the objects
line by line, and while at it return them indexed by the map name.
doc: xdp: Clarify driver implementation for XDP Rx metadata
Clarify that drivers must remove device-reserved metadata from the
data_meta area before passing frames to XDP programs.
Additionally, expand the explanation of how userspace and BPF programs
should coordinate the use of METADATA_SIZE, and add a detailed diagram
to illustrate pointer adjustments and metadata layout.
Also describe the requirements and constraints enforced by
bpf_xdp_adjust_meta().
Arnd Bergmann resolves compile issues with large NR_CPUS for ixgbe, fm10k,
and i40e.
For ice:
Dave adds a NULL check for LAG netdev.
Michal corrects a pointer check in debugfs initialization.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: check correct pointer in fwlog debugfs
ice: add NULL check in eswitch lag check
ethernet: intel: fix building with large NR_CPUS
====================
net: airoha: fix potential use-after-free in airoha_npu_get()
np->name was being used after calling of_node_put(np), which
releases the node and can lead to a use-after-free bug.
Previously, of_node_put(np) was called unconditionally after
of_find_device_by_node(np), which could result in a use-after-free if
pdev is NULL.
This patch moves of_node_put(np) after the error check to ensure
the node is only released after both the error and success cases
are handled appropriately, preventing potential resource issues.
Use __skb_queue_purge() instead of re-implementing it. Note that it uses
kfree_skb_reason() instead of kfree_skb() internally, and pass
SKB_DROP_REASON_QUEUE_PURGE drop reason to the kfree_skb tracepoint.
vsock/test: fix vsock_ioctl_int() check for unsupported ioctl
`vsock_do_ioctl` returns -ENOIOCTLCMD if an ioctl support is not
implemented, like for SIOCINQ before commit f7c722659275 ("vsock: Add
support for SIOCINQ ioctl"). In net/socket.c, -ENOIOCTLCMD is re-mapped
to -ENOTTY for the user space. So, our test suite, without that commit
applied, is failing in this way:
34 - SOCK_STREAM ioctl(SIOCINQ) functionality...ioctl(21531): Inappropriate ioctl for device
Return false in vsock_ioctl_int() to skip the test in this case as well,
instead of failing.
The buggy address belongs to the object at ffff888013472900
which belongs to the cache skbuff_head_cache of size 232
The buggy address is located 216 bytes inside of
freed 232-byte region [ffff888013472900, ffff8880134729e8)
Memory state around the buggy address: ffff888013472880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888013472900: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888013472980: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
^ ffff888013472a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888013472a80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
Indeed tcp_prune_ofo_queue() is reusing the skb dropped a few lines
above. The caller wants to enqueue 'in_skb', lets check space vs the
latter.
Lyude Paul [Thu, 10 Jul 2025 22:51:13 +0000 (18:51 -0400)]
rust: time: Pass correct timer mode ID to hrtimer_start_range_ns
While rebasing rvkms I noticed that timers I was setting seemed to have
pretty random timer values that amounted slightly over 2x the time value I
set each time. After a lot of debugging, I finally managed to figure out
why: it seems that since we moved to Instant and Delta, we mistakenly
began passing the clocksource ID to hrtimer_start_range_ns, when we should
be passing the timer mode instead. Presumably, this works fine for simple
relative timers - but immediately breaks on other types of timers.
So, fix this by passing the ID for the timer mode instead.
Signed-off-by: Lyude Paul <lyude@redhat.com> Acked-by: Andreas Hindborg <a.hindborg@kernel.org> Reviewed-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Fixes: e0c0ab04f678 ("rust: time: Make HasHrTimer generic over HrTimerMode") Link: https://lore.kernel.org/r/20250710225129.670051-1-lyude@redhat.com
[ Removed cast, applied `rustfmt`, fixed `Fixes:` tag. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Pavel Begunkov [Wed, 16 Jul 2025 21:04:09 +0000 (22:04 +0100)]
io_uring/zcrx: account area memory
zcrx areas can be quite large and need to be accounted and checked
against RLIMIT_MEMLOCK. In practise it shouldn't be a big issue as
the inteface already requires cap_net_admin.
Christoph Paasch [Tue, 15 Jul 2025 20:20:53 +0000 (13:20 -0700)]
net/mlx5: Correctly set gso_size when LRO is used
gso_size is expected by the networking stack to be the size of the
payload (thus, not including ethernet/IP/TCP-headers). However, cqe_bcnt
is the full sized frame (including the headers). Dividing cqe_bcnt by
lro_num_seg will then give incorrect results.
For example, running a bpftrace higher up in the TCP-stack
(tcp_event_data_recv), we commonly have gso_size set to 1450 or 1451 even
though in reality the payload was only 1448 bytes.
This can have unintended consequences:
- In tcp_measure_rcv_mss() len will be for example 1450, but. rcv_mss
will be 1448 (because tp->advmss is 1448). Thus, we will always
recompute scaling_ratio each time an LRO-packet is received.
- In tcp_gro_receive(), it will interfere with the decision whether or
not to flush and thus potentially result in less gro'ed packets.
So, we need to discount the protocol headers from cqe_bcnt so we can
actually divide the payload by lro_num_seg to get the real gso_size.
Gal Pressman [Tue, 15 Jul 2025 14:07:54 +0000 (17:07 +0300)]
ethtool: Don't check for RXFH fields conflict when no input_xfrm is requested
The requirement of ->get_rxfh_fields() in ethtool_set_rxfh() is there to
verify that we have no conflict of input_xfrm with the RSS fields
options, there is no point in doing it if input_xfrm is not
supported/requested.
This is under the assumption that a driver that supports input_xfrm will
also support ->get_rxfh_fields(), so add a WARN_ON() to
ethtool_check_ops() to verify it, and remove the op NULL check.
This fixes the following error in mlx4_en, which doesn't support
getting/setting RXFH fields.
$ ethtool --set-rxfh-indir eth2 hfunc xor
Cannot set RX flow hash configuration: Operation not supported
Fixes: 72792461c8e8 ("net: ethtool: don't mux RXFH via rxnfc callbacks") Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250715140754.489677-1-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hangbin Liu [Tue, 15 Jul 2025 04:34:59 +0000 (04:34 +0000)]
selftests: rtnetlink: fix addrlft test flakiness on power-saving systems
Jakub reported that the rtnetlink test for the preferred lifetime of an
address has become quite flaky. The issue started appearing around the 6.16
merge window in May, and the test fails with:
FAIL: preferred_lft addresses remaining
The flakiness might be related to power-saving behavior, as address
expiration is handled by a "power-efficient" workqueue.
To address this, use slowwait to check more frequently whether the address
still exists. This reduces the likelihood of the system entering a low-power
state during the test, improving reliability.
Miguel Ojeda [Wed, 16 Jul 2025 21:45:08 +0000 (23:45 +0200)]
Merge tag 'rust-timekeeping-for-v6.17' of https://github.com/Rust-for-Linux/linux into rust-next
Pull timekeeping updates from Andreas Hindborg:
- Make 'Instant' generic over clock source. This allows the compiler to
assert that arithmetic expressions involving the 'Instant' use
'Instants' based on the same clock source.
- Make 'HrTimer' generic over the timer mode. 'HrTimer' timers take a
'Duration' or an 'Instant' when setting the expiry time, depending on
the timer mode. With this change, the compiler can check the type
matches the timer mode.
- Add an abstraction for 'fsleep'. 'fsleep' is a flexible sleep
function that will select an appropriate sleep method depending on
the requested sleep time.
- Avoid 64-bit divisions on 32-bit hardware when calculating
timestamps.
- Seal the 'HrTimerMode' trait. This prevents users of the
'HrTimerMode' from implementing the trait on their own types.
* tag 'rust-timekeeping-for-v6.17' of https://github.com/Rust-for-Linux/linux:
rust: time: Add wrapper for fsleep() function
rust: time: Seal the HrTimerMode trait
rust: time: Remove Ktime in hrtimer
rust: time: Make HasHrTimer generic over HrTimerMode
rust: time: Add HrTimerExpires trait
rust: time: Replace HrTimerMode enum with trait-based mode types
rust: time: Add ktime_get() to ClockSource trait
rust: time: Make Instant generic over ClockSource
rust: time: Replace ClockId enum with ClockSource trait
rust: time: Avoid 64-bit integer division on 32-bit architectures
rust: device_id: split out index support into a separate trait
Introduce a new trait `RawDeviceIdIndex`, which extends `RawDeviceId`
to provide support for device ID types that include an index or
context field (e.g., `driver_data`). This separates the concerns of
layout compatibility and index-based data embedding, and allows
`RawDeviceId` to be implemented for types that do not contain a
`driver_data` field. Several such structures are defined in
include/linux/mod_devicetable.h.
Refactor `IdArray::new()` into a generic `build()` function, which
takes an optional offset. Based on the presence of `RawDeviceIdIndex`,
index writing is conditionally enabled. A new `new_without_index()`
constructor is also provided for use cases where no index should be
written.
This refactoring is a preparation for enabling the PHY abstractions to
use the RawDeviceId trait.
The changes to acpi.rs and driver.rs were made by Danilo.
Alice Ryhl [Fri, 11 Jul 2025 08:04:37 +0000 (08:04 +0000)]
device: rust: rename Device::as_ref() to Device::from_raw()
The prefix as_* should not be used for a constructor. Constructors
usually use the prefix from_* instead.
Some prior art in the stdlib: Box::from_raw, CString::from_raw,
Rc::from_raw, Arc::from_raw, Waker::from_raw, File::from_raw_fd.
There is also prior art in the kernel crate: cpufreq::Policy::from_raw,
fs::File::from_raw_file, Kuid::from_raw, ARef::from_raw,
SeqFile::from_raw, VmaNew::from_raw, Io::from_raw.
Kent Overstreet [Sun, 13 Jul 2025 17:31:33 +0000 (13:31 -0400)]
bcachefs: Don't build aux search tree when still repairing node
bch2_btree_node_drop_keys_outside_node() will (re)build aux search
trees, because it's also called by topology repair.
bch2_btree_node_read_done() was calling it before validating individual
keys; invalid ones have to be dropped.
If we call drop_keys_outside_node() first, then
bch2_bset_build_aux_tree() doesn't run because the node already has an
aux search tree - which was invalidated by the repair.
Reported-by: syzbot+c5e7a66b3b23ae65d44f@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 12 Jul 2025 23:33:12 +0000 (19:33 -0400)]
bcachefs: Tweak threshold for allocator triggering discards
The allocator path has a "if we're really low on free buckets, check if
we should issue discards" - tweak this to also trigger discards if more
than 1/128th of the device is in need_discard state.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>