git.ipfire.org Git - thirdparty/kernel/linux.git/log

hv_netvsc: use kmap_local_page in netvsc_copy_to_send_buf

netvsc_copy_to_send_buf() copies page buffer entries into the VMBus
send buffer using phys_to_virt() on the entry PFN. Entries for the
RNDIS header and the skb linear data come from kmalloc'd memory and
are always in the kernel direct map, but entries for skb fragments
reference page cache or user pages, which on 32-bit x86 with
CONFIG_HIGHMEM=y can live above the LOWMEM boundary. For such a page
phys_to_virt() returns an address outside the direct map and the
subsequent memcpy() faults on the transmit softirq path, which is
fatal.

Map the pages with kmap_local_page() instead, handling two properties
of the page buffer entries:

- pb[i].pfn is a Hyper-V PFN at HV_HYP_PAGE_SIZE (4K) granularity,
   not a native PFN. Reconstruct the physical address first and derive
   the native page from it, so the mapping stays correct where
   PAGE_SIZE > HV_HYP_PAGE_SIZE (e.g. arm64 with 64K pages).

- Since commit 41a6328b2c55 ("hv_netvsc: Preserve contiguous PFN
   grouping in the page buffer array"), an entry describes a full
   physically contiguous fragment and pb[i].len can exceed PAGE_SIZE,
   while kmap_local_page() maps a single page. Copy page by page,
   splitting at native page boundaries.

The copy path only handles packets smaller than the send section size
(6144 bytes by default); larger packets take the cp_partial path where
only the RNDIS header is copied. So entries here are bounded by the
section size and a copy is split at most once on 4K-page systems. On
!CONFIG_HIGHMEM configs kmap_local_page() folds to page_address() and
no mapping work is added.

Fixes: c25aaf814a63 ("hyperv: Enable sendbuf mechanism on the send path")
Cc: stable@vger.kernel.org
Signed-off-by: Anton Leontev <leontyevantony@gmail.com>
Link: https://patch.msgid.link/20260604165938.32033-1-leontyevantony@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

octeontx2-af: fix memory leak in rvu_setup_hw_resources()

If rvu_npc_exact_init() fails in rvu_setup_hw_resources(), the function
returns directly instead of jumping to the error handling path. This
causes a resource leak for the previously initialized CGX, NPC, fwdata,
and MSI-X states.

Fix this by replacing the direct return with goto cgx_err to ensure
proper cleanup.

The bug was first flagged by an experimental analysis tool we are
developing for kernel memory-management bugs while analyzing
v6.13-rc1. The tool is still under development and is not yet publicly
available. Manual inspection confirms that the bug is still present in
v7.1-rc6.

An x86_64 allyesconfig build showed no new warnings. As we do not have
access to Marvell OcteonTX2 RVU AF hardware to test with, no runtime
testing was able to be performed.

Fixes: 3571fe07a090 ("octeontx2-af: Drop rules for NPC MCAM")
Cc: stable@vger.kernel.org
Signed-off-by: Dawei Feng <dawei.feng@seu.edu.cn>
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Link: https://patch.msgid.link/20260604143756.1524482-1-dawei.feng@seu.edu.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

drm/virtio: Fix driver removal with disabled KMS

DRM atomic and modesetting aren't initialized if virtio-gpu driver built
with disabled KMS, leading to access of uninitialized data on driver
removal/unbinding and crashing kernel. Fix it by skipping shutting down
atomic core with unavailable KMS.

Fixes: 72122c69d717 ("drm/virtio: Add option to disable KMS support")
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Tested-by: Ryosuke Yasuoka <ryasuoka@redhat.com>
Reviewed-by: Ryosuke Yasuoka <ryasuoka@redhat.com>
Link: https://patch.msgid.link/20260604122743.13383-1-dmitry.osipenko@collabora.com

rxrpc: Fix the ACK parser to extract the SACK table for parsing

Fix modification of the received skbuff in rxrpc_input_soft_acks() and a
potential incorrect access of the buffer in a fragmented UDP packet (the
packet would probably have to be deliberately pre-generated as fragmented)
when AF_RXRPC tries to extract the contents of the SACK table by copying
out the contents of the SACK table into a buffer before attempting to parse

AF_RXRPC assumes that it can just call skb_condense() and then validly
access the SACK table from skb->data and that it will be a flat buffer -
but skb_condense() can silently fail to do anything under some
circumstances.

Note that whilst rxrpc_input_soft_acks() should be able to parse extended
ACKs, the rest of AF_RXRPC doesn't currently support that.

Further, there's then no need to call skb_condense() in rxrpc_input_ack(),
so don't.

Fixes: d57a3a151660 ("rxrpc: Save last ACK's SACK table rather than marking txbufs")
Reported-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://lore.kernel.org/r/20260513180907.2061972-1-michael.bommarito@gmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jeffrey Altman <jaltman@auristor.com>
cc: Eric Dumazet <edumazet@google.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: netdev@vger.kernel.org
cc: stable@kernel.org
Link: https://patch.msgid.link/105362.1780573560@warthog.procyon.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

r8152: handle the return value of usb_reset_device()

If usb_reset_device() returns a negative error code, stop the
process of probing.

Fixes: 10c3271712f5 ("r8152: disable the ECM mode")
Signed-off-by: Chih Kai Hsu <hsu.chih.kai@realtek.com>
Reviewed-by: Hayes Wang <hayeswang@realtek.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260604092247.27158-450-nic_swsd@realtek.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

sched/deadline: Use task_on_rq_migrating() helper

Replace the open-coded "p->on_rq == TASK_ON_RQ_MIGRATING" comparisons
in enqueue_task_dl() and dequeue_task_dl() with the existing
task_on_rq_migrating() helper, consistent with the rest of the
scheduler code.

No functional change.

Signed-off-by: Liang Luo <luoliang@kylinos.cn>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Link: https://patch.msgid.link/20260608075500.387271-1-luoliang@kylinos.cn

sched/core: Combine separate 'else' and 'if' statements

The kernel coding style recommends using 'else if' instead of
placing 'if' on a separate line after 'else'. This change makes
the code consistent with the rest of the kernel codebase.

Signed-off-by: Liang Luo <luoliang@kylinos.cn>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260608071842.325159-1-luoliang@kylinos.cn

sched/fair: Fix cpu_util runnable_avg arithmetic

If we take runnable_avg in max(runnable_avg, util_avg) in cpu_util(), we
should then add or subtract task runnable_avg, but the arithmetic below
is still with task util_avg. This mixes runnable_avg with util_avg which
is incorrect.

Fix by always doing arithmetic with runnable_avg and only take
max(runnable_avg, util_avg) at the last step.

Fixes: 7d0583cf9ec7 ("sched/fair, cpufreq: Introduce 'runnable boosting'")
Signed-off-by: Hongyan Xia <hongyan.xia@transsion.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://patch.msgid.link/20260605094318.37931-1-hongyan.xia@transsion.com

rust: sync: completion: Mark inline complete_all and wait_for_completion

When building the kernel using the llvm-22.1.0-rust-1.93.1-x86_64
toolchain provided by kernel.org with ARCH=x86_64, the following symbols
are generated:

$ nm vmlinux | grep ' _R'.*Completion | rustfilt
ffffffff81827930 T <kernel::sync::completion::Completion>::complete_all
ffffffff81827950 T <kernel::sync::completion::Completion>::wait_for_completion

These Rust methods are thin wrappers around the C completion helpers
`complete_all` and `wait_for_completion`. Mark them `#[inline]` to keep
the wrapper pattern consistent with other small Rust helper methods.

After applying this patch, the above command will produce no output.

Suggested-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Fabricio Parra <a@alice0.com>
Signed-off-by: Boqun Feng <boqun@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://github.com/Rust-for-Linux/linux/issues/1145
Link: https://patch.msgid.link/20260316151056.287-1-a@alice0.com
Link: https://patch.msgid.link/20260605052331.1628-4-boqun@kernel.org

MAINTAINERS: Add RUST [SYNC] entry

We have two pull requests on Rust synchronization primitives with 10+
patches in a row for recent cycles, so it makes sense to start the
effort of handling this area as a group.

Luckily for me, Gary Guo and Alice Ryhl agreed to help as
co-maintainers, and we also have a talented group of reviewers:

Lyude Paul started the SpinLockIrq work [1] and did an amazing job at
improving the design and implementation.

Daniel Almeida resolved the Lock<T: !Unpin> issue [2] and he did a fair
amount of reviews in areas related to synchronization primitives
already.

Onur Özkan started the ww_mutex work [3] and did an amazing job at
consolidating various design requirements and decisions.

Of course, this only reflects my own knowledge, and I believe they did
way more outside what I'm aware of ;-)

Note that having this MAINTAINERS entry is meant to bring more people
to help on the synchronization primitives in Rust, which means for patch
submissions and design discussion, please still involve the
corresponding maintainers (e.g. LOCKING and ATOMIC),
scripts/get_maintainers.pl should have this covered.

Signed-off-by: Boqun Feng <boqun@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Daniel Almeida <daniel.almeida@collabora.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Acked-by: Onur Özkan <work@onurozkan.dev>
Acked-by: Gary Guo <gary@garyguo.net>
Acked-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/rust-for-linux/20260302232154.861916-1-lyude@redhat.com/
Link: https://lore.kernel.org/all/20250828-lock-t-when-t-is-pinned-v2-0-b067c4b93fd6@collabora.com/
Link: https://lore.kernel.org/rust-for-linux/20260103073554.34855-1-work@onurozkan.dev/
Link: https://patch.msgid.link/20260415232830.8128-1-boqun@kernel.org
Link: https://patch.msgid.link/20260605052331.1628-2-boqun@kernel.org

i2c: imx-lpi2c: fix resource leaks switching to devm_dma_request_chan()

The LPI2C driver requests DMA channels using dma_request_chan(), but
never releases them in lpi2c_imx_remove(), resulting in DMA channel
leaks every time the driver is unloaded.

Additionally, when lpi2c_dma_init() successfully requests the TX DMA
channel but fails to request the RX DMA channel, the probe falls back
to PIO mode and completes successfully. Since probe succeeds, the devres
framework will not trigger any cleanup, leaving the TX DMA channel and
the memory allocated for the dma structure held for the lifetime of the
device even though DMA is never used.

Switch to devm_dma_request_chan() to let the device core manage DMA
channel lifetime automatically. Wrap all allocations within a devres
group so that devres_release_group() can release all partially acquired
resources when DMA init fails and probe continues in PIO mode.

Fixes: a09c8b3f9047 ("i2c: imx-lpi2c: add eDMA mode support for LPI2C")
Signed-off-by: Carlos Song <carlos.song@nxp.com>
Cc: <stable@vger.kernel.org> # v6.14+
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260520093323.2882070-1-carlos.song@oss.nxp.com

Merge tag 'v7.1-rockchip-arm32fixe' of https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/fixes

A change in 7.1-rc improved the general handling for reset controllers
using SRCU, but at the same time broke really early users before work-
queues are available.

So adapt the SMP bringup to keep the core-resets around, instead
of aquiring/releasing them on ever SMP action.

* tag 'v7.1-rockchip-arm32fixe' of https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
ARM: rockchip: keep reset control around

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Merge tag 'riscv-soc-fixes-for-v7.1-rc7' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into arm/fixes

RISC-V soc fixes for v7.1-rc7

Microchip:
Fix a resource leak in an unlikely probe failure case.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
* tag 'riscv-soc-fixes-for-v7.1-rc7' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
soc: microchip: mpfs-sys-controller: fix resource leak on probe error

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

drm/i915/edp: Check supported link rates DPCD read

intel_edp_set_sink_rates() reads DP_SUPPORTED_LINK_RATES into a local
stack array and then parses the array unconditionally. If the read
fails, the array contents are not valid and may result in bogus sink
link rates being used.

Use drm_dp_dpcd_read_data() and clear the sink rate array on failure,
so the existing parser falls back to the default sink rate handling.

Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.

Fixes: 68f357cb7347 ("drm/i915/dp: generate and cache sink rate array for all DP, not just eDP 1.4")
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patch.msgid.link/20260529145759.1640646-1-n.zhandarovich@fintech.ru
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit bd61c7756b34157e093028225a69383b4b1203cc)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>

accel/ivpu: Fix signed integer truncation in IPC receive

Fix potential buffer overflow where firmware-supplied data_size is cast
to signed int before being used in min_t(). Large unsigned values
(>= 0x80000000) become negative, causing unsigned wraparound and
oversized memcpy operations that can overflow the stack buffer.

Change min_t(int, ...) to min() as both values are unsigned and can be
handled by min() without explicit cast.

Fixes: 3b434a3445ff ("accel/ivpu: Use threaded IRQ to handle JOB done messages")
Cc: stable@vger.kernel.org # v6.12+
Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20260601161643.229342-1-andrzej.kacprowski@linux.intel.com

net: openvswitch: fix possible kfree_skb of ERR_PTR

After the patch in the "Fixes" tag, the allocation of the "reply" skb
can happen either before or after locking the ovs_mutex.

However, error cleanups still follow the classical reversed order,
assuming "reply" is allocated before locking: it is freed after unlocking.

If "reply" allocation happens after locking the mutex and it fails,
"reply" is left with an ERR_PTR, and execution jumps to the correspondent
cleanup stage which will try to free an invalid pointer.

Fix this by setting the pointer to NULL after having saved its error
value.

Fixes: 893f139b9a6c ("openvswitch: Minimize ovs_flow_cmd_new|set critical sections.")
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://patch.msgid.link/20260604121946.942164-1-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge patch series "`zerocopy` support"

Introduce support for `zerocopy` [1][2]:

    Fast, safe, compile error. Pick two.

    Zerocopy makes zero-cost memory manipulation effortless. We write
    `unsafe` so you don't have to.

It essentially provides derivable traits (e.g. `FromBytes`) and macros
(e.g. `transmute!`) for safely converting between byte sequences and
other types. Having such support allows us to remove some `unsafe` code.

It is among the most downloaded Rust crates (top #50 recent, top #100
all-time downloads; according to crates.io), and it is also used by the
Rust compiler itself.

The series starts with a few preparation commits, then the `zerocopy`
and `zerocopy-derive` crates are added. Finally, an example patch using
it is on top, removing one `unsafe impl`.

I had to adapt the crates slightly (just +2/-3 lines), but both patches
could potentially be provided upstream eventually. Please see the
commits for details.

In total, it is about ~39k lines added, ~32k without counting `benches/`
which are just for documentation purposes.

See the cover letter for `syn` for some more details about depending on
third-party crates in commit 54e3eae85562 ("Merge patch series "`syn`
support"").

The codegen of an isolated example function similar to the patch on top
is essentially identical. It also turns out that (for that particular
case) `zerocopy`'s version, even under `debug-assertions` enabled, has
no remaining panics, unlike a few in the current code (because the
compiler can prove the remaining `ub_checks` statically).

So their "fast, safe" does indeed check out -- at least in that case.

P.S. This version of `zerocopy` has already the unstable `Ptr{,Inner}`
types -- to play with them, please use:

    make ... KRUSTFLAGS=--cfg=zerocopy_unstable_ptr

Link: https://github.com/google/zerocopy
Link: https://docs.rs/zerocopy
Link: https://patch.msgid.link/20260608141439.182634-1-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

gpu: nova-core: firmware: parse `FalconUCodeDescV2` via `zerocopy`

Now that we have `zerocopy` support, we can avoid some `unsafe` code.

For instance, for `FalconUCodeDescV2`, we can replace the `unsafe impl
FromBytes` by safely deriving `zerocopy`'s `FromBytes` and then calling
`read_from_prefix`.

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260608141439.182634-20-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: prelude: add `zerocopy{,_derive}::FromBytes`

In order to easily use `FromBytes`, add it to the prelude.

This adds both the trait (`zerocopy::FromBytes`) as well as the derive
macro (`zerocopy_derive::FromBytes`).

We will be adding more as we need them.

Link: https://patch.msgid.link/20260608141439.182634-19-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: zerocopy-derive: enable support in kbuild

With all the new files in place and ready from the new crate, enable
the support for it in the build system.

In addition, skip formatting for this vendored crate.

Link: https://patch.msgid.link/20260608141439.182634-18-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: zerocopy-derive: add `README.md`

Originally, when the Rust upstream `alloc` standard library crate was
vendored in commit 057b8d257107 ("rust: adapt `alloc` crate to the
kernel"), a `README.md` file was added to explain the provenance and
licensing of the source files.

Thus do the same for the `zerocopy-derive` crate.

Cc: Joshua Liebow-Feeser <joshlf@google.com>
Cc: Jack Wrenn <jswrenn@google.com>
Link: https://patch.msgid.link/20260608141439.182634-17-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: zerocopy-derive: avoid generating non-ASCII identifiers

Linux is built with `-Dnon_ascii_idents`. However, `zerocopy-derive`
uses a non-ASCII character (`ẕ`) internally, which in turn triggers
the lint when attempting to use derives like `FromBytes`:

    error: identifier contains non-ASCII characters
       --> rust/kernel/lib.rs:153:9
        |
    153 |         a: u32,
        |         ^
        |
        = note: requested on the command line with `-D non-ascii-idents`

This was already noticed by another project using
`#![deny(non_ascii_idents)]` [1]. `zerocopy` added an
`#[allow(non_ascii_idents)]` [2], but it does not work since, at the
moment, the `non_ascii_idents` lint is a `crate_level_only` one, and thus
`allow`s only work at the crate root level.

Due to this, an issue about relaxing this restriction was created in
upstream Rust [3] some months ago.

Thus work around it here by using another prefix. The likelihood of a
collision is very small for us, since we control the callers, and this
will hopefully be fixed soon at either the `zerocopy` or the Rust level.

I filed an issue [4] about it with upstream `zerocopy` as requested
and we discussed this with upstream Rust and `zerocopy`: the Rust issue
got nominated and a PR [5] to relax the restriction was submitted by
Joshua. Upstream `zerocopy` prefers that approach, so if Rust merges it,
then it means we will be able to remove the workaround when we bump the
MSRV, thus likely late 2027, since we follow Debian Stable.

Cc: Joshua Liebow-Feeser <joshlf@google.com>
Cc: Jack Wrenn <jswrenn@google.com>
Link: https://github.com/google/zerocopy/issues/2880
Link: https://github.com/google/zerocopy/pull/2882
Link: https://github.com/rust-lang/rust/issues/151025
Link: https://github.com/google/zerocopy/issues/3427
Link: https://github.com/rust-lang/rust/pull/157497
Link: https://patch.msgid.link/20260608141439.182634-16-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: zerocopy-derive: add SPDX License Identifiers

Originally, when the Rust upstream `alloc` standard library crate was
vendored, the SPDX License Identifiers were added to every file so that
the license on those was clear. The same happened with the vendoring of
`proc_macro2`, `quote` and `syn`. Please see:

  commit 057b8d257107 ("rust: adapt `alloc` crate to the kernel")
  commit 69942c0a8965 ("rust: syn: add SPDX License Identifiers")
  commit ddfa1b279d08 ("rust: quote: add SPDX License Identifiers")
  commit a9acfceb9614 ("rust: proc-macro2: add SPDX License Identifiers")

Thus do the same for the `zerocopy-derive` crate.

This makes `scripts/spdxcheck.py` pass: use parentheses like commit
06e9bfc1e57d ("ionic: make spdxcheck.py happy") did since we have two
`OR` operators in the expression (three licenses).

Finally, as requested, I filed an issue [1] with upstream about it.

Cc: Joshua Liebow-Feeser <joshlf@google.com>
Cc: Jack Wrenn <jswrenn@google.com>
Link: https://github.com/google/zerocopy/issues/3428
Link: https://patch.msgid.link/20260608141439.182634-15-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: zerocopy-derive: import crate

This is a subset of the Rust `zerocopy-derive` crate, version v0.8.50
(released 2026-05-31), licensed under "BSD-2-Clause OR Apache-2.0 OR
MIT", from:

    https://github.com/google/zerocopy/tree/v0.8.50/zerocopy-derive/src

The files are copied as-is, with no modifications whatsoever (not even
adding the SPDX identifiers).

For copyright details, please see:

    https://github.com/google/zerocopy/blob/v0.8.50/README.md?plain=1
    https://github.com/google/zerocopy/blob/v0.8.50/LICENSE-BSD
    https://github.com/google/zerocopy/blob/v0.8.50/LICENSE-APACHE
    https://github.com/google/zerocopy/blob/v0.8.50/LICENSE-MIT

The next two patches modify these files as needed for use within the
kernel. This patch split allows reviewers to double-check the import
and to clearly see the differences introduced.

The following script may be used to verify the contents:

    for path in $(cd rust/zerocopy-derive/ && find . -type f); do
        curl --silent --show-error --location \
            https://github.com/google/zerocopy/raw/v0.8.50/zerocopy-derive/src/$path \
            | diff --unified rust/zerocopy-derive/$path - && echo $path: OK
    done

Cc: Joshua Liebow-Feeser <joshlf@google.com>
Cc: Jack Wrenn <jswrenn@google.com>
Link: https://patch.msgid.link/20260608141439.182634-14-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: zerocopy: enable support in kbuild

With all the new files in place and ready from the new crate, enable
the support for it in the build system.

In addition, skip formatting for this vendored crate.

Finally, there are no generated symbols expected from `zerocopy`, thus
skip adding the `exports` generation.

Link: https://patch.msgid.link/20260608141439.182634-13-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: zerocopy: add `README.md`

Originally, when the Rust upstream `alloc` standard library crate was
vendored in commit 057b8d257107 ("rust: adapt `alloc` crate to the
kernel"), a `README.md` file was added to explain the provenance and
licensing of the source files.

Thus do the same for the `zerocopy` crate.

Cc: Joshua Liebow-Feeser <joshlf@google.com>
Cc: Jack Wrenn <jswrenn@google.com>
Link: https://patch.msgid.link/20260608141439.182634-12-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: zerocopy: remove float `Display` support

The kernel builds `core` with the `no_fp_fmt_parse` `--cfg`, which means
we do not have support for formatting floating point primitives. However,
`zerocopy` expects those implementations to exist:

    error[E0277]: `f32` doesn't implement `core::fmt::Display`
       --> rust/zerocopy/src/byteorder.rs:172:29
        |
    172 |                   $trait::fmt(&self.get(), f)
        |                   ----------- ^^^^^^^^^^^ the trait `core::fmt::Display` is not implemented for `f32`
        |                   |
        |                   required by a bound introduced by this call
    ...
    907 | / define_type!(
    908 | |     An,
    909 | |     "A 32-bit floating point number",
    910 | |     F32,
    ...   |
    922 | |     []
    923 | | );
        | |_- in this macro invocation
        |
        = help: the following other types implement trait `core::fmt::Display`:
                  i128
                  i16
                  i32
                  i64
                  i8
                  isize
                  u128
                  u16
                and 4 others
        = note: this error originates in the macro `impl_fmt_trait` which comes from the expansion of the macro `define_type` (in Nightly builds, run with -Z macro-backtrace for more info)

Thus work around it by skipping those implementations in `zerocopy`.

Ideally, `zerocopy` would have the equivalent of `no_fp_fmt_parse`;
and, indeed, upstream just added it [1] after I filed an issue [2]
about it as requested. We can try it in a future update of our
vendored copy.

Cc: Joshua Liebow-Feeser <joshlf@google.com>
Cc: Jack Wrenn <jswrenn@google.com>
Link: https://github.com/google/zerocopy/pull/3429
Link: https://github.com/google/zerocopy/issues/3426
Link: https://patch.msgid.link/20260608141439.182634-11-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: zerocopy: add SPDX License Identifiers

Originally, when the Rust upstream `alloc` standard library crate was
vendored, the SPDX License Identifiers were added to every file so that
the license on those was clear. The same happened with the vendoring of
`proc_macro2`, `quote` and `syn`. Please see:

  commit 057b8d257107 ("rust: adapt `alloc` crate to the kernel")
  commit 69942c0a8965 ("rust: syn: add SPDX License Identifiers")
  commit ddfa1b279d08 ("rust: quote: add SPDX License Identifiers")
  commit a9acfceb9614 ("rust: proc-macro2: add SPDX License Identifiers")

Thus do the same for the `zerocopy` crate.

This makes `scripts/spdxcheck.py` pass: use parentheses like commit
06e9bfc1e57d ("ionic: make spdxcheck.py happy") did since we have two
`OR` operators in the expression (three licenses).

SPDX identifiers are not added to the `benches` files because they are
included in rendered documentation. Nevertheless, the `README.md` to be
added by a later commit mentions the license.

Finally, as requested, I filed an issue [1] with upstream about it.

Cc: Joshua Liebow-Feeser <joshlf@google.com>
Cc: Jack Wrenn <jswrenn@google.com>
Link: https://github.com/google/zerocopy/issues/3428
Link: https://patch.msgid.link/20260608141439.182634-10-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: zerocopy: import crate

This is a subset of the Rust `zerocopy` crate, version v0.8.50 (released
2026-05-31), licensed under "BSD-2-Clause OR Apache-2.0 OR MIT", from:

    https://github.com/google/zerocopy/tree/v0.8.50

The files are copied as-is, with no modifications whatsoever (not even
adding the SPDX identifiers).

The `benches` folder is added (i.e. not just `src` like in other cases)
since the files there are included in the rendered documentation,
as well as the `rustdoc` CSS style file that is needed to make those
visually more understandable.

For copyright details, please see:

    https://github.com/google/zerocopy/blob/v0.8.50/README.md?plain=1
    https://github.com/google/zerocopy/blob/v0.8.50/LICENSE-BSD
    https://github.com/google/zerocopy/blob/v0.8.50/LICENSE-APACHE
    https://github.com/google/zerocopy/blob/v0.8.50/LICENSE-MIT

The next two patches modify these files as needed for use within the
kernel. This patch split allows reviewers to double-check the import
and to clearly see the differences introduced.

The following script may be used to verify the contents:

    for path in $(cd rust/zerocopy/ && find . -type f); do
        curl --silent --show-error --location \
            https://github.com/google/zerocopy/raw/v0.8.50/$path \
            | diff --unified rust/zerocopy/$path - && echo $path: OK
    done

Cc: Joshua Liebow-Feeser <joshlf@google.com>
Cc: Jack Wrenn <jswrenn@google.com>
Link: https://patch.msgid.link/20260608141439.182634-9-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: kbuild: support `skip_clippy` for `rustc_procmacro`

Certain vendored crates, like the upcoming `zerocopy-derive`, do not
need to be built with Clippy since we `--cap-lints=allow` them anyway.

Thus add support to skip Clippy for proc macro crates.

Acked-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20260608141439.182634-8-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: kbuild: support per-target environment variables

Certain vendored crates, like the upcoming `zerocopy`, use extra
environment variables (e.g. via `env!`).

Thus add support to easily specify those.

Acked-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20260608141439.182634-7-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: kbuild: define `procmacro-extension` variable

Since we are adding one more proc macro crate (`zerocopy-derive`),
we are refactoring their handling.

Thus, instead of using `libmacros_extension` as the common variable to
hold the extension for all of them, use a dedicated variable with a more
generic name (including for its implementation).

Link: https://patch.msgid.link/20260608141439.182634-6-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: kbuild: define `procmacro-name` function

Since we are adding one more proc macro crate (`zerocopy-derive`),
we are refactoring their handling.

Thus define a `procmacro-name` function and use it to fill the existing
variables' values.

Reviewed-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20260608141439.182634-5-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: sync: add `UniqueArc::as_ptr`

Add an associated function to `UniqueArc` for getting a raw pointer. The
implementation defers to the `Arc` implementation.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20260605-unique-arc-as-ptr-v2-1-425476d2abdb@kernel.org
[ Relaxed bound moving it to new `T: ?Sized` impl block. Reworded since
it is not a method anymore. Added intra-doc link. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: kbuild: remove unused variable

Since we are adding one more proc macro crate (`zerocopy-derive`),
we are refactoring their handling.

`libpin_init_internal_extension` was added to mimic the setup for
`macros`, but it is not used, since the extension is expected to be
the same.

Thus remove it.

Reviewed-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20260608141439.182634-4-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: inline some init methods

These methods should be inlined for optimization reasons. Failure to do
so can also produce symbol names larger than what `modpost` or `objtool`
can handle.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260605-nova-exports-v4-1-e948c287407c@nvidia.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: kbuild: show the right `quiet_cmd_rustc_procmacrolibrary`

When Clippy is skipped, `RUSTC` should be shown in `quiet` instead of
`CLIPPY` to be accurate and to avoid confusion.

Thus do so, matching what we do in `quiet_cmd_rustc_library`.

Fixes: 7dbe46c0b11d ("rust: kbuild: add proc macro library support")
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20260608141439.182634-3-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

scripts: generate_rust_analyzer: support passing env vars

A future commit adding `zerocopy` support will need to pass an environment
variable during its build.

Thus add support for an `--envs` parameter, similar to `--cfgs`, that
allows to pass a map of variables to set for a given crate.

This allows us to keep a single source of truth for those values.

No change intended in the generated `rust-project.json`.

Acked-by: Tamir Duberstein <tamird@kernel.org>
Link: https://patch.msgid.link/20260608141439.182634-2-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: io: use the `bitfield!` macro in `register!`

Replace the local bitfield rules by the equivalent invocation of the
`bitfield!` macro.

No functional change should be introduced as the `bitfield!` macro has
been extracted from the rules of `register!`.

Acked-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Yury Norov <ynorov@nvidia.com>
Link: https://patch.msgid.link/20260606-bitfield-v5-3-b92188820914@nvidia.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: bitfield: Add KUnit tests for bitfield

Add KUnit tests to make sure the macro is working correctly. The unit
tests are put behind the new `RUST_BITFIELD_KUNIT_TEST` Kconfig option.

Acked-by: Danilo Krummrich <dakr@kernel.org>
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
[acourbot:
- Use a consistent test axis where each test focuses on a single thing.
- Rename members to generic name including range for readability.
- Add test exercising `try_with`.
- Add test checking that unallocated bits are left untouched.
]
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Yury Norov <ynorov@nvidia.com>
Link: https://patch.msgid.link/20260606-bitfield-v5-2-b92188820914@nvidia.com
[ Prefixed test suite name with `rust_` as mentioned. Markdown-formatted
a few comments with Markdown. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

rust: extract `bitfield!` macro from `register!`

Extract the bitfield-defining part of the `register!` macro into an
independent macro used to define bitfield types with bounds-checked
accessors.

Each field is represented as a `Bounded` of the appropriate bit width,
ensuring field values are never silently truncated.

Fields can optionally be converted to/from custom types, either fallibly
or infallibly.

Appropriate documentation is also added, and a MAINTAINERS entry created
for the new module.

Two minor fixups are also applied: the private accessors are inlined,
and a couple of missing fully qualified types in the macro are fixed.

Acked-by: Yury Norov <ynorov@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Yury Norov <ynorov@nvidia.com>
Link: https://patch.msgid.link/20260606-bitfield-v5-1-b92188820914@nvidia.com
[ Added some more intra-doc links. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

ipv6: sit: reload inner IPv6 header after GSO offloads

ipip6_tunnel_xmit() caches the inner IPv6 header pointer at function
entry and continues using it after iptunnel_handle_offloads().

For GSO skbs, iptunnel_handle_offloads() calls skb_header_unclone().
When the skb header is cloned, skb_header_unclone() can call
pskb_expand_head(), which may move the skb head. The pskb_expand_head()
contract requires pointers into the skb header to be reloaded after the
call.

If the later skb_realloc_headroom() branch is not taken, SIT uses the
stale iph6 pointer to read the inner hop limit and DS field. That can
read from a freed skb head after the old head's remaining clone is
released.

Reload iph6 after the offload helper succeeds and before subsequent
reads from the inner IPv6 header. Keep the existing reload after
skb_realloc_headroom(), since that branch can also replace the skb.

Fixes: 14909664e4e1 ("sit: Setup and TX path for sit/UDP foo-over-udp encapsulation")
Signed-off-by: Kyle Zeng <kylebot@openai.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot+6eb9ca986d80f6f88cf9@syzkaller.appspotmail.com
Link: https://patch.msgid.link/20260605073448.6524-1-kylebot@openai.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Use effective affinity mask for IRQ selection

When a sf is created after a CPU has been taken offline, the IRQ pool may
contain IRQs with affinity masks that include the offline CPU. Since only
online CPUs should be considered for IRQ placement, cpumask_subset() check
would fail because the iter_mask contains offline CPUs that are not present
in req_mask, causing sf creation to fail.

This is an example:
  1. When mlx5 driver loads, it initializes the IRQ pools.
     For sf_ctrl_pool with ≤64 sf:
     - xa_num_irqs = {N, N} (There is only one slot)
  2. When the first SF is created:
     - The ctrl IRQ is allocated with mask=cpu_online_mask={0-191}
  2. We take CPU 20 offline
  3. Existing ctl irq still have mask={0-191}
  4. Create a new SF:
     - req_mask={0-19,21-191}
     - iter_mask={0-191}
     - {0-191} is NOT a subset of {0-19,21-191}
     - least_loaded_irq=NULL
  5. Try to allocate a new irq via irq_pool_request_irq()
  6. xa_alloc() fails because the pool is full(There is only one slot)
  7. sf creation fails with error

Use irq_get_effective_affinity_mask() instead, which returns the IRQ's
actual effective affinity that already excludes offline CPUs.

Fixes: 061f5b23588a ("net/mlx5: SF, Use all available cpu for setting cpu affinity")
Suggested-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260605102112.91772-1-fushuai.wang@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: xsk: Fix DMA and xdp_frame leak on XDP_TX xmit failure

In the XSK branch of mlx5e_xmit_xdp_buff(), when sq->xmit_xdp_frame()
returns false (e.g. XDPSQ is full), the function returns without
unmapping the DMA address or freeing the xdp_frame allocated by
xdp_convert_zc_to_xdp_frame(). The xdpi_fifo push only happens on
success, so the completion path cannot recover these entries.

With CONFIG_DMA_API_DEBUG=y, the leak surfaces on driver unbind:

  DMA-API: pci 0000:08:00.0: device driver has pending DMA
  allocations while released from device [count=1116]
  One of leaked entries details: [device address=0x000000010ffd7028]
  [size=1534 bytes] [mapped with DMA_TO_DEVICE] [mapped as phy]
  WARNING: kernel/dma/debug.c:881 at dma_debug_device_change+0x127/0x180
  ...
  DMA-API: Mapped at:
   debug_dma_map_phys+0x4b/0xd0
   dma_map_phys+0xfd/0x2d0
   mlx5e_xdp_handle+0x5ae/0xac0 [mlx5_core]
   mlx5e_xsk_skb_from_cqe_mpwrq_linear+0xc4/0x170 [mlx5_core]
   mlx5e_handle_rx_cqe_mpwrq+0xc1/0x290 [mlx5_core]

Add the missing unmap + xdp_return_frame, matching the cleanup already
done in mlx5e_xdp_xmit(). has_frags is rejected earlier in this branch,
so no per-frag unmap is needed.

Fixes: 84a0a2310d6d ("net/mlx5e: XDP_TX from UMEM support")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260604135446.456119-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Fix slab-out-of-bounds in mlx5_query_nic_vport_mac_list

mlx5_query_nic_vport_mac_list() sizes its firmware command buffer using
the PF's log_max_current_uc/mc_list capabilities. When querying a VF
vport with a larger configured max (via devlink), the firmware response
can overflow this buffer:

BUG: KASAN: slab-out-of-bounds in mlx5_query_nic_vport_mac_list+0x453/0x4c0 [mlx5_core]
Read of size 4 at addr ff1100013ffc8a12 by task kworker/u96:2/385

CPU: 12 UID: 0 PID: 385 Comm: kworker/u96:2 Not tainted 7.0.0-rc6+ #1 PREEMPT
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)
Workqueue: mlx5_esw_wq esw_vport_change_handler [mlx5_core]
Call Trace:
  <TASK>
  dump_stack_lvl+0x69/0xa0
  print_report+0x176/0x4e4
  kasan_report+0xc8/0x100
  mlx5_query_nic_vport_mac_list+0x453/0x4c0 [mlx5_core]
  esw_update_vport_addr_list+0x2e3/0xda0 [mlx5_core]
  esw_vport_change_handle_locked+0xa1f/0x1060 [mlx5_core]
  esw_vport_change_handler+0x6a/0x90 [mlx5_core]
  process_one_work+0x87f/0x15e0
  worker_thread+0x62b/0x1020
  kthread+0x375/0x490
  ret_from_fork+0x4dc/0x810
  ret_from_fork_asm+0x11/0x20
  </TASK>

Fix by querying the vport's own HCA caps to size the buffer correctly.
Refactor the function to allocate and return the MAC list internally,
removing the caller's dependency on knowing the correct max.

Fixes: e16aea2744ab ("net/mlx5: Introduce access functions to modify/query vport mac lists")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260604135849.458060-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: qrtr: fix refcount saturation and potential UAF in qrtr_port_remove

In qrtr_port_remove(), the socket reference count is decremented via
__sock_put() before the port is removed from the qrtr_ports XArray and
before the RCU grace period elapses.

This breaks the fundamental RCU update paradigm. It exposes a race
window where a concurrent RCU reader (such as qrtr_reset_ports() or
qrtr_port_lookup()) can obtain a pointer to the socket from the XArray,
and attempt to call sock_hold() on a socket whose reference count has
already dropped to zero.

This exact race condition was hit during syzkaller fuzzing, leading to
the following refcount saturation warning and a potential Use-After-Free:

  refcount_t: saturated; leaking memory.
  WARNING: CPU: 3 PID: 1273 at lib/refcount.c:22 refcount_warn_saturate+0xae/0x1d0
  Modules linked in: qrtr(+) bochs drm_shmem_helper ...
  Call Trace:
   <TASK>
   qrtr_reset_ports net/qrtr/af_qrtr.c:768 [inline] [qrtr]
   __qrtr_bind.isra.0+0x48b/0x570 net/qrtr/af_qrtr.c:805 [qrtr]
   qrtr_bind+0x17d/0x210 net/qrtr/af_qrtr.c:901 [qrtr]
   kernel_bind+0xe4/0x120 net/socket.c:3592
   qrtr_ns_init+0x1a6/0x380 net/qrtr/ns.c:715 [qrtr]
   qrtr_proto_init+0x3b/0xff0 net/qrtr/af_qrtr.c:169 [qrtr]
   do_one_initcall+0xf5/0x5e0 init/main.c:1283
   ...
   </TASK>

Fix this by deferring the reference count decrement until after the
xa_erase() and the synchronize_rcu() complete.

(Note: The v1 of this patch incorrectly replaced __sock_put() with
sock_put(). As Simon Horman pointed out, the callers of qrtr_port_remove()
still hold a reference to the socket, so freeing the socket memory here
would lead to a subsequent UAF in the caller. Thus, the __sock_put() is
kept, but only repositioned to close the RCU race.)

Fixes: bdabad3e363d ("net: Add Qualcomm IPC router")
Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260604064801.1180388-1-w15303746062@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-phy-some-cleanups-following-phy_port-sfp'

Maxime Chevallier says:

====================
net: phy: some cleanups following phy_port SFP

While posting the v11 of phy_port netlink, sashiko found some
pre-existing issues, and following the tentative fix, Nicolai found
some more :)

This is V3, with a re-ordering of the port/sfp cleanup, as well as a new
patch (patch 3) that also reorders the phy_remove() path.
====================

Link: https://patch.msgid.link/20260604092819.723505-1-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: don't try to setup PHY-driven SFP cages when using genphy

We don't have support for PHY-driver SFP cages with the genphy code.

On top of that, it was found by sashiko that running
sfp_bus_add_upstream() for genphy deadlocks, as for genphy the PHY
probing runs under RTNL, which isn't the case for non-genphy drivers.

This problem was reproduced, and does lead to a deadlock on RTNL.

Before the blamed commit, the phy_sfp_probe() call was made by
individual PHY drivers, so there was no way to get to the SFP probing
path when using genphy.

Let's therefore only run phy_sfp_probe when not using genphy.

Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
Fixes: bad869b5e41a ("net: phy: Only rely on phy_port for PHY-driven SFP")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260604092819.723505-5-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: Clean the phy_ports after unregistering the downstream SFP bus

As reported by sashiko when looking a other patches, we need to ensure
that the downstream SFP bus gets unregistered prior to destroying the
phy_ports attached to a phy_device, as the SFP code may reference these
ports. Let's make sure we follow that ordering in phy_remove().

Fixes: 589e934d2735 ("net: phy: Introduce PHY ports representation")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
Link: https://patch.msgid.link/20260604092819.723505-4-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: remove phy ports upon probe failure

When phy_probe fails, let's clean the phy_ports that were successfully
added already.

Suggested-by: Nicolai Buchwitz <nb@tipi-net.de>
Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
Fixes: 589e934d2735 ("net: phy: Introduce PHY ports representation")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260604092819.723505-3-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: clean the sfp upstream if phy probing fails

Sashiko reported that we don't call sfp_bus_del_upstream() in the probe
failure path, so let's add it, otherwise the sfp-bus is left with a
dangling 'upstream' field, that may be used later on during SFP events.

This issue existed before the generic phylib sfp support, back when
drivers were calling phy_sfp_probe themselves.

Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
Fixes: 298e54fa810e ("net: phy: add core phylib sfp support")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260604092819.723505-2-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

netdev: fix double-free in netdev_nl_bind_rx_doit()

Sashiko flags that genlmsg_reply() always consumes the skb.
The error path calls nlmsg_free(rsp) so we can't jump directly
to it. Let's not unbind, just propagate the error to the user.
This is the typical way of handling genlmsg_reply() failures.
They shouldn't happen unless user does something silly like
calling the kernel with an already-full rcvbuf.

Reported-by: Sashiko <sashiko-bot@kernel.org>
Fixes: 170aafe35cb9 ("netdev: support binding dma-buf to netdevice")
Reviewed-by: Bobby Eshleman <bobbyeshleman@meta.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phonet: free phonet_device after RCU grace period

phonet_device_destroy() removes a phonet_device from the per-net device
list with list_del_rcu(), but frees it immediately. RCU readers walking
the same list can still hold a pointer to the object after it has been
removed, leading to a slab-use-after-free.

Use kfree_rcu(), matching the lifetime rule already used by
phonet_address_del() for the same object type.

Fixes: eeb74a9d45f7 ("Phonet: convert devices list to RCU")
Cc: stable@vger.kernel.org
Signed-off-by: Santosh Kalluri <santosh.kalluri129@gmail.com>
Acked-by: Rémi Denis-Courmont <remi@remlab.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: ibm: emac: Fix use-after-free during device removal

The driver was using devm_register_netdev() which causes unregister_netdev()
to be deferred until the devres cleanup phase, which runs after emac_remove()
returns. This creates a use-after-free window where:

1. emac_remove() is called, which tears down hardware (cancels work, detaches
   modules, unregisters from MAL)
2. emac_remove() returns
3. devres cleanup runs and finally calls unregister_netdev()

During step 3, the network stack might still process packets, triggering
emac_irq(), emac_poll(), or other handlers that access now-freed hardware
resources (dev->emacp, dev->mal, etc.).

Fix this by replacing devm_register_netdev() with manual register_netdev()
and calling unregister_netdev() at the beginning of emac_remove(), before
any hardware teardown. This ensures the network device is fully stopped and
unregistered before hardware resources are released.

The change is safe because:
- dev->ndev is assigned very early in probe (before any error paths that
  could bypass emac_remove)
- platform_set_drvdata() is only called after successful registration, so
  emac_remove() only runs for fully registered devices
- unregister_netdev() is idempotent and safe to call on any registered device

Fixes: a4dd8535a527 ("net: ibm: emac: use devm for register_netdev")
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx4: avoid GCC 10 __bad_copy_from() false positive

mlx4_init_user_cqes() fills a scratch buffer with the CQE
initialization pattern and then copies from that buffer to userspace.

In the single-copy path, the copy length is array_size(entries,
cqe_size), but the scratch buffer is allocated with PAGE_SIZE. GCC 10
does not carry the branch invariant strongly enough through the object
size checks and falsely triggers __bad_copy_from().

Size the scratch buffer to the actual copy length for the active path,
keep array_size() for the single-copy case, and retain a WARN_ON_ONCE()
guard for the PAGE_SIZE invariant before allocating the buffer.

Fixes: f69bf5dee7ef ("net/mlx4: Use array_size() helper in copy_to_user()")
Signed-off-by: Yao Sang <sangyao@kylinos.cn>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: add pskb_may_pull() to skb_gro_receive_list()

skb_gro_receive_list() calls skb_pull(skb, skb_gro_offset(skb)) without
first ensuring the data is in the linear area via pskb_may_pull(). When
the skb arrives via napi_gro_frags(), skb_headlen can be 0 (all data in
page fragments) while skb_gro_offset is non-zero (after IP+TCP header
parsing). The skb_pull() then decrements skb->len by skb_gro_offset
but skb->data_len stays unchanged, hitting BUG_ON(skb->len < skb->data_len)
in __skb_pull().

The UDP fraglist GRO path already contains this guard at
udp_offload.c:749. Adding it to skb_gro_receive_list() itself provides
centralized protection for all callers (TCP, UDP, and any future
protocols), and ensures the precondition of skb_pull() is satisfied
before it is called.

On pskb_may_pull() failure, set NAPI_GRO_CB(skb)->flush = 1 so the
skb is not held as a new GRO head and is instead delivered through the
normal receive path, matching the UDP handling.

Fixes: 8d95dc474f85 ("net: add code for TCP fraglist GRO")
Reported-by: HanQuan <eilaimemedsnaimel@gmail.com>
Reported-by: MingXuan <bwnie0730@outlook.com>
Signed-off-by: HanQuan <eilaimemedsnaimel@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tcp: restrict SO_ATTACH_FILTER to priv users

This patch restricts the use of SO_ATTACH_FILTER (cBPF) on TCP sockets
to users with CAP_NET_ADMIN capability.

This blocks potential side-channel attack where an unprivileged application
attaches a filter to leak TCP sequence/acknowledgment numbers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Tamir Shahar <tamirthesis@gmail.com>
Reported-by: Amit Klein <aksecurity@gmail.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

driver core: platform: set mod_name in driver registration

Pass KBUILD_MODNAME through the driver registration macro so that the
driver core can create the module symlink in sysfs for built-in drivers,
and fixup all callers.

The Rust platform adapter is updated to pass the module name through to
the new parameter.

Tested on qemu with:
- x86 defconfig + CONFIG_RUST
- arm64 defconfig + CONFIG_RUST + CONFIG_CORESIGHT stuff

Examples after this patch:

    /sys/bus/platform/drivers/...
        coresight-itnoc/module -> coresight_tnoc
        coresight-static-tpdm/module -> coresight_tpdm
        coresight-catu-platform/module -> coresight_catu
        serial8250/module -> 8250
        acpi-ged/module -> acpi
        vmclock/module -> ptp_vmclock

Co-developed-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
Link: https://patch.msgid.link/20260518-acpi_mod_name-v5-4-705ccc430885@sony.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

coresight: pass THIS_MODULE implicitly through a macro

Rename coresight_init_driver() to coresight_init_driver_with_owner() and
replace it with a macro wrapper that passes THIS_MODULE implicitly. This
is in line with what other buses do.

Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Co-developed-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
Link: https://patch.msgid.link/20260518-acpi_mod_name-v5-3-705ccc430885@sony.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

kernel: param: initialize module_kset in a pure_initcall

Commit "driver core: platform: set mod_name in driver registration" will
set struct device_driver's mod_name member for platform driver
registration. For a driver to be registered with its mod_name set,
module_kset needs to be initialized, which currently happens in a
subsys_initcall in param_sysfs_init(). The tegra cbb drivers register
themselves before module_kset init, in a core_initcall. This works
currently because lookup_or_create_module_kobject(), which dereferences
module_kset via kset_find_obj(), is not called if mod_name is not set,
which is the case now.

So in preparation for the commit "driver core: platform: set mod_name in
driver registration", move module_kset init to pure_initcall level,
ensuring it happens before tegra cbb driver registration.

Suggested-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Gary Guo <gary@garyguo.net>
Co-developed-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
Link: https://patch.msgid.link/20260601101942.4002661-1-shashank.mahadasyam@sony.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

soc/tegra: cbb: Move driver registration from pure_initcall to core_initcall

Commit "driver core: platform: set mod_name in driver registration" will
set struct device_driver's mod_name member for platform driver
registration. For a driver to be registered with its mod_name set,
module_kset needs to be initialized, which currently happens in a
subsys_initcall in param_sysfs_init(). The tegra cbb drivers register
themselves before module_kset init, in a pure_initcall. This works
currently because lookup_or_create_module_kobject(), which dereferences
module_kset via kset_find_obj(), is not called if mod_name is not set,
which is the case now.

So in preparation for the commit "driver core: platform: set mod_name in
driver registration", move tegra cbb driver registration to
core_initcall level, and commit "kernel: param: initialize module_kset
in a pure_initcall" will move module_kset init to pure_initcall level,
ensuring module_kset init happens before tegra cbb driver registration.

Suggested-by: Gary Guo <gary@garyguo.net>
Acked-by: Sumit Gupta <sumitg@nvidia.com>
Co-developed-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://patch.msgid.link/20260518-acpi_mod_name-v5-1-705ccc430885@sony.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

MAINTAINERS: i2c: designware: Remove inactive reviewer

Emails to Jan Dabros bounce with a permanent failure due to an
inactive account. Remove him from the list of reviewers.

Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

i2c: tegra: Fix NOIRQ suspend/resume

The Tegra I2C driver relies on runtime PM to wake up the controller before
each transfer. However, runtime PM is disabled between the system suspend
and NOIRQ suspend. If an I2C device initiates a transfer during this
window, the I2C controller fails to wake up and the transfer fails. To
handle this, the controller must be kept available for this period to
allow transfers.

Rework the I2C controller's system PM callbacks such that the controller
is resumed from runtime suspend during system suspend and it stays
RPM_ACTIVE throughout the suspend-resume cycle until it is runtime
suspended back in the system resume. The clocks are disabled in NOIRQ
suspend and enabled back in NOIRQ resume by calling the controller's
runtime PM functions directly.

Fixes: 8ebf15e9c869 ("i2c: tegra: Move suspend handling to NOIRQ phase")
Assisted-by: Cursor:claude-4.6-opus
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260518114013.62065-5-akhilrajeev@nvidia.com

i2c: tegra: Update Tegra410 I2C timing parameters

Update Tegra410 I2C timing parameters based on hardware characterization
results. This adjusts the fast mode and HS mode settings to be compliant
with the I2C specification.

Fixes: 59717f260183 ("i2c: tegra: Add support for Tegra410")
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260518114013.62065-4-akhilrajeev@nvidia.com

create_default_group(): pass parent's dentry instead of config_group

the only way parent_group is used there...

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

configfs_attach_group(): drop the unused parent_item argument

This one *was* used - for passing it to configfs_attach_item(), which
didn't use the value passed to it.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

configs_attach_item(): drop unused parent_item argument

That argument has been unused since the initial merge in 2005.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

configfs_create(): lift parent timestamp updates into callers

... and do *not* do it in ->lookup() case. stat foo/bar
should not update mtime of foo, TYVM...

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

kill configfs_drop_dentry()

Fold into the only remaining user, don't bother with the timestamps
of parent - we are going to rmdir it shortly anyway, which will
override those.

Fix the locking of inode, while we are at it - updating the link
count and timestamps ought to be done with the inode locked.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

configfs: mark pinned dentries persistent

on the removal side we can (finally) get rid of __simple_unlink()
and __simple_rmdir() kludges now that dentries in question are
properly marked persistent - simple_unlink() and simple_rmdir()
will do the right thing for those.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

configfs: dentry refcount needs to be pinned only once

currently we have a weird situation where
* symlinks and roots of subtrees created by mkdir are pinned once
* subdirectories of subtrees created by mkdir are pinned twice
* roots of subtrees created by register_{group,subsystem} are pinned
twice.

It makes things harder to follow for no good reason.  The goal is to
encapsulate the unbalanced dget/dput into d_{make,discard}_persisitent()
and, preferably, allow a use of simple_recursive_removal() or analogue
thereof.  So let's regularize that and pin things only once.

create_default_group() and configfs_register_subsystem() don't need to
keep their reference around on success - configfs_create_dir() has pinned
the sucker already.  So we can drop the reference passed to
configfs_create_dir() (via configfs_attach_group(), etc.) both on success
and on failure.  On the removal side we no longer have the double references,
so we need an explicit dget() to compensate.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

switch configfs_detach_{group,item}() to passing dentry

... and there's no need to grab/drop it, or check for NULL - none
of the callers would even get there with NULL dentry and all of
them have the sucker pinned

Note that if sd is a directory configfs_dirent, we have sd->s_element
pointing to config_item with item->ci_dentry equal to sd->s_dentry.
Which is the only reason why detach_groups() gets away with using
the latter for locking the inode and the former for removal.

Aren't redundant data structures wonderful, for obfuscation if nothing
else?

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

configfs_remove_dir(), detach_attrs(): switch to passing dentry

... and deal with grabbing/dropping it in the sole caller.
After that configfs_remove_dir() becomes an unconditional call of remove_dir(),
so we can fold them together.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

populate_attrs(): move cleanup to the sole caller

... where it folds with configfs_remove_dir() into a call of
configfs_detach_item(). Note that at the early failure exit
(before we'd added any children) we were not calling detach_attrs()
only because there it would've been a no-op - nothing added,
nothing there to be removed.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

populate_group(): move cleanup on failure to the sole caller

... where it folds with configfs_detach_item() into a call of
configfs_detach_group().

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

configfs_detach_rollback(): pass configfs_dirent instead of dentry

same story as with configfs_detach_prep() this function is undoing.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

configfs_do_depend_item(): pass configfs_dirent instead of dentry

Again, the only thing it uses the argument for is its ->d_fsdata
and callers already have that - as the matter of fact, they are
passing ->s_dentry of that configfs_dirent, so that the function
could get it back as ->d_fsdata of that. With nothing else in
dentry even looked at...

configfs_dirent in question is a directory one - in this case those
are subdirectories of root (aka roots of "subsystem" trees).

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

configfs_depend_prep(): pass configfs_dirent instead of dentry

Again, the only thing it uses dentry for is dentry->d_fsdata; for the
recursive call the situation is the same as with configfs_detach_prep()
and the same observation about ->s_dentry->d_fsdata applies.

Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

configfs_detach_prep(): pass configfs_dirent instead of dentry

The only thing it uses the argument for is its ->d_fsdata and
all callers have that already available.

Note that in the recursive call we are dealing with a (sub)directory
configfs_dirent, and for those ->s_dentry->d_fsdata points back
to configfs_dirent itself.

Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

configfs_mkdir(): use take_dentry_name_snapshot()

Note that neither ->make_group() nor ->make_item() are allowed to modify
the string passed to them - the argument is const char *.

Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

configfs: fix lockless traversals of ->s_children

Having the parent directory locked protects entries from removal
by another thread, but it does *not* protect cursors from being
moved around by lseek() - or freed, for that matter.

Fixes: 6f6107640625 ("configfs: Introduce configfs_dirent_lock")
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

drm/virtio: fix dma_fence refcount leak on error in virtio_gpu_dma_fence_wait()

dma_fence_unwrap_for_each() internally calls dma_fence_unwrap_first()
which does cursor->chain = dma_fence_get(head), taking an extra
reference. On normal loop completion, dma_fence_unwrap_next()
releases this via dma_fence_chain_walk() -> dma_fence_put().

When virtio_gpu_do_fence_wait() fails and the function returns early
from inside the loop, the cursor->chain reference is never released.
This is the only caller in the entire kernel that does an early return
inside dma_fence_unwrap_for_each.

Add dma_fence_put(itr.chain) before the early return.

Cc: stable@vger.kernel.org
Fixes: eba57fb5498f ("drm/virtio: Wait for each dma-fence of in-fence array individually")
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patch.msgid.link/20260607090303.92423-1-vulab@iscas.ac.cn

i2c: qcom-cci: Fix NULL pointer dereference in cci_remove()

On all modern platforms Qualcomm CCI controller provides two I2C masters,
and on particular boards only one I2C master may be initialized, and in
such cases the device unbinding or driver removal causes a NULL pointer
dereference, because cci_halt() is called for all two I2C masters, but
a completion is initialized only for the single enabled master:

    % rmmod i2c-qcom-cci
    Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
    <snip>
    Call trace:
    __wait_for_common+0x194/0x1a8 (P)
    wait_for_completion_timeout+0x20/0x2c
    cci_remove+0xc4/0x138 [i2c_qcom_cci]
    platform_remove+0x20/0x30
    device_remove+0x4c/0x80
    device_release_driver_internal+0x1c8/0x224
    driver_detach+0x50/0x98
    bus_remove_driver+0x6c/0xbc
    driver_unregister+0x30/0x60
    platform_driver_unregister+0x14/0x20
    qcom_cci_driver_exit+0x18/0x1008 [i2c_qcom_cci]
    ....

Fixes: e517526195de ("i2c: Add Qualcomm CCI I2C driver")
Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260515234121.1607425-2-vladimir.zapolskiy@linaro.org

firmware_loader: Fix recursive lock in device_cache_fw_images()

A recursive locking deadlock can occur in the firmware loader's power
management notification handler.

During system suspend or hibernation preparation, fw_pm_notify() calls
device_cache_fw_images(). This function acquires fw_lock to set the
firmware cache state to FW_LOADER_START_CACHE and then iterates over all
devices using dpm_for_each_dev() while still holding the lock.

For each device, dev_cache_fw_image() schedules asynchronous work to cache
the firmware. If memory allocation for the async work entry fails (e.g., in
out-of-memory conditions), async_schedule_node_domain() falls back to
executing the work function synchronously in the current thread.

The synchronous execution path (__async_dev_cache_fw_image() ->
cache_firmware() -> request_firmware() -> assign_fw()) attempts to acquire
fw_lock again. Since the current thread already holds fw_lock, this results
in a recursive locking deadlock.

Fix this by releasing fw_lock immediately after updating the cache state
and before calling dpm_for_each_dev(). The lock is only needed to protect
the state update. Concurrent firmware requests will correctly see the
FW_LOADER_START_CACHE state and use the piggyback mechanism, which is
independently protected by its own fwc->name_lock.

Fixes: ac39b3ea73aa ("firmware loader: let caching firmware piggyback on loading firmware")
Assisted-by: Gemini:gemini-3.1-pro-preview Gemini:gemini-3-flash-preview syzbot
Reported-by: syzbot+e70e4c6f6eee43357ba7@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e70e4c6f6eee43357ba7
Link: https://syzkaller.appspot.com/ai_job?id=8b4af9fd-24af-423f-8acb-1159fd34c1a5
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/48b092a5-f49d-48a4-95f4-f65bebfc6bc3@mail.kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

i2c: stm32f7: fix timing computation ignoring i2c-analog-filter

stm32f7_i2c_compute_timing() uses i2c_dev->analog_filter to pick
the analog filter delay, but i2c_dev->analog_filter is parsed from
the "i2c-analog-filter" DT property only after the compute_timing
loop in stm32f7_i2c_setup_timing(), so in practice the timing
calculations always ignore the analog filter. On an STM32MP1 board
with clock-frequency = <400000> and i2c-analog-filter set, measured
SCL frequency was ~382 kHz.

This also affects (widens) the computed SDADEL range. At high bus
clock speeds, this can select an SDADEL value that violates tVD;DAT
(data valid time).

Fix by parsing "i2c-analog-filter" before the compute_timing loop.

Fixes: 83c3408f7b9c ("i2c: stm32f7: support DT binding i2c-analog-filter")
Signed-off-by: Guillermo Rodríguez <guille.rodriguez@gmail.com>
Cc: <stable@vger.kernel.org> # v5.13+
Acked-by: Alain Volmat <alain.volmat@foss.st.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260526091210.20383-1-guille.rodriguez@gmail.com

ASoC: wm_adsp: Fix NULL dereference when removing firmware controls

In wm_adsp_control_remove() check that the priv pointer is not NULL
before attempting to cleanup what it points to.

When cs_dsp creates a control it calls wm_adsp_control_add_cb() so that
wm_adsp can create its own private control data. There are two cases
where private data is not created:

1. The control is a SYSTEM control, so an ALSA control is not created.

2. The codec driver has registered a control_add() callback that
hides the control, so wm_adsp_control_add() is not called.

When cs_dsp_remove destroys its control list it calls
wm_adsp_control_remove() for each control. But wm_adsp_control_remove()
was attempting to cleanup the private data pointed to by cs_ctl->priv
without checking the pointer for NULL.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 0700bc2fb94c ("ASoC: wm_adsp: Separate generic cs_dsp_coeff_ctl handling")
Link: https://patch.msgid.link/20260604101244.1402862-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>

i2c: imx: fix clock and pinctrl state inconsistency in runtime PM

In i2c_imx_runtime_suspend(), the clock is disabled before switching
the pinctrl state to sleep. If pinctrl_pm_select_sleep_state() fails,
the runtime suspend is aborted but the clock remains disabled, causing
a system crash when the hardware is subsequently accessed.

Fix this by switching the pinctrl state before disabling the clock so
that a pinctrl failure leaves the clock enabled and the hardware
accessible.

In i2c_imx_runtime_resume(), restore the pinctrl state back to sleep
if clk_enable() fails to keep the consistent.

Fixes: 576eba03c994 ("i2c: imx: switch different pinctrl state in different system power status")
Signed-off-by: Carlos Song <carlos.song@nxp.com>
Cc: <stable@vger.kernel.org> # v6.14+
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260521065038.2954998-1-carlos.song@oss.nxp.com

s390: Remove GENERIC_LOCKBREAK Kconfig option

s390 selects GENERIC_LOCKBREAK if PREEMPT is enabled. Reason is a historic
18 years old commit [1] which fixed a compile error for PREEMPT enabled
kernels. Back than only PREEMPT_NONE and PREEMPT_VOLUNTARY kernels were
considered to be important for s390. PREEMPT should "just work".

However, since recently PREEMPT is always enabled [2], which also causes
GENERIC_LOCKBREAK to be always enabled. For some workloads this leads to
massive performance degradation; e.g. a simple kernel compile on machines
with many CPUs may take up to four times longer.

To fix this just remove the GENERIC_LOCKBREAK from s390's Kconfig, since
the compile error from 18 years ago does not exist anymore.

[1] commit b6b40c532a36 ("[S390] Define GENERIC_LOCKBREAK.")
[2] commit 7dadeaa6e851 ("sched: Further restrict the preemption modes")

Cc: stable@vger.kernel.org
Reported-by: Massimiliano Pellizzer <massimiliano.pellizzer@canonical.com>
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>

RDMA/srp: bound SRP_RSP sense copy by the received length

srp_process_rsp() copies sense data from rsp->data + resp_data_len,
where resp_data_len is the full 32-bit value supplied by the SRP target
and is never checked against the number of bytes actually received
(wc->byte_len). The copy length is bounded to SCSI_SENSE_BUFFERSIZE, so
at most 96 bytes are copied, but the source offset is not bounded.

A malicious or compromised SRP target on the InfiniBand/RoCE fabric that
the initiator has logged into can return an SRP_RSP with
SRP_RSP_FLAG_SNSVALID set and a large resp_data_len. The receive buffer
is allocated at the target-chosen max_ti_iu_len, so the source of the
sense copy lands past the bytes actually received; with resp_data_len
near 0xFFFFFFFF it is gigabytes past the buffer and the read faults.

Copy the sense data only if it has not been truncated, that is, only if
the response header, the response data, and the sense region fit within
the bytes actually received; otherwise drop the sense and log. The
in-tree iSER and NVMe-RDMA receive paths already bound their parse by
wc->byte_len; this brings ib_srp into line with them.

Fixes: aef9ec39c47f ("IB: Add SCSI RDMA Protocol (SRP) initiator")
Link: https://patch.msgid.link/r/20260602220457.2542840-1-michael.bommarito@gmail.com
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

IB/isert: Reject login PDUs shorter than ISER_HEADERS_LEN

In drivers/infiniband/ulp/isert/ib_isert.c, isert_login_recv_done()
computes the login request payload length as wc->byte_len minus
ISER_HEADERS_LEN with no lower bound, and login_req_len is a signed int.
A remote iSER initiator can post a login Send work request carrying
fewer than ISER_HEADERS_LEN (76) bytes, so the subtraction underflows
and login_req_len becomes negative.

isert_rx_login_req() then reads that negative length back into a signed
int, takes size = min(rx_buflen, MAX_KEY_VALUE_PAIRS), and because the
min() is signed it keeps the negative value; the value is then passed as
the memcpy() length and sign-extended to a multi-gigabyte size_t. The
copy into the 8192-byte login->req_buf runs far out of bounds and
faults, crashing the target node. The login phase precedes iSCSI
authentication, so no credentials are required to reach this path.

Reject any login PDU shorter than ISER_HEADERS_LEN before the
subtraction, mirroring the existing early return on a failed work
completion, so login_req_len can never go negative. The upper bound was
already safe: a posted login buffer cannot deliver more than
ISER_RX_PAYLOAD_SIZE, so the difference stays at or below
MAX_KEY_VALUE_PAIRS and the existing min() clamps it; only the missing
lower bound needs to be added.

Fixes: b8d26b3be8b3 ("iser-target: Add iSCSI Extensions for RDMA (iSER) target driver")
Link: https://patch.msgid.link/r/20260602194642.2273217-1-michael.bommarito@gmail.com
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

cpufreq: Use policy->min/max init as QoS request

Modify cpufreq_policy_init_qos() introduced previously to use
policy->min/max set in the driver .init() callback as the initial
values for the policy min/max frequency QoS requests, respectively,
so long as they are different from 0 (which means that they have
been updated by the driver). Update the documentation in accordance
with that code change.

This only affects the following drivers:

- gx-suspmod (min)
- cppc-cpufreq (min)
- longrun (min/max)

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
[ rjw: Changelog rewrite ]
Link: https://patch.msgid.link/20260528090913.2759118-5-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

cpufreq: Remove driver default policy->min/max init

Prior to commit 521223d8b3ec ("cpufreq: Fix initialization of min and
max frequency QoS requests"), drivers were setting policy->min/max and
these values were used as initial policy QoS constraints.

After the above commit, these values are only used temporarily, as
cpufreq_set_policy() ultimately overrides them through:

cpufreq_policy_online()
\-cpufreq_init_policy()
\-cpufreq_set_policy()
\-/* Set policy->min/max */

A subsequent change will restore the previous behavior allowing
drivers to request special min/max QoS frequencies instead of
FREQ_QOS_MIN_DEFAULT_VALUE and FREQ_QOS_MAX_DEFAULT_VALUE, respectively,
if desired. For instance, the CPPC driver wants to advertise the lowest
non-linear frequency that should be used as the initial minimum
frequency QoS request.

However, for this purpose, all drivers setting policy->min/max to
policy->cpuinfo.min/max_freq, respectively, need to be updated so
their initial policy->min/max settings don't limit the frequency
scaling unnecessarily going forward (which would defeat the purpose
of commit 521223d8b3ec), so do that.

This does not actually alter the observed behavior of all of
the drivers in question because setting policy->min/max to
policy->cpuinfo.min/max_freq, respectively, is not necessary or
even useful any more after a previous change ("cpufreq: Set default
policy->min/max values for all drivers").

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Acked-by: Jie Zhan <zhanjie9@hisilicon.com>
[ rjw: Changelog rewrite ]
Link: https://patch.msgid.link/20260528090913.2759118-4-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

cpufreq: Set default policy->min/max values for all drivers

Some drivers set policy->min/max in their .init() callback, but
cpufreq_set_policy() will ultimately override them through:

cpufreq_policy_online()
\-cpufreq_init_policy()
\-cpufreq_set_policy()
\-/* Set policy->min/max */

Thus the policy min/max values set by the drivers are only temporary.

There is an exception if CPUFREQ_NEED_INITIAL_FREQ_CHECK is set and
cpufreq_policy_online() calls __cpufreq_driver_target() which invokes
cpufreq_driver->target().

To prepare for a subsequent change that will remove all initialization
of policy->min/max in driver .init() callbacks if the min/max value is
equal to the corresponding cpuinfo.min/max_freq, set default
policy->min/max values in the core for all drivers.

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com>
[ rjw: Edits of the new comment and changelog ]
Link: https://patch.msgid.link/20260528090913.2759118-3-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

cpufreq: Extract cpufreq_policy_init_qos() function

Extract the QoS-related logic from cpufreq_policy_online()
to make that function shorter/simpler.

The logic is placed in cpufreq_policy_init_qos() and is
now executed right after the following calls:

- cpufreq_driver->init()
- cpufreq_table_validate_and_sort()

This facilitats subsequent changes that will, in
cpufreq_policy_init_qos():

- Set a default policy->min/max value for all policies.
- Use the policy->min/max values set by drivers as initial request
values for policy frequency QoS requests.

No functional change.

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20260528090913.2759118-2-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

RDMA: During rereg_mr ensure that REREG_ACCESS is compatible

If IB_MR_REREG_ACCESS changes from RO to RW then the umem has to be
re-evaluated to ensure it is properly pinned as RW. Since the umem is
hidden inside each driver's mr struct add a ib_umem_check_rereg() function
that each driver has to call before processing IB_MR_REREG_ACCESS.

mlx4 has to retain its duplicate ib_access_writable check because it
implements IB_MR_REREG_ACCESS | IB_MR_REREG_TRANS by changing both items
in place sequentially while the MR is live, so it will continue to not
support this combination.

Cc: stable@vger.kernel.org
Fixes: b40656aa7d55 ("RDMA/umem: remove FOLL_FORCE usage")
Link: https://patch.msgid.link/r/0-v1-06fb1a2d6cf5+107-rereg_access_jgg@nvidia.com
Reported-by: Philip Tsukerman <philiptsukerman@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

i2c: riic: fix refcount leak in riic_i2c_resume_noirq()

When riic_i2c_resume_noirq() is called, it deasserts the reset
using reset_control_deassert(), which for shared resets increments
a reference count. If pm_runtime_force_resume() then fails, the
function returns without calling reset_control_assert() to
decrement the count. This leaves the reset deasserted and the
reference count unbalanced, which can prevent other users of the
shared reset from properly asserting it later.

Fix the leak by calling reset_control_assert() on the error
handling path for a failed pm_runtime_force_resume().

Fixes: e383f0961422 ("i2c: riic: Move suspend handling to NOIRQ phase")
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Cc: <stable@vger.kernel.org> # v6.19+
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260608071123.128964-1-vulab@iscas.ac.cn

Merge tag 'v7.1-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:

- Fix random config build failure on s390.

* tag 'v7.1-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: s390 - add select CRYPTO_AEAD for aes

gpio: mvebu: fix NULL pointer dereference in suspend/resume

mvebu_pwm_suspend() and mvebu_pwm_resume() are called for all GPIO
banks during suspend/resume, but not all banks have PWM functionality.
GPIO banks without PWM have mvchip->mvpwm set to NULL.

Calling mvebu_pwm_suspend() with mvpwm == NULL causes a NULL pointer
dereference when it tries to access mvpwm->blink_select.

  Unable to handle kernel NULL pointer dereference at virtual address 00000020 when write
  [00000020] *pgd=00000000
  Internal error: Oops: 815 [#1] PREEMPT ARM
  Modules linked in:
  CPU: 0 UID: 0 PID: 406 Comm: sh Not tainted 6.12.74-rt12-yocto-standard-g4e96f98fb7db-dirty #353
  Hardware name: Marvell Armada 370/XP (Device Tree)
  PC is at regmap_mmio_read+0x38/0x54
  LR is at regmap_mmio_read+0x38/0x54
  pc : [<c05fd2ac>]    lr : [<c05fd2ac>]    psr: 200f0013
  sp : f0c11d10  ip : 00000000  fp : c100d2f0
  r10: c14fb854  r9 : 00000000  r8 : 00000000
  r7 : c1799c00  r6 : 00000020  r5 : 00000020  r4 : c179c7c0
  r3 : f0a231a0  r2 : 00000020  r1 : 00000020  r0 : 00000000
  Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
  Control: 10c5387d  Table: 135ec059  DAC: 00000051
  Call trace:
   regmap_mmio_read from _regmap_bus_reg_read+0x78/0xac
   _regmap_bus_reg_read from _regmap_read+0x60/0x154
   _regmap_read from regmap_read+0x3c/0x60
   regmap_read from mvebu_gpio_suspend+0xa4/0x14c
   mvebu_gpio_suspend from dpm_run_callback+0x54/0x180
   dpm_run_callback from device_suspend+0x124/0x630
   device_suspend from dpm_suspend+0x124/0x270
   dpm_suspend from dpm_suspend_start+0x64/0x6c
   dpm_suspend_start from suspend_devices_and_enter+0x140/0x8e8
   suspend_devices_and_enter from pm_suspend+0x2fc/0x308
   pm_suspend from state_store+0x6c/0xc8
   state_store from kernfs_fop_write_iter+0x10c/0x1f8
   kernfs_fop_write_iter from vfs_write+0x270/0x468
   vfs_write from ksys_write+0x70/0xf0
   ksys_write from ret_fast_syscall+0x0/0x54

Add a NULL check for mvchip->mvpwm before calling the PWM
suspend/resume functions.

Fixes: 757642f9a584 ("gpio: mvebu: Add limited PWM support")
Signed-off-by: Yun Zhou <yun.zhou@windriver.com>
Link: https://patch.msgid.link/20260608084334.2960803-1-yun.zhou@windriver.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

Merge tag 'hyperv-fixes-signed-20260607' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv fixes from Wei Liu:

- MSHV driver fixes from various people (Anirudh Rayabharam, Can Peng,
   Dexuan Cui, Michael Kelley, Jork Loeser, Wei Liu)

- Hyper-V user space tools fixes (Thorsten Blum)

- Allow VMBus to be unloaded after frame buffer is flushed (Michael
   Kelley)

* tag 'hyperv-fixes-signed-20260607' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  mshv: support 1G hugepages by passing them as 2M-aligned chunks
  Drivers: hv: vmbus: Improve the logic of reserving fb_mmio on Gen2 VMs
  mshv: use kmalloc_array in mshv_root_scheduler_init
  mshv: Add conditional VMBus dependency
  hyperv: Clean up and fix the guest ID comment in hvgdk.h
  drm/hyperv: During panic do VMBus unload after frame buffer is flushed
  Drivers: hv: vmbus: Provide option to skip VMBus unload on panic
  mshv: unmap debugfs stats pages on kexec
  mshv: clean up SynIC state on kexec for L1VH
  mshv: limit SynIC management to MSHV-owned resources
  hv: utils: replace deprecated strcpy with strscpy in kvp_register
  hv: utils: handle and propagate errors in kvp_register
  mshv: add a missing padding field

thermal: sysfs: Replace sscanf() with kstrtoul()

Replace sscanf() with kstrtoul() in cur_state_store(), as kstrto<type>
is preferred over single-variable sscanf().

Signed-off-by: Ovidiu Panait <ovidiu.panait.oss@gmail.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20260606210420.2311145-3-ovidiu.panait.oss@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>