git.ipfire.org Git - thirdparty/linux.git/log

pinctrl: qcom: Fix wakeirq map by removing disconnected irqs for sm8150

PDC interrupts 122-125 were meant for ibi_i3c wakeup but sm8150 do not
support i3c. GPIOs 39,51,88 and 144 are also connected to different PDC
pin and already reflected in the wake irq map.

Remove the unsupported wakeup interrupts from the map.

Fixes: 90337380c809 ("pinctrl: qcom: sm8150: Specify PDC map")
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Maulik Shah <maulik.shah@oss.qualcomm.com>
Signed-off-by: Navya Malempati <navya.malempati@oss.qualcomm.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>

pinctrl: sunxi: fix regulator leak in sunxi_pmx_request() error path

In the error path of sunxi_pmx_request(), the code calls
regulator_put(s_reg->regulator) to release the regulator. However,
s_reg->regulator is only assigned after a successful regulator_enable().
This causes a memory leak: the regulator obtained via regulator_get()
is never properly released when regulator_enable() fails.

Fixes: dc1445584177 ("pinctrl: sunxi: Fix and simplify pin bank regulator handling")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>

drm/tve200: Fix probe cleanup after register failure

tve200_modeset_init() creates a panel bridge and initializes the DRM
mode config before tve200_probe() registers the DRM device. If
drm_dev_register() fails, probe returns an error and the driver's remove
callback is not called, so those modeset resources are left behind.

Unwind the panel bridge and mode config on that failure path before
disabling the clock and dropping the DRM device reference.

Signed-off-by: Myeonghun Pak <mhun512@gmail.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260424124118.38649-1-mhun512@gmail.com

auxdisplay: max6959: use regmap_assign_bits() in max6959_enable()

Replace the ternary with a direct call to the regmap_assign_bits()
helper and save a couple lines of code.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

dts: spacemit: set console baud rate on bpif3

Because the default console's baud rate is not set, defconfig kernels do
not have any serial output on this platform. Set the baud rate to
115200, matching what is used by U-Boot etc on this platform.

Suggested-by: Vivian Wang <wangruikang@iscas.ac.cn>
Fixes: d60d57ab6b2a8 ("riscv: dts: spacemit: add Banana Pi BPI-F3 board device tree")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Yixun Lan <dlan@kernel.org>
Link: https://lore.kernel.org/r/20260430-reword-overstep-3be08b7eab25@spud
Signed-off-by: Yixun Lan <dlan@kernel.org>

lib/vsprintf: Limit the returning size to INT_MAX

The return value of vsnprintf() and bstr_printf() can overflow INT_MAX
and return a minus value. In the @size is checked input overflow, but
it does not check the output, which is expected required size.

This should never happen but it should be checked and limited.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/177452713020.197965.3164174544083829000.stgit@devnote2
Signed-off-by: Petr Mladek <pmladek@suse.com>

lib/vsprintf: Fix to check field_width and precision

Check the field_width and presition correctly. Previously it depends
on the bitfield conversion from int to check out-of-range error.
However, commit 938df695e98d ("vsprintf: associate the format state
with the format pointer") changed those fields to int.
We need to check the out-of-range correctly without bitfield
conversion.

Fixes: 938df695e98d ("vsprintf: associate the format state with the format pointer")
Reported-by: David Laight <david.laight.linux@gmail.com>
Closes: https://lore.kernel.org/all/20260318151250.40fef0ab@pumpkin/
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/177452712047.197965.16376597502504928495.stgit@devnote2
Signed-off-by: Petr Mladek <pmladek@suse.com>

x86/xen: Fix a potential problem in xen_e820_resolve_conflicts()

When fixing a conflict in xen_e820_resolve_conflicts(), the loop over
the E820 map entries needs to be restarted, as the E820 map will have
been modified by the fix. Otherwise entries might be skipped by
accident.

Fixes: be35d91c8880 ("xen: tolerate ACPI NVS memory overlapping with Xen allocated memory")
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: xen-devel@lists.xenproject.org
Link: https://patch.msgid.link/20260505080653.197775-1-jgross@suse.com

dt-bindings: crypto: qcom-qce: Add Qualcomm Eliza QCE

Document the QCE crypto engine on Qualcomm Eliza SoC, fully compatible
with earlier generations.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Reviewed-by: Harshal Dev <harshal.dev@oss.qualcomm.com>
Reviewed-by: Kuldeep Singh <kuldeep.singh@oss.qualcomm.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: qat - fix heartbeat error injection

The current implementation of the heartbeat error injection uses
adf_disable_arb_thd() to stop a specific accelerator engine thread
from processing requests. This does not reliably prevent the device
from generating responses.

Fix the error injection by disabling the device arbiter through
exit_arb() instead. This properly simulates a device failure by
stopping all arbitration, which results in missing responses for
sent requests.

Remove the now unused adf_disable_arb_thd() function and its
declaration.

Fixes: e2b67859ab6e ("crypto: qat - add heartbeat error simulator")
Signed-off-by: Damian Muszynski <damian.muszynski@intel.com>
Reviewed-by: Ahsan Atta <ahsan.atta@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

dt-bindings: crypto: qcom-qce: Document the Milos crypto engine

Document the crypto engine on the Milos platform.

Signed-off-by: Alexander Koskovich <akoskovich@pm.me>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Reviewed-by: Kuldeep Singh <kuldeep.singh@oss.qualcomm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

keys: cleanup dead code in Kconfig for FIPS_SIGNATURE_SELFTEST

There is already an 'if ASYMMETRIC_KEY_TYPE' condition wrapping
FIPS_SIGNATURE_SELFTEST, making the 'depends on' statement a
duplicate dependency (dead code).

I propose leaving the outer 'if ASYMMETRIC_KEY_TYPE...endif' and removing
the individual 'depends on' statement.

This dead code was found by kconfirm, a static analysis tool for Kconfig.

Signed-off-by: Julian Braha <julianbraha@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

s390/cio: Purge based on the cdev's online status

Ensure that all devices currently offline are purged correctly.

Previously, purging logic relied on the internal FSM state to
determine whether a device was offline. However, devices with a
target state of offline could be skipped if CIO internal
processing was still ongoing during the purge operation.

Update the purge decision logic to rely on the online variable
in the cdev structure instead of the internal FSM state,
providing a more reliable indication of actual device
availability.

Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>

s390: Remove extra check of task_stack_page()

There is no need to call task_stack_page(),
because try_get_task_stack() already takes care of that.

Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>

rhashtable: Add bucket_table_free_atomic() helper

rhashtable_insert_rehash() allocates a new bucket table
with GFP_ATOMIC, as it is called from an RCU read-side
critical section.

If rhashtable_rehash_attach() then fails, the new table
is freed via kvfree(). This is unsafe, since kvfree() may
fall back to vfree() for vmalloc-backed allocations, which
can sleep and trigger:

BUG: sleeping function called from invalid context

Add bucket_table_free_atomic(), which uses kvfree_atomic()
so the table can be freed safely from non-sleeping context.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

mm/slab: Add kvfree_atomic() helper

kvmalloc() now supports non-sleeping GFP flags, including
the vmalloc fallback path. This means it may return vmalloc
memory even for GFP_ATOMIC and GFP_NOWAIT allocations.

Freeing such memory with kvfree() may then end up calling
vfree(), which is not safe for non-sleeping contexts.

Introduce kvfree_atomic() helper for such cases. It mirrors
kvfree(), but uses vfree_atomic() for vmalloced memory.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Acked-by: Harry Yoo (Oracle) <harry@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

rhashtable: drop ht->mutex in rhashtable_free_and_destroy()

rhashtable_free_and_destroy() is a single-shot teardown routine:
cancel_work_sync() has already quiesced the deferred rehash worker, and
the function's documented contract requires the caller to guarantee no
other concurrent access to the rhashtable. Under those conditions
ht->mutex is not protecting anything -- taking it is a leftover from
the original teardown path.

That leftover is actively harmful: it closes a circular lock-class
dependency with fs_reclaim. The deferred rehash worker takes ht->mutex
and then allocates GFP_KERNEL memory in bucket_table_alloc(),
establishing

    &ht->mutex  ->  fs_reclaim

After commit b32c4a213698 ("xattr: add rhashtable-based simple_xattr
infrastructure") introduced simple_xattr_ht_free(), which calls
rhashtable_free_and_destroy(), the simple_xattrs teardown became
reachable from evict() under the dcache shrinker. The subsequent
per-subsystem adaptations made the reverse edge concrete in three
independent code paths:

  * commit 52b364fed6e1 ("shmem: adapt to rhashtable-based simple_xattrs with lazy allocation")
  * commit 5bd97f5c5f24 ("kernfs: adapt to rhashtable-based simple_xattrs with lazy allocation")
  * commit 50704c391fbf ("pidfs: adapt to rhashtable-based simple_xattrs")

Any of the three closes the cycle

    fs_reclaim  ->  &ht->mutex

which lockdep reports as follows. This particular splat was observed
organically on a workstation kernel built from vfs-7.1-rc1.xattr at
~35h uptime under normal mixed workload, with CONFIG_PROVE_LOCKING=y.
The path happens to go through kernfs:

  WARNING: possible circular locking dependency detected
  7.0.0-faeab166167f-with-fixes-v1+ #191 Tainted: G     U
  kswapd0/243 is trying to acquire lock:
  ffff8882e475c0f8 (&ht->mutex){+.+.}-{4:4},
    at: rhashtable_free_and_destroy+0x36/0x740
  but task is already holding lock:
  ffffffffa8ad1d00 (fs_reclaim){+.+.}-{0:0},
    at: balance_pgdat+0x995/0x1600

  the existing dependency chain (in reverse order) is:

  -> #1 (fs_reclaim){+.+.}-{0:0}:
         __lock_acquire+0x506/0xbf0
         lock_acquire.part.0+0xc7/0x280
         fs_reclaim_acquire+0xd9/0x130
         __kvmalloc_node_noprof+0xcd/0xb40
         bucket_table_alloc.isra.0+0x5a/0x440
         rhashtable_rehash_alloc+0x4e/0xd0
         rht_deferred_worker+0x14b/0x440
         process_one_work+0x8fd/0x16a0
         worker_thread+0x601/0xff0
         kthread+0x36b/0x470
         ret_from_fork+0x5bf/0x910
         ret_from_fork_asm+0x1a/0x30

  -> #0 (&ht->mutex){+.+.}-{4:4}:
         check_prev_add+0xdb/0xce0
         validate_chain+0x554/0x780
         __lock_acquire+0x506/0xbf0
         lock_acquire.part.0+0xc7/0x280
         __mutex_lock+0x1b2/0x2550
         rhashtable_free_and_destroy+0x36/0x740
         kernfs_put.part.0+0x119/0x570
         evict+0x3b6/0x9c0
         __dentry_kill+0x181/0x540
         shrink_dentry_list+0x135/0x440
         prune_dcache_sb+0xdb/0x150
         super_cache_scan+0x2ff/0x520
         do_shrink_slab+0x35a/0xee0
         shrink_slab_memcg+0x457/0x950
         shrink_slab+0x43b/0x550
         shrink_one+0x31a/0x6f0
         shrink_many+0x31e/0xc80
         shrink_node+0xeb3/0x14a0
         balance_pgdat+0x8ed/0x1600
         kswapd+0x2f3/0x530
         kthread+0x36b/0x470
         ret_from_fork+0x5bf/0x910
         ret_from_fork_asm+0x1a/0x30

   Possible unsafe locking scenario:

         CPU0                    CPU1
         ----                    ----
    lock(fs_reclaim);
                                 lock(&ht->mutex);
                                 lock(fs_reclaim);
    lock(&ht->mutex);

Note that lockdep tracks lock classes, not instances: the two
&ht->mutex sites are on different rhashtable objects (the deferred
worker was triggered by some unrelated rhashtable growth), but because
rhashtable_init() uses a single static lockdep key for all rhashtables,
this is a real class-level cycle. Once reported, lockdep disables
itself for the remainder of the boot, masking any subsequent locking
bugs.

Drop the mutex. After cancel_work_sync() the rehash worker is quiesced
and, per this function's contract, no other concurrent access is
possible; the tables are therefore owned exclusively by this function
and can be walked without any lock held.

Switch the table walks from rht_dereference() (which requires
ht->mutex to be held under CONFIG_PROVE_RCU) to rcu_dereference_raw(),
which has no lockdep annotation. rht_ptr_exclusive() already uses
rcu_dereference_protected(p, 1) and needs no change.

This is the only place in lib/rhashtable.c where &ht->mutex is
acquired from a path reachable under fs_reclaim; the deferred worker
is the only other site and it is the forward edge. Removing the
acquisition here therefore eliminates the class cycle for all three
subsystems that use simple_xattrs, not just the one in the splat
above. No locking-semantics change is introduced for correct users;
incorrect users would already be racing with rehash worker completion
regardless of the mutex.

Synthetic reproduction of the splat within a few-minute window was
unsuccessful across several attempts (tmpfs and kernfs zombies via
cgroupfs with open-fd-through-rmdir, with and without swap, up to
~60k reclaim-path executions of simple_xattr_ht_free() in a single
run), consistent with the rare coincidence-of-edges profile of the
bug: the forward edge is already registered in /proc/lockdep on any
idle system via rht_deferred_worker, but the reverse edge requires
evict() to complete kernfs_put()'s final release inside the fs_reclaim
critical section, which in my attempts was ordered against rather than
interleaved with the worker.

Fixes: b32c4a213698 ("xattr: add rhashtable-based simple_xattr infrastructure")
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

block: only read from sqe on initial invocation of blkdev_uring_cmd()

This passthrough helper currently only supports discards. Part of that
command is the start and length, which is read from the SQE. It does
so on every invocation, where it really should just make it stable
on the first invocation. This avoids needing to copy the SQE upfront,
as we only really need those two 8b values stored in our per-req
payload.

Cc: stable@vger.kernel.org # 6.17+
Signed-off-by: Jens Axboe <axboe@kernel.dk>

x86/efi: Restore IRQ state in EFI page fault handler

The kernel's softirq API does not permit re-enabling softirqs while IRQs
are disabled. The reason for this is that local_bh_enable() will not
only re-enable delivery of softirqs over the back of IRQs, it will also
handle any pending softirqs immediately, regardless of whether IRQs are
enabled at that point.

For this reason, commit

d02198550423 ("x86/fpu: Improve crypto performance by making kernel-mode FPU reliably usable in softirqs")

disables softirqs only when IRQs are enabled, as it is not permitted
otherwise, but also unnecessary, given that asynchronous softirq
delivery never happens to begin with while IRQs are disabled.

However, this does mean that entering a kernel mode FPU section with
IRQs enabled and leaving it with IRQs disabled leads to problems, as
identified by Sashiko [0]: the EFI page fault handler is called from
page_fault_oops() with IRQs disabled, and thus ends the kernel mode FPU
section with IRQs disabled as well, regardless of whether IRQs were
enabled when it was started. This may result in schedule() being called
with a non-zero preempt_count, causing a BUG().

So take care to re-enable IRQs when handling any EFI page faults if they
were taken with IRQs enabled.

[0] https://sashiko.dev/#/patchset/20260430074107.27051-1-ivan.hu%40canonical.com

Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Ivan Hu <ivan.hu@canonical.com>
Cc: x86@kernel.org
Cc: <stable@vger.kernel.org>
Fixes: d02198550423 ("x86/fpu: Improve crypto performance by making kernel-mode FPU reliably usable in softirqs")
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

media: v4l2-subdev: Fail {enable,disable}_streams and s_streaming nicely

If a sub-device does not set enable_streams() and disable_streams() pad
ops while it sets the s_stream() video op to
v4l2_subdev_s_stream_helper(), enabling or disabling streaming either way
on the sub-device will result calling v4l2_subdev_s_stream_helper() and
v4l2_subdev_{enable,disable}_streams() recursively, exhausting the stack.
Return -ENOIOCTLCMD in this case to handle the situation gracefully.

Fixes: b62949ddaa52 ("media: subdev: Support single-stream case in v4l2_subdev_enable/disable_streams()")
Cc: stable@vger.kernel.org
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>

drm/i915/display: enable ccs modifiers on dg2

Since Xe driver aux ccs enablement dg2 ccs modifiers have been
disabled on i915 driver. Here allow dg2 to use ccs again for framebuffers.

Fixes: 6a99e91a6ca8 ("drm/i915/display: Detect AuxCCS support via display parent interface")
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patch.msgid.link/20260427165715.864721-1-juhapekka.heikkila@gmail.com
(cherry picked from commit aee13ba1448213975f36942ba5d1ce693eb5c002)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>

cpufreq: qcom-cpufreq-hw: Fix possible double free

qcom_cpufreq.data is allocated with devm_kzalloc() in probe() as an
array of per-domain data. qcom_cpufreq_hw_cpu_init() stores a pointer to
one element of this array in policy->driver_data.

qcom_cpufreq_hw_cpu_exit() currently calls kfree() on policy->driver_data.
This is not valid because the memory is devm-managed. For the first
domain, this can free the devm-managed allocation while the devres entry
is still active, leading to a possible double free when the platform
device is later detached. For other domains, the pointer may refer to an
element inside the array rather than the allocation base.

Remove the kfree(data) call and let devres release qcom_cpufreq.data.

This issue was found by a static analysis tool I am developing.

Fixes: 054a3ef683a1 ("cpufreq: qcom-hw: Allocate qcom_cpufreq_data during probe")
Cc: stable@vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

xfrm: esp: avoid in-place decrypt on shared skb frags

MSG_SPLICE_PAGES can attach pages from a pipe directly to an skb. TCP
marks such skbs with SKBFL_SHARED_FRAG after skb_splice_from_iter(),
so later paths that may modify packet data can first make a private
copy. The IPv4/IPv6 datagram append paths did not set this flag when
splicing pages into UDP skbs.

That leaves an ESP-in-UDP packet made from shared pipe pages looking
like an ordinary uncloned nonlinear skb. ESP input then takes the no-COW
fast path for uncloned skbs without a frag_list and decrypts in place
over data that is not owned privately by the skb.

Mark IPv4/IPv6 datagram splice frags with SKBFL_SHARED_FRAG, matching
TCP. Also make ESP input fall back to skb_cow_data() when the flag is
present, so ESP does not decrypt externally backed frags in place.
Private nonlinear skb frags still use the existing fast path.

This intentionally does not change ESP output. In esp_output_head(),
the path that appends the ESP trailer to existing skb tailroom without
calling skb_cow_data() is not reachable for nonlinear skbs:
skb_tailroom() returns zero when skb->data_len is nonzero, while ESP
tailen is positive. Thus ESP output will either use the separate
destination-frag path or fall back to skb_cow_data().

Fixes: cac2661c53f3 ("esp4: Avoid skb_cow_data whenever possible")
Fixes: 03e2a30f6a27 ("esp6: Avoid skb_cow_data whenever possible")
Fixes: 7da0dde68486 ("ip, udp: Support MSG_SPLICE_PAGES")
Fixes: 6d8192bd69bb ("ip6, udp6: Support MSG_SPLICE_PAGES")
Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
Reported-by: Kuan-Ting Chen <h3xrabbit@gmail.com>
Tested-by: Hyunwoo Kim <imv4bel@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Kuan-Ting Chen <h3xrabbit@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

objtool/klp: Cache dont_correlate() result

Cache the dont_correlate() result once per symbol at the start of
correlate_symbols(). This reduces klp diff time on an arm64 LTO
vmlinux.o from 2m51s to 35s.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Improve and simplify prefix symbol detection

Only create prefix symbols for functions that have
__patchable_function_entries entries, since those are the only C
functions where prefix NOPs are intentional.

This both simplifies the detection and makes it more accurate.

Note that assembly functions using SYM_TYPED_FUNC_START() can also have
prefixed NOPs, but that macro already creates their __cfi_ symbols.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Fix kCFI prefix finding/cloning

With CFI+CALL_PADDING, Clang places .Ltmp labels at the start of the NOP
padding (offset 5) between the __cfi_ prefix and the function entry
point.  get_func_prefix() only checks the immediately previous symbol,
so the intervening .Ltmp label causes it to miss the __cfi_ prefix
symbol.

This results in klp-diff not cloning the kCFI type hash into the
livepatch module, causing a CFI failure at module load when calling
callback functions through indirect calls:

  CFI failure at __klp_enable_patch+0xab/0x140
    (target: pre_patch_callback+0x0/0x80 [livepatch_combined];
     expected type: 0xde073954)

Instead of walking backward through the section's symbol list, just use
find_func_containing() for the byte before the function.  This works now
that __cfi_ symbols are being grown by objtool to fill the padding.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Grow __cfi_* prefix symbols for all CFI+CALL_PADDING

For all CONFIG_CFI+CONFIG_CALL_PADDING configs, for C functions, the
__cfi_ symbols only cover the 5-byte kCFI type hash.  After that there
also N bytes of NOP padding between the hash and the function entry
which aren't associated with any symbol.

The NOPs can be replaced with actual code at runtime.  Without a symbol,
unwinders and tooling have no way of knowing where those bytes belong.

Grow the existing __cfi_* symbols to fill that gap.

Note that assembly functions with SYM_TYPED_FUNC_START() aren't affected
by this issue, their __cfi_ symbols also cover the padding.

Also, CONFIG_PREFIX_SYMBOLS has no reason to exist: CONFIG_CALL_PADDING
is what causes the compiler to emit NOP padding before function entry
(via -fpatchable-function-entry), so it's the right condition for
creating prefix symbols.

Remove CONFIG_PREFIX_SYMBOLS, as it's no longer needed.  Simplify the
LONGEST_SYM_KUNIT_TEST dependency accordingly.  Rework objtool's
arguments a bit to handle the variety of prefix/cfi-related cases.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Fix position-dependent checksums for non-relocated jumps/calls

When computing klp checksums, instructions with non-relocated jump/call
destination offsets are problematic because the offset values can change
when surrounding code has moved, causing the function to be incorrectly
marked as changed.

Specifically, that includes jumps from alternatives to the end of the
alternative, which from objtool's perspective are jumps to the end of
the alternative instruction block in the original function.

Note that 'jump_dest' jumps don't include sibling calls (those use
call_dest), nor do they include jumps to/from .cold sub functions (those
are cross-section and need a reloc).

Fix it by hashing the opcode bytes (excluding the immediate operand)
along with a position-independent representation of the destination.
For calls, use the function name, and for jumps, use the destination's
offset within its function.

[Note the "9 bit hole" comment was wrong: it has been 8 bits since
commit 70589843b36f ("objtool: Add option to trace function validation")
added the 'trace' field. Adding the 4-bit 'immediate_len' field now
leaves a 4-bit hole.]

Fixes: 0d83da43b1e1 ("objtool/klp: Add --checksum option to generate per-function checksums")
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Add insn_sym() helper

Alternative replacement instructions awkwardly have insn->sym set to the
function they get patched to rather than the symbol (or rather lack
thereof) they belong to in the file.

This makes it difficult to know where a given instruction actually
lives.

Add a new insn_sym() helper which preserves the existing semantic of
insn->sym. Rename insn->sym to insn->_sym, which contains the actual
ELF binary symbol (or NULL, for alternative replacements) an instruction
lives in.

The private insn->_sym value will be needed for a subsequent patch.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Add correlation debugging output

Add debugging messages to show how duplicate symbols get correlated, and
split the --debug feature into --debug-correlate and --debug-clone.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Rewrite symbol correlation algorithm

Rewrite the symbol correlation code, using a tiered list of
deterministic strategies in a loop.  For duplicately named symbols, each
tier applies a filter with the goal of finding a 1:1 deterministic
correlation between the original and patched version of the symbol.

The three matching strategies are:

  find_twin(): A funnel of progressively tighter filters.  Candidates
  with the same demangled name are counted at four levels: name, scope
  (local-vs-global), file (strict file association), and checksum
  (unchanged functions).  The widest level that yields a 1:1 match wins,
  narrower levels are only tried when the wider level is ambiguous.

  find_twin_suffixed(): Uses already-correlated LLVM symbol pairs to map
  .llvm.<hash> suffixes from orig to patched.  Because all promoted
  symbols from the same TU share the same hash, one correlated pair
  seeds the mapping for the entire TU.

  find_twin_positional(): Last resort, matches symbols by position among
  same-named candidates, similar to livepatch sympos.  Used for data
  objects like __quirk variables where no deterministic filter can
  distinguish the candidates.

Overall this works much better than the existing algorithm, particularly
with LTO kernels.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Calculate object checksums

Start checksumming data objects in preparation for revamping the
correlation algorithm.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

klp-build: Validate short-circuit prerequisites

The --short-circuit option implicitly requires that certain directories
are already in klp-tmp. Enforce that to prevent confusing errors.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Remove "objtool --checksum"

The checksum functionality has been moved to "objtool klp checksum"
which is now used by klp-build. Remove the now-dead --checksum and
--debug-checksum options from the default objtool command.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

klp-build: Use "objtool klp checksum" subcommand

Use the new "objtool klp checksum" subcommand instead of injecting
--checksum into every objtool invocation via OBJTOOL_ARGS during the
kernel build.

This decouples checksum generation from the build, running it in
separate post-build passes, making the code (and the patch generation
pipeline itself) more modular.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Add "objtool klp checksum" subcommand

Move the checksum functionality out of the main objtool command into a
new "objtool klp checksum" subcommand.

This has the benefit of making the code (and the patch generation
process itself) more modular.

For bisectability, both "objtool --checksum" and "objtool klp checksum"
work for now. The former will be removed after klp-build has been
converted to use the new subcommand.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Consolidate file decoding into decode_file()

decode_sections() relies on CFI and cfi_hash initialization done
separately in check(), making it unusable outside of check().

Consolidate the initialization into decode_sections() and rename it to
decode_file(), and make it global along with free_insns() and
insn_reloc() for use by other objtool components -- namely, the checksum
code which will be moving to another file.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Extricate checksum calculation from validate_branch()

In preparation for porting the checksum code to other arches, make its
functionality independent from the CFG reverse engineering code.

Move it into a standalone calculate_checksums() function which iterates
all functions and instructions directly, rather than being called inline
from do_validate_branch().

Since checksum_update_insn() is no longer called during CFG traversal,
it needs to manually iterate the alternatives.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Add is_cold_func() helper

Add an is_cold_func() helper. No functional changes intended.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Add is_alias_sym() helper

Improve readability with a new is_alias_sym() helper.

No functional changes intended.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Handle Clang .data..Lanon anonymous data sections

Clang generates anonymous data sections named .data..Lanon.<hash>.
These need section-symbol references in the same way as .data..Lubsan
(GCC) and .data..L__unnamed_ (Clang UBSAN) sections. Without this,
convert_reloc_sym() fails when processing relocations that reference
these sections.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Create empty checksum sections for function-less object files

If an object file has no functions, objtool has nothing to checksum, so
it doesn't create the .discard.sym_checksum symbol.

Then when 'objtool klp diff' reads symbol checksums, it errors out due
to the missing .discard.sym_checksum section.

Instead, just create an empty checksum section to signal to
read_sym_checksums() that the file has been processed.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Include libsubcmd headers directly from source tree

Instead of installing libsubcmd headers to a build output directory and
including from there, include directly from tools/lib/ where they
already exist. This fixes clangd indexing which otherwise can't find
libsubcmd headers.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Don't set sym->file for section symbols

Section symbols aren't grouped after their corresponding FILE symbols.
Their sym->file should really be NULL rather than whatever random FILE
happened to be last.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

klp-build: Remove redundant SRC and OBJ variables

SRC and OBJ are both set to $(pwd) and are always identical. The script
already enforces that klp-build runs from the kernel root directory, and
builds are done in-place, making these variables unnecessary.

Suggested-by: Song Liu <song@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

klp-build: Print "objtool klp diff" command in verbose mode

Print the full objtool command line when '--verbose' is given to help
with debugging.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

klp-build: Reject patches to realmode

Realmode code is compiled as a separate 16-bit binary and embedded into
the kernel image via rmpiggy.S. It can't be livepatched.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

klp-build: Reject patches to vDSO

vDSO code runs in userspace and can't be livepatched. Such patches also
cause spurious "new function" errors due to generated files like
vdso*-image.c having unstable line numbers across builds.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

klp-build: Fix patch cleanup on interrupt

If a build error occurs and the user hits Ctrl-C while a large patch is
being reverted during cleanup, the cleanup EXIT trap gets re-triggered
and tries to re-revert the already partially-reverted patch.  That
causes 'patch -R' to repeatedly prompt

  "Unreversed patch detected!  Ignore -R? [n]"

for each already-reverted hunk, with no way to break out.

Fix it by adding '--force' to the patch revert command in
revert_patch(), which causes it to silently ignore already-reverted
hunks.  And ignore errors, as the cleanup is always best-effort.

For similar reasons, add to APPLIED_PATCHES before (rather than after)
applying the patch in apply_patch() so an interrupted apply will also
get cleaned up.

Fixes: d36a7343f4ba ("livepatch/klp-build: switch to GNU patch and recountdiff")
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

klp-build: Suppress excessive fuzz output by default

When a patch applies with fuzz, the detailed output from the patch tool
can be very noisy, especially for big patches.

Suppress the fuzz details by default, while keeping the "applied with
fuzz" warning. The noise can be restored with '--verbose'.

Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

klp-build: Validate patch file existence

Make sure all patch files actually exist. Otherwise there can be
confusing errors later.

Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

klp-build: Don't use errexit

The errtrace option (combined with the ERR trap) already serves the same
function (and more) as errexit, so errexit is redundant. And it has
more pitfalls. Remove it.

Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

klp-build: Fix checksum comparison for changed offsets

The klp-build -f/--show-first-changed feature uses diff to compare
checksum log lines between original and patched objects.  However, diff
compares entire lines, including the offset field.  When a function is
at a different section offset, the offset field differs even though the
instruction checksum is identical, causing the wrong instruction to be
printed.

Only compare the checksum field when looking for the first changed
instruction.  Also print both the original and patched offsets when they
differ.

Fixes: 78be9facfb5e ("livepatch/klp-build: Add --show-first-changed option to show function divergence")
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

klp-build: Fix hang on out-of-date .config

If .config is out of date with the kernel source, 'make syncconfig'
hangs while waiting for user input on new config options. Detect the
mismatch and return an error.

Fixes: 6f93f7b06810 ("livepatch/klp-build: Fix inconsistent kernel version")
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Fix reloc hash collision in find_reloc_by_dest_range()

In find_reloc_by_dest_range(), hash collisions can cause a high-offset
relocation to appear when probing a low-offset hash bucket.

Only return early when the best match found so far genuinely belongs to
the current bucket (its offset is within the bucket's stride range).
Otherwise, continue scanning later buckets which may contain
lower-offset matches.

This ensures the first reloc in the range gets returned.

Fixes: 74b873e49d92 ("objtool: Optimize find_rela_by_dest_range()")
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Fix reloc corruption in convert_reloc_sym_to_secsym()

Use the section symbol's index instead of the old symbol's index when
updating the ELF relocation entry in convert_reloc_sym_to_secsym().

Found by Sashiko review.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Don't correlate .rodata.cst* constant pool objects

Clang aggregates UBSAN type descriptors into shared anonymous
.data..L__unnamed_* sections.  This data is used by UBSAN trap handlers.

When a changed function has an UBSAN bounds check, klp-diff clones the
entire UBSAN data section associated with the TU.  Relocations within
the cloned section that reference named rodata objects in .rodata.cst*
(like 'exponent', 'pirq_ali_set.irqmap') become KLP relocations because
those objects now get correlated.

That results in a .klp.rela.vmlinux..data section which can easily have
thousands of KLP relocs, most of which are completely superfluous, used
by functions which aren't cloned to the patch module.

The .rodata.cst* sections are SHF_MERGE constant pool sections
containing small fixed-size data (lookup tables, bitmasks) that is only
read by value.  Pointer identity is never relevant for these objects, so
correlating them is unnecessary.

Exclude .rodata.cst* objects from correlation so they get cloned as
local data instead of generating KLP relocations.

It might be possible to someday treat UBSAN data sections as special
sections, and only extract the few needed entries.  But this works for
now.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Fix pointer comparisons for rodata objects

klp-diff treats all rodata as uncorrelated, so any reference to it uses
a duplicated copy rather than using a KLP reloc.

For the contents of the data itself, a duplicated copy is fine.
However, pointer comparisons (e.g., f->f_op == &foo_ops) are broken.

Fix it by correlating non-anonymous rodata objects.

Also, use a new find_symbol_containing_inclusive() helper for matching
the end of a symbol so bounds calculations don't get broken, for the
case where an array or other symbol's ending address is used as part of
a bounds calculation.

While these are really two distinct changes, they need to be done in the
same patch so as to avoid introducing bisection regressions.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Simplify reloc symbol conversion

Inline section_reference_needed() and is_reloc_allowed() into
convert_reloc_sym() and remove the redundant is_reloc_allowed() check in
clone_reloc().

Move the is_sec_sym() checks into the convert callees so they become
no-ops when the reloc is already in the right format. This allows
convert_reloc_sym() to unconditionally dispatch to the right converter
based on section type.

Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Move mark_rodata() to elf.c

Move the sec->rodata marking from check.c to elf.c so it's set during
ELF reading rather than during the check pipeline. This makes the
rodata flag available to all objtool users, including klp-diff which
reads ELF files directly without running check().

Add an is_rodata_sec() helper to elf.h for consistency with
is_text_sec() and is_string_sec().

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Fix relocation conversion failures for R_X86_64_NONE

Objtool has some hacks which NOP out certain calls/jumps and replace
their relocations with R_X86_64_NONE. The klp-diff relocation
extraction code will error out when trying to copy these relocations due
to their negative addend, which would only makes sense for a PC-relative
branch instruction. Just ignore them.

Fixes: dd590d4d57eb ("objtool/klp: Introduce klp diff subcommand for diffing object files")
Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Fix kCFI trap handling

.kcfi_traps contains references to kCFI trap instruction locations.
When a KCFI type check fails at an indirect call, the trap handler looks
up the faulting address in this section.

Add it to the special sections list so the entries get extracted for the
changed functions they reference.

Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Fix extraction of text annotations for alternatives

Objtool is failing to extract text annotations which reference
.altinstr_replacement instructions:

  1) Alternative replacement fake symbols are NOTYPE rather than FUNC,
     and they don't have sym->included set, thus they aren't recognized
     by should_keep_special_sym().

  2) .discard.annotate_insn gets processed before .altinstr_replacement,
     so the referenced (fake) symbols don't have clones yet.

Fix the first issue by checking for a valid clone instead of
sym->included and by accepting NOTYPE symbols when processing
.discard.annotate_insn.

Fix the second issue by deferring text annotation processing until after
the other special sections have been cloned.

Fixes: dd590d4d57eb ("objtool/klp: Introduce klp diff subcommand for diffing object files")
Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Fix XXH3 state memory leak

The XXH3 state allocated in checksum_init() is never freed. Free it in
checksum_finish().

Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Fix cloning of zero-length section symbols

Fix NULL dereference when cloning a symbol from an empty section.
sec->data is only populated for sections with non-zero size.

Fixes: dd590d4d57eb ("objtool/klp: Introduce klp diff subcommand for diffing object files")
Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Fix handling of zero-length .altinstr_replacement sections

When a section is empty (e.g. only zero-length alternative
replacements), there are no symbols to convert a section symbol
reference to. Skip the reloc instead of erroring out.

Fixes: dd590d4d57eb ("objtool/klp: Introduce klp diff subcommand for diffing object files")
Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Fix --debug-checksum for duplicate symbol names

find_symbol_by_name() only returns the first match, so
--debug-checksum=<func> silently ignores any subsequent duplicately
named functions after the first.

Fix that, along with a new for_each_sym_by_name() helper.

Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool: Replace iterator callback with for_each_sym_by_mangled_name()

Convert the callback-based iterate_sym_by_demangled_name() with a new
for_each_sym_by_demangled_name() macro. This eliminates the callback
struct/function and makes the code more compact and readable.

Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Fix create_fake_symbols() skipping entsize-based sections

create_fake_symbols() has two phases: creating symbols from
ANNOTATE_DATA_SPECIAL entries, and a fallback that uses sh_entsize for
special sections like .static_call_sites.

When .discard.annotate_data is absent, the function returns early,
skipping the entsize fallback and silently allowing unsupported
module-local static call keys through.

Fix it by jumping to the entsize phase instead of returning early.

Fixes: dd590d4d57eb ("objtool/klp: Introduce klp diff subcommand for diffing object files")
Assisted-by: Claude:claude-4-opus
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Improve local label check

Clang emits various .L-prefixed local symbols beyond .Ltmp*, such as
.L__const.* for local constant data. These are assembler-local labels
not present in kallsyms, so they can never be resolved at module load
time.

Broaden the check from .Ltmp* to all .L* symbols so they get cloned into
the patch module instead.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Don't report uncorrelated functions as new

Clang LTO uses __UNIQUE_ID() to generate some uniquely named wrapper
functions, like initstubs. If they're uncorrelated, prevent them from
being reported as new functions and included unnecessarily.

Note that dont_correlate() already includes prefix functions, so prefix
functions are still being ignored here.

Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Don't correlate __initstub__ symbols

With LTO, the initcall infrastructure generates __initstub__kmod_*
wrapper functions in .init.text. These are the LTO equivalent of
__initcall__kmod_* data pointers, which are already excluded from
correlation.

These are __init functions whose memory is freed after boot, so there's
no reason to include or reference them in a livepatch module.

Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Don't correlate absolute symbols

Some arch/x86/crypto/*.S files define local .set/.equ constants that get
duplicated in vmlinux.o. This causes klp-diff to fail with "Multiple
correlation candidates" errors since it can't uniquely match these
between orig and patched builds.

Skip ABS symbols in dont_correlate(). They're purely compile-time
assembly constants that are never referenced by relocations, so they
don't need correlation.

Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Don't correlate __ADDRESSABLE() symbols

Symbols created by __ADDRESSABLE() are only used to convince the
toolchain not to optimize out the referenced symbol.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Fix .data..once static local non-correlation

While there was once a section named .data.once, it has since been
renamed to .data..once with commit dbefa1f31a91 ("Rename .data.once to
.data..once to fix resetting WARN*_ONCE"). Fix it.

Fixes: dd590d4d57eb ("objtool/klp: Introduce klp diff subcommand for diffing object files")
Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

objtool/klp: Fix is_uncorrelated_static_local() for Clang

For naming function-local static locals, GCC uses <var>.<id>, e.g.
__already_done.15, while Clang uses <func>.<var> with optional .<id>,
e.g. create_worker.__already_done.111

The existing is_uncorrelated_static_local() check only matches the GCC
convention where the variable name is a prefix. Handle both cases by
checking for a prefix match (GCC) and by checking after the first dot
separator (Clang).

Fixes: dd590d4d57eb ("objtool/klp: Introduce klp diff subcommand for diffing object files")
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>

cpufreq: apple-soc: Use FIELD_MODIFY()

Use FIELD_MODIFY() to remove open-coded bit manipulation.
No functional change intended.

Signed-off-by: Hans Zhang <18255117159@163.com>
Reviewed-by: Joshua Peisach <jpeisach@ubuntu.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

cpufreq/amd-pstate: Use FIELD_MODIFY()

Use FIELD_MODIFY() to remove open-coded bit manipulation.
No functional change intended.

Signed-off-by: Hans Zhang <18255117159@163.com>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

drm/i915/hdcp: Drop mgr->base.lock acquisition in intel_conn_to_vcpi()

Now that intel_conn_to_vcpi() reads the MST topology state via
drm_atomic_get_new_mst_topology_state(), the topology state belongs
to this atomic commit and is already serialized through the atomic
state's private object machinery. There is no need to additionally
take mgr->base.lock here.
Taking it from the HDCP enable path in commit_tail with
state->base.acquire_ctx is also unsafe: by this point the
acquire_ctx is no longer in a state where new modeset locks may be
acquired through it, which produces a modeset-lock splat on MST +
HDCP. Drop the drm_modeset_lock() call.

Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Tested-by: Santhosh Reddy Guddati <santhosh.reddy.guddati@intel.com>
Reviewed-by: Santhosh Reddy Guddati <santhosh.reddy.guddati@intel.com>
Link: https://patch.msgid.link/20260504101117.3619984-3-suraj.kandpal@intel.com

drm/i915/hdcp: Use new MST topology state in intel_conn_to_vcpi()

intel_conn_to_vcpi() runs from the HDCP enable path in commit_tail
and looks up the VCPI via mgr->base.state. When an ALLOCATE_PAYLOAD
is being driven on the mgr just before HDCP enable, that payload
list is being mutated in place, so the lookup can miss the port and
trip drm_WARN_ON(!payload), causing HDCP to be programmed with
VCPI 0.
Use drm_atomic_get_new_mst_topology_state() to read the topology
state attached to this atomic commit (stable, decided in
atomic_check), and bail out cleanly when no topology state or
payload is present for this port instead of WARNing.

Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Tested-by: Santhosh Reddy Guddati <santhosh.reddy.guddati@intel.com>
Reviewed-by: Santhosh Reddy Guddati <santhosh.reddy.guddati@intel.com>
Link: https://patch.msgid.link/20260504101117.3619984-2-suraj.kandpal@intel.com

cpufreq: qcom: Unify user-visible "Qualcomm" name

Various names for Qualcomm as a company are used in user-visible config
options. Switch to unified "Qualcomm" so it will be easier for users to
identify the options when for example running menuconfig.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

net/sched: speedup tc_dump_qdisc() when tcm_handle is provided

"tc qdisc show ... handle xxx" filtering can be done by the kernel.

A followup patch can do the same for tcm_parent.

iproute2/tc needs a small companion patch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260503114515.2460477-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: airoha: Introduce airoha_fe_get()/airoha_qdma_get() register read helpers

Add airoha_fe_get() and airoha_qdma_get() as utility routines for reading
a masked field from a specified register.
This is a non-functional refactor, no logical changes are introduced to
the existing codebase.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260501-airoha_fe_get-airoha_qdma_get-v3-1-126c6f647ccb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: mt7530: fix .get_stats64 sleeping in atomic context

The .get_stats64 callback runs in atomic context, but on
MDIO-connected switches every register read acquires the MDIO bus
mutex, which can sleep:
[   12.645973] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:609
[   12.654442] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 759, name: grep
[   12.663377] preempt_count: 0, expected: 0
[   12.667410] RCU nest depth: 1, expected: 0
[   12.671511] INFO: lockdep is turned off.
[   12.675441] CPU: 0 UID: 0 PID: 759 Comm: grep Tainted: G S      W           7.0.0+ #0 PREEMPT
[   12.675453] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
[   12.675456] Hardware name: Bananapi BPI-R64 (DT)
[   12.675459] Call trace:
[   12.675462]  show_stack+0x14/0x1c (C)
[   12.675477]  dump_stack_lvl+0x68/0x8c
[   12.675487]  dump_stack+0x14/0x1c
[   12.675495]  __might_resched+0x14c/0x220
[   12.675504]  __might_sleep+0x44/0x80
[   12.675511]  __mutex_lock+0x50/0xb10
[   12.675523]  mutex_lock_nested+0x20/0x30
[   12.675532]  mt7530_get_stats64+0x40/0x2ac
[   12.675542]  dsa_user_get_stats64+0x2c/0x40
[   12.675553]  dev_get_stats+0x44/0x1e0
[   12.675564]  dev_seq_printf_stats+0x24/0xe0
[   12.675575]  dev_seq_show+0x14/0x3c
[   12.675583]  seq_read_iter+0x37c/0x480
[   12.675595]  seq_read+0xd0/0xec
[   12.675605]  proc_reg_read+0x94/0xe4
[   12.675615]  vfs_read+0x98/0x29c
[   12.675625]  ksys_read+0x54/0xdc
[   12.675633]  __arm64_sys_read+0x18/0x20
[   12.675642]  invoke_syscall.constprop.0+0x54/0xec
[   12.675653]  do_el0_svc+0x3c/0xb4
[   12.675662]  el0_svc+0x38/0x200
[   12.675670]  el0t_64_sync_handler+0x98/0xdc
[   12.675679]  el0t_64_sync+0x158/0x15c

For MDIO-connected switches, poll MIB counters asynchronously using a
delayed workqueue every second and let .get_stats64 return the cached
values under a spinlock. A mod_delayed_work() call on each read
triggers an immediate refresh so counters stay responsive when queried
more frequently.

MMIO-connected switches (MT7988, EN7581, AN7583) are not affected
because their regmap does not sleep, so they continue to read MIB
counters directly in .get_stats64.

Fixes: 88c810f35ed5 ("net: dsa: mt7530: implement .get_stats64")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Acked-by: Chester A. Unal <chester.a.unal@arinc9.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/6940b913da2c29156f0feff74b678d3c526ee84c.1777719253.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipmr: Add __rcu to netns_ipv4.mrt.

kernel test robot reported this Sparse warning:

  $ make C=1 net/ipv4/ipmr.o
  net/ipv4/ipmr.c:312:24: error: incompatible types in comparison expression (different address spaces):
  net/ipv4/ipmr.c:312:24:    struct mr_table [noderef] __rcu *
  net/ipv4/ipmr.c:312:24:    struct mr_table *

Let's add __rcu annotation to netns_ipv4.mrt.

Fixes: b3b6babf4751 ("ipmr: Free mr_table after RCU grace period.")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202605030032.glNApko7-lkp@intel.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502180755.359554-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

psp: strip variable-length PSP header in psp_dev_rcv()

psp_dev_rcv() unconditionally removes a fixed PSP_ENCAP_HLEN, even
when psph->hdrlen indicates that the PSP header carries optional
fields. A frame whose PSP header advertises a non-zero VC or any
extension would therefore be silently mis-decapsulated: option bytes
would spill into the inner packet head and downstream parsing would
fail on a corrupted skb.

Compute the full PSP header length from psph->hdrlen, pull the
optional bytes into the linear region, and strip the whole header
when decapsulating. Optional fields (VC, ...) are still ignored,
just discarded with the rest of the header instead of leaking.
crypt_offset and the VIRT flag are intentionally not validated here
- callers know their device's PSP implementation and can decide.

Both in-tree callers gate on hardware-validated PSP, so this is a
correctness fix rather than a reachable corruption path under
current configurations.

Fixes: 0eddb8023cee ("psp: provide decapsulation and receive helper for drivers")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Daniel Zahka <daniel.zahka@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: David Carlier <devnexen@gmail.com>
Link: https://patch.msgid.link/20260502141945.14484-1-devnexen@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: prevent possible UAF in rtnl_prop_list_size()

I was mistaken by synchronize_rcu() [1] call in netdev_name_node_alt_destroy(),
giving a false sense of RCU safety at delete times.

We have to use list_del_rcu() to not confuse potential readers
in rtnl_prop_list_size().

[1] This synchronize_rcu() call was later removed in commit 723de3ebef03
("net: free altname using an RCU callback").

Fixes: 9f30831390ed ("net: add rcu safety to rtnl_prop_list_size()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260502124102.499204-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: realtek: replace magic number with register bit macros

Replace magic number with register bit macros. The description of
the RTL8211B interrupt register is obtained from publicly available
datasheet (RTL8211B(L) Rev. 1.5 Datasheet)

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Reviewed-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/20260502092857.156831-1-olek2@wp.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'mptcp-misc-fixes-for-v7-1-rc3'

Matthieu Baerts says:

====================
mptcp: misc fixes for v7.1-rc3

Here are various unrelated fixes:

- Patch 1: increment the right MIB counter. A fix for v5.7.

- Patch 2: set the right MPTCP reset reason. A fix for v5.9.

- Patch 3: fix rx timestamp corruption when on MPTCP passive fastopen. A
fix for v6.2.

- Patch 4: increase sockopt seq after having set TCP_MAXSEG to propagate
it to newer subflows later. A fix for 6.17.
====================

Link: https://patch.msgid.link/20260501-net-mptcp-misc-fixes-7-1-rc3-v1-0-b70118df778e@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mptcp: sockopt: increase seq in mptcp_setsockopt_all_sf

mptcp_setsockopt_all_sf() was missing a call to sockopt_seq_inc(). This
is required not to cause missing synchronization for newer subflows
created later on.

This helper is called each time a socket option is set on subflows, and
future ones will need to inherit this option after their creation.

Fixes: 51c5fd09e1b4 ("mptcp: add TCP_MAXSEG sockopt support")
Cc: stable@vger.kernel.org
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260501-net-mptcp-misc-fixes-7-1-rc3-v1-4-b70118df778e@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mptcp: fix rx timestamp corruption on fastopen

The skb cb offset containing the timestamp presence flag is cleared
before loading such information. Cache such value before MPTCP CB
initialization.

Fixes: 36b122baf6a8 ("mptcp: add subflow_v(4,6)_send_synack()")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260501-net-mptcp-misc-fixes-7-1-rc3-v1-3-b70118df778e@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mptcp: use MPTCP_RST_EMPTCP for ACK HMAC validation failure

When HMAC validation fails on a received ACK + MP_JOIN in
subflow_syn_recv_sock(), the subflow is reset with reason
MPTCP_RST_EPROHIBIT ("Administratively prohibited"). This is
incorrect: HMAC validation failure is an MPTCP protocol-level
error, not an administrative policy denial.

The mirror site on the client, in subflow_finish_connect(), already
uses MPTCP_RST_EMPTCP ("MPTCP-specific error") for the same kind of
HMAC failure on the SYN/ACK + MP_JOIN. Use the same reason on the
server side for symmetry and accuracy.

Suggested-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Fixes: 443041deb5ef ("mptcp: fix NULL pointer in can_accept_new_subflow")
Cc: stable@vger.kernel.org
Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260501-net-mptcp-misc-fixes-7-1-rc3-v1-2-b70118df778e@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mptcp: use MPJoinSynAckHMacFailure for SynAck HMAC failure

In subflow_finish_connect(), HMAC validation of the server's HMAC
in SYN/ACK + MP_JOIN increments MPTCP_MIB_JOINACKMAC ("HMAC was
wrong on ACK + MP_JOIN") on failure. The function processes the
SYN/ACK, not the ACK; the matching MPTCP_MIB_JOINSYNACKMAC counter
("HMAC was wrong on SYN/ACK + MP_JOIN") exists but is not
incremented anywhere in the tree.

The mirror site on the server, subflow_syn_recv_sock(), already
uses JOINACKMAC correctly for ACK HMAC failure. Use JOINSYNACKMAC
at the SYN/ACK validation site so each counter reflects the packet
whose HMAC actually failed.

Suggested-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Fixes: fc518953bc9c ("mptcp: add and use MIB counter infrastructure")
Cc: stable@vger.kernel.org
Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260501-net-mptcp-misc-fixes-7-1-rc3-v1-1-b70118df778e@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: mana: hardening: Reject zero max_num_queues from GDMA_QUERY_MAX_RESOURCES

In a CVM environment, hardware responses cannot be trusted. The
GDMA_QUERY_MAX_RESOURCES command returns resource limits used to
determine the maximum number of queues.

In mana_gd_query_max_resources(), gc->max_num_queues is initialized
from num_online_cpus() and successively clamped by the hardware-reported
max_eq, max_cq, max_sq, max_rq, and num_msix_usable values. If any of
these hardware values is zero, gc->max_num_queues becomes zero and the
function returns success. This leads to a confusing failure later when
alloc_etherdev_mq() is called with zero queues, returning NULL and
producing a misleading -ENOMEM error.

Add an explicit zero check for gc->max_num_queues after all clamping
steps and return -ENOSPC for a clear early failure, consistent with the
existing gc->num_msix_usable <= 1 guard.

Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Link: https://patch.msgid.link/20260430083627.1873757-1-ernis@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: mana: hardening: Reject zero max_num_queues from MANA_QUERY_VPORT_CONFIG

As a part of MANA hardening for CVM, validate that max_num_sq and
max_num_rq returned by MANA_QUERY_VPORT_CONFIG are not zero. These
values flow into apc->num_queues, which is used as an allocation count
and loop bound. A zero value would result in zero-size allocations and
incorrect driver behavior.

Return -EPROTO if either value is zero.

Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
Link: https://patch.msgid.link/20260430085638.1875400-1-ernis@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

vsock/virtio: fix potential unbounded skb queue

virtio_transport_inc_rx_pkt() checks vvs->rx_bytes + len > vvs->buf_alloc.

virtio_transport_recv_enqueue() skips coalescing for packets
with VIRTIO_VSOCK_SEQ_EOM.

If fed with packets with len == 0 and VIRTIO_VSOCK_SEQ_EOM,
a very large number of packets can be queued
because vvs->rx_bytes stays at 0.

Fix this by estimating the skb metadata size:

(Number of skbs in the queue) * SKB_TRUESIZE(0)

Fixes: 077706165717 ("virtio/vsock: don't use skbuff state to account credit")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Cc: "Eugenio Pérez" <eperezma@redhat.com>
Cc: virtualization@lists.linux.dev
Link: https://patch.msgid.link/20260430122653.554058-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-bridge-mcast-support-exponential-field-encoding'

Ujjal Roy says:

====================
net: bridge: mcast: support exponential field encoding

Description:
This series addresses a mismatch in how multicast query
intervals and response codes are handled across IPv4 (IGMPv3)
and IPv6 (MLDv2). While decoding logic currently exists,
the corresponding encoding logic is missing during query
packet generation. This leads to incorrect intervals being
transmitted when values exceed their linear thresholds.

The patches introduce a unified floating-point encoding
approach based on RFC3376 and RFC3810, ensuring that large
intervals are correctly represented in QQIC and MRC fields
using the exponent-mantissa format.

Key Changes:
* ipv4: igmp: get rid of IGMPV3_{QQIC,MRC} and simplify calculation
  Removes legacy macros in favor of a cleaner, unified
  calculation for retrieving intervals from encoded fields,
  improving code maintainability.

* ipv6: mld: rename mldv2_mrc() and add mldv2_qqi()
  Standardizes MLDv2 terminology by renaming mldv2_mrc()
  to mldv2_mrd() (Maximum Response Delay) and introducing
  a new API mldv2_qqi for QQI calculation, improving code
  readability.

* ipv4: igmp: encode multicast exponential fields
  Introduces the logic to dynamically calculate the exponent
  and mantissa using bit-scan (fls). This ensures QQIC and
  MRC fields (8-bit) are properly encoded when transmitting
  query packets with intervals that exceed their respective
  linear threshold value of 128 (for QQI/MRT).

* ipv6: mld: encode multicast exponential fields
  Applies similar encoding logic for MLDv2. This ensures
  QQIC (8-bit) and MRC (16-bit) fields are properly encoded
  when transmitting query packets with intervals that exceed
  their respective linear thresholds (128 for QQI; 32768
  for MRD).

* selftests: net: bridge: add MRC and QQIC field encoding tests
  Updates bridge selftests to validate both linear and non-linear
  (exponential) encoding for MRC and QQIC fields, ensuring
  protocol compliance across IGMPv3 and MLDv2.

Impact:
These changes ensure that multicast queriers and listeners
stay synchronized on timing intervals, preventing protocol
timeouts or premature group membership expiration caused
by incorrectly formatted packet headers.
====================

Link: https://patch.msgid.link/20260502131907.987-1-royujjal@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: net: bridge: add MRC and QQIC field encoding tests

Enhance vlmc_query_intvl_test and vlmc_query_response_intvl_test in
bridge_vlan_mcast.sh to validate IGMPv3/MLDv2 protocol compliance for
MRC and QQIC field encoding across both linear and exponential ranges.

TEST: Vlan multicast snooping enable                                [ OK ]
TEST: Vlan mcast_query_interval global option default value         [ OK ]
TEST: Number of tagged IGMPv2 general query                         [ OK ]
TEST: IGMPv3 QQIC linear value 60(s)                                [ OK ]
TEST: MLDv2 QQIC linear value 60(s)                                 [ OK ]
TEST: IGMPv3 QQIC non linear value 160(s)                           [ OK ]
TEST: MLDv2 QQIC non linear value 160(s)                            [ OK ]
TEST: Vlan mcast_query_response_interval global option default value   [ OK ]
TEST: IGMPv3 MRC linear value of 60(x0.1s)                          [ OK ]
TEST: MLDv2 MRC linear value of 24000(ms)                           [ OK ]
TEST: IGMPv3 MRC non linear value of 240(x0.1s)                     [ OK ]
TEST: MLDv2 MRC non linear value of 48000(ms)                       [ OK ]

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Ujjal Roy <royujjal@gmail.com>
Link: https://patch.msgid.link/20260502131907.987-6-royujjal@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv6: mld: encode multicast exponential fields

In MLD, MRC and QQIC fields are not correctly encoded when
generating query packets. Since the receiver of the query
interprets these fields using the MLDv2 floating-point
decoding logic, any value that exceeds the linear threshold
is incorrectly parsed as an exponential value, leading to
an incorrect interval calculation.

Encode and assign the corresponding protocol fields during
query generation. Introduce the logic to dynamically
calculate the exponent and mantissa using bit-scan (fls).
This ensures MRC (16-bit) and QQIC (8-bit) fields are
properly encoded when transmitting query packets with
intervals that exceed their respective linear thresholds
(32768 for MRD; 128 for QQI).

RFC3810: If Maximum Response Code >= 32768, the Maximum
Response Code field represents a floating-point value as
follows:
     0 1 2 3 4 5 6 7 8 9 A B C D E F
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |1| exp |          mant         |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

RFC3810: If QQIC >= 128, the QQIC field represents a
floating-point value as follows:
     0 1 2 3 4 5 6 7
    +-+-+-+-+-+-+-+-+
    |1| exp | mant  |
    +-+-+-+-+-+-+-+-+

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Ujjal Roy <royujjal@gmail.com>
Link: https://patch.msgid.link/20260502131907.987-5-royujjal@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv4: igmp: encode multicast exponential fields

In IGMP, MRC and QQIC fields are not correctly encoded
when generating query packets. Since the receiver of the
query interprets these fields using the IGMPv3 floating-
point decoding logic, any value that exceeds the linear
threshold is incorrectly parsed as an exponential value,
leading to an incorrect interval calculation.

Encode and assign the corresponding protocol fields during
query generation. Introduce the logic to dynamically
calculate the exponent and mantissa using bit-scan (fls).
This ensures MRC and QQIC fields (8-bit) are properly
encoded when transmitting query packets with intervals
that exceed their respective linear threshold value of
128 (for MRT/QQI).

RFC3376: for both MRC and QQIC, values >= 128 represent
the same floating-point encoding as follows:
     0 1 2 3 4 5 6 7
    +-+-+-+-+-+-+-+-+
    |1| exp | mant  |
    +-+-+-+-+-+-+-+-+

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Ujjal Roy <royujjal@gmail.com>
Link: https://patch.msgid.link/20260502131907.987-4-royujjal@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>