git.ipfire.org Git - thirdparty/kernel/linux.git/log

hrtimer: Fix trace oddity

It turns out that __run_hrtimer() will trace like:

<idle>-0 [032] d.h2. 20705.474563: hrtimer_cancel: hrtimer=0xff2db8f77f8226e8
<idle>-0 [032] d.h1. 20705.474563: hrtimer_expire_entry: hrtimer=0xff2db8f77f8226e8 now=20699452001850 function=tick_nohz_handler/0x0

Which is a bit nonsensical, the timer doesn't get canceled on
expiration. The cause is the use of the incorrect debug helper.

Fixes: c6a2a1770245 ("hrtimer: Add tracepoint for hrtimers")
Reported-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260121143208.219595606@infradead.org

rseq: Lower default slice extension

Change the minimum slice extension to 5 usec.

Since slice_test selftest reaches a staggering ~350 nsec extension:

Task: slice_test    Mean: 350.266 ns
  Latency (us)    | Count
  ------------------------------
  EXPIRED         | 238
  0 us            | 143189
  1 us            | 167
  2 us            | 26
  3 us            | 11
  4 us            | 28
  5 us            | 31
  6 us            | 22
  7 us            | 23
  8 us            | 32
  9 us            | 16
  10 us           | 35

Lower the minimal (and default) value to 5 usecs -- which is still massive.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260121143208.073200729@infradead.org

rseq: Move slice_ext_nsec to debugfs

Move changing the slice ext duration to debugfs, a sliglty less permanent
interface.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260121143207.923520192@infradead.org

rseq: Allow registering RSEQ with slice extension

Since glibc cares about the number of syscalls required to initialize a new
thread, allow initializing rseq with slice extension on. This avoids having to
do another prctl().

Requested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260121143207.814193010@infradead.org

selftests/rseq: Implement time slice extension test

Provide an initial test case to evaluate the functionality. This needs to be
extended to cover the ABI violations and expose the race condition between
observing granted and arriving in rseq_slice_yield().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251215155709.320325431@linutronix.de

entry: Hook up rseq time slice extension

Wire the grant decision function up in exit_to_user_mode_loop()

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251215155709.258157362@linutronix.de

rseq: Implement rseq_grant_slice_extension()

Provide the actual decision function, which decides whether a time slice
extension is granted in the exit to user mode path when NEED_RESCHED is
evaluated.

The decision is made in two stages. First an inline quick check to avoid
going into the actual decision function. This checks whether:

#1 the functionality is enabled

#2 the exit is a return from interrupt to user mode

#3 any TIF bit, which causes extra work is set. That includes TIF_RSEQ,
    which means the task was already scheduled out.

The slow path, which implements the actual user space ABI, is invoked
when:

  A) #1 is true, #2 is true and #3 is false

     It checks whether user space requested a slice extension by setting
     the request bit in the rseq slice_ctrl field. If so, it grants the
     extension and stores the slice expiry time, so that the actual exit
     code can double check whether the slice is already exhausted before
     going back.

  B) #1 - #3 are true _and_ a slice extension was granted in a previous
     loop iteration

     In this case the grant is revoked.

In case that the user space access faults or invalid state is detected, the
task is terminated with SIGSEGV.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251215155709.195303303@linutronix.de

rseq: Reset slice extension when scheduled

When a time slice extension was granted in the need_resched() check on exit
to user space, the task can still be scheduled out in one of the other
pending work items. When it gets scheduled back in, and need_resched() is
not set, then the stale grant would be preserved, which is just wrong.

RSEQ already keeps track of that and sets TIF_RSEQ, which invokes the
critical section and ID update mechanisms.

Utilize them and clear the user space slice control member of struct rseq
unconditionally within the existing user access sections. That's just an
unconditional store more in that path.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/20251215155709.131081527@linutronix.de

rseq: Implement time slice extension enforcement timer

If a time slice extension is granted and the reschedule delayed, the kernel
has to ensure that user space cannot abuse the extension and exceed the
maximum granted time.

It was suggested to implement this via the existing hrtick() timer in the
scheduler, but that turned out to be problematic for several reasons:

   1) It creates a dependency on CONFIG_SCHED_HRTICK, which can be disabled
      independently of CONFIG_HIGHRES_TIMERS

   2) HRTICK usage in the scheduler can be runtime disabled or is only used
      for certain aspects of scheduling.

   3) The function is calling into the scheduler code and that might have
      unexpected consequences when this is invoked due to a time slice
      enforcement expiry. Especially when the task managed to clear the
      grant via sched_yield(0).

It would be possible to address #2 and #3 by storing state in the
scheduler, but that is extra complexity and fragility for no value.

Implement a dedicated per CPU hrtimer instead, which is solely used for the
purpose of time slice enforcement.

The timer is armed when an extension was granted right before actually
returning to user mode in rseq_exit_to_user_mode_restart().

It is disarmed, when the task relinquishes the CPU. This is expensive as
the timer is probably the first expiring timer on the CPU, which means it
has to reprogram the hardware. But that's less expensive than going through
a full hrtimer interrupt cycle for nothing.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/20251215155709.068329497@linutronix.de

rseq: Implement syscall entry work for time slice extensions

The kernel sets SYSCALL_WORK_RSEQ_SLICE when it grants a time slice
extension. This allows to handle the rseq_slice_yield() syscall, which is
used by user space to relinquish the CPU after finishing the critical
section for which it requested an extension.

In case the kernel state is still GRANTED, the kernel resets both kernel
and user space state with a set of sanity checks. If the kernel state is
already cleared, then this raced against the timer or some other interrupt
and just clears the work bit.

Doing it in syscall entry work allows to catch misbehaving user space,
which issues an arbitrary syscall, i.e. not rseq_slice_yield(), from the
critical section. Contrary to the initial strict requirement to use
rseq_slice_yield() arbitrary syscalls are not considered a violation of the
ABI contract anymore to allow onion architecture applications, which cannot
control the code inside a critical section, to utilize this as well.

If the code detects inconsistent user space that result in a SIGSEGV for
the application.

If the grant was still active and the task was not preempted yet, the work
code reschedules immediately before continuing through the syscall.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251215155709.005777059@linutronix.de

rseq: Implement sys_rseq_slice_yield()

Provide a new syscall which has the only purpose to yield the CPU after the
kernel granted a time slice extension.

sched_yield() is not suitable for that because it unconditionally
schedules, but the end of the time slice extension is not required to
schedule when the task was already preempted. This also allows to have a
strict check for termination to catch user space invoking random syscalls
including sched_yield() from a time slice extension region.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20251215155708.929634896@linutronix.de

rseq: Add prctl() to enable time slice extensions

Implement a prctl() so that tasks can enable the time slice extension
mechanism. This fails, when time slice extensions are disabled at compile
time or on the kernel command line and when no rseq pointer is registered
in the kernel.

That allows to implement a single trivial check in the exit to user mode
hotpath, to decide whether the whole mechanism needs to be invoked.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251215155708.858717691@linutronix.de

rseq: Add statistics for time slice extensions

Extend the quick statistics with time slice specific fields.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251215155708.795202254@linutronix.de

rseq: Provide static branch for time slice extensions

Guard the time slice extension functionality with a static key, which can
be disabled on the kernel command line.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251215155708.733429292@linutronix.de

rseq: Add fields and constants for time slice extension

Aside of a Kconfig knob add the following items:

   - Two flag bits for the rseq user space ABI, which allow user space to
     query the availability and enablement without a syscall.

   - A new member to the user space ABI struct rseq, which is going to be
     used to communicate request and grant between kernel and user space.

   - A rseq state struct to hold the kernel state of this

   - Documentation of the new mechanism

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251215155708.669472597@linutronix.de

sched/debug: Convert copy_from_user() + kstrtouint() to kstrtouint_from_user()

Using kstrtouint_from_user() instead of copy_from_user() + kstrtouint()
makes the code simpler and less error-prone.

Suggested-by: Yury Norov <ynorov@nvidia.com>
Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Yury Norov <ynorov@nvidia.com>
Link: https://patch.msgid.link/20260117145615.53455-2-fushuai.wang@linux.dev

sched/fair: Remove nohz.nr_cpus and use weight of cpumask instead

nohz.nr_cpus was observed as contended cacheline when running
enterprise workload on large systems.

Fundamental scalability challenge with nohz.idle_cpus_mask
and nohz.nr_cpus is the following:

(1) nohz_balancer_kick() observes (reads) nohz.nr_cpus
     (or nohz.idle_cpu_mask) and nohz.has_blocked to  see whether there's
     any nohz balancing work to do, in every scheduler tick.

(2) nohz_balance_enter_idle() and nohz_balance_exit_idle()
     (through nohz_balancer_kick() via sched_tick()) modify (write)
     nohz.nr_cpus (and/or nohz.idle_cpu_mask) and nohz.has_blocked.

The characteristic frequencies are the following:

(1) nohz_balancer_kick() happens at scheduler (busy)tick frequency
     on CPU(which has not gone idle). This is a relatively constant
     frequency  in the ~1 kHz range or lower.

(2) happens at idle enter/exit frequency on every CPU that goes to idle.
     This is workload dependent, but can easily be hundreds of kHz for
     IO-bound loads and high CPU counts. Ie. can be orders of magnitude
     higher than (1), in which case a cachemiss at every invocation of (1)
     is almost inevitable. idle exit will trigger (1) on the CPU
     which is coming out of idle.

There's two types of costs from these functions:

(A) scheduler tick cost via (1): this happens on busy CPUs too, and is
     thus a primary scalability cost. But the rate here is constant and
     typically much lower than (B), hence the absolute benefit to workload
     scalability will be lower as well.

(B) idle cost via (2): going-to-idle and coming-from-idle costs are
     secondary concerns, because they impact power efficiency more than
     they impact scalability. But in terms of absolute cost this scales
     up with nr_cpus as well, and a much faster rate, and thus may also
     approach and negatively impact system limits like
     memory bus/fabric bandwidth.

Note that nohz.idle_cpus_mask and nohz.nr_cpus may appear to reside in the
same cacheline, however under CONFIG_CPUMASK_OFFSTACK=y the backing storage
for nohz.idle_cpus_mask will be elsewhere. With CPUMASK_OFFSTACK=n,
the nohz.idle_cpus_mask and rest of nohz fields are in different cachelines
under typical NR_CPUS=512/2048. This implies two separate cachelines
being dirtied upon idle entry / exit.

nohz.nr_cpus can be derived from the mask itself. Its usage doesn't warrant
a functionally correct value. This means one less cacheline being dirtied in
idle entry/exit path which helps to save some bus bandwidth w.r.t to those
nohz functions(approx 50%). This in turn helps to improve enterprise
workload throughput.

On system with 480 CPUs, running "hackbench 40 process 10000 loops"
(Avg of 3 runs)
baseline:
     0.81%  hackbench          [k] nohz_balance_exit_idle
     0.21%  hackbench          [k] nohz_balancer_kick
     0.09%  swapper            [k] nohz_run_idle_balance

With patch:
     0.35%  hackbench          [k] nohz_balance_exit_idle
     0.09%  hackbench          [k] nohz_balancer_kick
     0.07%  swapper            [k] nohz_run_idle_balance

[Ingo Molnar: scalability analysis changlog]

Reviewed-and-tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://patch.msgid.link/20260115073524.376643-4-sshegde@linux.ibm.com

sched/fair: Change likelyhood of nohz.nr_cpus

These days most of the system have multi cores. The likelyhood of
at least one or more CPUs in nohz (idle state) is higher.

Give accurate hint to the branch predictor.

Reviewed-and-tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://patch.msgid.link/20260115073524.376643-3-sshegde@linux.ibm.com

sched/fair: Move checking for nohz cpus after time check

Current code does.
- Read nohz.nr_cpus
- Check if the time has passed to do NOHZ idle balance

Instead do this.
- Check if the time has passed to do NOHZ idle balance
- Read nohz.nr_cpus

This will skip the read most of the time in normal system usage.
i.e when there are nohz.nr_cpus (system is not 100% busy).

Note that when there are no idle CPUs(100% busy), even if the flag gets
set to NOHZ_STATS_KICK | NOHZ_NEXT_KICK, find_new_ilb will fail and
there will be no NOHZ idle balance. In such cases there will be a very
narrow window where, kick_ilb will be called un-necessarily.
However current functionality is still retained.

Note: This patch doesn't solve any cacheline overheads. No improvement
in performance apart from saving a few cycles of reading nohz.nr_cpus

Reviewed-and-tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://patch.msgid.link/20260115073524.376643-2-sshegde@linux.ibm.com

sched/fair: Fix math notation errors in avg_vruntime comment

The avg_vruntime comment contains a couple of mathematical notation
issues:

- The summation over w_i * (V - v_i) is written in an ambiguous form
- The delta term refers to v instead of v0, which is inconsistent
with the code and preceding explanation

Fix these to make the comment mathematically correct and consistent
with the implementation.

Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260114090035.19033-1-zhanxusheng@xiaomi.com

sched: Fix build for modules using set_tsk_need_resched()

Commit adcc3bfa8806 ("sched: Adapt sched tracepoints for RV task model")
added a tracepoint to the need_resched action that can be triggered also
by set_tsk_need_resched.
This function was previously accessible from out-of-tree modules but
it's no longer available because the __trace_set_need_resched() symbol
is not exported (together with the tracepoint itself, which was exported
in a separate patch) and building such modules fails.

Export __trace_set_need_resched to modules to fix those build issues.

Fixes: adcc3bfa8806 ("sched: Adapt sched tracepoints for RV task model")
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Phil Auld <pauld@redhat.com>
Link: https://patch.msgid.link/20260112140413.362202-1-gmonaco@redhat.com

sched: Export hidden tracepoints to modules

The tracepoints sched_entry, sched_exit and sched_set_need_resched
are not exported to tracefs as trace events, this allows only kernel
code to access them. Helper modules like [1] can be used to still have
the tracepoints available to ftrace for debugging purposes, but they do
rely on the tracepoints being exported.

Export the 3 not exported tracepoints.
Note that sched_set_state is already exported as the macro is called
from modules.

[1] - https://github.com/qais-yousef/sched_tp.git

Fixes: adcc3bfa8806 ("sched: Adapt sched tracepoints for RV task model")
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Phil Auld <pauld@redhat.com>
Link: https://patch.msgid.link/20251205131621.135513-9-gmonaco@redhat.com

sched: Further restrict the preemption modes

The introduction of PREEMPT_LAZY was for multiple reasons:

  - PREEMPT_RT suffered from over-scheduling, hurting performance compared to
    !PREEMPT_RT.

  - the introduction of (more) features that rely on preemption; like
    folio_zero_user() which can do large memset() without preemption checks.

    (Xen already had a horrible hack to deal with long running hypercalls)

  - the endless and uncontrolled sprinkling of cond_resched() -- mostly cargo
    cult or in response to poor to replicate workloads.

By moving to a model that is fundamentally preemptable these things become
managable and avoid needing to introduce more horrible hacks.

Since this is a requirement; limit PREEMPT_NONE to architectures that do not
support preemption at all. Further limit PREEMPT_VOLUNTARY to those
architectures that do not yet have PREEMPT_LAZY support (with the eventual goal
to make this the empty set and completely remove voluntary preemption and
cond_resched() -- notably VOLUNTARY is already limited to !ARCH_NO_PREEMPT.)

This leaves up-to-date architectures (arm64, loongarch, powerpc, riscv, s390,
x86) with only two preemption models: full and lazy.

While Lazy has been the recommended setting for a while, not all distributions
have managed to make the switch yet. Force things along. Keep the patch minimal
in case of hard to address regressions that might pop up.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Link: https://patch.msgid.link/20251219101502.GB1132199@noisy.programming.kicks-ass.net

sched: Reorder some fields in struct rq

This colocates some hot fields in "struct rq" to be on the same cache line
as others that are often accessed at the same time or in similar ways.

Using data from a Google-internal fleet-scale profiler, I found three
distinct groups of hot fields in struct rq:

- (1) The runqueue lock: __lock.

- (2) Those accessed from hot code in pick_next_task_fair():
      nr_running, nr_numa_running, nr_preferred_running,
      ttwu_pending, cpu_capacity, curr, idle.

- (3) Those accessed from some other hot codepaths, e.g.
      update_curr(), update_rq_clock(), and scheduler_tick():
      clock_task, clock_pelt, clock, lost_idle_time,
      clock_update_flags, clock_pelt_idle, clock_idle.

The cycles spent on accessing these different groups of fields broke down
roughly as follows:

- 50% on group (1) (the runqueue lock, always read-write)
- 39% on group (2) (load:store ratio around 38:1)
-  8% on group (3) (load:store ratio around 5:1)
-  3% on all the other fields

Most of the fields in group (3) are already in a cache line grouping; this
patch just adds "clock" and "clock_update_flags" to that group. The fields
in group (2) are scattered across several cache lines; the main effect of
this patch is to group them together, on a single line at the beginning of
the structure. A few other less performance-critical fields (nr_switches,
numa_migrate_on, has_blocked_load, nohz_csd, last_blocked_load_update_tick)
were also reordered to reduce holes in the data structure.

Since the runqueue lock is acquired from so many different contexts, and is
basically always accessed using an atomic operation, putting it on either
of the cache lines for groups (2) or (3) would slow down accesses to those
fields dramatically, since those groups are read-mostly accesses.

To test this, I wrote a focused load test that would put load on the
pick_next_task_fair() path. A parent process would fork many child
processes, and each child would nanosleep() for 1 msec many times in a
loop. The load test was monitored with "perf", and I looked at the amount
of cycles that were spent with sched_balance_rq() on the stack. The test
was reliably spending ~5% of all of its cycles there. I ran it 100 times
on a pair of 2-socket Intel Haswell machines (72 vCPUs per machine) - one
running the tip of sched/core, the other running this change - using 360
child processes and 8192 1-msec sleeps per child.  The mean cycle count
dropped from 5.14B to 4.91B, or a *4.6% decrease* in relevant scheduler
cycles.

Given that this change reduces cache misses in a very hot kernel codepath,
there's likely to be additional application performance improvement due to
reduced cache conflicts from kernel data structures.

On a Power11 system with 128-byte cache lines, my test showed a ~5%
decrease in relevant scheduler cycles, along with a slight increase in user
time - both positive indicators. This data comes from
https://lore.kernel.org/lkml/affdc6b1-9980-44d1-89db-d90730c1e384@linux.ibm.com/
This is the case even though the additional "____cacheline_aligned" that
puts the runqueue lock on the next cache line adds an additional 64 bytes
of padding on those machines. This patch does not change the size of
"struct rq" on machines with 64-byte cache lines.

I also ran "hackbench" to try to test this change, but it didn't show
conclusive results.  Looking at a CPU cycle profile of the hackbench run,
it was spending 95% of its cycles inside __alloc_skb(), __kfree_skb(), or
kmem_cache_free() - almost all of which was spent updating memcg counters
or contending on the list_lock in kmem_cache_node. And it spent less than
0.5% of its cycles inside either schedule() or try_to_wake_up().  So it's
not surprising that it didn't show useful results here.

The "__no_randomize_layout" was added to reflect the fact that performance
of code that references this data structure is unusually sensitive to
placement of its members.

Signed-off-by: Blake Jones <blakejones@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Reviewed-by: Josh Don <joshdon@google.com>
Tested-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Link: https://patch.msgid.link/20251202023743.1524247-1-blakejones@google.com

sched/fair: Use cpumask_weight_and() in sched_balance_find_dst_group()

In the group_has_spare case, the function creates a temporary cpumask
to just calculate weight of (p->cpus_ptr & sched_group_span(local)).

We've got a dedicated helper for it.

Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Fernand Sieber <sieberf@amazon.com>
Link: https://patch.msgid.link/20251207034247.402926-1-yury.norov@gmail.com

sched/fair: Simplify task_numa_find_cpu()

Use for_each_cpu_and() and drop some housekeeping code.

Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Phil Auld <pauld@redhat.com>
Link: https://patch.msgid.link/20251207033037.399608-1-yury.norov@gmail.com

sched/fair: Drop useless cpumask_empty() in find_energy_efficient_cpu()

cpumask_empty() call is O(N) and useless because the previous
cpumask_and() returns false for empty 'cpus'. Drop it.

Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://patch.msgid.link/20251207040543.407695-1-yury.norov@gmail.com

sched/fair: Fix sched_avg fold

After the robot reported a regression wrt commit: 089d84203ad4 ("sched/fair:
Fold the sched_avg update"), Shrikanth noted that two spots missed a factor
se_weight().

Fixes: 089d84203ad4 ("sched/fair: Fold the sched_avg update")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202512181208.753b9f6e-lkp@intel.com
Debugged-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251218102020.GO3707891@noisy.programming.kicks-ass.net

sched: Fix faulty assertion in sched_change_end()

Commit 47efe2ddccb1f ("sched/core: Add assertions to QUEUE_CLASS") added an
assert to sched_change_end() verifying that a class demotion would result in a
reschedule.

As it turns out; rt_mutex_setprio() does not force a resched on class
demontion. Furthermore, this is only relevant to running tasks.

Change the warning into a reschedule and make sure to only do so for running
tasks.

Fixes: 47efe2ddccb1f ("sched/core: Add assertions to QUEUE_CLASS")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251216141725.GW3707837@noisy.programming.kicks-ass.net

sched/core: Rework sched_class::wakeup_preempt() and rq_modified_*()

Change sched_class::wakeup_preempt() to also get called for
cross-class wakeups, specifically those where the woken task
is of a higher class than the previous highest class.

In order to do this, track the current highest class of the runqueue
in rq::next_class and have wakeup_preempt() track this upwards for
each new wakeup. Additionally have schedule() re-set the value on
pick.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251127154725.901391274@infradead.org

sched/fair: Sort out 'blocked_load*' namespace noise

There's three layers of logic in the scheduler that
deal with 'has_blocked' (load) handling of the NOHZ code:

  (1) nohz.has_blocked,
  (2) rq->has_blocked_load, deal with NOHZ idle balancing,
  (3) and cfs_rq_has_blocked(), which is part of the layer
      that is passing the SMP load-balancing signal to the
      NOHZ layers.

The 'has_blocked' and 'has_blocked_load' names are used
in a mixed fashion, sometimes within the same function.

Standardize on 'has_blocked_load' to make it all easy
to read and easy to grep.

No change in functionality.

Suggested-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Link: https://patch.msgid.link/aS6yvxyc3JfMxxQW@gmail.com

sched/fair: Introduce and use the vruntime_cmp() and vruntime_op() wrappers for wrapped-signed aritmetics

We have to be careful with vruntime comparisons and subtraction,
due to the possibility of wrapping, so we have macros like:

   #define vruntime_gt(field, lse, rse) ({ (s64)((lse)->field - (rse)->field) > 0; })

Which is used like this:

if (vruntime_gt(min_vruntime, se, rse))
se->min_vruntime = rse->min_vruntime;

Replace this with an easier to read pattern that uses the regular
arithmetics operators:

if (vruntime_cmp(se->min_vruntime, ">", rse->min_vruntime))
se->min_vruntime = rse->min_vruntime;

Also replace vruntime subtractions with vruntime_op():

-       delta = (s64)(sea->vruntime - seb->vruntime) +
-               (s64)(cfs_rqb->zero_vruntime_fi - cfs_rqa->zero_vruntime_fi);
+       delta = vruntime_op(sea->vruntime, "-", seb->vruntime) +
+               vruntime_op(cfs_rqb->zero_vruntime_fi, "-", cfs_rqa->zero_vruntime_fi);

In the vruntime_cmp() and vruntime_op() macros use Use __builtin_strcmp(),
because of __HAVE_ARCH_STRCMP might turn off the compiler optimizations
we rely on here to catch usage bugs.

No change in functionality.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched/fair: Rename cfs_rq::avg_vruntime to ::sum_w_vruntime, and helper functions

The ::avg_vruntime field is a  misnomer: it says it's an
'average vruntime', but in reality it's the momentary sum
of the weighted vruntimes of all queued tasks, which is
at least a division away from being an average.

This is clear from comments about the math of fair scheduling:

    * \Sum (v_i - v0) * w_i := cfs_rq->avg_vruntime

This confusion is increased by the cfs_avg_vruntime() function,
which does perform the division and returns a true average.

The sum of all weighted vruntimes should be named thusly,
so rename the field to ::sum_w_vruntime. (As arguably
::sum_weighted_vruntime would be a bit of a mouthful.)

Understanding the scheduler is hard enough already, without
extra layers of obfuscated naming. ;-)

Also rename related helper functions:

  sum_vruntime_add()    => sum_w_vruntime_add()
  sum_vruntime_sub()    => sum_w_vruntime_sub()
  sum_vruntime_update() => sum_w_vruntime_update()

With the notable exception of cfs_avg_vruntime(), which
was named accurately.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251201064647.1851919-7-mingo@kernel.org

sched/fair: Rename cfs_rq::avg_load to cfs_rq::sum_weight

The ::avg_load field is a long-standing misnomer: it says it's an
'average load', but in reality it's the momentary sum of the load
of all currently runnable tasks. We'd have to also perform a
division by nr_running (or use time-decay) to arrive at any sort
of average value.

This is clear from comments about the math of fair scheduling:

* \Sum w_i := cfs_rq->avg_load

The sum of all weights is ... the sum of all weights, not
the average of all weights.

To make it doubly confusing, there's also an ::avg_load
in the load-balancing struct sg_lb_stats, which *is* a
true average.

The second part of the field's name is a minor misnomer
as well: it says 'load', and it is indeed a load_weight
structure as it shares code with the load-balancer - but
it's only in an SMP load-balancing context where
load = weight, in the fair scheduling context the primary
purpose is the weighting of different nice levels.

So rename the field to ::sum_weight instead, which makes
the terminology of the EEVDF math match up with our
implementation of it:

* \Sum w_i := cfs_rq->sum_weight

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251201064647.1851919-6-mingo@kernel.org

sched/fair: Separate se->vlag from se->vprot

There's no real space concerns here and keeping these fields
in a union makes reading (and tracing) the scheduler code harder.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251201064647.1851919-4-mingo@kernel.org

sched/fair: Clean up comments in 'struct cfs_rq'

- Fix vertical alignment
- Fix typos
- Fix capitalization

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251201064647.1851919-3-mingo@kernel.org

sched/fair: Join two #ifdef CONFIG_FAIR_GROUP_SCHED blocks

Join two identical #ifdef blocks:

  #ifdef CONFIG_FAIR_GROUP_SCHED
  ...
  #endif

  #ifdef CONFIG_FAIR_GROUP_SCHED
  ...
  #endif

Also mark nested #ifdef blocks in the usual fashion, to make
it more apparent where in a nested hierarchy of #ifdefs we
are at a glance.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Link: https://patch.msgid.link/20251201064647.1851919-2-mingo@kernel.org

sched/core: Add assertions to QUEUE_CLASS

Add some checks to the sched_change pattern to validate assumptions
around changing classes.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251127154725.771691954@infradead.org

sched/fair: Limit hrtick work

The task_tick_fair() function does:

- update the hierarchical runtimes
- drive NUMA-balancing
- update load-balance statistics
- drive force-idle preemption

All but the very first can be limited to the periodic tick. Let hrtick
only update accounting and drive preemption, not load-balancing and
other bits.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20250918080205.563385766@infradead.org

sched/fair: Remove superfluous rcu_read_lock()

With fair switched to rcu_dereference_all() validation, having IRQ or
preemption disabled is sufficient, remove the rcu_read_lock()
clutter.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251127154725.647502625@infradead.org

sched/fair: Switch to rcu_dereference_all()

With the {rcu,sched,bh} RCU flavours being unified, it doesn't really
make sense to check for just the rcu one. Switch to the _all family of
verification which includes all 3 of the listed flavours.

Notably, this will enable us to remove some superfluous
rcu_read_lock() regions when we know they are inside preempt/IRQ
disabled regions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched/headers: Rename rcu_dereference_check_sched_domain() => rcu_dereference_sched_domain()

Remove check from the name for being surplus to requirements.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched/fair: Avoid rq->lock bouncing in sched_balance_newidle()

While poking at this code recently I noted we do a pointless
unlock+lock cycle in sched_balance_newidle(). We drop the rq->lock (so
we can balance) but then instantly grab the same rq->lock again in
sched_balance_update_blocked_averages().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251127154725.532469061@infradead.org

sched/fair: Fold the sched_avg update

Nine (and a half) instances of the same pattern is just silly, fold the lot.

Notably, the half instance in enqueue_load_avg() is right after setting
cfs_rq->avg.load_sum to cfs_rq->avg.load_avg * get_pelt_divider(&cfs_rq->avg).
Since get_pelt_divisor() >= PELT_MIN_DIVIDER, this ends up being a no-op
change.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://patch.msgid.link/20251127154725.413564507@infradead.org

<linux/compiler_types.h>: Add the __signed_scalar_typeof() helper

Define __signed_scalar_typeof() to declare a signed scalar type, leaving
non-scalar types unchanged.

To be used to clean up the scheduler load-balancing code a bit.

[ mingo: Split off this patch from the scheduler patch. ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shrikanth Hegde <sshegde@linux.ibm.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://patch.msgid.link/20251127154725.413564507@infradead.org

Linux 6.19-rc1

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"The only core fix is in doc; all the others are in drivers, with the
  biggest impacts in libsas being the rollback on error handling and in
  ufs coming from a couple of error handling fixes, one causing a crash
  if it's activated before scanning and the other fixing W-LUN
  resumption"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: qcom: Fix confusing cleanup.h syntax
  scsi: libsas: Add rollback handling when an error occurs
  scsi: device_handler: Return error pointer in scsi_dh_attached_handler_name()
  scsi: ufs: core: Fix a deadlock in the frequency scaling code
  scsi: ufs: core: Fix an error handler crash
  scsi: Revert "scsi: libsas: Fix exp-attached device scan after probe failure scanned in again after probe failed"
  scsi: ufs: core: Fix RPMB link error by reversing Kconfig dependencies
  scsi: qla4xxx: Use time conversion macros
  scsi: qla2xxx: Enable/disable IRQD_NO_BALANCING during reset
  scsi: ipr: Enable/disable IRQD_NO_BALANCING during reset
  scsi: imm: Fix use-after-free bug caused by unfinished delayed work
  scsi: target: sbp: Remove KMSG_COMPONENT macro
  scsi: core: Correct documentation for scsi_device_quiesce()
  scsi: mpi3mr: Prevent duplicate SAS/SATA device entries in channel 1
  scsi: target: Reset t_task_cdb pointer in error case
  scsi: ufs: core: Fix EH failure after W-LUN resume error

Merge tag 'ceph-for-6.19-rc1' of https://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
"We have a patch that adds an initial set of tracepoints to the MDS
  client from Max, a fix that hardens osdmap parsing code from myself
  (marked for stable) and a few assorted fixups"

* tag 'ceph-for-6.19-rc1' of https://github.com/ceph/ceph-client:
  rbd: stop selecting CRC32, CRYPTO, and CRYPTO_AES
  ceph: stop selecting CRC32, CRYPTO, and CRYPTO_AES
  libceph: make decode_pool() more resilient against corrupted osdmaps
  libceph: Amend checking to fix `make W=1` build breakage
  ceph: Amend checking to fix `make W=1` build breakage
  ceph: add trace points to the MDS client
  libceph: fix log output race condition in OSD client

Merge tag 'tomoyo-pr-20251212' of git://git.code.sf.net/p/tomoyo/tomoyo

Pull tomoyo update from Tetsuo Handa:
"Trivial optimization"

* tag 'tomoyo-pr-20251212' of git://git.code.sf.net/p/tomoyo/tomoyo:
tomoyo: Use local kmap in tomoyo_dump_page()

Merge tag 'smp-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull CPU hotplug fix from Ingo Molnar:

- Fix CPU hotplug callbacks to disable interrupts on UP kernels

* tag 'smp-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu: Make atomic hotplug callbacks run with interrupts disabled on UP

Merge tag 'perf-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf event fixes from Ingo Molnar:

- Fix NULL pointer dereference crash in the Intel PMU driver

- Fix missing read event generation on task exit

- Fix AMD uncore driver init error handling

- Fix whitespace noise

* tag 'perf-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Fix NULL event dereference crash in handle_pmi_common()
  perf/core: Fix missing read event generation on task exit
  perf/x86/amd/uncore: Fix the return value of amd_uncore_df_event_init() on error
  perf/uprobes: Remove <space><Tab> whitespace noise

Merge tag 'irq-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Ingo Molnar:

- Fix error code in the irqchip/mchp-eic driver

- Fix setup_percpu_irq() affinity assumptions

- Remove the unused irq_domain_add_tree() function

* tag 'irq-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/mchp-eic: Fix error code in mchp_eic_domain_alloc()
  irqdomain: Delete irq_domain_add_tree()
  genirq: Allow NULL affinity for setup_percpu_irq()

Merge tag 'core-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc core fixes from Ingo Molnar:

- Improve bug reporting

- Suppress W=1 format warning

- Improve rseq scalability on Clang builds

* tag 'core-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rseq: Always inline rseq_debug_syscall_return()
  bug: Hush suggest-attribute=format for __warn_printf()
  bug: Let report_bug_entry() provide the correct bugaddr

Merge tag 'mm-nonmm-stable-2025-12-11-11-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc updates from Andrew Morton:
"There are no significant series in this small merge. Please see the
  individual changelogs for details"

[ Editor's note: it's mainly ocfs2 and a couple of random fixes ]

* tag 'mm-nonmm-stable-2025-12-11-11-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm: memfd_luo: add CONFIG_SHMEM dependency
  mm: shmem: avoid build warning for CONFIG_SHMEM=n
  ocfs2: fix memory leak in ocfs2_merge_rec_left()
  ocfs2: invalidate inode if i_mode is zero after block read
  ocfs2: avoid -Wflex-array-member-not-at-end warning
  ocfs2: convert remaining read-only checks to ocfs2_emergency_state
  ocfs2: add ocfs2_emergency_state helper and apply to setattr
  checkpatch: add uninitialized pointer with __free attribute check
  args: fix documentation to reflect the correct numbers
  ocfs2: fix kernel BUG in ocfs2_find_victim_chain
  liveupdate: luo_core: fix redundant bound check in luo_ioctl()
  ocfs2: validate inline xattr size and entry count in ocfs2_xattr_ibody_list
  fs/fat: remove unnecessary wrapper fat_max_cache()
  ocfs2: replace deprecated strcpy with strscpy
  ocfs2: check tl_used after reading it from trancate log inode
  liveupdate: luo_file: don't use invalid list iterator

Merge tag 'mm-stable-2025-12-11-11-39' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more MM updates from Andrew Morton:

- "powerpc/pseries/cmm: two smaller fixes" (David Hildenbrand)
   fixes a couple of minor things in ppc land

- "Improve folio split related functions" (Zi Yan)
   some cleanups and minorish fixes in the folio splitting code

* tag 'mm-stable-2025-12-11-11-39' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/damon/tests/core-kunit: avoid damos_test_commit stack warning
  mm: vmscan: correct nr_requested tracing in scan_folios
  MAINTAINERS: add idr core-api doc file to XARRAY
  mm/hugetlb: fix incorrect error return from hugetlb_reserve_pages()
  mm: fix CONFIG_STACK_GROWSUP typo in mm.h
  mm/huge_memory: fix folio split stats counting
  mm/huge_memory: make min_order_for_split() always return an order
  mm/huge_memory: replace can_split_folio() with direct refcount calculation
  mm/huge_memory: change folio_split_supported() to folio_check_splittable()
  mm/sparse: fix sparse_vmemmap_init_nid_early definition without CONFIG_SPARSEMEM
  powerpc/pseries/cmm: adjust BALLOON_MIGRATE when migrating pages
  powerpc/pseries/cmm: call balloon_devinfo_init() also without CONFIG_BALLOON_COMPACTION

file: ensure cleanup

Brown paper bag time. This is a silly oversight where I missed to drop
the error condition checking to ensure we clean up on early error
returns. I have an internal unit testset coming up for this which will
catch all such issues going forward.

Reported-by: Chris Mason <clm@fb.com>
Reported-by: Jeff Layton <jlayton@kernel.org>
Fixes: 011703a9acd7 ("file: add FD_{ADD,PREPARE}()")
Signed-off-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

x86/hv: Add gitignore entry for generated header file

Commit 7bfe3b8ea6e3 ("Drivers: hv: Introduce mshv_vtl driver") added a
new generated header file for the offsets into the mshv_vtl_cpu_context
structure to be used by the low-level assembly code. But it didn't add
the .gitignore file to go with it, so 'git status' and friends will
mention it.

Let's add the gitignore file before somebody thinks that generated
header should be committed.

Fixes: 7bfe3b8ea6e3 ("Drivers: hv: Introduce mshv_vtl driver")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'drm-fixes-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel

Pull more drm fixes from Dave Airlie:
"These are the enqueued fixes that ended up in our fixes branch,
  nouveau mostly, along with some small fixes in other places.

  plane:
   - Handle IS_ERR vs NULL in drm_plane_create_hotspot_properties()

  ttm:
   - fix devcoredump for evicted bos

  panel:
   - Fix stack usage warning in novatek-nt35560

  nouveau:
   - alloc fwsec sb at boot to avoid s/r problems
   - fix strcpy usage
   - fix i2c encoder crash

  bridge:
   - Ignore spurious PLL_UNLOCK bit in ti-sn65dsi83

  mgag200:
   - Fix bigendian handling in mgag200

  tilcdc:
   - Fix probe failure in tilcdc"

* tag 'drm-fixes-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel:
  drm/mgag200: Fix big-endian support
  drm/tilcdc: Fix removal actions in case of failed probe
  drm/ttm: Avoid NULL pointer deref for evicted BOs
  drm: nouveau: Replace sprintf() with sysfs_emit()
  drm/nouveau: fix circular dep oops from vendored i2c encoder
  drm/nouveau: refactor deprecated strcpy
  drm/plane: Fix IS_ERR() vs NULL check in drm_plane_create_hotspot_properties()
  drm/bridge: ti-sn65dsi83: ignore PLL_UNLOCK errors
  drm/nouveau/gsp: Allocate fwsec-sb at boot
  drm/panel: novatek-nt35560: avoid on-stack device structure

Merge tag 'drm-next-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
"This is the weekly fixes for what is in next tree, mostly amdgpu and
  some i915, panthor and a core revert.

  core:
   - revert dumb bo 8 byte alignment

  amdgpu:
   - SI fix
   - DC reduce stack usage
   - HDMI fixes
   - VCN 4.0.5 fix
   - DP MST fix
   - DC memory allocation fix

  amdkfd:
   - SVM fix
   - Trap handler fix
   - VGPR fixes for GC 11.5

  i915:
   - Fix format string truncation warning
   - FIx runtime PM reference during fbdev BO creation

  panthor:
   - fix UAF

  renesas:
   - fix sync flag handling"

* tag 'drm-next-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel:
  Revert "drm/amd/display: Fix pbn to kbps Conversion"
  drm/amd: Fix unbind/rebind for VCN 4.0.5
  drm/i915: Fix format string truncation warning
  drm/i915/fbdev: Hold runtime PM ref during fbdev BO creation
  drm/amd/display: Improve HDMI info retrieval
  drm/amdkfd: bump minimum vgpr size for gfx1151
  drm/amd/display: shrink struct members
  drm/amdkfd: Export the cwsr_size and ctl_stack_size to userspace
  drm/amd/display: Refactor dml_core_mode_support to reduce stack frame
  drm/amdgpu: don't attach the tlb fence for SI
  drm/amd/display: Use GFP_ATOMIC in dc_create_plane_state()
  drm/amdkfd: Trap handler support for expert scheduling mode
  drm/amdkfd: Use huge page size to check split svm range alignment
  drm/rcar-du: dsi: Handle both DRM_MODE_FLAG_N.SYNC and !DRM_MODE_FLAG_P.SYNC
  drm/gem-shmem: revert the 8-byte alignment constraint
  drm/gem-dma: revert the 8-byte alignment constraint
  drm/panthor: Prevent potential UAF in group creation

Merge tag 'i3c/for-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux

Pull further i3c update from Alexandre Belloni:
"We are removing a legacy API callback and having this sooner rather
  than later will help ensuring no one introduces a new driver using it.

  I've also added patches removing the "__free(...) = NULL" pattern
  because I'm sure we won't avoid people sending those following the
  mailing list discussion..."

* tag 'i3c/for-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
  i3c: adi: Fix confusing cleanup.h syntax
  i3c: master: Fix confusing cleanup.h syntax
  i3c: master: cleanup callback .priv_xfers()
  i3c: master: switch to use new callback .i3c_xfers() from .priv_xfers()

Merge tag 'rtc-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC updates from Alexandre Belloni:
"Subsystem:
   - stop setting max_user_freq from the individual drivers as this has
     not been hardware related for a while

  New drivers:
   - Andes ATCRTC100
   - Apple SMC
   - Nvidia VRS

  Drivers:
   - renesas-rtca3: add RZ/V2H support
   - tegra: add ACPI support"

* tag 'rtc-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (34 commits)
  rtc: spacemit: MFD_SPACEMIT_P1 as dependencies
  rtc: atcrtc100: Fix signedness bug in probe()
  rtc: max31335: Fix ignored return value in set_alarm
  rtc: gamecube: Check the return value of ioremap()
  Documentation: ABI: testing: Fix "upto" typo in rtc-cdev
  rtc: Add new rtc-macsmc driver for Apple Silicon Macs
  dt-bindings: rtc: Add Apple SMC RTC
  MAINTAINERS: drop unneeded file entry in NVIDIA VRS RTC DRIVER
  rtc: isl12026: Add id_table
  rtc: renesas-rtca3: Add support for multiple reset lines
  dt-bindings: rtc: renesas,rz-rtca3: Add RZ/V2H support
  rtc: tegra: Replace deprecated SIMPLE_DEV_PM_OPS
  rtc: tegra: Add ACPI support
  rtc: tegra: Use devm_clk_get_enabled() in probe
  rtc: Kconfig: add MC34708 to mc13xxx help text
  rtc: s35390a: use u8 instead of char for register buffer
  rtc: nvvrs: add NVIDIA VRS RTC device driver
  dt-bindings: rtc: Document NVIDIA VRS RTC
  rtc: atcrtc100: Add ATCRTC100 RTC driver
  MAINTAINERS: Add entry for ATCRTC100 RTC driver
  ...

Merge tag 'pwm/for-6.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux

Pull pwm fix from Uwe Kleine-König:
"Fix missing th1520 Kconfig dependencies

  This tightens the dependency for the new pwm driver written in Rust to
  make build bots and obviously also users happy"

* tag 'pwm/for-6.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
  pwm: th1520: Fix missing Kconfig dependencies

Merge tag 'gpio-fixes-for-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio updates from Bartosz Golaszewski:

- fix spinlock op type after conversion to lock guards

- fix a memory leak in error path in gpio-regmap

- Kconfig fixes in GPIO drivers

- add a GPIO ACPI quirk for Dell Precision 7780

- set of fixes for shared GPIO management

* tag 'gpio-fixes-for-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: shared: make locking more fine-grained
  gpio: shared: fix auxiliary device cleanup order
  gpio: shared: check if a reference is populated before cleaning its resources
  gpio: shared: fix NULL-pointer dereference in teardown path
  gpio: shared: ignore disabled nodes when traversing the device-tree
  gpiolib: acpi: Add quirk for Dell Precision 7780
  gpio: tb10x: fix OF_GPIO dependency
  gpio: qixis: select CONFIG_REGMAP_MMIO
  gpio: regmap: Fix memleak in error path in gpio_regmap_register()
  gpio: mmio: fix bad guard conversion

Merge tag 'pci-v6.19-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI fix from Bjorn Helgaas:

- Initialize rzg3s_pcie_msi_irq() MSI status bitmap before use (Claudiu
Beznea)

* tag 'pci-v6.19-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI: rzg3s-host: Initialize MSI status bitmap before use

Merge tag 'soundwire-6.19-rc1_updated' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire

Pull soundwire updates from Vinod Koul:

- Support for multiple sections in a BPT stream

- Align DMA frame with BPT frames

- Qualcomm support for v3.1.0 controllers

* tag 'soundwire-6.19-rc1_updated' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: intel_ace2x: handle multi BPT sections
  soundwire: pass sdw_bpt_section to cdns BPT helpers
  soundwire: introduce BPT section
  soundwire: intel_ace2x: add fake frame to BRA read command
  soundwire: cadence_master: add fake_size parameter to sdw_cdns_prepare_read_dma_buffer
  ASoC: SOF: Intel: export hda_sdw_bpt_get_buf_size_aligment
  soundwire: cadence: export sdw_cdns_bpt_find_bandwidth
  soundwire: cadence_master: set data_per_frame as frame capability
  soundwire: only compute BPT stream in sdw_compute_dp0_port_params
  soundwire: cadence_master: make frame index trace more readable
  soundwire: qcom: adding support for v3.1.0
  dt-bindings: soundwire: qcom: Document v3.1.0 version of IP block
  soundwire: qcom: prepare for v3.x
  soundwire: qcom: deprecate qcom,din/out-ports
  dt-bindings: soundwire: qcom: deprecate qcom,din/out-ports
  soundwire: qcom: remove unused rd_fifo_depth
  of: base: Add of_property_read_u8_index

Merge tag 'sound-fix-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"The only slightly large change is the enablement of CIX HD-audio
  controller, which took a bit time to be cooked up, while most of other
  changes are device-specific small trivial fixes:

   - Default disablement of the kconfig for decades old pre-release
     alsa-lib PCM API; it's only the default config value change, so it
     can't lead to any regressions for the existing setups

   - Support for CIX HD-audio controller

   - A few ASoC ACP fixes

   - Fixes for ASoC cirrus, bcm, wcd, qcom, ak platforms

   - Trivial hardening for FireWire and USB-audio

   - HD-audio Intel binding fix and quirks"

* tag 'sound-fix-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (30 commits)
  ALSA: hda/tas2781: Add new quirk for HP new project
  ALSA: hda: cix-ipbloq: Use modern PM ops
  ALSA: hda: intel-dsp-config: Prefer legacy driver as fallback
  ASoC: amd: acp: update tdm channels for specific DAI
  ASoC: cs35l56: Fix incorrect select SND_SOC_CS35L56_CAL_SYSFS_COMMON
  ALSA: firewire-motu: add bounds check in put_user loop for DSP events
  ASoC: cs35l41: Always return 0 when a subsystem ID is found
  ALSA: uapi: Fix typo in asound.h comment
  ALSA: Do not build obsolete API
  ALSA: hda: add CIX IPBLOQ HDA controller support
  ALSA: hda/core: add addr_offset field for bus address translation
  ALSA: hda: dt-bindings: add CIX IPBLOQ HDA controller support
  ALSA: hda/realtek: Add support for ASUS UM3406GA
  ALSA: hda/realtek: Add support for HP Turbine Laptops
  ALSA: usb-audio: Initialize status1 to fix uninitialized symbol errors
  ALSA: firewire-motu: fix buffer overflow in hwdep read for DSP events
  ALSA: hda: cs35l41: Fix NULL pointer dereference in cs35l41_hda_read_acpi()
  ASoC: cros_ec_codec: Remove unnecessary selection of CRYPTO
  ASoc: qcom: q6afe: fix bad guard conversion
  ASoC: rockchip: Fix Wvoid-pointer-to-enum-cast warning (again)
  ...

Merge tag 'drm-misc-fixes-2025-12-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

drm-misc-fixes for v6.19-rc1:
- Fix stack usage warning in novatek-nt35560.
- Fix s/r, i2c issues in nouveau and update string handling.
- Ignore spurious PLL_UNLOCK bit in ti-sn65dsi83.
- Handle IS_ERR vs NULL in drm_plane_create_hotspot_properties().
- Fix devcoredump crash on reading evicted bo's.
- Fix bigendian handling in mgag200.
- Fix probe failure in tilcdc.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patch.msgid.link/6c371dc1-08bf-4a34-895c-9ef348b6061b@linux.intel.com

i3c: adi: Fix confusing cleanup.h syntax

Initializing automatic __free variables to NULL without need (e.g.
branches with different allocations), followed by actual allocation is
in contrary to explicit coding rules guiding cleanup.h:

"Given that the "__free(...) = NULL" pattern for variables defined at
the top of the function poses this potential interdependency problem the
recommendation is to always define and assign variables in one statement
and not group variable definitions at the top of the function when
__free() is used."

Code does not have a bug, but is less readable and uses discouraged
coding practice, so fix that by moving declaration to the place of
assignment.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://patch.msgid.link/20251208020750.4727-4-krzysztof.kozlowski@oss.qualcomm.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>

i3c: master: Fix confusing cleanup.h syntax

Initializing automatic __free variables to NULL without need (e.g.
branches with different allocations), followed by actual allocation is
in contrary to explicit coding rules guiding cleanup.h:

"Given that the "__free(...) = NULL" pattern for variables defined at
the top of the function poses this potential interdependency problem the
recommendation is to always define and assign variables in one statement
and not group variable definitions at the top of the function when
__free() is used."

Code does not have a bug, but is less readable and uses discouraged
coding practice, so fix that by moving declaration to the place of
assignment.

Not that other existing usage of __free() in this context is a corret
exception initialized to NULL, because the actual allocation is branched
in if().

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://patch.msgid.link/20251208020750.4727-3-krzysztof.kozlowski@oss.qualcomm.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>

i3c: master: cleanup callback .priv_xfers()

Remove the .priv_xfers() callback from the framework after all master
controller drivers have switched to use the new .i3c_xfers() callback.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
Tested-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com>
Link: https://patch.msgid.link/20251203-i3c_xfer_cleanup_master-v2-2-7dd94d04ee2d@nxp.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>

Merge tag 'loongarch-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch updates from Huacai Chen:

- Add basic LoongArch32 support

   Note: Build infrastructures of LoongArch32 are not enabled yet,
   because we need to adjust irqchip drivers and wait for GNU toolchain
   be upstream first.

- Select HAVE_ARCH_BITREVERSE in Kconfig

- Fix build and boot for CONFIG_RANDSTRUCT

- Correct the calculation logic of thread_count

- Some bug fixes and other small changes

* tag 'loongarch-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (22 commits)
  LoongArch: Adjust default config files for 32BIT/64BIT
  LoongArch: Adjust VDSO/VSYSCALL for 32BIT/64BIT
  LoongArch: Adjust misc routines for 32BIT/64BIT
  LoongArch: Adjust user accessors for 32BIT/64BIT
  LoongArch: Adjust system call for 32BIT/64BIT
  LoongArch: Adjust module loader for 32BIT/64BIT
  LoongArch: Adjust time routines for 32BIT/64BIT
  LoongArch: Adjust process management for 32BIT/64BIT
  LoongArch: Adjust memory management for 32BIT/64BIT
  LoongArch: Adjust boot & setup for 32BIT/64BIT
  LoongArch: Adjust common macro definitions for 32BIT/64BIT
  LoongArch: Add adaptive CSR accessors for 32BIT/64BIT
  LoongArch: Add atomic operations for 32BIT/64BIT
  LoongArch: Add new PCI ID for pci_fixup_vgadev()
  LoongArch: Add and use some macros for AVEC
  LoongArch: Correct the calculation logic of thread_count
  LoongArch: Use unsigned long for _end and _text
  LoongArch: Use __pmd()/__pte() for swap entry conversions
  LoongArch: Fix arch_dup_task_struct() for CONFIG_RANDSTRUCT
  LoongArch: Fix build errors for CONFIG_RANDSTRUCT
  ...

Merge tag 'libcrypto-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull crypto library fixes from Eric Biggers:
"Fixes for some recent regressions as well as some longstanding issues:

   - Fix incorrect output from the arm64 NEON implementation of GHASH

   - Merge the ksimd scopes in the arm64 XTS code to reduce stack usage

   - Roll up the BLAKE2b round loop on 32-bit kernels to greatly reduce
     code size and stack usage

   - Add missing RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS dependency

   - Fix chacha-riscv64-zvkb.S to not use frame pointer for data"

* tag 'libcrypto-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  crypto: arm64/ghash - Fix incorrect output from ghash-neon
  crypto/arm64: sm4/xts - Merge ksimd scopes to reduce stack bloat
  crypto/arm64: aes/xts - Use single ksimd scope to reduce stack bloat
  lib/crypto: blake2s: Replace manual unrolling with unrolled_full
  lib/crypto: blake2b: Roll up BLAKE2b round loop on 32-bit
  lib/crypto: riscv: Depend on RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS
  lib/crypto: riscv/chacha: Avoid s0/fp register

Merge tag 'block-6.19-20251211' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block fixes from Jens Axboe:

- Always initialize DMA state, fixing a potentially nasty issue on the
   block side

- btrfs zoned write fix with cached zone reports

- Fix corruption issues in bcache with chained bio's, and further make
   it clear that the chained IO handler is simply a marker, it's not
   code meant to be executed

- Kill old code dealing with synchronous IO polling in the block layer,
   that has been dead for a long time. Only async polling is supported
   these days

- Fix a lockdep issue in tag_set management, moving it to RCU

- Fix an issue with ublks bio_vec iteration

- Don't unconditionally enforce blocking issue of ublk control
   commands, allow some of them with non-blocking issue as they
   do not block

* tag 'block-6.19-20251211' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  blk-mq-dma: always initialize dma state
  blk-mq: delete task running check in blk_hctx_poll()
  block: fix cached zone reports on devices with native zone append
  block: Use RCU in blk_mq_[un]quiesce_tagset() instead of set->tag_list_lock
  ublk: don't mutate struct bio_vec in iteration
  block: prohibit calls to bio_chain_endio
  bcache: fix improper use of bi_end_io
  ublk: allow non-blocking ctrl cmds in IO_URING_F_NONBLOCK issue

Merge tag 'io_uring-6.19-20251211' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring fix from Jens Axboe:
"Single fix for io_uring headed to stable, fixing an issue introduced
  with the min_wait support earlier this year, where SQPOLL didn't get
  correctly woken if an event arrived once the event waiting has
  finished the min_wait portion.

  As we already have regression tests for this added and people
  reporting new failures there, let's get this one flushed out
  so it can bubble back down to stable as well"

* tag 'io_uring-6.19-20251211' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring: fix min_wait wakeups for SQPOLL

Merge tag 'v6.19-rc-smb3-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:

- minor cleanup

- minor update to comment to avoid confusion about fs type

* tag 'v6.19-rc-smb3-server-fixes' of git://git.samba.org/ksmbd:
  smb/server: add comment to FileSystemName of FileFsAttributeInformation
  smb/server: remove unused nterr.h
  smb/server: rename include guard in smb_common.h

Merge tag 'v6.19-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

- Fix incorrect error code defines

- Add missing error code definitions

- Add parenthesis around NT_STATUS code defines to fix checkpatch
   warnings

- Remove some duplicated protocol definitions, moving to common code
   shared by client and server

- Add missing protocol documentation reference (for change notify)

- Correct struct definition (for duplicate_extents_to_file_ex)

* tag 'v6.19-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb/client: remove DeviceType Flags and Device Characteristics definitions
  smb: move File Attributes definitions into common/fscc.h
  smb: update struct duplicate_extents_to_file_ex
  smb: move file_notify_information to common/fscc.h
  smb: move SMB2 Notify Action Flags into common/smb2pdu.h
  smb: move notify completion filter flags into common/smb2pdu.h
  smb/client: add parentheses to NT error code definitions containing bitwise OR operator
  smb: add documentation references for smb2 change notify definitions
  smb/client: add 4 NT error code definitions
  smb/client: fix NT_STATUS_UNABLE_TO_FREE_VM value
  smb/client: fix NT_STATUS_DEVICE_DOOR_OPEN value
  smb/client: fix NT_STATUS_NO_DATA_DETECTED value

Merge tag 'nfs-for-6.19-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
"Bugfixes:
   - Fix 'nlink' attribute update races when unlinking a file
   - Add missing initialisers for the directory verifier in various
     places
   - Don't regress the NFSv4 open state due to misordered racing replies
   - Ensure the NFSv4.x callback server uses the correct transport
     connection
   - Fix potential use-after-free races when shutting down the NFSv4.x
     callback server
   - Fix a pNFS layout commit crash
   - Assorted fixes to ensure correct propagation of mount options when
     the client crosses a filesystem boundary and triggers the VFS
     automount code
   - More localio fixes

  Features and cleanups:
   - Add initial support for basic directory delegations
   - SunRPC back channel code cleanups"

* tag 'nfs-for-6.19-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (24 commits)
  NFSv4: Handle NFS4ERR_NOTSUPP errors for directory delegations
  nfs/localio: remove 61 byte hole from needless ____cacheline_aligned
  nfs/localio: remove alignment size checking in nfs_is_local_dio_possible
  NFS: Fix up the automount fs_context to use the correct cred
  NFS: Fix inheritance of the block sizes when automounting
  NFS: Automounted filesystems should inherit ro,noexec,nodev,sync flags
  Revert "nfs: ignore SB_RDONLY when mounting nfs"
  Revert "nfs: clear SB_RDONLY before getting superblock"
  Revert "nfs: ignore SB_RDONLY when remounting nfs"
  NFS: Add a module option to disable directory delegations
  NFS: Shortcut lookup revalidations if we have a directory delegation
  NFS: Request a directory delegation during RENAME
  NFS: Request a directory delegation on ACCESS, CREATE, and UNLINK
  NFS: Add support for sending GDD_GETATTR
  NFSv4/pNFS: Clear NFS_INO_LAYOUTCOMMIT in pnfs_mark_layout_stateid_invalid
  NFSv4.1: protect destroying and nullifying bc_serv structure
  SUNRPC: new helper function for stopping backchannel server
  SUNRPC: cleanup common code in backchannel request
  NFSv4.1: pass transport for callback shutdown
  NFSv4: ensure the open stateid seqid doesn't go backwards
  ...

rseq: Always inline rseq_debug_syscall_return()

To get the full benefit of:

  eaa9088d568c ("rseq: Use static branch for syscall exit debug when GENERIC_IRQ_ENTRY=y")

clang needs an __always_inline instead of a plain inline qualifier:

$ for i in {1..10}; do taskset -c 4 perf5 bench syscall basic -l 100000000 | grep "ops/sec"; done

Before      After
ops/sec  15424491    15872221   +2.9%

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/20251205100753.4073221-1-edumazet@google.com

bug: Hush suggest-attribute=format for __warn_printf()

Recent additions to this function cause GCC 14.3.0 to get excited
(W=1) and suggest a missing attribute:

lib/bug.c: In function '__warn_printf':
lib/bug.c:187:25: error: function '__warn_printf' be a candidate for 'gnu_printf' format attribute [-Werror=suggest-attribute=format]
187 | vprintk(fmt, *args);
| ^~~~~~~

Disable the diagnostic locally, following the pattern used for stuff
like va_format().

Fixes: 5c47b7f3d1a9 ("bug: Add BUG_FORMAT_ARGS infrastructure")
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251207-warn-printf-gcc-v1-1-b597d612b94b@google.com

bug: Let report_bug_entry() provide the correct bugaddr

report_bug_entry() always provides zero for bugaddr but could easily
extract the correct address from the provided bug_entry. Just do that to
have proper warning messages.

E.g. adding an artificial:

  void foo(void) { WARN_ONCE(1, "bar"); }

function generates this warning message:

  WARNING: arch/s390/kernel/setup.c:1017 at 0x0, CPU#0: swapper/0/0
                                            ^^^

With the correct bug address this changes to:

  WARNING: arch/s390/kernel/setup.c:1017 at foo+0x1c/0x40, CPU#0: swapper/0/0
                                            ^^^^^^^^^^^^^

Fixes: 7d2c27a0ec5e ("bug: Add report_bug_entry()")
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251208200658.3431511-1-hca@linux.ibm.com

Merge tag 'drm-intel-next-fixes-2025-12-12' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next

drm/i915 fixes for v6.19-rc1:
- Fix format string truncation warning
- FIx runtime PM reference during fbdev BO creation

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patch.msgid.link/281309f78560bcceebac8d5c0511efe66baf641c@intel.com

perf/x86/intel: Fix NULL event dereference crash in handle_pmi_common()

handle_pmi_common() may observe an active bit set in cpuc->active_mask
while the corresponding cpuc->events[] entry has already been cleared,
which leads to a NULL pointer dereference.

This can happen when interrupt throttling stops all events in a group
while PEBS processing is still in progress. perf_event_overflow() can
trigger perf_event_throttle_group(), which stops the group and clears
the cpuc->events[] entry, but the active bit may still be set when
handle_pmi_common() iterates over the events.

The following recent fix:

7e772a93eb61 ("perf/x86: Fix NULL event access and potential PEBS record loss")

moved the cpuc->events[] clearing from x86_pmu_stop() to x86_pmu_del() and
relied on cpuc->active_mask/pebs_enabled checks. However,
handle_pmi_common() can still encounter a NULL cpuc->events[] entry
despite the active bit being set.

Add an explicit NULL check on the event pointer before using it,
to cover this legitimate scenario and avoid the NULL dereference crash.

Fixes: 7e772a93eb61 ("perf/x86: Fix NULL event access and potential PEBS record loss")
Reported-by: kitta <kitta@linux.alibaba.com>
Co-developed-by: kitta <kitta@linux.alibaba.com>
Signed-off-by: Evan Li <evan.li@linux.alibaba.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251212084943.2124787-1-evan.li@linux.alibaba.com
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220855

Merge tag 'amd-drm-fixes-6.19-2025-12-11' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-fixes-6.19-2025-12-11:

amdgpu:
- SI fix
- DC reduce stack usage
- HDMI fixes
- VCN 4.0.5 fix
- DP MST fix
- DC memory allocation fix

amdkfd:
- SVM fix
- Trap handler fix
- VGPR fixes for GC 11.5

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20251211195600.1641924-1-alexander.deucher@amd.com

Merge tag 'drm-misc-next-fixes-2025-12-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next-fixes for v6.19-rc1:
- Fix uaf in panthor.
- Revert 8 byte alignment constraint for pitch in dumb bo's.
- Fix DRM_MODE_FLAG_N.SYNC and !DRM_MODE_FLAG_P.SYNC handling renasas.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patch.msgid.link/a82c2a2a-314f-403b-85bf-9b3ee09b903c@linux.intel.com

ALSA: hda/tas2781: Add new quirk for HP new project

Add new vendor_id and subsystem_id in quirk for HP new project (NexusX).

Signed-off-by: Baojun Xu <baojun.xu@ti.com>
Link: https://patch.msgid.link/20251211092427.1648-1-baojun.xu@ti.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: hda: cix-ipbloq: Use modern PM ops

When building without CONFIG_PM_SLEEP, there are several warnings (or
errors with CONFIG_WERROR=y / W=e) from the cix-ipbloq driver:

  sound/hda/controllers/cix-ipbloq.c:378:12: error: 'cix_ipbloq_hda_runtime_resume' defined but not used [-Werror=unused-function]
    378 | static int cix_ipbloq_hda_runtime_resume(struct device *dev)
        |            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  sound/hda/controllers/cix-ipbloq.c:362:12: error: 'cix_ipbloq_hda_runtime_suspend' defined but not used [-Werror=unused-function]
    362 | static int cix_ipbloq_hda_runtime_suspend(struct device *dev)
        |            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  sound/hda/controllers/cix-ipbloq.c:349:12: error: 'cix_ipbloq_hda_resume' defined but not used [-Werror=unused-function]
    349 | static int cix_ipbloq_hda_resume(struct device *dev)
        |            ^~~~~~~~~~~~~~~~~~~~~
  sound/hda/controllers/cix-ipbloq.c:336:12: error: 'cix_ipbloq_hda_suspend' defined but not used [-Werror=unused-function]
    336 | static int cix_ipbloq_hda_suspend(struct device *dev)
        |            ^~~~~~~~~~~~~~~~~~~~~~

When CONFIG_PM and CONFIG_PM_SLEEP are unset, SET_SYSTEM_SLEEP_PM_OPS()
and SET_RUNTIME_PM_OPS() evaluate to nothing, so these functions appear
unused to the compiler in this configuration.

Use the modern SYSTEM_SLEEP_PM_OPS and RUNTIME_PM_OPS macros to resolve
these warnings, which is what they are intended to do. Additionally,
wrap &cix_ipbloq_hda_pm in pm_ptr() to ensure the compiler can drop the
entire structure when CONFIG_PM is unset.

Fixes: d91e9bd10125 ("ALSA: hda: add CIX IPBLOQ HDA controller support")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20251211-hda-cix-ipbloq-modern-pm-ops-v1-1-c7a5580af021@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>

Merge tag 'asoc-fix-v6.19-merge-window' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.19

A small pile of fixes that came in during the merge window, it's all
fairly standard device specific stuff.

smb/client: remove DeviceType Flags and Device Characteristics definitions

These definitions are already in common/smb2pdu.h, so remove the duplicated
ones from the client.

Co-developed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: move File Attributes definitions into common/fscc.h

These definitions are specified in MS-FSCC 2.6, so move them into fscc.h.

Modify the following places:

  - FILE_ATTRIBUTE__MASK -> FILE_ATTRIBUTE_MASK
  - Update FILE_ATTRIBUTE_MASK value
  - cpu_to_le32(constant) -> cpu_to_le32(MACRO DEFINITION)

Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: update struct duplicate_extents_to_file_ex

Add the missing field to the structure (see MS-FSCC 2.3.9.2), and correct
the section number in the documentation reference.

Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>

Merge tag 'for-6.19/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mikulas Patocka:

- convert crypto_shash users to direct crypto library use with simpler
   and faster code and reduced stack usage (Eric Biggers):

     - the dm-verity SHA-256 conversion also teaches it to do two-way
       interleaved hashing for added performance

     - dm-crypt MD5 conversion (used for Loop-AES compatibility)

- added document for for takeover/reshape raid1 -> raid5 examples (Heinz Mauelshagen)

- fix dm-vdo kerneldoc warnings (Matthew Sakai)

- various random fixes and cleanups

* tag 'for-6.19/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (29 commits)
  dm pcache: fix segment info indexing
  dm pcache: fix cache info indexing
  dm-pcache: advance slot index before writing slot
  dm raid: add documentation for takeover/reshape raid1 -> raid5 table line examples
  dm log-writes: Add missing set_freezable() for freezable kthread
  dm-raid: fix possible NULL dereference with undefined raid type
  dm-snapshot: fix 'scheduling while atomic' on real-time kernels
  dm: ignore discard return value
  MAINTAINERS: add Benjamin Marzinski as a device mapper maintainer
  dm-mpath: Simplify the setup_scsi_dh code
  dm vdo: fix kerneldoc warnings
  dm-bufio: align write boundary on physical block size
  dm-crypt: enable DM_TARGET_ATOMIC_WRITES
  dm: test for REQ_ATOMIC in dm_accept_partial_bio()
  dm-verity: remove useless mempool
  dm-verity: disable recursive forward error correction
  dm-ebs: Mark full buffer dirty even on partial write
  dm mpath: enable DM_TARGET_ATOMIC_WRITES
  dm verity fec: Expose corrected block count via status
  dm: Don't warn if IMA_DISABLE_HTABLE is not enabled
  ...

Merge tag 'spi-fix-v6.19-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
"A few small fixes for SPI that came in during the merge window,
  nothing too exciting here"

* tag 'spi-fix-v6.19-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: microchip-core: Fix an error handling path in mchp_corespi_probe()
  spi: cadence-qspi: Fix runtime PM imbalance in probe

Merge tag 'regulator-fix-v6.19-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
"A few fixes that came in during the merge window, nothing too
  exciting - the one core fix improves error propagation from gpiolib
  which hopefully shouldn't actually happen but is safer"

* tag 'regulator-fix-v6.19-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: spacemit: Align input supply name with the DT binding
  regulator: fixed: Rely on the core freeing the enable GPIO
  regulator: check the return value of gpiod_set_value_cansleep()

mm: memfd_luo: add CONFIG_SHMEM dependency

The new memfd code fails to link without SHMEM:

aarch64-linux-ld: mm/memfd_luo.o: in function `memfd_luo_retrieve_folios':
memfd_luo.c:(.text.memfd_luo_retrieve_folios+0xdc): undefined reference to `shmem_add_to_page_cache'
memfd_luo.c:(.text.memfd_luo_retrieve_folios+0x11c): undefined reference to `shmem_inode_acct_blocks'
memfd_luo.c:(.text.memfd_luo_retrieve_folios+0x134): undefined reference to `shmem_recalc_inode'

Add a Kconfig dependency to disallow that configuration.

Link: https://lkml.kernel.org/r/20251204100203.1034394-1-arnd@kernel.org
Fixes: b3749f174d68 ("mm: memfd_luo: allow preserving memfd")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm: shmem: avoid build warning for CONFIG_SHMEM=n

The newly added 'flags' variable is unused and causes a warning if
CONFIG_SHMEM is disabled, since the shmem_acct_size() macro it is passed
into does nothing:

mm/shmem.c: In function '__shmem_file_setup':
mm/shmem.c:5816:23: error: unused variable 'flags' [-Werror=unused-variable]
5816 | unsigned long flags = (vm_flags & VM_NORESERVE) ? SHMEM_F_NORESERVE : 0;
| ^~~~~

Replace the two macros with equivalent inline functions to get the
argument checking.

Link: https://lkml.kernel.org/r/20251204102905.1048000-1-arnd@kernel.org
Fixes: 6ff1610ced56 ("mm: shmem: use SHMEM_F_* flags instead of VM_* flags")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: guoweikang <guoweikang.kernel@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ocfs2: fix memory leak in ocfs2_merge_rec_left()

In 'ocfs2_merge_rec_left()', do not reset 'left_path' to NULL after
move, thus allowing 'ocfs2_free_path()' to free it before return.

Link: https://lkml.kernel.org/r/20251205065159.392749-1-dmantipov@yandex.ru
Fixes: 677b975282e4 ("ocfs2: Add support for cross extent block")
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Reported-by: syzbot+cfc7cab3bb6eaa7c4de2@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=cfc7cab3bb6eaa7c4de2
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ocfs2: invalidate inode if i_mode is zero after block read

A panic occurs in ocfs2_unlink due to WARN_ON(inode->i_nlink == 0) when
handling a corrupted inode with i_mode=0 and i_nlink=0 in memory.

This "zombie" inode is created because ocfs2_read_locked_inode proceeds
even after ocfs2_validate_inode_block successfully validates a block that
structurally looks okay (passes checksum, signature etc.) but contains
semantically invalid data (specifically i_mode=0).  The current validation
function doesn't check for i_mode being zero.

This results in an in-memory inode with i_mode=0 being added to the VFS
cache, which later triggers the panic during unlink.

Prevent this by adding an explicit check for (i_mode == 0, i_nlink == 0,
non-orphan) within ocfs2_validate_inode_block.  If the check is true,
return -EFSCORRUPTED to signal corruption.  This causes the caller
(ocfs2_read_locked_inode) to invoke make_bad_inode(), correctly preventing
the zombie inode from entering the cache.

Link: https://lkml.kernel.org/r/20251202224507.53452-2-eraykrdg1@gmail.com
Co-developed-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com>
Signed-off-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com>
Signed-off-by: Ahmet Eray Karadag <eraykrdg1@gmail.com>
Reported-by: syzbot+55c40ae8a0e5f3659f2b@syzkaller.appspotmail.com
Fixes: https://syzkaller.appspot.com/bug?extid=55c40ae8a0e5f3659f2b
Link: https://lore.kernel.org/all/20251022222752.46758-2-eraykrdg1@gmail.com/T/
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: David Hunter <david.hunter.linux@gmail.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ocfs2: avoid -Wflex-array-member-not-at-end warning

-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.

Use the new TRAILING_OVERLAP() helper to fix the following warning:

fs/ocfs2/xattr.c:52:41: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

This helper creates a union between a flexible-array member (FAM) and a
set of MEMBERS that would otherwise follow it.

This overlays the trailing MEMBER struct ocfs2_extent_rec er; onto the FAM
struct ocfs2_xattr_value_root::xr_list.l_recs[], while keeping the FAM and
the start of MEMBER aligned.

The static_assert() ensures this alignment remains, and it's intentionally
placed inmediately after the related structure --no blank line in between.

Link: https://lkml.kernel.org/r/aRKm_7aN7Smc3J5L@kspp
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ocfs2: convert remaining read-only checks to ocfs2_emergency_state

Now that the centralized `ocfs2_emergency_state()` helper is available,
refactor remaining filesystem-wide checks for `ocfs2_is_soft_readonly` and
`ocfs2_is_hard_readonly` to use this new function.

To ensure strict consistency with the previous behavior and guarantee no
functional changes, the call sites continue to explicitly return -EROFS
when the emergency state is detected. This standardizes the check logic
while preserving the existing error handling flow.

Link: https://lkml.kernel.org/r/3421641b54ad6b6e4ffca052351b518eacc1bd08.1764728893.git.eraykrdg1@gmail.com
Co-developed-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com>
Signed-off-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com>
Signed-off-by: Ahmet Eray Karadag <eraykrdg1@gmail.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: David Hunter <david.hunter.linux@gmail.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ocfs2: add ocfs2_emergency_state helper and apply to setattr

Patch series "ocfs2: Refactor read-only checks to use
ocfs2_emergency_state", v4.

Following the fix for the `make_bad_inode` validation failure (syzbot ID:
b93b65ee321c97861072), this separate series introduces a new helper
function, `ocfs2_emergency_state()`, to improve and centralize read-only
and error state checking.

This is modeled after the `ext4_emergency_state()` pattern, providing a
single, unified location for checking all filesystem-level emergency
conditions.  This makes the code cleaner and ensures that any future
checks (e.g., for fatal error states) can be added in one place.

This series is structured as follows:

1.  The first patch introduces the `ocfs2_emergency_state()` helper
    (currently checking for -EROFS) and applies it to `ocfs2_setattr`
    to provide a "fail-fast" mechanism, as suggested by Albin
    Babu Varghese.
2.  The second patch completes the refactoring by converting all
    remaining read-only checks throughout OCFS2 to use this new helper.

This patch (of 2):

To centralize error checking, follow the pattern of other filesystems like
ext4 (which uses `ext4_emergency_state()`), and prepare for future
enhancements, this patch introduces a new helper function:
`ocfs2_emergency_state()`.

The purpose of this helper is to provide a single, unified location for
checking all filesystem-level emergency conditions.  In this initial
implementation, the function only checks for the existing hard and soft
read-only modes, returning -EROFS if either is set.

This provides a foundation where future checks (e.g., for fatal error
states returning -EIO, or shutdown states) can be easily added in one
place.

This patch also adds this new check to the beginning of `ocfs2_setattr()`.
This ensures that operations like `ftruncate` (which triggered the
original BUG) fail-fast with -EROFS when the filesystem is already in a
read-only state.

Link: https://lkml.kernel.org/r/cover.1764728893.git.eraykrdg1@gmail.com
Link: https://lkml.kernel.org/r/e9e975bcaaff8dbc155b70fbc1b2798a2e36e96f.1764728893.git.eraykrdg1@gmail.com
Co-developed-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com>
Signed-off-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com>
Signed-off-by: Ahmet Eray Karadag <eraykrdg1@gmail.com>
Suggested-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: David Hunter <david.hunter.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>