git.ipfire.org Git - thirdparty/linux.git/log

s390/ap: Restrict driver_override versus apmask and aqmask use

Introduce a restriction for the driver_override feature versus apmask
and aqmask:
- driver_override is only allowed when the apmask and aqmask values
both are default (=0xffff..ffff).
- apmask and aqmask modifications are only allowed when there is no
driver_override on any AP device active.
So in the end the user is restricted to choose to either use
apmask/apmask to divide the AP devices into host owned and vfio owned
or use the driver_override feature but not mix these two approaches.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/ap: Rename mutex ap_perms_mutex to ap_attr_mutex

The mutex ap_perms_mutex was already used not only for protection
of the struct ap_perms ap_perms variable but also for an consistent
update of the AP bus sysfs attributes apmask and aqmask.

So rename this mutex to ap_attr_mutex which better reflects the
current use. This is also a preparation for an upcoming patch which
will use this mutex to lock updates on a new sysfs attribute.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/ap: Support driver_override for AP queue devices

Add a new sysfs attribute driver_override the AP queue's
directory. Writing in a string overrides the default driver
determination and the drivers are matched against this string
instead. This overrules the driver binding determined by the
apmask/aqmask bitmask fields.

According to the common understanding of how the driver_override
behavior shall work, there is no further checking done. Neither about
the string which is given as override driver nor if this device is
currently in use by an mdev device. Another patch may limit this
behavior to refuse a mixed usage of the driver_override and
apmask/aqmask feature.

As there exists some tooling for this kind of driver_override
(see package driverctl) the AP bus behavior for re-binding
should be compatible to this. The steps for a driver_override are:
1) unbind the current driver from the device. For example
    echo "17.0005" > /sys/devices/ap/card17/17.0005/driver/unbind
2) set the new driver for this device in the sysfs
    driver_override attribute. For example
    echo "vfio_ap" > /sys//devices/ap/card17/17.0005/driver_override
3) trigger a bus reprobe of this device. For example
    echo "17.0005" > /sys/bus/ap/drivers_probe
With the driverctl package this is more comfortable and
the settings get persisted:
  driverctl -b ap set-override 17.0005 vfio_ap
and unset with
  driverctl -b ap unset-override 17.0005

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/ap: Use all-bits-one apmask/aqmask for vfio in_use() checks

For the in_use() check of an updated apmask the host's aqmask
was provided to the vfio function. Similar on an update of the
aqmask the host's apmask was provided to the vfio in_use()
function. This led to false results on the check for apmask or
aqmask updates. For example with only one APQN when exactly
this card is tried to be re-assigned back to the host, the
in_use() check did not complain.

The correct behavior is achieved with providing a full mask
for aqmask when an adapter is to be checked and similar a full
mask for aqmask when a domain is to be checked for usage.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/debug: Update description of resize operation

With commit 1204777867e8 ("s390/debug: keep debug data on resize")
the behavior of a debug area resize operation was changed. Update the
associated documentation to reflect this change.

Fixes: 1204777867e8 ("s390/debug: keep debug data on resize")
Reported-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

Merge branch 'compat-removal'

Heiko Carstens says:

====================

Remove s390 compat support to allow for code simplification and especially
reduced test effort. To the best of our knowledge there aren't any 31 bit
binaries out in the world anymore that would matter for newer kernels or
newer distributions.

Distributions do not provide compat packages since quite some time or even
have CONFIG_COMPAT disabled.

Instead of adding deprecation warnings to config option, or adding kernel
messages, just remove the code. Deprecation warnings haven't proven to be
useful. If it turns out there is still a reason to keep the compat support
this series can be reverted at any time in the future.

====================

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/syscalls: Switch to generic system call table generation

The s390 syscall.tbl format differs slightly from most others, and
therefore requires an s390 specific system call table generation
script.

With compat support gone use the opportunity to switch to generic
system call table generation. The abi for all 64 bit system calls is
now common, since there is no need to specify if system call entry
points are only for 64 bit anymore.

Furthermore create the system call table in C instead of assembler
code in order to get type checking for all system call functions
contained within the table.

Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/syscalls: Remove system call table pointer from thread_struct

With compat support gone there is only one system call table
left. Therefore remove the sys_call_table pointer from
thread_struct and use the sys_call_table directly.

Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/uapi: Remove 31 bit support from uapi header files

Since the kernel does not support running 31 bit / compat binaries
anymore, remove also the corresponding 31 bit support from uapi header
files.

Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390: Remove compat support

There shouldn't be any 31 bit code around anymore that matters.
Remove the compat layer support required to run 31 bit code.

Reason for removal is code simplification and reduced test effort.

Note that this comes without any deprecation warnings added to config
options, or kernel messages, since most likely those would be ignored
anyway.

If it turns out there is still a reason to keep the compat layer this
can be reverted at any time in the future.

Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

tools: Remove s390 compat support

Remove s390 compat support from everything within tools, since s390 compat
support will be removed from the kernel.

Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Weißschuh <linux@weissschuh.net> # tools/nolibc selftests/nolibc
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net> # selftests/vDSO
Acked-by: Alexei Starovoitov <ast@kernel.org> # bpf bits
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/syscalls: Add pt_regs parameter to SYSCALL_DEFINE0() syscall wrapper

All system call wrappers should match the sys_call_ptr_t type. This is not
the case for system calls without parameters. Add the missing pt_regs
parameter there too.

Note: this is currently not a problem, since the parameter is unused.
However it prevents to create a correctly typed system call table in
C. With the current assembler implementation this works because of
missing type checking.

Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/kvm: Use psw32_t instead of psw_compat_t

kvm_s390_handle_lpsw() make use of the psw_compat_t type even though
the code has nothing to do with CONFIG_COMPAT, for which the type is
supposed to be used. Use psw32_t instead.

Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/ptrace: Rename psw_t32 to psw32_t

Use a standard "_t" suffix for psw_t32 and rename it to psw32_t.

Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/fault: Print unmodified PSW address on protection exception

In case of a kernel crash caused by a protection exception, print the
unmodified PSW address as reported by the CPU. The protection exception
handler modifies the PSW address in order to keep fault handling easy,
however that leads to misleading call traces.

Therefore restore the original PSW address before printing it.

Before this change the output in case of a protection exception looks like
this:

Oops: 0004 ilc:2 [#1]SMP
Krnl PSW : 0704c00180000000 000003ffe0b40d78 (sysrq_handle_crash+0x28/0x40)
            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
...
Krnl Code: 000003ffe0b40d66: e3e0f0980024        stg     %r14,152(%r15)
            000003ffe0b40d6c: c010fffffff2        larl    %r1,000003ffe0b40d50
           #000003ffe0b40d72: c0200046b6bc        larl    %r2,000003ffe1417aea
           >000003ffe0b40d78: 92021000            mvi     0(%r1),2
            000003ffe0b40d7c: c0e5ffae03d6        brasl   %r14,000003ffe0101528

With this change it looks like this:

Oops: 0004 ilc:2 [#1]SMP
Krnl PSW : 0704c00180000000 000003ffe0b40dfc (sysrq_handle_crash+0x2c/0x40)
            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
...
Krnl Code: 000003ffe0b40dec: c010fffffff2        larl    %r1,000003ffe0b40dd0
            000003ffe0b40df2: c0200046b67c        larl    %r2,000003ffe1417aea
           *000003ffe0b40df8: 92021000            mvi     0(%r1),2
           >000003ffe0b40dfc: c0e5ffae03b6        brasl   %r14,000003ffe0101568
            000003ffe0b40e02: 0707                bcr     0,%r7

Note that with this change the PSW address points to the instruction behind
the instruction which caused the exception like it is expected for
protection exceptions.

This also replaces the '#' marker in the disassembly with '*', which allows
to distinguish between new and old behavior.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/uprobes: Use __forward_psw() instead of private implementation

With adjust_psw_addr() the uprobes code contains more or less a private
__forward_psw() implementation. Switch it to use __forward_psw(), and
remove adjust_psw_addr().

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/processor: Add __forward_psw() helper

Similar to __rewind_psw() add the counter part __forward_psw(). This
helps to make code more readable if a PSW address has to be forwarded,
since it is more natural to write

addr = __forward_psw(psw, ilen);

instead of

addr = __rewind_psw(psw, -ilen);

This renames also the ilc parameter of __rewind_psw() to ilen, since
the parameter reflects an instruction length, and not an instruction
length code. Also change the type of ilen from unsigned long to long
so it reflects that lengths can be negative or positive.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/fpu: Fix false-positive kmsan report in fpu_vstl()

A false-positive kmsan report is detected when running ping command.

An inline assembly instruction 'vstl' can write varied amount of bytes
depending on value of 'index' argument. If 'index' > 0, 'vstl' writes
at least 2 bytes.

clang generates kmsan write helper call depending on inline assembly
constraints. Constraints are evaluated compile-time, but value of
'index' argument is known only at runtime.

clang currently generates call to __msan_instrument_asm_store with 1 byte
as size. Manually call kmsan function to indicate correct amount of bytes
written and fix false-positive report.

This change fixes following kmsan reports:

[   36.563119] =====================================================
[   36.563594] BUG: KMSAN: uninit-value in virtqueue_add+0x35c6/0x7c70
[   36.563852]  virtqueue_add+0x35c6/0x7c70
[   36.564016]  virtqueue_add_outbuf+0xa0/0xb0
[   36.564266]  start_xmit+0x288c/0x4a20
[   36.564460]  dev_hard_start_xmit+0x302/0x900
[   36.564649]  sch_direct_xmit+0x340/0xea0
[   36.564894]  __dev_queue_xmit+0x2e94/0x59b0
[   36.565058]  neigh_resolve_output+0x936/0xb40
[   36.565278]  __neigh_update+0x2f66/0x3a60
[   36.565499]  neigh_update+0x52/0x60
[   36.565683]  arp_process+0x1588/0x2de0
[   36.565916]  NF_HOOK+0x1da/0x240
[   36.566087]  arp_rcv+0x3e4/0x6e0
[   36.566306]  __netif_receive_skb_list_core+0x1374/0x15a0
[   36.566527]  netif_receive_skb_list_internal+0x1116/0x17d0
[   36.566710]  napi_complete_done+0x376/0x740
[   36.566918]  virtnet_poll+0x1bae/0x2910
[   36.567130]  __napi_poll+0xf4/0x830
[   36.567294]  net_rx_action+0x97c/0x1ed0
[   36.567556]  handle_softirqs+0x306/0xe10
[   36.567731]  irq_exit_rcu+0x14c/0x2e0
[   36.567910]  do_io_irq+0xd4/0x120
[   36.568139]  io_int_handler+0xc2/0xe8
[   36.568299]  arch_cpu_idle+0xb0/0xc0
[   36.568540]  arch_cpu_idle+0x76/0xc0
[   36.568726]  default_idle_call+0x40/0x70
[   36.568953]  do_idle+0x1d6/0x390
[   36.569486]  cpu_startup_entry+0x9a/0xb0
[   36.569745]  rest_init+0x1ea/0x290
[   36.570029]  start_kernel+0x95e/0xb90
[   36.570348]  startup_continue+0x2e/0x40
[   36.570703]
[   36.570798] Uninit was created at:
[   36.571002]  kmem_cache_alloc_node_noprof+0x9e8/0x10e0
[   36.571261]  kmalloc_reserve+0x12a/0x470
[   36.571553]  __alloc_skb+0x310/0x860
[   36.571844]  __ip_append_data+0x483e/0x6a30
[   36.572170]  ip_append_data+0x11c/0x1e0
[   36.572477]  raw_sendmsg+0x1c8c/0x2180
[   36.572818]  inet_sendmsg+0xe6/0x190
[   36.573142]  __sys_sendto+0x55e/0x8e0
[   36.573392]  __s390x_sys_socketcall+0x19ae/0x2ba0
[   36.573571]  __do_syscall+0x12e/0x240
[   36.573823]  system_call+0x6e/0x90
[   36.573976]
[   36.574017] Byte 35 of 98 is uninitialized
[   36.574082] Memory access of size 98 starts at 0000000007aa0012
[   36.574218]
[   36.574325] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G    B            N  6.17.0-dirty #16 NONE
[   36.574541] Tainted: [B]=BAD_PAGE, [N]=TEST
[   36.574617] Hardware name: IBM 3931 A01 703 (KVM/Linux)
[   36.574755] =====================================================

[   63.532541] =====================================================
[   63.533639] BUG: KMSAN: uninit-value in virtqueue_add+0x35c6/0x7c70
[   63.533989]  virtqueue_add+0x35c6/0x7c70
[   63.534940]  virtqueue_add_outbuf+0xa0/0xb0
[   63.535861]  start_xmit+0x288c/0x4a20
[   63.536708]  dev_hard_start_xmit+0x302/0x900
[   63.537020]  sch_direct_xmit+0x340/0xea0
[   63.537997]  __dev_queue_xmit+0x2e94/0x59b0
[   63.538819]  neigh_resolve_output+0x936/0xb40
[   63.539793]  ip_finish_output2+0x1ee2/0x2200
[   63.540784]  __ip_finish_output+0x272/0x7a0
[   63.541765]  ip_finish_output+0x4e/0x5e0
[   63.542791]  ip_output+0x166/0x410
[   63.543771]  ip_push_pending_frames+0x1a2/0x470
[   63.544753]  raw_sendmsg+0x1f06/0x2180
[   63.545033]  inet_sendmsg+0xe6/0x190
[   63.546006]  __sys_sendto+0x55e/0x8e0
[   63.546859]  __s390x_sys_socketcall+0x19ae/0x2ba0
[   63.547730]  __do_syscall+0x12e/0x240
[   63.548019]  system_call+0x6e/0x90
[   63.548989]
[   63.549779] Uninit was created at:
[   63.550691]  kmem_cache_alloc_node_noprof+0x9e8/0x10e0
[   63.550975]  kmalloc_reserve+0x12a/0x470
[   63.551969]  __alloc_skb+0x310/0x860
[   63.552949]  __ip_append_data+0x483e/0x6a30
[   63.553902]  ip_append_data+0x11c/0x1e0
[   63.554912]  raw_sendmsg+0x1c8c/0x2180
[   63.556719]  inet_sendmsg+0xe6/0x190
[   63.557534]  __sys_sendto+0x55e/0x8e0
[   63.557875]  __s390x_sys_socketcall+0x19ae/0x2ba0
[   63.558869]  __do_syscall+0x12e/0x240
[   63.559832]  system_call+0x6e/0x90
[   63.560780]
[   63.560972] Byte 35 of 98 is uninitialized
[   63.561741] Memory access of size 98 starts at 0000000005704312
[   63.561950]
[   63.562824] CPU: 3 UID: 0 PID: 192 Comm: ping Tainted: G    B            N  6.17.0-dirty #16 NONE
[   63.563868] Tainted: [B]=BAD_PAGE, [N]=TEST
[   63.564751] Hardware name: IBM 3931 A01 703 (KVM/Linux)
[   63.564986] =====================================================

Fixes: dcd3e1de9d17 ("s390/checksum: provide csum_partial_copy_nocheck()")
Signed-off-by: Aleksei Nikiforov <aleksei.nikiforov@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai: Calculate size of reserved PAI extension control block area

The PAI extension 1 control block area is 512 bytes in total.
It currently contains three address pointer which refer to counter
memory blocks followed by a reserved area.
Calculate the reserved area instead of hardcoding its size. This
makes the code more readable and maintainable.
No functional chance.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Suggested-by: Jan Polensky <japo@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/mm: Let dump_fault_info() print additional information

Let dump_fault_info() print additional information to make debugging
easier:

Print "FSI" if the access-exception-fetch/store-indication facility is
installed. If it is installed the TEID may also indicate if an exception
happened because of a fetch or a store operation.

Print "SOP", "ESOP-1", or "ESOP-2" depending on the type of the installed
Suppression-on-Protection facility. This also gives additional information
about the validity and meaning of the TEID bits.

The output is changed from something like:

Failing address: 0000000000000000 TEID: 0000000000000803

to

Failing address: 0000000000000000 TEID: 0000000000000803 ESOP-2 FSI

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/mm: Change comment and die() message if teid.b61 is zero

The comments in do_protection() give the impression that a TEID, where bit
61 is zero, indicates a low address protection exception. This is not
necessarily true, and it depends on the type of Suppression-on-Protection
facility of the machine (see Princples of Operation) what this means.

Rework the comments and the die() message to reflect this. This may also
help to avoid confusion.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/mm: Remove unused flush_tlb()

flush_tlb() exists for historic reasons and was never used. Remove it.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

Merge branch 'pai-pmu-merge'

Thomas Richter says:

====================

The PAI PMUs pai_crypto and pai_ext both operate on memory
mapped counters supported by z16 and follow on machines.
These memory mapped counters have a lot in common, like:
- validation, installing and removing events
- starting and stopping events
- retrieving counter values
- collecting sample data.

However both PMU drivers have slightly different parameters,
for example:
- different mapped memory size
- different number of supported counters
- different counter numbers and names
- different bits in the CR0 register
- different anchor address in lowcore

Due to these different parameters, two independent
PMUs have been developed. However both PMU drivers
have very much in common and most of the PMU call back
functions look very similar and are sometimes identical.

This patch set combines both independent PMU device drivers
perf_pai_crypto.c and per_pai_ext.c into one device driver.
The new device driver operations on a table which contains
the different parameters and uses common functions for
event operations.

Result is one PAI PMU driver which supports both PMUs.
It is also extendable to support new PAI PMUs.

====================

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai: Rename perf_pai_crypto.c to perf_pai.c

Rename perf_pai_crypto.c to perf_pai.c. The new perf_pai.c
contains both PAI device drivers:
- pai_crypto for PAI crypto counter set
- pai_ext for PAI NNPA counter set
The rename reflects this common driver supporting both PMUs.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Merge pai_ext PMU into pai_crypto

Combine PAI cryptography and PAI extension (NNPA) PMUs in one driver.
Remove file perf_pai_ext.c and registration of PMU "pai_ext"
from perf_pai_crypto.c.

Includes:
- Shared alloc/free and sched_task handling
- NNPA events with exclude_kernel enforced, exclude_user rejected
- Setup CR0 bits for both PMUs

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Introduce PAI crypto specific event delete function

Introduce PAI crypto specific event delete function to handle
additional actions to be done at event removal.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Make pai_root per-PMU and unify naming

Prepare the common PAI PMU driver to handle multiple PMUs.

Convert pai_root into an array indexed by PAI_PMU_IDX(event)
so that per-CPU state becomes per-PMU. Adjust all call sites
accordingly. Rename KMSG_COMPONENT and the s390dbf buffer from
"pai_crypto" to "pai" for consistent naming.

No functional change intended beyond log identifiers.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Rename paicrypt_copy() to pai_copy()

To support one common PAI PMU device driver which handles
both PMUs pai_crypto and pai_ext, use a common naming scheme
for structures and variables suitable for both device drivers.
Rename paicrypt_copy() to pai_copy() to indicate its common usage.
No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Add common pai_del() function

To support one common PAI PMU device driver which handles
both PMUs pai_crypto and pai_ext, use a common naming scheme
for structures and variables suitable for both device drivers.
Add a common usable function pai_stop() for the event on a CPU.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Add common pai_stop() function

To support one common PAI PMU device driver which handles
both PMUs pai_crypto and pai_ext, use a common naming scheme
for structures and variables suitable for both device drivers.

Add a common usable function pai_stop() for the event on a CPU.
Call this common pai_stop() from paicrypt_del().

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Add common pai_add() function

To support one common PAI PMU device driver which handles
both PMUs pai_crypto and pai_ext, use a common naming scheme
for structures and variables suitable for both device drivers.

Add a common usable function pai_add() for the event on a CPU.
Call this common pai_add() from paicrypt_add().

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Add common pai_start() function

To support one common PAI PMU device driver which handles
both PMUs pai_crypto and pai_ext, use a common naming scheme
for structures and variables suitable for both device drivers.

Add a common usable function pai_start() to the event on a CPU.
The function expects a PAI PMU specific read function as second
parameter to read out the start value for an event.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Add common pai_read() function

To support one common PAI PMU device driver which handles
both PMUs pai_crypto and pai_ext, use a common naming scheme
for structures and variables suitable for both device drivers.

Add a common usable function pai_read() to read counter values.
The function expects a PAI PMU specific read function as second
parameter.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Unify sample push logic and update context handling

Unify naming and logic for PAI PMU drivers to support both PMUs
pai_crypto and pai_ext.

Rename paicrypt_push_sample() to pai_push_sample() to reflect
its common usage. Add detailed comments about invocation context
and scheduler callbacks. Use struct pai_pmu to determine area_size
instead of PAGE_SIZE for counter backup.
Remove obsolete variable paicrypt_cnt.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Rename paicrypt_have_samples() to pai_have_samples()

To support one common PAI PMU device driver which handles
both PMUs pai_crypto and pai_ext, use a common naming scheme
for structures and variables suitable for both device drivers.
Rename paicrypt_have_samples() to pai_have_samples() to reflect
its common usage. No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Rename paicrypt_getctr() to pai_getctr()

To support one common PAI PMU device driver which handles
both PMUs pai_crypto and pai_ext, use a common naming scheme
for structures and variables suitable for both device drivers.

Rename paicrypt_getctr() to pai_getctr() to reflect is common
purpose. pai_getctr() now uses pai_pmu table to extract PAI PMU
characteristics such as kernel_offset inside the counter area page.
Also rename paicrypt_have_sample() to pai_have_sample().

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Rename paicrypt_getdata() to pai_getdata()

To support one common PAI PMU device driver which handles
both PMUs pai_crypto and pai_ext, use a common naming scheme
for structures and variables suitable for both device drivers.
Rename paicrypt_getdata() to pai_getdata(). Use the PAI PMU
characteristics in the pai_pmu table to determine the number
of counters to be extracted.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Rename some function for common usage.

To support one common PAI PMU device driver which handles
both PMUs pai_crypto and pai_ext, use a common naming scheme
for structures and variables suitable for both device drivers.
Rename functions
- paicrypt_free() -> pai_free()
- paicrypt_destroy_event() -> pai_destroy_event()
- paicrypt_destroy_event_cpu() -> pai_destroy_event_cpu()
to reflect their future common usage.
No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Introduce generic event init using pai_pmu[]

To support one common PAI PMU device driver which handles
both PMUs pai_crypto and pai_ext, use a common naming scheme
for structures and variables suitable for both device drivers.

Rework PAI crypto event initialization. Add a common
function for event initialization. It uses the PAI characteristics
stored in the pai_pmu table instead of hardcoded values.
Enlarge pai_event_valid() to check all event validation aspects.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Add PAI crypto characteristics table for parameters

Create and add a PMU characteristics table to store the parameters
of the PAI crypto PMU. This table contains PMU details such as
- number of available counters
- name of these counters to export to /sysfs
- Size of the memory mapped counter area
- base number of first counter
- etc

Also define a PMU specific initialization function to be called when
a PAI PMU feature is supported. At device driver initialization
test these features and if available use instruction qpaci to
retrieve the number of available counters. Also export these counter
names to /sysfs and register this PMU.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Rename paicrypt_root_alloc() and paicrypt_root_free()

To support one common PAI PMU device driver which handles
both PMUs pai_crypto and pai_ext, use a common naming scheme
for structures and variables suitable for both device drivers.
Rename functions paicrypt_root_alloc() and paicrypt_root_free()
to pai_root_alloc() and pai_root_free().
No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Rename structure paicrypt_root

To support one common PAI PMU device driver which handles
both PMUs pai_crypto and pai_ext, use a common naming scheme
for structures and variables suitable for both device drivers.
Rename structure paicrypt_root to pai_root.
No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Rename structure paicrypt_map to pai_map

To support one common PAI PMU device driver which handles
both PMUs pai_crypto and pai_ext, use a common naming scheme
for structures and variables suitable for both device drivers.
Rename structure paicrypt_map to pai_map.
No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Rename structure paicrypt_mapptr to pai_mapptr

To support one common PAI PMU device driver which handles
both PMUs pai_crypto and pai_ext, use a common naming scheme
for structures and variables suitable for both device drivers.
Rename structure paicrypt_mapptr to pai_mapptr.
No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Rename member paicrypt_map::page

Rename member page in struct paicrypt_map to area. This rename
creates consistent naming for both PMU drivers paicrypto and PMU
paiext. No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Rename variable cfm_dbg

The global variable cfm_dbg points to the s390dbf debug buffer.
Rename it to paidbg to better reflect its purpose.
No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/sclp_mem: Consider global memory_hotplug.memmap_on_memory setting

When the global kernel command line parameter
memory_hotplug.memmap_on_memory is set to false, per-memory-block
memmap_on_memory setting can still be set to true. However, when
configuring memory block, add_memory_resource() would configure it
without memmap_on_memory.

i.e.
Even if the MHP_MEMMAP_ON_MEMORY flag is set,
mhp_supports_memmap_on_memory() returns false unless the kernel command
line parameter "memory_hotplug.memmap_on_memory" is enabled. When both
the flag and the cmdline parameter are set, the memory block can be
configured with or without memmap_on_memory support.

To ensure consistent behavior, permit configuring per-memory-block
memmap_on_memory only when the memory_hotplug.memmap_on_memory kernel
command line parameter is enabled.

This is similar to commit 73954d379efd ("dax: add a sysfs knob to
control memmap_on_memory behavior")

Fixes: ff18dcb19aab ("s390/sclp: Add support for dynamic (de)configuration of memory")
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/hiperdispatch: Decrease steal time threshold

Higher steal time thresholds favor low utilization scenarios, which is not
the common case for s390. Set steal time threshold to a lower value to
prioritize vertical high and medium CPUs sooner and allow high utilization
scenarios to benefit from it.

Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Mete Durlu <meted@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/smp: Mark pcpu_delegate() and smp_call_ipl_cpu() as __noreturn

pcpu_delegate() never returns to its caller. If the target CPU is the
current CPU, it calls __pcpu_delegate(), whose delegate function is not
supposed to return. In any case, even if __pcpu_delegate() unexpectedly
returns, pcpu_delegate() sends SIGP_STOP to the current CPU and waits
in an infinite loop. Annotate pcpu_delegate() with the __noreturn
attribute to improve compiler optimizations.

Also annotate smp_call_ipl_cpu() accordingly since it always calls
pcpu_delegate().

[hca: Merge two patches from Thorsten Blum]

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/nmi: Annotate s390_handle_damage() with __noreturn

s390_handle_damage() ends by calling the non-returning function
disabled_wait() and therefore also never returns. Annotate it with the
__noreturn compiler attribute to improve compiler optimizations.

Remove the unreachable infinite while loop.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390: Fix double word in comments

Remove the repeated word "the" in comments.

Signed-off-by: Bo Liu <liubo03@inspur.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

Merge branch 'dat-enhancement-1'

Heiko Carstens says:

====================

Add the Dat-Enhancement facility 1 to the list of facilities which are
required to start the kernel. The facility provides the CSPG and IDTE
instructions. In particular the CSPG instruction can be used to replace a
valid page table entry with a different page table entry, which also
differs in the page frame real address.

Without the CSPG instruction it is possible to use the CSP instruction to
change valid page table entries, however it only allows to change the lower
or higher 32 bits of such entries, which means it cannot be used to change
the page frame real address of valid page table entries.

Given that there is code around (e.g. HugeTLB vmemmap optimization) which
requires to change valid page table entries of the kernel mapping, without
the detour over an invalid page table entry, make the CSPG instruction
unconditionally available.

The Dat-Enhancement facility 1 is available since z990, which is older than
the currently supported minimum architecture (z10). Therefore adding this
the architecture level set shouldn't cause any problems.

====================

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/mm: Replace the CSP instruction with CSPG

The CSPG instruction is part of the Dat-Enhancement facility 1, which
is always available. Given that it can be used everywhere where also
the CSP instruction can be used, replace CSP with CSPG everywhere.

This allows to remove the csp() inline assembly. Also remove the
unused gmap_pmdp_csp() function.

Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/mm: Remove cpu_has_idte()

Remove cpu_has_idte(). The IDTE instruction is part of the
Dat-Enhancement facility 1, which is always available.
Therefore remove the helper and now superfluous code.

Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390: Add Dat-Enhancement facility 1 to architecture level set

Add the Dat-Enhancement facility 1 to the list of facilities which are
required to start the kernel. The facility provides the CSPG and IDTE
instructions. In particular the CSPG instruction can be used to replace a
valid page table entry with a different page table entry, which also
differs in the page frame real address.

Without the CSPG instruction it is possible to use the CSP instruction to
change valid page table entries, however it only allows to change the lower
or higher 32 bits of such entries, which means it cannot be used to change
the page frame real address of valid page table entries.

Given that there is code around (e.g. HugeTLB vmemmap optimization) which
requires to change valid page table entries of the kernel mapping, without
the detour over an invalid page table entry, make the CSPG instruction
unconditionally available.

The Dat-Enhancement facility 1 is available since z990, which is older than
the currently supported minimum architecture (z10). Therefore adding this
to the architecture level set shouldn't cause any problems.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/ap: Don't leak debug feature files if AP instructions are not available

If no AP instructions are available the AP bus module leaks registered
debug feature files. Change function call order to fix this.

Fixes: cccd85bfb7bf ("s390/zcrypt: Rework debug feature invocations.")
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/ptrace: Explicitly include <linux/typecheck.h>

The psw_bits() macro makes use of typecheck() without that typecheck.h
is included. Add the missing include to avoid potential future compile
problems.

[hca@linux.ibm.com: change commit message]

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/ap: Expose ap_bindings_complete_count counter via sysfs

The AP bus udev event BINDINGS=complete is sent out when the
first time all devices detected by the AP bus scan have been
bound to device drivers. This is the ideal time to for example
change the AP bus masks apmask and aqmask to re-establish a
persistent change on the decision about which cards/domains
should be available for the host and which should go into the
pool for kvm guests.

However, if exactly this initial udev event is sent out early
in the boot process a udev rule may not have been established
yet and thus this event will never be recognized. To have
some indication about if the AP bus binding complete has
already happened, the internal ap_bindings_complete_count
counter is exposed via sysfs with this patch.

Suggested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/smp: Fix fallback CPU detection

In case SCLP CPU detection does not work a fallback mechanism using SIGP is
in place. Since a cleanup this does not work correctly anymore: new CPUs
are only considered if their type matches the boot CPU.

Before the cleanup the information if a CPU type should be considered was
also part of a structure generated by the fallback mechanism and indicated
that a CPU type should not be considered when adding CPUs.

Since the rework a global SCLP state is used instead. If the global SCLP
state indicates that the CPU type should be considered and the fallback
mechanism is used, there may be a mismatch with CPU types if CPUs are
added. This can lead to a system with only a single CPU even tough there
are many more CPUs.

Address this by simply copying the boot cpu type into the generated data
structure from the fallback mechanism.

Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Fixes: d08d94306e90 ("s390/smp: cleanup core vs. cpu in the SCLP interface")
Reviewed-by: Mete Durlu <meted@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pci: Highlight failure to enable PCI function

Emit an error log when a PCI function cannot be enabled for use, despite
being reported as configured to the system.

This brings to attention situations where functions might go missing
without notice. Going unnoticed is less likely when functions are added
to the system through hotplug, but will produce the same error log.

Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

Merge branch 'ap-bus-trace-events'

Harald Freudenberger says:

====================

Investigations related to runtime of crypto requests has revealed a lack of
performance or runtime information with crypto requests. There are the two
zcrypt ioctl trace events covering the entry and exit of an ioctl with
crypto requests giving the overall runtime within the kernel. However,
there is no way to figure out the time where a request is enqueued into the
AP bus queue but not pushed into the firmware queue. Then there is no
information about the runtime of an request during processing in the
firmware. And finally some info about pulling the reply from the firmware
and delivering it into user space is missing.

This series is aiming to provide a way to collect measurements which can be
used to cover these runtime information for each crypto request/reply.

====================

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/ap: Introduce new AP nqap and dqap trace events

Introduce two new AP bus related tracepoint events:
- There is a tracepoint s390_ap_nqap event immediately after a request
  has been pushed into the AP firmware queue with the NQAP AP command.
- The other tracepoint s390_ap_dqap event fires immediately after a
  reply has been pulled out of the AP firmware queue via DQAP AP
  command.
Both events are triggered unconditional and may need filtering.
Filtering can be done based on the status value which is part of
the nqap and dqap trace. So for example a
  echo "!(status & 0x00ff0000)" >.../s390_ap_dqap/filter
filters out all trace events which have a response_code != 0
leaving just the successful nqap and dqap invocations.

The idea of these two trace events focuses on performance to measure
the runtime of a crypto request/reply as close as possible at the
firmware level. In combination with the two zcrypt tracepoints (see
the zcrypt.h trace event definition file) this gives measurement data
about the runtime of a request/reply within the zcrpyt and AP bus
layer. However, with having the status of these AP commands in hand
also other usage may be possible.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/ap: Extend struct ap_queue_status with some convenience fields

Sometimes there is a different view of the AP status word
needed. So here is slight rework of the struct ap_queue_status
to open up the possibility to have different ways of accessing
the AP status bits and fields.

The new struct ap_queue_status

struct ap_queue_status {
union {
unsigned int value : 32;
struct {
unsigned int status_bits : 8;
unsigned int rc : 8;
unsigned int : 16;
};
struct {
unsigned int queue_empty : 1;
unsigned int replies_waiting : 1;
unsigned int queue_full : 1;
unsigned int : 3;
unsigned int async : 1;
unsigned int irq_enabled : 1;
unsigned int response_code : 8;
unsigned int : 16;
};
};
};

comprises the old struct ap_queue_status but extends it
to have this also accessible as an unsigned int required
for example for a simple print or trace of the whole value.

Note that this rework is fully backward compatible to the
existing code exploiting the struct ap_queue_status.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Anthony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/zcrypt: Rework zcrypt request and reply trace event definition

This is a slight rework of the s390_zcrypt_req and s390_zcrypt_rep
trace event:
- the psmid has been added to the s390_zcrypt_rep
- "dev" renamed to "card"
- "domain" renamed to "dom"
The motivation of these changes is to make these traces more
aligned to new upcoming traces for AP bus related trace events.
Additionally the psmid is needed to match the reply (and thus
indirect the request) to AP bus related trace events where only
the psmid is unique identifying AP messages.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Anthony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/ptdump: Use seq_puts() in pt_dump_seq_puts() macro

The pt_dump_seq_puts() macro incorrectly uses seq_printf() instead of
seq_puts(). This is both a performance issue and conceptually wrong,
as the macro name suggests plain string output (puts) but the
implementation uses formatted output (printf).

The macro is used in dump_pagetables.c:67-68 and 131 to output
constant strings. Using seq_printf() adds unnecessary overhead for
format string parsing.

Signed-off-by: Josephine Pfeiffer <hi@josie.lol>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

Merge branch 'tape-block-sizes'

Jan Höppner says:

====================

The tape device driver is limited to a block size of 65535 bytes since a
single CCW can only transfer up to 64K-1 bytes (The count field is a
16bit value). This series introduces data chaining for all read/write
functions to support block sizes larger than 65535.

The tape device type 3490 (emulated) and 3590/3592 can handle up to
256K. [1]

[1] https://www.ibm.com/docs/en/zos/3.1.0?topic=blksize-system-determined-block-size

====================

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/tape: Add support for bigger block sizes

The tape device type 3590/3592 and emulated 3490 VTS can handle a block
size of up to 256K bytes. Currently the tape device driver is limited to
a block size of 65535 bytes (64K-1). This limitation stems from the
maximum of 65535 bytes of data that can be transferred with one
Channel-Command Word (CCW).

To work around this limitation data chaining is used which uses several
CCW to transfer an entire 256K block of data. A single CCW holds a
maximum of 65535 bytes of data.

Set MAX_BLOCKSIZE to 262144 (= 256K) to allow for data transfers with
larger block sizes. The read_block() and write_block() discipline
functions calculate the number of CCWs required based on the IDAL buffer
array size that was created for a given block size. If there is more
than one CCW required for the data transfer, the new helper function
tape_ccw_dc_idal() is used to build the data chain accordingly.

The Interruption-Repsonse Block (irb) is added to the tape_request
struct so that the tapechar_read/write() functions can analyze what data
was read or written accordingly.

Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/tape: Introduce idal buffer array

The tape device driver uses a single idal_buffer for I/O. While the
buffer itself can be arbitrary big, the limit for data transfer for a
single Channel-Command Word is at 65535 bytes (64K-1) since the count
field specifying the amount of data designated by the CCW is a 16-bit
unsigned value.

Provide functionality that allocates an array of multiple IDAL buffer
with the limitation mentioned above in mind.
A call to idal_buffer_array_alloc() allocates an array with a certain
amount of IDAL buffers which is determined based on the total size of
@size. Each individual buffer is limited to a size of CCW_MAX_BYTE_COUNT
(65535 bytes).

Add helper functions that determine the size (# of elements) and the
total data size covered by the array as well.

Current users of the single IDAL buffer are adapted to use the new
functions with one buffer to allocate.

The single IDAL buffer is removed from the tape_char_data struct.

Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/tape: Move idal allocation to core functions

Currently tapechar_check_idalbuffer() is part of tape_char.c and is used
to ensure the idal buffer is big enough for the requested I/O and
reallocates a new one if required. The same is done in tape_std.c when a
fixed block size is set using the mtsetblk command. This is essentially
duplicate code.

The allocation of the buffer that is required for I/O can be considered
core functionality. Move the idal buffer allocation to tape_core.c,
make it generally available, and reduce code duplication.

Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/tape: Fix return value of ccw helper functions

In contrast to all other helper functions used to build CCW chains,
tape_ccw_cc_idal() and tape_ccw_end_idal() return values using
post-increments, which results in returning the same CCW pointer.

Though, the intent of the CCW helper functions is to return the _next_
CCW in the chain, which can then be processed.

There is currently no actual issue, as tape_ccw_cc_idal() is not used
yet and tape_ccw_end_idal() is only used at the end of a chain.

Change both functions return statement to ccw + 1 and bring them in line
with the other helper functions.

Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/tape: Remove extra CCW allocation for error recovery

The Read Opposite error recovery code required 2 extra CCWs to be
allocated in order to transform the request. As this error recovery code
for both 34xx and 3590 was removed the additional allocation isn't
required anymore. Reduce it to two.

Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/tape: Remove 3590 Read Opposite error recovery

On old native type 3590 tape devices a Read Opposite error recovery
procedure on Error Recovery Action Code (ERA) 26 was issued if a Read
Forward command failed. This recovery procedure was implemented with the
Read Backward command. This is no longer supported.

Remove 3590 ERA 26 and Read Backward related recovery code.

Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/tape: Remove 34xx Read Opposite error recovery

On old native type 3490 tape devices a Read Opposite error recovery
procedure on Error Recovery Action Code (ERA) 26 was issued if a Read
Forward command failed. This recovery procedure was implemented with the
Read Backward command.

As a preparation for a subsequent commit, that adds support for bigger
block sizes, remove the 34xx ERA 26 related recovery code. The recovery
code would need to be adapted to the bigger block sizes, without any
possibility to be tested, as modern Virtual Tape Servers (VTS) do
neither report ERA 26 on a Read Forward command failure nor support the
error recovery procedure anymore.

Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/tape: Remove count parameter from read/write_block functions

The count parameter of the read/write_block discipline functions was
never used. Remove it.

Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

Merge branch 'memory-hotplug'

Sumanth Korikkar says:

====================

Provide a new interface for dynamic configuration and
deconfiguration of hotplug memory on s390, allowing with/without
memmap_on_memory support. It is a follow up on the discussion with David
when introducing memmap_on_memory support for s390 and support dynamic
(de)configuration of memory:
https://lore.kernel.org/all/ee492da8-74b4-4a97-8b24-73e07257f01d@redhat.com/
https://lore.kernel.org/all/20241202082732.3959803-1-sumanthk@linux.ibm.com/

The original motivation for introducing memmap_on_memory on s390 was to
avoid using online memory to store struct pages metadata, particularly
for standby memory blocks. This became critical in cases where there was
an imbalance between standby and online memory, potentially leading to
boot failures due to insufficient memory for metadata allocation.

To address this, memmap_on_memory was utilized on s390. However, in its
current form, it adds struct pages metadata at the start of each memory
block at the time of addition (only standby memory), and this
configuration is static. It cannot be changed at runtime  (When the user
needs continuous physical memory).

Inorder to provide more flexibility to the user and overcome the above
limitation, add an option to dynamically configure and deconfigure
hotpluggable memory block with/without memmap_on_memory.

With the new interface, s390 will not add all possible hotplug memory in
advance, like before, to make it visible in sysfs for online/offline
actions. Instead, before memory block can be set online, it has to be
configured via a new interface in /sys/firmware/memory/memoryX/config,
which makes s390 similar to others.  i.e. Adding of hotpluggable memory is
controlled by the user instead of adding it at boottime.

s390 kernel sysfs interface to configure/deconfigure memory with
memmap_on_memory (with upcoming lsmem changes):

* Initial memory layout:
lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE                 SIZE   STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x7fffffff   2G  online 0-15  yes        no
0x80000000-0xffffffff   2G offline 16-31 no         yes

* Configure memory
echo 1 > /sys/firmware/memory/memory16/config
lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE                  SIZE  STATE   BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x7fffffff    2G  online  0-15  yes        no
0x80000000-0x87ffffff  128M offline    16  yes        yes
0x88000000-0xffffffff  1.9G offline 17-31  no         yes

* Deconfigure memory
echo 0 > /sys/firmware/memory/memory16/config
lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE                 SIZE   STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x7fffffff   2G  online 0-15  yes        no
0x80000000-0xffffffff   2G offline 16-31 no         yes

* Enable memmap_on_memory and online it.
(Deconfigure first)
echo 0 > /sys/devices/system/memory/memory5/online
echo 0 > /sys/firmware/memory/memory5/config

lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE                  SIZE  STATE  BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x27ffffff  640M  online 0-4   yes        no
0x28000000-0x2fffffff  128M offline 5     no         no
0x30000000-0x7fffffff  1.3G  online 6-15  yes        no
0x80000000-0xffffffff    2G offline 16-31 no         yes

(Enable memmap_on_memory and online it)
echo 1 > /sys/firmware/memory/memory5/memmap_on_memory
echo 1 > /sys/firmware/memory/memory5/config
echo 1 > /sys/devices/system/memory/memory5/online

lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE                  SIZE  STATE   BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x27ffffff  640M  online  0-4   yes        no
0x28000000-0x2fffffff  128M  online  5     yes        yes
0x30000000-0x7fffffff  1.3G  online  6-15  yes        no
0x80000000-0xffffffff    2G  offline 16-31 no         yes

* Disable memmap_on_memory and online it.
(Deconfigure first)
echo 0 > /sys/devices/system/memory/memory5/online
echo 0 > /sys/firmware/memory/memory5/config

lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE                  SIZE  STATE  BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x27ffffff  640M  online 0-4   yes        no
0x28000000-0x2fffffff  128M offline 5     no         yes
0x30000000-0x7fffffff  1.3G  online 6-15  yes        no
0x80000000-0xffffffff    2G offline 16-31 no         yes

(Disable memmap_on_memory and online it)
echo 0 > /sys/firmware/memory/memory5/memmap_on_memory
echo 1 > /sys/firmware/memory/memory5/config
echo 1 > /sys/devices/system/memory/memory5/online

lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY
RANGE                  SIZE  STATE   BLOCK CONFIGURED MEMMAP_ON_MEMORY
0x00000000-0x7fffffff  2G    online  0-15  yes        no
0x80000000-0xffffffff  2G    offline 16-31 no         yes

* Userspace changes:
lsmem/chmem tool is also changed to use the new interface. I will send
it to util-linux soon.

Patch 1 adds support for removal of boot-allocated memory blocks.

Patch 2 provides option to dynamically configure and deconfigure memory
with/without memmap_on_memory.

Patch 3 removes MHP_OFFLINE_INACCESSIBLE from s390. The mhp flag was
used to mark memory as not accessible until memory hotplug online phase
begins.  However, with patch 2, it is no longer essential. Memory can be
brought to accessible state before adding memory, as the memory is added
during runttime now instead of boottime.

Patch 4 removes the MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers. It
is no longer needed.  Memory can be brought to accessible state before
adding memory now, with runtime (de)configuration of memory.

Note: The patches apply to the linux-next branch.

v3:
Thanks David
* Avoid goto label in create_standby_sclp_mems().
* Use unsigned long instead of u64.
* Add Acked-by.

v2:
Thanks David
* Rename struct mblock/mblock_arg with struct sclp_mem/sclp_mem_arg.
* Rename all mblocks/mblock references with sclp_mems/sclp_mem -
  structures, functions.
* Rename create_online_mblock() with create_configured_sclp_mem().
* Rename config_mblock_show()/config_mblock_store() with
  config_sclp_mem_show()/config_sclp_mem_store().
* Remove contains_standby_increment() and
  sclp_mem_notifier. sclp mem state change is performed when
  adding/removing memory. sclp memory notifier - no longer needed with
  this patchset.
* Recover sclp mem state when add_memory() fails.
* Refactor and add function init_sclp_mem().
* Use unsigned long instead of unsigned long long.
* Simplify and correct kobj handling. Thanks Heiko.

====================

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/con3270: Use scnprintf() instead of sprintf()

Use scnprintf() instead of sprintf() for those cases where the destination
is an array and the size of the array is known at compile time.

This prevents theoretical buffer overflows, but also avoids that people
again and again spend time to figure out if the code is actually safe.

Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/tape: Use scnprintf() instead of sprintf()

Use scnprintf() instead of sprintf() for those cases where the destination
is an array and the size of the array is known at compile time.

This prevents theoretical buffer overflows, but also avoids that people
again and again spend time to figure out if the code is actually safe.

Reviewed-by: Jan Höppner <hoeppner@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/dcss: Use scnprintf() instead of sprintf()

Use scnprintf() instead of sprintf() for those cases where the destination
is an array and the size of the array is known at compile time.

This prevents theoretical buffer overflows, but also avoids that people
again and again spend time to figure out if the code is actually safe.

Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/cio: Use scnprintf() instead of sprintf()

Use scnprintf() instead of sprintf() for those cases where the destination
is an array and the size of the array is known at compile time.

This prevents theoretical buffer overflows, but also avoids that people
again and again spend time to figure out if the code is actually safe.

Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/early: Use scnprintf() instead of sprintf()

Use scnprintf() instead of sprintf() for those cases where the destination
is an array and the size of the array is known at compile time.

This prevents theoretical buffer overflows, but also avoids that people
again and again spend time to figure out if the code is actually safe.

Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/pai_crypto: Adjust paicrypt_copy() return statement

Adjust the return statement in paicrypt_copy() to the same statement
as in paiext_copy(). Use one common style. No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/sysinfo: Replace sprintf() with snprintf() for buffer safety

Replace sprintf() with snprintf() when formatting symlink target name
to prevent potential buffer overflow. The link_to buffer is only 10
bytes, and using snprintf() ensures proper bounds checking if the
topology nesting limit value is unexpectedly large.

Signed-off-by: Josephine Pfeiffer <hi@josie.lol>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/extmem: Replace sprintf() with snprintf() for buffer safety

Replace unsafe sprintf() calls with snprintf() in segment_save() to
prevent potential buffer overflows. The function builds command strings
by repeatedly appending to a fixed-size buffer, which could overflow if
segment ranges are numerous or values are large.

Signed-off-by: Josephine Pfeiffer <hi@josie.lol>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

s390/cmm: Replace sprintf() with scnprintf() for buffer safety

Replace sprintf() with scnprintf() in cmm_timeout_handler() to prevent
potential buffer overflow. The scnprintf() function ensures we don't
write beyond the buffer size and provides safer string formatting.

Signed-off-by: Josephine Pfeiffer <hi@josie.lol>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

Linux 6.18-rc2

Merge tag 'sched_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Borislav Petkov:

- Make sure the check for lost pelt idle time is done unconditionally
   to have correct lost idle time accounting

- Stop the deadline server task before a CPU goes offline

* tag 'sched_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix pelt lost idle time detection
  sched/deadline: Stop dl_server before CPU goes offline

Merge tag 'perf_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

- Make sure perf reporting works correctly in setups using
   overlayfs or FUSE

- Move the uprobe optimization to a better location logically

* tag 'perf_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix MMAP2 event device with backing files
  perf/core: Fix MMAP event path names with backing files
  perf/core: Fix address filter match with backing files
  uprobe: Move arch_uprobe_optimize right after handlers execution

Merge tag 'x86_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

- Reset the why-the-system-rebooted register on AMD to avoid stale bits
   remaining from previous boots

- Add a missing barrier in the TLB flushing code to prevent erroneously
   not flushing a TLB generation

- Make sure cpa_flush() does not overshoot when computing the end range
   of a flush region

- Fix resctrl bandwidth counting on AMD systems when the amount of
   monitoring groups created exceeds the number the hardware can track

* tag 'x86_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/CPU/AMD: Prevent reset reasons from being retained across reboot
  x86/mm: Fix SMP ordering in switch_mm_irqs_off()
  x86/mm: Fix overflow in __cpa_addr()
  x86/resctrl: Fix miscount of bandwidth event when reactivating previously unavailable RMID

Merge tag 'rust-rustfmt' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux

Pull rustfmt fixes from Miguel Ojeda:
"Rust 'rustfmt' cleanup

  'rustfmt', by default, formats imports in a way that is prone to
  conflicts while merging and rebasing, since in some cases it condenses
  several items into the same line.

  Document in our guidelines that we will handle this for the moment
  with the trailing empty comment workaround and make the tree
  'rustfmt'-clean again"

* tag 'rust-rustfmt' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
  rust: bitmap: fix formatting
  rust: cpufreq: fix formatting
  rust: alloc: employ a trailing comment to keep vertical layout
  docs: rust: add section on imports formatting

Merge tag 'tpmdd-next-v6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull tpm fix from Jarkko Sakkinen:
"Correct the state transitions for ARM FF-A to match the spec and how
tpm_crb behaves on other platforms"

* tag 'tpmdd-next-v6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
tpm_crb: Add idle support for the Arm FF-A start method

Merge tag 'pci-v6.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci fixes from Bjorn Helgaas:

- Search for MSI Capability with correct ID to fix an MSI regression on
   platforms with Cadence IP (Hans Zhang)

- Revert early bridge resource set up to fix resource assignment
   failures that broke at least alpha boot and Snapdragon ath12k WiFi
   (Ilpo Järvinen)

- Implement VMD .irq_startup()/.irq_shutdown() to fix IRQ issues that
   caused boot crashes and broken devices below VMD (Inochi Amaoto)

- Select CONFIG_SCREEN_INFO on X86 to fix black screen on boot when
   SCREEN_INFO not selected (Mario Limonciello)

* tag 'pci-v6.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI/VGA: Select SCREEN_INFO on X86
  PCI: vmd: Override irq_startup()/irq_shutdown() in vmd_init_dev_msi_info()
  PCI: Revert early bridge resource set up
  PCI: cadence: Search for MSI Capability with correct ID

Merge tag 'cxl-fixes-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull Compute Express Link fixes from Dave Jiang:
"A small collection of CXL fixes. In addition to some misc fixes for
  the CXL subsystem, a number of fixes for CXL extended linear cache
  support are included to make it functional again.

   - Avoid missing port component registers setup due to dport
     enumeration failure

   - Add check for no entries in cxl_feature_info to address accessing
     invalid pointer.

   - Use %pa printk format to emit resource_size_t in
     validate_region_offset()

  CXL extended linear cache support fixes:

   - Fix setup of memory resource in cxl_acpi_set_cache_size()

   - Set range param for region_res_match_cxl_range() as const
     (addresses a compile warning for match_region_by_range() fix)

   - Fix match_region_by_range() to use region_res_match_cxl_range()

   - Subtract to find an hpa_alias0 in cxl_poison events to correct the
     alias math calculation"

* tag 'cxl-fixes-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/trace: Subtract to find an hpa_alias0 in cxl_poison events
  cxl/region: Use %pa printk format to emit resource_size_t
  cxl: Fix match_region_by_range() to use region_res_match_cxl_range()
  cxl: Set range param for region_res_match_cxl_range() as const
  cxl/acpi: Fix setup of memory resource in cxl_acpi_set_cache_size()
  cxl/features: Add check for no entries in cxl_feature_info
  cxl/port: Avoid missing port component registers setup

Merge tag 'hid-for-linus-2025101701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:

- fix for sticky fingers handling in hid-multitouch (Benjamin
   Tissoires)

- fix for reporting of 0 battery levels (Dmitry Torokhov)

- build fix for hid-haptic in certain configurations (Jonathan Denose)

- improved probe and avoiding spamming kernel log by hid-nintendo
   (Vicki Pfau)

- fix for OOB in hid-cp2112 (Deepak Sharma)

- interrupt handling fix for intel-thc-hid (Even Xu)

- a couple of new device IDs and device-specific quirks

* tag 'hid-for-linus-2025101701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: logitech-hidpp: Add HIDPP_QUIRK_RESET_HI_RES_SCROLL
  selftests/hid: add tests for missing release on the Dell Synaptics
  HID: multitouch: fix sticky fingers
  HID: multitouch: fix name of Stylus input devices
  HID: hid-input: only ignore 0 battery events for digitizers
  HID: hid-debug: Fix spelling mistake "Rechargable" -> "Rechargeable"
  HID: Kconfig: Fix build error from CONFIG_HID_HAPTIC
  HID: nintendo: Rate limit IMU compensation message
  HID: nintendo: Wait longer for initial probe
  HID: core: Add printk_ratelimited variants to hid_warn() etc
  HID: quirks: Add ALWAYS_POLL quirk for VRS R295 steering wheel
  HID: quirks: avoid Cooler Master MM712 dongle wakeup bug
  HID: cp2112: Add parameter validation to data length
  HID: intel-thc-hid: intel-quickspi: Add ARL PCI Device Id's
  HID: intel-thc-hid: Intel-quickspi: switch first interrupt from level to edge detection
  HID: intel-thc-hid: intel-quicki2c: Fix wrong type casting

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:

- Replace bpf_map_kmalloc_node() with kmalloc_nolock() to fix kmemleak
   imbalance in tracking of bpf_async_cb structures (Alexei Starovoitov)

- Make selftests/bpf arg_parsing.c more robust to errors (Andrii
   Nakryiko)

- Fix redefinition of 'off' as different kind of symbol when I40E
   driver is builtin (Brahmajit Das)

- Do not disable preemption in bpf_test_run (Sahil Chandna)

- Fix memory leak in __lookup_instance error path (Shardul Bankar)

- Ensure test data is flushed to disk before reading it (Xing Guo)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Fix redefinition of 'off' as different kind of symbol
  bpf: Do not disable preemption in bpf_test_run().
  bpf: Fix memory leak in __lookup_instance error path
  selftests: arg_parsing: Ensure data is flushed to disk before reading.
  bpf: Replace bpf_map_kmalloc_node() with kmalloc_nolock() to allocate bpf_async_cb structures.
  selftests/bpf: make arg_parsing.c more robust to crashes
  bpf: test_run: Fix ctx leak in bpf_prog_test_run_xdp error path

Merge tag 'exfat-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat

Pull exfat fixes from Namjae Jeon:

- Fix out-of-bounds in FS_IOC_SETFSLABEL

- Add validation for stream entry size to prevent infinite loop

* tag 'exfat-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: fix out-of-bounds in exfat_nls_to_ucs2()
exfat: fix improper check of dentry.stream.valid_size

Merge tag 'nfs-for-6.18-2' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client fixes from Anna Schumaker:

- Fix for FlexFiles mirror->dss allocation

- Apply delay_retrans to async operations

- Check if suid/sgid is cleared after a write when needed

- Fix setting the state renewal timer for early mounts after a reboot

* tag 'nfs-for-6.18-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFS4: Fix state renewals missing after boot
  NFS: check if suid/sgid was cleared after a write as needed
  NFS4: Apply delay_retrans to async operations
  NFSv4/flexfiles: fix to allocate mirror->dss before use

Merge tag '6.18-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:
"smb client fixes, security and smbdirect improvements, and some minor cleanup:

   - Important OOB DFS fix

   - Fix various potential tcon refcount leaks

   - smbdirect (RDMA) fixes (following up from test event a few weeks
     ago):

      - Fixes to improve and simplify handling of memory lifetime of
        smbdirect_mr_io structures, when a connection gets disconnected

      - Make sure we really wait to reach SMBDIRECT_SOCKET_DISCONNECTED
        before destroying resources

      - Make sure the send/recv submission/completion queues are large
        enough to avoid ib_post_send() from failing under pressure

   - convert cifs.ko to use the recommended crypto libraries (instead of
     crypto_shash), this also can improve performance

   - Three small cleanup patches"

* tag '6.18-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: (24 commits)
  smb: client: Consolidate cmac(aes) shash allocation
  smb: client: Remove obsolete crypto_shash allocations
  smb: client: Use HMAC-MD5 library for NTLMv2
  smb: client: Use MD5 library for SMB1 signature calculation
  smb: client: Use MD5 library for M-F symlink hashing
  smb: client: Use HMAC-SHA256 library for SMB2 signature calculation
  smb: client: Use HMAC-SHA256 library for key generation
  smb: client: Use SHA-512 library for SMB3.1.1 preauth hash
  cifs: parse_dfs_referrals: prevent oob on malformed input
  smb: client: Fix refcount leak for cifs_sb_tlink
  smb: client: let smbd_destroy() wait for SMBDIRECT_SOCKET_DISCONNECTED
  smb: move some duplicate definitions to common/cifsglob.h
  smb: client: let destroy_mr_list() keep smbdirect_mr_io memory if registered
  smb: client: let destroy_mr_list() call ib_dereg_mr() before ib_dma_unmap_sg()
  smb: client: call ib_dma_unmap_sg if mr->sgt.nents is not 0
  smb: client: improve logic in smbd_deregister_mr()
  smb: client: improve logic in smbd_register_mr()
  smb: client: improve logic in allocate_mr_list()
  smb: client: let destroy_mr_list() remove locked from the list
  smb: client: let destroy_mr_list() call list_del(&mr->list)
  ...

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
"ARM:

   - Fix the handling of ZCR_EL2 in NV VMs

   - Pick the correct translation regime when doing a PTW on the back of
     a SEA

   - Prevent userspace from injecting an event into a vcpu that isn't
     initialised yet

   - Move timer save/restore to the sysreg handling code, fixing EL2
     timer access in the process

   - Add FGT-based trapping of MDSCR_EL1 to reduce the overhead of debug

   - Fix trapping configuration when the host isn't GICv3

   - Improve the detection of HCR_EL2.E2H being RES1

   - Drop a spurious 'break' statement in the S1 PTW

   - Don't try to access SPE when owned by EL3

  Documentation updates:

   - Document the failure modes of event injection

   - Document that a GICv3 guest can be created on a GICv5 host with
     FEAT_GCIE_LEGACY

  Selftest improvements:

   - Add a selftest for the effective value of HCR_EL2.AMO

   - Address build warning in the timer selftest when building with
     clang

   - Teach irqfd selftests about non-x86 architectures

   - Add missing sysregs to the set_id_regs selftest

   - Fix vcpu allocation in the vgic_lpi_stress selftest

   - Correctly enable interrupts in the vgic_lpi_stress selftest

  x86:

   - Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test
     for the bug fixed by commit 3ccbf6f47098 ("KVM: x86/mmu: Return
     -EAGAIN if userspace deletes/moves memslot during prefault")

   - Don't try to get PMU capabilities from perf when running a CPU with
     hybrid CPUs/PMUs, as perf will rightly WARN.

  guest_memfd:

   - Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a
     more generic KVM_CAP_GUEST_MEMFD_FLAGS

   - Add a guest_memfd INIT_SHARED flag and require userspace to
     explicitly set said flag to initialize memory as SHARED,
     irrespective of MMAP.

     The behavior merged in 6.18 is that enabling mmap() implicitly
     initializes memory as SHARED, which would result in an ABI
     collision for x86 CoCo VMs as their memory is currently always
     initialized PRIVATE.

   - Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with
     private memory, to enable testing such setups, i.e. to hopefully
     flush out any other lurking ABI issues before 6.18 is officially
     released.

   - Add testcases to the guest_memfd selftest to cover guest_memfd
     without MMAP, and host userspace accesses to mmap()'d private
     memory"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (46 commits)
  arm64: Revamp HCR_EL2.E2H RES1 detection
  KVM: arm64: nv: Use FGT write trap of MDSCR_EL1 when available
  KVM: arm64: Compute per-vCPU FGTs at vcpu_load()
  KVM: arm64: selftests: Fix misleading comment about virtual timer encoding
  KVM: arm64: selftests: Add an E2H=0-specific configuration to get_reg_list
  KVM: arm64: selftests: Make dependencies on VHE-specific registers explicit
  KVM: arm64: Kill leftovers of ad-hoc timer userspace access
  KVM: arm64: Fix WFxT handling of nested virt
  KVM: arm64: Move CNT*CT_EL0 userspace accessors to generic infrastructure
  KVM: arm64: Move CNT*_CVAL_EL0 userspace accessors to generic infrastructure
  KVM: arm64: Move CNT*_CTL_EL0 userspace accessors to generic infrastructure
  KVM: arm64: Add timer UAPI workaround to sysreg infrastructure
  KVM: arm64: Make timer_set_offset() generally accessible
  KVM: arm64: Replace timer context vcpu pointer with timer_id
  KVM: arm64: Introduce timer_context_to_vcpu() helper
  KVM: arm64: Hide CNTHV_*_EL2 from userspace for nVHE guests
  Documentation: KVM: Update GICv3 docs for GICv5 hosts
  KVM: arm64: gic-v3: Only set ICH_HCR traps for v2-on-v3 or v3 guests
  KVM: arm64: selftests: Actually enable IRQs in vgic_lpi_stress
  KVM: arm64: selftests: Allocate vcpus with correct size
  ...

Merge tag 'powerpc-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Madhavan Srinivasan:

- Fix to handle NULL pointer dereference at irq domain teardown

- Fix for handling extraction of struct xive_irq_data

- Fix to skip parameter area allocation when fadump disabled

Thanks to Ganesh Goudar, Hari Bathini, Nam Cao, Ritesh Harjani (IBM),
Sourabh Jain, and Venkat Rao Bagalkote,

* tag 'powerpc-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/fadump: skip parameter area allocation when fadump is disabled
  powerpc, ocxl: Fix extraction of struct xive_irq_data
  powerpc/pseries/msi: Fix NULL pointer dereference at irq domain teardown

Merge tag 'slab-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fixes from Vlastimil Babka:

- Fixes for two bugs that can be triggered when debugging options are
   enabled (Hao Ge, Vlastimil Babka)

* tag 'slab-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  slab: reset slab->obj_ext when freeing and it is OBJEXTS_ALLOC_FAIL
  slab: fix clearing freelist in free_deferred_objects()