Matthew Brost [Thu, 18 Dec 2025 22:37:45 +0000 (14:37 -0800)]
drm/xe: Increase log level for unhandled page faults
Set the kernel log level for unhandled page faults to match the log
level (info) for engine resets. Currently, dmesg output can be confusing
because it shows an engine reset without indicating the page fault that
caused it. Without this change, the GuC log must be examined to
determine the source of the engine reset.
Swaraj Gaikwad [Tue, 9 Dec 2025 09:48:36 +0000 (09:48 +0000)]
drm/xe: Fix documentation heading levels in xe_guc_pc.c
Sphinx reports htmldocs warnings:
Documentation/gpu/xe/xe_firmware:31: ./drivers/gpu/drm/xe/xe_guc_pc.c:76: ERROR: A level 2 section cannot be used here.
Documentation/gpu/xe/xe_firmware:31: ./drivers/gpu/drm/xe/xe_guc_pc.c:87: ERROR: A level 2 section cannot be used here.
The xe_guc_pc.c documentation is included inside xe_firmware.rst.
The headers in the C file currently use '=' underlines, which conflict with
the parent document's section levels.
Fix this by demoting "Frequency management" and "Render-C States" headers
from '=' to '-' to correctly nest them as subsections.
Thomas Hellström [Wed, 17 Dec 2025 09:34:41 +0000 (10:34 +0100)]
drm/xe: Drop preempt-fences when destroying imported dma-bufs.
When imported dma-bufs are destroyed, TTM is not fully
individualizing the dma-resv, but it *is* copying the fences that
need to be waited for before declaring idle. So in the case where
the bo->resv != bo->_resv we can still drop the preempt-fences, but
make sure we do that on bo->_resv which contains the fence-pointer
copy.
In the case where the copying fails, bo->_resv will typically not
contain any fences pointers at all, so there will be nothing to
drop. In that case, TTM would have ensured all fences that would
have been copied are signaled, including any remaining preempt
fences.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Fixes: fa0af721bd1f ("drm/ttm: test private resv obj on release/destroy") Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.16+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Tested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251217093441.5073-1-thomas.hellstrom@linux.intel.com
Ashutosh Dixit [Fri, 12 Dec 2025 06:18:49 +0000 (22:18 -0800)]
drm/xe/oa: Disallow 0 OA property values
An OA property value of 0 is invalid and will cause a NPD.
Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6452 Fixes: cc4e6994d5a2 ("drm/xe/oa: Move functions up so they can be reused for config ioctl") Cc: stable@vger.kernel.org Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Harish Chegondi <harish.chegondi@intel.com> Link: https://patch.msgid.link/20251212061850.1565459-3-ashutosh.dixit@intel.com
Ashutosh Dixit [Fri, 12 Dec 2025 06:18:48 +0000 (22:18 -0800)]
drm/xe/oa: Move default oa unit assignment earlier during stream open
De-referencing param.oa_unit, when an OA unit id is not provided during
stream open, results in NPD below.
Oops: general protection fault, probably for non-canonical address...
KASAN: null-ptr-deref in range...
RIP: 0010:xe_oa_stream_open_ioctl+0x169/0x38a0
xe_observation_ioctl+0x19f/0x270
drm_ioctl_kernel+0x1f4/0x410
Fix this by moving default oa unit assignment before the dereference.
Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6840 Fixes: c7e269aa565f ("drm/xe/oa: Allow exec_queue's to be specified only for OAG OA unit") Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Harish Chegondi <harish.chegondi@intel.com> Link: https://patch.msgid.link/20251212061850.1565459-2-ashutosh.dixit@intel.com
drm/xe/pf: Add handling for MLRC adverse event threshold
Since it is illegal to register a MLRC context when scheduler groups are
enabled, the GuC consider the VF doing so as an adverse event. Like for
other adverse event, there is a threshold for how many times the event
can happen before the GuC throws an error, which we need to add support
for.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251216214902.1429-5-michal.wajdeczko@intel.com
Michal Wajdeczko [Tue, 16 Dec 2025 21:48:58 +0000 (22:48 +0100)]
drm/xe/pf: Prepare for new threshold KLVs
We want to extend our macro-based KLV list definitions with new
information about the version from which given KLV is supported.
Prepare our code generators to emit dedicated version check if
a KLV was defined with the version information.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251216214902.1429-4-michal.wajdeczko@intel.com
There are already few places in the code where we need to check GuC
firmware version. Wrap existing raw conditions into a named helper
macro to make it clear and avoid explicit call of the MAKE_GUC_VER.
Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251216214902.1429-3-michal.wajdeczko@intel.com
Michal Wajdeczko [Wed, 17 Dec 2025 22:40:18 +0000 (23:40 +0100)]
drm/xe: Introduce IF_ARGS macro utility
We want to extend our macro-based KLV list definitions with new
information about the version from which given KLV is supported.
Add utility IF_ARGS macro that can be used in code generators to
emit different code based on the presence of additional arguments.
Introduce macro itself and extend our kunit tests to cover it.
We will use this macro in next patch.
Fixes: 4ac9048d0501 ("drm/xe: Wait on in-syncs when swicthing to dma-fence mode") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251217132412.435755-1-tapani.palli@intel.com
Dan Carpenter [Fri, 5 Dec 2025 11:10:31 +0000 (14:10 +0300)]
drm/xe/vf: fix return type in vf_migration_init_late()
The vf_migration_init_late() function is supposed to return zero on
success and negative error codes on failure. The error code
eventually gets propagated back to the probe() function and returned.
The problem is it's declared as type bool so it returns true on
error. Change it to type int instead.
Fixes: 2e2dab20dd66 ("drm/xe/vf: Enable VF migration only on supported GuC versions") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/aTK9pwJ_roc8vpDi@stanley.mountain Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Ashutosh Dixit [Fri, 5 Dec 2025 21:26:13 +0000 (13:26 -0800)]
drm/xe/oa: Always set OAG_OAGLBCTXCTRL_COUNTER_RESUME
Reports can be written out to the OA buffer using ways other than periodic
sampling. These include mmio trigger and context switches. To support these
use cases, when periodic sampling is not enabled,
OAG_OAGLBCTXCTRL_COUNTER_RESUME must be set.
Ashutosh Dixit [Fri, 5 Dec 2025 21:26:11 +0000 (13:26 -0800)]
drm/xe/oa/uapi: Expose MERT OA unit
A MERT OA unit is available in the SoC on some platforms. Add support
for this OA unit and expose it to userspace. The MERT OA unit does not
have any HW engines attached, but is otherwise similar to an OAM unit.
Matthew Brost [Fri, 12 Dec 2025 18:28:47 +0000 (10:28 -0800)]
drm/xe: Add more GT stats around pagefault mode switch flows
Add GT stats to measure the time spent switching between pagefault mode
and dma-fence mode. Also add a GT stat to indicate when pagefault
suspend is skipped because the system is idle. These metrics will help
profile pagefault workloads while 3D and display are enabled.
Matthew Brost [Fri, 12 Dec 2025 18:28:45 +0000 (10:28 -0800)]
drm/xe: Wait on in-syncs when swicthing to dma-fence mode
If a dma-fence submission has in-fences and pagefault queues are running
work, there is little incentive to kick the pagefault queues off the
hardware until the dma-fence submission is ready to run. Therefore, wait
on the in-fences of the dma-fence submission before removing the
pagefault queues from the hardware.
v2:
- Fix kernel doc (CI)
- Don't wait under lock (Thomas)
- Make wait interruptable
Matthew Brost [Fri, 12 Dec 2025 18:28:44 +0000 (10:28 -0800)]
drm/xe: Skip exec queue schedule toggle if queue is idle during suspend
If an exec queue is idle, there is no need to issue a schedule disable
to the GuC when suspending the queue’s execution. Opportunistically skip
this step if the queue is idle and not a parallel queue. Parallel queues
must have their scheduling state flipped in the GuC due to limitations
in how submission is implemented in run_job().
Also if all pagefault queues can skip the schedule disable during a
switch to dma-fence mode, do not schedule a resume for the pagefault
queues after the next submission.
v2:
- Don't touch the LRC tail is queue is suspended but enabled in run_job
(CI)
Matthew Brost [Fri, 12 Dec 2025 18:28:42 +0000 (10:28 -0800)]
drm/xe: Use usleep_range for accurate long-running workload timeslicing
msleep is not very accurate in terms of how long it actually sleeps,
whereas usleep_range is precise. Replace the timeslice sleep for
long-running workloads with the more accurate usleep_range to avoid
jitter if the sleep period is less than 20ms.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251212182847.1683222-3-matthew.brost@intel.com
Matthew Brost [Fri, 12 Dec 2025 18:28:41 +0000 (10:28 -0800)]
drm/xe: Adjust long-running workload timeslices to reasonable values
A 10ms timeslice for long-running workloads is far too long and causes
significant jitter in benchmarks when the system is shared. Adjust the
value to 5ms for preempt-fencing VMs, as the resume step there is quite
costly as memory is moved around, and set it to zero for pagefault VMs,
since switching back to pagefault mode after dma-fence mode is
relatively fast.
Also change min_run_period_ms to 'unsiged int' type rather than 's64' as
only positive values make sense.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20251212182847.1683222-2-matthew.brost@intel.com
Shuicheng Lin [Fri, 5 Dec 2025 23:47:18 +0000 (23:47 +0000)]
drm/xe/oa: Limit num_syncs to prevent oversized allocations
The OA open parameters did not validate num_syncs, allowing
userspace to pass arbitrarily large values, potentially
leading to excessive allocations.
Add check to ensure that num_syncs does not exceed DRM_XE_MAX_SYNCS,
returning -EINVAL when the limit is violated.
v2: use XE_IOCTL_DBG() and drop duplicated check. (Ashutosh)
Fixes: c8507a25cebd ("drm/xe/oa/uapi: Define and parse OA sync properties") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251205234715.2476561-6-shuicheng.lin@intel.com
Shuicheng Lin [Fri, 5 Dec 2025 23:47:17 +0000 (23:47 +0000)]
drm/xe: Limit num_syncs to prevent oversized allocations
The exec and vm_bind ioctl allow userspace to specify an arbitrary
num_syncs value. Without bounds checking, a very large num_syncs
can force an excessively large allocation, leading to kernel warnings
from the page allocator as below.
Introduce DRM_XE_MAX_SYNCS (set to 1024) and reject any request
exceeding this limit.
v2: Add "Reported-by" and Cc stable kernels.
v3: Change XE_MAX_SYNCS from 64 to 1024. (Matt & Ashutosh)
v4: s/XE_MAX_SYNCS/DRM_XE_MAX_SYNCS/ (Matt)
v5: Do the check at the top of the exec func. (Matt)
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Reported-by: Koen Koning <koen.koning@intel.com> Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6450 Cc: <stable@vger.kernel.org> # v6.12+ Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michal Mrozek <michal.mrozek@intel.com> Cc: Carl Zhang <carl.zhang@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Ivan Briano <ivan.briano@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251205234715.2476561-5-shuicheng.lin@intel.com
Michal Wajdeczko [Mon, 15 Dec 2025 17:04:33 +0000 (18:04 +0100)]
drm/xe/guc: Fix version check for page-reclaim feature
Page reclamation interfaces were introduced in GuC firmware version
70.31.0 (which corresponds to GuC ABI version 1.14.0), but since this
feature is also available for the VFs and VFs don't know the firmware
version, use GuC compatibility version check instead.
Fixes: 77ebc7c10d16 ("drm/xe/guc: Add page reclamation interface to GuC") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Brian Nguyen <brian3.nguyen@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251215170433.196398-1-michal.wajdeczko@intel.com
Linus Torvalds [Sun, 14 Dec 2025 03:35:35 +0000 (15:35 +1200)]
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"The only core fix is in doc; all the others are in drivers, with the
biggest impacts in libsas being the rollback on error handling and in
ufs coming from a couple of error handling fixes, one causing a crash
if it's activated before scanning and the other fixing W-LUN
resumption"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: qcom: Fix confusing cleanup.h syntax
scsi: libsas: Add rollback handling when an error occurs
scsi: device_handler: Return error pointer in scsi_dh_attached_handler_name()
scsi: ufs: core: Fix a deadlock in the frequency scaling code
scsi: ufs: core: Fix an error handler crash
scsi: Revert "scsi: libsas: Fix exp-attached device scan after probe failure scanned in again after probe failed"
scsi: ufs: core: Fix RPMB link error by reversing Kconfig dependencies
scsi: qla4xxx: Use time conversion macros
scsi: qla2xxx: Enable/disable IRQD_NO_BALANCING during reset
scsi: ipr: Enable/disable IRQD_NO_BALANCING during reset
scsi: imm: Fix use-after-free bug caused by unfinished delayed work
scsi: target: sbp: Remove KMSG_COMPONENT macro
scsi: core: Correct documentation for scsi_device_quiesce()
scsi: mpi3mr: Prevent duplicate SAS/SATA device entries in channel 1
scsi: target: Reset t_task_cdb pointer in error case
scsi: ufs: core: Fix EH failure after W-LUN resume error
Linus Torvalds [Sun, 14 Dec 2025 03:24:10 +0000 (15:24 +1200)]
Merge tag 'ceph-for-6.19-rc1' of https://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"We have a patch that adds an initial set of tracepoints to the MDS
client from Max, a fix that hardens osdmap parsing code from myself
(marked for stable) and a few assorted fixups"
* tag 'ceph-for-6.19-rc1' of https://github.com/ceph/ceph-client:
rbd: stop selecting CRC32, CRYPTO, and CRYPTO_AES
ceph: stop selecting CRC32, CRYPTO, and CRYPTO_AES
libceph: make decode_pool() more resilient against corrupted osdmaps
libceph: Amend checking to fix `make W=1` build breakage
ceph: Amend checking to fix `make W=1` build breakage
ceph: add trace points to the MDS client
libceph: fix log output race condition in OSD client
Linus Torvalds [Sat, 13 Dec 2025 18:12:46 +0000 (06:12 +1200)]
Merge tag 'smp-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug fix from Ingo Molnar:
- Fix CPU hotplug callbacks to disable interrupts on UP kernels
* tag 'smp-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu: Make atomic hotplug callbacks run with interrupts disabled on UP
Linus Torvalds [Sat, 13 Dec 2025 18:10:35 +0000 (06:10 +1200)]
Merge tag 'perf-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event fixes from Ingo Molnar:
- Fix NULL pointer dereference crash in the Intel PMU driver
- Fix missing read event generation on task exit
- Fix AMD uncore driver init error handling
- Fix whitespace noise
* tag 'perf-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Fix NULL event dereference crash in handle_pmi_common()
perf/core: Fix missing read event generation on task exit
perf/x86/amd/uncore: Fix the return value of amd_uncore_df_event_init() on error
perf/uprobes: Remove <space><Tab> whitespace noise
Linus Torvalds [Sat, 13 Dec 2025 18:07:09 +0000 (06:07 +1200)]
Merge tag 'irq-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
- Fix error code in the irqchip/mchp-eic driver
- Fix setup_percpu_irq() affinity assumptions
- Remove the unused irq_domain_add_tree() function
* tag 'irq-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/mchp-eic: Fix error code in mchp_eic_domain_alloc()
irqdomain: Delete irq_domain_add_tree()
genirq: Allow NULL affinity for setup_percpu_irq()
Linus Torvalds [Sat, 13 Dec 2025 18:04:16 +0000 (06:04 +1200)]
Merge tag 'core-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc core fixes from Ingo Molnar:
- Improve bug reporting
- Suppress W=1 format warning
- Improve rseq scalability on Clang builds
* tag 'core-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq: Always inline rseq_debug_syscall_return()
bug: Hush suggest-attribute=format for __warn_printf()
bug: Let report_bug_entry() provide the correct bugaddr
Linus Torvalds [Sat, 13 Dec 2025 08:55:12 +0000 (20:55 +1200)]
Merge tag 'mm-nonmm-stable-2025-12-11-11-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc updates from Andrew Morton:
"There are no significant series in this small merge. Please see the
individual changelogs for details"
[ Editor's note: it's mainly ocfs2 and a couple of random fixes ]
* tag 'mm-nonmm-stable-2025-12-11-11-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: memfd_luo: add CONFIG_SHMEM dependency
mm: shmem: avoid build warning for CONFIG_SHMEM=n
ocfs2: fix memory leak in ocfs2_merge_rec_left()
ocfs2: invalidate inode if i_mode is zero after block read
ocfs2: avoid -Wflex-array-member-not-at-end warning
ocfs2: convert remaining read-only checks to ocfs2_emergency_state
ocfs2: add ocfs2_emergency_state helper and apply to setattr
checkpatch: add uninitialized pointer with __free attribute check
args: fix documentation to reflect the correct numbers
ocfs2: fix kernel BUG in ocfs2_find_victim_chain
liveupdate: luo_core: fix redundant bound check in luo_ioctl()
ocfs2: validate inline xattr size and entry count in ocfs2_xattr_ibody_list
fs/fat: remove unnecessary wrapper fat_max_cache()
ocfs2: replace deprecated strcpy with strscpy
ocfs2: check tl_used after reading it from trancate log inode
liveupdate: luo_file: don't use invalid list iterator
Brown paper bag time. This is a silly oversight where I missed to drop
the error condition checking to ensure we clean up on early error
returns. I have an internal unit testset coming up for this which will
catch all such issues going forward.
Reported-by: Chris Mason <clm@fb.com> Reported-by: Jeff Layton <jlayton@kernel.org> Fixes: 011703a9acd7 ("file: add FD_{ADD,PREPARE}()") Signed-off-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 13 Dec 2025 07:57:41 +0000 (19:57 +1200)]
x86/hv: Add gitignore entry for generated header file
Commit 7bfe3b8ea6e3 ("Drivers: hv: Introduce mshv_vtl driver") added a
new generated header file for the offsets into the mshv_vtl_cpu_context
structure to be used by the low-level assembly code. But it didn't add
the .gitignore file to go with it, so 'git status' and friends will
mention it.
Let's add the gitignore file before somebody thinks that generated
header should be committed.
Linus Torvalds [Sat, 13 Dec 2025 05:39:28 +0000 (17:39 +1200)]
Merge tag 'drm-fixes-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel
Pull more drm fixes from Dave Airlie:
"These are the enqueued fixes that ended up in our fixes branch,
nouveau mostly, along with some small fixes in other places.
plane:
- Handle IS_ERR vs NULL in drm_plane_create_hotspot_properties()
ttm:
- fix devcoredump for evicted bos
panel:
- Fix stack usage warning in novatek-nt35560
nouveau:
- alloc fwsec sb at boot to avoid s/r problems
- fix strcpy usage
- fix i2c encoder crash
bridge:
- Ignore spurious PLL_UNLOCK bit in ti-sn65dsi83
mgag200:
- Fix bigendian handling in mgag200
tilcdc:
- Fix probe failure in tilcdc"
* tag 'drm-fixes-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel:
drm/mgag200: Fix big-endian support
drm/tilcdc: Fix removal actions in case of failed probe
drm/ttm: Avoid NULL pointer deref for evicted BOs
drm: nouveau: Replace sprintf() with sysfs_emit()
drm/nouveau: fix circular dep oops from vendored i2c encoder
drm/nouveau: refactor deprecated strcpy
drm/plane: Fix IS_ERR() vs NULL check in drm_plane_create_hotspot_properties()
drm/bridge: ti-sn65dsi83: ignore PLL_UNLOCK errors
drm/nouveau/gsp: Allocate fwsec-sb at boot
drm/panel: novatek-nt35560: avoid on-stack device structure
i915:
- Fix format string truncation warning
- FIx runtime PM reference during fbdev BO creation
panthor:
- fix UAF
renesas:
- fix sync flag handling"
* tag 'drm-next-2025-12-13' of https://gitlab.freedesktop.org/drm/kernel:
Revert "drm/amd/display: Fix pbn to kbps Conversion"
drm/amd: Fix unbind/rebind for VCN 4.0.5
drm/i915: Fix format string truncation warning
drm/i915/fbdev: Hold runtime PM ref during fbdev BO creation
drm/amd/display: Improve HDMI info retrieval
drm/amdkfd: bump minimum vgpr size for gfx1151
drm/amd/display: shrink struct members
drm/amdkfd: Export the cwsr_size and ctl_stack_size to userspace
drm/amd/display: Refactor dml_core_mode_support to reduce stack frame
drm/amdgpu: don't attach the tlb fence for SI
drm/amd/display: Use GFP_ATOMIC in dc_create_plane_state()
drm/amdkfd: Trap handler support for expert scheduling mode
drm/amdkfd: Use huge page size to check split svm range alignment
drm/rcar-du: dsi: Handle both DRM_MODE_FLAG_N.SYNC and !DRM_MODE_FLAG_P.SYNC
drm/gem-shmem: revert the 8-byte alignment constraint
drm/gem-dma: revert the 8-byte alignment constraint
drm/panthor: Prevent potential UAF in group creation
Matt Roper [Fri, 12 Dec 2025 18:14:13 +0000 (10:14 -0800)]
drm/xe/lnl: Drop pre-production workaround support
LNL has been out long enough that all of our internal usage of
pre-production hardware has been phased out and we no longer need to
maintain workarounds that were exclusive to pre-production parts.
Production LNL hardware always has B0 or later steppings for both
graphics and media IP. Eliminate all workarounds that were exclusive to
A-step hardware and set the 'has_prod_wa_only' device flag for LNL to
make sure we warn and taint if someone tries to load the driver on an
old pre-production part.
Matt Roper [Fri, 12 Dec 2025 18:14:12 +0000 (10:14 -0800)]
drm/xe: Track pre-production workaround support
When we're initially enabling driver support for a new platform/IP, we
usually implement all workarounds documented in the WA database in the
driver. Many of those workarounds are restricted to early steppings
that only showed up in pre-production hardware (i.e., internal test
chips that are not available to the general public). Since the
workarounds for early, pre-production steppings tend to be some of the
ugliest and most complicated workarounds, we generally want to eliminate
them and simplify the code once the platform has launched and our
internal usage of those pre-production parts have been phased out.
Let's add a flag to the device info that tracks which platforms still
have support for pre-production workarounds for so that we can print a
warning and taint if someone tries to load the driver on a
pre-production part for a platform without pre-production workarounds.
This will help our internal users understand the likely problems they'll
encounter if they try to load the driver on an old pre-production
device.
The Xe behavior here is similar to what we've done for many years on
i915 (see intel_detect_preproduction_hw()), except that instead of
manually coding up ranges of device steppings that we believe to be
pre-production hardware, Xe will use the hardware's own production vs
pre-production fusing status, which we can read from the FUSE2 register.
This fuse didn't exist on older Intel hardware, but should be present on
all platforms supported by the Xe driver.
Going forward, let's set the expectation that we'll start looking into
removing pre-production workarounds for a platform around the time that
platforms of the next major IP stepping are having their force_probe
requirement lifted. This timing is just a rough guideline; there may be
cases where some instances of pre-production parts are still being
actively used in CI farms, internal device pools, etc. and we'll need to
wait a bit longer for those to be swapped out.
v2:
- Fix inverted forcewake check
v3:
- Invert flag and add it to the platforms on which we still have
pre-prod workarounds. (Jani, Lucas)
v4:
- Avoid checking pre-production on VF since they don't have access to
the FUSE2 register.
Linus Torvalds [Sat, 13 Dec 2025 05:15:16 +0000 (17:15 +1200)]
Merge tag 'i3c/for-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux
Pull further i3c update from Alexandre Belloni:
"We are removing a legacy API callback and having this sooner rather
than later will help ensuring no one introduces a new driver using it.
I've also added patches removing the "__free(...) = NULL" pattern
because I'm sure we won't avoid people sending those following the
mailing list discussion..."
* tag 'i3c/for-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
i3c: adi: Fix confusing cleanup.h syntax
i3c: master: Fix confusing cleanup.h syntax
i3c: master: cleanup callback .priv_xfers()
i3c: master: switch to use new callback .i3c_xfers() from .priv_xfers()
Linus Torvalds [Sat, 13 Dec 2025 05:09:06 +0000 (17:09 +1200)]
Merge tag 'rtc-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
"Subsystem:
- stop setting max_user_freq from the individual drivers as this has
not been hardware related for a while
New drivers:
- Andes ATCRTC100
- Apple SMC
- Nvidia VRS
Linus Torvalds [Sat, 13 Dec 2025 04:36:57 +0000 (16:36 +1200)]
Merge tag 'gpio-fixes-for-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio updates from Bartosz Golaszewski:
- fix spinlock op type after conversion to lock guards
- fix a memory leak in error path in gpio-regmap
- Kconfig fixes in GPIO drivers
- add a GPIO ACPI quirk for Dell Precision 7780
- set of fixes for shared GPIO management
* tag 'gpio-fixes-for-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: shared: make locking more fine-grained
gpio: shared: fix auxiliary device cleanup order
gpio: shared: check if a reference is populated before cleaning its resources
gpio: shared: fix NULL-pointer dereference in teardown path
gpio: shared: ignore disabled nodes when traversing the device-tree
gpiolib: acpi: Add quirk for Dell Precision 7780
gpio: tb10x: fix OF_GPIO dependency
gpio: qixis: select CONFIG_REGMAP_MMIO
gpio: regmap: Fix memleak in error path in gpio_regmap_register()
gpio: mmio: fix bad guard conversion
Linus Torvalds [Sat, 13 Dec 2025 04:09:10 +0000 (16:09 +1200)]
Merge tag 'sound-fix-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"The only slightly large change is the enablement of CIX HD-audio
controller, which took a bit time to be cooked up, while most of other
changes are device-specific small trivial fixes:
- Default disablement of the kconfig for decades old pre-release
alsa-lib PCM API; it's only the default config value change, so it
can't lead to any regressions for the existing setups
- Support for CIX HD-audio controller
- A few ASoC ACP fixes
- Fixes for ASoC cirrus, bcm, wcd, qcom, ak platforms
- Trivial hardening for FireWire and USB-audio
- HD-audio Intel binding fix and quirks"
* tag 'sound-fix-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (30 commits)
ALSA: hda/tas2781: Add new quirk for HP new project
ALSA: hda: cix-ipbloq: Use modern PM ops
ALSA: hda: intel-dsp-config: Prefer legacy driver as fallback
ASoC: amd: acp: update tdm channels for specific DAI
ASoC: cs35l56: Fix incorrect select SND_SOC_CS35L56_CAL_SYSFS_COMMON
ALSA: firewire-motu: add bounds check in put_user loop for DSP events
ASoC: cs35l41: Always return 0 when a subsystem ID is found
ALSA: uapi: Fix typo in asound.h comment
ALSA: Do not build obsolete API
ALSA: hda: add CIX IPBLOQ HDA controller support
ALSA: hda/core: add addr_offset field for bus address translation
ALSA: hda: dt-bindings: add CIX IPBLOQ HDA controller support
ALSA: hda/realtek: Add support for ASUS UM3406GA
ALSA: hda/realtek: Add support for HP Turbine Laptops
ALSA: usb-audio: Initialize status1 to fix uninitialized symbol errors
ALSA: firewire-motu: fix buffer overflow in hwdep read for DSP events
ALSA: hda: cs35l41: Fix NULL pointer dereference in cs35l41_hda_read_acpi()
ASoC: cros_ec_codec: Remove unnecessary selection of CRYPTO
ASoc: qcom: q6afe: fix bad guard conversion
ASoC: rockchip: Fix Wvoid-pointer-to-enum-cast warning (again)
...
Brian Nguyen [Fri, 12 Dec 2025 21:32:36 +0000 (05:32 +0800)]
drm/xe: Add debugfs support for page reclamation
Allow for runtime modification to page reclamation feature through
debugfs configuration. This parameter will only take effect if the
platform supports the page reclamation feature by default.
v2:
- Minor comment tweaks. (Shuicheng)
- Convert to kstrtobool_from_user. (Michal)
- Only expose page reclaim file if page reclaim flag
initially supported and with that, remove
xe_match_desc usage. (Michal)
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-22-brian3.nguyen@intel.com
Brian Nguyen [Fri, 12 Dec 2025 21:32:35 +0000 (05:32 +0800)]
drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaim
There are additional hardware managed L2$ flushing such as the
transient display. In those scenarios, page reclamation is
unnecessary resulting in redundant cacheline flushes, so skip
over those corresponding ranges.
v2:
- Elaborated on reasoning for page reclamation skip based on
Tejas's discussion. (Matthew A, Tejas)
v3:
- Removed MEDIA_IS_ON due to racy condition resulting in removal of
relevant registers and values. (Matthew A)
- Moved l3 policy access to xe_pat. (Matthew A)
v4:
- Updated comments based on previous change. (Tejas)
- Move back PAT index macros to xe_pat.c.
Brian Nguyen [Fri, 12 Dec 2025 21:32:34 +0000 (05:32 +0800)]
drm/xe: Append page reclamation action to tlb inval
Add page reclamation action to tlb inval backend. The page reclamation
action is paired with range tlb invalidations so both are issued at the
same time.
Page reclamation will issue the TLB invalidation with an invalid seqno
and a H2G page reclamation action with the fence's corresponding seqno
and handle the fence accordingly on page reclaim action done handler.
If page reclamation fails, tlb timeout handler will be responsible for
signalling fence and cleaning up.
v2:
- add send_page_reclaim to patch.
- Remove flush_cache and use prl_sa pointer to determine PPC flush
instead of explicit bool. Add NULL as fallback for others. (Matthew B)
v3:
- Add comments for flush_cache with media.
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-20-brian3.nguyen@intel.com
Brian Nguyen [Fri, 12 Dec 2025 21:32:32 +0000 (05:32 +0800)]
drm/xe: Suballocate BO for page reclaim
Page reclamation feature needs the PRL to be suballocated into a
GGTT-mapped BO. On allocation failure, fallback to default tlb
invalidation with full PPC flush.
PRL's BO allocation is managed in separate pool to ensure 4K alignment
for proper GGTT address.
With BO, pass into TLB invalidation backend and modify fence to
accomadate accordingly.
v2:
- Removed page reclaim related variables from TLB fence. (Matthew B)
- Allocate PRL bo size to num_entries. (Matthew B)
- Move PRL bo allocation to tlb_inval run_job. (Matthew B)
v5:
- Use xe_page_reclaim_list_valid. (Matthew B)
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-18-brian3.nguyen@intel.com
Brian Nguyen [Fri, 12 Dec 2025 21:32:31 +0000 (05:32 +0800)]
drm/xe: Create page reclaim list on unbind
Page reclaim list (PRL) is preparation work for the page reclaim feature.
The PRL is firstly owned by pt_update_ops and all other page reclaim
operations will point back to this PRL. PRL generates its entries during
the unbind page walker, updating the PRL.
This PRL is restricted to a 4K page, so 512 page entries at most.
v2:
- Removed unused function. (Shuicheng)
- Compacted warning checking, update commit message,
spelling, etc. (Shuicheng, Matthew B)
- Fix kernel docs
- Moved PRL max entries overflow handling out from
generate_reclaim_entry to caller (Shuicheng)
- Add xe_page_reclaim_list_init for clarity. (Matthew B)
- Modify xe_guc_page_reclaim_entry to use macros
for greater flexbility. (Matthew B)
- Add fallback for PTE outside of page reclaim supported
4K, 64K, 2M pages (Matthew B)
- Invalidate PRL for early abort page walk.
- Removed page reclaim related variables from tlb fence
(Matthew Brost)
- Remove error handling in *alloc_entries failure. (Matthew B)
v3:
- Fix NULL pointer dereference check.
- Modify reclaim_entry to QW and bitfields accordingly. (Matthew B)
- Add vm_dbg prints for PRL generation and invalidation. (Matthew B)
v4:
- s/GENMASK/GENMASK_ULL && s/BIT/BIT_ULL (CI)
v5:
- Addition of xe_page_reclaim_list_is_new() to avoid continuous
allocation of PRL if consecutive VMAs cause a PRL invalidation.
- Add xe_page_reclaim_list_valid() helpers for clarity. (Matthew B)
- Move xe_page_reclaim_list_entries_put in
xe_page_reclaim_list_invalidate.
Brian Nguyen [Fri, 12 Dec 2025 21:32:30 +0000 (05:32 +0800)]
drm/xe/guc: Add page reclamation interface to GuC
Add page reclamation related changes to GuC interface, handlers, and
senders to support page reclamation.
Currently TLB invalidations will perform an entire PPC flush in order to
prevent stale memory access for noncoherent system memory. Page
reclamation is an extension of the typical TLB invalidation
workflow, allowing disabling of full PPC flush and enable selective PPC
flushing. Selective flushing will be decided by a list of pages whom's
address is passed to GuC at time of action.
Page reclamation interfaces require at least GuC FW ver 70.31.0.
v2:
- Moved send_page_reclaim to first patch usage.
- Add comments explaining shared done handler. (Matthew B)
- Add FW version fallback to disable page reclaim
on older versions. (Matthew B, Shuicheng)
Oak Zeng [Fri, 12 Dec 2025 21:32:29 +0000 (05:32 +0800)]
drm/xe: Add page reclamation info to device info
Starting from Xe3p, HW adds a feature assisting range based page
reclamation. Introduce a bit in device info to indicate whether
device has such capability.
Signed-off-by: Oak Zeng <oak.zeng@intel.com> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-15-brian3.nguyen@intel.com
Brian Nguyen [Fri, 12 Dec 2025 21:32:28 +0000 (05:32 +0800)]
drm/xe/xe_tlb_inval: Modify fence interface to support PPC flush
Allow tlb_invalidation to control when driver wants to flush the
Private Physical Cache (PPC) as a process of the tlb invalidation
process.
Default behavior is still to always flush the PPC but driver now has the
option to disable it.
v2:
- Revise commit/kernel doc descriptions. (Shuicheng)
- Remove unused function. (Shuicheng)
- Remove bool flush_cache parameter from fence,
and various function inputs. (Matthew B)
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-14-brian3.nguyen@intel.com
Matthew Brost [Fri, 12 Dec 2025 21:32:27 +0000 (05:32 +0800)]
drm/xe: Do not forward invalid TLB invalidation seqnos to upper layers
Certain TLB invalidation operations send multiple H2G messages per seqno
with only the final H2G containing the valid seqno - the others carry an
invalid seqno. The G2H handler drops these invalid seqno to aovid
prematurely signaling a TLB invalidation fence.
With TLB_INVALIDATION_SEQNO_INVALID used to indicate in progress
multi-step TLB invalidations, reset tdr to ensure that timeout
won't prematurely trigger when G2H actions are still ongoing.
v2: Remove lock from xe_tlb_inval_reset_timeout. (Matthew B)
v3: Squash with dependent patch from Matthew Brost' series.
Dave Airlie [Sat, 13 Dec 2025 00:54:28 +0000 (10:54 +1000)]
Merge tag 'drm-misc-fixes-2025-12-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
drm-misc-fixes for v6.19-rc1:
- Fix stack usage warning in novatek-nt35560.
- Fix s/r, i2c issues in nouveau and update string handling.
- Ignore spurious PLL_UNLOCK bit in ti-sn65dsi83.
- Handle IS_ERR vs NULL in drm_plane_create_hotspot_properties().
- Fix devcoredump crash on reading evicted bo's.
- Fix bigendian handling in mgag200.
- Fix probe failure in tilcdc.
Initializing automatic __free variables to NULL without need (e.g.
branches with different allocations), followed by actual allocation is
in contrary to explicit coding rules guiding cleanup.h:
"Given that the "__free(...) = NULL" pattern for variables defined at
the top of the function poses this potential interdependency problem the
recommendation is to always define and assign variables in one statement
and not group variable definitions at the top of the function when
__free() is used."
Code does not have a bug, but is less readable and uses discouraged
coding practice, so fix that by moving declaration to the place of
assignment.
Initializing automatic __free variables to NULL without need (e.g.
branches with different allocations), followed by actual allocation is
in contrary to explicit coding rules guiding cleanup.h:
"Given that the "__free(...) = NULL" pattern for variables defined at
the top of the function poses this potential interdependency problem the
recommendation is to always define and assign variables in one statement
and not group variable definitions at the top of the function when
__free() is used."
Code does not have a bug, but is less readable and uses discouraged
coding practice, so fix that by moving declaration to the place of
assignment.
Not that other existing usage of __free() in this context is a corret
exception initialized to NULL, because the actual allocation is branched
in if().
Jan Maslak [Wed, 10 Dec 2025 14:56:18 +0000 (15:56 +0100)]
drm/xe: Restore engine registers before restarting schedulers after GT reset
During GT reset recovery in do_gt_restart(), xe_uc_start() was called
before xe_reg_sr_apply_mmio() restored engine-specific registers. This
created a race window where the scheduler could run jobs before hardware
state was fully restored.
This caused failures in eudebug tests (xe_exec_sip_eudebug@breakpoint-
waitsip-*) where TD_CTL register (containing TD_CTL_GLOBAL_DEBUG_ENABLE)
wasn't restored before jobs started executing. Breakpoints would fail to
trigger SIP entry because the debug enable bit wasn't set yet.
Fix by moving xe_uc_start() after all MMIO register restoration,
including engine registers and CCS mode configuration, ensuring all
hardware state is fully restored before any jobs can be scheduled.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Jan Maslak <jan.maslak@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251210145618.169625-2-jan.maslak@intel.com
Linus Torvalds [Fri, 12 Dec 2025 17:44:03 +0000 (05:44 +1200)]
Merge tag 'loongarch-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Add basic LoongArch32 support
Note: Build infrastructures of LoongArch32 are not enabled yet,
because we need to adjust irqchip drivers and wait for GNU toolchain
be upstream first.
- Select HAVE_ARCH_BITREVERSE in Kconfig
- Fix build and boot for CONFIG_RANDSTRUCT
- Correct the calculation logic of thread_count
- Some bug fixes and other small changes
* tag 'loongarch-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (22 commits)
LoongArch: Adjust default config files for 32BIT/64BIT
LoongArch: Adjust VDSO/VSYSCALL for 32BIT/64BIT
LoongArch: Adjust misc routines for 32BIT/64BIT
LoongArch: Adjust user accessors for 32BIT/64BIT
LoongArch: Adjust system call for 32BIT/64BIT
LoongArch: Adjust module loader for 32BIT/64BIT
LoongArch: Adjust time routines for 32BIT/64BIT
LoongArch: Adjust process management for 32BIT/64BIT
LoongArch: Adjust memory management for 32BIT/64BIT
LoongArch: Adjust boot & setup for 32BIT/64BIT
LoongArch: Adjust common macro definitions for 32BIT/64BIT
LoongArch: Add adaptive CSR accessors for 32BIT/64BIT
LoongArch: Add atomic operations for 32BIT/64BIT
LoongArch: Add new PCI ID for pci_fixup_vgadev()
LoongArch: Add and use some macros for AVEC
LoongArch: Correct the calculation logic of thread_count
LoongArch: Use unsigned long for _end and _text
LoongArch: Use __pmd()/__pte() for swap entry conversions
LoongArch: Fix arch_dup_task_struct() for CONFIG_RANDSTRUCT
LoongArch: Fix build errors for CONFIG_RANDSTRUCT
...
Jagmeet Randhawa [Thu, 11 Dec 2025 21:21:46 +0000 (05:21 +0800)]
drm/xe: Increase TDF timeout
There are some corner cases where flushing transient
data may take slightly longer than the 150us timeout
we currently allow. Update the driver to use a 300us
timeout instead based on the latest guidance from
the hardware team. An update to the bspec to formally
document this is expected to arrive soon.
- Fix chacha-riscv64-zvkb.S to not use frame pointer for data"
* tag 'libcrypto-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
crypto: arm64/ghash - Fix incorrect output from ghash-neon
crypto/arm64: sm4/xts - Merge ksimd scopes to reduce stack bloat
crypto/arm64: aes/xts - Use single ksimd scope to reduce stack bloat
lib/crypto: blake2s: Replace manual unrolling with unrolled_full
lib/crypto: blake2b: Roll up BLAKE2b round loop on 32-bit
lib/crypto: riscv: Depend on RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS
lib/crypto: riscv/chacha: Avoid s0/fp register
Linus Torvalds [Fri, 12 Dec 2025 10:04:18 +0000 (22:04 +1200)]
Merge tag 'block-6.19-20251211' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- Always initialize DMA state, fixing a potentially nasty issue on the
block side
- btrfs zoned write fix with cached zone reports
- Fix corruption issues in bcache with chained bio's, and further make
it clear that the chained IO handler is simply a marker, it's not
code meant to be executed
- Kill old code dealing with synchronous IO polling in the block layer,
that has been dead for a long time. Only async polling is supported
these days
- Fix a lockdep issue in tag_set management, moving it to RCU
- Fix an issue with ublks bio_vec iteration
- Don't unconditionally enforce blocking issue of ublk control
commands, allow some of them with non-blocking issue as they
do not block
* tag 'block-6.19-20251211' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
blk-mq-dma: always initialize dma state
blk-mq: delete task running check in blk_hctx_poll()
block: fix cached zone reports on devices with native zone append
block: Use RCU in blk_mq_[un]quiesce_tagset() instead of set->tag_list_lock
ublk: don't mutate struct bio_vec in iteration
block: prohibit calls to bio_chain_endio
bcache: fix improper use of bi_end_io
ublk: allow non-blocking ctrl cmds in IO_URING_F_NONBLOCK issue
Linus Torvalds [Fri, 12 Dec 2025 10:01:32 +0000 (22:01 +1200)]
Merge tag 'io_uring-6.19-20251211' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fix from Jens Axboe:
"Single fix for io_uring headed to stable, fixing an issue introduced
with the min_wait support earlier this year, where SQPOLL didn't get
correctly woken if an event arrived once the event waiting has
finished the min_wait portion.
As we already have regression tests for this added and people
reporting new failures there, let's get this one flushed out
so it can bubble back down to stable as well"
* tag 'io_uring-6.19-20251211' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring: fix min_wait wakeups for SQPOLL
Linus Torvalds [Fri, 12 Dec 2025 09:59:19 +0000 (21:59 +1200)]
Merge tag 'v6.19-rc-smb3-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- minor cleanup
- minor update to comment to avoid confusion about fs type
* tag 'v6.19-rc-smb3-server-fixes' of git://git.samba.org/ksmbd:
smb/server: add comment to FileSystemName of FileFsAttributeInformation
smb/server: remove unused nterr.h
smb/server: rename include guard in smb_common.h
Linus Torvalds [Fri, 12 Dec 2025 09:52:42 +0000 (21:52 +1200)]
Merge tag 'nfs-for-6.19-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Bugfixes:
- Fix 'nlink' attribute update races when unlinking a file
- Add missing initialisers for the directory verifier in various
places
- Don't regress the NFSv4 open state due to misordered racing replies
- Ensure the NFSv4.x callback server uses the correct transport
connection
- Fix potential use-after-free races when shutting down the NFSv4.x
callback server
- Fix a pNFS layout commit crash
- Assorted fixes to ensure correct propagation of mount options when
the client crosses a filesystem boundary and triggers the VFS
automount code
- More localio fixes
Features and cleanups:
- Add initial support for basic directory delegations
- SunRPC back channel code cleanups"
* tag 'nfs-for-6.19-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (24 commits)
NFSv4: Handle NFS4ERR_NOTSUPP errors for directory delegations
nfs/localio: remove 61 byte hole from needless ____cacheline_aligned
nfs/localio: remove alignment size checking in nfs_is_local_dio_possible
NFS: Fix up the automount fs_context to use the correct cred
NFS: Fix inheritance of the block sizes when automounting
NFS: Automounted filesystems should inherit ro,noexec,nodev,sync flags
Revert "nfs: ignore SB_RDONLY when mounting nfs"
Revert "nfs: clear SB_RDONLY before getting superblock"
Revert "nfs: ignore SB_RDONLY when remounting nfs"
NFS: Add a module option to disable directory delegations
NFS: Shortcut lookup revalidations if we have a directory delegation
NFS: Request a directory delegation during RENAME
NFS: Request a directory delegation on ACCESS, CREATE, and UNLINK
NFS: Add support for sending GDD_GETATTR
NFSv4/pNFS: Clear NFS_INO_LAYOUTCOMMIT in pnfs_mark_layout_stateid_invalid
NFSv4.1: protect destroying and nullifying bc_serv structure
SUNRPC: new helper function for stopping backchannel server
SUNRPC: cleanup common code in backchannel request
NFSv4.1: pass transport for callback shutdown
NFSv4: ensure the open stateid seqid doesn't go backwards
...
Brendan Jackman [Sun, 7 Dec 2025 03:53:18 +0000 (03:53 +0000)]
bug: Hush suggest-attribute=format for __warn_printf()
Recent additions to this function cause GCC 14.3.0 to get excited
(W=1) and suggest a missing attribute:
lib/bug.c: In function '__warn_printf':
lib/bug.c:187:25: error: function '__warn_printf' be a candidate for 'gnu_printf' format attribute [-Werror=suggest-attribute=format]
187 | vprintk(fmt, *args);
| ^~~~~~~
Disable the diagnostic locally, following the pattern used for stuff
like va_format().
Heiko Carstens [Mon, 8 Dec 2025 20:06:58 +0000 (21:06 +0100)]
bug: Let report_bug_entry() provide the correct bugaddr
report_bug_entry() always provides zero for bugaddr but could easily
extract the correct address from the provided bug_entry. Just do that to
have proper warning messages.
E.g. adding an artificial:
void foo(void) { WARN_ONCE(1, "bar"); }
function generates this warning message:
WARNING: arch/s390/kernel/setup.c:1017 at 0x0, CPU#0: swapper/0/0
^^^
With the correct bug address this changes to:
WARNING: arch/s390/kernel/setup.c:1017 at foo+0x1c/0x40, CPU#0: swapper/0/0
^^^^^^^^^^^^^
Evan Li [Fri, 12 Dec 2025 08:49:43 +0000 (16:49 +0800)]
perf/x86/intel: Fix NULL event dereference crash in handle_pmi_common()
handle_pmi_common() may observe an active bit set in cpuc->active_mask
while the corresponding cpuc->events[] entry has already been cleared,
which leads to a NULL pointer dereference.
This can happen when interrupt throttling stops all events in a group
while PEBS processing is still in progress. perf_event_overflow() can
trigger perf_event_throttle_group(), which stops the group and clears
the cpuc->events[] entry, but the active bit may still be set when
handle_pmi_common() iterates over the events.
The following recent fix:
7e772a93eb61 ("perf/x86: Fix NULL event access and potential PEBS record loss")
moved the cpuc->events[] clearing from x86_pmu_stop() to x86_pmu_del() and
relied on cpuc->active_mask/pebs_enabled checks. However,
handle_pmi_common() can still encounter a NULL cpuc->events[] entry
despite the active bit being set.
Add an explicit NULL check on the event pointer before using it,
to cover this legitimate scenario and avoid the NULL dereference crash.
Fixes: 7e772a93eb61 ("perf/x86: Fix NULL event access and potential PEBS record loss") Reported-by: kitta <kitta@linux.alibaba.com> Co-developed-by: kitta <kitta@linux.alibaba.com> Signed-off-by: Evan Li <evan.li@linux.alibaba.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://patch.msgid.link/20251212084943.2124787-1-evan.li@linux.alibaba.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220855
drm/xe/multi_queue: Support active group after primary is destroyed
Add support to keep the group active after the primary queue is
destroyed. Instead of killing the primary queue during exec_queue
destroy ioctl, kill it when all the secondary queues of the group
are killed.
drm/xe/multi_queue: Teardown group upon job timeout
Upon a job timeout, teardown the multi-queue group by
triggering TDR on all queues of the multi-queue group
and by skipping timeout checks in them.
v5: Ban the group while triggering TDR for the guc
reported errors
Add FIXME in TDR to take multi-queue group off HW
(Matt Brost)
v6: Trigger cleanup of group only for multi-queue case
drm/xe/multi_queue: Set QUEUE_DRAIN_MODE for Multi Queue batches
To properly support soft light restore between batches
being arbitrated at the CFEG, PIPE_CONTROL instructions
have a new bit in the first DW, QUEUE_DRAIN_MODE. When
set, this indicates to the CFEG that it should only
drain the current queue.
Additionally we no longer want to set the CS_STALL bit
for these multi queue queues as this causes the entire
pipeline to stall waiting for completion of the prior
batch, preventing this soft light restore from occurring
between queues in a queue group.
v4: Assert !multi_queue where applicable (Matt Roper)
drm/xe/multi_queue: Handle tearing down of a multi queue
As all queues of a multi queue group use the primary queue of the group
to interface with GuC. Hence there is a dependency between the queues of
the group. So, when primary queue of a multi queue group is cleaned up,
also trigger a cleanup of the secondary queues also. During cleanup, stop
and re-start submission for all queues of a multi queue group to avoid
any submission happening in parallel when a queue is being cleaned up.
v2: Initialize group->list_lock, add fs_reclaim dependency, remove
unwanted secondary queues cleanup (Matt Brost)
v3: Properly handle cleanup of multi-queue group (Matt Brost)
v4: Fix IS_ENABLED(CONFIG_LOCKDEP) check (Matt Brost)
Revert stopping/restarting of submissions on queues of the
group in TDR as it is not needed.
drm/xe/multi_queue: Add support for multi queue dynamic priority change
Support dynamic priority change for multi queue group queues via
exec queue set_property ioctl. Issue CGP_SYNC command to GuC through
the drm scheduler message interface for priority to take effect.
v2: Move is_multi_queue check to exec_queue layer and assert
is_multi_queue being set in guc submission layer (Matt Brost)
v3: Assert CGP_SYNC message length is valid (Matt Brost)
drm/xe/multi_queue: Add exec_queue set_property ioctl support
This patch adds support for exec_queue set_property ioctl.
It is derived from the original work which is part of
https://patchwork.freedesktop.org/series/112188/
Currently only DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE_PRIORITY
property can be dynamically set.
v2: Check for and update kernel-doc which property this ioctl
supports (Matt Brost)
Only MULTI_QUEUE_PRIORITY property is valid for secondary queues of a
multi queue group. MULTI_QUEUE_PRIORITY only applies to multi queue group
queues. Detect invalid user queue property setting and return error.
drm/xe/multi_queue: Add multi queue priority property
Add support for queues of a multi queue group to set
their priority within the queue group by adding property
DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE_PRIORITY.
This is the only other property supported by secondary
queues of a multi queue group, other than
DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE.
v2: Add kernel doc for enum xe_multi_queue_priority,
Add assert for priority values, fix includes and
declarations (Matt Brost)
v3: update uapi kernel-doc (Matt Brost)
v4: uapi change due to rebase
drm/xe/multi_queue: Add GuC interface for multi queue support
Implement GuC commands and response along with the Context
Group Page (CGP) interface for multi queue support.
Ensure that only primary queue (q0) of a multi queue group
communicate with GuC. The secondary queues of the group only
need to maintain LRCA and interface with drm scheduler.
Use primary queue's submit_wq for all secondary queues of a multi
queue group. This serialization avoids any locking around CGP
synchronization with GuC.
v2: Fix G2H_LEN_DW_MULTI_QUEUE_CONTEXT value, add more comments
(Matt Brost)
v3: Minor code refactro, use xe_gt_assert
v4: Use xe_guc_ct_wake_waiters(), remove vf recovery support
(Matt Brost)