From: Satyanarayana K V P Date: Thu, 12 Jun 2025 08:04:02 +0000 (+0530) Subject: drm/xe: Add helper function to inject fault into ct_dead_capture() X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=87c648c31322459ed1284ee18ab9c256cb076fd0;p=thirdparty%2Fkernel%2Flinux.git drm/xe: Add helper function to inject fault into ct_dead_capture() When injecting fault to xe_guc_ct_send_recv() & xe_guc_mmio_send_recv() functions, the CI test systems are going out of space and crashing. To avoid this issue, a new helper function is created and when fault is injected into this xe_is_injection_active() helper function, ct dead capture is avoided which suppresses ct dumps in the log. Signed-off-by: Satyanarayana K V P Suggested-by: John Harrison Reviewed-by: John Harrison Cc: Michal Wajdeczko Cc: Jani Nikula Signed-off-by: John Harrison Link: https://lore.kernel.org/r/20250612080402.22011-1-satyanarayana.k.v.p@intel.com --- diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h index 0bc3bc8e68030..e4da797a984b5 100644 --- a/drivers/gpu/drm/xe/xe_device.h +++ b/drivers/gpu/drm/xe/xe_device.h @@ -195,6 +195,8 @@ void xe_device_declare_wedged(struct xe_device *xe); struct xe_file *xe_file_get(struct xe_file *xef); void xe_file_put(struct xe_file *xef); +int xe_is_injection_active(void); + /* * Occasionally it is seen that the G2H worker starts running after a delay of more than * a second even after being queued and activated by the Linux workqueue subsystem. This diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c index e303fec181747..37509f6195035 100644 --- a/drivers/gpu/drm/xe/xe_guc_ct.c +++ b/drivers/gpu/drm/xe/xe_guc_ct.c @@ -2033,6 +2033,24 @@ void xe_guc_ct_print(struct xe_guc_ct *ct, struct drm_printer *p, bool want_ctb) } #if IS_ENABLED(CONFIG_DRM_XE_DEBUG) + +#ifdef CONFIG_FUNCTION_ERROR_INJECTION +/* + * This is a helper function which assists the driver in identifying if a fault + * injection test is currently active, allowing it to reduce unnecessary debug + * output. Typically, the function returns zero, but the fault injection + * framework can alter this to return an error. Since faults are injected + * through this function, it's important to ensure the compiler doesn't optimize + * it into an inline function. To avoid such optimization, the 'noinline' + * attribute is applied. Compiler optimizes the static function defined in the + * header file as an inline function. + */ +noinline int xe_is_injection_active(void) { return 0; } +ALLOW_ERROR_INJECTION(xe_is_injection_active, ERRNO); +#else +int xe_is_injection_active(void) { return 0; } +#endif + static void ct_dead_capture(struct xe_guc_ct *ct, struct guc_ctb *ctb, u32 reason_code) { struct xe_guc_log_snapshot *snapshot_log; @@ -2043,6 +2061,12 @@ static void ct_dead_capture(struct xe_guc_ct *ct, struct guc_ctb *ctb, u32 reaso if (ctb) ctb->info.broken = true; + /* + * Huge dump is getting generated when injecting error for guc CT/MMIO + * functions. So, let us suppress the dump when fault is injected. + */ + if (xe_is_injection_active()) + return; /* Ignore further errors after the first dump until a reset */ if (ct->dead.reported)