Maxime Ripard [Fri, 21 Jun 2024 15:20:44 +0000 (16:20 +0100)]
drm/vc4: Introduce generation number enum
With the introduction of the BCM2712 support, we will get yet another
generation of display engine to support.
The binary check of whether it's VC5 or not thus doesn't work anymore,
especially since some parts of the driver will have changed with BCM2711,
and some others with BCM2712.
Let's introduce an enum to store the generation the driver is running
on, which should provide more flexibility.
Dom Cobley [Fri, 21 Jun 2024 15:20:43 +0000 (16:20 +0100)]
drm/vc4: hvs: Remove ABORT_ON_EMPTY flag
ABORT_ON_EMPTY chooses whether the HVS abandons the current frame
when it experiences an underflow, or attempts to continue.
In theory the frame should be black from the point of underflow,
compared to a shift of sebsequent pixels to the left.
Unfortunately it seems to put the HVS is a bad state where it is not
possible to recover simply. This typically requires a reboot
following the 'flip done timed out message'.
Discussion with Broadcom has suggested we don't use this flag.
All their testing is done with it disabled.
Additionally setting BLANK_INSERT_EN causes the HDMI to output
blank pixels on an underflow which avoids it losing sync.
After this change a 'flip done timed out' due to sdram bandwidth
starvation or too low a clock is recoverable once the situation improves.
Dave Stevenson [Fri, 21 Jun 2024 15:20:41 +0000 (16:20 +0100)]
drm/vc4: hvs: Fix dlist debug not resetting the next entry pointer
The debug function to display the dlists didn't reset next_entry_start
when starting each display, so resulting in not stopping the
list at the correct place.
Dom Cobley [Fri, 21 Jun 2024 15:20:40 +0000 (16:20 +0100)]
drm/vc4: hdmi: Avoid hang with debug registers when suspended
Trying to read /sys/kernel/debug/dri/1/hdmi1_regs
when the hdmi is disconnected results in a fatal system hang.
This is due to the pm suspend code disabling the dvp clock.
That is just a gate of the 108MHz clock in DVP_HT_RPI_MISC_CONFIG,
which results in accesses hanging AXI bus.
Dave Stevenson [Fri, 21 Jun 2024 15:20:39 +0000 (16:20 +0100)]
drm/vc4: plane: YUV planes require vertical scaling to always be enabled
It has been observed that a YUV422 unity scaled plane isn't displayed.
Enabling vertical scaling on the UV planes solves this. There is
already a similar clause to always enable horizontal scaling on the
UV planes.
Maxime Ripard [Fri, 21 Jun 2024 15:20:37 +0000 (16:20 +0100)]
drm/vc4: crtc: Move assigned_channel to a variable
We access multiple times the vc4_crtc_state->assigned_channel variable
in the vc4_crtc_get_scanout_position() function, so let's store it in a
local variable.
Maxime Ripard [Fri, 21 Jun 2024 15:20:34 +0000 (16:20 +0100)]
drm/vc4: hvs: Print error if we fail an allocation
We need to allocate a few additional structures when checking our
atomic_state, especially related to hardware SRAM that will hold the
plane descriptors (DLIST) and the current line context (LBM) during
composition.
Since those allocation can fail, let's add some error message in that
case to help debug what goes wrong.
Maxime Ripard [Fri, 21 Jun 2024 15:20:32 +0000 (16:20 +0100)]
drm/vc4: hdmi: Warn if writing to an unknown HDMI register
The VC4 HDMI driver has a bunch of accessors to read from a register.
The read accessor was warning when accessing an unknown register, but
the write one was just returning silently.
Let's make sure we warn also when writing to an unknown register.
Dave Stevenson [Fri, 21 Jun 2024 15:20:30 +0000 (16:20 +0100)]
drm/vc4: hvs: Set AXI panic modes for the HVS
The HVS can change AXI request mode based on how full the COB
FIFOs are.
Until now the vc4 driver has been relying on the firmware to
have set these to sensible values.
With HVS channel 2 now being used for live video, change the
panic mode for all channels to be explicitly set by the driver,
and the same for all channels.
Dom Cobley [Fri, 21 Jun 2024 15:20:28 +0000 (16:20 +0100)]
drm/vc4: hdmi: Avoid log spam for audio start failure
We regularly get dmesg error reports of:
[ 18.184066] hdmi-audio-codec hdmi-audio-codec.3.auto: ASoC: error at snd_soc_dai_startup on i2s-hifi: -19
[ 18.184098] MAI: soc_pcm_open() failed (-19)
These are generated for any disconnected hdmi interface when pulseaudio
attempts to open the associated ALSA device (numerous times). Each open
generates a kernel error message, generating general log spam.
The error messages all come from _soc_pcm_ret in sound/soc/soc-pcm.c#L39
which suggests returning ENOTSUPP, rather that ENODEV will be quiet.
And indeed it is.
Christian König [Mon, 26 Aug 2024 12:25:40 +0000 (14:25 +0200)]
drm/doc: Document submission error signaling
Different approaches have been tried to signal resets and other errors in
vendor specific ways which not only resulted in a wide variety of
implementations but also repeating the same bugs and problems over different
drivers.
Document that drivers should use dma_fence based error signaling which is
vendor agnostic and allows userspace to query submission errors in generic
non-vendor specific code.
Christian König [Mon, 26 Aug 2024 12:25:38 +0000 (14:25 +0200)]
drm/sched: add optional errno to drm_sched_start()
The current implementation of drm_sched_start uses a hardcoded
-ECANCELED to dispose of a job when the parent/hw fence is NULL.
This results in drm_sched_job_done being called with -ECANCELED for
each job with a NULL parent in the pending list, making it difficult
to distinguish between recovery methods, whether a queue reset or a
full GPU reset was used.
To improve this, we first try a soft recovery for timeout jobs and
use the error code -ENODATA. If soft recovery fails, we proceed with
a queue reset, where the error code remains -ENODATA for the job.
Finally, for a full GPU reset, we use error codes -ECANCELED or
-ETIME. This patch adds an error code parameter to drm_sched_start,
allowing us to differentiate between queue reset and GPU reset
failures. This enables user mode and test applications to validate
the expected correctness of the requested operation. After a
successful queue reset, the only way to continue normal operation is
to call drm_sched_job_done with the specific error code -ENODATA.
v1: Initial implementation by Jesse utilized amdgpu_device_lock_reset_domain
and amdgpu_device_unlock_reset_domain to allow user mode to track
the queue reset status and distinguish between queue reset and
GPU reset.
v2: Christian suggested using the error codes -ENODATA for queue reset
and -ECANCELED or -ETIME for GPU reset, returned to
amdgpu_cs_wait_ioctl.
v3: To meet the requirements, we introduce a new function
drm_sched_start_ex with an additional parameter to set
dma_fence_set_error, allowing us to handle the specific error
codes appropriately and dispose of bad jobs with the selected
error code depending on whether it was a queue reset or GPU reset.
v4: Alex suggested using a new name, drm_sched_start_with_recovery_error,
which more accurately describes the function's purpose.
Additionally, it was recommended to add documentation details
about the new method.
v5: Fixed declaration of new function drm_sched_start_with_recovery_error.(Alex)
v6 (chk): rebase on upstream changes, cleanup the commit message,
drop the new function again and update all callers,
apply the errno also to scheduler fences with hw fences
v7 (chk): rebased
drm/bochs: Use GEM SHMEM helpers for memory management
Replace GEM VRAM with GEM SHMEM in bochs. The new memory manager
stores buffer objects in system memory. Makes the driver's memory
management more reliably.
Most of the changes are hidden in external helpers that allocate
buffers. Replacing DRM_GEM_VRAM_DRIVER with DRM_GEM_SHMEM_DRIVER_OPS
swaps these. With GEM VRAM, the video memory was updated directly by
the DRM client. The biggest change within bochs is in atomic_update,
which now updates video memory via memcpy() from the BO in system
memory. Shadow-plane helpers maintaining the pointers to the buffer's
data, so bochs doesn't have to. The update is triggered by each page
flip's call to the framebuffer's dirty helper. The driver supports
damage clipping to minimize memcpy() overhead.
The advantage of GEM SHMEM is that it makes memory management
more reliable. Given DRM's double buffering during page flips, the
minimum amount of video memory is three times the maximum consumption
in some pathological cases. For example, if the maximum size of a GEM
buffer is 1920x1080-32 (i.e., 32-bit FullHD), the buffer size is
8 MiB. Display hardware has to provide at lease 24 MiB to reliably
page flip such configurations. This cannot always be guaranteed and
bochs already contains code to rule out <4 MiB configurations. With
GEM SHMEM, only 8 MiB of video memory are required for the given
example. Unsupported modes can be sorted out easily.
Remove the simple display pipeline in favor of the regular atomic
helpers in bochs. The simple-pipe helpers are considered deprecated
in DRM.
This effectivly inlines the simple-pipe code for plane and CRTC
support. Instead of a single update helper, there's now a mode-set
helper for the CRTC and an update helper for the plane. The encoder
changes type from NONE ot VIRTUAL.
Removing simple-pipe helpers from bochs will allow for related
cleanups in GEM VRAM helpers.
drm/bochs: Allocate DRM device in struct bochs_device
Allocate an instance of struct drm_device in struct bochs_device. Also
remove all uses of dev_private from bochs and upcast from the embedded
instance if necessary.
Implement a read function for struct drm_edid and read the EDID data
with drm_edit_read_custom(). Update the connector data accordingly.
The EDID data comes from the emulator itself and the connector stores
a copy in its EDID property. The drm_edid field in struct bochs_device
is therefore not required. Remove it.
If qemu provides no EDID data, install default display modes as before.
drm/msm: add another DRM_DISPLAY_DSC_HELPER selection
In the drm/msm driver both DSI and DPU subdrivers use drm_dsc_*
functions, but only DSI selects DRM_DISPLAY_DSC_HELPER symbol. Add
missing select to the DPU subdriver too.
Fixes: ca097d4d94d8 ("drm/display: split DSC helpers from DP helpers") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202409040129.rqhtRTeC-lkp@intel.com/ Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240905-fix-dsc-helpers-v1-2-3ae4b5900f89@linaro.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
The Xe driver shares display code with the i915 driver, pulling in the
dependency on the DSC helpers this way. However when working on
separating DRM_DISPLAY_DSC_HELPER this was left unnoticed. Add missing
dependency.
Fixes: ca097d4d94d8 ("drm/display: split DSC helpers from DP helpers") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202409032226.x6f4SWQl-lkp@intel.com/ Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240905-fix-dsc-helpers-v1-1-3ae4b5900f89@linaro.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
Mary Guillemard [Fri, 30 Aug 2024 08:03:50 +0000 (10:03 +0200)]
drm/panthor: Add DEV_QUERY_TIMESTAMP_INFO dev query
Expose timestamp information supported by the GPU with a new device
query.
Mali uses an external timer as GPU system time. On ARM, this is wired to
the generic arch timer so we wire cntfrq_el0 as device frequency.
This new uAPI will be used in Mesa to implement timestamp queries and
VK_KHR_calibrated_timestamps.
Since this extends the uAPI and because userland needs a way to advertise
those features conditionally, this also bumps the driver minor version.
v2:
- Rewrote to use GPU timestamp register
- Added timestamp_offset to drm_panthor_timestamp_info
- Add missing include for arch_timer_get_cntfrq
- Rework commit message
v3:
- Add panthor_gpu_read_64bit_counter
- Change panthor_gpu_read_timestamp to use
panthor_gpu_read_64bit_counter
v4:
- Fix multiple typos in uAPI documentation
- Mention behavior when the timestamp frequency is unknown
- Use u64 instead of unsigned long long
for panthor_gpu_read_timestamp
- Apply r-b from Mihail
Lu Baolu [Mon, 2 Sep 2024 01:46:58 +0000 (09:46 +0800)]
drm/nouveau/tegra: Use iommu_paging_domain_alloc()
In nvkm_device_tegra_probe_iommu(), a paging domain is allocated for @dev
and attached to it on success. Use iommu_paging_domain_alloc() to make it
explicit.
Jani Nikula [Wed, 4 Sep 2024 08:52:06 +0000 (11:52 +0300)]
drm/bridge/tdp158: fix build failure
ARCH=arm build fails with:
CC [M] drivers/gpu/drm/bridge/ti-tdp158.o
../drivers/gpu/drm/bridge/ti-tdp158.c: In function ‘tdp158_enable’:
../drivers/gpu/drm/bridge/ti-tdp158.c:31:9: error: implicit declaration of function ‘gpiod_set_value_cansleep’ [-Werror=implicit-function-declaration]
31 | gpiod_set_value_cansleep(tdp158->enable, 1);
| ^~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/bridge/ti-tdp158.c: In function ‘tdp158_probe’:
../drivers/gpu/drm/bridge/ti-tdp158.c:80:26: error: implicit declaration of function ‘devm_gpiod_get_optional’; did you mean ‘devm_regulator_get_optional’? [-Werror=implicit-function-declaration]
80 | tdp158->enable = devm_gpiod_get_optional(dev, "enable", GPIOD_OUT_LOW);
| ^~~~~~~~~~~~~~~~~~~~~~~
| devm_regulator_get_optional
../drivers/gpu/drm/bridge/ti-tdp158.c:80:65: error: ‘GPIOD_OUT_LOW’ undeclared (first use in this function)
80 | tdp158->enable = devm_gpiod_get_optional(dev, "enable", GPIOD_OUT_LOW);
| ^~~~~~~~~~~~~
../drivers/gpu/drm/bridge/ti-tdp158.c:80:65: note: each undeclared identifier is reported only once for each function it appears in
Add the proper gpio consumer #include to fix this, and juggle the
include order to be a bit more pleasant on the eye while at it.
Fixes: a15710027afb ("drm/bridge: add support for TI TDP158") Cc: Marc Gonzalez <mgonzalez@freebox.fr> Cc: Robert Foss <rfoss@kernel.org> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Neil Armstrong <neil.armstrong@linaro.org> Cc: Laurent Pinchart <Laurent.pinchart@ideasonboard.com> Cc: Jonas Karlman <jonas@kwiboo.se> Cc: Jernej Skrabec <jernej.skrabec@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240904085206.3331553-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Jani Nikula [Tue, 3 Sep 2024 17:34:37 +0000 (20:34 +0300)]
drm/mm: annotate drm_mm_node_scanned_block() with __maybe_unused
Clang build with CONFIG_DRM_DEBUG_MM=n, CONFIG_WERROR=y, and W=1 leads to:
CC [M] drivers/gpu/drm/drm_mm.o
../drivers/gpu/drm/drm_mm.c:614:20: error: function 'drm_mm_node_scanned_block' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
static inline bool drm_mm_node_scanned_block(const struct drm_mm_node *node)
^
Fix this by annotating drm_mm_node_scanned_block() with __maybe_unused.
Andy Shevchenko [Thu, 29 Aug 2024 15:46:40 +0000 (18:46 +0300)]
drm/mm: Mark drm_mm_interval_tree*() functions with __maybe_unused
The INTERVAL_TREE_DEFINE() uncoditionally provides a bunch of helper
functions which in some cases may be not used. This, in particular,
prevents kernel builds with clang, `make W=1` and CONFIG_WERROR=y:
Marc Gonzalez [Mon, 12 Aug 2024 14:51:02 +0000 (16:51 +0200)]
drm/bridge: add support for TI TDP158
TDP158 is an AC-coupled DVI / HDMI to TMDS level shifting Redriver.
It supports DVI 1.0, HDMI 1.4b and 2.0b.
It supports 4 TMDS channels, HPD, and a DDC interface.
It supports dual power supply rails (1.1V on VDD, 3.3V on VCC)
for power reduction. Several methods of power management are
implemented to reduce overall power consumption.
It supports fixed receiver EQ gain using I2C or pin strap to
compensate for different lengths input cable or board traces.
Features
- AC-coupled TMDS or DisplayPort dual-mode physical layer input
to HDMI 2.0b TMDS physical layer output supporting up to 6Gbps
data rate, compatible with HDMI 2.0b electrical parameters
- DisplayPort dual-mode standard version 1.1
- Programmable fixed receiver equalizer up to 15.5dB
- Global or independent high speed lane control, pre-emphasis
and transmit swing, and slew rate control
- I2C or pin strap programmable
- Configurable as a DisplayPort redriver through I2C
- Full lane swap on main lanes
- Low power consumption (200 mW at 6Gbps, 8 mW in shutdown)
https://www.ti.com/lit/ds/symlink/tdp158.pdf
On our board, I2C_EN is pulled high.
Thus, this code defines a module_i2c_driver.
The default settings work fine for our use-case.
So this basic driver doesn't need to tweak any I2C registers.
Marc Gonzalez [Mon, 12 Aug 2024 14:51:01 +0000 (16:51 +0200)]
dt-bindings: display: bridge: add TI TDP158
TDP158 is an AC-coupled DVI / HDMI to TMDS level shifting Redriver.
It supports DVI 1.0, HDMI 1.4b and 2.0b.
It supports 4 TMDS channels, HPD, and a DDC interface.
It supports dual power supply rails (1.1V on VDD, 3.3V on VCC)
for power reduction. Several methods of power management are
implemented to reduce overall power consumption.
It supports fixed receiver EQ gain using I2C or pin strap to
compensate for different lengths input cable or board traces.
Features
- AC-coupled TMDS or DisplayPort dual-mode physical layer input
to HDMI 2.0b TMDS physical layer output supporting up to 6Gbps
data rate, compatible with HDMI 2.0b electrical parameters
- DisplayPort dual-mode standard version 1.1
- Programmable fixed receiver equalizer up to 15.5dB
- Global or independent high speed lane control, pre-emphasis
and transmit swing, and slew rate control
- I2C or pin strap programmable
- Configurable as a DisplayPort redriver through I2C
- Full lane swap on main lanes
- Low power consumption (200 mW at 6Gbps, 8 mW in shutdown)
https://www.ti.com/lit/ds/symlink/tdp158.pdf
Like the TFP410, the TDP158 can be set up in 2 different ways:
1) hard-coding its configuration settings using pin-strapping resistors
2) placing it on an I2C bus, and defer set-up until run-time
The mode is selected via pin 8 = I2C_EN
I2C_EN high = I2C Control Mode
I2C_EN low = Pin Strap Mode
drm/imx: parallel-display: switch to imx_legacy_bridge / drm_bridge_connector
Use the imx_legacy bridge driver instead of handlign display modes via
the connector node.
All existing usecases already support attaching using
the DRM_BRIDGE_ATTACH_NO_CONNECTOR flag, while the imx_legacy bridge
doesn't support creating connector at all. Switch to
drm_bridge_connector at the same time.
drm/imx: ldb: switch to imx_legacy_bridge / drm_bridge_connector
Use the imx_legacy bridge driver instead of handlign display modes via
the connector node.
All existing usecases already support attaching using
the DRM_BRIDGE_ATTACH_NO_CONNECTOR flag, while the imx_legacy bridge
doesn't support creating connector at all. Switch to
drm_bridge_connector at the same time.
i.MX DRM DT bindings allow using either a proper panel / bridge graph to
provide information about connected panels, or just a display-timings DT
node, describing just the timings and the flags. Add helper bridge
driver supporting the latter usecase. It will be used by both LDB and
parallel-display drivers.
Defer panel handling to drm_panel_bridge, unifying codepaths for the
panel and bridge cases. The MFD_SYSCON symbol is moved to select to
prevent Kconfig symbol loops.
None of the boards ever supported by the upstream kernel used the custom
DDC bus support with the LDB connector. If a need arises to do so, one
should use panel-simple and its DDC bus code. Drop ddc-i2c-bus support
from the imx-ldb driver.
Bindings for the imx-ldb never allowed specifying the EDID in DT. None
of the existing DT files use it. Drop it now in favour of using debugfs
overrides or the drm.edid_firmware support.
drm/imx: parallel-display: drop edid override support
None of the in-kernel DT files ever used edid override with the
fsl-imx-drm driver. In case the EDID needs to be specified manually, DRM
core allows one to either override it via the debugfs or to load it via
request_firmware by using DRM_LOAD_EDID_FIRMWARE. In all other cases
EDID and/or modes are to be provided as a part of the panel driver.
dt-bindings: display: imx/ldb: drop ddc-i2c-bus property
The in-kernel DT files do not use ddc-i2c-bus property with the iMX LVDS
Display Bridge. If in future a need arises to support such usecase, the
panel-simple should be used, which is able to handle the DDC bus.
dt-bindings: display: fsl-imx-drm: drop edid property support
None of the in-kernel DT files ever used edid override with the
fsl-imx-drm driver. In case the EDID needs to be specified manually, DRM
core allows one to either override it via the debugfs or to load it via
request_firmware by using DRM_LOAD_EDID_FIRMWARE. In all other cases
EDID and/or modes are to be provided as a part of the panel driver.
Drop the edid property from the fsl-imx-drm bindings.
Currently the DRM DSC functions are selected by the
DRM_DISPLAY_DP_HELPER Kconfig symbol. This is not optimal, since the DSI
code (both panel and host drivers) end up selecting the seemingly
irrelevant DP helpers. Split the DSC code to be guarded by the separate
DRM_DISPLAY_DSC_HELPER Kconfig symbol.
sizeof(unsigned long) * 8 is the number of bits in an unsigned long
variable, replace it with BITS_PER_LONG macro to make them simpler.
And fix the warning:
WARNING: Comparisons should place the constant on the right side of the test
#23: FILE: drivers/gpu/drm/panthor/panthor_mmu.c:2696:
+ if (BITS_PER_LONG < va_bits) {
Mary Guillemard [Mon, 19 Aug 2024 08:02:23 +0000 (10:02 +0200)]
drm/panfrost: Add cycle counter job requirement
Extend the uAPI with a new job requirement flag for cycle
counters. This requirement is used by userland to indicate that a job
requires cycle counters or system timestamp to be propagated. (for use
with write value timestamp jobs)
We cannot enable cycle counters unconditionally as this would result in
an increase of GPU power consumption. As a result, they should be left
off unless required by the application.
If a job requires cycle counters or system timestamps propagation, we
must enable cycle counting before issuing a job and disable it right
after the job completes.
Since this extends the uAPI and because userland needs a way to advertise
features like VK_KHR_shader_clock conditionally, we bumps the driver
minor version.
v2:
- Rework commit message
- Squash uAPI changes and implementation in this commit
- Simplify changes based on Steven Price comments
v3:
- Add Steven Price r-b
- Fix a codestyle issue
Mary Guillemard [Mon, 19 Aug 2024 08:02:22 +0000 (10:02 +0200)]
drm/panfrost: Add SYSTEM_TIMESTAMP and SYSTEM_TIMESTAMP_FREQUENCY parameters
Expose system timestamp and frequency supported by the GPU.
Mali uses an external timer as GPU system time. On ARM, this is wired to
the generic arch timer so we wire cntfrq_el0 as device frequency.
This new uAPI will be used in Mesa to implement timestamp queries and
VK_KHR_calibrated_timestamps.
v2:
- Rewrote to use GPU timestamp register
- Add missing include for arch_timer_get_cntfrq
- Rework commit message
v3:
- Move panfrost_cycle_counter_get and panfrost_cycle_counter_put to
panfrost_ioctl_query_timestamp
- Handle possible overflow in panfrost_timestamp_read
Matt Coster [Fri, 30 Aug 2024 15:06:01 +0000 (15:06 +0000)]
drm/imagination: Use pvr_vm_context_get()
I missed this open-coded kref_get() while trying to debug a refcount
bug, so let's use the helper function here to avoid that waste of time
again in the future.
Dave Airlie [Fri, 30 Aug 2024 03:41:26 +0000 (13:41 +1000)]
Merge tag 'drm-intel-next-2024-08-29' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
Cross-driver (xe-core) Changes:
- Require BMG scanout buffers to be 64k physically aligned (Maarten)
Core (drm) Changes:
- Introducing Xe2 ccs modifiers for integrated and discrete graphics (Juha-Pekka)
Driver Changes:
- General cleanup and more work moving towards intel_display isolation (Jani)
- New display workaround (Suraj)
- Use correct cp_irq_count on HDCP (Suraj)
- eDP PSR fix when CRC is enabled (Jouni)
- Fix DP MST state after a sink reset (Imre)
- Fix Arrow Lake GSC firmware version (John)
- Use chained DSBs for LUT programming (Ville)
Dave Airlie [Fri, 30 Aug 2024 03:41:05 +0000 (13:41 +1000)]
Merge tag 'drm-xe-next-2024-08-28' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
UAPI Changes:
- Fix OA format masks which were breaking build with gcc-5
Cross-subsystem Changes:
Driver Changes:
- Use dma_fence_chain_free in chain fence unused as a sync (Matthew Brost)
- Refactor hw engine lookup and mmio access to be used in more places
(Dominik, Matt Auld, Mika Kuoppala)
- Enable priority mem read for Xe2 and later (Pallavi Mishra)
- Fix PL1 disable flow in xe_hwmon_power_max_write (Karthik)
- Fix refcount and speedup devcoredump (Matthew Brost)
- Add performance tuning changes to Xe2 (Akshata, Shekhar)
- Fix OA sysfs entry (Ashutosh)
- Add first GuC firmware support for BMG (Julia)
- Bump minimum GuC firmware for platforms under force_probe to match LNL
and BMG (Julia)
- Fix access check on user fence creation (Nirmoy)
- Add/document workarounds for Xe2 (Julia, Daniele, John, Tejas)
- Document workaround and use proper WA infra (Matt Roper)
- Fix VF configuration on media GT (Michal Wajdeczko)
- Fix VM dma-resv lock (Matthew Brost)
- Allow suspend/resume exec queue backend op to be called multiple times
(Matthew Brost)
- Add GT stats to debugfs (Nirmoy)
- Add hwconfig to debugfs (Matt Roper)
- Compile out all debugfs code with ONFIG_DEUBG_FS=n (Lucas)
- Remove dead kunit code (Jani Nikula)
- Refactor drvdata storing to help display (Jani Nikula)
- Cleanup unsused xe parameter in pte handling (Himal)
- Rename s/enable_display/probe_display/ for clarity (Lucas)
- Fix missing MCR annotation in couple of registers (Tejas)
- Fix DGFX display suspend/resume (Maarten)
- Prepare exec_queue_kill for PXP handling (Daniele)
- Fix devm/drmm issues (Daniele, Matthew Brost)
- Fix tile and ggtt fini sequences (Matthew Brost)
- Fix crashes when probing without firmware in place (Daniele, Matthew Brost)
- Use xe_managed for kernel BOs (Daniele, Matthew Brost)
- Future-proof dss_per_group calculation by using hwconfig (Matt Roper)
- Use reserved copy engine for user binds on faulting devices
(Matthew Brost)
- Allow mixing dma-fence jobs and long-running faulting jobs (Francois)
- Cleanup redundant arg when creating use BO (Nirmoy)
- Prevent UAF around preempt fence (Auld)
- Fix display suspend/resume (Maarten)
- Use vma_pages() helper (Thorsten)
- Calculate pagefault queue size (Stuart, Matthew Auld)
- Fix missing pagefault wq destroy (Stuart)
- Fix lifetime handling of HW fence ctx (Matthew Brost)
- Fix order destroy order for jobs (Matthew Brost)
- Fix TLB invalidation for media GT (Matthew Brost)
- Document GGTT (Rodrigo Vivi)
- Refactor GGTT layering and fix runtime outer protection (Rodrigo Vivi)
- Handle HPD polling on display pm runtime suspend/resume (Imre, Vinod)
- Drop unrequired NULL checks (Apoorva, Himal)
- Use separate rpm lockdep map for non-d3cold-capable devices (Thomas Hellström)
- Support "nomodeset" kernel command-line option (Thomas Zimmermann)
- Drop force_probe requirement for LNL and BMG (Lucas, Balasubramani)
host1x:
- fix syncpoint IRQ during resume
- use iommu_paging_domain_alloc()
imx:
- ipuv3: convert to struct drm_edid
omapdrm:
- improve error handling
panel:
- add support for BOE TV101WUM-LL2 plus DT bindings
- novatek-nt35950: improve error handling
- nv3051d: improve error handling
- panel-edp: add support for BOE NE140WUM-N6G; revert support for
SDC ATNA45AF01
- visionox-vtdr6130: improve error handling; use
devm_regulator_bulk_get_const()
renesas:
- rz-du: add support for RZ/G2UL plus DT bindings
Kunwu Chan [Fri, 23 Aug 2024 08:07:24 +0000 (16:07 +0800)]
gpu: host1x: Make host1x_context_device_bus_type constant
Since commit d492cc2573a0 ("driver core: device.h: make struct
bus_type a const *"), the driver core can properly handle constant
struct bus_type, move the host1x_context_device_bus_type variable
to be a constant structure as well, placing it into read-only memory
which can not be modified at runtime.
Mikko Perttunen [Thu, 25 Apr 2024 05:02:34 +0000 (08:02 +0300)]
gpu: host1x: Handle CDMA wraparound when debug printing
During channel debug information dump, when printing CDMA
opcodes, the circular nature of the CDMA pushbuffer wasn't being
taken into account, sometimes accessing past the end. Change
the printing to take this into account.
Mikko Perttunen [Wed, 24 Apr 2024 05:13:35 +0000 (08:13 +0300)]
drm/tegra: gem: Don't attach dma-bufs when not needed
The dma-buf import code currently attaches and maps all imported
dma-bufs to the drm device to get their sgt for mapping to the
directly managed IOMMU domain.
In many cases, like for newer chips (Tegra186+), the directly
managed IOMMU domain is, however, not used. Mapping to the drm
device can also cause issues e.g. with swiotlb since it is not
a real device.
To improve the situation, only attach and map imported dma-bufs
when required.
Ville Syrjälä [Mon, 24 Jun 2024 19:10:32 +0000 (22:10 +0300)]
drm/i915/dsb: Use chained DSBs for LUT programming
In order to better handle the necessary DSB DEwake tricks let's
switch over to using a chained DSB for the actual LUT programming.
The CPU will start 'dsb_color_commit', which in turn will start the
chained 'dsb_color_vblank'.
Ville Syrjälä [Mon, 24 Jun 2024 19:10:31 +0000 (22:10 +0300)]
drm/i915/dsb: s/dsb/dsb_color_vblank/
We'll soon utilize several DSBs during the commit. To that end
rename the current crtc_state->dsb to crtc_state->dsb_color_vblank
to better reflect its role (color managemnent stuff programmed during
vblank).
Ville Syrjälä [Mon, 24 Jun 2024 19:10:30 +0000 (22:10 +0300)]
drm/i915/dsb: Clear DSB_ENABLE_DEWAKE once the DSB is done
In order to avoid the DSB keeping the DEwake permanently
asserted we must clear DSB_PMCTRL_2.DSB_FORCE_DEWAKE once
we are done. For good measure do the same for
DSB_PMCTRL.DSB_ENABLE_DEWAKE.
Experimentally this doens't seem to be actually necessary
(unlike with DSB_FORCE_DEWAKE). That is, the DSB_ENABLE_DEWAKE
doesn't seem to do anything whenever the DSB is not active.
But I'd hate to waste a ton of power in case there I'm wrong
and there is some way DEwake could remaing asserted. One extra
register write is a small price to pay for some peace of mind.
Ville Syrjälä [Mon, 24 Jun 2024 19:10:29 +0000 (22:10 +0300)]
drm/i915/dsb: Allow intel_dsb_chain() to use DSB_WAIT_FOR_VBLANK
Allow intel_dsb_chain() to start the chained DSB
at start of the undelaye vblank. This is slightly
more involved than simply setting the bit as we
must use the DEwake mechanism to eliminate pkgC
latency.
And DSB_ENABLE_DEWAKE itself is problematic in that
it allows us to configure just a single scanline,
and if the current scanline is already past that
DSB_ENABLE_DEWAKE won't do anything, rendering the
whole thing moot.
The current workaround involves checking the pipe's current
scanline with the CPU, and if it looks like we're about to
miss the configured DEwake scanline we set DSB_FORCE_DEWAKE
to immediately assert DEwake. This is somewhat racy since the
hardware is making progress all the while we're checking it on
the CPU.
We can make things less racy by chaining two DSBs and handling
the DSB_FORCE_DEWAKE stuff entirely without CPU involvement:
1. CPU starts the first DSB immediately
2. First DSB configures the second DSB, including its dewake_scanline
3. First DSB starts the second w/ DSB_WAIT_FOR_VBLANK
4. First DSB asserts DSB_FORCE_DEWAKE
5. First DSB waits until we're outside the dewake_scanline-vblank_start
window
6. First DSB deasserts DSB_FORCE_DEWAKE
That will guarantee that the we are fully awake when the second
DSB starts to actually execute.
Ville Syrjälä [Mon, 24 Jun 2024 19:10:28 +0000 (22:10 +0300)]
drm/i915/dsb: Introduce intel_dsb_chain()
In order to handle the DEwake tricks without involving
the CPU we need a mechanism by which one DSB can start
another one. Add a basic function to do so. We'll extend
it later with additional code to actually deal with
DEwake.
Add functions to emit a DSB scanline window wait instructions.
We can either wait for the scanline to be IN the window
or OUT of the window.
The hardware doesn't handle wraparound so we must manually
deal with it by swapping the IN range to the inverse OUT
range, or vice versa.
Also add a bit of paranoia to catch the edge case of waiting
for the entire frame. That doesn't make sense since an IN
wait would be a nop, and an OUT wait would imply waiting
forever. Most of the time this also results in both scanline
ranges (original and inverted) to have lower=upper+1
which is nonsense from the hw POV.
For now we are only handling the case where the scanline wait
happens prior to latching the double buffered registers during
the commit (which might change the timings due to LRR/VRR/etc.)
Ville Syrjälä [Mon, 24 Jun 2024 19:10:26 +0000 (22:10 +0300)]
drm/i915/dsb: Precompute DSB_CHICKEN
Adjust the code that determines the correct DSB_CHICKEN value
to be usable for use within DSB commands themselves. Ie.
precompute it based on our knowledge of what the hardware state
(VRR vs. not mainly) will be at the time of the commit.
Ville Syrjälä [Mon, 24 Jun 2024 19:10:25 +0000 (22:10 +0300)]
drm/i915/dsb: Account for VRR properly in DSB scanline stuff
When determining various scanlines for DSB use we should take into
account whether VRR is active at the time when the DSB uses said
scanline information. For now all DSB scanline usage occurs prior
to the actual commit, so we only need to care about the state of
VRR at that time.
I've decided to move intel_crtc_scanline_to_hw() in its entirety
to the DSB code as it will also need to know the actual state
of VRR in order to do its job 100% correctly.
TODO: figure out how much of this could be moved to some
more generic place and perhaps be shared with the CPU
vblank evasion code/etc...
Ville Syrjälä [Mon, 24 Jun 2024 19:10:24 +0000 (22:10 +0300)]
drm/i915/dsb: Fix dewake scanline
Currently we calculate the DEwake scanline based on
the delayed vblank start, while in reality it should be computed
based on the undelayed vblank start (as that is where the DSB
actually starts). Currently it doesn't really matter as we
don't have any vblank delay configured, but that may change
in the future so let's be accurate in what we do.
We can also remove the max() as intel_crtc_scanline_to_hw()
can deal with negative numbers, which there really shouldn't
be anyway.
Ville Syrjälä [Mon, 24 Jun 2024 19:10:23 +0000 (22:10 +0300)]
drm/i915/dsb: Shuffle code around
Relocate intel_dsb_dewake_scanline() and dsb_chicken() upwards
in the file. I need to reuse these while emitting DSB
commands, and I'd like to keep the DSB command emission
stuff more or less grouped together in the file.
Also drop the intel_ prefix from intel_dsb_dewake_scanline() since
it's all internal stuff and thus doesn't need so much namespacing.
Ville Syrjälä [Mon, 24 Jun 2024 19:10:22 +0000 (22:10 +0300)]
drm/i915/dsb: Convert dewake_scanline to a hw scanline number earlier
Currently we switch from out software idea of a scanline
to the hw's idea of a scanline during the commit phase in
_intel_dsb_commit(). While that is slightly easier due to
fastsets fiddling with the timings, we'll also need to
generate proper hw scanline numbers already when emitting
DSB scanline wait instructions. So this approach won't
do in the future. Switch to hw scanline numbers earlier.
Also intel_dsb_dewake_scanline() itself already makes
some assumptions about VRR that don't take into account
VRR toggling during fastsets, so technically delaying
the sw->hw conversion doesn't even help us.
The other reason for delaying the conversion was that we
are using intel_get_crtc_scanline() during intel_dsb_commit()
which gives us the current sw scanline. But this is pretty
low level stuff anyway so just using raw PIPEDSL reads seems
fine here, and that of course gives us the hw scanline
directly, reducing the need to do so many conversions.
v2: Return the non-hw scanline from intel_dsb_dewake_scanline()