media: chips-media: wave5: Improve performance of decoder
The current decoding method was to wait until each frame was
decoded after feeding a bitstream. As a result, performance was low
and Wave5 could not achieve max pixel processing rate.
Update driver to use an asynchronous approach for decoding and feeding a
bitstream in order to achieve full capabilities of the device.
WAVE5 supports command-queueing to maximize performance by pipelining
internal commands and by hiding wait cycle taken to receive a command
from Host processor.
Instead of waiting for each command to be executed before sending the
next command, Host processor just places all the commands in the
command-queue and goes on doing other things while the commands in the
queue are processed by VPU.
While Host processor handles its own tasks, it can receive VPU interrupt
request (IRQ).
In this case, host processor can simply exit interrupt service routine
(ISR) without accessing to host interface to read the result of the
command reported by VPU.
After host processor completed its tasks, host processor can read the
command result when host processor needs the reports and does
response processing.
To achieve this goal, the device_run() calls v4l2_m2m_job_finish
so that next command can be sent to VPU continuously, if there is
any result, then irq is triggered and gets decoded frames and returns
them to upper layer.
Theses processes work independently each other without waiting
a decoded frame.
Signed-off-by: Jackson Lee <jackson.lee@chipsnmedia.com> Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com> Tested-by: Brandon Brnich <b-brnich@ti.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>