Age | Commit message (Collapse) | Author |
|
Create KUnit tests to test the conversion between YUV and RGB. Test each
conversion and range combination with some common colors.
The code used to compute the expected result can be found in comment.
[Louis Chauvet:
- fix minor formating issues (whitespace, double line)
- change expected alpha from 0x0000 to 0xffff
- adapt to the new get_conversion_matrix usage
- apply the changes from Arthur
- move struct pixel_yuv_u8 to the test itself]
Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20250415-yuv-v18-6-f2918f71ec4b@bootlin.com
Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
|
|
Add support to the YUV formats bellow:
- NV12/NV16/NV24
- NV21/NV61/NV42
- YUV420/YUV422/YUV444
- YVU420/YVU422/YVU444
The conversion from yuv to rgb is done with fixed-point arithmetic, using
32.32 fixed-point numbers and the drm_fixed helpers.
To do the conversion, a specific matrix must be used for each color range
(DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
the `conversion_matrix` struct, along with the specific y_offset needed.
This matrix is queried only once, in `vkms_plane_atomic_update` and
stored in a `vkms_plane_state`. Those conversion matrices of each
encoding and range were obtained by rounding the values of the original
conversion matrices multiplied by 2^32. This is done to avoid the use of
floating point operations.
The same reading function is used for YUV and YVU formats. As the only
difference between those two category of formats is the order of field, a
simple swap in conversion matrix columns allows using the same function.
[Louis Chauvet:
- Adapted Arthur's work
- Implemented the read_line_t callbacks for yuv
- add struct conversion_matrix
- store the whole conversion_matrix in the plane state
- remove struct pixel_yuv_u8
- update the commit message
- Merge the modifications from Arthur]
Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20250415-yuv-v18-2-f2918f71ec4b@bootlin.com
Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
|
|
Re-introduce a line-by-line composition algorithm for each pixel format.
This allows more performance by not requiring an indirection per pixel
read. This patch is focused on readability of the code.
Line-by-line composition was introduced by [1] but rewritten back to
pixel-by-pixel algorithm in [2]. At this time, nobody noticed the impact
on performance, and it was merged.
This patch is almost a revert of [2], but in addition efforts have been
made to increase readability and maintainability of the rotation handling.
The blend function is now divided in two parts:
- Transformation of coordinates from the output referential to the source
referential
- Line conversion and blending
Most of the complexity of the rotation management is avoided by using
drm_rect_* helpers. The remaining complexity is around the clipping, to
avoid reading/writing outside source/destination buffers.
The pixel conversion is now done line-by-line, so the read_pixel_t was
replaced with read_pixel_line_t callback. This way the indirection is only
required once per line and per plane, instead of once per pixel and per
plane.
The read_line_t callbacks are very similar for most pixel format, but it
is required to avoid performance impact. Some helpers for color
conversion were introduced to avoid code repetition:
- *_to_argb_u16: perform colors conversion. They should be inlined by the
compiler, and they are used to avoid repetition between multiple variants
of the same format (argb/xrgb and maybe in the future for formats like
bgr formats).
This new algorithm was tested with:
- kms_plane (for color conversions)
- kms_rotation_crc (for rotations of planes)
- kms_cursor_crc (for translations of planes)
- kms_rotation (for all rotations and formats combinations) [3]
The performance gain was mesured with kms_fb_stress [4] with some
modification to fix the writeback format.
The performance improvement is around 5 to 10%.
[1]: commit 8ba1648567e2 ("drm: vkms: Refactor the plane composer to accept
new formats")
https://lore.kernel.org/all/20220905190811.25024-7-igormtorrente@gmail.com/
[2]: commit 322d716a3e8a ("drm/vkms: isolate pixel conversion
functionality")
https://lore.kernel.org/all/20230418130525.128733-2-mcanal@igalia.com/
[3]: https://lore.kernel.org/igt-dev/20240313-new_rotation-v2-0-6230fd5cae59@bootlin.com/
[4]: https://lore.kernel.org/all/20240422-kms_fb_stress-dev-v5-0-0c577163dc88@riseup.net/
Reviewed-by: José Expósito <jose.exposito89@gmail.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241118-yuv-v14-8-2dbc2f1e222c@bootlin.com
Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
|
|
Introduce two typedefs: pixel_read_t and pixel_write_t. It allows the
compiler to check if the passed functions take the correct arguments.
Such typedefs will help ensuring consistency across the code base in
case of update of these prototypes.
Rename input/output variable in a consistent way between read_line and
write_line.
A warn has been added in get_pixel_*_function to alert when an unsupported
pixel format is requested. As those formats are checked before
atomic_update callbacks, it should never happen.
Document for those typedefs.
Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Reviewed-by: José Expósito <jose.exposito89@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241118-yuv-v14-3-2dbc2f1e222c@bootlin.com
Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
|
|
All convertions from the ARGB16161616 format follow the same structure.
Instead of repeting the same structure for each supported format, create
a function to encapsulate the common logic and isolate the pixel
conversion functions in a callback function.
Suggested-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20230515135204.115393-4-mcanal@igalia.com
|
|
Currently, the pixel conversion functions repeat the same loop to
iterate the rows. Instead of repeating the same code for each pixel
format, create a function to wrap the loop and isolate the pixel
conversion functionality.
Suggested-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20230418130525.128733-2-mcanal@igalia.com
|
|
Currently the blend function only accepts XRGB_8888 and ARGB_8888
as a color input.
This patch refactors all the functions related to the plane composition
to overcome this limitation.
The pixels blend is done using the new internal format. And new handlers
are being added to convert a specific format to/from this internal format.
So the blend operation depends on these handlers to convert to this common
format. The blended result, if necessary, is converted to the writeback
buffer format.
This patch introduces three major differences to the blend function.
1 - All the planes are blended at once.
2 - The blend calculus is done as per line instead of per pixel.
3 - It is responsible to calculates the CRC and writing the writeback
buffer(if necessary).
These changes allow us to allocate way less memory in the intermediate
buffer to compute these operations. Because now we don't need to
have the entire intermediate image lines at once, just one line is
enough.
| Memory consumption (output dimensions) |
|:--------------------------------------:|
| Current | This patch |
|:------------------:|:-----------------:|
| Width * Heigth | 2 * Width |
Beyond memory, we also have a minor performance benefit from all
these changes. Results running the IGT[1] test
`igt@kms_cursor_crc@pipe-a-cursor-512x512-onscreen` ten times:
| Frametime |
|:------------------------------------------:|
| Implementation | Current | This commit |
|:---------------:|:---------:|:------------:|
| frametime range | 9~22 ms | 5~17 ms |
| Average | 11.4 ms | 7.8 ms |
[1] IGT commit id: bc3f6833a12221a46659535dac06ebb312490eb4
V2: Improves the performance drastically, by performing the operations
per-line and not per-pixel(Pekka Paalanen).
Minor improvements(Pekka Paalanen).
V3: Changes the code to blend the planes all at once. This improves
performance, memory consumption, and removes much of the weirdness
of the V2(Pekka Paalanen and me).
Minor improvements(Pekka Paalanen and me).
V4: Rebase the code and adapt it to the new NUM_OVERLAY_PLANES constant.
V5: Minor checkpatch fixes and the removal of TO-DO item(Melissa Wen).
Several security/robustness improvents(Pekka Paalanen).
Removes check_planes_x_bounds function and allows partial
partly off-screen(Pekka Paalanen).
V6: Fix a mismatch of some variable sizes (Pekka Paalanen).
Several minor improvements (Pekka Paalanen).
Reviewed-by: Melissa Wen <mwen@igalia.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Igor Torrente <igormtorrente@gmail.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220905190811.25024-7-igormtorrente@gmail.com
|