On some implementations eg. AMD RDNA2 the driver can generate a
more optimal code path knowing whether outputs are indexed using the
local invocation index or not.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16736>
We accidentally compared the stencil layout to the color/depth layout.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16663>
In wsi_destroy_image(), if buffer_blit_queue then
do not call extra free. This will fix assert in
debug release and accessing out of allocated memory.
Fixes:
7bd5aa111c
("vulkan/wsi: add a private transfer pool to exec the DRI_PRIME blit")
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset's avatarSamuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16649>
1. Lowers NV_mesh_shader TASK_COUNT output to launch_mesh_workgroups.
2. Removes all code after launch_mesh_workgroups, enforcing the
fact that it's a terminating instruction.
3. Ensures that task shaders always have at least one
launch_mesh_workgroups instruction, so the backend doesn't
need to implement a special case when the shader doesn't have it.
4. Optionally, implements task_payload using shared memory when
task_payload atomics are used.
This is useful when the backend is otherwise not capable of
handling the same atomic features as it can for shared memory.
If this is used, the backend only has to implement the basic
load/store operations for task_payload.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16720>
The new intrinsic launches mesh shader workgroups
from a task shader, with explicit task_payload.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16720>
We're getting a source stride of 0 here sometimes, which I believe means
to just use the natural stride, which is what we wanted anyway. No need
to fall back to a temporary image in that case.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16631>
The S3TC compressor code doesn't support this, but our lack of checking
was being papered over by the stride checks being overly picky. This
is needed to prevent regressions in the next commit.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16631>
The compressor can handle 3 or 4-component sources, so allow a GL_RGBA
source and just pass that along with the correct number of components.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16631>
VK_KHR_format_feature_flags2 is mostly about define a new 64-bit
VkFormatFeatureFlagBits2KHR format feature flag type, as 29 bits of
the 32-bit VkFormatFeatureFlagBits are already in use.
So all the bits from VkFormatFeatureFlagBits are being replicated, and
most of the work here consist on switch to the new flags.
From the new (not replicated from VkFormatFeatureFlagBits) flag bits,
we don't support
VK_FORMAT_FEATURE_2_STORAGE_READ_WITHOUT_FORMAT_BIT_KHR or
VK_FORMAT_FEATURE_2_STORAGE_WRITE_WITHOUT_FORMAT_BIT_KHR, as right now
we require the format on the shader for doing the read and stores.
We use now VK_FORMAT_FEATURE_2_SAMPLED_IMAGE_DEPTH_COMPARISON_BIT_KHR,
but only applying it for depth formats.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16718>
On GPUs that support AFBC with tiled headers, try to use tiled headers instead
of linear headers. This should be a bit more efficient for the caches.
Additionally, on Mali, tiled headers are tied to solid colour blocks, so this
has the effect of enabling AFBC with solid colour blocks where supported.
Unfortunately, results are disappointing. Mali-G52:
-btexture from 856fps to 859fps
-bdesktop from 292fps to 294fps
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>
Tiled headers and bounds checking were introduced with v7. The flags don't exist
on v6. Fix the XML accordingly so we don't accidentally use features too new for
the hardware.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>
These check the alignments are correct. Of course, ideally these cases aren't
hit in practice, since it's a waste of memory.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>
The header size is the header stride times the number of rows in the header
(number of tiles of superblocks). We already calculate the header stride, so
eliminate the separate header size calculation.
Delete the old header size calculation. It has no notion of wide blocks, let
alone tiled AFBC headers.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>
Extract a helper for calculating AFBC strides. This is used in two places in
pan_layout. It will need extension for tiled AFBC, and the extended version
could benefit from unit testing.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>
Let's keep all the AFBC computations inside the layout code, to keep pan_cs
dumb. This helper will need some extension for tiled AFBC.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>
Due to the order of binding shaders, GALLIUM_HUD triggered a NULL pointer
dereference in the new shader variants code.
Fixes: 0fcddd4d2c ("pan/bi: Rework varying linking on Valhall")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16707>
When we do SVC temporal encoding, we see output bitsream is not proper. To fix
this , incase of SVC passing session init varaible display_remote as enable.
Signed-off-by: SureshGuttula <suresh.guttula@amd.com>
Reviewed-by: Thong Thai <thong.thai@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16468>