These tests are flaky due to missing barriers, exposed by 211db6d333
("radv: Fix redundant subpass barriers due to erroneous comparison").
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16762>
These tests are flaky and should not be treated as expected-fail.
This also removes the duplicates from the fail list which was breaking CTS
runner.
Fixes: cd14431b8c ("radv/ci: skip dEQP-VK.fragment_operations.transient_attachment_bit")
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16762>
Sometimes you can end up with tex instructions that have sampler deref srcs, even though
they don't need them, e.g. a txs. In this case, still fix up those derefs in the sampler
splitting pass rather than leaving them pointing to a typed sampler.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16639>
There is also a hypothetical scenario where
transformData is 0 and transformOffset is not 0
and we end up reading from transformOffset because
transform_addr is not 0.
VkAccelerationStructureBuildRangeInfoKHR spec:
If VkAccelerationStructureGeometryTrianglesDataKHR::transformData is not NULL, a single VkTransformMatrixKHR structure is consumed from VkAccelerationStructureGeometryTrianglesDataKHR::transformData, at an offset of transformOffset. This matrix describes a transformation from the space in which the vertices for all triangles in this geometry are described to the space in which the acceleration structure is defined.
Which I think means, that we should ignore
transformOffset if transformData is NULL.
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16719>
VkAccelerationStructureBuildRangeInfoKHR spec:
If the geometry uses indices, primitiveCount × 3 indices are consumed from VkAccelerationStructureGeometryTrianglesDataKHR::indexData, starting at an offset of primitiveOffset. The value of firstVertex is added to the index values before fetching vertices.
If the geometry does not use indices, primitiveCount × 3 vertices are consumed from VkAccelerationStructureGeometryTrianglesDataKHR::vertexData, starting at an offset of primitiveOffset + VkAccelerationStructureGeometryTrianglesDataKHR::vertexStride × firstVertex.
Meaning: We always add firstVertex * vertexStride
to the vertex address and add primitiveOffset
either to the vertex address or the index address,
depending on wether indices are used.
Also add missing handling with instances.
Fixes: 0dad88b ("radv: Implement device-side BVH building.")
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16719>
Currently dpb_enc referneces not updated properly when index 0, as
we are skipping clearing that ref.
This patch will fix this for index 0. So that when ever we set
non_referenced flag, that is not used as ref and not pushed to DPB.
This is helping in SVC encoding.
Signed-off-by: SureshGuttula <suresh.guttula@amd.com>
Reviewed-by: Thong Thai <thong.thai@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16745>
Those shaders are just like the blorp ones.
v2: Use a single internal cache for blorp/RT (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7f1e82306c ("anv: Switch to the new common pipeline cache")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16741>
Fix defect reported by Coverity Scan.
Uninitialized scalar field (UNINIT_CTOR)
uninit_member: Non-static class member m_uiOffset is not initialized in
this constructor nor in any functions that it calls.
Fixes: b171a6baa2 ("d3d12: Add video encode implementation of pipe_video_codec")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16682>
For cases with lots of very small primitives, this may improve
performance because we're not executing those dead channels all the
time.
Shader-db reports no instruction or cycle-count changes. However, by
hacking up the driver to report when this optimization triggers, it
appears to affect about 10% of shader-db.
v2 (Kenneth Graunke): Always enable VMask prior to XeHP for now,
because using VMask on those platforms allows us to perform the
eliminate_find_live_channel() optimization. However, XeHP doesn't
seem to have packed fragment shader dispatch, so we lose that
optimization regardless, and there's no reason not to avoid vmask.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1054>
As described in src/util/detect_os.h, the DETECT_OS macros are always
defined to a 0 or 1 value, and they should be used with #if rather
than #ifdef.
Commit 54b7227f15 accidentally disabled those extensions on all
platforms, so enable them again.
Fixes: 54b7227f15 ("egl/wgl: On win32, there is no support for EGL_EXT_device and EGL_EXT_platform_device")
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16751>
It turns out that we need a fragment shader for streamout. Whh? From
Lionel's reading of simulator sources, it seems the streamout unit is
looking at enabled next stages. It'll generate output to the clipper in
the following cases :
- 3DSTATE_STREAMOUT::ForceRendering = ON
- PS enabled
- Stencil test enabled
- depth test enabled
- depth write enabled
- some other depth/hiz clear condition
Forcing rendering without a PS seems like a recipe for hangs so it's
probably better to just enable the PS in this case.
Fixes: 36ee2fd61c ("anv: Implement the basic form of VK_EXT_transform_feedback")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16506>
Actually compile and cache the no-op fragment shader but remove it from
the pipeline if we determine it's a no-op. This way we always have it
even if it's not strictly needed.
Fixes: 36ee2fd61c ("anv: Implement the basic form of VK_EXT_transform_feedback")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16506>
Starting with Ivy Bridge, we implement alpha-to-coverage by writting
gl_SampleMask with a pattern based on alpha. This will show up in
wm_prog_data::uses_omask so we don't need to look at the key.
Fixes: 36ee2fd61c ("anv: Implement the basic form of VK_EXT_transform_feedback")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16506>
Add unit tests ensuring the optimization applies in all the cases we care about,
as functional integration tests (CTS and Piglit) won't test this. Also add unit
tests for a few cases where we specifically cannot fuse, in case these cases are
missed by the tests.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16725>
On some implementations eg. AMD RDNA2 the driver can generate a
more optimal code path knowing whether outputs are indexed using the
local invocation index or not.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16736>
We accidentally compared the stencil layout to the color/depth layout.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16663>
In wsi_destroy_image(), if buffer_blit_queue then
do not call extra free. This will fix assert in
debug release and accessing out of allocated memory.
Fixes:
7bd5aa111c
("vulkan/wsi: add a private transfer pool to exec the DRI_PRIME blit")
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset's avatarSamuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16649>
1. Lowers NV_mesh_shader TASK_COUNT output to launch_mesh_workgroups.
2. Removes all code after launch_mesh_workgroups, enforcing the
fact that it's a terminating instruction.
3. Ensures that task shaders always have at least one
launch_mesh_workgroups instruction, so the backend doesn't
need to implement a special case when the shader doesn't have it.
4. Optionally, implements task_payload using shared memory when
task_payload atomics are used.
This is useful when the backend is otherwise not capable of
handling the same atomic features as it can for shared memory.
If this is used, the backend only has to implement the basic
load/store operations for task_payload.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16720>
The new intrinsic launches mesh shader workgroups
from a task shader, with explicit task_payload.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16720>
We're getting a source stride of 0 here sometimes, which I believe means
to just use the natural stride, which is what we wanted anyway. No need
to fall back to a temporary image in that case.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16631>
The S3TC compressor code doesn't support this, but our lack of checking
was being papered over by the stride checks being overly picky. This
is needed to prevent regressions in the next commit.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16631>
The compressor can handle 3 or 4-component sources, so allow a GL_RGBA
source and just pass that along with the correct number of components.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16631>