LD_VAR_BUF instructions on Valhall take a source format, indicating the
in-memory format of the varying independent from the register format, which we
still model within the compiler for compatibility with Bifrost. (Prior to
Valhall, source format is specified in the attribute descriptor as a physical
pixel format.)
Model this information, allowing us to generate fp16 LD_VAR_BUF instructions
correctly on Valhall.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16752>
Fixes a vector dimension validation failure in
dEQP-GLES3.functional.shaders.indexing.varying_array.vec4_static_write_dynamic_read
after we enable fp16 varyings.
No shader-db changes, as we don't yet support fp16 varyings.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16752>
As fusing VAR_TEX is an optimization, it's helpful to have unit tests since
functional tests won't check that the optimization triggers when expected.
Originally written when I was touching the VAR_TEX code. Those changes have
since been dropped by the unit test remains useful.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16752>
Use vk_buffer as a base for radv_buffer and
replace manual handling of VK_WHOLE_SIZE with
vk_buffer_range.
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16764>
VkBufferCreateFlags is correct.
Fixes: f6ae21b ("vulkan: Add a base struct for buffers")
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16764>
The register precolouring logic assumes that coverage masks are always in R60,
so spilling them causes incorrect results. We could do better. Fixes on Valhall:
dEQP-GLES3.functional.ubo.random.all_per_block_buffers.28
Fixes: 3df5446cbd ("pan/bi: Simplify register precolouring in the IR")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16748>
On Valhall, the driver should set this flag if the hardware may rotate
primitives. This happens if:
1. The rasterization of lines does not matter, AND
2. The provoking vertex does not matter.
The first condition we may satisfy by checking for LINES and the second by
checking for flat shading. Otherwise, we should set this flag to allow
optimizations. This may be more efficient for tiling.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16748>
These basically correspond to the alpha_zero_nop and alpha_one_store flags we
already compute and set. Except those flags don't exist on Valhall, so these
need to be used instead (on Bifrost, in addition .. unclear why the duplication
on Bifrost).
Set these flags when we can. Ostensibly this is for performance (neglible
improvement on glmark2 score), but mostly I want to get us using the hardware
optimally.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16748>
This will allow to translate the function and factors earlier.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16672>
These tests are flaky due to missing barriers, exposed by 211db6d333
("radv: Fix redundant subpass barriers due to erroneous comparison").
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16762>
These tests are flaky and should not be treated as expected-fail.
This also removes the duplicates from the fail list which was breaking CTS
runner.
Fixes: cd14431b8c ("radv/ci: skip dEQP-VK.fragment_operations.transient_attachment_bit")
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16762>
Sometimes you can end up with tex instructions that have sampler deref srcs, even though
they don't need them, e.g. a txs. In this case, still fix up those derefs in the sampler
splitting pass rather than leaving them pointing to a typed sampler.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16639>
There is also a hypothetical scenario where
transformData is 0 and transformOffset is not 0
and we end up reading from transformOffset because
transform_addr is not 0.
VkAccelerationStructureBuildRangeInfoKHR spec:
If VkAccelerationStructureGeometryTrianglesDataKHR::transformData is not NULL, a single VkTransformMatrixKHR structure is consumed from VkAccelerationStructureGeometryTrianglesDataKHR::transformData, at an offset of transformOffset. This matrix describes a transformation from the space in which the vertices for all triangles in this geometry are described to the space in which the acceleration structure is defined.
Which I think means, that we should ignore
transformOffset if transformData is NULL.
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16719>
VkAccelerationStructureBuildRangeInfoKHR spec:
If the geometry uses indices, primitiveCount × 3 indices are consumed from VkAccelerationStructureGeometryTrianglesDataKHR::indexData, starting at an offset of primitiveOffset. The value of firstVertex is added to the index values before fetching vertices.
If the geometry does not use indices, primitiveCount × 3 vertices are consumed from VkAccelerationStructureGeometryTrianglesDataKHR::vertexData, starting at an offset of primitiveOffset + VkAccelerationStructureGeometryTrianglesDataKHR::vertexStride × firstVertex.
Meaning: We always add firstVertex * vertexStride
to the vertex address and add primitiveOffset
either to the vertex address or the index address,
depending on wether indices are used.
Also add missing handling with instances.
Fixes: 0dad88b ("radv: Implement device-side BVH building.")
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16719>
Currently dpb_enc referneces not updated properly when index 0, as
we are skipping clearing that ref.
This patch will fix this for index 0. So that when ever we set
non_referenced flag, that is not used as ref and not pushed to DPB.
This is helping in SVC encoding.
Signed-off-by: SureshGuttula <suresh.guttula@amd.com>
Reviewed-by: Thong Thai <thong.thai@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16745>
Those shaders are just like the blorp ones.
v2: Use a single internal cache for blorp/RT (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7f1e82306c ("anv: Switch to the new common pipeline cache")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16741>
Fix defect reported by Coverity Scan.
Uninitialized scalar field (UNINIT_CTOR)
uninit_member: Non-static class member m_uiOffset is not initialized in
this constructor nor in any functions that it calls.
Fixes: b171a6baa2 ("d3d12: Add video encode implementation of pipe_video_codec")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16682>
For cases with lots of very small primitives, this may improve
performance because we're not executing those dead channels all the
time.
Shader-db reports no instruction or cycle-count changes. However, by
hacking up the driver to report when this optimization triggers, it
appears to affect about 10% of shader-db.
v2 (Kenneth Graunke): Always enable VMask prior to XeHP for now,
because using VMask on those platforms allows us to perform the
eliminate_find_live_channel() optimization. However, XeHP doesn't
seem to have packed fragment shader dispatch, so we lose that
optimization regardless, and there's no reason not to avoid vmask.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1054>
As described in src/util/detect_os.h, the DETECT_OS macros are always
defined to a 0 or 1 value, and they should be used with #if rather
than #ifdef.
Commit 54b7227f15 accidentally disabled those extensions on all
platforms, so enable them again.
Fixes: 54b7227f15 ("egl/wgl: On win32, there is no support for EGL_EXT_device and EGL_EXT_platform_device")
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16751>
It turns out that we need a fragment shader for streamout. Whh? From
Lionel's reading of simulator sources, it seems the streamout unit is
looking at enabled next stages. It'll generate output to the clipper in
the following cases :
- 3DSTATE_STREAMOUT::ForceRendering = ON
- PS enabled
- Stencil test enabled
- depth test enabled
- depth write enabled
- some other depth/hiz clear condition
Forcing rendering without a PS seems like a recipe for hangs so it's
probably better to just enable the PS in this case.
Fixes: 36ee2fd61c ("anv: Implement the basic form of VK_EXT_transform_feedback")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16506>
Actually compile and cache the no-op fragment shader but remove it from
the pipeline if we determine it's a no-op. This way we always have it
even if it's not strictly needed.
Fixes: 36ee2fd61c ("anv: Implement the basic form of VK_EXT_transform_feedback")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16506>
Starting with Ivy Bridge, we implement alpha-to-coverage by writting
gl_SampleMask with a pattern based on alpha. This will show up in
wm_prog_data::uses_omask so we don't need to look at the key.
Fixes: 36ee2fd61c ("anv: Implement the basic form of VK_EXT_transform_feedback")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16506>
Add unit tests ensuring the optimization applies in all the cases we care about,
as functional integration tests (CTS and Piglit) won't test this. Also add unit
tests for a few cases where we specifically cannot fuse, in case these cases are
missed by the tests.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16725>