This allows us to do API specific checks before removing variable
without filling nir_remove_dead_variables() with API specific code.
In the following patches we will use this to support the removal
of dead uniforms in GLSL.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4797>
With this, Doom Eternal should now run with ACO on GFX8+.
The generated 8/16-bit storage code is okay but the generated int8/int16
code is currently pretty bad but it works and apparently Doom Eternal
doesn't actually use it (even though it requires it).
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4707>
Fixes dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 hang with ACO
(features needed for the test are implemented in a later commit)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
When we originally wrote spirv_to_nir we didn't have a good scalar value
union to handily use so we rolled our own thing for spec constants. Now
that we have nir_const_value, we can use that and simplify a bunch of
the spec constant logic.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4675>
shuffle is implemented with shared VGPRs with ACO and Wave64.
Fixes dEQP-VK.subgroups.shuffle.framebuffer.subgroupshuffle*_vertex
with Wave64.
Fixes: c24d9522da ("radv: Enable ACO for NGG VS/TES, but disable NGG for ACO GS.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4595>
This disables shaderFloat16 on GFX8 because only GFX9+ supports
double rate packed math.
This improves consistency regarding other AMD Vulkan drivers and
it makes no sense to enable that feature without packed math.
This also reduces performance with Wolfeinstein Youngblood if
fp16 is forced enabled on GFX8, while it's similar on GFX9.
We might re-introduce that feature in the future with ACO support
if it ends up being faster and correct.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4453>
This replaces emit_vertex with:
if (vertex_count < max_vertices) {
emit_vertex_with_counter vertex_count ...
vertex_count += 1
}
Which is exactly what NIR->LLVM was doing but at NIR level. This
pass is already called by ACO.
pipeline-db changes on GFX10:
Totals from affected shaders:
SGPRS: 1952 -> 1912 (-2.05 %)
VGPRS: 2112 -> 2044 (-3.22 %)
Code Size: 189368 -> 185620 (-1.98 %) bytes
Max Waves: 494 -> 491 (-0.61 %)
No pipeline-db changes on other generations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4182>
Statistics will be added to ACO in later commits.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2965>
If compute shaders require a specific subgroup size (ie. Wave32),
we have to return the correct one.
Fixes dEQP-VK.subgroups.size_control.compute.required_subgroup_size_*.
Fixes: fb07fd4e6c ("radv: implement VK_EXT_subgroup_size_control")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215>
It's more accurate.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3975>
GS is the same on GFX6, but GFX6 isn't fully supported yet.
v4: fix regclass
v7: rebase after shader args MR
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
v2: implement GFX10
v3: rebase
v7: rebase after shader args MR
v8: fix gs_vtx_offset usage on GFX9/GFX10
v8: use unreachable() instead of printing intrinsic
v8: rename output_state to ge_output_state
v8: fix formatting around nir_foreach_variable()
v8: rename some helpers in the scheduler
v8: rename p_memory_barrier_all to p_memory_barrier_common
v8: fix assertion comparing ctx.stage against vertex_geometry_gs
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
This extension allows to use LOD with image read/write operations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The Vulkan spec says that StorageImageMultisample and ImageMSArray
SPIRV-V capabilities must be enabled if the
shaderStorageImageMultisample feature is supported.
This fixes a warning with RenderDoc.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2212
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This shouldn't introduce any functional changes for RadeonSI
when NIR is enabled because these operations are already lowered.
pipeline-db (NAVI10/LLVM):
SGPRS: 9043 -> 9051 (0.09 %)
VGPRS: 7272 -> 7292 (0.28 %)
Code Size: 638892 -> 621628 (-2.70 %) bytes
LDS: 1333 -> 1331 (-0.15 %) blocks
Max Waves: 1614 -> 1608 (-0.37 %)
Found this while glancing at some F12019 shaders.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The former was always true and hence dead code. We will want to
explicitly declare the ring offset register with ACO, but we also want
to declare the scratch offset too, and we can't try to disable it since
ACO also supports spilling and the determination of whether spilling has
to happen occurs well after setting up registers. So replace
supports_spill with something that will actually be used for ACO.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
ACO sets this itself and will have to set it differently in the future to
support shaderDenormFlushToZeroFloat64.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
This option is useless and shouldn't be used at all.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
In particular, increase the cost of 64-bit integer division.
Fixes huge shaders with dEQP-VK.spirv_assembly.type.scalar.i64.mod_geom
, with ACO used for GS this creates shaders requiring a branch with
>32767 dword offset.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>