The LLVM backend keeps track of 16-bit output variables and it will
miscompile shaders when these outputs aren't the correct bitsize.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17706>
Previously, it was in a divergent branch, therefore
it could hang the GPU when a workgroup had a primitive-only wave.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17581>
There's no dependency between them.
This can simplify the compiler backend translation by
always storing prim id before vertex export, which also
benefits the LLVM backend in latter changes.
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17581>
We are going to reuse them outside of radv_pipeline.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16531>
When culling enabled, it will use LDS space, which overlap with
the prim id export.
Fixes: e97f0463a8 ("ac/nir: Implement NGG deferred attribute culling in NIR.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17593>
These set the pass and make sure we don't have multiple submissions
at the same time touching the perf counters/pass at the same time.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16879>
Change NGG GS emit vertex code to emit combined shared stores,
also change the export vertex code to emit combined shared loads.
This results in more optimal code generation, ie. fewer LDS
instructions are generated.
GS vertices are stored using an odd stride to minimize the chance
of bank conflicts, which means that unfortunately
we still can't use an alignment higher than 4 here,
so the best we can get are some ds_read2_b32 instructions.
Fossil DB stats on Navi 21 (formerly Sienna Cichlid):
Totals from 135 (0.10% of 128653) affected shaders:
VGPRs: 6416 -> 6512 (+1.50%)
CodeSize: 529436 -> 503792 (-4.84%)
MaxWaves: 2952 -> 2924 (-0.95%)
Instrs: 93384 -> 90176 (-3.44%)
Latency: 290283 -> 293611 (+1.15%); split: -0.36%, +1.50%
InvThroughput: 81218 -> 82598 (+1.70%)
Copies: 6603 -> 6606 (+0.05%)
PreVGPRs: 5037 -> 5076 (+0.77%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11425>
radeonsi may add extra dword to the stride, so let's pass it
directly.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16788>
This is used by radeonsi to save some lds space when all LS output
is passed by register.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
Used by radeonsi and radv to reflect true wave size used, not minimal size.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
radeonsi won't emit tess factor in the lower pass, need to keep
the output for llvm backend to pass it as parameter. This is used
by radeonsi for an optimization to save LDS write.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
radeonsi load this from SGPR arg, can't use static value because TCS output
and TES input may not match (TCS output is not a key for TES) and
determined in runtime.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
NV_mesh_shader workgroups are only 1 dimensional, so it's OK to
only add firstTask to the X dimension.
However, let's keep the Y and Z dimensions so this doesn't mess up
the workgroup ID for future 3 dimensional task shader dispatches.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17023>
This allows future mesh shaders to use a 3D workgroup ID.
Also changes how the NV_mesh_shader first_task is emulated.
The new code moves the responsibility from ac_nir into radv.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17023>
amdgpu returns 12 TB of GTT on Kaveri, which resulted in 0 KB of GTT
after the conversion to uint32_t, which caused us to report 0 as the UBO
size, which disabled UBOs and downgraded the driver to OpenGL 3.0.
Fixes: aee8ee17a5 - radeonsi: change max TBO/SSBO sizes again and rework max alloc size
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6642
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16885>
Allow 1/4 of the max heap size, but maximum of 512 MB on 32-bit
architectures.
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16901>