Statistics only work in non-NGG mode. If screen->use_ngg is true, we can't
know if the draw will actually use NGG or not, so this commit switch
to a shader based implementation of this counter.
To avoid modifying si_query, the shader implementation behaves like the hw
one: it uses the same buffer size and offset.
The emulation path activation in the shader is controlled by vs_state_bit[31].
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15861>
If dual source blending is enabled, use export targets 21 and 22.
Also we have to swap odd/even lanes between export target 21 and 22.
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16328>
This will help me see all places where we use "info", which will
be moved from si_shader_selector to shader variants.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14414>
This stops pipe_stream_output_info from create_*s_state context functions
because NIR contains everything and can do more advanced shader linking
this way.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14414>
because si_get_nir_shader runs NIR passes and some of them can introduce
new loads.
Fixes: 3fb77ef2e0 - radeonsi: do opt_large_constants & lower_indirect_derefs after uniform inlining
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14528>
This will allow further optimizations for shader variants that change
GS outputs (affecting the copy shader), and this is mainly about sharing
optimizations with NGG instead of having a totally separate codepath for
legacy GS.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14266>
The samplemask VGPR that we had to pass to the epilog increased VGPR usage
by 1 for all shaders. Do it in the main function by using the mono key
structure, which causes on-demand compilation and stall, but we'll save
the VGPR.
57794 shaders in 35145 tests
Totals:
SGPRS: 2715856 -> 2716272 (0.02 %)
VGPRS: 1776168 -> 1718432 (-3.25 %)
Spilled SGPRs: 3704 -> 3630 (-2.00 %)
Spilled VGPRs: 1727 -> 1733 (0.35 %)
Private memory VGPRs: 256 -> 256 (0.00 %)
Scratch size: 2008 -> 2016 (0.40 %) dwords per thread
Code Size: 61429584 -> 61393288 (-0.06 %) bytes
Max Waves: 838645 -> 840484 (0.22 %)
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14266>
This generally works well.
There are new cases that select Wave32, and there are shader profiles
which adjust that.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13966>
use a user SGPR instead. This will be needed in the future.
Also don't upload small_prim_precision because it's passed via
VS_STATE_BITS.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13811>
The downside is that this duplicates shader code for clip/cull distances
in both the position and parameter portions of the shader.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13811>
ps is for the pixel shader, while ge is for VS, TCS, TES, and GS.
si_shader_key: 68 bytes
si_shader_key_ge: 68 bytes
si_shader_key_ps: 28 bytes
The only notable change is that si_shader_select_with_key is changed
to a C++ template. Other changes are trivial.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13285>
This patch is to to remove PKT3_CONTEXT_REG_RMW from radeonsi.
and avoid multiple command buffer(PM4 packet)creation for R_02881C_PA_CL_VS_OUT_CNTL.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12789>
It regresses the first snx test because it adds CPU overhead, and there is
no way to work around it. The average effect on viewperf is 0, meaning that
a few cases improve, while a few others regress.
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13279>
This helps some viewperf subtests.
Only view XY culling is done. Edgeflags are always disabled with lines.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13048>
This replaces vs_output_param_offset by vs_output_ps_input_cntl,
which is easier to use.
For geometry shaders, vs_output_ps_input_cntl is stored in the GS si_shader
structure, not gs_copy_shader. This requires that gs_copy_shader compilation
is finished before the GS main shader part, so that GS can initialize
vs_output_ps_input_cntl using the compiled GS copy shader.
output_semantic_to_slot becomes unused, so it's removed.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12343>
GLES2 drivers are allowed to reject some GLSL constructs, like dynamic
loop bounds (which neither i915g nor vc4 can fully support), but gallium
hasn't had any way to trigger a link failure. Add a return msg to the
finalize_nir hook, which is called at the end of GLSL linking, and use
that. This means that some other callers of finalize need to do something
with the msg, and we (for now) just throw it away.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12218>
Do it when we bind shaders.
The advantages are:
- no need to memset the fields when any shader variant state is changed
(e.g. culling on/off)
- no need to recompute the fields every time that happens
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12656>