Commit Graph

442 Commits

Author SHA1 Message Date
Pierre-Eric Pelloux-Prayer 9685b5785b radeonsi: add SI_PROFILE_CLAMP_DIV_BY_ZERO
To enable divide by zero clamping per shader, instead of per app.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14931>
2022-02-16 11:13:25 +00:00
Marek Olšák afdfcdd542 radeonsi: determine MEM_ORDERED after generating a shader variant
because si_get_nir_shader runs NIR passes and some of them can introduce
new loads.

Fixes: 3fb77ef2e0 - radeonsi: do opt_large_constants & lower_indirect_derefs after uniform inlining

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14528>
2022-01-18 11:11:08 +00:00
Marek Olšák 08cc73a218 radeonsi: rename uses_vmem_* flags
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14528>
2022-01-18 11:11:08 +00:00
Marek Olšák d4a1766a5a radeonsi: move the GS copy shader into shader variants
This will allow further optimizations for shader variants that change
GS outputs (affecting the copy shader), and this is mainly about sharing
optimizations with NGG instead of having a totally separate codepath for
legacy GS.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14266>
2022-01-05 12:46:31 +00:00
Marek Olšák 8ed9d38e73 radeonsi: move si_nir_scan_shader into si_shader_info.c
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14266>
2022-01-05 12:46:31 +00:00
Marek Olšák 198ad7e4dc radeonsi: move smoothing to the main shader part to remove 1 live VGPR
The samplemask VGPR that we had to pass to the epilog increased VGPR usage
by 1 for all shaders. Do it in the main function by using the mono key
structure, which causes on-demand compilation and stall, but we'll save
the VGPR.

57794 shaders in 35145 tests
Totals:
SGPRS: 2715856 -> 2716272 (0.02 %)
VGPRS: 1776168 -> 1718432 (-3.25 %)
Spilled SGPRs: 3704 -> 3630 (-2.00 %)
Spilled VGPRs: 1727 -> 1733 (0.35 %)
Private memory VGPRs: 256 -> 256 (0.00 %)
Scratch size: 2008 -> 2016 (0.40 %) dwords per thread
Code Size: 61429584 -> 61393288 (-0.06 %) bytes
Max Waves: 838645 -> 840484 (0.22 %)

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14266>
2022-01-05 12:46:31 +00:00
Marek Olšák 12b942bd16 radeonsi: pass sample_coverage VGPR index to the PS prolog instead of guessing
The code was correct, but little confusing. This is cleaner.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14266>
2022-01-05 12:46:30 +00:00
Marek Olšák 3283df1425 radeonsi: remove unused si_shader::prolog2
This became unused when the GS prolog was removed.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14266>
2022-01-05 12:46:30 +00:00
Marek Olšák af9ec3c45d radeonsi: add shader profiles that disable binning
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13966>
2021-12-11 20:07:35 +00:00
Marek Olšák b3b2f97f2e radeonsi: add Wave32 heuristics and shader profiles
This generally works well.

There are new cases that select Wave32, and there are shader profiles
which adjust that.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13966>
2021-12-11 20:07:35 +00:00
Marek Olšák cd86f1dc2b radeonsi: rename si_get_shader_wave_size and make it non-inline
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13878>
2021-11-26 11:35:05 +00:00
Marek Olšák 1ef027851d radeonsi: propagate si_shader::wave_size to VGT_SHADER_STAGES
instead of hardcoding them

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13878>
2021-11-26 11:35:05 +00:00
Marek Olšák bc57488936 radeonsi: add si_shader::wave_size because it will vary
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13878>
2021-11-26 11:35:05 +00:00
Marek Olšák 41523773f5 radeonsi: add wave32 flag into prolog/epilog keys
It will vary between shaders.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13878>
2021-11-26 11:35:05 +00:00
Marek Olšák ba6d389fa7 radeonsi: don't use GS SGPR6 for the small prim cull info
use a user SGPR instead. This will be needed in the future.

Also don't upload small_prim_precision because it's passed via
VS_STATE_BITS.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13811>
2021-11-16 19:41:07 +00:00
Marek Olšák 513bd6acca radeonsi: cull against clip planes, clipvertex, clip/cull distances in shader
The downside is that this duplicates shader code for clip/cull distances
in both the position and parameter portions of the shader.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13811>
2021-11-16 19:41:07 +00:00
Marek Olšák 881c459191 radeonsi: unify how ngg_cull_flags are set
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13811>
2021-11-16 19:41:07 +00:00
Marek Olšák 5a5263d65d radeonsi: unify GFX9_VSGS_NUM_USER_SGPR and GFX9_TESGS_NUM_USER_SGPR
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13700>
2021-11-16 02:11:46 +00:00
Marek Olšák 9151ac3531 ac,radeonsi: cull small lines in the shader using the diamond exit rule
It also splits clip_half_line_width into X and Y components for tighter
view culling.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13700>
2021-11-16 02:11:46 +00:00
Marek Olšák 8cf802e8ef radeonsi: replace the GS prolog with a monolithic shader variant
It only exists because of the hw bug and is used very rarely.
Let's simplify it.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13393>
2021-10-18 18:08:59 +00:00
Marek Olšák 8c5a32b5fe radeonsi: split si_shader_key into ps and ge parts to minimize memcmp overhead
ps is for the pixel shader, while ge is for VS, TCS, TES, and GS.

si_shader_key: 68 bytes
si_shader_key_ge: 68 bytes
si_shader_key_ps: 28 bytes

The only notable change is that si_shader_select_with_key is changed
to a C++ template. Other changes are trivial.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13285>
2021-10-16 10:41:51 +00:00
Marek Olšák 385c9e1caf radeonsi: si_state_shaders.c -> cpp
We'll add some templates here.

Why is `extern "C"` not needed for exported functions?

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13285>
2021-10-16 10:41:51 +00:00
Arvind Yadav 8f9945a75b radeonsi: remove the use of PKT3_CONTEXT_REG_RMW
This patch is to to remove PKT3_CONTEXT_REG_RMW from radeonsi.
and avoid multiple command buffer(PM4 packet)creation for R_02881C_PA_CL_VS_OUT_CNTL.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12789>
2021-10-13 10:28:14 +00:00
Marek Olšák 844f66bf38 radeonsi: remove GS fast launch
It regresses the first snx test because it adds CPU overhead, and there is
no way to work around it. The average effect on viewperf is 0, meaning that
a few cases improve, while a few others regress.

Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13279>
2021-10-11 07:13:48 +00:00
Marek Olšák f00d3e2909 radeonsi: implement shader-based culling for lines
This helps some viewperf subtests.
Only view XY culling is done. Edgeflags are always disabled with lines.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13048>
2021-09-28 17:30:06 +00:00
Marek Olšák 0030bdf9a6 radeonsi: add gfx10 helpers for determining whether edgeflags are enabled
They will return false when culling lines.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13048>
2021-09-28 17:30:06 +00:00
Marek Olšák edb5fa4d59 radeonsi: eliminate redundant SPI_SHADER_PGM_RSRC3/4_GS register writes
They don't change much.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12343>
2021-09-14 15:24:11 +00:00
Marek Olšák 3df035d08c radeonsi: put si_pm4_state at the beginning of si_shader
instead of allocating it separately. This removes pointer indirections.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12343>
2021-09-14 15:24:11 +00:00
Marek Olšák 5824ab569e radeonsi: precompute more spi_map code
This replaces vs_output_param_offset by vs_output_ps_input_cntl,
which is easier to use.

For geometry shaders, vs_output_ps_input_cntl is stored in the GS si_shader
structure, not gs_copy_shader. This requires that gs_copy_shader compilation
is finished before the GS main shader part, so that GS can initialize
vs_output_ps_input_cntl using the compiled GS copy shader.

output_semantic_to_slot becomes unused, so it's removed.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12343>
2021-09-14 15:24:11 +00:00
Marek Olšák 57f9452b46 radeonsi: precompute num_interp for si_emit_spi_map
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12343>
2021-09-14 15:24:11 +00:00
Marek Olšák 46802f7b60 radeonsi: interleave si_shader_info::input_* in memory for faster emit_spi_map
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12343>
2021-09-14 15:24:11 +00:00
Marek Olšák 7a20110ad3 radeonsi: precompute si_vgt_stages_key for NGG in si_shader
to remove this overhead from si_update_shaders

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12343>
2021-09-14 15:24:11 +00:00
Marek Olšák 03b5a94258 radeonsi: add const to the key parameter in si_shader_select_with_key
The keys will match the current state, so we shouldn't change them.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12343>
2021-09-14 15:24:11 +00:00
Marek Olšák eddb65ffb0 radeonsi: don't use NGG passthrough if culling is possible for better perf
Switching NGG passthrough on/off decreases performance because it causes
context rolls.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12812>
2021-09-10 23:32:03 +00:00
Marek Olšák 1f8be99621 radeonsi: enable shader-based prim culling with polygon mode
Polygon mode should have no effect on culling, so keep it enabled.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12812>
2021-09-10 23:32:03 +00:00
Marek Olšák 576f8394db radeonsi: remove the primitive discard compute shader
It doesn't always work, it's only useful on gfx9 and older, and it's too
complicated.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4011

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12812>
2021-09-10 23:32:03 +00:00
Emma Anholt 17332ceb0f mesa/st: Add an optional GLSL link fail msg to finalize_nir.
GLES2 drivers are allowed to reject some GLSL constructs, like dynamic
loop bounds (which neither i915g nor vc4 can fully support), but gallium
hasn't had any way to trigger a link failure.  Add a return msg to the
finalize_nir hook, which is called at the end of GLSL linking, and use
that.  This means that some other callers of finalize need to do something
with the msg, and we (for now) just throw it away.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12218>
2021-09-06 18:09:25 +00:00
Marek Olšák c005b2cd4b radeonsi: move as_ls/es/ngg setting out of si_shader_selector_key
Do it when we bind shaders.

The advantages are:
- no need to memset the fields when any shader variant state is changed
  (e.g. culling on/off)
- no need to recompute the fields every time that happens

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12656>
2021-09-01 00:42:57 +00:00
Marek Olšák 08310f85ae radeonsi: remove instancing support from the prim discard compute shader
It's not important for workstation apps on Vega.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12656>
2021-09-01 00:42:57 +00:00
Ian Romanick 5f2dbd45f2 gallium: Remove "optimize" parameter from pipe_screen::finalize_nir
As part of adding support for inline uniforms in Iris, I was going to
add a finalize_nir hook.  I went looking to see how other drivers use
the "optimize" parameter, and I discovered that *nobody* uses it at all.

v2: Fix typo in commit message.  Noticed by Mike.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12317>
2021-08-13 15:45:29 -07:00
Pierre-Eric Pelloux-Prayer 9fe8ae3fcd radeonsi: don't create an infinite number of variants
If a shader has code like this:

   uniform float timestamp;
   ...
   if (timestamp > 0.0)
      do_something()

And timestamp is modified each frame, we'll end up generating a new
variant per frame.

This commit introduces a hard limit on the number of variants we generate
for a single shader.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5121
Fixes: b7501184b9 ("radeonsi: implement inlinable uniforms")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12138>
2021-08-09 10:26:54 +00:00
Marek Olšák 06da711350 radeonsi: remove the GDS variants of compute-based primitive discard
The GDS ordered append variant is unstable due to kernel and firmware bugs.
The unordered GDS variant isn't faster than the memory-based variant.

Only the memory-based variant is kept.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11510>
2021-06-28 13:23:14 +00:00
Marek Olšák fc95ba6c86 radeonsi: remove the Z culling option from the primitive discard CS
Not useful.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11102>
2021-06-21 19:03:29 +00:00
Marek Olšák 1e9cc86511 radeonsi: merge 2 conditional blocks with same condition into 1 in culling code
The block only loads input VGPRs from LDS, and the next block uses them.
The entering condition is the same, even though the second block is
the next shader part beginning with the prolog.

Simply move the VGPR loads into the prolog.

This decreases the shader code size by 12 bytes.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11102>
2021-06-21 19:03:29 +00:00
Pierre-Eric Pelloux-Prayer b78a38bd02 radeonsi: use si_nir_is_output_const_if_tex_is_const
When a blending mode producing "color = src * dst" is used and we
can determine that dst is 1, then the draw call can dropped completely.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10979>
2021-06-15 11:18:02 +02:00
Marek Olšák c53f25b668 radeonsi: kill 16-bit VS outputs if PS doesn't use them or doing Z-only draw
The kill_outputs logic uses our internal IO indices. Just add indices for
16-bit varyings. We don't have enough free indices to use, but we can reuse
the indices that GLES doesn't have. Those are all the legacy desktop GL
varyings.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9051>
2021-04-13 21:10:43 -04:00
Marek Olšák 7db43960f6 radeonsi: implement 16-bit VS->PS varyings
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9051>
2021-04-13 21:10:43 -04:00
Pierre-Eric Pelloux-Prayer a27ea38d2a radeonsi/sqtt: keep a copy of the uploaded shader code
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>
2021-03-05 13:10:11 +00:00
Marek Olšák e9e385b084 radeonsi: gather shader info about VMEM usage for MEM_ORDERED
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9028>
2021-02-17 04:49:24 -05:00
Marek Olšák 27e22f025c radeonsi: gather shader info about indirect UBO/SSBO/samplers/images
A future commit will use it.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9028>
2021-02-17 04:49:24 -05:00