If the first time the preamble is written, one of the rings
isn't allocated, we wouldn't write the RING_SIZE to the preamble.
Later, when the preamble gets updated after the ring allocation,
the new RING_SIZE packet would overwrite other packets.
To prevent this, always write the RING_SIZE (the alternative would
be to write NOP packets).
This fix "*ERROR* Illegal register access in command stream" hangs
I observed on GFX8.
Fixes: 32c7805ccc ("radeonsi: merge all preamble states into one")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16962>
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16604>
The preamble will be skipped by the kernel if there is no context switch.
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16509>
Tess registers are appended. GS registers are appended or overwritten
if they are already set. There are separate TMZ and non-TMZ preambles.
The preamble will be passed to the kernel as an IB to execute on a context
switch only.
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16509>
We can't use UINT16_ABGR for the alpha channel. Always use 32_ABGR.
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16509>
Instead of storing them in a linked list, put them in an array
in si_shader_selector. The keys are also stored separatly, to
avoid pointer chasing when searching a variant in si_shader_select_with_key.
This main point here is to simplify the code by storing everything
in the selector instead of splitting the list storage between the selector
and the shaders; this shouldn't affect performance in a meaningful way.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16273>
This uses the common helper code to implement the tess ring sizing.
One question is if radeonsi should be using tess_offchip_ring_offset
in some places it's using tess_factor_ring_size?
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16415>
If dual source blending is enabled, use export targets 21 and 22.
Also we have to swap odd/even lanes between export target 21 and 22.
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16328>
gfx11 passes scratch base address using
SPI_GFX_SCRATCH_BASE_LO and _HI registers. Make the
code changes to support the same.
v5: remove type cast from 64bit to 32bit (Marek Olšák)
v4: combine scratch_memory and scratch_state atom (Marek Olšák)
v3: skip shader relocs for gfx11
v2: make atom for scratch_memory (Indrajit)
Signed-off-by: Yogesh mohan marimuthu <yogesh.mohanmarimuthu@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16328>
The addition of the "compute" parameter is for a future change.
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15966>
This will help me see all places where we use "info", which will
be moved from si_shader_selector to shader variants.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14414>
This stops pipe_stream_output_info from create_*s_state context functions
because NIR contains everything and can do more advanced shader linking
this way.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14414>
This may be needed by ACO, but it doesn't do anything for LLVM yet other
than making the initial LLVM IR smaller.
It will be needed by a future commit, which rewrites ac_optimize_vs_outputs
in NIR, which relies on NIR matching the shader key.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14414>
The code was compiling monolithic PS if a shader output didn't
have a color buffer, but dual src blending never has a color buffer
for mrt1.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15965>
It was mistakenly added to indicate it's for a User-Mode Driver,
but all defined registers in Mesa are.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
To avoid dragging gl.h into places it has no business being,
defined tessellation primitive mode to an enum.
This has a lot of fallout all over the place.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>
because si_get_nir_shader runs NIR passes and some of them can introduce
new loads.
Fixes: 3fb77ef2e0 - radeonsi: do opt_large_constants & lower_indirect_derefs after uniform inlining
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14528>
This will allow further optimizations for shader variants that change
GS outputs (affecting the copy shader), and this is mainly about sharing
optimizations with NGG instead of having a totally separate codepath for
legacy GS.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14266>
The samplemask VGPR that we had to pass to the epilog increased VGPR usage
by 1 for all shaders. Do it in the main function by using the mono key
structure, which causes on-demand compilation and stall, but we'll save
the VGPR.
57794 shaders in 35145 tests
Totals:
SGPRS: 2715856 -> 2716272 (0.02 %)
VGPRS: 1776168 -> 1718432 (-3.25 %)
Spilled SGPRs: 3704 -> 3630 (-2.00 %)
Spilled VGPRs: 1727 -> 1733 (0.35 %)
Private memory VGPRs: 256 -> 256 (0.00 %)
Scratch size: 2008 -> 2016 (0.40 %) dwords per thread
Code Size: 61429584 -> 61393288 (-0.06 %) bytes
Max Waves: 838645 -> 840484 (0.22 %)
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14266>