Commit Graph

266 Commits

Author SHA1 Message Date
Samuel Pitoiset 2d295ab3f3 radv: add llvm_compiler_shader() helper
To match aco_compile_shader().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
2020-03-13 10:22:13 +00:00
Samuel Pitoiset 4d991c2de4 radv: remove unnecessary LLVM includes
They are already included from src/amd/llvm.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
2020-03-13 10:22:13 +00:00
Timur Kristóf 346bd0c623 radv: Move some helper functions to the radv_shader.h header file.
Move calculate_tess_lds_size and get_tcs_num_patches to radv_shader.h
ACO will need to call these functions too.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>
2020-03-11 08:34:10 +00:00
Daniel Schürmann 61fb17e8d7 amd: join emit_kill() from radv and radeonsi in ac_nir_to_llvm
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4047>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4047>
2020-03-09 12:29:32 +00:00
Eric Anholt 13a276ed3b radv: Squelch possibly-undefined warning
The same condition is used in the def as in the use, but gcc wasn't
figuring it out.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3867>
2020-02-18 15:35:32 -08:00
Samuel Pitoiset e4752dafed radv/gfx10: implement NGG GS queries
The number of generated primitives is only counted by the hardware
if GS uses the legacy path. For NGG GS, we need to accumulate that
value in the NGG GS itself. To achieve that, we use a plain GDS
atomic operation.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3380>
2020-01-29 17:40:48 +01:00
Rhys Perry 4363a1f75b amd/common,radv: move vertex_format_table to ac_shader_util.{h,c}
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:39:52 +00:00
Samuel Pitoiset fce28a7341 radv/gfx10: simplify some duplicated NGG GS code
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3382>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3382>
2020-01-15 07:45:29 +00:00
Samuel Pitoiset 1db276ba23 radv/gfx10: add support for NGG passthrough mode
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-13 08:14:45 +01:00
Samuel Pitoiset 471738e97b radv/gfx10: do not declare LDS for NGG if useless
Only needed for NGG without passthrough mode or for NGG streamout.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-13 08:14:43 +01:00
Marek Olšák d1c8aeb24f ac: unify primitive export code
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-08 16:00:38 -05:00
Marek Olšák 1c77a18cc2 ac: unify build_sendmsg_gs_alloc_req
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-08 16:00:36 -05:00
Samuel Pitoiset 13b4e9adcf ac: declare an enum for the OOB select field on GFX10
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147>
2019-12-19 15:15:32 +01:00
Samuel Pitoiset d399f4f414 radv/gfx10: fix ngg_get_ordered_id
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3133>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3133>
2019-12-17 12:34:18 +00:00
Samuel Pitoiset b37c91c12e radv: handle unaligned vertex fetches on GFX6/GFX10
The Vulkan spec doesn't have any words for vertex attributes alignment.

Fixes a test failure on GFX6 and a GPU hang on GFX10 with:
dEQP-VK.spirv_assembly.instruction.spirv1p4.entrypoint.tess_con_pc_entry_point

vkpipeline-db results on GFX10:
Totals from affected shaders:
SGPRS: 463772 -> 472972 (1.98 %)
VGPRS: 343208 -> 343752 (0.16 %)
Spilled SGPRs: 323 -> 336 (4.02 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 13806200 -> 14164472 (2.60 %) bytes
Max Waves: 84021 -> 83755 (-0.32 %)

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2161
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-13 09:54:07 +00:00
Samuel Pitoiset 4cacba0c86 radv/gfx10: fix the vertex order for triangle strips emitted by a GS
My fix wasn't totally correct as pointed out by Marek.
Ported from RadeonSI.

Fixes: deafe4cc58 ("radv/gfx10: fix primitive indices orientation for NGG GS")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-04 08:28:57 +01:00
Samuel Pitoiset dac6bd29ae radv: simplify a check in radv_fixup_vertex_input_fetches()
The number of loaded channels should always be > 0 now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-04 08:04:05 +01:00
Marek Olšák f671cc4d95 ac: set swizzled bit in cache policy as a hint not to merge loads/stores
LLVM now merges loads and stores for all opcodes, so this must be set.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-25 16:48:27 -05:00
Connor Abbott e7f4cadd02 radv: Replace supports_spill with explict_scratch_args
The former was always true and hence dead code. We will want to
explicitly declare the ring offset register with ACO, but we also want
to declare the scratch offset too, and we can't try to disable it since
ACO also supports spilling and the determination of whether spilling has
to happen occurs well after setting up registers. So replace
supports_spill with something that will actually be used for ACO.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-25 14:17:51 +01:00
Connor Abbott 66c703b3e8 radv: Move argument declaration out of nir_to_llvm
Now it's executed for ACO too.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-25 14:17:51 +01:00
Connor Abbott 3b143369a5 ac/nir, radv, radeonsi: Switch to using ac_shader_args
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2019-11-25 14:17:10 +01:00
Connor Abbott 43da33c169 radv: Rename ac_arg_regfile
We'll duplicate this in a header file in the next commit, and then
remove the original enum. Just rename it temporarily so that things
keep building.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-25 14:12:46 +01:00
Samuel Pitoiset 519d9b30de radv: remove useless RADV_DEBUG=unsafemath debug option
This option is useless and shouldn't be used at all.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-15 09:07:34 +01:00
Rhys Perry de998d3eb5 radv: fix radv_nir_get_max_workgroup_size when nir=NULL
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 84a1a2578 ('compiler: pack shader_info from 160 bytes to 96 bytes')
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-11 20:44:12 +00:00
Samuel Pitoiset deafe4cc58 radv/gfx10: fix primitive indices orientation for NGG GS
The primitive indices have to be swapped to follow the drawing
order.

This fixes corruption with Overwatch when NGG GS is force enabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-07 19:21:15 +00:00
Samuel Pitoiset d3f9957de4 radv: determine shaders wavesize at pipeline level
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-06 09:20:34 +01:00
Samuel Pitoiset d1e1f7c4d5 radv: hardcode the number of waves for the GFX6 LS-HS bug
It's always 64.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-06 09:20:32 +01:00
Samuel Pitoiset 1e36a8f41d radv: declare NGG scratch for VS or TES and only on GFX10
Do not need to declare it for other stages because this is for
streamout.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-31 06:51:01 +00:00
Samuel Pitoiset 7c50214aab radv: implement VK_KHR_shader_float_controls
This exposes what's required for DX and this is what we already
configure. The driver flushes denorms for FP32 and preserves them
for FP16/FP64. Note that we can't allow both preserving and
flushing denorms because this won't work for merged shaders. This
will require LLVM to update the float mode register to make it work.

Only enabled on GFX8+ with the LLVM path because it's untested on
previous chips and ACO doesn't support it.

This extension is required for SPIRV 1.4.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-18 16:55:58 +02:00
Samuel Pitoiset e19d1ee2d1 radv/gfx10: fix NGG streamout with triangle strips for VS
The number of vertices has to be adjusted with the output primitive
type.

This fixes dEQP-VK.transform_feedback.simple.triangle_strip_*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:09:35 +02:00
Samuel Pitoiset 08ab13d340 radv/gfx10: fix storing/loading NGG stream outputs for GS
The GS outputs are stored differently in the LDS storage, they
are indexed by out_idx which is incremented for each stored DWORD.
Thus, we need a different path for exporting the stream outputs.

This fixes a bunch of CTS failures when NGG GS is force enabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:09:32 +02:00
Samuel Pitoiset 3be21b5ab1 radv/gfx10: use the component mask when storing/loading NGG stream outputs
It's unnecessary to store/load more components that needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:09:30 +02:00
Samuel Pitoiset 60f8224171 radv/gfx10: fix storing/loading NGG stream outputs for VS and TES
The LDS storage allocated for stream outputs is 4 * N, where N
is the number of outputs. So, we have to store/load with N as index
and not with the output location as index.

This doesn't fix anything known but it should fix out-of-bounds
access and it also reduces the number of outputs written to the
LDS storage.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:09:27 +02:00
Rhys Perry 3c966fd688 aco,radv: rename record_llvm_ir/llvm_ir_string to record_ir/ir_string
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 11:08:47 +01:00
Rhys Perry b3f71685d9 radv: never kill a NGG GS shader
Seems to fix a hang with excessive vertex emissions when NGG is used for
GS.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-18 19:26:58 +00:00
Samuel Pitoiset 68820007fd radv: fix loading 64-bit GS inputs
We have to load 2 32-bit integer and to cast correctly.

This fixes crashes with gs-double-interpolator.vk_shader_test.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111734
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-18 17:16:36 +02:00
Rhys Perry ffabcbba60 radv: always emit a position export in gs copy shaders
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: f8d0337299 ('radv: add multiple streams support for the GS copy shader')
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-16 19:42:30 +00:00
Rhys Perry 0f29c9df31 radv: keep GS threads with excessive emissions which could write to memory
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-16 19:42:30 +00:00
Samuel Pitoiset d0fd82b502 radv/gfx10: implement NGG streamout
It's still disabled by default because transform feedback randomly
hangs and it seems like it's related to GDS (cf. RadeonSI).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset 67093ed3a3 radv/gfx10: unconditionally declare scratch space for NGG streamout without GS
Streamout outputs are stored in the ESGS ring.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset 5ebc76471c radv/gfx10: adjust the GS NGG scratch size for streamout
It needs more space for multiple streams.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset a15b3bcf1a radv/gfx10: add an option to switch from legacy to NGG streamout
This internal option is turned off by default because NGG streamout
still hangs. It seems like it's related to GDS as RadeonSI.

That option will be turned on once all issues are resolved.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset 8137df3a46 radv: fix allocating number of user sgprs if streamout is used
streamout_buffers is assigned after that function, so the previous
fix was completely wrong. This probably fix something when streamout
buffers and push constants are used/inlined in the same shader.

Fixes: 378e2d2414 ("radv: fix computing number of user SGPRs for streamout buffers")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-13 07:55:30 +02:00
Samuel Pitoiset 538766792d radv/gfx10: declare a LDS symbol for the NGG emit space
This fixes some interactions when NGG GS is enabled. It fixes:

- dEQP-VK.clipping.user_defined.clip_cull_distance_dynamic_index.*geom*
- dEQP-VK.tessellation.geometry_interaction.passthrough.*

For some reasons, using the computed ESGS ring size randomly hangs
with CTS. For now, just use the maximum LDS size for ESGS.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-10 09:27:01 +02:00
Samuel Pitoiset a9af11f1fa radv: fill shader info for all stages in the pipeline
This shouldn't be in NIR->LLVM because ACO also needs the shader
info. This will also help for computing some NGG values that are
necessary for declaring LDS symbols.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-10 09:26:45 +02:00
Samuel Pitoiset 8cf297c7b1 radv: do not pass all compiler options to the shader info pass
Only the pipeline layout and the shader keys are needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-10 09:26:42 +02:00
Samuel Pitoiset 0bf51b6941 radv/gfx10: determine the number of vertices per primitive for TES
This doesn't fix anything known but it's correct now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 17:36:49 +02:00
Samuel Pitoiset c6be5cefba radv/gfx10: make use the output usage mask when exporting NGG GS params
It shouldn't matter much because output varyings should have been
compacted during NIR shader linking but it mirrors what the driver
does when emitting NGG GS vertex parameters.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 17:25:28 +02:00
Samuel Pitoiset b1a872f0c0 radv/gfx10: account for the subpass view for the NGG GS storage
If the fragment shader needs the layer index, we have to allocate
one more dword in the NGG GS storage. Found by inspection. This
doesn't fix anything known.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 17:25:28 +02:00
Samuel Pitoiset f31fb33432 radv: calculate esgs_itemsize in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:24 +02:00