Commit Graph

356 Commits

Author SHA1 Message Date
Marek Olšák dd9801a918 radeonsi: rename SI_SGPR_RW_BUFFERS to SI_SGPR_INTERNAL_BINDINGS
They are just internal buffers and images.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8653>
2021-01-22 16:45:30 +00:00
Marek Olšák 488cd3b93f radeonsi: clear dirty_states if si_pm4_bind_state is unbinding or no-op
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8653>
2021-01-22 16:45:30 +00:00
Marek Olšák 961aa67adf radeonsi: add a specialized function for CP DMA L2 prefetch
This radically simplifies the code to decrease CPU overhead in si_draw_vbo.
The generic CP DMA copy function is too complicated.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8548>
2021-01-18 01:17:19 +00:00
Marek Olšák 0eca4660a5 radeonsi: make cik_emit_prefetch_L2 templated and move it to si_state_draw.cpp
This is a great candidate for a template. There are a lot of conditions
that are already templated in si_draw_vbo.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8548>
2021-01-18 01:17:19 +00:00
Marek Olšák 4056e953fe radeonsi: move emit_cache_flush functions into si_gfx_cs.c
This is a better place for them. They are not inlined anyway.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8548>
2021-01-18 01:17:19 +00:00
Marek Olšák df456312c2 radeonsi: constant buffer cleanups
si_set_clip_state unreferenced a NULL pointer = no-op.
si_set_tess_state can just pass the user buffer to si_set_rw_buffer directly.

Then si_upload_const_buffer can be static.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8548>
2021-01-18 01:17:19 +00:00
Pierre-Eric Pelloux-Prayer aa9fe1e423 radeonsi: pass radeon_cmdbuf to emit_cache_flush
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:09:25 +01:00
Marek Olšák dffc27e5e1 radeonsi: fix small primitive culling with MSAA force-disabled and smoothing
The problem was that the shader constants were based on the framebuffer
sample count and ignored the multisample enable state and the line/polygon
smoothing state, which uses MSAA rasterization that only sets SampleMaskIn
to get the coverage for alpha-blended smoothing (the PS epilog computes
the alpha channel from SampleMaskIn and blending generates the AA results).

- This is a complete rework that adds a new state for NGG cull constants.
- It fixes the same thing for the prim discard compute shader.
- It documents how VS_STATE.SMALL_PRIM_PRECISION is encoded.

It fixes blue corruption in Unigine Heaven with MSAA and Medium details
or better.

Fixes: 7648060dc0 - radeonsi: enable NGG culling by default on gfx10.3 dGPUs

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8022>
2020-12-16 00:43:45 -05:00
Marek Olšák 85af48b0ee radeonsi: allow including a few files from C++
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7807>
2020-12-09 16:01:21 -05:00
Marek Olšák c3432ad852 radeonsi: add an option to enable 2x2 coarse shading for non-GUI elements
This is for experiments with VRS.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7646>
2020-11-17 22:16:19 +00:00
Marek Olšák 8ab15c9e33 radeonsi: move si_upload_vertex_buffer_descriptors into si_state_draw.c
It will be inlined there.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6786>
2020-09-24 13:08:03 +00:00
Marek Olšák c56fbed99b radeonsi: kill point size VS output if it's not used by the rasterizer
Fixed-func shaders can contain the output, because their generator
doesn't consider the current primitive type into account.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6620>
2020-09-07 11:27:30 +00:00
Marek Olšák b1cb72c449 radeonsi: change PIPE_SHADER to MESA_SHADER (si_shader_selector::type)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6340>
2020-09-02 23:03:00 -04:00
Marek Olšák 69014d8c94 radeonsi: implement CP register shadowing
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5798>
2020-07-22 12:08:19 -04:00
Timothy Arceri bba766d85d radeonsi: fix SI_NUM_ATOMS
This is not used anywhere so maybe we should just drop it instead.

Fixes: 639b673fc3 ("radeonsi: don't use an indirect table for state atoms")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5766>
2020-07-08 03:04:03 +00:00
Timothy Arceri 4686a95621 r600/radeonsi: silence zero-length-bounds gcc warnings
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5766>
2020-07-08 03:04:03 +00:00
Pierre-Eric Pelloux-Prayer 5a05f9714b radeonsi: bump SI_NUM_SHADER_BUFFERS to 32
Some app uses more than 8 SSBOs (https://gitlab.freedesktop.org/mesa/mesa/-/issues/2946),
so increase SI_NUM_SHADER_BUFFERS to 32 (which allows 16 SSBOs).

Since we're now using a 64 bits number to track buffers, we could bump
SI_NUM_SHADER_BUFFERS to 48 but that would conflict with Mesa's
MAX_COMBINED_ATOMIC_BUFFERS limit (= 90).

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2122
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5632>
2020-06-30 09:23:14 +02:00
Marek Olšák 1c1d34a67a radeonsi: rename init_config states to cs_preamble states
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5603>
2020-06-26 07:02:57 +00:00
Marek Olšák caeb44aa24 radeonsi: split si_all_descriptors_begin_new_cs and rename functions
A future commit will extend it.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5095>
2020-05-23 03:45:07 -04:00
Pierre-Eric Pelloux-Prayer 8873ea0e25 radeonsi: determine secure flag must be set for gfx IB
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401>
2020-05-11 10:25:53 +02:00
Marek Olšák d3da73954a radeonsi: add SI_IMAGE_ACCESS_DCC_OFF to ignore DCC for shader images
A shader-based DCC decompress pass will use this.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>
2020-04-30 22:27:31 +00:00
Pierre-Eric Pelloux-Prayer d7008fe46a radeonsi: switch to 3-spaces style
Generated automatically using clang-format and the following config:

AlignAfterOpenBracket: true
AlignConsecutiveMacros: true
AllowAllArgumentsOnNextLine: false
AllowShortCaseLabelsOnASingleLine: false
AllowShortFunctionsOnASingleLine: false
AlwaysBreakAfterReturnType: None
BasedOnStyle: LLVM
BraceWrapping:
  AfterControlStatement: false
  AfterEnum: true
  AfterFunction: true
  AfterStruct: false
  BeforeElse: false
  SplitEmptyFunction: true
BinPackArguments: true
BinPackParameters: true
BreakBeforeBraces: Custom
ColumnLimit: 100
ContinuationIndentWidth: 3
Cpp11BracedListStyle: false
Cpp11BracedListStyle: true
ForEachMacros:
  - LIST_FOR_EACH_ENTRY
  - LIST_FOR_EACH_ENTRY_SAFE
  - util_dynarray_foreach
  - nir_foreach_variable
  - nir_foreach_variable_safe
  - nir_foreach_register
  - nir_foreach_register_safe
  - nir_foreach_use
  - nir_foreach_use_safe
  - nir_foreach_if_use
  - nir_foreach_if_use_safe
  - nir_foreach_def
  - nir_foreach_def_safe
  - nir_foreach_phi_src
  - nir_foreach_phi_src_safe
  - nir_foreach_parallel_copy_entry
  - nir_foreach_instr
  - nir_foreach_instr_reverse
  - nir_foreach_instr_safe
  - nir_foreach_instr_reverse_safe
  - nir_foreach_function
  - nir_foreach_block
  - nir_foreach_block_safe
  - nir_foreach_block_reverse
  - nir_foreach_block_reverse_safe
  - nir_foreach_block_in_cf_node
IncludeBlocks: Regroup
IncludeCategories:
  - Regex:           '<[[:alnum:].]+>'
    Priority:        2
  - Regex:           '.*'
    Priority:        1
IndentWidth: 3
PenaltyBreakBeforeFirstCallParameter: 1
PenaltyExcessCharacter: 100
SpaceAfterCStyleCast: false
SpaceBeforeCpp11BracedList: false
SpaceBeforeCtorInitializerColon: false
SpacesInContainerLiterals: false

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4319>
2020-03-30 11:05:52 +00:00
Marek Olšák 0db74f479b radeonsi: use the live shader cache
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>
2020-01-24 20:29:29 -05:00
Marek Olšák aa2d846604 radeonsi/gfx10: move GE_PC_ALLOC setting to shader states
The value is not changed. I just use a different way to compute it.

The value will vary with NGG culling.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák 5fa2ab831e radeonsi: fork tgsi_shader_info and tgsi_tessctrl_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák 1e03b63b3b radeonsi: rename desc_list_byte_size -> vb_desc_list_alloc_size
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-13 15:57:07 -05:00
Marek Olšák fd84e422b6 radeonsi: clean up messy si_emit_rasterizer_prim_state
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-08 15:48:49 -05:00
Marek Olšák b64a3240c2 radeonsi: determine accurately if line stippling is enabled for performance
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-08 15:48:47 -05:00
Marek Olšák 451bc91158 radeonsi: disallow compute-based culling if polygon mode is enabled
Polygon mode can generate thick points or lines.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Marek Olšák 62229e8949 radeonsi: use IR SHA1 as the cache key for the in-memory shader cache
instead of using whole IR binaries. This saves some memory.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-05 23:28:42 -05:00
Marek Olšák 743a9d85e2 radeonsi: add FMASK slots for shader images (for MSAA images)
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-09 17:12:31 -04:00
Marek Olšák 4dde40908f radeonsi/gfx10: set PA_CL_VS_OUT_CNTL with CONTEXT_REG_RMW to fix edge flags
We need two different values of the register, one for NGG and one for
legacy, in order to fix edge flags for the legacy pipeline.

Passing the ngg flag to emit_clip_regs would be too complicated,
so CONTEXT_REG_RMW is used for partial register updates.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák 776f05a307 radeonsi/gfx10: fix the PRIMITIVES_GENERATED query if using legacy streamout
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák a9bb566955 radeonsi: move some global shader cache flags to per-binary flags
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák 810846e157 radeonsi/gfx10: fix the legacy pipeline by storing as_ngg in the shader cache
It could load an NGG shader when we want a legacy shader and vice versa.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák 8b68511ebc radeonsi: DCC MSAA blending bug - include logic op, limit to Navi14 and older
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:50 -04:00
Marek Olšák e718f8e713 radeonsi: simplify si_get_input_prim and remove incorrect TODO comment
u_vertices_per_prim(QUADS) is the same as TRIANGLES.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-23 15:03:49 -04:00
Marek Olšák e08463ac22 radeonsi/gfx10: update a tunable max_es_verts_base for NGG
We have to fix the computation so as not to break quads.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák a4b3eea325 radeonsi/gfx10: consolidate & improve input_prim determination for NGG
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák b680f723f8 radeonsi/gfx10: export correct PrimitiveID from NGG vertex shaders
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák 6920f09f4b radeonsi/gfx10: fix GL_LINE polygon mode for decomposed primitives
We need to tell PA to accept edge flags generated by the input assembler,
because decomposed primitives shouldn't draw inner edges.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle 792a638b03 radeonsi/gfx10: implement streamout-related queries
The NGG hardware pipeline doesn't track these statistics automatically,
and in fact *cannot* track them automatically when API geometry shaders
are involved, so we accumulate statistics in the shader using atomic
adds.

This implementation accumulates statistics via the memory system and
the RW buffer descriptor setup. We could use GDS, but since these
atomics aren't latency-sensitive, that basically just trades off
L2$ bandwidth vs. export bus bandwidth. One single memory transaction
per shader workgroup doesn't seem too bad. The result ring buffer in
memory is needed either way to avoid pipeline stalls.

The shader code contains the atomic unconditionally, though the
GFX10_GS_QUERY_BUF is a null buffer when no queries are active. The
atomic is simply discarded by the shader hardware in that case.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle 77e715541c radeonsi/gfx10: emit VGT_GS_OUT_PRIM_TYPE from draw and add it to VS_STATE
With NGG, the VGT_GS_OUT_PRIM_TYPE can change without a shader change.

The VS_STATE is required for both streamout and culling from a vertex
shader without pre-compiling outprim-specific variants.

We could consider compiling specialized variants in the future. We
could also consider compiling the NGG logic as an epilog.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle 016a465d7d radeonsi/gfx10: implement gfx10_shader_ngg
For pipelines without API GS. We will later expand this to cover NGG
geometry shaders as well.

Note that the vtx offset passed into the GS part is just the
vertex index multiplied by VGT_ESGS_RING_ITEMSIZE.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle 7bb9bb0540 radeonsi/gfx10: implement gfx10_emit_cache_flush
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle 595a7f7c47 radeonsi/gfx10: add pipe_screen::make_texture_descriptor
Texture descriptors in gfx10 are very different.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle 064f195ef0 radeonsi: make si_restore_qbo_state externally available
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Marek Olšák 0f1b070bad radeonsi: remove old_va parameter from si_rebind_buffer by remembering offsets
This is a prerequisite for the next commit.

Cc: 19.1 <mesa-stable@lists.freedesktop.org>
2019-05-16 13:14:55 -04:00
Marek Olšák 04122532e3 radeonsi: invalidate caches at the beginning of the prim discard compute IB
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:13:36 -04:00
Marek Olšák c9b7a37b8f radeonsi: cull primitives with async compute for large draw calls
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:13:34 -04:00