KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Marek Olšák	eddb65ffb0	radeonsi: don't use NGG passthrough if culling is possible for better perf Switching NGG passthrough on/off decreases performance because it causes context rolls. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12812>	2021-09-10 23:32:03 +00:00
Marek Olšák	1f8be99621	radeonsi: enable shader-based prim culling with polygon mode Polygon mode should have no effect on culling, so keep it enabled. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12812>	2021-09-10 23:32:03 +00:00
Marek Olšák	576f8394db	radeonsi: remove the primitive discard compute shader It doesn't always work, it's only useful on gfx9 and older, and it's too complicated. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4011 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12812>	2021-09-10 23:32:03 +00:00
Emma Anholt	17332ceb0f	mesa/st: Add an optional GLSL link fail msg to finalize_nir. GLES2 drivers are allowed to reject some GLSL constructs, like dynamic loop bounds (which neither i915g nor vc4 can fully support), but gallium hasn't had any way to trigger a link failure. Add a return msg to the finalize_nir hook, which is called at the end of GLSL linking, and use that. This means that some other callers of finalize need to do something with the msg, and we (for now) just throw it away. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12218>	2021-09-06 18:09:25 +00:00
Marek Olšák	c005b2cd4b	radeonsi: move as_ls/es/ngg setting out of si_shader_selector_key Do it when we bind shaders. The advantages are: - no need to memset the fields when any shader variant state is changed (e.g. culling on/off) - no need to recompute the fields every time that happens Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12656>	2021-09-01 00:42:57 +00:00
Marek Olšák	08310f85ae	radeonsi: remove instancing support from the prim discard compute shader It's not important for workstation apps on Vega. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12656>	2021-09-01 00:42:57 +00:00
Ian Romanick	5f2dbd45f2	gallium: Remove "optimize" parameter from pipe_screen::finalize_nir As part of adding support for inline uniforms in Iris, I was going to add a finalize_nir hook. I went looking to see how other drivers use the "optimize" parameter, and I discovered that nobody uses it at all. v2: Fix typo in commit message. Noticed by Mike. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12317>	2021-08-13 15:45:29 -07:00
Pierre-Eric Pelloux-Prayer	9fe8ae3fcd	radeonsi: don't create an infinite number of variants If a shader has code like this: uniform float timestamp; ... if (timestamp > 0.0) do_something() And timestamp is modified each frame, we'll end up generating a new variant per frame. This commit introduces a hard limit on the number of variants we generate for a single shader. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5121 Fixes: `b7501184b9` ("radeonsi: implement inlinable uniforms") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12138>	2021-08-09 10:26:54 +00:00
Marek Olšák	06da711350	radeonsi: remove the GDS variants of compute-based primitive discard The GDS ordered append variant is unstable due to kernel and firmware bugs. The unordered GDS variant isn't faster than the memory-based variant. Only the memory-based variant is kept. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11510>	2021-06-28 13:23:14 +00:00
Marek Olšák	fc95ba6c86	radeonsi: remove the Z culling option from the primitive discard CS Not useful. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11102>	2021-06-21 19:03:29 +00:00
Marek Olšák	1e9cc86511	radeonsi: merge 2 conditional blocks with same condition into 1 in culling code The block only loads input VGPRs from LDS, and the next block uses them. The entering condition is the same, even though the second block is the next shader part beginning with the prolog. Simply move the VGPR loads into the prolog. This decreases the shader code size by 12 bytes. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11102>	2021-06-21 19:03:29 +00:00
Pierre-Eric Pelloux-Prayer	b78a38bd02	radeonsi: use si_nir_is_output_const_if_tex_is_const When a blending mode producing "color = src * dst" is used and we can determine that dst is 1, then the draw call can dropped completely. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10979>	2021-06-15 11:18:02 +02:00
Marek Olšák	c53f25b668	radeonsi: kill 16-bit VS outputs if PS doesn't use them or doing Z-only draw The kill_outputs logic uses our internal IO indices. Just add indices for 16-bit varyings. We don't have enough free indices to use, but we can reuse the indices that GLES doesn't have. Those are all the legacy desktop GL varyings. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9051>	2021-04-13 21:10:43 -04:00
Marek Olšák	7db43960f6	radeonsi: implement 16-bit VS->PS varyings Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9051>	2021-04-13 21:10:43 -04:00
Pierre-Eric Pelloux-Prayer	a27ea38d2a	radeonsi/sqtt: keep a copy of the uploaded shader code Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Marek Olšák	e9e385b084	radeonsi: gather shader info about VMEM usage for MEM_ORDERED Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9028>	2021-02-17 04:49:24 -05:00
Marek Olšák	27e22f025c	radeonsi: gather shader info about indirect UBO/SSBO/samplers/images A future commit will use it. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9028>	2021-02-17 04:49:24 -05:00
Marek Olšák	19e6601413	radeonsi: do late NIR optimizations after uniform inlining This was missing. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9028>	2021-02-17 04:49:24 -05:00
Marek Olšák	ffbf3a5f8b	radeonsi: simplify the NGG culling condition in si_draw_vbo Changes: - disallow NGG culling for GS, fast launch for tess using template args (GS can't do NGG culling, tess can't do fast launch) - skip checking current_rast_prim with tessellation (bake the condition into ngg_cull_vert_threshold) - use only 1 vertex count threshold for enabling NGG shader culling to simplify it. I think it doesn't have a big impact. The threshold computation depends on more parameters than just fast launch. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8434>	2021-02-02 05:42:32 +00:00
Marek Olšák	dd9801a918	radeonsi: rename SI_SGPR_RW_BUFFERS to SI_SGPR_INTERNAL_BINDINGS They are just internal buffers and images. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8653>	2021-01-22 16:45:30 +00:00
Marek Olšák	62703b79a5	radeonsi: remove si_gs_prolog_bits::gfx9_prev_is_vs It didn't do anything useful. GS doesn't use the other user SGPRs. If we decrease the number of user SGPRs we declare for the GS prolog, we can remove gfx9_prev_is_vs. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8344>	2021-01-06 23:28:04 -05:00
Marek Olšák	b6b6d1ff3c	radeonsi: fix hang caused by for loop with exec=0 in LS and ES LLVM expects that exec != 0 when entering loops and generates this code that becomes an infinite loop if exec == 0: BB5_1: vcc_lo = (inverted terminating condition) s_and_b32 vcc_lo, exec_lo, vcc_lo s_cbranch_vccnz BB5_3 // jump if vcc != 0 (break statement) // ... loop body ... s_branch BB5_1 BB5_3: For non-monolithic VS before TCS, VS before GS, and TES before GS, we set exec = (thread enabledmask), which sets 0 for HS-only and GS-only waves, causing the infinite loop condition above. Fix it as follows: - set exec = ~0 at the beginning - wrap the whole shader (LS and ES) in a conditional block, so that HS-only and GS-only waves jump over it and never enter such a loop The TES before GS hang can be reproduced by gfxbench: testfw_app --gfx egl -w 1920 -h 1080 --gl_api gles -t gl_tess Fixes: `68d6d097f1` - radeonsi/gfx9: add GFX9 and VEGA10 enums Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8344>	2021-01-06 23:28:01 -05:00
Yogesh mohan marimuthu	8a22fc9502	radeonsi: enable vrs2x2 coarse shading if flat shading (v9) Enable vrs2x2 coarse shading if flat shading as per idea and guidance given by Marek. is_flat_shading variable in struct si_shader_info is set based on the data from gather_intrinsic_info() function and struct si_state_rasterizer. If is_flat_shading_variable is set, then in function si_emit_db_render_state() vrs2x2 shading is enabled in hardware. v2: Fix review comments from Pierre-Eric. Code optimizations. v3: Fix indentation style issue. v4: Fix review comments from Marek. Fixed logical issue pointed by Marek where info->is_flat_shading variable can be corrupted and other code cleanup. v5: Make the code compact as suggested by Pierre-Eric. v6: Fix new review comments from Marek. v7: use info->uses_interp_color variable fix from Marek. v8: Fix coding style comment from Marek. v9: Add uses_fbfetch_output check as suggested by Marek. Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8161>	2021-01-06 10:12:10 +05:30
Marek Olšák	76eb3478cf	radeonsi: take color interpolation into account for shader variants Fixes: - Sample shading now uses per-sample interpolation for colors if colors are the only inputs. (this is the only case that was broken) Optimizations: - BC_OPTIMIZE (barycentric optimization) is now enabled with MSAA if colors are qualified with both center and centroid. (BC_OPTIMIZE means that the hardware skips initializing centroid (i,j) if they are equal to center (i,j)) - If MSAA is disabled and at least 2 out of (center, centroid, sample) are used by all inputs now including colors, center is forced for all inputs. - If INTERP_MODE_COLOR is not used and the legacy GL shade model is flat, the shader variant for flat shading is not generated. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8225>	2021-01-05 02:43:55 +00:00
Marek Olšák	fe839baf6a	radeonsi: fix future C++ compile failures and warnings Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7807>	2020-12-09 16:01:29 -05:00
Marek Olšák	85af48b0ee	radeonsi: allow including a few files from C++ Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7807>	2020-12-09 16:01:21 -05:00
Marek Olšák	c7470c1760	radeonsi: don't set DrawID and StartInstance if they are unused Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7721>	2020-12-01 15:33:03 -05:00
Marek Olšák	623ea81530	radeonsi: don't update provoking vertex and outprim states in SGPR if unused Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7721>	2020-12-01 15:33:03 -05:00
Marek Olšák	4641dca269	radeonsi: don't update indexed flag in SGPR if it's unused to skip the register update when switching between indexed and non-indexed Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7721>	2020-12-01 15:33:03 -05:00
Marek Olšák	509142876b	radeonsi: add AMD_DEBUG=nofastlaunch for debugging Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7721>	2020-12-01 15:33:03 -05:00
Marek Olšák	aaed7a29be	radeonsi: implement GS fast launch for indexed triangle strips This increases performance for indexed triangle strips up to +100%. In practice, it's limited by memory bandwidth and compute power, so 256-bit memory bus and a lot of CUs are recommended. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7681>	2020-11-27 06:16:59 +00:00
Marek Olšák	61fe66a2e4	radeonsi: pass VS->TCS IO via VGPRs if VS and TCS have the same thread count It can only be done if a TCS input is accessed without indirect indexing and with gl_InvocationID as the vertex index, and the number of VS and TCS threads is the same. This eliminates LDS stores and loads for VS->TCS IO, reducing shader lifetime and LDS traffic. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>	2020-11-23 02:22:21 +00:00
Marek Olšák	1190808eca	radeonsi: if VS and TCS have the same number of threads, merge the conditonals Instead of: if (VS) { VS; } if (TCS) { TCS; } Do this if the number of threads is the same in VS and TCS: exec = enabled_threads; VS; TCS; Skipping declare_vb_descriptor_input_sgprs is needed to match the VS return values. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>	2020-11-23 02:22:21 +00:00
Marek Olšák	c4310f70aa	radeonsi: swap DrawId and StartInstance SGPR locations We need to change both values at the same time, so they need to be next to each other. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7441>	2020-11-18 01:41:25 +00:00
Marek Olšák	b7501184b9	radeonsi: implement inlinable uniforms This improves performance for uber shaders. It must be enabled using the new driconf option. The driver compiles the specialized shaders in another thread without stalls, same as all other optimizations. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7057>	2020-10-30 11:07:22 +00:00
Marek Olšák	1de0bf0a56	radeonsi: remove indirection when loading position at the end for NGG culling If we store the position into LDS after we know the new thread ID, we don't need to remember the old thread ID. The culling code only needs W, X/W, Y/W, so we have to keep those. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7172>	2020-10-17 01:58:19 +00:00
Marek Olšák	f5912c6d32	radeonsi: kill disabled clip distances and planes at per-channel granularity Apps often enable only 1 plane for gl_ClipVertex, which means 1 scalar clip distance. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6948>	2020-10-01 16:29:46 +00:00
Marek Olšák	30c3b2c0b6	radeonsi: simplify NGG culling enablement and add radeonsi_shader_culling option Add a vertex count threshold into si_shader_selector to simplify the draw_vbo code. The new option is supposed to be used in 00-mesa-defaults.conf and should be tweaked for best performance unlike the AMD_DEBUG experimental options. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6948>	2020-10-01 16:29:46 +00:00
Marek Olšák	d1d27e9db4	radeonsi: remove redundant info.uses_fbfetch Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6782>	2020-09-25 04:37:23 -04:00
Marek Olšák	98a52fecda	radeonsi: implement 16-bit FS color outputs This removes type conversions from 16 bits to 32 bits in the main function and then back to 16 bits in the epilog. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6622>	2020-09-22 02:44:53 +00:00
Marek Olšák	c56fbed99b	radeonsi: kill point size VS output if it's not used by the rasterizer Fixed-func shaders can contain the output, because their generator doesn't consider the current primitive type into account. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6620>	2020-09-07 11:27:30 +00:00
Marek Olšák	1dd243d4f5	radeonsi: use shader_info::cs::local_size_variable to clean up some code Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6624>	2020-09-07 11:15:41 +00:00
Marek Olšák	757f790ad8	radeonsi: remove redundant si_shader_info::uses_derivatives Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6624>	2020-09-07 11:15:41 +00:00
Marek Olšák	f3f08bca23	radeonsi: remove redundant si_shader_selector::max_gs_stream Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6624>	2020-09-07 11:15:41 +00:00
Marek Olšák	2b4fa68808	radeonsi: remove redundant GS variables in si_shader_selector Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6624>	2020-09-07 11:15:40 +00:00
Marek Olšák	7960668dc9	radeonsi: remove redundant si_shader_info::writes_memory Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6624>	2020-09-07 11:15:40 +00:00
Marek Olšák	83cdffd435	radeonsi: rename num_memory_instructions -> num_memory_stores it only counts stores Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6624>	2020-09-07 11:15:40 +00:00
Marek Olšák	c8ab5899c1	radeonsi: reduce type sizes in si_shader_selector Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6624>	2020-09-07 11:15:40 +00:00
Marek Olšák	99c4e61084	radeonsi: remove redundant si_shader_info::uses_kill Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6624>	2020-09-07 11:15:40 +00:00
Marek Olšák	8df349a31e	radeonsi: merge uses_persp_opcode_interp_sample/uses_linear_opcode_interp_sample Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6624>	2020-09-07 11:15:40 +00:00

1 2 3 4 5 ...

409 Commits