KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Marek Olšák	17164d4e27	radeonsi/gfx10: don't declare any LDS for NGG if it's not used Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-27 13:50:57 -05:00
Marek Olšák	378444ce90	radeonsi: allow generating VS prologs with 0 inputs If "ls_vgpr_fix" is set, we use a prolog, but it can have 0 inputs. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>	2019-12-16 20:06:07 +00:00
Marek Olšák	442ef8c3e3	radeonsi: keep serialized NIR instead of nir_shader in si_shader_selector This decreases memory usage, because serialized NIR is more compact. The main shader part is compiled from nir_shader. Monolithic shader variants are compiled from nir_binary. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-05 23:28:45 -05:00
Marek Olšák	fff884e09d	radeonsi/nir: implement pipe_screen::finalize_nir Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-23 21:12:52 -04:00
Marek Olšák	268e0e01f3	radeonsi/nir: simplify si_lower_nir signature just a cleanup	2019-10-15 21:52:09 -04:00
Marek Olšák	eec7b0a865	radeonsi: use simple_mtx_t instead of mtx_t Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-07 20:05:07 -04:00
Marek Olšák	4dde40908f	radeonsi/gfx10: set PA_CL_VS_OUT_CNTL with CONTEXT_REG_RMW to fix edge flags We need two different values of the register, one for NGG and one for legacy, in order to fix edge flags for the legacy pipeline. Passing the ngg flag to emit_clip_regs would be too complicated, so CONTEXT_REG_RMW is used for partial register updates. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	1426acf9e7	radeonsi/gfx10: remove incorrect ngg/pos_writes_edgeflag variables It varies depending on si_shader_key::as_ngg. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	223b3174bd	radeonsi/nir: always lower ballot masks as 64-bit, codegen handles it This fixes KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks. This solution is better, because the IR isn't dependent on wave32.	2019-08-19 17:23:38 -04:00
Marek Olšák	5167ca27fa	gallium: add TGSI_SEMANTIC_DEFAULT_OUTER/INNER_LEVEL for radeonsi NIR support.	2019-08-12 14:52:17 -04:00
Marek Olšák	902dd50cf0	gallium: add AMD-specific compute TGSI enums for tgsi_to_nir	2019-08-12 14:52:17 -04:00
Marek Olšák	6a2bdb8d01	gallium: add TGSI_PROPERTY_VS_BLIT_SGPRS_AMD for tgsi_to_nir needed by radeonsi NIR support	2019-08-12 14:52:17 -04:00
Marek Olšák	f064b530f6	radeonsi/gfx10: remove an obsolete VGT_REUSE_OFF workaround Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:09:01 -04:00
Marek Olšák	e777720173	radeonsi/nir: lower PS inputs before scanning the shader Lowering PS inputs can eliminate some of them, which messes up persp/linear barycentric coord usage info. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:46 -04:00
Marek Olšák	9ac7d0a0e2	radeonsi: fix packing of key.mono.u.ps	2019-07-30 22:06:23 -04:00
Marek Olšák	8f72f137ad	radeonsi/gfx10: add as_ngg variant for TES as ES to select Wave32/64 Legacy GS has to use Wave64, so TES before GS has to use Wave64 too. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	88efb63caf	radeonsi/gfx10: implement Wave32 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	a8a526c5cb	radeonsi/gfx10: set as_ngg for GS prolog as_ngg is required by Wave32. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	0f30223cf4	radeonsi/gfx10: combine hw edgeflags with user edgeflags for correct behavior Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	985a59e0d1	radeonsi/gfx10: don't compile the GS copy shader if it's 100% not needed Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	1c99a13f89	radeonsi: decrease maximum supported GENERIC varying index from 42 to 31 This can decrease LDS and/or memory usage for shader outputs when geometry shaders or tessellation is used. Only PS inputs support higher indices and those aren't eliminated by kill_outputs. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	3be4ed2fe1	radeonsi: fix and clean up shader_type passing - don't pass it via a parameter if it can be derived from other parameters - set shader_type for ac_rtld_open - use enum pipe_shader_type instead of unsigned Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	f66ee5af2f	radeonsi: determine the rasterization primitive type accurately (v2) v2: reworked version to fix bugs and make it more efficient Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	32694456f7	radeonsi/gfx10: jump over the shader query atomic if the queries are disabled Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	b680f723f8	radeonsi/gfx10: export correct PrimitiveID from NGG vertex shaders Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	6920f09f4b	radeonsi/gfx10: fix GL_LINE polygon mode for decomposed primitives We need to tell PA to accept edge flags generated by the input assembler, because decomposed primitives shouldn't draw inner edges. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	77e715541c	radeonsi/gfx10: emit VGT_GS_OUT_PRIM_TYPE from draw and add it to VS_STATE With NGG, the VGT_GS_OUT_PRIM_TYPE can change without a shader change. The VS_STATE is required for both streamout and culling from a vertex shader without pre-compiling outprim-specific variants. We could consider compiling specialized variants in the future. We could also consider compiling the NGG logic as an epilog. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	4ecc39e1aa	radeonsi/gfx10: NGG geometry shader PM4 and upload Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	016a465d7d	radeonsi/gfx10: implement gfx10_shader_ngg For pipelines without API GS. We will later expand this to cover NGG geometry shaders as well. Note that the vtx offset passed into the GS part is just the vertex index multiplied by VGT_ESGS_RING_ITEMSIZE. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	8ec60d3031	radeonsi/gfx10: add as_ngg shader key bit Also add the shader main part NGG variant, so that in principle we can switch between legacy in NGG modes. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	226f650d92	radeonsi/gfx10: document NGG shader stages Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	b519ddc35c	radeonsi/gfx9: declare LDS ESGS ring as an explicit symbol on LLVM >= 9 This will make it easier to use LDS for other purposes in geometry shaders in the future. The lifetime of the esgs_ring variable is as follows: - declared as [0 x i32] while compiling shader parts or monolithic shaders - just before uploading, gfx9_get_gs_info computes (among other things) the final ESGS ring size (this depends on both the ES and the GS shader) - during upload, the "esgs_ring" symbol is given to ac_rtld as a shared LDS symbol, which will lead to correctly laying out the LDS including other LDS objects that may be defined in the future - si_shader_gs uses shader->config.lds_size as the LDS size This change depends on the LLVM changes for emitting LDS symbols into the ELF file. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	f8315ae04b	amd/rtld: layout and relocate LDS symbols Upcoming changes to LLVM will emit LDS objects as symbols in the ELF symbol table, with relocations that will be resolved with this change. Callers will also be able to define LDS symbols that are shared between shader parts. This will be used by radeonsi for the ESGS ring in gfx9+ merged shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	ca21ba2a08	radeonsi: inline si_shader_binary_read_config into its only caller Since it can only be used for reading the config of an individual, non-combined shader, it is not very reusable anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	bf8a1ca902	radeonsi: use the new run-time linker for shaders v2: - fix a memory leak Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	bf11c594dd	radeonsi: return bool from si_shader_binary_upload We didn't really use error codes anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	8b1343ca79	radeonsi: let si_shader_create return a boolean We didn't really use error codes anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	77b05cc42d	radeonsi: use ac_shader_config Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Marek Olšák	c9b7a37b8f	radeonsi: cull primitives with async compute for large draw calls Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:13:34 -04:00
Marek Olšák	4eb377d1c3	radeonsi: add si_vs_prolog_bits::unpack_instance_id_from_vertex_id:1 The prim discard compute shader bakes InstanceID into the output index buffer. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:10:07 -04:00
Marek Olšák	ccfcb9d818	ac: rename SI-CIK-VI to GFX6-GFX7-GFX8 Acked-by: Dave Airlie <airlied@redhat.com> We already use GFX9 and I don't want us to have confusing naming in the driver. GFXn naming is better from the driver perspective, because it's the real version of the gfx portion of the hw. Also, CIK means Bonaire-Kaveri-Kabini, it doesn't mean CI. It shouldn't confuse our SDMA, UVD, VCE etc. code much. Those have nothing to do with GFXn and they have their own version numbers.	2019-05-15 20:54:10 -04:00
Nicolai Hähnle	d814c21b1b	radeonsi: overhaul the vertex fetch fixup mechanism The overall goal is to support unaligned loads from vertex buffers natively on SI. In the unaligned case, we fall back to the general case implementation in ac_build_opencoded_load_format. Since this function is fully general, we will also use it going forward for cases requiring fully manual format conversions of dwords anyway. This requires a different encoding of the fix_fetch array, which will now contain the entire format information if a fixup is required. Having to check the alignment of vertex buffers is awkward. To keep the impact on the fast path minimal, the si_context will keep track of which vertex buffers are (not) at least dword-aligned, while the si_vertex_elements will note which vertex buffers have some (at most dword) alignment requirement. Vertex buffers should be dword-aligned most of the time, which allows a fast early-out in almost all cases. Add the radeonsi_vs_fetch_always_opencode configuration variable for testing purposes. Note that it can only be used reliably on LLVM >= 9, because support for byte and short load is required. v2: - add a missing check to si_bind_vertex_elements Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-13 17:07:23 +02:00
Timothy Arceri	a004e95dd7	radeonsi/nir: create si_nir_opts() helper We will make use of this in the following commit. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-01 09:41:07 +10:00
Marek Olšák	501ff90a95	radeonsi: rename r600_resource -> si_resource Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 13:32:18 -05:00
Timothy Arceri	2817a4ec0b	radeonsi: remove unrequired param in si_nir_scan_tess_ctrl() Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-02 10:01:15 +11:00
Samuel Pitoiset	3fbdcd942f	amd: remove support for LLVM 6.0 User are encouraged to switch to LLVM 7.0 released in September 2018. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-06 14:02:56 +01:00
Sonny Jiang	084cf3b966	radeonsi:optimizing SET_CONTEXT_REG for shaders vgt_vertex_reuse Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-10-05 19:04:13 -04:00
Sonny Jiang	ce1d72609d	radeonsi:optimizing SET_CONTEXT_REG for shaders Tessellation Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-10-05 19:04:13 -04:00
Sonny Jiang	4052624398	radeonsi:optimizing SET_CONTEXT_REG for shaders GS Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-10-05 19:04:13 -04:00
Marek Olšák	c5442c1165	radeonsi: add TGSI_SEMANTIC_CS_USER_DATA for reading up to 4 SGPRs with TGSI	2018-08-29 15:31:42 -04:00

1 2 3 4 5 ...

320 Commits