KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Nicolai Hähnle	74a26af913	amd/common/gfx10: add register JSON A small number of fields now need new disambiguation. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Sagar Ghuge	456557a837	nir: Add lower_rotate flag and set to true in all drivers Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-01 10:14:22 -07:00
Samuel Pitoiset	d8b079e4c7	radv: rework how the number of VGPRs is computed Just a cleanup, it shouldn't change anything. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-01 14:59:27 +02:00
Samuel Pitoiset	f4d2c47cf6	radv: the number of VGPR_COMP_CNT for GS is expected to be 0 on GFX8 Just move around the switch case. GFX9+ is handled below. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-01 14:59:19 +02:00
Samuel Pitoiset	b4477fa4d4	radv: reduce number of VGPRs for TESS_EVAL if primitive ID is not used We only need to 2. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-01 14:59:17 +02:00
Daniel Schürmann	0daeb1d127	amd/common: lower bitfield_extract to ubfe/ibfe. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00
Daniel Schürmann	48a75e7af0	amd/common: lower bitfield_insert to bfm & bitfield_select Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00
Daniel Schürmann	c58dff753c	radv: enable AMD_shader_ballot with RADV_PERFTEST_SHADER_BALLOT ('shader_ballot') Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Daniel Schürmann	7a858f274c	spirv/nir: add support for AMD_shader_ballot and Groups capability This commit also renames existing AMD capabilities: - gcn_shader -> amd_gcn_shader - trinary_minmax -> amd_trinary_minmax Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Nicolai Hähnle	f480b8aaa4	amd/common: use generated register header	2019-06-03 20:05:20 -04:00
Caio Marcelo de Oliveira Filho	e45bf01940	spirv: Change spirv_to_nir() to return a nir_shader spirv_to_nir() returned the nir_function corresponding to the entrypoint, as a way to identify it. There's now a bool is_entrypoint in nir_function and also a helper function to get the entry_point from a nir_shader. The return type reflects better what the function name suggests. It also helps drivers avoid the mistake of reusing internal shader references after running NIR_PASS on it. When using NIR_TEST_CLONE or NIR_TEST_SERIALIZE, those would be invalidated right in the first pass executed. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-29 10:34:35 -07:00
Caio Marcelo de Oliveira Filho	a3bfdacb6c	radv: Don't re-use entry_point pointer from spirv_to_nir Replace its uses with checking for is_entrypoint and calling nir_shader_get_entrypoint(). This is a preparation to change spirv_to_nir() return type. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-29 10:34:35 -07:00
Caio Marcelo de Oliveira Filho	31a7476335	spirv, radv, anv: Replace ptr_type with addr_format Instead of setting the glsl types of the pointers for each resource, set the nir_address_format, from which we can derive the glsl_type, and in the future the bit pattern representing a NULL pointer. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 10:53:38 -07:00
Samuel Pitoiset	d7501834cd	radv: add a workaround for Monster Hunter World and LLVM 7&8 The load/store optimizer pass doesn't handle WaW hazards correctly and this is the root cause of the reflection issue with Monster Hunter World. AFAIK, it's the only game that are affected by this issue. This is fixed with LLVM r361008, but we need a workaround for older LLVM versions unfortunately. Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-17 11:41:19 +02:00
Marek Olšák	ccfcb9d818	ac: rename SI-CIK-VI to GFX6-GFX7-GFX8 Acked-by: Dave Airlie <airlied@redhat.com> We already use GFX9 and I don't want us to have confusing naming in the driver. GFXn naming is better from the driver perspective, because it's the real version of the gfx portion of the hw. Also, CIK means Bonaire-Kaveri-Kabini, it doesn't mean CI. It shouldn't confuse our SDMA, UVD, VCE etc. code much. Those have nothing to do with GFXn and they have their own version numbers.	2019-05-15 20:54:10 -04:00
Jonathan Marek	d0bff89159	nir: allow specifying a set of opcodes in lower_alu_to_scalar This can be used by both etnaviv and freedreno/a2xx as they are both vec4 architectures with some instructions being scalar-only. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-10 15:10:41 +00:00
Ian Romanick	1f1007a4ed	nir: Initialize lower_flrp_progress everywhere I don't know why I thought NIR_PASS always set the progress variable. Derp. Fixes: `d41cdef2a5` ("nir: Use the flrp lowering pass instead of nir_opt_algebraic") Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Coverity CID: 1444996 Coverity CID: 1444995 Coverity CID: 1444994 Coverity CID: 1444993 Coverity CID: 1444991 Coverity CID: 1444989	2019-05-09 10:03:51 -07:00
Timothy Arceri	e19a8fe033	radv: call constant folding before opt algebraic The pattern of calling opt algebraic first seems to have originated in i965. The order in OpenGL drivers generally doesn't matter because the GLSL IR optimisations do constant folding before opt algebraic. However in Vulkan drivers calling opt algebraic first can result in missed constant folding opportunities. vkpipeline-db results (VEGA64): Totals from affected shaders: SGPRS: 3160 -> 3176 (0.51 %) VGPRS: 3588 -> 3580 (-0.22 %) Spilled SGPRs: 52 -> 44 (-15.38 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 12 -> 12 (0.00 %) dwords per thread Code Size: 261812 -> 261036 (-0.30 %) bytes LDS: 7 -> 7 (0.00 %) blocks Max Waves: 346 -> 348 (0.58 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-08 19:45:01 +10:00
Ian Romanick	d41cdef2a5	nir: Use the flrp lowering pass instead of nir_opt_algebraic I tried to be very careful while updating all the various drivers, but I don't have any of that hardware for testing. :( i965 is the only platform that sets always_precise = true, and it is only set true for fragment shaders. Gen4 and Gen5 both set lower_flrp32 only for vertex shaders. For fragment shaders, nir_op_flrp is lowered during code generation as a(1-c)+bc. On all other platforms 64-bit nir_op_flrp and on Gen11 32-bit nir_op_flrp are lowered using the old nir_opt_algebraic method. No changes on any other Intel platforms. v2: Add panfrost changes. Iron Lake and GM45 had similar results. (Iron Lake shown) total cycles in shared programs: 188647754 -> 188647748 (<.01%) cycles in affected programs: 5096 -> 5090 (-0.12%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12% Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-06 22:52:29 -07:00
Juan A. Suarez Romero	06c9d7f9f9	radv: enable descriptor indexing capabilities This enables the remaining capabilities in SPV_EXT_descriptor_indexing. Fixes: `0e10790558` "radv: Enable VK_EXT_descriptor_indexing." Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-30 09:23:23 +02:00
Bas Nieuwenhuizen	5c3467e74a	radv: Run the new ycbcr lowering pass. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Samuel Pitoiset	b3e3440c87	radv: add VK_NV_compute_shader_derivates support Only computeDerivativeGroupLinear is supported for now. All crucible tests pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-22 14:51:57 +02:00
Samuel Pitoiset	9cf55b022d	radv: add VK_KHR_shader_atomic_int64 but disable it for now No support for 64-bit compare&swap atomic operations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-17 21:59:56 +02:00
Samuel Pitoiset	ecbe6cb805	radv: sort the shader capabilities alphabetically Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-16 09:14:22 +02:00
Samuel Pitoiset	14f03978ed	radv: enable VK_KHR_shader_float16_int8 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-15 10:43:55 +02:00
Timothy Arceri	e30804c602	nir/radv: remove restrictions on opt_if_loop_last_continue() When I implemented opt_if_loop_last_continue() I had restricted this pass from moving other if-statements inside the branch opposite the continue. At the time it was causing a bunch of spilling in shader-db for i965. However Samuel Pitoiset noticed that making this pass more aggressive significantly improved the performance of Doom on RADV. Below are the statistics he gathered. 28717 shaders in 14931 tests Totals: SGPRS: 1267317 -> 1267549 (0.02 %) VGPRS: 896876 -> 895920 (-0.11 %) Spilled SGPRs: 24701 -> 26367 (6.74 %) Code Size: 48379452 -> 48507880 (0.27 %) bytes Max Waves: 241159 -> 241190 (0.01 %) Totals from affected shaders: SGPRS: 23584 -> 23816 (0.98 %) VGPRS: 25908 -> 24952 (-3.69 %) Spilled SGPRs: 503 -> 2169 (331.21 %) Code Size: 2471392 -> 2599820 (5.20 %) bytes Max Waves: 586 -> 617 (5.29 %) The codesize increases is related to Wolfenstein II it seems largely due to an increase in phis rather than the existing jumps. This gives +10% FPS with Doom on my Vega56. Rhys Perry also benchmarked Doom on his VEGA64: Before: 72.53 FPS After: 80.77 FPS v2: disable pass on non-AMD drivers Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-09 11:29:41 +10:00
Samuel Pitoiset	c25f63872b	radv: partially enable VK_KHR_shader_float16_int8 Only 8-bit integers for now, float16 requires a bit more work. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 18:53:59 +02:00
Rhys Perry	0af95f0ffc	radv: lower 16-bit flrp Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-01 09:58:48 +02:00
Samuel Pitoiset	8a6e61cc52	radv: do not lower frexp_exp and frexp_sig Hardware has two instructions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-28 13:02:51 +01:00
Samuel Pitoiset	23d30f4099	spirv,nir: lower frexp_exp/frexp_sig inside a new NIR pass This lowering isn't needed for RADV because AMDGCN has two instructions. It will be disabled for RADV in an upcoming series. While we are at it, factorize a little bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-22 19:41:46 +01:00
Rhys Perry	037f11d42e	radv: enable VK_KHR_8bit_storage Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:02:27 +01:00
Jason Ekstrand	08f804ec0c	anv,radv,turnip: Lower TG4 offsets with nir_lower_tex v2: turn on for turnip as well (Karol Herbst) Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-03-21 02:58:41 +00:00
Samuel Pitoiset	71ffa00fc6	radv: enable lower_mul_2x32_64 Fixes: `58bcebd987` ("spirv: Allow [i/u]mulExtended to use new nir opcode") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-06 22:41:20 +01:00
Bas Nieuwenhuizen	13ab63bb62	radv: Implement VK_EXT_buffer_device_address. v2: Also update the release notes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:37:38 +01:00
Samuel Pitoiset	5e7f800f32	radv: fix build Fixes: `9b9ccee4d6` ("radv: take LDS into account for compute shader occupancy stats") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-01 15:31:55 +01:00
Timothy Arceri	9b9ccee4d6	radv: take LDS into account for compute shader occupancy stats Ported from `d205faeb6c`. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-01 22:25:30 +11:00
Timothy Arceri	a53d68d318	ac/radv/radeonsi: add ac_get_num_physical_sgprs() helper Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-01 22:25:30 +11:00
Bas Nieuwenhuizen	ead54d4a42	radv/winsys: Set winsys bo priority on creation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-29 15:56:41 +01:00
Karol Herbst	9b24028426	nir: rename nir_var_function to nir_var_function_temp Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 20:01:41 +01:00
Karol Herbst	d0c6ef2793	nir: rename global/local to private/function memory the naming is a bit confusing no matter how you look at it. Within SPIR-V "global" memory is memory accessible from all threads. glsl "global" memory normally refers to shader thread private memory declared at global scope. As we already use "shared" for memory shared across all thrads of a work group the solution where everybody could be happy with is to rename "global" to "private" and use "global" later for memory usually stored within system accessible memory (be it VRAM or system RAM if keeping SVM in mind). glsl "local" memory is memory only accessible within a function, while SPIR-V "local" memory is memory accessible within the same workgroup. v2: rename local to function as well v3: rename vtn_variable_mode_local as well Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-08 18:51:46 +01:00
Jason Ekstrand	05d72d6d48	spirv: Sort supported capabilities Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-07 18:41:15 -06:00
Jason Ekstrand	63b9aa2e25	spirv: Add support for using derefs for UBO/SSBO access For now, it's hidden behind a cap. Hopefully, we can eventually drop that along with all the manual offset code in spirv_to_nir. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	adc155a815	spirv: Add explicit pointer types Instead of baking in uvec2 for UBO and SSBO pointers and uint for push constant and shared memory pointers, make it configurable. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	fc9c4f89b8	nir: Move propagation of cast derefs to a new nir_opt_deref pass We're going to want to do more deref optimizations going forward and this gives us a central place to do them. Also, cast propagation will get a bit more complicated with the addition of ptr_as_array derefs. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Samuel Pitoiset	9606310081	radv: enable shaderStorageImageMultisample feature on GFX8+ Untested on older chips. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 18:01:19 +01:00
Samuel Pitoiset	0a7e767e58	radv: drop the amdgpu-skip-threshold=1 workaround for LLVM 8 This workaround has been introduced by `135e4d434f` for fixing DXVK GPU hangs with many games. It is no longer needed since LLVM r345718. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 12:09:57 +01:00
Ian Romanick	378f996771	nir/opt_peephole_select: Don't peephole_select expensive math instructions On some GPUs, especially older Intel GPUs, some math instructions are very expensive. On those architectures, don't reduce flow control to a csel if one of the branches contains one of these expensive math instructions. This prevents a bunch of cycle count regressions on pre-Gen6 platforms with a later patch (intel/compiler: More peephole select for pre-Gen6). v2: Remove stray #if block. Noticed by Thomas. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-17 13:47:06 -08:00
Ian Romanick	09b7e1d8e4	nir/opt_peephole_select: Don't try to remove flow control around indirect loads That flow control may be trying to avoid invalid loads. On at least some platforms, those loads can also be expensive. No shader-db changes on any Intel platform (even with the later patch "intel/compiler: More peephole select"). v2: Add a 'indirect_load_ok' flag to nir_opt_peephole_select. Suggested by Rob. See also the big comment in src/intel/compiler/brw_nir.c. v3: Use nir_deref_instr_has_indirect instead of deref_has_indirect (from nir_lower_io_arrays_to_elements.c). v4: Fix inverted condition in brw_nir.c. Noticed by Lionel. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-17 13:47:06 -08:00
Samuel Pitoiset	3fbdcd942f	amd: remove support for LLVM 6.0 User are encouraged to switch to LLVM 7.0 released in September 2018. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-06 14:02:56 +01:00
Nicolai Hähnle	8c97abc066	radv: include LLVM IR in the VK_AMD_shader_info "disassembly" Helpful for debugging compiler backend problems: this allows us to easily retrieve the LLVM IR from RenderDoc. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-11-09 14:54:37 +01:00

1 2 3 4

161 Commits