KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Marek Olšák	98a52fecda	radeonsi: implement 16-bit FS color outputs This removes type conversions from 16 bits to 32 bits in the main function and then back to 16 bits in the epilog. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6622>	2020-09-22 02:44:53 +00:00
Vinson Lee	50f1cd4076	ac/llvm: Fix nonportable sizeof. Fix defect reported by Coverity. Sizeof not portable (SIZEOF_MISMATCH) suspicious_sizeof: Passing argument vec_size * 8UL /* sizeof (LLVMValueRef ) / to function __builtin_alloca and then casting the return value to LLVMValueRef * is suspicious. In this particular case sizeof (LLVMValueRef *) happens to be equal to sizeof (LLVMValueRef), but this is not a portable assumption. Fixes: `ca74603b4f` ("ac/llvm: add better code for isign") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6682>	2020-09-14 15:25:45 -07:00
Pierre-Eric Pelloux-Prayer	82d2d73e03	amd/llvm: switch to 3-spaces style Follow-up of !4319 using the same clang-format config. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5310>	2020-09-07 10:00:20 +02:00
Marek Olšák	e8d55e6db3	ac/llvm: fix b2f for v2f16 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6284>	2020-09-06 14:36:21 +00:00
Marek Olšák	d9a77f9ca3	ac/llvm: add better code for fsign There are 2 improvements: - better code for 16, 32, and 64 bits - vector support for 16 and 32 bits Totals: SGPRS: 2639738 -> 2625882 (-0.52 %) VGPRS: 1534120 -> 1533916 (-0.01 %) Spilled SGPRs: 3541 -> 3557 (0.45 %) Spilled VGPRs: 33 -> 33 (0.00 %) Private memory VGPRs: 256 -> 256 (0.00 %) Scratch size: 292 -> 292 (0.00 %) dwords per thread Code Size: 55640332 -> 55384892 (-0.46 %) bytes Max Waves: 964785 -> 964857 (0.01 %) Totals from affected shaders: SGPRS: 377352 -> 363496 (-3.67 %) VGPRS: 209800 -> 209596 (-0.10 %) Spilled SGPRs: 1979 -> 1995 (0.81 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 256 -> 256 (0.00 %) Scratch size: 256 -> 256 (0.00 %) dwords per thread Code Size: 12549300 -> 12293860 (-2.04 %) bytes Max Waves: 105762 -> 105834 (0.07 %) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6284>	2020-09-06 14:36:21 +00:00
Marek Olšák	ca74603b4f	ac/llvm: add better code for isign There are 2 improvements: - select v_med3_i32 - support vectors Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6284>	2020-09-06 14:36:21 +00:00
Marek Olšák	7acc7ec33b	ac/llvm: fix unaligned VS input loads on gfx10.3 Fixes: `a23802bcb9` Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6595>	2020-09-04 11:07:41 +00:00
Daniel Schürmann	a79dad950b	nir,amd: remove trinary_minmax opcodes These consist of the variations nir_op_{i\|u\|f}{min\|max\|med}3 which are either lowered in the backend (LLVM) anyway or can be recombined by the backend (ACO). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6421>	2020-08-24 20:56:11 +00:00
Bas Nieuwenhuizen	40e00c800c	amd/llvm: Mark pointer function arguments as 32-byte aligned. Otherwise LLVM does not see the pointers as allowing speculative loads. The pipeline-db results are pretty wild, but mostly what is to be expected from allowing more code movement in LLVM: Totals from affected shaders: SGPRS: 157728 -> 168336 (6.73 %) VGPRS: 158628 -> 158664 (0.02 %) Spilled SGPRs: 10845 -> 24753 (128.24 %) Spilled VGPRs: 13 -> 13 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 8 -> 8 (0.00 %) dwords per thread Code Size: 17189180 -> 17313712 (0.72 %) bytes LDS: 204 -> 204 (0.00 %) blocks Max Waves: 5700 -> 5687 (-0.23 %) Wait states: 0 -> 0 (0.00 %) This gives some boosts for shaders we can move a descriptor load outside a loop. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3159>	2020-07-08 23:47:06 +00:00
Marek Olšák	2b8b62c55b	ac/nir: fix 64-bit division for GL CTS This fixes: KHR-GL45.gpu_shader_fp64.builtin.mod_* Fixes: `ba2ec1f3` "ac/nir: use llvm.amdgcn.rcp in ac_build_fdiv()" Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5531>	2020-06-23 04:46:55 +00:00
Pierre-Eric Pelloux-Prayer	993c64e6fe	ac/llvm: load 1 byte at a time if unaligned on gfx10 If buffer or stride is unaligned we use the same trick as on gfx6: load 1 byte at a time and recompose the output if needed. This change fixes lots of deqp/glcts tests: - dEQP-GLES2.functional.draw.random.1, 10, ... - dEQP-GLES2.functional.vertex_arrays.multiple_attributes.stride.3_float2_0_float2_0_float2_17, ... - dEQP-GLES2.functional.vertex_arrays.single_attribute.first.byte_first24_offset1_stride2_quads256, ... - dEQP-GLES2.functional.vertex_arrays.single_attribute.strides.buffer_0_17_byte2_vec4_dynamic_draw_quads_1, ... - dEQP-GLES31.functional.draw_indirect.random.14, ... Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5502>	2020-06-19 09:20:16 +02:00
Samuel Pitoiset	008b0d1701	ac/nir: adjust an assertion for D16 on GFX6-GFX7 16-bit types can be used with MUBUF on GFX6-GFX7. Fixes: `c3e0ba52a0` ("ac/nir: support 16-bit data in buffer_load_format opcodes") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5325>	2020-06-08 08:45:32 +02:00
Marek Olšák	c6c8a9bd55	ac/nir: support v2f16 derivatives Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003>	2020-06-02 16:29:25 -04:00
Marek Olšák	70b6d54011	ac/nir: support 16-bit data in image opcodes Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003>	2020-06-02 16:29:25 -04:00
Marek Olšák	c3e0ba52a0	ac/nir: support 16-bit data in buffer_load_format opcodes Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003>	2020-06-02 16:29:25 -04:00
Marek Olšák	b819ba949b	ac/nir: remove type and num_channels args from ac_build_buffer_store_common They were only used for type overloading where we can just use the type of data. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003>	2020-06-02 16:29:25 -04:00
Marek Olšák	b98df7bf50	ac/nir: support vector types in the type suffix of overloaded intrinsics Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003>	2020-06-02 16:29:25 -04:00
Marek Olšák	e5ea87cde8	ac/nir: use more types from ac_llvm_context Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003>	2020-06-02 16:29:25 -04:00
Samuel Pitoiset	14292310d9	ac/nir: implement nir_intrinsic_shader_clock with device scope Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5117>	2020-05-24 20:37:58 +02:00
Samuel Pitoiset	b034f6cf2a	ac/nir: fix shader clock with subgroup scope The compiler should emit s_memtime instead of s_memrealtime for the subgroup scope. I don't know why this LLVM 9 checks was for but LLVM 8 also has this amdgcn intrinsic. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5117>	2020-05-24 20:37:54 +02:00
Michel Dänzer	2a6811f0f9	Revert "ac,radeonsi: fix compilations issues with LLVM 11" This reverts commit `42b1696ef6`. The corresponding LLVM changes were reverted. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5087>	2020-05-19 07:19:35 +00:00
Samuel Pitoiset	0d63a1a84d	ac/llvm: add support for texturing with clamped LOD This is a requirement for the shaderResourceMinLod feature which allows to clamp LOD. This uses all image_sample__cl variants. All dEQP-VK.glsl.texture_functions.textureclamp.* pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4989>	2020-05-14 10:05:44 +00:00
Pierre-Eric Pelloux-Prayer	7e7bb38bd8	radeonsi: fix export count Fixes: `17acff01a0` ("radeonsi: skip vs output optimizations for some outputs") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2877 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4871>	2020-05-04 15:11:09 +02:00
Samuel Pitoiset	42b1696ef6	ac,radeonsi: fix compilations issues with LLVM 11 Latest LLVM replaced LLVMVectorTypeKind. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2826 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4755>	2020-04-27 17:13:36 +00:00
Marek Olšák	b4fd8f1919	ac,radeonsi: simplify checking for Navi1x chips Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4698>	2020-04-24 10:38:54 +00:00
Pierre-Eric Pelloux-Prayer	17acff01a0	radeonsi: skip vs output optimizations for some outputs If PT_SPRITE_TEX is enabled, PS inputs are overriden at runtime so we can't apply the vs output optim. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2747 Fixes: `3ec9975555` ("radeonsi: eliminate trivial constant VS outputs") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4559>	2020-04-20 08:45:16 +02:00
Samuel Pitoiset	ba2ec1f369	ac/nir: use llvm.amdgcn.rcp in ac_build_fdiv() Instead of emitting 1.0 / x which includes a slow division that LLVM doesn't always optimize even if the metadata is correctly set. No pipeline-db changes with VEGA10/LLVM 9. pipeline-db (VEGA10/LLVM 10): Totals from affected shaders: SGPRS: 6672 -> 6672 (0.00 %) VGPRS: 6652 -> 6652 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 561780 -> 561692 (-0.02 %) bytes Max Waves: 1043 -> 1043 (0.00 %) pipeline-db (VEGA10/LLVM 11 - 92744f62478): Totals from affected shaders: SGPRS: 84608 -> 83768 (-0.99 %) VGPRS: 106768 -> 106636 (-0.12 %) Spilled SGPRs: 1625 -> 1713 (5.42 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 10850936 -> 10726712 (-1.14 %) bytes Max Waves: 3152 -> 3180 (0.89 %) LLVM 11 (master) is more affected than previous versions, but based on the small impact with LLVM 9/10, I decided to emit it unconditionally. Cc: 20.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>	2020-03-27 08:05:43 +01:00
Pierre-Eric Pelloux-Prayer	5533c41541	ac: fix ac_build_is_helper_invocation when postponed_kill is null If there was no demote() in the shader, ac_build_is_helper_invocation behaves exactly the same as ac_build_load_helper_invocation, i.e. the helper lanes are the same as they were at the beginning of the shader. Fixes: `de57ea2a3d` ("amd/llvm: implement nir_intrinsic_demote(_if) and nir_intrinsic_is_helper_invocation") Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4301> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4301>	2020-03-25 08:19:38 +01:00
Marek Olšák	303842b2db	ac: fix fast division This stopped working with LLVM 11 and might occasionally have been broken on older LLVM, because the metadata was set on the mul, not on the rcp. Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4268> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4268>	2020-03-21 22:34:17 +00:00
Marek Olšák	63a5051ea6	ac: set new LLVM denormal flags See: https://reviews.llvm.org/D71358 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>	2020-03-17 20:47:48 +00:00
Samuel Pitoiset	cc320ef9af	ac/llvm: add missing optimization barrier for 64-bit readlanes Otherwise, LLVM optimizes it but it's actually incorrect. Fixes: `0f45d4dc2b` ("ac: add ac_build_readlane without optimization barrier") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3585> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3585>	2020-03-12 08:46:42 +01:00
Marek Olšák	fc65df5651	ac: add a bug workaround for the 100% NGG culling case Fixes: `8db00a51f8` - radeonsi/gfx10: implement NGG culling for 4x wave32 subgroups Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4079>	2020-03-09 16:08:11 -04:00
Daniel Schürmann	de57ea2a3d	amd/llvm: implement nir_intrinsic_demote(_if) and nir_intrinsic_is_helper_invocation The current implementation uses a temporary helper variable to ensure correct behavior until LLVM provides an intrinsic. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4047>	2020-03-09 12:29:32 +00:00
Samuel Pitoiset	9e5d2a73c5	ac/llvm: flush denorms for nir_op_fmed3 on GFX8 and older gens The hardware doesn't flush denorms, exactly like fmin/fmax, so we have to do it manually. This doesn't fix anything known. Fixes: `d6a07732c9` ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>	2020-02-27 08:04:33 +01:00
Samuel Pitoiset	30ac733680	ac/llvm: fix 16-bit fmed3 on GFX8 and older gens 16-bit med3 is only supported on GFX9+. Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.mid3.f16.*. Fixes: `d6a07732c9` ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>	2020-02-27 08:04:30 +01:00
Samuel Pitoiset	50b8c25274	ac/llvm: fix 64-bit fmed3 Lower 64-bit fmed3 because LLVM doesn't expose an intrinsic. Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.mid3.f64.*. Fixes: `d6a07732c9` ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>	2020-02-27 08:04:28 +01:00
Samuel Pitoiset	a31bcf2be6	ac/llvm: fix missing casts in ac_build_readlane() Because ac_build_optimization_barrier() overwrites the original src_type, we have to keep track of it before emitting that barrier. Otherwise, wrong conversions are expected for pointers or small bitsizes. By doing this, we no longer need to do the cast dance in ac_build_readlane_no_opt_barrier(), it was just necessary for ac_build_optimization_barrier(). This fixes a bunch of crashes with subgroups related tests when RADV_DEBUG=checkir is enabled, and it also fixes a compiler crash with The Surge 2. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2395 Fixes: `0f45d4dc2b` ("ac: add ac_build_readlane without optimization barrier") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3535> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3535>	2020-01-24 07:40:07 +01:00
Marek Olšák	4e4b2d13f0	ac: add helper ac_build_triangle_strip_indices_to_triangle Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	0f45d4dc2b	ac: add ac_build_readlane without optimization barrier Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	77393cf39b	ac: add prefix bitcount functions Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Timur Kristóf	eccac46cdc	ac/llvm: Fix ac_build_reduce in wave32 mode. Previously, when cluster_size was set to 0, it always worked as if the cluster size was 64. This commit fixes it in wave32 mode by changing to work as if the cluster size was set to 32. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2020-01-10 12:30:44 +01:00
Marek Olšák	9b71041627	ac: add ac_build_s_endpgm Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 16:03:48 -05:00
Marek Olšák	1c44480538	ac: add 128-bit bitcount Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 16:00:41 -05:00
Marek Olšák	d1c8aeb24f	ac: unify primitive export code Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 16:00:38 -05:00
Marek Olšák	1c77a18cc2	ac: unify build_sendmsg_gs_alloc_req Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 16:00:36 -05:00
Marek Olšák	e5e3ffa6b9	ac: fix ac_get_i1_sgpr_mask for Wave32 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>	2019-12-16 20:06:07 +00:00
Bas Nieuwenhuizen	e09426ad6b	amd/llvm: Refactor ac_build_scan. Split out the logic for exclusive scans into a separate function that makes clear what it does instead of having this opaque 60 line if. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-28 11:35:11 +01:00
Samuel Pitoiset	0812dbd403	ac: add 8-bit and 16-bit supports to ac_build_permlane16() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-27 07:42:42 +01:00
Samuel Pitoiset	c9aa843961	radv/gfx10: fix implementation of exclusive scans This implementation is loosely based on ROCm. https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl This fixes dEQP-VK.subgroups.arithmetic..subgroupexclusive on GFX10. Fixes: `227c29a80d` ("amd/common/gfx10: implement scan & reduce operations") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-27 07:39:26 +01:00
Samuel Pitoiset	f6770b9726	ac/llvm: fix warning in ac_build_canonicalize() ../src/amd/llvm/ac_llvm_build.c: In function ‘ac_build_canonicalize’: ../src/amd/llvm/ac_llvm_build.c:4567:9: warning: ‘intr’ may be used uninitialized in this function [-Wmaybe-uninitialized] 4567 \| return ac_build_intrinsic(ctx, intr, type, params, 1, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4568 \| AC_FUNC_ATTR_READNONE); \| ~~~~~~~~~~~~~~~~~~~~~~ ../src/amd/llvm/ac_llvm_build.c:4567:9: warning: ‘type’ may be used uninitialized in this function [-Wmaybe-uninitialized] Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-26 08:35:10 +01:00

1 2

68 Commits