KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Dave Airlie	1fc19a0f27	radv: merge tess rings into a single bo Inspired by a passing commit to radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-27 00:54:59 +00:00
Samuel Pitoiset	e05507a427	ac/nir: use ordered float comparisons except for not equal Original patch from Timothy Arceri, I have just fixed the not equal case locally. This fixes one important rendering issue in Wolfenstein 2 (the cutscene transition issue). RadeonSI uses the same ordered comparisons, so I guess that what we should do as well. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104302 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2018-02-26 13:59:04 +01:00
Timothy Arceri	9873bd9dcd	ac: make use of ac_get_llvm_num_components() helper Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-26 11:43:47 +11:00
James Legg	afd8fd0656	radv: Really use correct HTILE expanded words. When transitioning to an htile compressed depth format, Set the full depth range, so later rasterization can pass HiZ. Previously, for depth only formats, the depth range was set to 0 to 0. This caused unwanted HiZ rejections with a VK_FORMAT_D16_UNORM depth buffer (VK_FORMAT_D32_SFLOAT was not affected somehow). These values are derived from PAL [0], since I can't find the specification describing the htile values. [0] `5cba4ecbda/src/core/hw/gfxip/gfx9/gfx9MaskRam.cpp (L1500)` CC: Dave Airlie <airlied@redhat.com> CC: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> CC: mesa-stable@lists.freedesktop.org Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Fixes: `5158603182` "radv: Use correct HTILE expanded words."	2018-02-24 02:16:22 +01:00
Mauro Rossi	8eed942136	radv/extensions: fix c_vk_version for patch == None Similar to `cb0d1ba156` ("anv/extensions: Fix VkVersion::c_vk_version for patch == None") fixes the following building errors: out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/radv_entrypoints.c:1161:48: error: use of undeclared identifier 'None'; did you mean 'long'? return instance && VK_MAKE_VERSION(1, 0, None) <= core_version; ^~~~ long external/mesa/include/vulkan/vulkan.h:34:43: note: expanded from macro 'VK_MAKE_VERSION' (((major) << 22) \| ((minor) << 12) \| (patch)) ^ ... fatal error: too many errors emitted, stopping now [-ferror-limit=] 20 errors generated. Fixes: `e72ad05c1d` ("radv: Return NULL for entrypoints when not supported.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-24 00:31:31 +01:00
Bas Nieuwenhuizen	032870beda	radv: Fix autotools build. Somewhere along the way the Makefile changes got lost ... Fixes: `4db78f3a6b` "radv: Put supported extensions in a struct." Acked-by: Dave Airlie <airlied@redhat.com>	2018-02-23 01:54:12 +01:00
Bas Nieuwenhuizen	e72ad05c1d	radv: Return NULL for entrypoints when not supported. This implements strict checking for the entrypoint ProcAddr functions. - InstanceProcAddr with instance = NULL, only returns the 3 allowed entrypoints. - DeviceProcAddr does not return any instance entrypoints. - InstanceProcAddr does not return non-supported or disabled instance entrypoints. - DeviceProcAddr does not return non-supported or disabled device entrypoints. - InstanceProcAddr still returns non-supported device entrypoints. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	414f5e0e14	radv: Reword radv_entrypoints_gen.py With a big inspiration from anv as always ... Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	076f7cfc6b	radv: Track enabled extensions. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	4db78f3a6b	radv: Put supported extensions in a struct. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Samuel Pitoiset	d6b7539206	ac/nir: remove emission of nir_op_fpow fpow is now lowered at NIR level. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:44:46 +01:00
Samuel Pitoiset	7aa008d1d7	radv: enable lowering of fpow to fexp2 and flog2 There is no fpow in hardware, so it's always lowered somewhere, but it appears that lowering at NIR level is better. Figured while comparing compute shaders between RadeonSI and RADV. Polaris10: Totals from affected shaders: SGPRS: 18936 -> 18904 (-0.17 %) VGPRS: 12240 -> 12220 (-0.16 %) Spilled SGPRs: 2809 -> 2809 (0.00 %) Code Size: 718116 -> 719848 (0.24 %) bytes Max Waves: 1409 -> 1410 (0.07 %) Vega10: Totals from affected shaders: SGPRS: 18392 -> 18392 (0.00 %) VGPRS: 12008 -> 11920 (-0.73 %) Spilled SGPRs: 3001 -> 2981 (-0.67 %) Code Size: 777444 -> 778788 (0.17 %) bytes Max Waves: 1503 -> 1504 (0.07 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:40:47 +01:00
Samuel Pitoiset	a01e9996b5	ac/nir: set GLC=1 for load/store of coherent/volatile images This disables persistence accross wavefronts. F1 2017 and Wolfenstein 2 appear to use some coherent images but this patch doesn't seem to change anything. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:39:55 +01:00
Timothy Arceri	6d338d757f	ac/radeonsi: pass type to load_tess_varyings() We need this to be able to load 64bit varyings. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
James Zhu	f0ad908e79	amd/common:add uvd hevc enc support check in hw query Based on amdgpu hardware query information to check if UVD hevc enc support Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-21 13:53:38 -05:00
Samuel Pitoiset	a6accad68f	ac/nir: add glsl_is_array_image() helper For consistency. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-21 09:41:51 +01:00
Samuel Pitoiset	ff83dfb364	ac/nir: set the DA field when performing atomics on 3D images This doesn't fix anything known but it should definitely be set. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-21 09:41:49 +01:00
Dave Airlie	baa0feb73d	radv: don't send num_tcs_input_cp to sgprs. We never use it in the shaders. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:36 +00:00
Dave Airlie	952222ddd4	radv/tess: don't need to look in constant for vertices_per_patch This just avoids passing this value via user sgprs. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:28 +00:00
Dave Airlie	77fd1b9187	ac/radv: cleanup some tcs output values access Just consolidates some code to make it easier to change. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:23 +00:00
Dave Airlie	0e6f0d400b	ac/radv: remove total_vertices variable This just removes an unneeded variable. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:19 +00:00
Dave Airlie	e9b9fb3616	ac/radv: don't mark tess inner as used if we don't use it. This just avoids marking it as a used output if we don't actually use it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:15 +00:00
Dave Airlie	d5b2d7ed67	ac/nir: to integer the args to bcsel. dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw was hitting an llvm assert due to one value being an int and the other a float. This just casts both values to integer and fixes the test. Fixes: dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-20 23:15:18 +00:00
Samuel Pitoiset	1ac741d690	ac/nir: move ac_declare_lds_as_pointer() outside of the switch Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-20 10:44:59 +01:00
Samuel Pitoiset	b5d111ae76	radv: allow to force family using RADV_FORCE_FAMILY Useful for pipeline-db. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-20 10:44:47 +01:00
Samuel Pitoiset	549c7f3724	radv: compact varyings after removing unused ones It makes no sense to compact before, and the description of nir_compact_varyings() confirms that. Polaris10: Totals from affected shaders: SGPRS: 108528 -> 108128 (-0.37 %) VGPRS: 74548 -> 74500 (-0.06 %) Spilled SGPRs: 844 -> 814 (-3.55 %) Code Size: 3007328 -> 2992932 (-0.48 %) bytes Max Waves: 16019 -> 16009 (-0.06 %) Vega10: Totals from affected shaders: SGPRS: 106088 -> 106232 (0.14 %) VGPRS: 74652 -> 74700 (0.06 %) Spilled SGPRs: 692 -> 658 (-4.91 %) Code Size: 2967708 -> 2953028 (-0.49 %) bytes Max Waves: 18178 -> 18162 (-0.09 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-19 12:19:17 +01:00
Marek Olšák	931ec80eeb	radeonsi: implement 32-bit pointers in user data SGPRs (v2) User SGPRs changes: VS: 14 -> 9 TCS: 14 -> 10 TES: 10 -> 6 GS: 8 -> 4 GSCOPY: 2 -> 1 PS: 9 -> 5 Merged VS-TCS: 24 -> 16 Merged VS-GS: 18 -> 11 Merged TES-GS: 18 -> 11 SGPRS: 2170102 -> 2158430 (-0.54 %) VGPRS: 1645656 -> 1641516 (-0.25 %) Spilled SGPRs: 9078 -> 8810 (-2.95 %) Spilled VGPRs: 130 -> 114 (-12.31 %) Scratch size: 1508 -> 1492 (-1.06 %) dwords per thread Code Size: 52094872 -> 52692540 (1.15 %) bytes Max Waves: 371848 -> 372723 (0.24 %) v2: - the shader cache needs to take address32_hi into account - set amdgpu-32bit-address-high-bits Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)	2018-02-17 04:52:17 +01:00
Marek Olšák	0977b7f7b3	ac: query high bits of 32-bit address space	2018-02-17 04:51:58 +01:00
Bas Nieuwenhuizen	05d84ed68a	radv: Always lower indirect derefs after nir_lower_global_vars_to_local. Otherwise new local variables can cause hangs on vega. CC: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105098 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-15 23:45:59 +01:00
Samuel Pitoiset	579b33c1fd	ac/nir: do not reserve user SGPRs for unused descriptor sets In theory this might lead to corruption if we bind a descriptor set which is unused, because LLVM is smart and it can re-use unused user SGPRs. In practice, this doesn't seem to fix anything. As a side effect, this will reduce the number of emitted SH_REG packets. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-15 14:53:30 +01:00
Samuel Pitoiset	309854148c	ac/shader: fix gathering of desc_set_used_mask This was quite wrong. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-15 14:53:30 +01:00
Samuel Pitoiset	61a4fc3ecc	ac/shader: be a little smarter when scanning vertex buffers Although meta shaders don't use any vertex buffers, there is no behaviour change but I think it's better to do this. Though, this saves two user SGPRs for push constants inlining or something else. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-15 14:53:30 +01:00
Timothy Arceri	9740c8a8aa	ac: implement nir_intrinsic_image_samples Fixes cts test: KHR-GL45.shader_texture_image_samples_tests.image_functional_test Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-15 09:02:41 +11:00
Timothy Arceri	3ad52501dc	ac/nir_to_llvm: fix image size for arrays of arrays Fixes cts test: KHR-GL44.shader_image_size.advanced-changeSize Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-15 09:02:41 +11:00
Samuel Pitoiset	ad4b58ea70	ac/nir: rename nir_to_llvm_context to radv_shader_context There is still more to do in that area, but it's a good start. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:16 +01:00
Samuel Pitoiset	141db61509	ac: remove nir_to_llvm_context from ac_nir_translate() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:14 +01:00
Samuel Pitoiset	a541117ff4	ac/nir: remove nir_to_llvm_context::nir link Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:12 +01:00
Samuel Pitoiset	e9f0205ca2	ac: move the outputs array to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:10 +01:00
Samuel Pitoiset	07e4268f36	ac/shader: scan force_persample Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:08 +01:00
Bas Nieuwenhuizen	7461bd5b8f	ac: Use the renumbered const address space for LLVM 7. The LLVM AMDGPU backend decided to renumber the constant address space .... Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-14 01:05:03 +01:00
Timothy Arceri	10457712ed	ac/nir: add nir_intrinsic_{load,store}_shared support Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-13 14:43:05 +11:00
Timothy Arceri	c787cbfa33	ac/nir_to_llvm: add support for nir_intrinsic_shared_atomic_* Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-13 14:43:05 +11:00
Eric Anholt	1aed66dc1e	radv: Fix compiler warning about uninitialized 'set' The compiler doesn't figure out that we only get result == VK_SUCCESS if set got initialized. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 20:48:47 +00:00
Eric Anholt	091bff8317	ac/nir: Fix compiler warning about uninitialized dw_addr. Even switching the def's condition to be the same chip revision check as the use, the compiler doesn't figure it out. Just NULL-init it. Fixes: `ec53e52742` ("ac/nir: Add ES output to LDS for GFX9.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 20:48:29 +00:00
Samuel Pitoiset	f4e85ba93f	ac/nir: remove backlink to nir_to_llvm_context Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:39 +01:00
Samuel Pitoiset	be5f6eb13e	ac/nir: remove nir_to_llvm_context::module Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:36 +01:00
Samuel Pitoiset	90a815ddeb	ac/nir: remove nir_to_llvm_context::builder Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:34 +01:00
Samuel Pitoiset	759acfa180	ac/nir: drop nir_to_llvm_context from glsl_to_llvm_type() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:31 +01:00
Samuel Pitoiset	e7373a6498	ac/nir: drop nir_to_llvm_context from visit_var_atomic() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:29 +01:00
Samuel Pitoiset	485346b05a	ac/nir: drop nir_to_llvm_context from visit_vulkan_resource_reindex() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:27 +01:00
Samuel Pitoiset	cd6dfacda9	ac/nir: drop nir_to_llvm_context from visit_load_push_constant() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:25 +01:00
Samuel Pitoiset	5c9e398c83	ac/nir: drop nir_to_llvm_context from cast_ptr() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:23 +01:00
Samuel Pitoiset	5ef5944848	ac/nir: drop nir_to_llvm_context from visit_load_local_invocation_index() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:21 +01:00
Samuel Pitoiset	da8b0b8264	ac/nir: drop nir_to_llvm_context from emit_f2f16() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:19 +01:00
Samuel Pitoiset	e32f374944	ac: remove unused parameters in abi::load_tess_coord() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:17 +01:00
Samuel Pitoiset	1e69db003d	ac/nir: remove useless bitcast in load_tess_coord() nir_intrinsic_load_tess_coord always returns a v3i32. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:15 +01:00
Samuel Pitoiset	ed179fbdf3	ac: add load_resource() to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:13 +01:00
Samuel Pitoiset	ecf229706f	ac: add load_sample_mask_in() to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:11 +01:00
Samuel Pitoiset	0f48eeea05	ac: move view_index to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:09 +01:00
Samuel Pitoiset	0efbede949	ac: move push_constants to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:07 +01:00
Samuel Pitoiset	460d3ce726	ac: move tg_size to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:04 +01:00
Samuel Pitoiset	054c92190c	ac/nir: remove unused nir_to_llvm_context:{defs,phis} Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:02 +01:00
Timothy Arceri	ef8082baf8	ac: convert nir_op_f2f32 src to a float Fixes the following piglit test: ./bin/arb_vertex_attrib_64bit-check-explicit-location -auto -fbo Where we would end up with the nir such as: vec1 64 ssa_11 = pack_64_2x32_split ssa_9, ssa_10 vec1 32 ssa_12 = f2f32 ssa_2 And our pack_64_2x32_split nir to llvm code always produces a 64bit integer as output. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-10 10:46:28 +11:00
Timothy Arceri	1b1e5f8edf	ac: fix some 64bit unpack asserts Previously the asserts did not take swizzles into account. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-10 10:46:28 +11:00
Samuel Pitoiset	3a2bb4db23	ac/nir: compute correct number of user SGPRs on GFX9 For merged shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 10:16:04 +01:00
Timothy Arceri	c77078c942	ac: pass struct ac_llvm_context to emit_membar() Fixes segfault in piglit test: ./bin/arb_shader_image_load_store-shader-mem-barrier --quick -auto -fbo Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 12:51:27 +11:00
Timothy Arceri	12a2350e6d	ac: add 64bit support to ac_find_lsb() v2: use LLVMBuildTrunc() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:42:59 +11:00
Timothy Arceri	a9f6b392c7	ac: move get_elem_bits() to ac_llvm_build.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:42:59 +11:00
Timothy Arceri	19f9839f0b	ac: add 64bit bitCount support v2: use LLVMBuildTrunc() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:42:59 +11:00
Samuel Pitoiset	bb750d265c	ac/nir: clean up handle_fs_outputs_post() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:33 +01:00
Samuel Pitoiset	528bc14fa5	ac/nir: add radv_load_output() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:30 +01:00
Samuel Pitoiset	834d9845ca	ac/shader: scan info about output PS declarations NIR->LLVM should only be a translation pass, and all scan stuff should be done before. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:27 +01:00
Samuel Pitoiset	a8e04e91de	ac/nir: add radv_export_param() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:26 +01:00
Samuel Pitoiset	e3cfd6b805	ac/nir: remove set but unused export_mask Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:24 +01:00
Samuel Pitoiset	724136d590	ac/nir: remove dead code in handle_vs_outputs_post() The memcpy can't be reached because the condition is always false. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:22 +01:00
Samuel Pitoiset	c63d8d0284	ac/nir: remove useless check in si_llvm_init_export_args() values can't be NULL because we use ac_build_export_null() now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:20 +01:00
Samuel Pitoiset	26ab5a4269	ac/nir: use ac_build_export_null() The number of enabled channels should be 0 when exporting null. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:11:44 +01:00
Samuel Pitoiset	bd9f7b7635	ac: add ac_build_export_null() helper Imported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-08 22:11:42 +01:00
Fredrik Höglund	5a38d8f103	radv: implement VK_EXT_external_memory_host Ported from the radeonsi GL_AMD_pinned_memory implementation. Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 00:46:07 +01:00
Samuel Pitoiset	757d36ee70	ac/nir: use new pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics Ported from RadeonSI. Only one F1 2017 shader is affected, code size decreased from 532 to 488 on both Polaris10 and Vega10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-07 12:42:13 +01:00
Samuel Pitoiset	2f54d7382d	ac/nir: avoid loading unused VS input components Polaris10: Totals from affected shaders: SGPRS: 122840 -> 120984 (-1.51 %) VGPRS: 78812 -> 78440 (-0.47 %) Spilled SGPRs: 177 -> 129 (-27.12 %) Code Size: 2950028 -> 2941276 (-0.30 %) bytes Max Waves: 17899 -> 17976 (0.43 %) Vega10: Totals from affected shaders: SGPRS: 117144 -> 115776 (-1.17 %) VGPRS: 77580 -> 77532 (-0.06 %) Spilled SGPRs: 0 -> 152 (0.00 %) Code Size: 3352656 -> 3347860 (-0.14 %) bytes Max Waves: 19756 -> 19866 (0.56 %) This increases SGPRs spilling a bit with Talos, but I have some other ideas that might reduce it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-07 12:42:09 +01:00
Samuel Pitoiset	1c57a6da5e	ac/shader: scan vertex inputs usage mask Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-07 12:42:07 +01:00
Samuel Pitoiset	3488a3f033	radv: run nir_opt_shrink_load LLVM can't shrink loads. Polaris10: Totals from affected shaders: SGPRS: 62528 -> 59955 (-4.11 %) VGPRS: 44708 -> 44616 (-0.21 %) Spilled SGPRs: 16 -> 8 (-50.00 %) Code Size: 1355504 -> 1355172 (-0.02 %) bytes Max Waves: 11710 -> 11670 (-0.34 %) Vega10: Totals from affected shaders: SGPRS: 51448 -> 50371 (-2.09 %) VGPRS: 39140 -> 39048 (-0.24 %) Spilled SGPRs: 16 -> 16 (0.00 %) Code Size: 1307188 -> 1304296 (-0.22 %) bytes Max Waves: 11312 -> 11292 (-0.18 %) This reduces SGPRs spilling in MadMax, and it also reduces number of SGPRs in DOW3 and F12017. The number of waves slightly decreases in F1 but I don't see any performance changes after benchmarking it. Talos and Serious Sam are not affected because they don't use any push constants. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-06 23:08:44 +01:00
Timothy Arceri	9c52902c76	ac/radeonsi: add num_work_groups to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	f12e2f9c12	ac: implement nir_intrinsic_shader_clock Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	b7b89bbddb	ac/radeonsi: create ac_build_shader_clock() helper Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	d116af383f	ac/radeonsi: add load_local_group_size() to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	e3ebffdbb0	ac: don't call emit_outputs() for compute Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	c8066cdfa7	ac/radeonsi: add local_invocation_ids to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	fa5239c153	ac/radeonsi: add workgroup_ids to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Bas Nieuwenhuizen	c7d640fbbf	ac/nir: fix GS load input type. Fixes: `df1d5174fc` "ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-06 21:52:38 +01:00
Dave Airlie	e7e81f362d	radv: don't support tc-compat on multisample d32s8 at all. RX550 fails dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_2 So increase the range of the workaround. Fixes: `f4c534ef6` (radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-06 19:56:00 +00:00
Samuel Pitoiset	0170ae1e23	ac/nir: remove emission of nir_op_fdiv RadeonSI and RADV lower fdiv. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-05 23:09:34 +01:00
Samuel Pitoiset	a1d568c830	ac/nir: fix a crash in load_gs_input() on pre-GFX9 chips Fixes: `df1d5174fc` ("ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-05 11:05:52 +01:00
Marek Olšák	3bf1e036e8	amd: remove support for LLVM 3.9 Only these are supported: - LLVM 4.0 - LLVM 5.0 - LLVM 6.0 - master (7.0) Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-02 23:47:40 +01:00
Marek Olšák	847d0a393d	radeonsi: use pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-02 16:46:22 +01:00
Samuel Pitoiset	df1d5174fc	ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load The old one generates useless instructions in there, found while comparing geometry shaders between RadeonSI and RADV. This improves all Vulkan demos that use geometry shaders, +4% for deferredshadows, +9% for viewportarray, +7% for geometryshader on Polaris10. This seems to also improve DOW3 a little bit (+1%). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-02 12:32:21 +01:00
Bas Nieuwenhuizen	2ffe395cba	radv: Don't expose VK_KHX_multiview on android. deqp does not allow any KHX extensions, and since deqp is included in android-cts, android does not allow any khx extensions. So disable VK_KHX_multiview on android. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> CC: 18.0 <mesa-stable@lists.freedesktop.org>	2018-02-01 23:32:48 +01:00
Marek Olšák	b0a6053a99	ac/nir: use ac_build_buffer_load_format for image buffer loads Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-01 16:20:19 +01:00
Marek Olšák	bac9fa9f17	ac: add glc parameter to ac_build_buffer_load_format Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-01 16:20:19 +01:00

1 2 3 4 5 ...

2070 Commits