KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Timothy Arceri	b99ebaa4fd	ac: move some helpers to ac_llvm_build.c We will call these from the radeonsi NIR backend. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	2deb822075	ac: add store_tcs_outputs() to the abi Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	b104e7e172	ac: call load_tcs_input() via the abi This also enables some code sharing with tes. V2: drop type param and just use ctx->i32 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	b09a3196e0	ac: add load_tes_inputs() to the abi V2: drop type param and just use ctx->i32 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Samuel Pitoiset	a4d2782664	amd/common: scan if gl_PrimitiveID is used before translating to LLVM It makes more sense to move all scan stuff in the same place. Also, we don't really need to duplicate the uses_primid field for each stages. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-04 18:43:09 +01:00
Samuel Pitoiset	3b2cb2f99a	amd/common: scan if gl_InvocationID is used Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-04 18:43:07 +01:00
Bas Nieuwenhuizen	79724c89f8	ac: rename has_sync_file to has_fence_to_handle. sync_files are in linux since 4.7, while the amdgpu fence_to_handle ioctl is only in 4.15. In particular we don't need it for sync_file in radv, because everything happens via syncobjs, which got support earlier than fence_to_handle. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-04 01:12:09 +01:00
Bas Nieuwenhuizen	c99426ea83	ac/nir: Handle loading data from compact arrays. Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-04 00:14:23 +01:00
Marek Olšák	4f19cc82f9	ac: rename has_syncobj_wait -> has_syncobj_wait_for_submit Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-04 00:07:45 +01:00
Samuel Pitoiset	3260a96c17	amd/common: rework set_userdata_location() and rename to set_loc() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:17 +01:00
Samuel Pitoiset	4221a816e2	amd/common: rename set_userdata_location_shader() to set_loc_shader() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:15 +01:00
Samuel Pitoiset	5081fd398e	amd/common: replace set_userdata_location_indirect() by set_loc_desc() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:13 +01:00
Samuel Pitoiset	f8202ef683	amd/common: rename radv_define_vs_user_sgprs_phase2() ... to set_vs_specific_input_locs(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:11 +01:00
Samuel Pitoiset	9d5a1787ee	amd/common: rename radv_define_common_user_sgprs_phase2() ... to set_global_input_locs(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:08 +01:00
Samuel Pitoiset	9a2393a510	amd/common: rename add_user_sgpr_array_argument() to add_array_arg() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:06 +01:00
Samuel Pitoiset	b6217bdbee	amd/common: replace add_sgpr_argument() by add_arg() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:04 +01:00
Samuel Pitoiset	32bbc9eb0f	amd/common: replace add_user_sgpr_argument() by add_arg() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:02 +01:00
Samuel Pitoiset	e946b5360d	amd/common: replace add_vgpr_argument() by add_arg() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:59 +01:00
Samuel Pitoiset	f1242a8976	amd/common: add new add_arg() helper for SGPRs/VGPRs arguments The idea is to clean up the add arguments logic. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:57 +01:00
Samuel Pitoiset	bedfa06eaf	amd/common: rename radv_define_common_user_sgprs_phase1() ... to declare_global_input_sgprs(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:55 +01:00
Samuel Pitoiset	0f58f67abe	amd/common: rename radv_define_vs_user_sgprs_phase1() ... to declare_vs_specific_inputs_sgprs(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:53 +01:00
Samuel Pitoiset	5c91c1614c	amd/common: do not try to declare input VS SGPRs for GS It's a no-op anyway but it looked strange to me, remove it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:51 +01:00
Samuel Pitoiset	fc35a071b6	amd/common: add declare_vs_input_vgprs() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:49 +01:00
Samuel Pitoiset	3015668cad	amd/common: add declare_tes_input_vgprs() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:47 +01:00
Samuel Pitoiset	62942aa8c6	amd/common: remove unnecessary num_user_sgprs_used Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:46 +01:00
Samuel Pitoiset	6edf1fcdf5	amd/common: remove unnecessary user_sgpr_count Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:44 +01:00
Dave Airlie	cf363e4405	amd/common/radv/radeonsi: use register defines for dcc block sizes. These are just taken from amdvlk, we probably knew these already, but may as well port them now. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-27 11:10:35 +10:00
Samuel Pitoiset	38f9b87af2	amd/common: add ac_export_mrt_z() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-22 10:38:49 +01:00
Samuel Pitoiset	03ef264146	amd/common: pass the family to ac_llvm_context_init() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-22 10:38:44 +01:00
Samuel Pitoiset	4237c3d645	radv: properly load unused gl_LocalInvocationID/gl_WorkGroupID components F1 2017 looks good now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-19 21:26:25 +01:00
Samuel Pitoiset	0c4a30eb51	radv: do not add extra SGPR when push constants are not used This is not because the vertex stage needs some push constants that other stages need them too. This should reduce the number of loaded SGPRs in some situations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-19 21:22:18 +01:00
Samuel Pitoiset	39097282f7	radv: change the needs_push_constants logic Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-19 21:22:16 +01:00
Samuel Pitoiset	1cecaa9174	radv: remove one useless check in ac_nir_shader_info_pass() pipeline->layout can't be NULL now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-19 21:22:12 +01:00
Dave Airlie	dd517ad96d	ac/nir: fix lds store for patch outputs. This wasn't calculating the correct value, this along with a nir patch fixes a regression in: dEQP-VK.tessellation.shader_input_output.barrier Fixes: `043d14db30` (ac/nir: don't write tcs outputs to LDS that aren't read back.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-19 06:44:24 +10:00
Samuel Pitoiset	79b34d0832	amd/common: add ac_vgt_gs_mode() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-18 11:50:50 +01:00
Samuel Pitoiset	55f8431c76	amd/common: add ac_get_cb_shader_mask() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-18 11:50:48 +01:00
Bas Nieuwenhuizen	b308bb8773	amd/common: Add detection of the syncobj wait/signal/reset ioctls. First amdgpu bump after inclusion was 20 (which was done for local BOs). Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-18 09:31:06 +01:00
Samuel Pitoiset	225b198802	amd/common: add ac_build_waitcnt() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:24:44 +01:00
Samuel Pitoiset	24601810e9	amd/common: more use of i32_1 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:24:42 +01:00
Samuel Pitoiset	ec4e566560	amd/common: more use of i32_0 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:24:41 +01:00
Samuel Pitoiset	d43e72fd8c	radeonsi: make use of ac_build_fdiv() And move the comment to amd/common. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:24:38 +01:00
Samuel Pitoiset	88522e2bcd	radv: export SampleMask from pixel shaders at full rate Use 16_ABGR instead of 32_ABGR if Z isn't written. Ported from RadeonSI. No CTS regressions on Polaris. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:23:28 +01:00
Samuel Pitoiset	91f4d746e4	amd/common: add ac_get_spi_shader_z_format() ac_shader_util.c will contain shader helpers for RadeonSI and RADV. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:23:23 +01:00
Samuel Pitoiset	90c3bf0789	radv: do not load the local invocation index when it's unused Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:22:26 +01:00
Samuel Pitoiset	e001944410	amd/common: scan which components of gl_LocalInvocationID are used Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:22:04 +01:00
Samuel Pitoiset	42285ed8c3	amd/common: scan which components of gl_WorkGroupID are used Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:22:02 +01:00
Samuel Pitoiset	2e58ef46a8	radv: replace grid_components_used by uses_grid_size Use a boolean instead because the number of needed SGPRs is always 3. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:19:42 +01:00
Samuel Pitoiset	97e57740d8	radv: always emit all compute block components The number of grid components is always 3 when gl_NumWorkGroups is declared, because it relies on the number of components of nir_instrinsic_load_num_work_groups. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:19:39 +01:00
Timothy Arceri	a5f9ac2928	ac: fix nir_op_f2f64 Without this we get the error "FPExt only operates on FP" when converting the following: vec1 32 ssa_5 = b2f ssa_4 vec1 64 ssa_6 = f2f64 ssa_5 Which results in: %44 = and i32 %43, 1065353216 %45 = fpext i32 %44 to double With this patch we now get: %44 = and i32 %43, 1065353216 %45 = bitcast i32 %44 to float %46 = fpext float %45 to double Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-13 13:20:28 +11:00
Bas Nieuwenhuizen	3342a432fa	ac/nir: Support vulkan_resource_reindex. Fixes: `93b4cb61eb` "spirv: Allow OpPtrAccessChain for block indices" Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-12 00:16:18 +01:00
Bas Nieuwenhuizen	368f49b284	ac/nir: Don't load the descriptor in vulkan_resource_index. To support the reindex intrinsic, we need the result to be something on which we can adjust the index/address. Since it is all within a basic block, the compiler should be able to merge any extra loads. v2: Change visit_get_buffer_size too. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-12 00:16:18 +01:00
Samuel Pitoiset	5f81a43535	radv: use a faster version for nir_op_pack_half_2x16 This patch is ported from RadeonSI and it has two effects. It fixes a rendering issue which affects F1 2017 and Dawn of War 3 (Vega only) because LLVM was ending up by generating the new v_mad_mix_{hi,lo} instructions which appear to be buggy in some way. Not sure if Mesa is generating something wrong or if the issue is in LLVM only. Anyway, that explains why the DOW3 issue can't be reproduced with GL on Vega. It also improves performance because v_cvt_pkrtz_f16 is faster, and because I guess the rounding mode behaviour is similar between GL and VK, we can use it. About performance, it improves Talos by +3/4% but I don't see any other impacts. No CTS regressions on Polaris. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-07 17:21:50 +01:00
Timothy Arceri	ccd1810bba	ac: add si_nir_load_input_gs() to the abi V2: make use of driver_location and don't expose NIR to the ABI. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:19 +11:00
Timothy Arceri	caf15ce670	ac: move build_varying_gather_values() to ac_llvm_build.h and expose Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:19 +11:00
Timothy Arceri	6fd6cb6616	ac: add basic nir -> llvm type helper Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Marek Olšák	186adc514b	ac/surface: always compute DCC info when DCC is possible on GFX9 The same code for VI doesn't check for scanout either. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-30 18:46:11 +01:00
Marek Olšák	e4cce7dbba	radeonsi: dismantle si_common_screen_init/destroy Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	757ea3e613	radeonsi: move/remove ac_shader_binary helpers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	e3c0a5b6e8	ac/surface: enable DCC computation for MSAA Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Dylan Baker	5060c51b6f	meson: build r600 driver v4: - Ensure inc_amd_common defined when radeonsi is disabled (needed by r600) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Tested-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-28 14:06:33 -08:00
Nicolai Hähnle	377a062321	ac/surface: fix indentation Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Nicolai Hähnle	97f42d11df	amd/common: sid.h cleanups Fix a bunch of labels indicating when registers were added/removed and normalize the SI-class GRBM_GFX_INDEX. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Marek Olšák	6b8909f2d1	ac: pack legacy_surf_level better r600_texture: 1488 -> 1248 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:46:16 +01:00
Marek Olšák	ec15ff78c3	ac: change legacy_surf_level::slice_size to dword units The next commit will reduce the size even more. v2: typecast to uint64_t manually v3: add more typecasts, add asserts Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:44:04 +01:00
Marek Olšák	474b4a9191	ac: pack ac_surface better r600_texture: 1736 -> 1488 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:12:38 +01:00
Dave Airlie	043d14db30	ac/nir: don't write tcs outputs to LDS that aren't read back. If the TCS doesn't read back the outputs, no need to store them to LDS in the first place. (except for tess factors). This seems to give about 50fps (3290->3330) with tessellation demo. I haven't tested if it impacts DoW3 at all. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-27 13:50:24 +10:00
Boyuan Zhang	436a3f8d6d	radeon/common: add vcn enc ip info query New ip info query is needed for vcn encode Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Timothy Arceri	b73ce64fb8	ac: add gs_{prim,invocation}_id to the abi Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-16 10:54:03 +11:00
Dylan Baker	46a7fdd7ca	meson: Remove build_by_default from amd code This is the same logic as the previous two patches. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-13 13:43:20 -08:00
Timothy Arceri	8fe6abd964	ac: add emit_vertex to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-12 11:08:26 +11:00
Dave Airlie	6bec8bcd79	ac/nir: add support for all intrinsics. (v2) This is derived from tgsi/radeonsi code from the GLSL intrinsics. This should pre-fix radv for the upcoming spirv patches. v2: actually use wait_cnt, sleep deprived dad time! (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-09 01:25:59 +00:00
Marek Olšák	7f33e94e43	amd/addrlib: update to latest version This uses C++11 initializer lists. I just overwrote all Mesa files with internal addrlib and discarded hunks that we should probably keep, but I might have missed something. The code depending on ADDR_AM_BUILD is removed. We can add it back next time if needed. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-08 00:55:13 +01:00
Marek Olšák	cde664ab81	radeonsi: use ac_create_target_machine Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-07 17:58:38 +01:00
Marek Olšák	81f81fdb54	radeonsi: use ac_get_llvm_processor_name Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-07 17:58:36 +01:00
Marek Olšák	24e9004708	radeonsi: remove unused field in the PCI ID table Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2017-11-07 17:26:36 +01:00
Dave Airlie	0084f4a422	ac/nir: for ubo load use correct num_components I was hacking something stupid in doom, and hit an assert for the bitcast following this, it definitely looks like this should be the number of 32-bit components, not the instr level ones. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-07 14:54:19 +10:00
Timothy Arceri	6e2eb96b64	ac: remove the remaining duplicate llvm types Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	e73a467005	ac: remove usused v4f32 Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	7f4966731f	ac: add v2f32 to the common code and make use of it Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	cd6cfd1095	ac: use the ac f16 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	8f651ae062	ac: use the ac f32 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	368654a299	ac: use the ac f64 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	d927db0672	ac: use the common v8i32 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	9db51b2393	ac: use the common v4i32 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	ee376ac6f4	ac: add v3i32 to the common code and make use of it Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	309a51411d	ac: add v2i32 to the common code and use it Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	c64cfa0392	ac: use the ac i64 llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	3d45acf71c	ac: remove unused i16 llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	4d4799643d	ac: use the ac ivoidt llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	209ad5c16f	ac: use the ac i8 llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	21d71189ec	ac: use the ac i1 llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	bd59a0bb8b	ac: use the ac i32 llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	439a2febc4	ac/radeonsi: add support for tex instr without a derefence These are produced by nir_lower_bitmap(), adding the missing derefence would cause other issues that need to be hacked around such as skipping sampler lowering and uniform location assignment, so this change seems the correct way to go. Fixes 194 piglit crashes on radeonsi using NIR. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:19:51 +11:00
Marek Olšák	529cdce799	radeonsi: remove 'Authors:' comments It's inaccurate. Instead, see the copyright and use "git log" and "git blame" to know the authorship. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-02 18:19:03 +01:00
Dave Airlie	16cfbef44c	ac/llvm: drop pointless wrappers around umsb/imsb Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 15:59:34 +10:00
Dave Airlie	82d47b9d38	ac/llvm: consolidate find lsb function. This was the same between si and ac. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 15:59:31 +10:00
Dave Airlie	de2b241111	ac/llvm: drop v4f32empty. (v2) This was unused. v2: drop args. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 15:59:22 +10:00
Dave Airlie	a76b6c2192	ac/llvm: add i1false/i1true to common code. These get used in fair few places. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 15:59:18 +10:00
Dave Airlie	88b7ddbe65	ac/llvm: use the ac i32 0/1 and f32 0/1 llvm types. This just avoids having two copies of these. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 15:59:13 +10:00
Dave Airlie	f925f5b074	ac/nir: move lds declaration/load/store into shared code. This was duplicated between both drivers, share here. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 15:59:11 +10:00
Matthew Nicholls	27a0b24bf2	ac/nir: generate correct instruction for atomic min/max on unsigned images v2: fix silly typo Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-25 20:52:58 +02:00
Timothy Arceri	8ebaf8192a	ac: add support for explicit component packing This is needed for RADV to support explicit component packing. This is also required to use the new NIR component splitting / packing passes. V2: - add commponent packing support for interpolate_at* intrinsics - improve store packing support when not all varyings are scalar as spotted by Bas the store source was incorrectly offset. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-25 17:02:40 +11:00
Marek Olšák	2a414c3961	radeonsi: postponed KILL isn't postponed anymore, but maintains WQM This restores performance for the drirc workaround, i.e. KILL_IF does: visible = src0 >= 0; kill_flag &= visible; // accumulate kills amdgcn_kill(wqm_vote(visible)); // kill fully dead quads only And all helper pixels are killed at the end of the shader: amdgcn_kill(kill_flag); Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-24 14:56:34 +02:00
Marek Olšák	478afbe525	ac: use llvm.amdgcn.kill with LLVM 6.0 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-24 14:56:34 +02:00
Marek Olšák	1ff9e27cbd	ac: replace ac_build_kill with ac_build_kill_if_false This will be a new LLVM intrinsic and will also work nicely with llvm.amdgcn.wqm.vote. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-24 14:56:34 +02:00
Eric Anholt	ba85525fce	ac: Silence a compiler warning about results[0]. We know that num_components will be > 0, but it doesn't. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-23 10:14:40 -07:00
Eric Anholt	34c04c734f	ac: Fix a compiler warning for possibly undefined "name" Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-23 10:14:40 -07:00
Nicolai Hähnle	f9ccfda9bc	amd/common/gfx9: workaround DCC corruption more conservatively Fixes KHR-GL45.texture_swizzle.smoke and others on Vega. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102809 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-23 18:10:20 +02:00
Bas Nieuwenhuizen	a548b727a1	ac/nir: Only clamp shadow reference on radeonsi. Vulkan CTS does not expect the value to be clamped (at least for D32), and it makes a differences even though depth is in [0,1], due to strict inequalities. I couldn't find anything in the Vulkan spec about this, but the test seemed to be copied from GL tests and the GL spec only specifies clamping for fixed point formats. Hence I expect radeonsi to run into this at some point as well, but given that they still have a usecase with the Z16->Z32 promotion, I'll leave that for someone else to clean up. This at least fixes radv dEQP-VK.texture.shadow.* on VI. Fixes: `0f9e32519b` 'ac/nir: clamp shadow texture comparison value on VI' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-23 09:13:38 +02:00
Bas Nieuwenhuizen	2c5b43c87f	ac/nir: Fix nir_texop_lod on GFX for 1D arrays. Fixes: `1bcb953e16` 'radv: handle GFX9 1D textures' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-23 00:27:44 +02:00
Dave Airlie	da9c3cd3ee	radv/ac/nir: only emit tess factors to storage if tes reads them Otherwise we just need to write them to the tf ring. this seems to improve the tessellation demo on Bonarie ~2190->~2230 fps Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-23 07:10:29 +10:00
Bas Nieuwenhuizen	ad727b96b6	ac/nir: Account for compact array index in GS input load from LDS. Mirrors the vram path. Fixes: `d4ecc3c929` 'ac/nir: Add loading from LDS for merged GS.' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-21 22:29:40 +02:00
Bas Nieuwenhuizen	24fe4e6143	ac/nir: Set larged wrokgroup size for GS on GFX9. They don't take a single wave anymore and we need the barriers. Fixes: `6bc42855f9` 'radv: enable GS on GFX9' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-21 12:46:44 +02:00
Bas Nieuwenhuizen	9e82f2b3ea	ac/nir: Take the max workgroup size of all provided shaders. Fixes: `ffaf4d608a` 'radv: Enable tessellation shaders for GFX9.' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-21 12:46:28 +02:00
Andres Rodriguez	92724338ba	radv: Expose VK_EXT_global_priority Expose the extension string as supported Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:01:44 +02:00
Jason Ekstrand	59fb59ad54	nir: Get rid of nir_shader::stage It's redundant with nir_shader::info::stage. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-20 12:49:17 -07:00
Bas Nieuwenhuizen	9961ae2447	ac/nir: Fix up GS input vgprs. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-20 06:23:37 +01:00
Bas Nieuwenhuizen	d4ecc3c929	ac/nir: Add loading from LDS for merged GS. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-20 06:23:29 +01:00
Bas Nieuwenhuizen	ec53e52742	ac/nir: Add ES output to LDS for GFX9. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-20 06:23:18 +01:00
Bas Nieuwenhuizen	3e77333030	ac/nir: Add merged GS function. [airlied: merged fixup + and fixed up a couple more bits]. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-20 06:23:14 +01:00
Dave Airlie	1dda214d9c	ac/nir: init full exec mask for merged shaders. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-20 01:50:40 +02:00
Timothy Arceri	bebfeb7e1c	ac: move some code out of loop in store_tcs_output() Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-10-20 08:01:26 +11:00
Bas Nieuwenhuizen	ce03c119ce	radv: Add code to compile merged shaders. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:23 +02:00
Bas Nieuwenhuizen	640f2c458f	ac/nir: Add LS-HS input VGPR workaround. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:19 +02:00
Bas Nieuwenhuizen	0a182e73d9	ac/nir: Compile the bodies of multiple shaders. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:15 +02:00
Bas Nieuwenhuizen	56d8af1ec5	ac/nir: Expand user SGPR descriptions a bit. To prevent VS/TCS collisions in merged shaders. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:07 +02:00
Bas Nieuwenhuizen	25efef40d2	ac/nir: Don't write to the dynamic HS word on GFX9. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:04 +02:00
Bas Nieuwenhuizen	d8bd693d03	ac/nir: Add function creation for merged LS+HS. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:00 +02:00
Bas Nieuwenhuizen	0cdc8b26f8	ac/nir: Make scan_shader_output_decl less dependent on the context. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:24:56 +02:00
Bas Nieuwenhuizen	6078a3bd51	ac/nir: Allow ac_shader_variant_info to contain info about multiple stages. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:24:51 +02:00
Bas Nieuwenhuizen	a996ed1f9b	ac/nir: Change interface to allow multiple source shaders. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:24:47 +02:00
Bas Nieuwenhuizen	872b21487c	ac/nir: Add HS calling convention. Needed for GFX9 merged shaders. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:24:42 +02:00
Bas Nieuwenhuizen	163a4bf386	ac: Parse the new HS RSRC1 register. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:24:20 +02:00
Marek Olšák	854593b8eb	ac: clean up ac_build_indexed_load function interfaces Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	11adea4b24	ac: add radeon_info::has_sync_file Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-12 21:04:56 +02:00
Marek Olšák	5f2073be32	ac/surface: add ac_surface::is_displayable Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-12 19:03:33 +02:00
Marek Olšák	7b697c8b78	amd: move r600d_common.h into r600g Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-09 16:27:06 +02:00
Marek Olšák	76997e9133	radeonsi: shrink r600d_common.h and stop using it Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-09 16:27:05 +02:00
Marek Olšák	bcd3e761a3	ac: properly document a buffer.store LLVM workaround Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-06 02:56:11 +02:00
Marek Olšák	94d800bfa3	ac: silence a warning	2017-10-04 17:00:05 +02:00
Dave Airlie	4e93d6baae	radv: emit fmuladd instead of fma to llvm. For Vulkan SPIR-V the spec states fma() Inherited from OpFMul followed by OpFAdd. Matt says the backend will do the right thing depending on the hardware being compiled for, if you use the fmuladd intrinsic. Using the Mad Max pts test, on high settings at 4K: CHP: 55->60 HGDD: 46->50 LM: 55->60 No change on Stronghold. Thanks to Feral for spending the time to track this down. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-04 06:22:44 +01:00
Nicolai Hähnle	052b974fed	amd/common: move ac_build_phi from radeonsi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-02 12:17:15 +02:00
Nicolai Hähnle	4c56e07029	radeonsi: clamp depth comparison value only for fixed point formats The hardware usually does this automatically. However, we upgrade depth to Z32_FLOAT to enable TC-compatible HTILE, which means the hardware no longer clamps the comparison value for us. The only way to tell in the shader whether a clamp is required seems to be to communicate an additional bit in the descriptor table. While VI has some unused bits in the resource descriptor, those bits have unfortunately all been used in gfx9. So we use an unused bit in the sampler state instead. Fixes dEQP-GLES3.functional.texture.shadow.2d.linear.equal_depth_component32f and many other tests in dEQP-GLES3.functional.texture.shadow.* Fixes: `d4d9ec55c5` ("radeonsi: implement TC-compatible HTILE") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 11:44:50 +02:00
Nicolai Hähnle	a6ea4c1b93	amd/common: save an instruction in the build_cube_select sequence Avoid a v_cndmask: the absolute value is free due to input modifiers. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 11:43:07 +02:00
Nicolai Hähnle	5be5c1e0fa	amd/common: fix build_cube_select Fix the custom cube coord selection sequence to be identical to the hardware v_cubesc/tc and OpenGL spec. Affects texture sampling with user-provided derivatives. Fixes dEQP-GLES3.functional.shaders.texture_functions.texturegrad.* Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 11:43:04 +02:00
Nicolai Hähnle	9ddc6e16a9	amd/common: remove ac_shader_abi::chip_class Redundant with the recently added ac_llvm_context::chip_class. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-29 11:37:03 +02:00
Dylan Baker	673dda8330	meson: build "radv" vulkan driver for radeon hardware This builds, installs, and has been tested on a r290x (Hawaii) with the Vulkan CTS. It dies horribly in a fire at the same point for the meson build as the autotools build. v2: - enable radv by default - add shader cache support and enforce that it's built for radv v3: - Fix typo in meson_options (Nicholas) - strip trailing 'svn' from llvm version before setting the version preprocessor flag (Bas) - Check for LLVM module requirements Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-09-27 09:12:34 -07:00
Nicolai Hähnle	eb71394ff3	ac/surface: handle error when choosing preferred swizzle mode CID: 1418140 Fixes: `c4ac522511` ("ac/surface: handle S8 on gfx9") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-09-21 11:00:00 +02:00
Dave Airlie	c4ac522511	ac/surface: handle S8 on gfx9 If we don't have a depth piece, we don't get a correct swizzle mode and we hit an assert in addrlib. In case of no depth get the preferrred swizzle mode for stencil alone. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "17.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-20 15:32:05 +10:00
Nicolai Hähnle	94736d31c3	amd/common: add workaround for cube map array layer clamping Fixes dEQP-GLES31.functional.texture.filtering.cube_array.* Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-18 11:25:18 +02:00
Nicolai Hähnle	6772452e4c	amd/common: remove has_ds_bpermute argument from ac_build_ddxy Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-18 11:25:18 +02:00
Nicolai Hähnle	3db86d86ed	amd/common: add chip_class to ac_llvm_context Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-18 11:25:18 +02:00
Nicolai Hähnle	e0af3bed2c	amd/common: round cube array slice in ac_prepare_cube_coords The NIR-to-LLVM pass already does this; now the same fix covers radeonsi as well. Fixes various tests of dEQP-GLES31.functional.texture.filtering.cube_array.combinations.* Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-09-18 11:25:18 +02:00
Bas Nieuwenhuizen	979978ee06	radv: Check for GFX9 for 1D arrays in image_size intrinsic. Only on GFX9 we implement them as 2D images. This fixes: dEQP-VK.image.image_size.1d_array.readonly_12x34 dEQP-VK.image.image_size.1d_array.readonly_1x1 dEQP-VK.image.image_size.1d_array.readonly_32x32 dEQP-VK.image.image_size.1d_array.readonly_7x1 dEQP-VK.image.image_size.1d_array.readonly_writeonly_12x34 dEQP-VK.image.image_size.1d_array.readonly_writeonly_1x1 dEQP-VK.image.image_size.1d_array.readonly_writeonly_32x32 dEQP-VK.image.image_size.1d_array.readonly_writeonly_7x1 dEQP-VK.image.image_size.1d_array.writeonly_12x34 dEQP-VK.image.image_size.1d_array.writeonly_1x1 dEQP-VK.image.image_size.1d_array.writeonly_32x32 dEQP-VK.image.image_size.1d_array.writeonly_7x1 Fixes: `1bcb953e16` "radv: handle GFX9 1D textures" Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-09-15 22:06:56 +02:00
Samuel Pitoiset	f0d09d9012	radeonsi: move si_get_wave_info() to AMD common code This will allow us to use it from radv. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-09-14 10:37:57 +02:00
Nicolai Hähnle	cffc0ae0d9	ac/surface: match Z and stencil tile config Fixes various piglit tests on Stoney, see the comment. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-13 18:27:01 +02:00
Nicolai Hähnle	481df8032b	ac/surface: sanity-check that we got a TC-compatible HTILE if requested Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-13 18:26:59 +02:00
Nicolai Hähnle	b8c6e88848	amd/common: get ME/PFP/CE firmware feature versions as well Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-13 18:25:06 +02:00
Dave Airlie	aba441be44	radv/ac: bump params array for image atomic comp swap For the comp_swap case this was overflowing and crashing sometimes. Fixes: dEQP-VK.image.atomic_operations.compare_exchange.* Cc: "17.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-13 17:17:02 +10:00
Dave Airlie	1bcb953e16	radv: handle GFX9 1D textures As GFX9 can't handle 1D depth textures, radeonsi and apparantly pro just update all 1D textures to 2D, and work around it. This ports the workarounds from radeonsi. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-13 08:40:41 +10:00
Connor Abbott	b909d278d0	ac: remove bitcast_to_float() ac_to_float() does a superset of what it does. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-08 04:24:56 +01:00
Connor Abbott	50967cd0b0	ac: move ac_to_integer() and ac_to_float() to ac_llvm_build.c We'll need to use ac_to_integer() for other stuff in ac_llvm_build.c. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-09-08 04:24:02 +01:00
Connor Abbott	fafa299511	ac: fix ac_get_type_size() for doubles Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-09-08 04:19:47 +01:00
Dave Airlie	4cab214e76	radv/ac: use ac_get_type_size. Just moved to newly shared code. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-08 04:15:50 +01:00
Connor Abbott	b8a51c8c4b	radeonsi: move the guts of ARB_shader_group_vote emission to ac Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-08 04:12:49 +01:00
Connor Abbott	bd73b89792	radeonsi: move si_emit_ballot() to ac Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-08 04:12:42 +01:00
Connor Abbott	ac27fa7294	radeonsi: move emit_optimization_barrier() to ac Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-08 04:06:47 +01:00
Connor Abbott	c181d4f2b7	radeonsi: move llvm_get_type_size() to ac Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-08 04:04:16 +01:00
Marek Olšák	4bd2bdbb3c	ac/surface: add radeon_surf::has_stencil for convenience Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-07 17:59:37 +02:00
Dave Airlie	e852ecd22b	ac/surface: reduce gfx9_surface_layout size. 152->144. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-07 11:00:08 +10:00
Nicolai Hähnle	552aaa11ed	ac/debug: take ASIC generation into account when printing registers There were some overlapping changes in gfx9 especially in the CB/DB blocks which made register dumps rather misleading. The split is along the lines of the header files, so we'll print VI-only fields on SI and CI, for example, but we won't print GFX9 fields on SI/CI/VI, and we won't print SI/CI/VI fields on GFX9. Acked-by: Marek Olšák <marek.olsak@amd.com>	2017-09-06 09:59:19 +02:00
Nicolai Hähnle	274f1dace7	amd/common: pass chip_class to ac_dump_reg Acked-by: Marek Olšák <marek.olsak@amd.com>	2017-09-06 09:59:17 +02:00
Nicolai Hähnle	925ad7d2f6	ac/sid_tables: add FieldTable object Automatically re-use table entries like StringTable and IntTable do. This allows us to get rid of the "fields_owner" logic, and simplifies the next change. Acked-by: Marek Olšák <marek.olsak@amd.com>	2017-09-06 09:59:14 +02:00
Nicolai Hähnle	981335b704	ac/sid_tables: remove unused variable varname_values Acked-by: Marek Olšák <marek.olsak@amd.com>	2017-09-06 09:59:07 +02:00
Dave Airlie	b880cd3b59	radv/gfx9: fix buffer size on gfx9. The VI sizing only applies to VI. This fixes: dEQP-VK.image.image_size.buffer.* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-06 03:05:44 +01:00
Dave Airlie	979be4f9c8	ac: reorg ac_shader_binary struct to take less space. This reduces the size from 96 to 80 bytes but putting all the 32-bit sizes at the start. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-04 08:40:37 +10:00
Samuel Pitoiset	12cbd9a13f	radeonsi: move si_vm_fault_occured() to AMD common code For radv, in order to report VM faults when detected. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-09-01 09:46:32 +02:00
Bas Nieuwenhuizen	46dd30d08f	ac/debug: Support multiple trace ids for nested IBs. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-08-29 23:05:59 +02:00
Grazvydas Ignotas	29f46488cc	ac/nir: remove misleading condition location is never set to INTERP_SAMPLE, and Nicolai comments: "... that part is misleading. location refers to the base location, not the final location of the sample, and it can never be INTERP_SAMPLE." Suggested-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>	2017-08-29 01:36:57 +03:00
Grazvydas Ignotas	2b4e31bc9b	ac/nir: silence maybe-uninitialized warnings These are likely false positives, but are also annoying because they show up on every "make install", which causes ac_nir_to_llvm to be rebuilt here. Initializing those variables to NULL should be harmless even when unnecessary. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-29 01:16:58 +03:00
Grazvydas Ignotas	15800180f3	amd: add .editorconfig amd/common/ and amd/vulkan/ are using tabs for indent, which doesn't match the settings in root .editorconfig, so let's override. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-29 01:08:58 +03:00
Marek Olšák	39205f216e	gallium/radeon: set EVENT_WRITE_EOP.INT_SEL = wait for write confirmation Ported from Vulkan. Not sure what this is good for.. maybe write confirmation from L2 flushes? Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-28 21:45:33 +02:00
Marek Olšák	d500c9b060	Revert "radeonsi: get the raster config from AMDGPU on SI" This reverts commit `fc99cb3c9e`. "The performance went down from 64.7 to 51.4 fps in Valley and from 30.8 to 25.1 fps in Heaven on Radeon HD 7970. Other games seem to have also a 10-25% performance decrease." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102429 It looks like we can't use the raster config values from the kernel.	2017-08-27 22:27:23 +02:00
Mauro Rossi	725741f10d	ac/debug: use util_strchrnul() to fix android build error Similar to `e09d04cd56` "radeonsi: use util_strchrnul() to fix android build error" Android Bionic does not support strchrnul() string function, gallium auxiliary util/u_string.h provides util_strchrnul() This change avoids the following warning and error: external/mesa/src/amd/common/ac_debug.c:501:15: warning: implicit declaration of function 'strchrnul' is invalid in C99 char end = strchrnul(out, '\n'); ^ external/mesa/src/amd/common/ac_debug.c:501:9: error: incompatible integer to pointer conversion initializing 'char ' with an expression of type 'int' char *end = strchrnul(out, '\n'); ^ ~~~~~~~~~~~~~~~~~~~~ 1 warning and 1 error generated. Fixes: `c2c3912410` "ac/debug: annotate IB dumps with the raw values" Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Rob Herring <robh@kernel.org>	2017-08-24 17:23:24 -05:00
Marek Olšák	fc99cb3c9e	radeonsi: get the raster config from AMDGPU on SI Not sure yet if we wanna do this on CIK and VI too. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-24 23:54:55 +02:00
Bas Nieuwenhuizen	180c1b924e	ac/nir: Add shader support for multiviews. It uses an user SGPR to pass the view index to the shaders, except for the fragment shader where we use layer=view (which comes in handy when we want to do the NV ext that allows us to execute pre-FS stages once instead of per view). Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-08-24 19:20:47 +02:00
Bas Nieuwenhuizen	eec5578158	ac/nir: Make shader key a struct. Some bits can be passed to almost every shader, and I don't like adding 5 variables. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-08-24 19:20:47 +02:00
Bas Nieuwenhuizen	3d5f29f5f9	ac/nir: Implement input attachments with layered rendering. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-08-24 19:20:47 +02:00
Bas Nieuwenhuizen	c848e642d2	ac/nir: Determine if input attachments are used in the info pass. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-08-24 19:20:47 +02:00
Bas Nieuwenhuizen	43595db302	ac/nir: Cast sources of integer ops to int. The int32->float semantic conversion got dropped in a testcase, because the src was already float. On closer inspection I decided to add a few more casts for integer op operands to be safe too. Cc: 17.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-08-24 19:20:47 +02:00
Bas Nieuwenhuizen	6bafb56df6	radv: Implement bc optimize. Seems like we actually enabled it already, but did not implement the shader part. With this patch we do. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-08-24 00:57:03 +02:00
Bas Nieuwenhuizen	a7f5545ede	ac/nir: refactor input variable iteration. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-08-24 00:57:03 +02:00
Nicolai Hähnle	8937ac9a13	ac/debug: invoke valgrind checks while parsing IBs Help catch garbage data written into IBs. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-23 13:54:07 +02:00
Nicolai Hähnle	c2c3912410	ac/debug: annotate IB dumps with the raw values Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-23 13:54:05 +02:00
Nicolai Hähnle	cfb3824c23	ac/debug: use an explicit getter for fetching words from the IB Guard against out-of-bounds accesses, and prepare for upcoming changes. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-23 13:54:03 +02:00
Marek Olšák	759526813b	ac/surface/gfx9: don't allow DCC for the smallest mipmap levels This fixes garbage there if we don't flush TC L2 after rendering. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-22 13:29:47 +02:00
Marek Olšák	776fcccabf	gallium/radeon: clean up EOP_DATA_SEL magic numbers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-22 13:29:47 +02:00
Marek Olšák	fdef2f0fd1	radeonsi/gfx9: properly handle imported textures with unexpected swizzle mode Cc: 17.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-22 13:29:47 +02:00
Nicolai Hähnle	fbbb5f71cd	amd/common: split out ac_parse_ib_chunk from ac_parse_ib Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-22 09:50:46 +02:00
Dave Airlie	b040f51b61	ac/nir: fixup layer/viewport export for GFX9. GFX9 moved where the viewport index export goes. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-21 04:26:37 +01:00
Dave Airlie	4c02e2bd95	radv: disable texture gather workaround on gfx9. Not required anymore. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-17 02:24:36 +01:00
Marek Olšák	4630ede102	ac: fail shader compilation if libelf is replaced by an incompatible version UE4Editor has this issue. This commit prevents hangs (release build) or assertion failures (debug build). It doesn't fix the editor, but catastrophic scenarios are prevented. Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-08-10 13:24:23 +02:00
Connor Abbott	c12c2e40a3	ac/nir: fix saturate emission The .f32 was already getting added by emit_intrin_2f_param(). Noticed when enabling LLVM module verification. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-08-08 11:58:21 -07:00
Dave Airlie	3f389f75b6	radv: fix f16->f32 denorm handling for SI/CIK. (v2) This just copies the code from the -pro shaders, and fixes the tests on CIK. With this CIK passes the same set of conformance tests as VI. Fixes: `83e58b03` (radv: flush f32->f16 conversion denormals to zero. (v2)) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-07 00:00:05 +01:00
Andres Rodriguez	6130c8e6e7	ac/gpu: add driver/device UUID query helpers We need vulkan and gl to produce the same UUIDs. Therefore we should keep the mechanism to compute these in a common location to guarantee they are updated in lockstep. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-06 12:42:07 +10:00
Marek Olšák	c60c5accd1	ac/surface: align DCC size for surfaces that use tile swizzle Note that dcc_alignment = pipe_interleave_bytes * num_pipes * num_banks, which is greater than the previous open-coded alignment. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-04 02:10:04 +02:00
Marek Olšák	0141beadd8	ac/surface: limit tile swizzle to non-mipmaps on SI Mipmapping with tile swizzle doesn't work. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-04 02:10:04 +02:00
Marek Olšák	2b7e85562a	ac/surface: enable tile swizzle for mipmapped textures The tile swizzle computation was done after the whole miptree was computed, but that was too late, because at that point AddrSurfInfoOut contained information about the smallest miplevel, which is never 2D-tiled. The correct way is to do the computation before the second level is computed. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-04 02:10:04 +02:00
Marek Olšák	6fb382d9fb	ac/surface: set structure size and handle errors for AddrComputeBaseSwizzle Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-04 02:10:04 +02:00
Marek Olšák	59144d4bf5	ac/surface: increment surf_index only when tile swizzle is allowed Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-04 02:10:04 +02:00
Marek Olšák	9059400247	ac/surface: compute tile swizzle only when it's allowed Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-04 02:10:04 +02:00
Marek Olšák	4e757d591d	ac/surface: add RADEON_SURF_SHAREABLE Shareable textures won't use tile swizzle. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-04 02:10:04 +02:00
Marek Olšák	d311e837f4	ac/surface: remove RADEON_SURF_HAS_TILE_MODE_INDEX it's useless Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-04 02:10:04 +02:00
Marek Olšák	4662e45350	ac/surface: move tile_swizzle to ac_surface and document it Gfx9 will use it too. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-04 02:10:04 +02:00
Nicolai Hähnle	dfc1502c84	radeonsi: fix streamout overflow predication on VI+ There is a firmware regression that causes failures. Work around it by using the compute shader for query_buffer_objects to summarize the query results. v2: rename to PREDICATION_OP_BOOL64 (consistent with sid.h) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-02 09:48:53 +02:00
Bas Nieuwenhuizen	341578a6ae	ac/nir: Add float cast before shadow comparator clamp. LLVM complained about passing an i32 to a float clamp. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Fixes: `0f9e32519b` "ac/nir: clamp shadow texture comparison value on VI" Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-02 08:43:13 +02:00
Dave Airlie	cb6f16dce9	radeon/ac: use ds_swizzle for derivs on si/cik. This looks like it's supported since llvm 3.9 at least, so switch over radeonsi and radv to using it, -pro also uses this. We can now drop creating lds for these operations as the ds_swizzle operation doesn't actually write to lds at all. Acked-by: Marek Olšák <marek.olsak@amd.com> (stable requested due to fixing radv CIK conformance tests) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-02 00:12:01 +01:00
Connor Abbott	ddd9e11795	ac/nir: fix nir_op_unpack_64_2x32_split_y emission This was broken thanks to a typo in `b2367cf`. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-01 12:20:49 -07:00
Connor Abbott	6d731c5651	ac/nir: fix lsb emission This makes it match radeonsi. The LLVM backend itself will emit the correct instruction, but LLVM might do incorrect optimizations since it thinks the output is undefined when the input is 0, even though it's not supposed to be. We really need a new intrinsic, or for the backend to become smarter and recognize this pattern. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Bas Nieuwenhuizen <basni@google.com>	2017-08-01 12:20:49 -07:00
Dave Airlie	df61a05019	radv: handle 10-bit format clamping workaround. This fixes: dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.* for a2r10g10b10 formats as destination on SI/CIK hardware. This adds support to the meta program for emitting 10-bit outputs, and adds 10-bit support to the fragment shader key. It also only does the int8/10 on SI/CIK. Fixes: `f4e499ec7` (radv: add initial non-conformant radv vulkan driver) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-01 00:10:23 +01:00
Nicolai Hähnle	b7d36efc2d	ac/nir: implement load_frag_coord intrinsic Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:44 +02:00
Nicolai Hähnle	bcf85fcd9a	ac/nir: pass ac_llvm_context to unpack_param Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:44 +02:00
Nicolai Hähnle	1c64637c26	ac/nir,radeonsi: add and use ac_shader_abi::frag_pos v2: update for LLVMValueRefs in ac_shader_abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:43 +02:00
Nicolai Hähnle	f03c54e05a	ac/nir,radeonsi: add and use ac_shader_abi::{ancillary,sample_coverage} v2: update for LLVMValueRefs in ac_shader_abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:43 +02:00
Nicolai Hähnle	7de445377c	ac/nir,radv: move force_persample to ac_shader_info::force_persample Avoid accessing radv-specific structures during the meat of NIR-to-LLVM translation. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:43 +02:00
Nicolai Hähnle	a69afb68c9	radeonsi: use new function ac_build_umin for edgeflag clamping Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:42 +02:00
Nicolai Hähnle	0f9e32519b	ac/nir: clamp shadow texture comparison value on VI Needed for TC-compatible HTILE in radeonsi for test cases like piglit spec/arb_texture_rg/execution/fs-shadow2d-red-01.shader_test Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:42 +02:00
Nicolai Hähnle	ac2ab5acad	ac/nir: add always_vector argument to ac_build_gather_values_extended This simplifies a bunch of places that no longer need special treatment of value_count == 1. We rely on LLVM to optimize away the 1-element vector types. This fixes a bunch of bugs where 1-element arrays are indexed indirectly. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:42 +02:00
Nicolai Hähnle	e247357240	ac/nir,radeonsi: add ac_shader_abi::front_face v2: update for LLVMValueRefs in ac_shader_abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:42 +02:00
Nicolai Hähnle	28634ff7d3	ac/nir: pass ac_nir_context to emit_ddxy Allocating the ddxy_lds is considered to be part of the API shader translation and not part of the ABI. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:41 +02:00
Nicolai Hähnle	c5f3912e13	ac/nir: pass ac_nir_context to SSBO intrinsic handlers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:41 +02:00
Nicolai Hähnle	b78eae6f2a	ac/nir: load buffer descriptors via ac_shader_abi::load_ssbo Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:40 +02:00
Nicolai Hähnle	aa66fec47e	ac/nir: pass ac_nir_context to emit_discard_if Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:40 +02:00
Nicolai Hähnle	4ba201ee36	ac/nir: extract shader_info->fs.can_discard from NIR shader info Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:40 +02:00
Nicolai Hähnle	9061dca872	ac/nir: handle old-style shadow tex instructions correctly The first element is only extracted for new-style shadow tex. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:39 +02:00
Nicolai Hähnle	07597632a5	ac/nir: whitespace fixes Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:39 +02:00
Nicolai Hähnle	ba06e8bbe8	ac/nir: use shader_info pass to determine whether instance_id is used This improves the separation of ABI and NIR translation. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:39 +02:00
Nicolai Hähnle	be0488a173	ac/nir: move setting shader_info->fs.writes_memory to radv-specific code Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:39 +02:00
Nicolai Hähnle	f37f9aed84	ac/nir: add image and write parameter to ac_shader_abi::load_sampler_desc Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:38 +02:00
Nicolai Hähnle	b36b6f76fa	ac/nir: add support for arrays-of-arrays to get_sampler_desc Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:38 +02:00
Nicolai Hähnle	35b7b3a80f	ac/nir: pass ac_nir_context to tex_fetch_ptrs and related functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:37 +02:00
Nicolai Hähnle	6ff5317589	ac/nir: add and use ac_shader_abi::load_sampler_desc Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:37 +02:00
Nicolai Hähnle	57fbf3f9eb	ac/nir: pass ac_nir_context to visit_tex and various related functions Get most of the churn out of the way before actually loading samplers via the ABI. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:37 +02:00
Nicolai Hähnle	7763c7b2ba	ac/nir,radeonsi: add ac_shader_abi::chip_class Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:37 +02:00
Nicolai Hähnle	d007919d99	ac/nir,radeonsi: add ac_shader_abi::load_ubo Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:36 +02:00
Nicolai Hähnle	220ed150bc	ac/nir: pass ac_nir_context to visit_load_ubo_buffer Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:35 +02:00
Nicolai Hähnle	df62e5eed0	ac/nir: pass ac_nir_context to visit_{load,store}_var and get_deref_offset helper Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:35 +02:00
Nicolai Hähnle	e139705c98	ac/nir: pass ac_llvm_context to some helper functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:35 +02:00
Nicolai Hähnle	cb96a36b04	ac/nir: pass ac_nir_context to visit_intrinsic Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:35 +02:00
Nicolai Hähnle	48737e1890	ac/nir: add ac_nir_context::main_function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:35 +02:00
Nicolai Hähnle	2be774b196	ac/nir: split scanning outputs from setting up output allocas The scanning phase sets the driver_location, because it is part of the ABI: radeonsi does the assignment differently. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:34 +02:00
Nicolai Hähnle	1a508cf8d6	ac/nir: pass ac_llvm_context to build_alloca helpers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:34 +02:00
Nicolai Hähnle	b99a169869	ac/nir: use ac_shader_abi::emit_outputs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:34 +02:00
Nicolai Hähnle	0c3b6a4bd9	ac,radeonsi: add ac_shader_abi::emit_outputs for hardware VS shaders Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:34 +02:00
Nicolai Hähnle	9df23db13d	radeonsi: translate NIR to LLVM Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:33 +02:00
Nicolai Hähnle	73c7e92d3a	ac/nir: add ac_shader_abi::inputs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:32 +02:00
Nicolai Hähnle	b2367cfcc7	ac/nir: begin splitting off ac_nir_context The eventual goal is to hide all radv-specific details behind ac_nir_context::abi, so that the NIR->LLVM code can be re-used by radeonsi. During development, we live with a partial split, where some of the NIR->LLVM code still relies on linking back to the nir_to_llvm_context (which should ultimately be renamed to reflect that it's radv-specific). The idea is to get rid of these backlinks over time. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:32 +02:00
Nicolai Hähnle	fa5ae8db2e	ac/nir: start using ac_shader_abi v2: update for LLVMValueRefs in ac_shader_abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:31 +02:00
Nicolai Hähnle	61ad2f13c3	ac,radeonsi: move some VS input descriptions to ac_shader_abi v2: use LLVM values instead of function parameter indices Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:31 +02:00
Dave Airlie	e77ff11ffe	radv/ac: port SI TC L1 write corruption fix. This ports `72e46c988` to radv. radeonsi: apply a TC L1 write corruption workaround for SI Fixes: `f4e499ec7` (radv: add initial non-conformant radv vulkan driver) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-26 23:39:24 +01:00
Dave Airlie	a81e99f50a	radv/ac: realign SI workaround with radeonsi. This ports: `da7453666a` radeonsi: don't apply the Z export bug workaround to Hainan to radv. Just noticed in passing. Fixes: `f4e499ec7` (radv: add initial non-conformant radv vulkan driver) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-26 23:38:17 +01:00
Marek Olšák	5e81df0f10	ac/surface: fix hybrid graphics where APU=GFX9, dGPU=older v2: don't do it for compressed textures (bpp = 0) Cc: 17.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2017-07-26 19:53:26 +02:00
Dave Airlie	80562f2b77	ac/gpu: add code to detect if kernel supports sync objects. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-21 21:31:54 +01:00
Connor Abbott	91dd2ca99f	ac/nir: rewrite shared variable handling (v2) Translate the NIR variables directly to LLVM instead of lowering to a TGSI-style giant array of vec4's and then back to a variable. This should fix indirect dereferences, make shared variables more tightly packed, and make LLVM's alias analysis more precise. This should fix an upcoming Feral title, which has a compute shader that was failing to compile because the extra padding made us run out of LDS space. v2: Combine the previous two patches into one, only use this for shared variables for now until LLVM becomes smarter. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Alex Smith <asmith@feralinteractive.com>	2017-07-17 14:16:03 -07:00
Marek Olšák	3d1a576fa6	ac/gpu_info: if clock crystal frequency is 0, print an error and set 1 During bring-up, this is often 0. Prevent automatic disablement of ARB_timer_query and demotion of the OpenGL version to 3.2 by setting a non-zero frequency. Print an error message instead. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-07-17 10:56:59 -04:00
Marek Olšák	ddbd2f4c54	ac/surface/gfx9: flags.texture currently refers to TC-compatible HTILE This should lead to better MSAA performance on GFX9. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-07-17 10:56:46 -04:00
Marek Olšák	4560f2b90a	radeonsi: merge si_llvm_get_amdgpu_target into ac_get_llvm_target Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-07-17 10:50:39 -04:00
Dave Airlie	acf1e132af	amd/addrlib: fix typo in api name. This fixes the misspelling of ALIGNMENTS in addrlib. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-17 01:44:14 +01:00
Dave Airlie	f8d5b377c8	radv: set cb base tile swizzles for MRT speedups (v4) This patch uses addrlib to workout the tile swizzles according to the surface index. It seems to produce the same values as amdgpu-pro for the deferred test. v2: don't apply swizzle to CMASK. the eg docs don't mention it, and we clearly don't align cmask for that. v3: disable surf index for dedicated images, as these will most likely be shared, and I don't think the metadata has space for this info in it yet. v4: update for shareable images, rename combined_swizzle to tile_swizzle This gets the deferred demo from 730->950fps on my rx480. (dcc cmask elim predication patches get it further) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-17 01:43:41 +01:00
Dave Airlie	7b5f2e0070	radv/ac: drop setting xnack Since radv uses compute rings and we can't know when we are setting up the shaders what ring they are to be used on, we should just use the default xnack setting. This may be suboptimal in some places, but if we hit a problem, we likely should try and address this between llvm and mesa. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-09 22:21:43 +01:00
Dave Airlie	edf2acbeb1	radv: add support for using addrlib max alignment. Rather than using 64k, use what addrlib returns as the base alignment for vulkan allocations. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-09 22:17:59 +01:00
Alex Smith	c2a5cb6427	ac/nir: Fix ordering of parameters for image atomic cmpswap intrinsics The NIR parameters are ordered "compare, data", matching GLSL, but both the image and buffer LLVM intrinsics take them the other way around. This is already handled correctly for SSBO atomics. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver"	2017-07-07 00:57:25 +02:00
Dave Airlie	09d7c7be4f	radv: enable sisched toggle in perftest flags. RADV_PERFTEST=sisched to enable it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-06 23:07:49 +01:00
Dave Airlie	d97275e42c	ac/llvm: set xnack like radeonsi does. Use family, but only set xnack+ for gfx9. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-06 23:07:45 +01:00
Dave Airlie	01e958d631	ac/llvm: create features list using snprintf. Just more moving code around before adding things to it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-06 23:06:04 +01:00
Dave Airlie	9d9f051390	ac/radv: change api to create target machine This just modifies the API to make it easier to add other flags to target machine creation. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-06 23:05:59 +01:00
Bas Nieuwenhuizen	860a8e6b99	ac/nir: Move VS position exports before param exports. According to Nicolai the SX can already start work when all the position exports are done, so do those first. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-07-05 20:23:00 +02:00
Connor Abbott	2ec77f7a3c	ac/nir: fix 64-bit shifts NIR always makes the shift amount 32 bits, but LLVM asserts if the two sources aren't the same type. Zero-extend the shift amount to make LLVM happy. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-07-03 11:58:59 -07:00
Connor Abbott	7168425dd7	ac/nir: implement 64-bit packing and unpacking We implement the split opcodes, and tell NIR to lower the original ones. The lowering to LLVM is a little more complicated, but NIR can optimize the split ones a little better, and some NIR lowering passes that we might want to use (particularly for doubles) emit the split ones. This should fix pack/unpackDouble2x32, which seems like a bug since when we enabled the Float64 capability. It will also fix pack/unpackInt2x32 when we enable the Int64 capability. Fixes: `798ae37c` ("radv: Enable Float64 support.") Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-07-03 11:58:58 -07:00
Bas Nieuwenhuizen	87d3349393	radv: Use v4i32 variant of llvm.SI.load.const. We apparently still used v16i8 .... As radeonsi doesn't use it with LLVM version checks I don't think we need them either. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-06-30 23:30:55 +02:00
Dave Airlie	ff422500cc	ac/nir: remove last remnants of v16i8 llvm doesn't need this workaround anymore. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-28 20:22:30 +01:00
Alex Smith	909184ac9c	ac/nir: Use correct LLVM intrinsics for atomic ops on imageBuffers The buffer intrinsics should be used instead of the image ones. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-06-28 21:05:04 +02:00
James Legg	69a17da037	ac/nir: assert printfs will fit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-06-28 21:05:04 +02:00
James Legg	6fc41bb4d5	ac/nir: Make intrinsic_name buffer long enough When using cmpswap on an image, it was being trunctated to lvm.amdgcn.image.atomic.cmpswa, with the coords type missing entirely. v2: Add stable CC CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-06-28 21:05:04 +02:00
Nicolai Hähnle	2ce126df3a	ac/nir: convert emit helpers to ac_llvm_context Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-27 10:28:30 +10:00
Nicolai Hähnle	58d496c8e2	ac/nir: remove unused nir_to_llvm_context::has_ddxy Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-27 10:28:30 +10:00
Nicolai Hähnle	6ecef25545	ac/nir: implement nir_op_f2b Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-27 10:28:30 +10:00
Nicolai Hähnle	dacf73e527	ac/nir: implement nir_op_{b2i,i2b} Booleans in NIR are ~0 for true, b2i returns 0/1. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-27 10:28:30 +10:00
Nicolai Hähnle	77d7764d5e	ac/nir: convert type helpers to ac_llvm_context Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-27 10:28:30 +10:00
Nicolai Hähnle	b7bd49158e	ac/llvm: fix type of second llvm.cttz.* parameter LLVM has required an i1 here for a long time. llvm.ctlz.* was fixed in commit `edd23e0606` ("ac/llvm: fix various findMSB bugs"). Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-27 10:28:30 +10:00
Nicolai Hähnle	e8ba03d32a	ac/shader_info: fix a comment Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-27 10:28:29 +10:00
Nicolai Hähnle	edfd3be77e	ac: add ac_llvm_context::v8i32 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-27 10:28:29 +10:00
Nicolai Hähnle	331a574732	ac: add ac_llvm_context::{i,f}32_{0,1} Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-27 10:28:29 +10:00
Nicolai Hähnle	7bf8c944dc	ac: add ac_llvm_context::{i16, i64, f16, f64} Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-27 10:28:29 +10:00
Dave Airlie	6a68170c83	radv: handle primitive id input into fragment shader with no geom shader Fixes: dEQP-VK.pipeline.framebuffer_attachment.no_attachments dEQP-VK.pipeline.framebuffer_attachment.no_attachments_ms Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-26 08:45:30 +10:00
Dave Airlie	a563f611c3	radv: set prim_id for geometry shaders Noticed in passing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-26 08:45:22 +10:00
Dave Airlie	4042892cee	radv: set use_prim_id for tess shaders correctly. Just noticed in passing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-26 08:45:14 +10:00
Marek Olšák	0f827b51c0	radeonsi/gfx9: fix TC-compatible stencil compression Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-19 20:15:36 +02:00
Marek Olšák	064f07fef3	ac/sid.h: don't use parentheses in PKT3_RELEASE_MEM definition The parses skips the line if it contains parentheses. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-19 20:15:36 +02:00
Marek Olšák	ed291cea3d	ac: parse EVENT_WRITE_EOP, RELEASE_MEM, WAIT_REG_MEM, NOWHERE Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-19 20:15:36 +02:00
Nicolai Hähnle	67e49a7f65	amd/common: fix off-by-one in sid_tables.py The very last entry in the sid_strings_offsets table ended up missing, leading to out-of-bounds reads and potential crashes. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-19 12:03:59 +02:00
Emil Velikov	84bf7e5ad6	ac: resolve conflicts introduced with "ac: remove amdgpu.h dependency" The commit did not add the relevant includes - in particular stdint.h and stdbool.h for the respective standard types. At the same time, the amdgpu_device_handle typedef redeclaration was off. Fixes: `81945ded0d` ("ac: remove amdgpu.h dependency") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101471 Cc: Mark Janes <mark.a.janes@intel.com> Cc: Gregor Münch <gr.muench@gmail.com> Reported-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reported-by: Mark Janes <mark.a.janes@intel.com> Reported-by: Gregor Münch <gr.muench@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-06-17 11:37:51 +01:00
Emil Velikov	81945ded0d	ac: remove amdgpu.h dependency Add a couple of forward declarations and drop the amdgpu.h requirement. With this we can build the r300 and r600 drivers without the need for amdgpu. v2: - Add amdgpu.h include in the C file (Marek) - Add a comment about pre C11 typedef redeclaration warning (Eric) Cc: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101189 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-06-16 12:41:44 +01:00
Dave Airlie	95c0591087	ac/gpu: drop duplicated code line. has_hw_decode is assigned twice. Pointed out by coverity. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-13 10:01:40 +10:00
Grazvydas Ignotas	19f6cc3cba	ac/nir: remove another unused variable Declared by each loop already. Trivial. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>	2017-06-08 00:02:42 +03:00
Grazvydas Ignotas	7dfa54399c	ac/nir: convert several ifs to a switch Also solve "outinfo may be used uninitialized" warning by putting in an unreachable(). Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-06-08 00:02:26 +03:00
Grazvydas Ignotas	ae3262c1f2	ac/nir: mark some arguments const Most functions are only inspecting nir, so nir related arguments can be marked const. Some more can be done if/when some nir changes are accepted. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-06-08 00:02:02 +03:00
Dave Airlie	1ec4f008a2	ac/nir: move gpr counting inside argument handling. This just moves this code in here to it's cleaner. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-07 06:00:30 +01:00
Dave Airlie	7b46e2a74b	ac/nir: assign argument param pointers in one place. Instead of having the fragile code to do a second pass, just give the pointers you want params in to the initial code, then call a later pass to assign them. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-07 06:00:23 +01:00
Dave Airlie	b19cafd441	ac/nir: consolidate setting userdata location Just pass a pointer and increment inside the function, makes the code less error prone. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-07 05:59:57 +01:00
Eric Engestrom	63a8a88ac4	tree-wide: remove trailing backslash Simple search for a backslash followed by two newlines. If one of the newlines were to be removed, this would cause issues, so let's just remove these trailing backslashes. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-06-07 01:18:09 +01:00
Bas Nieuwenhuizen	ecdace80f4	ac/surface: Fix HTILE for radv. We always compute HTILE size using addrlib, even when not TC compatible. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlied <airlied@redhat.com>	2017-06-06 03:17:02 +02:00
Dave Airlie	0063da8393	radv: add some misc gfx9 pieces. This just adds the strings and includes the gfx9 register defs in some files that we need them in. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-06 09:43:21 +10:00
Nicolai Hähnle	dfc06d2fac	radv: use ac_surface data structures This is mostly mechanical changes of renaming types and introducing "legacy" everywhere. It doesn't use the ac_surface computation functions yet. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-06-05 10:44:09 +10:00
Nicolai Hähnle	e07d5c7296	ac/surface/gfx6: explicitly support S8 surfaces This is needed by radv for dEQP-VK.renderpass.simple.stencil Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-06-05 10:43:29 +10:00
Dave Airlie	72f0830ecd	ac/nir: set workgroup size attribute to correct value. This ports: `55445ff189` from radeonsi radeonsi: tell LLVM not to remove s_barrier instructions LLVM 5.0 removes s_barrier instructions if the max-work-group-size attribute is not set. What a surprise. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-05 01:37:44 +01:00
Dave Airlie	68c812f699	ac: add new helper function to add a integer target dependent function attr. This is needed to add the max workgroup size attribute. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-05 01:37:29 +01:00
Leo Liu	ea79c0440c	amd/common: set vcn dec as hw decode as well Recommit after issue resolved by the previous patch. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-05-29 14:32:29 -04:00
Leo Liu	0abc24723c	amd/common: add vcn dec ip info query for amdgpu version 3.17 Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-05-29 14:32:29 -04:00
Marek Olšák	e019ea8f4b	radeonsi: move building llvm.SI.load.const into ac_build_buffer_load Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-29 01:52:16 +02:00
Marek Olšák	e1942c970f	radeonsi: rename readonly_memory -> can_speculate This is more accurate. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-29 01:52:16 +02:00
Dave Airlie	e1409f7302	Revert "amd/common: add vcn dec ip info query" This reverts commit `524d4fff9e`. This commit breaks amdgpu on kernels with no DEC IP support. Caught by the airlied CI system.	2017-05-26 16:36:57 +10:00
Dave Airlie	ae1f32915b	Revert "amd/common: set vcn dec as hw decode as well" This reverts commit `50d322be2f`. A previous patch breaks amdgpu on non-vcn decode systems, but have to revert this first.	2017-05-26 16:36:38 +10:00
Leo Liu	50d322be2f	amd/common: set vcn dec as hw decode as well Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-05-25 11:40:20 -04:00
Leo Liu	524d4fff9e	amd/common: add vcn dec ip info query Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-05-25 11:40:20 -04:00
Leo Liu	c23ffafc50	radeon: rename has_uvd info to has_hw_decode Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-05-25 11:40:20 -04:00
Christian König	5318870f54	winsys/amdgpu: align VA allocations to fragment size v2 BOs larger than the minimum fragment size should have their VA alignet to at least the fragment size for optimal performance. v2: drop unused leftover from initial implementation Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-05-24 10:32:19 +02:00
Nicolai Hähnle	70215a23c6	ac: add missing extern "C" guards Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-05-18 11:48:53 +02:00
Nicolai Hähnle	6c01c4b907	ac: add radeon_info::num_{sdma,compute}_rings Vulkan needs them. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-05-18 11:48:53 +02:00
Nicolai Hähnle	c488bf24ed	ac: add radeon_surf::htile_slice_size Vulkan needs it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-05-18 11:48:52 +02:00
Nicolai Hähnle	98a2492290	ac_surface: use radeon_info from ac_gpu_info Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-05-18 11:48:52 +02:00
Nicolai Hähnle	988c866212	ac/radeonsi: move radeon_info initialization to amd/common v2: update Android.common.mk (Emil) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-05-18 11:48:52 +02:00
Nicolai Hähnle	de9dd4f9f1	ac/radeonsi: move struct radeon_info to ac_gpu_info.h Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-05-18 11:48:52 +02:00
Nicolai Hähnle	4d6e75776d	ac/radeonsi: move some aspects of sanity checking to ac_surface Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-05-18 11:48:52 +02:00
Nicolai Hähnle	00f466bad9	ac/radeonsi: add ac_compute_surface to automatically switch gfx6 vs. gfx9 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-05-18 11:48:52 +02:00
Nicolai Hähnle	8aabed64c3	ac/radeonsi: move the bulk of gfx9_surface_init to ac_surface We can now merge the two *_surface_init functions. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-05-18 11:48:51 +02:00
Nicolai Hähnle	db77cd879b	ac/radeonsi: move the bulk of gfx6_surface_init to ac_surface Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-05-18 11:48:51 +02:00
Nicolai Hähnle	f187a49322	ac/radeonsi: move amdgpu_addr_create to ac_surface v2: - update Android.common.mk (Emil) - rebase on top of Raven support Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2017-05-18 11:48:51 +02:00
Nicolai Hähnle	15a844986a	ac/radeonsi: move surface definitions to new header ac_surface.h Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-05-18 11:48:51 +02:00
Marek Olšák	bd4b224fa6	gallium/radeon: use a top-of-pipe timestamp for the start of TIME_ELAPSED Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-17 20:28:44 +02:00
Nicolai Hähnle	3accda4b82	ac/debug: handle index field in SET_*_REG correctly Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-05-16 16:11:53 +02:00
Marek Olšák	7622181cad	radeonsi/gfx9: add support for Raven Cc: 17.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-15 13:00:26 +02:00
Marek Olšák	efdb378c36	amd/addrlib: import Raven support Cc: 17.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-15 13:00:26 +02:00
Jason Ekstrand	b86dba8a0e	nir: Embed the shader_info in the nir_shader again Commit `e1af20f18a` changed the shader_info from being embedded into being just a pointer. The idea was that sharing the shader_info between NIR and GLSL would be easier if it were a pointer pointing to the same shader_info struct. This, however, has caused a few problems: 1) There are many things which generate NIR without GLSL. This means we have to support both NIR shaders which come from GLSL and ones that don't and need to have an info elsewhere. 2) The solution to (1) raises all sorts of ownership issues which have to be resolved with ralloc_parent checks. 3) Ever since `00620782c9`, we've been using nir_gather_info to fill out the final shader_info. Thanks to cloning and the above ownership issues, the nir_shader::info may not point back to the gl_shader anymore and so we have to do a copy of the shader_info from NIR back to GLSL anyway. All of these issues go away if we just embed the shader_info in the nir_shader. There's a little downside of having to copy it back after calling nir_gather_info but, as explained above, we have to do that anyway. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-05-09 15:07:47 -07:00
Marek Olšák	34bc470fa6	ac: fix broken elimination of duplicated VS exports The renumbering code didn't take into account that multiple VS exports can have the same PARAM index. This also significantly simplifies the renumbering. Thankfully, we have piglits for this: spec@arb_gpu_shader5@arb_gpu_shader5-interpolateatcentroid-packing spec@glsl-1.50@execution@interface-blocks-complex-vs-fs Reported by Michel Dänzer. Fixes: `b08715499e` ("ac: eliminate duplicated VS exports") Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-08 19:18:29 +02:00
Dave Airlie	a096d8d3f7	radv: enable POLARIS12 support. This just adds the chip in the right places. We don't set the partial_vs_wave workaround, as radeonsi doesn't, but have to confirm it's not required. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-05-05 11:07:40 +10:00
Marek Olšák	12beef0374	radeonsi: drop support for LLVM 3.8 LLVM 3.8: - had broken indirect resource indexing - didn't have scratch coalescing - was the last user of problematic v16i8 - only supported OpenGL 4.1 This leaves us with LLVM 3.9 and LLVM 4.0 support for Mesa 17.2. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-05 00:23:44 +02:00
Marek Olšák	4d32b4ac99	radeonsi: stop using v16i8 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-05 00:23:44 +02:00
Marek Olšák	283a1d1e27	radeonsi/gfx9: make some PA & DB registers match the closed Vulkan driver Cc: 17.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-05 00:23:44 +02:00
Marek Olšák	b08715499e	ac: eliminate duplicated VS exports Only very few shaders have them (from 48486 shaders): shaders/private/left_4_dead_2/765.shader_test - ac: 1 matches 2 shaders/private/left_4_dead_2/877.shader_test - ac: 1 matches 6 shaders/private/left_4_dead_2/2141.shader_test - ac: 1 matches 6 shaders/private/ue4_effects_cave/11.shader_test - ac: 4 matches 5 shaders/private/ue4_effects_cave/14.shader_test - ac: 5 matches 6 shaders/private/ue4_effects_cave/46.shader_test - ac: 5 matches 6 shaders/private/ue4_effects_cave/42.shader_test - ac: 4 matches 5 shaders/private/ue4_effects_cave/104.shader_test - ac: 4 matches 5 shaders/private/f1-2015/336.shader_test - ac: 3 matches 4 shaders/private/f1-2015/948.shader_test - ac: 6 matches 7 shaders/private/f1-2015/602.shader_test - ac: 0 matches 3 shaders/private/f1-2015/600.shader_test - ac: 0 matches 3 shaders/private/f1-2015/1214.shader_test - ac: 0 matches 1 shaders/private/f1-2015/988.shader_test - ac: 4 matches 5 shaders/private/ue4_elemental/149.shader_test - ac: 3 matches 4 shaders/private/ue4_elemental/346.shader_test - ac: 4 matches 5 shaders/private/ue4_elemental/178.shader_test - ac: 3 matches 4 shaders/private/ue4_elemental/136.shader_test - ac: 4 matches 5 shaders/private/ue4_elemental/168.shader_test - ac: 4 matches 5 shaders/private/ue4_elemental/690.shader_test - ac: 3 matches 4 shaders/private/ue4_elemental/19.shader_test - ac: 5 matches 6 shaders/private/dota2/1901.shader_test - ac: 0 matches 5 shaders/private/dota2/1357.shader_test - ac: 0 matches 5 shaders/private/dota2/1375.shader_test - ac: 0 matches 5 shaders/private/dota2/1369.shader_test - ac: 0 matches 5 shaders/private/dota2/1583.shader_test - ac: 0 matches 5 shaders/private/dota2/1811.shader_test - ac: 0 matches 5 shaders/private/dota2/1893.shader_test - ac: 0 matches 5 shaders/private/dota2/1533.shader_test - ac: 0 matches 5 shaders/private/dota2/1951.shader_test - ac: 0 matches 5 shaders/private/dota2/1361.shader_test - ac: 0 matches 5 shaders/private/mad_max/2792.shader_test - ac: 0 matches 1 shaders/private/mad_max/2794.shader_test - ac: 0 matches 1 shaders/private/mad_max/2780.shader_test - ac: 0 matches 1 shaders/private/mad_max/2902.shader_test - ac: 0 matches 1 shaders/private/bioshock-infinite/3050.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/2544.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/3062.shader_test - ac: 3 matches 8 shaders/private/bioshock-infinite/2012.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/3058.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/3270.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/732.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/3026.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/3258.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/3198.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/3046.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/3168.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/2550.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/3210.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/3032.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/668.shader_test - ac: 3 matches 7 Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-03 20:55:00 +02:00

... 5 6 7 8 9 ...

900 Commits