mirrors/mesa - Frog Git

Commit Graph

Author	SHA1	Message	Date
Nicolai Hähnle	74a26af913	amd/common/gfx10: add register JSON A small number of fields now need new disambiguation. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	536782b0b7	amd/common: add GFX10 chips Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Marek Olšák	78cdf9a99f	amd/addrlib: add gfx10 support Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Samuel Pitoiset	83297baf2d	ac: compute the DCC fast clear size per slice on GFX8 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-02 09:37:44 +02:00
Samuel Pitoiset	6517d226ac	ac: compute the size of one DCC slice on GFX8 Addrlib doesn't provide this info. Because DCC is linear, at least on GFX8, it's easy to compute the size of one slice. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-02 09:37:41 +02:00
Emil Velikov	4ec32413f3	ac: change ac_query_gpu_info() signature Currently libdrm_amdgpu provides a typedef of the various handles. While the goal was to make those opaque, it effectively became part of the API To the best of my knowledge there are two ways to have opaque handles: - "typedef void foo;" - rather messy IMHO - "stuct foo;" and use "struct foo " through the API In our case amdgpu_device_handle is used only internally, plus respective code is not used or applicable for r300 and r600. Hence we copied the typedef. Seemingly this will be a problem since libdrm_amdgpu wants to change the API, while not updating the code(?). Either way, we can safely s/amdgpU_device_handle/void */ and carry on. Cc: Michel Dänzer <michel@daenzer.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak at amd.com>	2019-06-28 17:49:32 +01:00
Samuel Pitoiset	34bef8a0d7	radv: clear CMASK layers instead of the whole buffer on GFX8 This reduces the size of fill operations needed to clear CMASK for layered color textures. GFX9 unsupported for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-25 16:36:28 +02:00
Samuel Pitoiset	476b907a3b	radv: clear FMASK layers instead of the whole buffer on GFX8 This reduces the size of fill operations needed to clear FMASK for layered color textures. GFX9 unsupported for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-25 16:36:25 +02:00
Marek Olšák	ac4b1e2f0a	radeonsi: set the calling convention for inlined function calls otherwise the behavior is undefined Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-06-24 21:04:10 -04:00
Nicolai Hähnle	bd3a3fd25a	amd/rtld: update the ELF representation of LDS symbols The initial prototype used a processor-specific symbol type, but feedback suggests that an approach using processor-specific section name that encodes the alignment analogous to SHN_COMMON symbols is preferred. This patch keeps both variants around for now to reduce problems with LLVM compatibility as we switch branches around. This also cleans up the error reporting in this function. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-24 21:04:10 -04:00
Marek Olšák	0032f6b8a0	ac/surface: remove addrlib_family_rev_id Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-24 21:04:10 -04:00
Daniel Schürmann	0daeb1d127	amd/common: lower bitfield_extract to ubfe/ibfe. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00
Daniel Schürmann	48a75e7af0	amd/common: lower bitfield_insert to bfm & bitfield_select Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00
Nicolai Hähnle	21dd881416	ac/rtld: report better error messages for LDS overallocation Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-06-19 20:30:32 -04:00
Marek Olšák	b64bd5887e	ac/rtld: check correct LDS max size Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-06-19 20:30:32 -04:00
Nicolai Hähnle	1ee0f0d315	radeonsi: add s_sethalt to shaders for debugging Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-06-19 20:30:32 -04:00
Nicolai Hähnle	87182200c7	ac/rtld: fix sorting of LDS symbols by alignment Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-06-19 20:30:32 -04:00
Connor Abbott	53a7649e5d	ac/nir: Set speculatable for buffer loads where allowed This brings the nir path in line with the TGSI path. Totals from affected shaders: SGPRS: 2984 -> 2984 (0.00 %) VGPRS: 2792 -> 2652 (-5.01 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 247380 -> 248072 (0.28 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 121 -> 132 (9.09 %) Wait states: 0 -> 0 (0.00 %) Most of the change came from DiRT: Showdown, and came from sinking SSBO loads. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:28 +02:00
Connor Abbott	3bf8981c51	ac,radeonsi: Always mark buffer stores as inaccessiblememonly inaccessiblememonly means that it doesn't modify memory accesible via normal LLVM pointers. This lets LLVM's dead store elimination, memcpy forwarding, etc. ignore functions with this attribute. We don't represent descriptors as pointers, so this property is always true of buffer and image stores. There are plans to represent descriptors via pointers, but this just means that now nothing is inaccessiblememonly, as LLVM will then understand loads/stores via its usual alias analysis. Radeonsi was mistakenly only setting it if the driver could prove that there were no reads, and then it was cargo-culted into ac_llvm_build and ac_llvm_to_nir. Rip it out of everything. statistics with nir enabled: Totals from affected shaders: SGPRS: 152 -> 152 (0.00 %) VGPRS: 128 -> 132 (3.12 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 9324 -> 9244 (-0.86 %) bytes LDS: 2 -> 2 (0.00 %) blocks Max Waves: 17 -> 17 (0.00 %) Wait states: 0 -> 0 (0.00 %) The only difference was a manhattan31 shader. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-19 14:08:27 +02:00
Samuel Pitoiset	4c7ef1b02e	ac: make ac_compute_cmask() a static function Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 11:30:47 +02:00
Samuel Pitoiset	b5012a0518	ac: update llvm.amdgcn.icmp intrinsic name for LLVM 9+ LLVM r363339 changed llvm.amdgcn.icmp.i* to llvm.amdgcn.icmp.i64.i*. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 08:58:33 +02:00
Marek Olšák	abe9a51d27	ac: add radeon_info::is_amdgpu instead of checking drm_major == 3 and clean up Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-14 13:31:18 -04:00
Daniel Schürmann	deedc0b31d	amd/common: add support for AMD_shader_ballot functions Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Nicolai Hähnle	f8315ae04b	amd/rtld: layout and relocate LDS symbols Upcoming changes to LLVM will emit LDS objects as symbols in the ELF symbol table, with relocations that will be resolved with this change. Callers will also be able to define LDS symbols that are shared between shader parts. This will be used by radeonsi for the ESGS ring in gfx9+ merged shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	1ff2440eee	amd/common: use ARRAY_SIZE for the LLVM command line options This is more convenient for changing it around during debug. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	3c958d924a	amd/common: add ac_compile_module_to_elf A new variant of ac_compile_module_to_binary that allows us to keep the entire ELF around. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	77b05cc42d	radeonsi: use ac_shader_config Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	b3be346c68	amd/common: add a more powerful runtime linker Using an explicit linker instead of just concatenating .text sections will allow us to start using .rodata sections and explicit descriptions of data on LDS that is shared between stages. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	c129cb3861	amd/common: clarify ac_shader_binary::lds_size Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 18:33:21 -04:00
Nicolai Hähnle	2e96c01073	amd/common: extract ac_parse_shader_binary_config Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 18:33:08 -04:00
Marek Olšák	4773f5a293	radeonsi: use the ac helper for index buffer stores in the culling shader	2019-06-11 20:05:21 -04:00
Connor Abbott	9d93d2a404	ac/nir: Remove stale TODO While we're here, copy the comment explaining this from radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-06 17:14:28 +02:00
Marek Olšák	ff63b99531	ac: rename LLVM <= 7 helpers for readability Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-04 18:53:46 -04:00
Marek Olšák	c9b64b58de	ac: fix a typo in ac_build_wg_scan_bottom Cc: 19.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-04 18:53:46 -04:00
Rhys Perry	73dda85512	ac/nir: mark some texture intrinsics as convergent Otherwise LLVM can sink them and their texture coordinate calculations into divergent branches. v2: simplify the conditions on which the intrinsic is marked as convergent v3: only mark as convergent in FS and CS with derivative groups Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-04 17:30:53 +01:00
Samuel Pitoiset	33f4e04d5a	ac,radv: do not emit vec3 for raw load/store on SI It's unsupported, only load/store format with vec3 are supported. Fixes: `6970a9a6ca` ("ac,radv: remove the vec3 restriction with LLVM 9+")" Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-04 08:47:26 +02:00
Marek Olšák	b2bbd1a27b	ac/registers: don't use the si, cik, vi names, use gfxN trivial	2019-06-03 20:06:41 -04:00
Nicolai Hähnle	f480b8aaa4	amd/common: use generated register header	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	cf51009ad2	amd/common: unify PITCH_GFX6 and PITCH_GFX9 The definition of the fields differs, but PITCH_GFX9 is a mere extension of PITCH_GFX6 that does not conflict with any other fields. This aligns the definitions with what will be generated from the register JSON. The information about how large the fields really are is preserved in the register database.	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	e04215815e	amd/common: rename R_3F2_CONTROL to IB_CONTROL for disambiguation This "register" name collides with R_370_CONTROL. This aligns the definitions with what will be generated from the register JSON.	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	cd247cf456	amd/common: cleanup DATA_FORMAT/NUM_FORMAT field names The field layout wasn't actually changed in gfx9, so having the suffix isn't very useful. The field contents were changed, but this is reflected in the V_xxx_xxx definitions and is taken into account by the ac_debug logic based on the register JSON. This aligns the definitions with what will be generated from the register JSON.	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	ef6ef098af	amd/common: derive ac_debug tables from register JSON	2019-06-03 20:05:20 -04:00
Marek Olšák	486bc1e17e	ac: use amdgpu-flat-work-group-size Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-03 14:32:47 -04:00
Samuel Pitoiset	6970a9a6ca	ac,radv: remove the vec3 restriction with LLVM 9+ This changes requires LLVM r356755. 32706 shaders in 16744 tests Totals: SGPRS: 1448848 -> 1455984 (0.49 %) VGPRS: 1016684 -> 1016220 (-0.05 %) Spilled SGPRs: 25871 -> 25815 (-0.22 %) Spilled VGPRs: 122 -> 122 (0.00 %) Scratch size: 11964 -> 11956 (-0.07 %) dwords per thread Code Size: 55324500 -> 55301152 (-0.04 %) bytes Max Waves: 235660 -> 235586 (-0.03 %) Totals from affected shaders: SGPRS: 293704 -> 300840 (2.43 %) VGPRS: 246716 -> 246252 (-0.19 %) Spilled SGPRs: 159 -> 103 (-35.22 %) Scratch size: 188 -> 180 (-4.26 %) dwords per thread Code Size: 8653664 -> 8630316 (-0.27 %) bytes Max Waves: 60811 -> 60737 (-0.12 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-03 11:30:08 +02:00
Marek Olšák	b257956021	ac: treat Mullins as Kabini, remove the enum it's the same design	2019-05-27 15:10:51 -04:00
Jason Ekstrand	f2dc0f2872	nir: Drop imov/fmov in favor of one mov instruction The difference between imov and fmov has been a constant source of confusion in NIR for years. No one really knows why we have two or when to use one vs. the other. The real reason is that they do different things in the presence of source and destination modifiers. However, without modifiers (which many back-ends don't have), they are identical. Now that we've reworked nir_lower_to_source_mods to leave one abs/neg instruction in place rather than replacing them with imov or fmov instructions, we don't need two different instructions at all anymore. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Acked-by: Rob Clark <robdclark@chromium.org>	2019-05-24 08:38:11 -05:00
Samuel Pitoiset	d7501834cd	radv: add a workaround for Monster Hunter World and LLVM 7&8 The load/store optimizer pass doesn't handle WaW hazards correctly and this is the root cause of the reflection issue with Monster Hunter World. AFAIK, it's the only game that are affected by this issue. This is fixed with LLVM r361008, but we need a workaround for older LLVM versions unfortunately. Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-17 11:41:19 +02:00
Marek Olšák	9d1485554c	ac: match radeonsi code in ac_shader_binary_read_config	2019-05-16 13:15:36 -04:00
Marek Olšák	894e017c9c	r600+radeonsi: use ctx_query_reset_status on radeon This allows a nice cleanup, because the winsys always handles it.	2019-05-16 13:15:36 -04:00
Marek Olšák	b19884e08e	winsys/amdgpu: add a parallel compute IB coupled with a gfx IB Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:07:00 -04:00
Marek Olšák	eda281e977	ac: add LLVM code for triangle culling Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:06:58 -04:00
Marek Olšák	ccfcb9d818	ac: rename SI-CIK-VI to GFX6-GFX7-GFX8 Acked-by: Dave Airlie <airlied@redhat.com> We already use GFX9 and I don't want us to have confusing naming in the driver. GFXn naming is better from the driver perspective, because it's the real version of the gfx portion of the hw. Also, CIK means Bonaire-Kaveri-Kabini, it doesn't mean CI. It shouldn't confuse our SDMA, UVD, VCE etc. code much. Those have nothing to do with GFXn and they have their own version numbers.	2019-05-15 20:54:10 -04:00
Marek Olšák	e5cc363f43	ac: add comments to chip enums Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (except GFX2 changes) Reviewed-by: Dave Airlie <airlied@redhat.com> (except <= GFX5 changes)	2019-05-15 20:54:10 -04:00
Marek Olšák	6b0b8f132a	ac: use 1D GEPs for descriptors and constants just a cleanup Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-14 15:15:11 -04:00
Nicolai Hähnle	81fe33735a	amd/common: add ac_build_opencoded_fetch_format Implement software emulation of buffer_load_format for all types required by vertex buffer fetches. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-13 17:07:23 +02:00
Samuel Pitoiset	4f18c43d1d	radv: apply the indexing workaround for atomic buffer operations on GFX9 Because the new raw/struct intrinsics are buggy with LLVM 8 (they weren't marked as source of divergence), we fallback to the old instrinsics for atomic buffer operations only. This means we need to apply the indexing workaround for GFX9. The load/store operations still use the new LLVM 8 intrinsics. The fact that we need another workaround is painful but we should be able to clean up that a bit once LLVM 7 support will be dropped. This fixes a GPU hang with AC Odyssey and some rendering problems with Nioh. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110573 Fixes: `31164cf5f7` ("ac/nir: only use the new raw/struct image atomic intrinsics with LLVM 9+") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-03 17:59:12 +02:00
Samuel Pitoiset	492e828848	ac: tidy up ac_build_llvm8_tbuffer_{load,store} For consistency with ac_build_llvm8_buffer_{load,store}_common helpers and that will help a bit for removing the vec3 restriction. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-02 09:24:05 +02:00
Eric Engestrom	7ca8ba199f	delete autotools .gitignore files One special case, `src/util/xmlpool/.gitignore` is not entirely deleted, as `xmlpool.pot` still gets generated (eg. by `ninja xmlpool-pot`). Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-29 21:17:19 +00:00
Rhys Perry	bd4c661ad0	ac,ac/nir: use a better sync scope for shared atomics https://reviews.llvm.org/rL356946 (present in LLVM 9 and later) changed the meaning of the "system" sync scope, making it no longer restricted to the memory operation's address space. So a single address space sync scope is needed for shared atomic operations (such as "system-one-as" or "workgroup-one-as") otherwise buffer_wbinvl1 and s_waitcnt instructions can be created at each shared atomic operation. This mostly reimplements LLVMBuildAtomicRMW and LLVMBuildAtomicCmpXchg to allow for more sync scopes and uses the new functions in ac->nir with the "workgroup-one-as" or "workgroup" sync scopes. F1 2017 (4K, Ultra High settings, TAA), avg FPS : 59 -> 59.67 (+1.14%) Strange Brigade (4K, ~highest settings), avg FPS : 51.5 -> 51.6 (+0.19%) RotTR/mountain (4K, VeryHigh settings, FXAA), avg FPS : 57.2 -> 57.2 (+0.0%) RotTR/tomb (4K, VeryHigh settings, FXAA), avg FPS : 42.5 -> 43.0 (+1.17%) RotTR/valley (4K, VeryHigh settings, FXAA), avg FPS : 40.7 -> 41.6 (+2.21%) Warhammer II/fallen, avg FPS : 31.63 -> 31.83 (+0.63%) Warhammer II/skaven, avg FPS : 37.77 -> 38.07 (+0.79%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-29 18:20:44 +01:00
Bas Nieuwenhuizen	427024bf2e	ac/nir: Add support for planes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 19:56:20 +00:00
Marek Olšák	2313176817	ac: add REWIND and GDS registers to register headers Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:28:56 -04:00
Marek Olšák	35cd57df2e	ac: add ac_get_i1_sgpr_mask Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:28:56 -04:00
Marek Olšák	bfb9287599	ac: add radeon_info::is_pro_graphics Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:28:56 -04:00
Marek Olšák	64d6cc982d	ac: add radeon_info::marketing_name, replacing the winsys callback Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:28:56 -04:00
Samuel Pitoiset	2b515a8259	ac/nir: use the new raw/struct SSBO atomic intrisics for comp_swap This is actually fixed now. This change requires LLVM r358579. Make sure to have it in your tree, otherwise the following piglit will hang: tests/spec/arb_shader_storage_buffer_object/execution/ssbo-atomicCompSwap-int.shader_test Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-19 09:20:15 +02:00
Samuel Pitoiset	895e10d2db	ac/nir: only use the new raw/struct SSBO atomic intrinsics with LLVM 9+ They are buggy with older LLVM version, see r358579. Fixes: `78c551aca1` ("ac/nir: use new LLVM 8 intrinsics for SSBO atomics except cmpswap") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-19 09:20:13 +02:00
Samuel Pitoiset	31164cf5f7	ac/nir: only use the new raw/struct image atomic intrinsics with LLVM 9+ They are buggy with LLVM 8 because they weren't marked as source of divergence, see r358579. Fixes: `dd0172e865` ("radv: Use structured intrinsics instead of indexing workaround for GFX9.")" Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-19 09:20:09 +02:00
Samuel Pitoiset	ad6dc13fc7	ac: use struct/raw store intrinsics for 8-bit/16-bit int with LLVM 9+ This changes requires LLVM r356465. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-17 22:10:30 +02:00
Samuel Pitoiset	26ea506235	ac: use struct/raw load intrinsics for 8-bit/16-bit int with LLVM 9+ This changes requires LLVM r356465. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-17 22:10:28 +02:00
Samuel Pitoiset	6fd5e39b60	ac: add support for more types with struct/raw LLVM intrinsics LLVM 9+ now supports 8-bit and 16-bit types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-17 22:10:25 +02:00
Samuel Pitoiset	d118e382dd	ac/nir: add 64-bit SSBO atomic operations support Except compare&swap which is still buggy. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-17 21:59:54 +02:00
Samuel Pitoiset	78c551aca1	ac/nir: use new LLVM 8 intrinsics for SSBO atomics except cmpswap Use the raw version (ie. IDXEN=0) because vindex is unused. Use the old intrinsic for compare&swap because the new one hangs the GPU for some reasons. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-17 21:59:52 +02:00
Bas Nieuwenhuizen	af9534b9f3	ac: Move has_local_buffers disable to radeonsi. In radv we had a separate flag to actually use it + an env option to experimentally use it. The common code setting has_local_buffers to false of course broke that experimental option. Also the "enable on APU" did not make sense for RADV as it is still disabled by default. Fixes: `b21a4efb55` "radv/winsys: allow local BOs on APUs" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-15 20:39:28 +02:00
Marek Olšák	dbab755ecf	ac: fix incorrect bindless atomic code in visit_image_atomic Coverity: CID 1444664 Fixes: `d62d434fe9` ("ac/nir_to_llvm: add image bindless support") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-15 12:52:02 -04:00
Rhys Perry	8671cfe2a2	nir,ac/nir: fix cube_face_coord Seems it was missing the "/ ma + 0.5" and the order was swapped. Fixes: `a1a2a8dfda` ('nir: add AMD_gcn_shader extended instructions') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-15 17:22:47 +01:00
Karol Herbst	14531d676b	nir: make nir_const_value scalar v2: remove & operator in a couple of memsets add some memsets v3: fixup lima Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)	2019-04-14 22:25:56 +02:00
Karol Herbst	adb2263014	amd/nir: some cleanups Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-14 22:25:56 +02:00
Marek Olšák	f4ae188d50	ac: use the common helper ac_apply_fmask_to_sample Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-12 11:35:31 -04:00
Marek Olšák	971bc10177	radeonsi: set AC_FUNC_ATTR_READNONE for image opcodes where it was missing Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-12 11:34:39 -04:00
Samuel Pitoiset	6718bb57ac	ac/nir: remove some useless integer casts for ALU operations Sources are always casted to integers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:55 +02:00
Samuel Pitoiset	8a6442075f	ac/nir: remove useless integer cast in visit_image_load() ac_build_image_opcode() casts if necessary and buffer images are casted too. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:55 +02:00
Samuel Pitoiset	ffbb62f808	ac/nir: remove useless integer cast in adjust_sample_index_using_fmask() It's already casted if necessary in ac_build_image_opcode(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:55 +02:00
Samuel Pitoiset	7b5b27a685	ac/nir: remove useles LLVMGetUndef for nir_op_pack_64_2x32_split Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:55 +02:00
Samuel Pitoiset	fd4041987b	ac: add ac_build_load_helper_invocation() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:55 +02:00
Samuel Pitoiset	590a4c8981	ac: add ac_build_ddxy_interp() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:55 +02:00
Samuel Pitoiset	4cb13e9462	ac: add ac_build_umax() and use it where possible This changes the predicate from LessThan to Equal. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:55 +02:00
Samuel Pitoiset	cf88bfa75a	ac/nir: make use of ac_build_umin() where possible This changes the predicate from LessThan to Equal. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:54 +02:00
Samuel Pitoiset	15dd81913f	ac/nir: make use of ac_build_imin() where possible This changes the predicate from LessThan to Equal. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:54 +02:00
Samuel Pitoiset	d7a0c0d53b	ac/nir: make use of ac_build_imax() where possible Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 17:30:54 +02:00
Timothy Arceri	d62d434fe9	ac/nir_to_llvm: add image bindless support With this all piglit bindless image tests pass on radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 09:02:59 +02:00
Timothy Arceri	55fb93b586	ac/nir_to_llvm: make get_sampler_desc() more generic and pass it the image intrinsic This will be required by the bindless support in the following patches. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 09:02:59 +02:00
Karol Herbst	d7bbb3caf1	glsl_to_nir: handle bindless textures v2: add support for AMD Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 09:02:59 +02:00
Rhys Perry	fd1fc255d9	ac: add 16-bit support to ac_build_ddxy() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-10 09:05:58 +02:00
Samuel Pitoiset	bc6d486c78	ac/nir: fix nir_op_b2f16 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-10 09:05:55 +02:00
Samuel Pitoiset	27b8f3ecc3	ac/nir: fix intrinsic names for atomic operations with LLVM 9+ This fixes the following LLVM error when using RADV_DEBUG=checkir: Intrinsic name not mangled correctly for type arguments! Should be: llvm.amdgcn.buffer.atomic.add.i32 i32 (i32, <4 x i32>, i32, i32, i1)* @llvm.amdgcn.buffer.atomic.add The cmpswap operation still uses the old intrinsic. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-08 13:16:50 +02:00
Marek Olšák	b563460b49	radeonsi: enable displayable DCC on Ravens	2019-04-04 09:53:24 -04:00
Marek Olšák	1f21396431	radeonsi: add support for displayable DCC for multi-RB chips A compute shader is used to reorder DCC data from aligned to unaligned.	2019-04-04 09:53:24 -04:00
Marek Olšák	2c09eb4122	radeonsi: add support for displayable DCC for 1 RB chips This is the simpler codepath - just disable RB and pipe alignment for DCC.	2019-04-04 09:53:24 -04:00
Samuel Pitoiset	d099bc5829	ac: add 8-bit and 64-bit support to ac_build_bitfield_reverse() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 18:53:57 +02:00
Samuel Pitoiset	2cecf6c5cc	ac: add 8-bit support to ac_build_umsb() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 18:53:55 +02:00
Samuel Pitoiset	a45d9e3e8d	ac: add 8-bit support to ac_find_lsb() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 18:53:53 +02:00
Samuel Pitoiset	89cf8ca0ae	ac: add 8-bit support to ac_build_bit_count() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 18:53:52 +02:00
Samuel Pitoiset	869af0464a	ac/nir: add support for nir_op_b2i8 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 18:53:49 +02:00
Samuel Pitoiset	4d5fce29c3	ac: fix ac_build_umsb() for 16-bit integer type Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 09:51:56 +02:00
Samuel Pitoiset	7a088d1ac8	ac: fix ac_find_lsb() for 16-bit integer type Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 09:51:54 +02:00
Samuel Pitoiset	b16dffff23	ac: fix ac_build_bitfield_reverse() for 16-bit integer type Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 09:51:52 +02:00
Samuel Pitoiset	9d13b9e53e	ac: fix ac_build_bit_count() for 16-bit integer type Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 09:51:49 +02:00
Samuel Pitoiset	e39a6b940f	ac/nir: fix nir_op_b2i16 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-01 09:51:47 +02:00
Timothy Arceri	4478c5374b	Revert "ac/nir: use new LLVM 8 intrinsics for SSBO atomic operations" This reverts commit `29132af234`. It seems the new intrinsic causes a hang on radeonsi (VEGA) when running the piglit test: tests/spec/arb_shader_storage_buffer_object/execution/ssbo-atomicCompSwap-int.shader_test	2019-03-29 21:04:01 +11:00
Samuel Pitoiset	cc752dea61	ac: fix return type for llvm.amdgcn.frexp.exp.i32.64 This fixes the following piglit with RadeonSI tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-29 09:18:24 +01:00
Samuel Pitoiset	52c02d921f	ac: add ac_build_frex_exp() helper ans 16-bit/32-bit support Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-28 13:02:48 +01:00
Samuel Pitoiset	1bf9311c59	ac: add ac_build_frexp_mant() helper and 16-bit/32-bit support Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-28 13:02:46 +01:00
Samuel Pitoiset	d6a07732c9	ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-27 14:45:52 +01:00
Nicolai Hähnle	e16ac33f37	amd/surface: provide firstMipIdInTail for metadata surface calculations This field was added in a recent addrlib update, and while there currently seems to be no issue with skipping it, we will have to set it correctly in the future. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-03-26 10:00:55 +01:00
Bas Nieuwenhuizen	82075e3c42	ac/nir: Return frag_coord as integer. To preserve the invariant that nir ssa defs are integers or pointers in LLVM. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-03-26 09:41:15 +01:00
Rhys Perry	f736250ab4	ac/nir: implement 16-bit pack/unpack opcodes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-03-22 12:50:16 +01:00
Samuel Pitoiset	00327f827f	ac: fix incorrect argument type for tbuffer.{load,store} with LLVM 7 GLC/SLC are boolean. This fixes the following LLVM error when checkir is set: Intrinsic has incorrect argument type! void (i32, <4 x i32>, i32, i32, i32, i32, i32, i32, i32, i32)* @llvm.amdgcn.tbuffer.store.i32 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl	2019-03-21 14:02:00 +01:00
Samuel Pitoiset	20cac1f498	ac: fix 16-bit shifts This fixes the following LLVM error when ckeckir is set: Type too small for ZExt Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl	2019-03-21 14:01:58 +01:00
Samuel Pitoiset	2ac5c5c1b5	ac: add 16-bit support to fract Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 12:13:09 +01:00
Samuel Pitoiset	0eb1478ac2	ac: add 16-bit support fo fsign Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 12:13:07 +01:00
Samuel Pitoiset	ff11c9dcc7	ac: add f16_0 and f16_1 constants Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 12:13:05 +01:00
Rhys Perry	3cc72a88d8	ac/nir: implement 8-bit conversions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:02:25 +01:00
Rhys Perry	c73f8b6576	ac/nir: add 8-bit types to glsl_base_to_llvm_type Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:02:22 +01:00
Rhys Perry	9c5067acf1	ac/nir: implement 8-bit ssbo stores Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:02:20 +01:00
Samuel Pitoiset	b235d77e18	ac: add ac_build_tbuffer_store_byte() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:02:18 +01:00
Rhys Perry	b12e074b89	ac/nir: implement 8-bit push constant, ssbo and ubo loads Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:02:16 +01:00
Samuel Pitoiset	104dbc64a5	ac: add ac_build_tbuffer_load_byte() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:02:14 +01:00
Samuel Pitoiset	6e632eb24b	ac: add various int8 definitions Original patch by Rhys Perry. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-21 09:02:10 +01:00
Samuel Pitoiset	72e366b4c2	ac: use new LLVM 8 intrinsics in ac_build_buffer_store_dword() New buffer intrinsics have a separate soffset parameter. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:19:19 +01:00
Samuel Pitoiset	9d960c17a8	ac: use new LLVM 8 intrinsic when storing 16-bit values vindex is always 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:19:14 +01:00
Samuel Pitoiset	2a9d331898	ac: add ac_build_{struct,raw}_tbuffer_store() helpers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:19:12 +01:00
Samuel Pitoiset	30c2aca67f	ac: use new LLVM 8 intrinsics in ac_build_buffer_load() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:19:08 +01:00
Samuel Pitoiset	da46dbb1be	ac/nir: use ac_build_buffer_store_dword() for SSBO store operations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:19:06 +01:00
Samuel Pitoiset	6b573c00c9	ac/nir: use ac_build_buffer_load() for SSBO load operations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:19:02 +01:00
Samuel Pitoiset	29132af234	ac/nir: use new LLVM 8 intrinsics for SSBO atomic operations Use the raw version (ie. IDXEN=0) because vindex is unused. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:18:56 +01:00
Samuel Pitoiset	b39844457f	ac/nir: remove one useless check in visit_store_ssbo() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:18:54 +01:00
Samuel Pitoiset	a2073f49f1	ac: add ac_build_buffer_store_format() helper Similar to ac_build_buffer_load_format(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:18:50 +01:00
Samuel Pitoiset	4debe49d44	ac/nir: set attrib flags for SSBO and image store operations For consistency regarding other store operations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:18:37 +01:00
Samuel Pitoiset	1b553dd47f	ac: make use of ac_get_store_intr_attribs() where possible Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 22:18:35 +01:00
Samuel Pitoiset	f4f0e3a395	ac: use llvm.amdgcn.fract intrinsic for nir_op_ffract Noticed with a Doom shader. 29077 shaders in 15096 tests Totals: SGPRS: 1282125 -> 1282133 (0.00 %) VGPRS: 908716 -> 908616 (-0.01 %) Spilled SGPRs: 24811 -> 24779 (-0.13 %) Code Size: 49048176 -> 48936488 (-0.23 %) bytes Max Waves: 244232 -> 244226 (-0.00 %) Totals from affected shaders: SGPRS: 229584 -> 229592 (0.00 %) VGPRS: 163268 -> 163168 (-0.06 %) Spilled SGPRs: 8682 -> 8650 (-0.37 %) Code Size: 12819572 -> 12707884 (-0.87 %) bytes Max Waves: 24398 -> 24392 (-0.02 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-20 09:06:35 +01:00
Timothy Arceri	010570c8e3	ac/nir_to_llvm: add assert to emit_bcsel() nir to llvm assumes we have already split vectors to scalars via nir_lower_alu_to_scalar(). Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-18 09:39:04 +11:00
Samuel Pitoiset	cbf022cb31	ac: use the raw tbuffer version for 16-bit SSBO loads vindex is always 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-13 14:16:14 +01:00
Samuel Pitoiset	045fae0f73	ac: add ac_build_{struct,raw}_tbuffer_load() helpers The struct version sets IDXEN=1, while the raw version sets IDXEN=0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-13 14:15:05 +01:00
Samuel Pitoiset	489dac0d21	ac: rework typed buffers loads for LLVM 7 Be more generic, this will be used by an upcoming series. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-13 13:31:06 +01:00
Rhys Perry	0f025bbccc	ac/nir: fix 16-bit ssbo stores Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-03-12 15:51:52 +01:00
Timothy Arceri	54522d0506	nir: rename glsl_type_is_struct() -> glsl_type_is_struct_or_ifc() Replace done using: find ./src -type f -exec sed -i -- \ 's/glsl_type_is_struct(/glsl_type_is_struct_or_ifc(/g' {} \; Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 13:10:02 +11:00
Timothy Arceri	8294295dbd	glsl: rename record_location_offset() -> struct_location_offset() Replace done using: find ./src -type f -exec sed -i -- \ 's/record_location_offset(/struct_location_offset(/g' {} \; Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-06 13:10:02 +11:00
Bas Nieuwenhuizen	a1fdd4a4a7	radv: Fix float16 interpolation set up. float16 types can have non-flat interpolation so set up the HW correctly for that. Fixes: `62024fa775` "radv: enable VK_KHR_16bit_storage extension / 16bit storage features" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-22 17:06:55 +01:00
Bas Nieuwenhuizen	1ef2855692	radv: Handle clip+cull distances more generally as compact arrays. Needed for https://gitlab.freedesktop.org/mesa/mesa/merge_requests/248 . That MR keeps the clip and cull arrays split. So we have to handle - compact arrays with location_frac != 0 - VARYING_SLOT_CLIP_DIST1 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-20 22:49:52 +00:00
Kenneth Graunke	ba7519ca36	radeonsi: Go back to using llvm.pow intrinsic for nir_op_fpow ARB_vertex_program and ARB_fragment_program define 0^0 = 1 (while GLSL leaves it undefined). Performing fpow lowering in NIR would break this behavior, preventing us from using prog_to_nir. According to llvm/lib/Target/AMDGPU/SIInstructions.td, POW_common expands to <V_LOG_F32_e32, V_EXP_F32_e32, V_MUL_LEGACY_F32_e32>, which presumably does a zero-wins multiply. Lowering in NIR results in a non-legacy multiply, where: pow(0, 0) = 2^(log2(0) * 0) = 2^(-INF * 0) = 2^(-NaN) = -NaN which isn't the desired result. This reverts: - commit `d6b7539206` (ac/nir: remove emission of nir_op_fpow) - commit `22430224fe` (radeonsi/nir: enable lowering of fpow) and prevents a regression in gl-1.0-spot-light with AMD_DEBUG=nir after enabling prog_to_nir in st/mesa later in this series. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-19 15:56:19 -08:00

1 2 3 4 5 ...

1330 Commits