mirrors/mesa - Frog Git

Commit Graph

Author	SHA1	Message	Date
Dave Airlie	303d22f319	radv/ac: round cube array coordinate before fixup. This fixes: dEQP-VK.glsl.texture_functions.texture.samplercubearray* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-04 05:39:07 +10:00
Dave Airlie	5821f676ee	radv: move to using common buffer load format. Get rid of usage of SI.vs.load.input. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-04 05:37:52 +10:00
Dave Airlie	cb1518e96b	radv/ac: setup lds for tessellation This seems to get lost in the rebases, should fix the tessellation demos, crash in llvm. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:17:15 +10:00
Dave Airlie	aaabdd6bc6	radv/ac: handle writing out tess factors. This ports the code from radeonsi to build the if/endif, and ports the tess factor emission code. This code has an optimisation TODO that we can deal with later. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:47 +10:00
Dave Airlie	94f9591995	radv/ac: add support for TCS/TES inputs/outputs. This adds support for the tessellation inputs/outputs to the shader compiler, this is one of the main pieces of the patch. It is very similiar to the radeonsi code (post merge we should consider if there are better sharing opportunities). The main differences from radeonsi, is that we can have "compact" varyings for clip/cull/tess factors, and we have to add special handling for these. This consists of treating the const index from the deref different depending on the compactness. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:42 +10:00
Dave Airlie	5ab1289b48	radv/ac: add clip support for tess eval shader. As this may be the last shader to emit clip distances. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:37 +10:00
Dave Airlie	326b9bc6dc	radv/ac: hook up tessellation intrinsics. This just adds support for the nir intrinsics that tessellation uses. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:32 +10:00
Dave Airlie	d8ab71b207	radv/ac: hook up shader information handling for tessellation This hooks up the tessellation shader info to the nir values and ctx generated ones. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:27 +10:00
Dave Airlie	5b40eab00a	radv: add tess ctrl stage barrier workaround for SI. This just ports the workaround from radeonsi. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:04 +10:00
Dave Airlie	3a633cc2cb	radv/ac: add support for patch inputs to unique index code. This add support for tessellation patch inputs to the code that finds the unique parameter index. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:15:57 +10:00
Dave Airlie	60326a7afc	radv/ac: setup tessellation shader inputs. This just configures all the register inputs for the tessellation related stages. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:15:41 +10:00
Dave Airlie	3968162751	radv/ac: setup tess rings on compiler side. This just sets up the necessary pointers on the compiler side for the rings needed for tessellation. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:15:35 +10:00
Dave Airlie	2b3c4bcc1f	radv/ac: add tess changes to shader keys/info This adds the tess pieces for shader keys and shader info, it adds the necessary bits to the vertex key/info as well. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:15:22 +10:00
Dave Airlie	a5136a97f7	radv: use defines for ring descriptor offsets. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:15:12 +10:00
Dave Airlie	97e0ff30c0	radv: handle clip dist in es outputs. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:14:53 +10:00
Dave Airlie	6279646306	radv: drop unneeded start Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:14:39 +10:00
Dave Airlie	a58d03a5a2	radv: fixup geometry clip emission since using the geom pass Fixes: 2b35b60d: radv: move to using nir clip/cull merge pass. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:14:38 +10:00
Marek Olšák	d60f72a9f0	radeonsi/gfx9: image descriptor changes in immutable fields The border color swizzle logic was copied from Vulkan. It doesn't make any sense to me, but it passes all piglits except the stencil ones. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	2862300d9e	radeonsi/gfx9: init_config changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	71ad666414	radeonsi/gfx9: CP DMA changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	ef97cc0cae	radeonsi/gfx9: add IB parser support Both GFX6 and GFX9 fields are printed next to each other in parsed IBs. The Python script parses both headers like one stream and tries to merge all definitions. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	68d6d097f1	radeonsi/gfx9: add GFX9 and VEGA10 enums Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	5691e14735	amd: GFX9 packet changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	ecbdfbeb05	amd: define event types for GFX9 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	00e777b61c	amd: add texture format definitions for GFX9 the DATA_FORMAT and NUM_FORMAT fields are the same, but some of the enums differ, thus add GFX6 and GFX9 suffixes, so that the IB parser can show enums for both. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	e6c520362d	amd: resolve remaining definition conflicts with gfx9d.h Add _GFX6 and _GFX9 suffixes to conflicting definitions. sid.h and gfx9d.h can now be included in the same file. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	7e7043c31c	amd: normalize register definition formatting This resolves trivial conflicts with gfx9d.h caused by different formatting. Some fields are also renamed. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	db04d4ccaa	amd: import GFX9 register definitions Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	7f160efcde	amd/addrlib: import gfx9 support	2017-03-30 14:44:33 +02:00
Dave Airlie	a930c2c612	radv: fix mask attribs properly. some days it just doesn't pay to get out of bed. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-30 13:09:30 +10:00
Dave Airlie	aa27a9f687	radv: fix regression with mask attrib setting code. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-30 12:07:32 +10:00
Dave Airlie	2b35b60df1	radv: move to using nir clip/cull merge pass. Doing this before tessellation makes doing some bits of tessellation a bit cleaner. It also cleans up a bit of the llvm generator code. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-30 11:04:56 +10:00
Dave Airlie	d43691ce77	radv: add parameter to emit_waitcnt. This is just a precursor for tess support, which needs to pass different values here. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-28 17:40:03 +10:00
Dave Airlie	931a8d0c9a	radv: rework vertex/export shader output handling In order to faciliate adding tess support, split the vs/es output info into a separate block, so we make it easier to have the tess shaders export the same info. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-28 17:39:59 +10:00
Emil Velikov	95ab07c586	ac: consistently use ifndef guards over pragma once Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:21 +00:00
Marek Olšák	84012262ea	ac: fix build with LLVM 5.0svn Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-22 17:54:42 +01:00
Alex Smith	ce4058dafd	radv/ac: Fix shared memory offset calculation The index passed to get_shared_memory_ptr is an attribute slot index, i.e. the index of a vec4 within LDS. Therefore this must be scaled by sizeof(vec4) to give the LDS byte offset. Fixes: `f4e499ec79` ("radv: add initial non-conformant radv vulkan driver") Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> CC: <mesa-stable@lists.freedesktop.org>	2017-03-17 09:35:48 +01:00
James Legg	e88cac1df0	radv: Fix using more than 4 bound descriptor sets Avoid a buffer overflow in ac_nir_to_llvm.c's create_function when using more than 4 descriptor sets. radv claims support for 8. Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-17 09:12:43 +01:00
Dave Airlie	7372e3cf5f	radv/ac: workaround regression in llvm 4.0 release LLVM 4.0 released with a pretty messy regression, that hopefully get fixed in the future. This work around was proposed by Tom, and it fixes the CTS regressions here at least, I'm not sure if this will cause any major side effects, but correctness over speed and all that. radeonsi should possibly consider the same workaround until an llvm fix can be found. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-15 09:51:53 +10:00
Dave Airlie	3ece76f03d	radv/ac: gather4 cube workaround integer This fix is extracted from amdgpu-pro shader traces. It appears the gather4 workaround for integer types doesn't work for cubes, so instead if forces a float scaled sample, then converts to integer. It modifies the descriptor before calling the gather. This also produces some ugly asm code for reasons specified in the patch, llvm could probably do better than dumping sgprs to vgprs. This fixes: dEQP-VK.glsl.texture_gather.basic.cube.rgba8* Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-15 09:51:53 +10:00
Jason Ekstrand	762a6333f2	nir: Rework conversion opcodes The NIR story on conversion opcodes is a mess. We've had way too many of them, naming is inconsistent, and which ones have explicit sizes was sort-of random. This commit re-organizes things and makes them all consistent: - All non-bool conversion opcodes now have the explicit size in the destination and are named <src_type>2<dst_type><size>. - Integer <-> integer conversion opcodes now only come in i2i and u2u forms (i2u and u2i have been removed) since the only difference between the different integer conversions is whether or not they sign-extend when up-converting. - Boolean conversion opcodes all have the explicit size on the bool and are named <src_type>2<dst_type>. Making things consistent also allows nir_type_conversion_op to be moved to nir_opcodes.c and auto-generated using mako. This will make adding int8, int16, and float16 versions much easier when the time comes. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-03-14 07:36:40 -07:00
Dave Airlie	b8ee70384a	radv: setup llvm target data layout Ported from radeonsi, pointed out by Tom. "This prevents LLVM from using sext instructions for local memory offsets and allows the backend to fold immediate offsets into the instruction. This also prevents some incorrect code generation for ptrtoint and inttoptr instructions." Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tom Stellard <tstellar@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-14 10:33:59 +10:00
Dave Airlie	e27fdbcb4c	radv/ac: move to new image intrinsics. This hooks up radv to the new image intrinsic builders. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-13 09:44:53 +10:00
Emil Velikov	a1d186cb70	amd: remove shebang from python scripts Analogous to earlier commit(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:46 +00:00
Emil Velikov	f6180a5ab7	amd: remove execute bit from python scripts Analogous to earlier commit(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:46 +00:00
Fredrik Höglund	162beb2abb	radv/ac: fix multiple descriptor sets with dynamic buffers The dynamic_offset_offset in the descriptor set binding layout is relative to the dynamic_offset_start for the set in the pipeline layout. Cc: 17.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-07 20:23:32 +01:00
Dave Airlie	03f5405fc2	amd/common: document PREDICATION OP 3 as 64-bit bool. This just documents some info for possible future use. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-07 15:20:01 +10:00
Dave Airlie	5c45d2051a	radv/ac: introduce i1true/i1false to context. This uses these in a few places, and fixes one or two cases which were using da as 32-bit instead of bool. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-07 08:17:03 +10:00
Dave Airlie	ca884aef86	radv/ac: handle Z export using new builder. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-07 08:17:03 +10:00
Dave Airlie	bf2be50774	radv/ac: move to using common ac_get_image_intr_name. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-07 08:17:03 +10:00
Dave Airlie	10ae83a9c2	radeonsi/ac: move get_image_intr_name to common This code is used in radv, so move to common build code. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-07 08:17:03 +10:00
Marek Olšák	7e1faa79d3	radeonsi: drop support for LLVM 3.6 & 3.7 They are too old. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-06 14:13:04 +01:00
Marek Olšák	d5d74fe2b5	radeonsi: set the convergent attribute where needed Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-06 14:13:04 +01:00
Marek Olšák	ef883fc554	gallivm,ac: add LP_FUNC_ATTR_CONVERGENT Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-06 14:13:04 +01:00
Marek Olšák	9b08f044be	radeonsi: fix LLVM 3.9 - don't use non-matching attributes on declarations Call site attributes are used since LLVM 4.0. This also reverts commit `b19caecbd6` "radeon/ac: fix intrinsic version check", because this is the correct fix. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-06 14:13:04 +01:00
Dave Airlie	2e73ccb485	radv/ac: use bitfield extract new intrinsics. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-06 15:27:33 +10:00
Dave Airlie	9c7309b09b	radv/ac: move to new kill build. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-06 15:27:33 +10:00
Dave Airlie	a2652719f3	radv/ac: move to using new export intrinsics. This uses the new code in build to do exports. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-06 15:27:33 +10:00
Dave Airlie	2830ece0fc	radv/ac: switch to new intrinsics for pkrtz and clamp. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-06 15:27:32 +10:00
Dave Airlie	b19caecbd6	radeon/ac: fix intrinsic version check Reported-by: 375gnu@gmail.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100068 Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-06 06:05:58 +10:00
Marek Olšák	7f1446a8a1	ac: normalize build helper names s/emit/build/ Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 17:30:07 +01:00
Marek Olšák	8bde7fb3fc	ac: replace SI.vs.load.input with amdgcn.buffer.load.format Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 17:30:07 +01:00
Marek Olšák	94811dc66c	radeonsi: move SI.vs.load.input building into amd/common Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 17:30:07 +01:00
Marek Olšák	97e21cfa25	ac: replace llvm.SI.tbuffer.store with llvm.amdgcn.buffer.store if ADD_TID=0 ADD_TID doesn't work. Needs more investigation. v2: remove leftover dead code Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)	2017-03-03 15:29:30 +01:00
Marek Olšák	8cfdbba6c7	ac: remove offen parameter from ac_build_buffer_store_dword Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	27439dfdae	radeonsi: merge and simplify tbuffer_store functions Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	d4324ddb89	radeonsi: replace AMDGPU.bfe.* with amdgcn.*bfe Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	9c09592086	radeonsi: move kill intrinsic building into amd/common just a cleanup Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	e729dc7c46	radeonsi: set readnone on reads from read-only memory	2017-03-03 15:29:30 +01:00
Marek Olšák	653ac0b389	radeonsi: replace SI.packf16 with amdgcn.cvt.pkrtz	2017-03-03 15:29:30 +01:00
Marek Olšák	4b2e5b9389	ac: replace old image intrinsics with new ones Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	ad18d7f040	radeonsi: move image intrinsic building to amd/common Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	2b3ebe307c	ac: replace SI.export with amdgcn.exp.* Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	369f4a8726	radeonsi: move llvm.SI.export building to amd/common Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	9af03318aa	ac: unify build_type_name_for_intr functions Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	b5744310d4	gallivm, ac: add writeonly and inaccessiblememonly attributes Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Tobias Klausmann	6d600cf632	amd/common: Fix build with new ac_add_function_attr() Fix usage of ac_add_function_attr() and make it known! common/ac_nir_to_llvm.c: In function 'create_llvm_function': common/ac_nir_to_llvm.c:265:4: error: implicit declaration of function 'ac_add_function_attr' [-Werror=implicit-function-declaration] ac_add_function_attr(main_function, i + 1, AC_FUNC_ATTR_BYVAL); ^~~~~~~~~~~~~~~~~~~~ Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-01 23:53:38 +01:00
Marek Olšák	940da36a65	gallivm,ac: add function attributes at call sites instead of declarations They can vary at call sites if the intrinsic is NOT a legacy SI intrinsic. We need this to force readnone or inaccessiblememonly on some amdgcn intrinsics. This is only used with LLVM 4.0 and later. Intrinsics only used with LLVM <= 3.9 don't need the LEGACY flag. gallivm and ac code is in the same patch, because splitting would be more complicated with all the LEGACY uses all over the place. v2: don't change the prototype of lp_add_function_attr. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> (v1)	2017-03-01 18:59:36 +01:00
Marek Olšák	408f370710	gallivm,ac: remove unused FUNC_ATTR_LAST enums Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-03-01 18:59:36 +01:00
Dave Airlie	e66be3d3bb	radv: fix txs for sampler buffers I messed this up when I wrote it, this fixes: dEQP-VK.memory.pipeline_barrier.uniform_texel_buffer. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-01 08:02:24 +10:00
Marek Olšák	8c838730d0	amd/common: fix ASICREV_IS_POLARIS11_M for Polaris12 Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-28 21:44:30 +01:00
Bas Nieuwenhuizen	137b06b437	radv/ac: Use constants for immutable samplers. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-28 20:48:14 +01:00
Timothy Arceri	f0aaa4b3a4	radeon/ac: make ac_shader_binary_config_start() available externally The read config functions are different for r600 and radeonsi so we can't just share the one in amd common. So just share this instead. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-02-28 13:20:31 +11:00
Timothy Arceri	affc8314cb	radeon/ac: add llvm_ir_string to ac_shader_binary struct Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-02-28 13:20:31 +11:00
Bas Nieuwenhuizen	336b05c49a	radv/ac: Add integer->integer casts. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-26 19:59:27 +01:00
Marek Olšák	c7878b0167	ac: silence a warning trivial	2017-02-25 00:16:38 +01:00
Dave Airlie	ccb70d6f53	radv: add sample mask output support This adds support to write to sample mask from the fragment shader. We can optimise this later like radeonsi. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-24 10:31:53 +10:00
Dave Airlie	8282c5c771	radv/ac: refactor our fmask sample index fixup. This refactors out the sample index fixup between txf and image load. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-24 10:31:49 +10:00
Dave Airlie	5e9ead0fa2	radv: fetch sample index via fmask for image coord as well. This follows the txf_ms code, I can't figure out why amdgpu-pro doesn't do this in their shaders, they must know someone we don't. This fixes: dEQP-VK.pipeline.multisample_shader_builtin.sample_id.* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-24 10:31:44 +10:00
Dave Airlie	bdcbe7c76b	radv: add sample mask input support Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-24 10:31:35 +10:00
Dave Airlie	fc430c391b	radv: fix interpolation at wrong place for offset interp The code was interpolating at the offset from the sample, not the offset from the center. Also fix for persample interpolation modes we should force the pixel center to be at the sample. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-24 10:31:19 +10:00
Dave Airlie	b71e6538a8	radv/ac: handle gs->copy shader clip distances. This fixes up the clip distance passing between the geometry shader and the copy shader. It packs the clip and cull distances into one or two consecutive slots, and avoids wasting space and make sure the gs output and copy shader input agree on where things are stored. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-23 15:31:41 +10:00
Dave Airlie	bec584ec0e	radv/ac: pass clips properly from vertex->geometry shader stages. This works out the geometry shader clip/cull inputs separately to the outputs, and uses that information to read from the ES->GS ring buffer. It stores the clip/cull distances packed into one or two slots. It fixes the es output emission and gs input reading to match. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-23 15:31:37 +10:00
Dave Airlie	c2cfb54f13	radv/ac: rename num clips/cull to output clips/culls As geom shaders can have different ones on entry and exit. also move to uint8_t as these are never that big. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-23 15:31:10 +10:00
Marek Olšák	675ef9c0c7	ac/llvm: use min+max instead of AMDGPU.clamp on LLVM 5.0 It selects v_med3_f32, which has the same rate & size. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 02:58:43 +01:00
Marek Olšák	660b55e6d9	radeonsi: stop using TGSI_OPCODE_CLAMP by moving it amd/common Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 02:58:43 +01:00
Marek Olšák	edd23e0606	ac/llvm: fix various findMSB bugs sffbh needs to be suffixed with ".i32" Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-18 06:24:32 +10:00
Dave Airlie	ebed22ec67	radv/ac: use shared umsb helper. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 22:57:16 +00:00
Dave Airlie	0ec66b9969	radeon/ac: add emit umsb shared code. Since we shared imsb, makes sense to share umsb. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 22:57:16 +00:00
Dave Airlie	4617ad07e0	radeon/ac: use llvm.amdgcn.sffbh intrinsic instead of AMDGPU.flbit.i32 Use the newer intrinsic. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 22:57:16 +00:00
Dave Airlie	fb15a1e9dd	radv/ac: use shader imsb emission code. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 22:57:15 +00:00
Dave Airlie	cae1ff1a4b	radeon/ac: add ac_emit_imsb helper. We want to use a different intrinsic on newer llvm, so move this code to a shared area. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 22:57:15 +00:00
Dave Airlie	a465eae38f	radv: fix warning since using common gs emit code Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-14 20:02:13 +00:00
Dave Airlie	e3324e0c60	radv/ac: use sendmsg emission interface. This uses the common code to emit the correct intrinsic. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-14 00:03:18 +00:00
Dave Airlie	f32955be43	radeon/ac/llvm: add support for sendmsg emission This lets us use the new intrinsic on the correct version of llvm. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-14 00:02:50 +00:00
Dave Airlie	62fef3e159	radv/ac: use common interp code for new intrinsics This uses the common fs interp code to use the new llvm intrinsics so llvm can drop the old ones. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-14 07:48:01 +10:00
Dave Airlie	a864ef7f48	radv/ac: avoid the fmask path when doing txs. This fixes the vulkan samples deferredmultisampling test. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-06 22:57:52 +00:00
Dave Airlie	13a28ff236	radeon/ac: move common llvm build functions to a separate file. Suggested by Marek. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-07 05:46:35 +10:00
Dave Airlie	106a51440d	radv: fix shared memory load/stores. If we have an indirect index here we need to scale it by attribute slots e.g. is this is vec2[256] then we get an indir_index in the 0.255 range but the vec2 are aligned inside vec4 slots. So scale the indir index, then extract the channels. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 19:53:03 +00:00
Dave Airlie	a1a8aef4c9	radv/ac: correctly size shared memory usage. We count the number of slots used, but slots are vec4 sized, so we have to scale by 16 not 4. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 19:52:13 +00:00
Dave Airlie	66463b7f75	radv: fix compute shared memory stores since 64-bit. These regressed and caused doom to stop loading. Fixes: `03724af26` radv/ac: Implement Float64 load/store var. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 19:51:52 +00:00
Dave Airlie	6cc3c46f58	radv/ac: move to using shared emit_ddxy code. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 09:54:04 +10:00
Dave Airlie	c9a2fc3679	radeonsi/ac: move most of emit_ddxy to shared code. We can reuse this in radv. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 09:54:04 +10:00
Dave Airlie	278d5ef70a	radv/ac: use shared thread id code Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 09:54:04 +10:00
Dave Airlie	c5f0a56aeb	radeonsi/ac: move get thread id to shared code. radv will use this. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 09:54:04 +10:00
Dave Airlie	1c5c268a8a	radv/ac: migrate to using shared code for some load/store stuff. This migrates to the code shared with radeonsi. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 09:54:04 +10:00
Dave Airlie	b3c28942c7	radeonsi/ac: move tbuffer store and buffer load to shared code. These are all reuseable by radv. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 09:54:04 +10:00
Dave Airlie	a9773311f6	radeonsi/ac: move a bunch of load/store related things to common code. These are all shareable with radv, so start migrating them to the common code. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 09:54:04 +10:00
Dave Airlie	fa316ed02f	radv/ac: handle clip/cull distance sizing in geometry shader outputs Otherwise we were writing these as 4 components, and things went bad. Fixes (the remaining): dEQP-VK.clipping.user_defined..vert_geom. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-02 08:25:04 +10:00
Dave Airlie	230e308ff9	radv/ac: add const_index to fetch index for gs inputs This fixes clip distance fetches as they are single item loads with a const_index like float[1]. Fixes: dEQP-VK.clipping.user_defined.*.vert_geom.[0-6] Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-02 08:25:04 +10:00
Dave Airlie	dc68b920df	radeonsi/ac: move frag interp emission code to shared llvm code. This code should be used in radv, so move it to a shared location in advance of doing that. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-02 08:24:53 +10:00
Bas Nieuwenhuizen	80f4331ed1	radv/ac: Add draw index support. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-01 19:49:40 +01:00
Bas Nieuwenhuizen	441ee1e65b	radv/ac: Implement Float64 SSBO loads. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:34 +01:00
Bas Nieuwenhuizen	bb1ce63002	radv/ac: Implement Float64 UBO loads. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:29 +01:00
Bas Nieuwenhuizen	03724af262	radv/ac: Implement Float64 load/store var. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:05 +01:00
Bas Nieuwenhuizen	91074bb11b	radv/ac: Implement Float64 SSBO stores. No f16 support as I'm not quite sure about alignment yet. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:05 +01:00
Bas Nieuwenhuizen	29577b2123	radv/ac: Add core Float64 support. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:05 +01:00
Dave Airlie	8477aa71d9	radv/ac: apply slice rounding to 1d arrays as well. Fixes: dEQP-VK.glsl.texture_functions.texture.1darray Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 11:13:15 +10:00
Dave Airlie	ca822e1b7c	radv: handle layer export from vs->fs properly Fixes: dEQP-VK.geometry.layered.1d_array.fragment_layer Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:30:49 +10:00
Dave Airlie	fd4ea9e62d	radv/ac: handle primitive id Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:30:08 +10:00
Dave Airlie	4ec294adce	radv/ac: handle emitting vertex outputs to esgs ring. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:30:05 +10:00
Dave Airlie	ac642c6195	radv/ac: handle gs inputs This handles geometry shader inputs written by the vertex (es) shader to the esgs ring. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:30:01 +10:00
Dave Airlie	80cdf2c17e	radv/ac: add geom input support to get deref offset. This just adds the API and fixes up the callers. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:59 +10:00
Dave Airlie	23999a363b	radv/ac: handle invocation and primitive id intrinsics Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:55 +10:00
Dave Airlie	63fa6c6eb4	radv/ac: handle geometry emit vertex and end prim intrinsics. This handles emitting things to the gsvs ring, and sending the correct GS msgs. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:52 +10:00
Dave Airlie	2a56186d57	radv/ac: handle emitting gs epilogue Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:48 +10:00
Dave Airlie	a615a01942	radv/ac: add copy shader creation This create the gs copy shader and compiles it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:40 +10:00
Dave Airlie	09cd037ca4	radv/ac: setup function parameters for vs as es and copy shader. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:33 +10:00
Dave Airlie	e1e9301b2a	radv: pass some necessary gs info back to state handling. We need this info to program some registers. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:30 +10:00
Dave Airlie	2a57bddd4c	radv/ac: propogate as_es flag into shader info from key. This just places the flag into the shader info so we can use it from the driver after we create the shader. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:23 +10:00
Dave Airlie	ec7bf863d2	radv/ac: start setting up the geom shader rings (v2) This sets up the rings and adds the variables needed to make them work. v2: rework for sharing ring and scratch Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:17 +10:00
Dave Airlie	ca91db2402	radv/ac: handle geom shader sgpr/vgpr inputs This just sets up the gpr inputs. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:13 +10:00
Dave Airlie	374e978438	radv/ac: add geom shader sendmsg defines. This just adds some defines needed for geom shaders. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:10 +10:00
Dave Airlie	583cf8efd4	radv/ac: add some geom shader info from nir->ac shader. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:28:50 +10:00
Dave Airlie	0ecd426490	radv/ac: implement txs for buffer textures. This fixes a bunch of buffer related: dEQP-VK.memory.pipeline_barrier.* tests, that were crashing in LLVM due to this being missing. Reviewed-by: Andres Rodriguez<andresx7@gmail.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 06:26:53 +10:00
Dave Airlie	ecc3fa3ba3	radv/ac: handle nir irem opcode. This fixes: dEQP-VK.spirv_assembly.instruction.compute.opsrem.* Reviewed-by: Andres Rodriguez <andresx7@gmail.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org" Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 05:38:57 +10:00
Dave Airlie	059dd17175	radv/ac: fix multisample subpass image. We weren't adding the fragment position properly. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 04:44:59 +10:00
Bas Nieuwenhuizen	29c1f67e9f	radv/ac: Add compiler support for spilling. Based on code written by Dave Airlie. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-30 02:07:12 +01:00
Bas Nieuwenhuizen	0fca80b3db	various: Fix missing DumpModule with recent LLVM. Since LLVM revision 293359 DumpModule gets only implemented when either a debug build or LLVM_ENABLE_DUMP is set. This patch adds a direct replacement for the function for radv and radeonsi, However, as I don't know a good place to put common LLVM code for all three I inlined the implementation for LLVMPipe. v2: Use the new code for LLVM 3.4+ instead of LLVM 5+ & fixed indentation Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-01-29 10:25:00 +01:00
Bas Nieuwenhuizen	96c60b7f07	radv/ac: Use base in push constant loads. Apparently the source is not an address but an offset, so we actually need to use the base. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> CC: <mesa-stable@lists.freedesktop.org>	2017-01-28 03:07:39 +01:00
Dave Airlie	7886100811	radv/ac: split part of llvm compile into a separate function This is needed to have common code for gs copy shader emission. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-18 06:21:05 +10:00
Dave Airlie	5dadd7ca27	radv/ac: switch an if to switch makes it easier to add other shader stages. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-18 06:20:48 +10:00
Dave Airlie	6b635bbe16	radv: add support for writing layer/viewport index (v2) This just adds the infrastructure to allow writing layer and viewport index. It's just a first patch out of the geom shader tree, and doesn't do much on its own. v2: add missing if statement change (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-18 06:20:44 +10:00
Bas Nieuwenhuizen	3b4bf8aa63	ac/debug: Decrease num_dw for type 2 NOP's. Otherwise we read past the end of the buffer. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-17 20:54:57 +01:00
Dave Airlie	d4392a877c	radv/ac: use ctx->voidt in more places. (v2) Just noticed this while in the area. v2: one replacement was incorrect. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-17 06:55:51 +10:00
Nicolai Hähnle	1007047ca1	ac/nir: use ac_emit_fdiv throughout ... and eliminate emit_fdiv and nir_to_llvm_context::fpmath_md_*, which are now unused. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:39:22 +01:00
Nicolai Hähnle	38c67f77ed	ac/nir: use ac_build_gather_values[_extended] throughout ... and eliminate the non-ac copies. Mostly straight-forward search & replace. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:39:20 +01:00
Nicolai Hähnle	2c9d26a356	ac/nir: use ac_emit_llvm_intrinsic throughout ... by straight-forward search & replace, and eliminate emit_llvm_intrinsic. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:39:17 +01:00
Nicolai Hähnle	a0ce09b4b2	amd/common: unify cube map coordinate handling between radeonsi and radv Code is taken from a combination of radv (for the more basic functions, to avoid gallivm dependencies) and radeonsi (for the new and improved derivative calculations). v2: add 0.5 offset to tex coords only after derivative calculation v3: - really only touch the first three coordinates - rebase on the removal of the 1.5 --> 0.5 offset change Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:39:10 +01:00
Grazvydas Ignotas	c728051131	ac/debug: move .gitignore for sid_tables.h too `b838f642` "ac/debug: Move sid_tables.h generation to common code." moved sid_tables.h but forgot the corresponding .gitignore. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-13 00:37:52 +01:00
Dave Airlie	ada66480b2	radv/ac: add support for multi sample image coords This just adds the nir->llvm support, enabling the extension causes some failures on llvm 3.9 at least, but this code seems fine. NIR passes the sampler in src[1].x, and we LLVM/SI requires it as the last parameters in the coords (coord[2] for 2D, coord[3] for 2DArray). Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-10 12:59:31 +10:00
Bas Nieuwenhuizen	8cb60c7dd3	ac/debug: Dump indirect buffers. This is for handling chained command buffers and secondary command buffers. It doesn't handle the trace id for secondary command buffers yet, but I don't think that is possible in general with just writes, as we could call a secondary command buffer multiple times. I think this is good enough for now, as the most useful case is the chaining when we grow an IB. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-09 21:44:08 +01:00
Bas Nieuwenhuizen	0ef1b4d5b1	ac/debug: Move IB decode to common code. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-09 21:43:59 +01:00
Bas Nieuwenhuizen	b838f64237	ac/debug: Move sid_tables.h generation to common code. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-09 21:43:54 +01:00
Marek Olšák	29d6a367a6	radeonsi: do all math in bytes in SI DMA code Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:43:24 +01:00
Dave Airlie	4813c9ade7	radv: handle multi-component shared load/stores. This was seen in doom shaders, so handle it properly. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave AIrlie <airlied@redhat.com>	2016-12-26 10:31:20 +10:00
Fredrik Höglund	27a8aab882	radv: fix dual source blending Add the index to the location when assigning driver locations for output variables. Otherwise two fragment shader outputs declared as: layout (location = 0, index = 0) out vec4 output1; layout (location = 0, index = 1) out vec4 output2; will end up aliasing one another. Note that this patch will make the second output variable in the above example alias a possible third output variable with location = 1 and index = 0. But this shouldn't be a problem in practice since only one color attachment is supported when dual-source blending is used. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-22 02:07:17 +01:00
Junwei Zhang	018ead4266	radeonsi: add Polaris12 support (v3) v2: use gfxip names for llvm 4.0+ v3: use tonga for llvm <= 3.8, drop gfxip name, we can just change that we change the other asics. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-12-21 15:10:03 -05:00
Bas Nieuwenhuizen	bfee9866ea	radv: Use RELEASE_MEM packet for MEC timestamp query. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-18 20:52:37 +01:00
Ilia Mirkin	fd249c803e	treewide: s/comparitor/comparator/ git grep -l comparitor \| xargs sed -i 's/comparitor/comparator/g' Just happened to notice this in a patch that was sent and included one of the tokens in question. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-12 22:13:07 -05:00
Grazvydas Ignotas	90c29784c6	radv/ac: some fix maybe-uninitialized warnings Mark some paths unreachable so that compiler knows variables are initialized in all valid paths. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-10 21:46:56 +01:00
Dave Airlie	bd56de88df	radv/ac: no need to pass nir to the post outputs handling We don't use the nir shader in here at all. Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-08 23:10:34 +00:00
Dave Airlie	d38eece4e6	radv: fix warnings in ubo load code. Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-08 23:06:30 +00:00
Dave Airlie	0fafe94a39	radv/ac: pass a mask of array params not a number. This makes it easier to add new params before the array ones. Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-08 23:06:18 +00:00
Dave Airlie	c46c376977	radv/ac: don't pass nir to create_function This isn't needed for later things like geom shader copy shaders, we won't have NIR. Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-08 23:05:52 +00:00
Dave Airlie	e54af02567	radv/ac: use build_gep0 instead of opencoding it. Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-08 23:03:39 +00:00
Dave Airlie	c7dc1b010a	radv: make push constants optional We don't set the push constants slot up unless something will cause us to need it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-07 23:26:19 +00:00
Dave Airlie	dfef9c7c1f	radv: only emit descriptor sgprs when needed This only emits enough descriptor sgprs for the number of sets in the layout, and only emits the descriptors necessary for the current stage. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-07 23:25:54 +00:00
Dave Airlie	ae61ddabe8	radv: move userdata sgpr ownership to compiler side. This isn't fully what we want yet, but is a good step on the way. This allows the compiler to create the information structures for the state setting side, however the state setting still expects things to be pretty much in 2 sgpr wide register sets, and can't handle the indirect setting yet. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-07 23:25:49 +00:00
Bas Nieuwenhuizen	92d7563fba	ac/nir: Only use the first component for SSBO atomics. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-05 01:40:54 +01:00
Dave Airlie	8033f78f94	radv: fix another regression since shadow fixes. This fixes: dEQP-VK.glsl.texture_gather.basic.2d.depth32f.* Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-05 10:14:37 +10:00
Marek Olšák	77014a0ad3	radeonsi: document a CP DMA bug that doesn't need a workaround yet This one is easy to miss, because it's not documented in any internal doc. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-01 02:16:51 +01:00
Bas Nieuwenhuizen	abc887faa1	ac/nir: Fix out of bounds array access. With nir_intrinsic_ssbo_atomic_comp_swap we run out of params. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-30 07:09:38 +01:00
Dave Airlie	f3a3fea973	radv: force persample shading when required. We need to force persample shading when a) shader uses sample_id b) shader uses sample_position c) shader uses sample qualifier. Also since ps_iter_samples can now change independently of the rasterizer samples we need to move setting the regs more often. This fixes: dEQP-VK.pipeline.multisample_interpolation.centroid_interpolate_at_consistency.* dEQP-VK.pipeline.multisample_interpolation.centroid_qualifier_inside_primitive.137_191_1.* dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_distinct_values.* dEQP-VK.pipeline.multisample_interpolation.sample_qualifier_distinct_values.128_128_1.* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-29 22:48:03 +00:00
Bas Nieuwenhuizen	b8c9ce4459	ac/nir: Fix accessing an unitialized value. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-29 20:13:28 +01:00
Bas Nieuwenhuizen	05533ce418	radv: Use different intrinsic for ubo loads. Not sure about the deprecation path, but this intrinsic can be lowered to SMEM loads. This results in a significant Talos performance improvement. v2: Fix for LLVM attribute changes. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-29 08:36:16 +01:00
Dave Airlie	020978af12	radv: brown-paper bag for a forgotten else. This fixes the fix: radv/ac/llvm: fix regression with shadow samplers fix Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-28 16:23:10 +10:00
Dave Airlie	b2e217369e	radv/ac/llvm: fix regression with shadow samplers fix This fixes b56b54cbf1d8e70c87a434da5350d11533e5fed8: radv/ac/llvm: shadow samplers only return one value It makes sure we only do that for shadow sampling, as opposed to sizing requests. Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-28 15:43:59 +10:00
Dave Airlie	b56b54cbf1	radv/ac/llvm: shadow samplers only return one value. The intrinsic engine asserts in llvm due to this. Reported-by: Christoph Haag <haagch+mesadev@frickel.club> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-27 23:05:01 +00:00
Dave Airlie	bb8ac18340	radv: fix texel fetch offset with 2d arrays. The code didn't limit the offsets to the number supplied, so if we expected 3 but only got 2 we were accessing undefined memory. This fixes random failures in: dEQP-VK.glsl.texture_functions.texelfetchoffset.sampler2darray_* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-24 18:06:05 +10:00
Fredrik Höglund	5cbcbc75f4	radv: add support for anisotropic filtering on SI-CI Ported from radeonsi. Note that si_make_texture_descriptor() already sets img7 to the mask value referred to in the comment. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-24 08:19:06 +10:00
Dave Airlie	220912e214	radv: fix sample id loading The sample id is packed into bits 8-12, so adjust things properly. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-22 17:15:57 +10:00
Dave Airlie	3c6151ccaf	radv/ac: add implementation of load_sample_pos intrinsic. This fixes a bunch of crashes in CTS tests looking for this. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-22 17:15:54 +10:00
Dave Airlie	5697cfb7ec	radv/ac: cleanup ddxy emission This cleans up the ddxy emission along the same lines as radeonsi. It also means we don't use LDS on VI chips we use the dspermute interface, it also removes some duplicated code. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-22 17:15:43 +10:00
Dave Airlie	b1340fd708	radv: spir-v allows texture size query with and without lod. The translation to llvm was failing here due to required lod. This fixes some new SteamVR shaders. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-21 09:00:22 +10:00
Dave Airlie	713522fb8d	ac/nir/llvm: fix channel in texture gather lowering code. This fixes a number of CTS tests like: dEQP-VK.glsl.texture_gather.basic.2d.rgba8ui.size_npot.clamp_to_edge_repeat Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-16 09:18:15 +10:00
Mauro Rossi	95ed2d9d2c	amd: flatten amd/common makefile structure This pulls amd/common build rules into upper level makefile, along with amd/addlib which is already there. v2: [Emil Velikov] - Move NEED_RADEON_LLVM conditional, drop amd/common from SUBDIRS - Drop AM_ from common_libamd_common_la* Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-15 20:04:37 +00:00
Daniel Scharrer	0b98e885e7	ac/nir/llvm: Fix setting function attributes for intrinsics This fixes a NULL pointer dereference for intrinsics with more than one function attribute introduced in commit `2fdaf38`. The fix is ported from the lp_build_intrinsic changes in commit `8bdd52c`. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-11-11 22:40:32 +01:00
Dave Airlie	2de85eb97a	radv: fix texturesamples to handle single sample case We can only read the valid samples if this is an MSAA texture, which means the type field must be 0x14 or 0x15. This fixes: dEQP-VK.glsl.texture_functions.query.texturesamples.* Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-11 09:35:43 +10:00
Dave Airlie	19decd8ce4	radv: fixup botched llvm API changes. Reported-by: Jan Vesely <jan.vesely@rutgers.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-10 14:12:32 +10:00
Dave Airlie	2fdaf38c01	ac/nir/llvm: adopt to new LLVM attribute API. Ported from corresponding changes to gallivm. tested build against 3.9 and master. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-10 13:29:12 +10:00
Dave Airlie	dd77faeca2	ac/nir: add support for discard_if intrinsic (v2) We are going to start lowering to this in NIR code, so prepare radv for it. v2: handle conversion to kilp properly (nha) Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-10 05:46:20 +10:00
Dave Airlie	bafc75b437	radv: emit correct last export when Z/stencil export is enabled I was getting a random GPU hang in the renderpass simple tests, it turns out sometimes radv emitted the wrong thing "last". This fixes the logic to emit Z/stencil last if they occur, and not mark a color output as last. Also this relies on the Z/STENCIL being the first two fragment outputs, which they are so yay. Fixes: dEQP-VK.renderpass.simple.color_depth (random hangs) Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-09 06:05:03 +10:00
Nicolai Hähnle	908100cfae	amd/common: add ac_is_sgpr_param helper Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:06:27 +01:00
Nicolai Hähnle	2ff5df8f50	amd/common: build also for gallium drivers At least when LLVM is used, which is basically always (unless you're only building r600 without OpenCL). Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:06:24 +01:00
Nicolai Hähnle	8eabee9ec0	amd/common: move llvm helper prototype to ac_llvm_util.h Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:05:46 +01:00
Marek Olšák	d3244c47ce	amd: fix a typo in PIXEL_PIPE_STAT_RESET definition Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-01 22:33:13 +01:00
Dave Airlie	d548fa882b	radv/ac/llvm: trim texture return values The intrinsic engine asserts in llvm due to this, as we put a vec4 into a vec1, and the next instruction isn't expecting it. So trim the vector at the end before inserting it. Reported-by: Christoph Haag <haagch+mesadev@frickel.club> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-27 11:42:03 +10:00
Marek Olšák	edf56fb428	gallium/radeon: fix a ZPASS comment, EVENT_WRITE_EOP fixups Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Timothy Arceri	e1af20f18a	nir/i965/anv/radv/gallium: make shader info a pointer When restoring something from shader cache we won't have and don't want to create a nir_shader this change detaches the two. There are other advantages such as being able to reuse the shader info populated by GLSL IR. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Dave Airlie	d842546ad1	radv: use emit_icmp for samples_identical On a debug llvm build we'd assert on the next compare when the return from samples_identical was i1 instead of i32. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-20 01:43:55 +01:00
Dave Airlie	67c91ef2a2	radv: fix samples_identical return value. This was returning an inversion, so not doing as it should have. We need to compare the fmask value with 0, and return the result from that.	2016-10-19 17:39:01 +10:00
Dave Airlie	63406b669e	radv: fix fmask ptr issue We were using the wrong descriptor in the fmask picking code.	2016-10-19 13:16:25 +10:00
Dave Airlie	b0e11a153c	radv: start using defines for the user sgpr offsets This adds some comments and adds defines for the user sgprs, so that we can move them around easier later and not have to change/revalidate every one of these. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 10:17:48 +10:00
Tom Stellard	5c66d46d6a	radv: Use new image load/store intrinsic signatures v2 These were changed in LLVM r284024. v2: - Only use float types for vdata of llvm.amdgcn.image.store. LLVM doesn't support integer types for this intrinsic. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 04:48:11 +10:00
Tom Stellard	30e63fb0e4	radv: Fix incorrect comment Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 04:48:11 +10:00
Dave Airlie	f4e499ec79	radv: add initial non-conformant radv vulkan driver This squashes all the radv development up until now into one for merging. History can be found: https://github.com/airlied/mesa/tree/semi-interesting This requires llvm 3.9 and is in no way considered a conformant vulkan implementation. It can run a number of vulkan applications, and supports all GPUs using the amdgpu kernel driver. Thanks to Intel for providing anv and spirv->nir, and Emil Velikov for reviewing build integration. Parts of this are: Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Authors: Bas Nieuwenhuizen and Dave Airlie Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-07 09:16:09 +10:00
Tom Stellard	91ec6e5664	radeonsi/compute: Use the HSA abi for non-TGSI compute shaders v3 This patch switches non-TGSI compute shaders over to using the HSA ABI described here: https://github.com/RadeonOpenCompute/ROCm-Docs/blob/master/AMDGPU-ABI.md The HSA ABI provides a much cleaner interface for compute shaders and allows us to share more code in the compiler with the HSA stack. The main changes in this patch are: - We now pass the scratch buffer resource into the shader via user sgprs rather than using relocations. - Grid/Block sizes are now passed to the shader via the dispatch packet rather than at the beginning of the kernel arguments. Typically for HSA, the CP firmware will create the dispatch packet and set up the user sgprs automatically. However, in Mesa we let the driver do this work. The main reason for this is that I haven't researched how to get the CP to do all these things, and I'm not sure if it is supported for all GPUs. v2: - Add comments explaining why we are setting certain bits of the scratch resource descriptor. v3: - Use amdgcn-mesa-mesa3d triple instead of amdgcn--mesa3d. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-16 23:07:10 +00:00
Dave Airlie	69fca64259	amd/addrlib: move addrlib from amdgpu winsys to common code Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-06 10:06:33 +10:00
Dave Airlie	a86be7b6ad	radeon: move radeon_family/chip_class defintions to common This just moves these to a common header file. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-06 10:06:04 +10:00
Dave Airlie	f1f1ba3781	radeonsi: move sid.h/r600d_common.h to a common place. Step one to merging radv would be to move some files around. This only adds the include path to r600/radeonsi, because later we want to avoid having to add it to the generic target paths. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-06 10:05:13 +10:00

... 35 36 37 38 39 ...

2021 Commits