KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Timothy Arceri	8086fa1bcd	radv: use nir_split_array_vars() We call in the opt loop in case another pass results in an array with indirect access being turned into direct access. Totals from affected shaders: SGPRS: 512 -> 496 (-3.12 %) VGPRS: 456 -> 452 (-0.88 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 40040 -> 39664 (-0.94 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 41 -> 43 (4.88 %) Wait states: 0 -> 0 (0.00 %) All affected shaders are from Batman Arkham City. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-18 15:04:09 +11:00
Timothy Arceri	06675711e7	radv: use nir_opt_find_array_copies() Totals from affected shaders: SGPRS: 1112 -> 1112 (0.00 %) VGPRS: 1492 -> 1196 (-19.84 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 112172 -> 101316 (-9.68 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 93 -> 98 (5.38 %) Wait states: 0 -> 0 (0.00 %) All affected shaders are from "Batman: Arkham City" over DXVK. The pass detects that the temporary array created by DXVK for storing TCS inputs is a copy of the input arrays and allows us to avoid copying all of the input data and then indirecting on it with if-ladders, instead we just do indirect indexing. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-18 15:04:09 +11:00
Timothy Arceri	9d5b106b2e	radv: use nir_opt_copy_prop_vars and nir_opt_dead_write_vars Totals from affected shaders: SGPRS: 2856 -> 2856 (0.00 %) VGPRS: 3236 -> 3248 (0.37 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 236560 -> 233548 (-1.27 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 277 -> 283 (2.17 %) Wait states: 0 -> 0 (0.00 %) Even in the cases were we have increased VGPR use it appears the NIR is improved significantly. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-18 15:04:09 +11:00
Timothy Arceri	72e4287e8f	radv: make use of nir_lower_load_const_to_scalar() This allows NIR to CSE more operations. LLVM does this also so the impact is limited, however doing this in NIR allows other opts to make progress. For example in radeonsi more loops are unrolled in Civilization Beyond Earth. The actual pipeline-db stats are not overwhelming but even in the negatively affected shaders the NIR is clearly better. It just happens that the code shuffling and in some cases calls to max rather than a flt result in the final output from LLVM not giving as good numbers. However this is an incremental opt that further passes build off so the change should be made IMO. Totals from affected shaders: SGPRS: 20192 -> 20184 (-0.04 %) VGPRS: 19516 -> 19524 (0.04 %) Spilled SGPRs: 437 -> 444 (1.60 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 1527444 -> 1522276 (-0.34 %) bytes LDS: 6 -> 6 (0.00 %) blocks Max Waves: 1018 -> 1016 (-0.20 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-25 09:31:22 +10:00
Samuel Pitoiset	35656823b9	radv: enable VK_SUBGROUP_FEATURE_ARITHMETIC_BIT All CTS pass on Polaris/Vega with LLVM 6, 7 and master, so I think it's safe to enable the feature. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-19 13:36:10 +02:00
Samuel Pitoiset	08103c5f65	radv: enable shaderInt16 capability Not sure if this is all wired up. CTS does pass and the Tangrams demo works fine on Vega. There are corruption issues on Polaris but not sure if that related to 16-bit support. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-17 15:18:39 +02:00
Bas Nieuwenhuizen	d97c892584	radv: Set the user SGPR MSB for Vega. Otherwise using 32 user SGPRs would be broken. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-09-16 12:50:58 +02:00
Jason Ekstrand	44ec31cd75	nir: Drop the vs_inputs_dual_locations option It was very inconsistently handled; the only things that made use of it were glsl_to_nir, glspirv, and nir_gather_info. In particular, nir_lower_io completely ignored it so anyone using nir_lower_io on 64-bit vertex attributes was going to be in for a shock. Also, as of the previous commit, it's set by every driver that supports 64-bit vertex attributes. There's no longer any reason to have it be an option so let's just delete it. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-09-06 16:07:50 -05:00
Samuel Pitoiset	24ee53231d	radv: remove dead variables after splitting per member structs Otherwise, nir_lower_clip_cull_distance_arrays might report wrong number of output clips/culls because it relies on shader output variables and some of them might be dead. This fixes a rendering issue with Dolphin and Super Mario Sunshine. Fixes: `b0c643d8f5` ("spirv: Use NIR per-member splitting") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107610 CC: 18.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-22 13:57:18 +02:00
Dave Airlie	b88468f15c	radv: return binary code_size not variant code size to cache The code sizes return here get passed to the cache shader insert function, which then memcpy from the code ptr, and causes all sorts of valgrind errors like: ==6755== Invalid read of size 8 ==6755== at 0x4C32FEE: memcpy@GLIBC_2.2.5 (vg_replace_strmem.c:1021) ==6755== by 0x2305D4C7: radv_pipeline_cache_insert_shaders (radv_pipeline_cache.c:416) ==6755== by 0x2305791D: radv_create_shaders (radv_pipeline.c:2158) ==6755== by 0x2305C523: radv_pipeline_init (radv_pipeline.c:3404) ==6755== by 0x2305C890: radv_graphics_pipeline_create (radv_pipeline.c:3515) ==6755== by 0x230188AB: radv_device_init_meta_blit_color (radv_meta_blit.c:871) ==6755== by 0x2301D50E: radv_device_init_meta_blit_state (radv_meta_blit.c:1278) ==6755== by 0x23011893: radv_device_init_meta (radv_meta.c:352) ==6755== by 0x2300744B: radv_CreateDevice (radv_device.c:1576) ==6755== by 0x5187D0F: ??? (in /usr/lib64/libvulkan.so.1.1.77) ==6755== by 0x518F6A3: ??? (in /usr/lib64/libvulkan.so.1.1.77) ==6755== by 0x5192A42: vkCreateDevice (in /usr/lib64/libvulkan.so.1.1.77) ==6755== Address 0x22a58548 is 4 bytes after a block of size 116 alloc'd ==6755== at 0x4C2EBAB: malloc (vg_replace_malloc.c:299) ==6755== by 0x23089DC4: ac_elf_read (ac_binary.c:144) ==6755== by 0x23090A60: ac_compile_module_to_binary (ac_llvm_helper.cpp:162) ==6755== by 0x23053F06: compile_to_memory_buffer (radv_llvm_helper.cpp:58) ==6755== by 0x23053F06: radv_compile_to_binary (radv_llvm_helper.cpp:98) ==6755== by 0x23052769: ac_llvm_compile (radv_nir_to_llvm.c:3394) ==6755== by 0x23052823: ac_compile_llvm_module (radv_nir_to_llvm.c:3418) ==6755== by 0x23053C05: radv_compile_nir_shader (radv_nir_to_llvm.c:3542) ==6755== by 0x23061B4E: shader_variant_create (radv_shader.c:580) ==6755== by 0x23061CFD: radv_shader_variant_create (radv_shader.c:634) ==6755== by 0x23057765: radv_create_shaders (radv_pipeline.c:2123) ==6755== by 0x2305C523: radv_pipeline_init (radv_pipeline.c:3404) ==6755== by 0x2305C890: radv_graphics_pipeline_create (radv_pipeline.c:3515) Since we are just inserting the code into the cache, we can avoid these bad reads and data in the cache by just using the binary code size here. Fixes: `939e5a382` (radv: add padding for the UMR disassembler) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-28 06:20:20 +10:00
Daniel Schürmann	62024fa775	radv: enable VK_KHR_16bit_storage extension / 16bit storage features Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:26 +02:00
Danylo Piliaiev	494a206229	radv: Fix incorrect assumption about ternary operator precedence Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 10:04:27 +02:00
Dave Airlie	6f3aee40f9	radv: using tls to store llvm related info and speed up compiles (v10) This uses the common compiler passes abstraction to help radv avoid fixed cost compiler overheads. This uses a linked list per thread stored in thread local storage, with an entry in the list for each target machine. This should remove all the fixed overheads setup costs of creating the pass manager each time. This takes a demo app time to compile the radv meta shaders on nocache and exit from 1.7s to 1s. It also has been reported to take the startup time of uncached shaders on RoTR from 12m24s to 11m35s (Alex) v2: fix llvm6 build, inline emit function, handle multiple targets in one thread v3: rebase and port onto new structure v4: rename some vars (Bas) v5: drag all code into radv for now, we can refactor it out later for radeonsi if we make it shareable v6: use a bit more C++ in the wrapper v7: logic bugs fixed so it actually runs again. v8: rebase on top of radeonsi changes. v9: drop some C++ headers, cleanup list entry v10: use pop_back (didn't have enough caffeine) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-10 07:58:03 +10:00
Dave Airlie	7398913a62	ac/radv: move llvm compiler info to struct and init in one place This ports radv to the shared code, however due to a bug in LLVM version prior to 7, radv cannot add target info at this stage, as it would leak one for every shader compile, however I'd prefer to keep this llvm damage in the shared code, since it isn't the driver at fault here. We just add a flag to denote if the driver can support leaking the target info or not, and the common code does the right thing depending on the llvm version. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 10:29:16 +10:00
Dave Airlie	35c82af539	radv/radeonsi: add a check ir tm options This doesn't do much yet, but it makes it easier to move the code to a common shared code base. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:32:35 +10:00
Dave Airlie	e1387eaf12	radv: create/destroy passmgr at the higher level. This is prep work for moving this to a per-thread struct Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:31:05 +10:00
Dave Airlie	f2b3e96e75	radv: drop copy of ac_create_target_machine. Once we split the init once stuff out, this can be shared again. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:15:35 +10:00
Dave Airlie	473be16c74	ac/radv: split the non-common init_once code from the common target code. (v2) This just splits out the non-shared code and reuses ac_get_llvm_target in radv. v2: rebase on Marek's patch - fixup brace position/whitespace Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:15:23 +10:00
Samuel Pitoiset	939e5a3823	radv: add padding for the UMR disassembler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-02 10:42:17 +02:00
Samuel Pitoiset	bcbd8dd6c9	radv: enable VK_EXT_shader_stencil_export The driver already supports exporting the stencil value. The following CTS test now pass: dEQP-VK.pipeline.shader_stencil_export.op_replace Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-26 10:40:10 +02:00
Bas Nieuwenhuizen	8c4f430d43	radv: Enable lower_io_to_temporaries after deref changes. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 21:23:06 -07:00
Rob Clark	d143f6c856	move lower_deref_instrs Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	3573570afe	radv: Disable lower_io_to_temporaries during deref changes. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	c11833ab24	nir,spirv: Rework function calls This commit completely reworks function calls in NIR. Instead of having a set of variables for the parameters and return value, nir_call_instr now has simply has a number of sources which get mapped to load_param intrinsics inside the functions. It's up to the client API to build an ABI on top of that. In SPIR-V, out parameters are handled by passing the result of a deref through as an SSA value and storing to it. This virtue of this approach can be seen by how much it allows us to delete from core NIR. In particular, nir_inline_functions gets halved and goes from a fairly difficult pass to understand in detail to almost trivial. It also simplifies spirv_to_nir somewhat because NIR functions never were a good fit for SPIR-V. Unfortunately, there is no good way to do this without a mega-commit. Core NIR and SPIR-V have to be changed at the same time. This also requires changes to anv and radv because nir_inline_functions couldn't handle deref instructions before this change and can't work without them after this change. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:58 -07:00
Jason Ekstrand	b0c643d8f5	spirv: Use NIR per-member splitting Before, we were doing structure splitting in spirv_to_nir. Unfortunately, this doesn't really work when you think about passing struct pointers into functions. Doing it later in NIR is a much better plan. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	74212c2414	anv,i965,radv,st,ir3: Call nir_lower_deref_instrs This inserts a call to nir_lower_deref_instrs at every call site of glsl_to_nir, spirv_to_nir, and prog_to_nir. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Eric Engestrom	d85fef1e34	radv: fix reported number of available VGPRs It's a bit late to round up after an integer division. Fixes: `de88979413` "radv: Implement VK_AMD_shader_info" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Alex Smith <asmith@feralinteractive.com>	2018-06-18 17:08:22 +01:00
Samuel Pitoiset	bfca15e16a	radv: add RADV_DEBUG=checkir This allows to run the LLVM verifier pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:54:08 +02:00
Samuel Pitoiset	135e4d434f	radv: add a workaround for DXVK hangs by setting amdgpu-skip-threshold Workaround for bug in llvm that causes the GPU to hang in presence of nested loops because there is an exec mask issue. The proper solution is to fix LLVM but this might require a bunch of work. This fixes a bunch of GPU hangs that happen with DXVK. Vega10: Totals from affected shaders: SGPRS: 110456 -> 110456 (0.00 %) VGPRS: 122800 -> 122800 (0.00 %) Spilled SGPRs: 7478 -> 7478 (0.00 %) Spilled VGPRs: 36 -> 36 (0.00 %) Code Size: 9901104 -> 9922928 (0.22 %) bytes Max Waves: 7143 -> 7143 (0.00 %) Code size slightly increases because it inserts more branch instructions but that's expected. I don't see any real performance changes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105613 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-09 14:16:49 +02:00
Bas Nieuwenhuizen	38933c1151	radv: Add option to print errors even in optimized builds. Errors are not that common of a case so we can eat a slight perf hit in having to call a function and do a runtime check. In turn this makes debugging random errors happening for end users easier, because they don't have to have a debug build on hand. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-31 11:51:23 +02:00
Samuel Pitoiset	38a8c5903b	radv: call nir_lower_io_to_temporaries for VS, GS, TES and FS Do not lower FS inputs because this moves all load_var instructions at beginning of shaders and because interp_var_at_sample (and friends) seem broken. That might be eventually enabled later on if we really want to preload all FS inputs at beginning. Polaris10: Totals from affected shaders: SGPRS: 54072 -> 54264 (0.36 %) VGPRS: 38580 -> 38124 (-1.18 %) Spilled SGPRs: 652 -> 652 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 2128116 -> 2127380 (-0.03 %) bytes Max Waves: 8048 -> 8086 (0.47 %) Vega10: Totals from affected shaders: SGPRS: 52616 -> 52656 (0.08 %) VGPRS: 37536 -> 37116 (-1.12 %) Spilled SGPRs: 828 -> 828 (0.00 %) Code Size: 2043756 -> 2042672 (-0.05 %) bytes Max Waves: 9176 -> 9254 (0.85 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-24 09:18:57 +02:00
Samuel Pitoiset	ded1509587	radv: call nir_split_var_copies() before nir_lower_var_copies() This doesn't nothing special currently because we don't create any copy_var instructions, but this is needed for the next patch. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-24 09:18:54 +02:00
Samuel Pitoiset	d8a61d3232	radv: set amdgpu-32bit-address-high-bits LLVM attribute Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:15 +02:00
Samuel Pitoiset	6211799aff	radv: remove the radv_finishme() when compiling shaders Having an entrypoint different than "main" doesn't mean we have multiple shaders per module. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 13:48:24 +02:00
Samuel Pitoiset	1e86eaf7d8	radv: remove radv_device::llvm_supports_spill It's always true. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 13:48:21 +02:00
Samuel Pitoiset	8ade3e4684	radv: allow to dump the GS copy shader with RADV_DEBUG="shaders" Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-14 12:38:00 +02:00
Timothy Arceri	ce188813bf	radv: add initial support for VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT When VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT is set we skip NIR linking optimisations and only run over the NIR optimisation loop once similar to the GLSLOptimizeConservatively constant used by some GL drivers. We need to run over the opts at least once to avoid errors in LLVM (e.g. dead vars it can't handle) and also to reduce the time spent compiling the IR in LLVM. With this change the Blacksmith Unity demos compilation times go from 329760 ms -> 299881 ms when using Wine and DXVK. V2: add bit to radv_pipeline_key Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106246	2018-05-13 09:58:33 +10:00
Samuel Pitoiset	3a410f0afc	radv: minor cleanups in radv_fill_shader_variant() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-11 12:35:05 +02:00
Iago Toral Quiroga	2d648e5ba3	compiler/lower_64bit_packing: rename the pass to be more generic It can do 32-bit packing too now. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Marek Olšák	43f0a10051	radeonsi: add triple into si_compiler Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Dave Airlie	f77caa7411	ac/radv/radeonsi: refactor max simd waves into common code. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-24 09:08:33 +10:00
Bas Nieuwenhuizen	dffdef6737	radv: Add Vega M support. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-19 16:36:21 +02:00
Bas Nieuwenhuizen	0e10790558	radv: Enable VK_EXT_descriptor_indexing. This adds everything except non-uniform indexing, which needs a bit more work and testing. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Daniel Schürmann	f2c6a55061	radv: enable subgroup capabilities Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 01:03:15 +02:00
Samuel Pitoiset	466aba9fa2	radv: add RADV_NUM_PHYSICAL_VGPRS constant Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 14:28:13 +02:00
Samuel Pitoiset	2f7bb93146	radv: add radv_get_num_physical_sgprs() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 14:28:13 +02:00
Samuel Pitoiset	acf60abc54	radv: enable VK_EXT_shader_viewport_index_layer The driver already supports exporting the Layer and ViewportIndex built-ins from vertex or tessellation shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-03 14:05:46 +02:00
Daniel Schürmann	b91cd5dba4	radv: enable VK_AMD_shader_trinary_minmax extension Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 01:29:39 +02:00
Timothy Arceri	9a243eccae	radv: don't lower indirects until after opts have run Noticed while passing by. Not sure if it impacts anything, but likely to impact GFX9 more than anything else since we lower inputs, outputs and locals there. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-20 15:01:44 +11:00
Dave Airlie	e8d9b7ab02	radv: lower constant initializers on output variables earlier If a shader only writes to an output via a constant initializer we need to lower it before we call nir_remove_dead_variables so that this pass sees the stores from the initializer and doesn't kill the output. Fixes test failures in new work-in-progress CTS tests: dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.float This is ported from anv: `99b57daf4a` anv/pipeline: lower constant initializers on output variables earlier from Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:40 +00:00

1 2 3

108 Commits