KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Rob Clark	064c806d23	freedreno/ir3: Add load/store_global lowering Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13300>	2021-10-21 18:59:57 +00:00
Rob Clark	f45b7c58c4	freedreno/ir3: Lower 64b phis Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13300>	2021-10-21 18:59:57 +00:00
Danylo Piliaiev	bee9212efb	ir3/freedreno: add 64b undef lowering Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13300>	2021-10-21 18:59:57 +00:00
Rob Clark	2d65e6f56d	freedreno/ir3: 64b intrinsic lowering Both for OpenCL and VK_KHR_buffer_device_address Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13300>	2021-10-21 18:59:57 +00:00
Rob Clark	96b37b9546	freedreno/ir3: Remove used unused Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13160>	2021-10-04 15:10:07 +00:00
Emma Anholt	1cc8523c5c	freedreno/ir3: Use LDIB for coherent image loads on a5xx. If the coherent flag is present, then we need to not have an incoherent cache between us and previous stores to the image that were also decorated as coherent. isam apparently (unsurprisingly) goes through a texture cache. Use ldib instead, so that we don't get the wrong result. We would need a similar fix for a4xx, but that uses ldgb and I don't have hardware to test on. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12704>	2021-09-03 18:17:07 +00:00
Danylo Piliaiev	6373dd814a	ir3/a6xx,freedreno: account for resinfo return size dependency on IBO_0_FMT On a6xx resinfo returns size in bytes divided by IBO_0_FMT format size (not just size in dwords), we have to shift it back to NIR meaning which is size in bytes. Make freedreno use 16b buffers when they are supported in order to be able to depend on hardware capabilities when lowering ssbo size. Fixes: `ce1a381e57` "turnip: enable VK_KHR_16bit_storage on A650" Fixes cts tests: dEQP-VK.ssbo.unsized_array_length.float_offset_explicit_size dEQP-VK.ssbo.unsized_array_length.float_no_offset_whole_size dEQP-VK.compute.basic.write_multiple_unsized_arr_single_invocation and many more Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12485>	2021-09-01 16:09:20 +03:00
Emma Anholt	83e9a7fbcf	freedreno/ir3: Align driver param upload size/offset for indirect uploads. For indirect draws, we have to upload some of the params as indirect references, which have a more strict size requirement. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12455>	2021-08-19 14:43:06 -07:00
Eric Anholt	5b5dcbfe89	freedreno/a6xx: Skip setting up image dims constants. We just use resinfo anyway. Notably, a6xx was only doing its setup in the FS case and not CS. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12258>	2021-08-18 00:15:18 +00:00
Eric Anholt	994793c500	freedreno/ir3: Move a6xx's get_ssbo_size shl to NIR. Just cleaning up a TODO. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12258>	2021-08-18 00:15:18 +00:00
Eric Anholt	547a2aa051	freedreno/ir3: Use the resinfo path for ssbo sizes on GL, too. Less state walking at draw time, in exchange for a SHL in the lookup. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12258>	2021-08-18 00:15:18 +00:00
Emma Anholt	513920ba82	freedreno/ir3: Only lower cube image sizes once. shader variants can cause ir3_nir_finalize() to run more than once, which would make us keep dividing the size by 6. Fixes: `a48fc88571` ("freedreno/a6xx: Apply the cube image size lowering to GL, too.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12258>	2021-08-18 00:15:18 +00:00
Emma Anholt	a48fc88571	freedreno/a6xx: Apply the cube image size lowering to GL, too. Fixes KHR-GLES31.core.texture_cube_map_array.texture_size_compute_sh. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12256>	2021-08-17 20:00:49 +00:00
Rob Clark	7ba6100c2a	freedreno/ir3/lower_io_offsets: Drop gpu_id param It was unused. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12159>	2021-08-06 18:51:50 +00:00
Rob Clark	cc72eeb077	freedreno/ir3: Reduce use of compiler->gpu_id For the same reason as previous patch. Mostly we only care about the generation, so convert things to use compiler->gen instead. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12159>	2021-08-06 18:51:50 +00:00
Timothy Arceri	a9ed4538ab	nir: add indirect loop unrolling to compiler options This is where it should be rather than having to pass it into the optimisation pass every time. It also allows us to call the loop analysis pass without having to duplicate these options which we will do later in this series. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>	2021-08-03 10:54:50 +00:00
Connor Abbott	177138d8cb	ir3: Reformat source with clang-format Generated using: cd src/freedreno/ir3 && clang-format -i {*,.}/.c {*,.}/.h -style=file Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11801>	2021-07-12 20:57:21 +00:00
Connor Abbott	d53984ce97	ir3/nir: Lower indirect references of compact variables Fixes Sascha Willems "tessellation" demo on Turnip (it contains indirect dereference of tessellation levels). Fixes: `643f2cb` ("ir3, tu: Cleanup indirect i/o lowering") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11781>	2021-07-09 09:48:21 +00:00
Connor Abbott	8176657ead	ir3/nir: Call nir_lower_subgroups Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>	2021-07-08 16:02:41 +00:00
Connor Abbott	68b8b9e9e1	tu, ir3: Plumb through support for CS subgroup size/id The way that the blob obtains the subgroup id on compute shaders is by just and'ing gl_LocalInvocationIndex with 63, since it advertizes a subgroupSize of 64. In order to support VK_EXT_subgroup_size_control and expose a subgroupSize of 128, we'll have to do something a little more flexible. Sometimes we have to fall back to a subgroup size of 64 due to various constraints, and in that case we have to fake a subgroup size of 128 while actually using 64 under the hood, by just pretending that the upper 64 invocations are all disabled. However when computing the subgroup id we need to use the "real" subgroup size. For this purpose we plumb through a driver param which exposes the real subgroup size. If the user forces a particular subgroup size then we lower load_subgroup_size in nir_lower_subgroups, otherwise we let it through, and we assume when translating to ir3 that load_subgroup_size means "give me the actual subgroup size that you decided in RA" and give you the driver param. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>	2021-07-08 16:02:41 +00:00
Rob Clark	140ce4f8ed	freedreno+ir3: Enable INT16 Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11545>	2021-06-29 23:27:28 +00:00
Emma Anholt	caa5c5b12e	freedreno/ir3: Move NIR printing to mesa_log. Now we can get some NIR debug on Android. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9262>	2021-06-18 18:18:35 +00:00
Rhys Perry	1cbcfb8b38	nir, nir/algebraic: add byte/word insertion instructions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>	2021-06-08 08:57:42 +00:00
Caio Marcelo de Oliveira Filho	c8a7bd0dc8	nir: Rename WORK_GROUP (and similar) to WORKGROUP Be consistent with other usages in Vulkan and SPIR-V, and the recently added workgroup_size field. Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>	2021-06-07 22:34:42 +00:00
Caio Marcelo de Oliveira Filho	a71a780598	nir: Rename nir_intrinsic_load_local_group_size to nir_intrinsic_load_workgroup_size Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>	2021-06-07 22:34:42 +00:00
Connor Abbott	0ab01f4215	ir3: Call nir_lower_wrmask() again after lowering scratch I forgot that after rebasing on large_consts support that this is now called after the first time nir_lower_wrmask is called and can generate partial writemasks that need to be lowered. While we're here, also call the main optimization loop if things are lowered to scratch because it generates address arithmetic that may need to be cleaned up. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10922>	2021-05-21 20:45:07 +00:00
Connor Abbott	a40714abf7	nir/lower_phis_to_scalar: Add "lower_all" option We don't want to have to deal with vector phis in freedreno, because vectors are always split/unsplit around vectorized instructions anyways, and the stated reason for not scalarising them (it hurting coalescing) won't apply to us because we won't be using nir_from_ssa. Add this option so that we don't have to do the equivalent thing while translating from NIR. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10809>	2021-05-17 09:59:45 +00:00
Danylo Piliaiev	d8ab0ec8e4	turnip: implement VK_KHR_vulkan_memory_model No handling of Acquire/Release because at the moment scheduler works as if any barrier is Acq+Rel. Instead of removing scoped_barrier with scope/mode that for TCS corresponds to a control_barrier or a memory_barrier_tcs_patch in ir3_nir_lower_tess_ctrl - remove them in emit_intrinsic_barrier. And do the same for memory_barrier_tcs_patch and control_barrier. While in any case hw fence/barrier shouldn't be emitted for them, they still affect ordering of stores, and in feature ir3 backend may want to have that information. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9054>	2021-05-05 10:05:38 +00:00
Connor Abbott	643f2cb8a3	ir3, tu: Cleanup indirect i/o lowering Do all the necessary lowering in one place, during finalization, and stop uselessly calling nir_lower_indirect_derefs in turnip. Splitting i/o to elements should no longer be necessary since we use the i/o semantics instead of variables now. This has the side effect that we no longer generate enormous if-ladders for tess/GS shaders with turnip. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7274>	2021-04-26 17:07:02 +00:00
Eric Anholt	7d234da6ee	freedreno: Fix YUV sampler regression. We have to keep sampler uniforms around for later YUV lowering, and we only need to remove uniforms that take up storage space. Code comes from radeonsi. Closes: #4644. Fixes: `de17b4aab5` ("freedreno: Remove uniform variables after finalizing NIR.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10246>	2021-04-15 16:20:15 +00:00
Connor Abbott	c68ea960a7	ir3, tu: Add compiler flag for robust UBO behavior This needs to be part of the compiler because it's the only piece that we always have access to in all the places ir3_optimize_loop() is called, and it's only enabled for the whole Vulkan device. Right now it's just used for constraining vectorization, but the next commit adds another use. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>	2021-04-15 16:05:11 +02:00
Marek Olšák	fb29cef8dd	nir: add many passes that lower and optimize 16-bit input/outputs and samplers Added: * a pass that renumbers bases of IO intrinsics * a pass that converts mediump IO to 16 bits, optionally using the new packed varying slots * a pass that sets (forces) mediump in IO intrinsics (for testing) * a pass that remaps VARYING_SLOT_VAR[0..15]_16BIT to VARYING_SLOT_VAR[0..31] (if some shader stages don't want packed varyings) * a pass that folds type conversions around texture opcodes into those opcodes (e.g. tex(f2f32(coord), ..) is changed into tex accepting f16) * a pass that changes (legalizes) sampler src and dst types based on specified hw constraints (e.g. derivatives must be the same type as coordinates) Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9050>	2021-04-13 05:07:42 +00:00
Rhys Perry	a2619b97f5	nir/lower_idiv: add options to use fp32 for 8-bit division lowering Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081>	2021-04-12 16:19:46 +00:00
Danylo Piliaiev	2087168a30	turnip,ir3: account for dispatch group offsets Fixes tests: dEQP-VK.compute.device_group.dispatch_base Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9339>	2021-03-29 14:31:44 +03:00
Danylo Piliaiev	ae3b95daa7	turnip: lower device index to zero Vulkan 1.1 has VK_KHR_device_group and VK_KHR_device_group_creation promoted to core, thus we should handle DeviceIndex built-in. While we are here, also add these extensions to the extensions list, even though they are not doing anything useful. Fixes test: dEQP-VK.compute.device_group.device_index Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9516>	2021-03-11 21:12:52 +00:00
Jason Ekstrand	117668b811	nir: Make nir_ssa_def_rewrite_uses take an SSA value This commit replaces the new_src parameter of nir_ssa_def_rewrite_uses() with an SSA def, removes nir_ssa_def_rewrite_uses_ssa(), and rewrites all the users as needed. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@collabora.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>	2021-03-08 16:59:55 +00:00
Eric Anholt	c93fd1046a	freedreno: Use the mesa/st frontend lowering of GL_CLAMP. 350 lines of code for this stupid feature, and we weren't even doing it right for CS/GS/tess. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9267>	2021-02-25 00:38:11 +00:00
Eric Anholt	5fa27e6670	freedreno: Drop custom driver lowering of GL's color clamping. The mesa/st frontend can do it for us now that we don't need to worry about breaking precompiles. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8997>	2021-02-24 21:48:54 +00:00
Eric Anholt	3b9f6af1a9	freedreno: Drop custom driver lowering of two-sided color. The GL frontend can do it for us now, so just use their code instead of our own shader variants. In the past we had to do hide the GL shader variants in the driver to get precompiles from st, but no longer as of !8601. I tested with drawoverhead -test 6 (shader program change, n=30) and -test 1 (no statechanges, n=43) and saw no change in driver overhead. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8997>	2021-02-24 21:48:54 +00:00
Eric Anholt	de17b4aab5	freedreno: Remove uniform variables after finalizing NIR. mesa/st optimizes the uniform storage if you have the finalize hook in place, causing the uniforms declared to potentially not have storage in the ParameterValues list any more. If you leave your uniforms around in the NIR, then a later finalization after variant creation will re-add the uniforms to parameters, defeating the optimization and likely reallocating the uniform storage (causing use-after-free). So, we have to do this before we can start using variants in mesa/st. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8997>	2021-02-24 21:48:54 +00:00
Daniel Schürmann	bd8e84eb8d	nir: replace .lower_sub with .has_fsub and .has_isub This allows a more fine-grained control about whether a backend supports one of these instructions. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597>	2021-01-11 19:13:51 +00:00
Rhys Perry	f199b7188b	nir/load_store_vectorize: add data as callback args Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>	2021-01-07 16:34:53 +00:00
Rhys Perry	00c8bec47b	nir: add nir_load_store_vectorize_options Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>	2021-01-07 16:34:53 +00:00
Erik Faye-Lund	5461e21245	Revert "freedreno/ir3: Use get_once() for one-time init" This reverts commit `b4ad27a986`. Acked-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7760>	2020-11-25 09:44:11 +00:00
Rob Clark	b4ad27a986	freedreno/ir3: Use get_once() for one-time init Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7644>	2020-11-24 21:03:34 +00:00
Connor Abbott	bac6cc586f	ir3: Enable nir_lower_vars_to_scratch on a6xx Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7386>	2020-11-19 17:55:58 +01:00
Connor Abbott	4970aa5577	ir3: Initial support for private memory Add information that the driver will need to setup registers, and implement support for load_scratch/store_scratch using private memory. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7386>	2020-11-19 17:55:03 +01:00
Eric Anholt	1f44053301	freedreno+turnip: Upload large shader constants as a UBO. Right now if the shader indirects on some large constant array, we see NIR load_consts (usually from the const file) of its contents into general registers, then indirection on the GPRs. This often results in register allocation failures, as it's easy to go beyond the ~256 dwords of registers per invocation. By moving the large constants to a UBO, we can load an arbitrary number of them. They also can be theoretically moved to the constant reg file (~2k dwords), though you're unlikely to hit this path without an indirect load on your large constant, and we don't yet let UBO indirect loads get moved to constant regs. This possibly won't work out right if we have 16-bit load_constants, but without other MRs in flight we won't see 16-bit temps to be lowered to this. This allows 2 kerbal-space-program shaders to compile that previously would fail, and fixes the new dEQP-VK and -GLES2 tests I wrote that dynamically index a 40-element temporary array of float/vec2/vec3/vec4 with constant element initializers. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2789 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5810>	2020-11-16 13:55:41 -08:00
Rob Clark	cf9ef90066	freedreno/ir3: Add pass to deal with load_uniform base offsets With indirect load_uniform, we can only encode 10b of constant base offset. This pass detects problematic cases and peels out the high bits of the base offset. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7612>	2020-11-13 22:44:04 +00:00
Connor Abbott	f2ae8d116a	freedreno/a6xx: Implement user clip/cull distances Also, plumb things through ir3 so that we don't lower clip planes to discard anymore. This seems to fix some artifacts in the neverball trace. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6959>	2020-10-23 11:09:18 +00:00

1 2 3

150 Commits