Commit Graph

150 Commits

Author SHA1 Message Date
Rob Clark 064c806d23 freedreno/ir3: Add load/store_global lowering
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13300>
2021-10-21 18:59:57 +00:00
Rob Clark f45b7c58c4 freedreno/ir3: Lower 64b phis
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13300>
2021-10-21 18:59:57 +00:00
Danylo Piliaiev bee9212efb ir3/freedreno: add 64b undef lowering
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13300>
2021-10-21 18:59:57 +00:00
Rob Clark 2d65e6f56d freedreno/ir3: 64b intrinsic lowering
Both for OpenCL and VK_KHR_buffer_device_address

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13300>
2021-10-21 18:59:57 +00:00
Rob Clark 96b37b9546 freedreno/ir3: Remove used unused
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13160>
2021-10-04 15:10:07 +00:00
Emma Anholt 1cc8523c5c freedreno/ir3: Use LDIB for coherent image loads on a5xx.
If the coherent flag is present, then we need to not have an incoherent
cache between us and previous stores to the image that were also decorated
as coherent.  isam apparently (unsurprisingly) goes through a texture
cache.  Use ldib instead, so that we don't get the wrong result.

We would need a similar fix for a4xx, but that uses ldgb and I don't
have hardware to test on.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12704>
2021-09-03 18:17:07 +00:00
Danylo Piliaiev 6373dd814a ir3/a6xx,freedreno: account for resinfo return size dependency on IBO_0_FMT
On a6xx resinfo returns size in bytes divided by IBO_0_FMT format size
(not just size in dwords), we have to shift it back to NIR meaning which
is size in bytes.

Make freedreno use 16b buffers when they are supported in order to be
able to depend on hardware capabilities when lowering ssbo size.

Fixes: ce1a381e57 "turnip: enable VK_KHR_16bit_storage on A650"

Fixes cts tests:
    dEQP-VK.ssbo.unsized_array_length.float_offset_explicit_size
    dEQP-VK.ssbo.unsized_array_length.float_no_offset_whole_size
    dEQP-VK.compute.basic.write_multiple_unsized_arr_single_invocation
and many more

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12485>
2021-09-01 16:09:20 +03:00
Emma Anholt 83e9a7fbcf freedreno/ir3: Align driver param upload size/offset for indirect uploads.
For indirect draws, we have to upload some of the params as indirect
references, which have a more strict size requirement.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12455>
2021-08-19 14:43:06 -07:00
Eric Anholt 5b5dcbfe89 freedreno/a6xx: Skip setting up image dims constants.
We just use resinfo anyway.  Notably, a6xx was only doing its setup in the
FS case and not CS.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12258>
2021-08-18 00:15:18 +00:00
Eric Anholt 994793c500 freedreno/ir3: Move a6xx's get_ssbo_size shl to NIR.
Just cleaning up a TODO.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12258>
2021-08-18 00:15:18 +00:00
Eric Anholt 547a2aa051 freedreno/ir3: Use the resinfo path for ssbo sizes on GL, too.
Less state walking at draw time, in exchange for a SHL in the lookup.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12258>
2021-08-18 00:15:18 +00:00
Emma Anholt 513920ba82 freedreno/ir3: Only lower cube image sizes once.
shader variants can cause ir3_nir_finalize() to run more than once, which
would make us keep dividing the size by 6.

Fixes: a48fc88571 ("freedreno/a6xx: Apply the cube image size lowering to GL, too.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12258>
2021-08-18 00:15:18 +00:00
Emma Anholt a48fc88571 freedreno/a6xx: Apply the cube image size lowering to GL, too.
Fixes KHR-GLES31.core.texture_cube_map_array.texture_size_compute_sh.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12256>
2021-08-17 20:00:49 +00:00
Rob Clark 7ba6100c2a freedreno/ir3/lower_io_offsets: Drop gpu_id param
It was unused.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12159>
2021-08-06 18:51:50 +00:00
Rob Clark cc72eeb077 freedreno/ir3: Reduce use of compiler->gpu_id
For the same reason as previous patch.  Mostly we only care about the
generation, so convert things to use compiler->gen instead.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12159>
2021-08-06 18:51:50 +00:00
Timothy Arceri a9ed4538ab nir: add indirect loop unrolling to compiler options
This is where it should be rather than having to pass it into the
optimisation pass every time.

It also allows us to call the loop analysis pass without having to
duplicate these options which we will do later in this series.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>
2021-08-03 10:54:50 +00:00
Connor Abbott 177138d8cb ir3: Reformat source with clang-format
Generated using:

cd src/freedreno/ir3 && clang-format -i {**,.}/*.c {**,.}/*.h -style=file

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11801>
2021-07-12 20:57:21 +00:00
Connor Abbott d53984ce97 ir3/nir: Lower indirect references of compact variables
Fixes Sascha Willems "tessellation" demo on Turnip (it contains
indirect dereference of tessellation levels).

Fixes: 643f2cb ("ir3, tu: Cleanup indirect i/o lowering")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11781>
2021-07-09 09:48:21 +00:00
Connor Abbott 8176657ead ir3/nir: Call nir_lower_subgroups
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott 68b8b9e9e1 tu, ir3: Plumb through support for CS subgroup size/id
The way that the blob obtains the subgroup id on compute shaders is by
just and'ing gl_LocalInvocationIndex with 63, since it advertizes a
subgroupSize of 64. In order to support VK_EXT_subgroup_size_control and
expose a subgroupSize of 128, we'll have to do something a little more
flexible. Sometimes we have to fall back to a subgroup size of 64 due to
various constraints, and in that case we have to fake a subgroup size of
128 while actually using 64 under the hood, by just pretending that the
upper 64 invocations are all disabled. However when computing the
subgroup id we need to use the "real" subgroup size. For this purpose we
plumb through a driver param which exposes the real subgroup size. If
the user forces a particular subgroup size then we lower
load_subgroup_size in nir_lower_subgroups, otherwise we let it through,
and we assume when translating to ir3 that load_subgroup_size means
"give me the *actual* subgroup size that you decided in RA" and give you
the driver param.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Rob Clark 140ce4f8ed freedreno+ir3: Enable INT16
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11545>
2021-06-29 23:27:28 +00:00
Emma Anholt caa5c5b12e freedreno/ir3: Move NIR printing to mesa_log.
Now we can get some NIR debug on Android.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9262>
2021-06-18 18:18:35 +00:00
Rhys Perry 1cbcfb8b38 nir, nir/algebraic: add byte/word insertion instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:42 +00:00
Caio Marcelo de Oliveira Filho c8a7bd0dc8 nir: Rename WORK_GROUP (and similar) to WORKGROUP
Be consistent with other usages in Vulkan and SPIR-V, and the recently
added workgroup_size field.

Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>
2021-06-07 22:34:42 +00:00
Caio Marcelo de Oliveira Filho a71a780598 nir: Rename nir_intrinsic_load_local_group_size to nir_intrinsic_load_workgroup_size
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>
2021-06-07 22:34:42 +00:00
Connor Abbott 0ab01f4215 ir3: Call nir_lower_wrmask() again after lowering scratch
I forgot that after rebasing on large_consts support that this is now
called after the first time nir_lower_wrmask is called and can generate
partial writemasks that need to be lowered. While we're here, also call
the main optimization loop if things are lowered to scratch because it
generates address arithmetic that may need to be cleaned up.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10922>
2021-05-21 20:45:07 +00:00
Connor Abbott a40714abf7 nir/lower_phis_to_scalar: Add "lower_all" option
We don't want to have to deal with vector phis in freedreno, because
vectors are always split/unsplit around vectorized instructions anyways,
and the stated reason for not scalarising them (it hurting coalescing)
won't apply to us because we won't be using nir_from_ssa. Add this
option so that we don't have to do the equivalent thing while
translating from NIR.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10809>
2021-05-17 09:59:45 +00:00
Danylo Piliaiev d8ab0ec8e4 turnip: implement VK_KHR_vulkan_memory_model
No handling of Acquire/Release because at the moment scheduler
works as if any barrier is Acq+Rel.

Instead of removing scoped_barrier with scope/mode that for TCS
corresponds to a control_barrier or a memory_barrier_tcs_patch in
ir3_nir_lower_tess_ctrl - remove them in emit_intrinsic_barrier.
And do the same for memory_barrier_tcs_patch and control_barrier.
While in any case hw fence/barrier shouldn't be emitted for them,
they still affect ordering of stores, and in feature ir3 backend
may want to have that information.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9054>
2021-05-05 10:05:38 +00:00
Connor Abbott 643f2cb8a3 ir3, tu: Cleanup indirect i/o lowering
Do all the necessary lowering in one place, during finalization, and
stop uselessly calling nir_lower_indirect_derefs in turnip. Splitting
i/o to elements should no longer be necessary since we use the i/o
semantics instead of variables now.

This has the side effect that we no longer generate enormous if-ladders
for tess/GS shaders with turnip.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7274>
2021-04-26 17:07:02 +00:00
Eric Anholt 7d234da6ee freedreno: Fix YUV sampler regression.
We have to keep sampler uniforms around for later YUV lowering, and we
only need to remove uniforms that take up storage space.  Code comes from
radeonsi.

Closes: #4644.
Fixes: de17b4aab5 ("freedreno: Remove uniform variables after finalizing NIR.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10246>
2021-04-15 16:20:15 +00:00
Connor Abbott c68ea960a7 ir3, tu: Add compiler flag for robust UBO behavior
This needs to be part of the compiler because it's the only piece that
we always have access to in all the places ir3_optimize_loop() is
called, and it's only enabled for the whole Vulkan device. Right now
it's just used for constraining vectorization, but the next commit adds
another use.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>
2021-04-15 16:05:11 +02:00
Marek Olšák fb29cef8dd nir: add many passes that lower and optimize 16-bit input/outputs and samplers
Added:
* a pass that renumbers bases of IO intrinsics
* a pass that converts mediump IO to 16 bits, optionally using the new
  packed varying slots
* a pass that sets (forces) mediump in IO intrinsics (for testing)
* a pass that remaps VARYING_SLOT_VAR[0..15]_16BIT to VARYING_SLOT_VAR[0..31]
  (if some shader stages don't want packed varyings)
* a pass that folds type conversions around texture opcodes into those
  opcodes (e.g. tex(f2f32(coord), ..) is changed into tex accepting f16)
* a pass that changes (legalizes) sampler src and dst types based on specified
  hw constraints (e.g. derivatives must be the same type as coordinates)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9050>
2021-04-13 05:07:42 +00:00
Rhys Perry a2619b97f5 nir/lower_idiv: add options to use fp32 for 8-bit division lowering
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081>
2021-04-12 16:19:46 +00:00
Danylo Piliaiev 2087168a30 turnip,ir3: account for dispatch group offsets
Fixes tests:
 dEQP-VK.compute.device_group.dispatch_base

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9339>
2021-03-29 14:31:44 +03:00
Danylo Piliaiev ae3b95daa7 turnip: lower device index to zero
Vulkan 1.1 has VK_KHR_device_group and VK_KHR_device_group_creation
promoted to core, thus we should handle DeviceIndex built-in.

While we are here, also add these extensions to the extensions list,
even though they are not doing anything useful.

Fixes test:
 dEQP-VK.compute.device_group.device_index

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9516>
2021-03-11 21:12:52 +00:00
Jason Ekstrand 117668b811 nir: Make nir_ssa_def_rewrite_uses take an SSA value
This commit replaces the new_src parameter of nir_ssa_def_rewrite_uses()
with an SSA def, removes nir_ssa_def_rewrite_uses_ssa(), and rewrites
all the users as needed.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
2021-03-08 16:59:55 +00:00
Eric Anholt c93fd1046a freedreno: Use the mesa/st frontend lowering of GL_CLAMP.
350 lines of code for this stupid feature, and we weren't even doing it
right for CS/GS/tess.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9267>
2021-02-25 00:38:11 +00:00
Eric Anholt 5fa27e6670 freedreno: Drop custom driver lowering of GL's color clamping.
The mesa/st frontend can do it for us now that we don't need to worry
about breaking precompiles.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8997>
2021-02-24 21:48:54 +00:00
Eric Anholt 3b9f6af1a9 freedreno: Drop custom driver lowering of two-sided color.
The GL frontend can do it for us now, so just use their code instead of
our own shader variants.  In the past we had to do hide the GL shader
variants in the driver to get precompiles from st, but no longer as of
!8601.

I tested with drawoverhead -test 6 (shader program change, n=30) and -test
1 (no statechanges, n=43) and saw no change in driver overhead.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8997>
2021-02-24 21:48:54 +00:00
Eric Anholt de17b4aab5 freedreno: Remove uniform variables after finalizing NIR.
mesa/st optimizes the uniform storage if you have the finalize hook in
place, causing the uniforms declared to potentially not have storage in
the ParameterValues list any more.  If you leave your uniforms around in
the NIR, then a later finalization after variant creation will re-add the
uniforms to parameters, defeating the optimization and likely reallocating
the uniform storage (causing use-after-free).  So, we have to do this
before we can start using variants in mesa/st.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8997>
2021-02-24 21:48:54 +00:00
Daniel Schürmann bd8e84eb8d nir: replace .lower_sub with .has_fsub and .has_isub
This allows a more fine-grained control about whether
a backend supports one of these instructions.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597>
2021-01-11 19:13:51 +00:00
Rhys Perry f199b7188b nir/load_store_vectorize: add data as callback args
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry 00c8bec47b nir: add nir_load_store_vectorize_options
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Erik Faye-Lund 5461e21245 Revert "freedreno/ir3: Use get_once() for one-time init"
This reverts commit b4ad27a986.

Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7760>
2020-11-25 09:44:11 +00:00
Rob Clark b4ad27a986 freedreno/ir3: Use get_once() for one-time init
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7644>
2020-11-24 21:03:34 +00:00
Connor Abbott bac6cc586f ir3: Enable nir_lower_vars_to_scratch on a6xx
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7386>
2020-11-19 17:55:58 +01:00
Connor Abbott 4970aa5577 ir3: Initial support for private memory
Add information that the driver will need to setup registers, and
implement support for load_scratch/store_scratch using private memory.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7386>
2020-11-19 17:55:03 +01:00
Eric Anholt 1f44053301 freedreno+turnip: Upload large shader constants as a UBO.
Right now if the shader indirects on some large constant array, we see NIR
load_consts (usually from the const file) of its contents into general
registers, then indirection on the GPRs.  This often results in register
allocation failures, as it's easy to go beyond the ~256 dwords of
registers per invocation.

By moving the large constants to a UBO, we can load an arbitrary number of
them.  They also can be theoretically moved to the constant reg file (~2k
dwords), though you're unlikely to hit this path without an indirect load
on your large constant, and we don't yet let UBO indirect loads get moved
to constant regs.

This possibly won't work out right if we have 16-bit load_constants, but
without other MRs in flight we won't see 16-bit temps to be lowered to
this.

This allows 2 kerbal-space-program shaders to compile that previously
would fail, and fixes the new dEQP-VK and -GLES2 tests I wrote that
dynamically index a 40-element temporary array of float/vec2/vec3/vec4
with constant element initializers.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2789
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5810>
2020-11-16 13:55:41 -08:00
Rob Clark cf9ef90066 freedreno/ir3: Add pass to deal with load_uniform base offsets
With indirect load_uniform, we can only encode 10b of constant base
offset.  This pass detects problematic cases and peels out the high
bits of the base offset.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7612>
2020-11-13 22:44:04 +00:00
Connor Abbott f2ae8d116a freedreno/a6xx: Implement user clip/cull distances
Also, plumb things through ir3 so that we don't lower clip planes to
discard anymore.

This seems to fix some artifacts in the neverball trace.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6959>
2020-10-23 11:09:18 +00:00