Commit Graph

2409 Commits

Author SHA1 Message Date
Rob Clark 78c8a8af80 freedreno: Generate device-info tables at build time
This way we can make the tables const.  At the same time, for a6xx, this
introduces a "sub-generation template" to reduce the copy/paste for
parameters which are keyed to the sub-generation.  It also explicitly
lists every supported GPU, to get rid of duplicate lists of supported
gpus between the device-info and drivers.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11790>
2021-07-14 01:58:00 +00:00
Rob Clark 0eda0188aa freedreno: Rename *_dev_info
Everywhere else symbols/types/etc are shortend to "fd_*", so lets do the
same here for consistency.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11790>
2021-07-14 01:58:00 +00:00
Jonathan Marek 1a6dd7f9b1 freedreno/common: unhardcode CCU color cache offset
Replace it with a calculation which works for all current GPUs.

Duplicated the calculation in both drivers because freedreno_dev_info isn't
meant for derived parameters (and drivers might want to just calculate on
the fly instead).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11790>
2021-07-14 01:58:00 +00:00
Jonathan Marek d34b18a6ce tu: remove workaround for conditional rendering + hw binning
- It hurts users with newer firmware who don't need the workaround
- Kernel now rejects older firmware due to security issues, so this will
  prevent users from using older firmware anyway.
- Only whitelisting 650 enables the workaround by default for any new GPUs

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11790>
2021-07-14 01:58:00 +00:00
Emma Anholt 10d8e123c5 freedreno: Optimize duplicate obj-obj ring relocs.
No need to include the same BO multiple times in the long-lived ringbuffer
object's list of relocs to be added to the submit.

Improves non-TC drawoverhead -test 9 (8 tex updates) throughput by 1.4901%
+/- 0.8705% (n=20)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11697>
2021-07-13 22:12:56 +00:00
Emma Anholt 737d4caa83 freedreno: Suballocate our long-lived ring objects.
On drawoverhead -test 9 (8 texture changes), this saves us 172kb of
memory.  That's only ~1% of the GEM memory while the test is running, but
more importantly it saves us 29% of the gem BO allocations.

non-TC drawoverhead -test 9 (8 texture change) throughput 0.449019% +/-
0.336296% (n=100), but this gets better as we get better suballocation
density.

Note that this means that all fd_ringbuffer_new_object calls can now
return data aligned to 64 bytes, instead of 4k.  We may find that we need
to increase it if some of our objects (tex consts, sampler consts, etc.)
require more alignment than that.  But, this may help non-drawoverhead
perf if any of our RB objects have a cache in front of them (indirect
consts?) and we don't have most of our data in the same cache set any
more.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11697>
2021-07-13 22:12:56 +00:00
Rob Clark 7f5a01a47d freedreno/ir3: Add float immed "FLUT" support
We can encode a limited set of float immeds into cat2 instructions,
using hw's float lookup table (FLUT) feature.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/36
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8705>
2021-07-13 14:40:30 +00:00
Rob Clark 4b2afd11cc freedreno/computerator: Add script to probe FLUT values
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8705>
2021-07-13 14:40:30 +00:00
Rob Clark 4e802538e7 turnip: Split tu6_emit_xs()
Emit all the state layout config (such as push-const CONSTLEN) first,
before emitting anything that depends on that state.  This fixes an
issue that was showing up when FLUT is enabled in ir3 (which results
in higher probability of not having any immediats lowered to push-
consts).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8705>
2021-07-13 14:40:30 +00:00
Rob Clark 71003e3c84 turnip: avoid some UB
Reduce a bit of extra noise that makes diffing cmdstream traces more
annoying.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8705>
2021-07-13 14:40:30 +00:00
Connor Abbott baf3cc3f6f ir3/print: Manual formatting fixups
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11801>
2021-07-12 20:57:21 +00:00
Connor Abbott 177138d8cb ir3: Reformat source with clang-format
Generated using:

cd src/freedreno/ir3 && clang-format -i {**,.}/*.c {**,.}/*.h -style=file

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11801>
2021-07-12 20:57:21 +00:00
Connor Abbott 082871bb35 freedreno: Add some options to .clang-format
In preparation for reformatting ir3.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11801>
2021-07-12 20:57:21 +00:00
Connor Abbott 2e76f7b60c ir3: Manually reformat some places
clang-format does a bad job with a few tables and macros, and there were
some places it was doing wonky things because comments were longer than
80 characters and it tries to fix that without reformatting the comment
itself. Add magic comments to tell it to turn itself off and retab those
places manually (well, with a regex!).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11801>
2021-07-12 20:57:21 +00:00
Connor Abbott f69a99081b ir3: Update .editorconfig and .dir-locals.el
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11801>
2021-07-12 20:57:21 +00:00
Connor Abbott 0f28e1aad3 ir3/lower_parallelcopy: Don't manually set wrmask
It's automatically set. This avoids some weird line wrapping with
clang-format.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11801>
2021-07-12 20:57:21 +00:00
Connor Abbott 1514744a16 ir3: Add ir3_collect() for fixed-size collects
This avoids having the specify the size, and fixes weird formatting with
clang-format.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11801>
2021-07-12 20:57:21 +00:00
Connor Abbott 49a39fbf0c ir3: Add missing include to ir3_parser.y
This prevents build errors in the generated ir3_parser.h when
clang-format reshuffles the header includes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11801>
2021-07-12 20:57:21 +00:00
Emma Anholt a7e753cb96 turnip: Fix allocation size for vkCmdUpdateBuffer.
tu_cs_alloc() takes a size in dwords, not bytes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11743>
2021-07-12 17:15:56 +00:00
Danylo Piliaiev 1c6c200c0d ir3: add newly found shlg.b16 instruction
Example of blob's output:
  (nop3) shlg.b16 hr8.x, (r)8, (r)hr8.x, 12

It does: (src2 << src1) | src2

src1 and src2 could be GPRs, relative GPRs, relative consts,
or immidiates. However, they could not be plain const registers.

Blob does use it in conjuncture with "samgq" instruction.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11760>
2021-07-09 13:00:29 +00:00
Michel Dänzer f5e6674f98 ci: Rename Debian based build jobs from meson-* to debian-*
meson has been the only build system in tree for some time, so the
meson- prefix was a bit meaningless.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11412>
2021-07-09 10:24:41 +00:00
Michel Dänzer df185ae030 ci: Add debian/ prefix to job names for Debian based docker images
And move the image build scripts to a subdirectory correspondingly.

Preparation for adding images based on other OSs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11412>
2021-07-09 10:24:41 +00:00
Michel Dänzer 55caa3abb1 turnip: Mark local variable ASSERTED
It's only used in assert. Avoids compiler warning/error with assertions disabled:

../src/freedreno/vulkan/tu_cs.h: In function 'tu_cs_reserve':
../src/freedreno/vulkan/tu_cs.h:208:13: error: unused variable 'result' [-Werror=unused-variable]
  208 |    VkResult result = tu_cs_reserve_space(cs, reserved_size);
      |             ^~~~~~

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11412>
2021-07-09 10:24:41 +00:00
Connor Abbott d53984ce97 ir3/nir: Lower indirect references of compact variables
Fixes Sascha Willems "tessellation" demo on Turnip (it contains
indirect dereference of tessellation levels).

Fixes: 643f2cb ("ir3, tu: Cleanup indirect i/o lowering")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11781>
2021-07-09 09:48:21 +00:00
Jason Ekstrand d4b482d378 android: Drop the Android.mk build system
Android.mk files haven't really been supported by Mesa devs for a long
time.  Most of us have been willing to update Makefile.sources if we
remember and sometimes we try to blind code some Android.mk for a new
generator.  However, the reality is that it breaks regularly and ends up
being maintained by the Android community.  To address this problem
another approach was implemented in !10183 utilizing the maintained
meson build system.  The old Android.mk files are no longer required.

This commit was created with the following commands:

    git rm **/Android.mk
    git rm **/Android.*.mk
    git rm **/Makefile.sources
    git rm CleanSpec.mk

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4487
Acked-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9728>
2021-07-08 14:44:02 -05:00
Jason Ekstrand 624e799cc3 nir: Drop nir_ssa_def::name and nir_register::name
We say that they're for debug only but we don't really have a good
policy around when to set them and when not to.  In particular,
nir_lower_system_values and nir_lower_vars_to_ssa which are the chief
producers of SSA values which might reasonably have a name do not bother
to set one.  We have some names set from things like BLORP and RADV's
meta shaders but AFAICT, they're setting a name more because it's there
than because they actually care.

Also, most things other than nir_clone and nir_serialize don't bother to
try and preserve them.  You can see in the diffstat of this commit
exactly what passes attempt to preserve names.  Notably missing from the
list is opt_algebraic which is the single largest source of SSA def
churn and it happily throws names away.

These observations lead me to question whether or not names are actually
useful at all or if they're just taking up space (8B per instruction)
and wasting CPU cycles (to ralloc_strdup on the off chance we do have
one).  I don't think I can think of a single time in recent history
where I've been debugging a shader issue and a SSA value name has been
there and been useful.  If anything, the few times they are there, they
just throw me off because they mess up the indentation in nir_print.

iris shader-db on my system gets runtime -2.07734% +/- 1.26933% (n=5)

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5439>
2021-07-08 17:34:41 +00:00
Connor Abbott 266d3d5814 tu: Update subgroup properties
Everything should be in place for this to actually work. Support a size
of 128, unlike the blob. I've also plumbed through ballot support, so
enable that.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott 75516e0595 ir3/legalize: Fix loop convergence behavior
This prevents the previous commit from being undone by the jump
optimizations in legalize, and fixes another potential case where
instead of a continue we have an if/else at the end of a loop.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott 0fa93fb662 ir3: Fix convergence behavior for loops with continues
When loops have continue statements, it's expected that when we execute
a divergent continue (i.e. a continue where not all of the threads
active at the start take it) we keep going with the rest of the loop
body and then reconverge at the start of the next iteration. However the
Adreno ISA seems to always take a branch that jumps backwards, assuming
it's the bottom of a loop, so we get a different, undesired convergence
behavior. There's no way I know of to control this behavior in the
instruction set, so we have to instead insert a "continue block" at the
end of the loop where continue statements reconverge which then jumps
back to the top of the loop. Since this doesn't correspond 1:1 with any
NIR block we have to make control flow handling in NIR->ir3 a bit more
complicated, unfortunately.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott b1b80c06a7 ir3: Implement nir subgroup intrinsics
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott 5d5d752319 ir3: Handle shared registers in lower_parallelcopy
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott 17f7453d45 ir3: Add subgroup pseudoinstructions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott 232ec710fd ir3: Support any/all/getone branches
This plumbs through the support in the IR.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott 7a8e0b15e2 ir3: Cleanup ir3_legalize jump optimization
Do the optimization parts in their own loop, and be more robust when
detecting the useless jumps.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott 43e926a3af ir3/sched: Handle branch condition in split_pred()
Before this, if there was a block with multiple things writing p0.x,
it was a tossup whether the right one would be used as the branch
condition. Found by inspection.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott bb3212dd4d ir3: Fix infinite loop in scheduler when splitting
When we go to split e.g. a p0.x producer, the only other instructions
ready to schedule are often only p0.x producers. It could happen that
they all have a lower priority than the split instruction. Then we would
immediately schedule the split instruction again, then again try to
schedule one of the other producers, be blocked, and split it, around
and around again, leading to an infinite loop. The following commit
triggered this with
dEQP-GLES3.functional.shaders.discard.dynamic_loop_always on a3xx.

Fixes: d2f4d33 ("freedreno/ir3: new pre-RA scheduler")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott 2ff3ab0aed ir3: Make MOVMSK use repeat
MOVMSK is a bit of a special case, because it takes multiple cycles (and
therefore reduces the nops needed if it's between some other assigner
and consumer) however weird things happen if you try to start reading
the first component while it isn't finished yet. On balance making it
use repeat seems to result in a fewer special cases.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott 66a275d50f ir3: Fix shared reg delay
Based on computerator experiments, this is actually 6, including for
movmsk.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott b1b4ce7be2 ir3: Actually allow shared reg moves to be folded
I realized that shared registers were never actually getting folded,
even after adding them to valid_flags, because the move wasn't even
being considered.

I looked at the other uses of is_same_type_mov(), and they should be ok
with this.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott b32188cdba ir3: Better valid flags for shared regs
Shared registers seem to use the same port as consts, so the same
restrictions for cat2/cat3 apply to them.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott 590efd180b ir3: Prevent propagating shared regs out of loops
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott 394c597b1b ir3: Handle unreachable blocks
This fixes a pre-existing bug in ir3, but it showed up even more due to
other changes in this series and it interacts with the logical/physical
CFG split. When both sides of an if end with a jump, a block may become
unreachable via the logical CFG, which can cause problems because it has
no predecessors to figure out the location of live-in non-shared
values. In this case we assume that nir_opt_if has removed any code in
these blocks and just skip processing live-ins for these blocks,
pretending that they aren't live.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott 22ae91b284 ir3: Handle shared register liveness correctly
As explained in the comments added, we need to add extra edges to the
CFG which are ignored except for shared registers. This plumbs through
support for this.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott 8176657ead ir3/nir: Call nir_lower_subgroups
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Connor Abbott 68b8b9e9e1 tu, ir3: Plumb through support for CS subgroup size/id
The way that the blob obtains the subgroup id on compute shaders is by
just and'ing gl_LocalInvocationIndex with 63, since it advertizes a
subgroupSize of 64. In order to support VK_EXT_subgroup_size_control and
expose a subgroupSize of 128, we'll have to do something a little more
flexible. Sometimes we have to fall back to a subgroup size of 64 due to
various constraints, and in that case we have to fake a subgroup size of
128 while actually using 64 under the hood, by just pretending that the
upper 64 invocations are all disabled. However when computing the
subgroup id we need to use the "real" subgroup size. For this purpose we
plumb through a driver param which exposes the real subgroup size. If
the user forces a particular subgroup size then we lower
load_subgroup_size in nir_lower_subgroups, otherwise we let it through,
and we assume when translating to ir3 that load_subgroup_size means
"give me the *actual* subgroup size that you decided in RA" and give you
the driver param.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Hyunjun Ko 9507705693 turnip/kgsl: new flag TU_USE_KGSL
There are some cases using kgsl backend on linux that is still not usual
setup though, we need to consider too.

Regarding the timeline semaphore feature, we could implement it for
the kgsl backend in the future, and probalby it should be using the
existing code in tu_drm.

See #4738, #4907

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11488>
2021-07-01 04:22:55 +00:00
Rob Clark 140ce4f8ed freedreno+ir3: Enable INT16
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11545>
2021-06-29 23:27:28 +00:00
Connor Abbott 42b3d83dd4 ir3/lower_parallelcopy: Use SWZ
shader-db results on a650:

total instructions in shared programs: 1575484 -> 1574866 (-0.04%)
instructions in affected programs: 32579 -> 31961 (-1.90%)
helped: 75
HURT: 0
helped stats (abs) min: 1 max: 98 x̄: 8.24 x̃: 7
helped stats (rel) min: 0.41% max: 30.12% x̄: 2.47% x̃: 1.13%
95% mean confidence interval for instructions value: -10.97 -5.51
95% mean confidence interval for instructions %-change: -3.44% -1.51%
Instructions are helped.

total nops in shared programs: 355742 -> 355628 (-0.03%)
nops in affected programs: 18635 -> 18521 (-0.61%)
helped: 55
HURT: 147
helped stats (abs) min: 1 max: 14 x̄: 4.76 x̃: 6
helped stats (rel) min: 1.41% max: 100.00% x̄: 8.13% x̃: 4.76%
HURT stats (abs)   min: 1 max: 2 x̄: 1.01 x̃: 1
HURT stats (rel)   min: 0.56% max: 25.00% x̄: 2.09% x̃: 1.20%
95% mean confidence interval for nops value: -0.98 -0.15
95% mean confidence interval for nops %-change: -1.93% 0.55%
Inconclusive result (%-change mean confidence interval includes 0).

total non-nops in shared programs: 1219742 -> 1219238 (-0.04%)
non-nops in affected programs: 61125 -> 60621 (-0.82%)
helped: 220
HURT: 0
helped stats (abs) min: 1 max: 99 x̄: 2.29 x̃: 1
helped stats (rel) min: 0.19% max: 29.17% x̄: 0.90% x̃: 0.40%
95% mean confidence interval for non-nops value: -3.26 -1.32
95% mean confidence interval for non-nops %-change: -1.24% -0.56%
Non-nops are helped.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11565>
2021-06-29 08:08:12 +00:00
Connor Abbott 92bb37cb59 ir3: Add min gen for multi-mov instructions
swz works on a5xx/a6xx but not a3xx according to CI. I don't have any
access to a4xx HW so I can't tell whether it works there.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11565>
2021-06-29 08:08:12 +00:00
Connor Abbott 78ab6250b5 ir3: Print multi-mov instructions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11565>
2021-06-29 08:08:12 +00:00