Commit Graph

64 Commits

Author SHA1 Message Date
Rob Clark b1465c382b freedreno/ir3/ra: assign vreg names to all array elements
We shouldn't divide-by-two for half-reg arrays.  We set the proper node
interference class, based on `arr->half`.

Fixes a RA fail with 16b arrays:

  src/freedreno/ir3/ir3_ra.c:633: name_to_array: Assertion `!"invalid array name"' failed.

Caused by use/def iterators returning `arr->length` vreg namess, but
only assigning the array half that many names.

Also, since we are assigning unique vreg names to each array element,
there is no need to try and convert from half-reg to it's conflicting
full reg when pre-coloring the array elements.  Getting us closer to
having half-arrays work sanely with split-register-file (a5xx and
earlier).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
2020-07-18 09:21:09 -07:00
Rob Clark 6317f7d574 freedreno/ir3/ra: debug msgs tweak
Print out the assigned vreg names earlier.  Also print the few special
nodes.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
2020-07-18 09:14:13 -07:00
Rob Clark 37e0e0791f freedreno/ir3/ra: be better at failing
It doesn't happen much.  But it's annoying when we hit an impossible
condition deep in RA 90% thru a long test run.  Add some ra_assert()/
ra_unreachable() helper macros so we can bail cleanly and fail RA.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5907>
2020-07-14 23:26:15 +00:00
Connor Abbott 4554b946c3 ir3: Include ir3_compiler from ir3_shader
I wanted to access the ir3_compiler from a small helper inside
ir3_shader.h, which currently isn't possible.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5607>
2020-06-26 09:34:33 +00:00
Rob Clark 6da0647987 freedreno/ir3/ra: fix pre-color edge case
Fixes a case where you have something like:

   aVecOutput.z = aScalarInput;

In particular, skipping over things that are not the first component is
wrong.. in the above case the input we need to precolor is the 3rd
component.  But we need to adjust the target register according to the
offset.

Fixes android.hardware.nativehardware.cts.AHardwareBufferNativeTests

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5601>
2020-06-25 04:40:40 +00:00
Rob Clark 1cc4cf141a freedreno/ir3: make mergedregs a property of the variant
Rather than assuming a6xx+ means mergedregs.  We can actually (mostly?)
do splitregs on a6xx as well.  And GS/DS/HS currently require it, which
might be papering over a bug, or might be something to do with how
chaining shaders works.  At any rate, we should at least be consistent,
and not have the compiler thinking we are doing mergedregs when we are
actually doing splitregs.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5458>
2020-06-18 02:46:28 +00:00
Rob Clark 38df3f899d freedreno/ir3: decouple regset from gpu gen
Allow different regset's to coexist, so we can make mergedregs vs split
reg file a variant property.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5458>
2020-06-18 02:46:28 +00:00
Rob Clark ee29c682fe freedreno/ir3: limit pre-fetched tex dest
Teach RA to setup additional interference to prevent textures fetched
before the FS starts from ending up in a register that is too high to
encode.

Fixes mis-rendering in multiple playcanv.as webgl apps.

Note that the regression was not actually 733bee57eb8's fault, but
that was the commit that exposed the problem.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3108
Fixes: 733bee57eb ("glsl: lower samplers with highp coordinates correctly")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>
2020-06-11 21:59:54 +00:00
Rob Clark 3c355f1ae8 freedreno/ir3/validate: add checking for types and opcodes
For cases where instructions have a src and/or dst type, validate that
it matches the src/dst register types.  And for cases where there are
different opcodes for half vs full, validate that the opcode matches.

Now that we maintain this properly throughout the stages of the ir, we
can drop the fixups from the RA pass.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark fcfe5eff63 freedreno/ir3: make input/output iterators declare cursor ptr
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark 65f604e3b3 freedreno/ir3: make foreach_src declare cursor ptr
To match how the newer iterators work.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark 9beb2baaff freedreno/ir3: juggle around ir3_debug_print()
In a later patch, this will get folded into an IR3_PASS() macro, at
least for most passes.  But to do that, it is better to standardize
on printing the ir3 after the pass.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Eric Anholt b420d04e1f freedreno/ir3: Fix register allocation assertion failures.
We were failing to tell the allocator about the restriction that scalar
texture instructions (allocated as scalar regs) couldn't be allocated such
that the start of the full unwritemasked vector started before r0.  There
was a patch in select_reg_callback on a6xx that tried to work around that,
but you could still end up backed into a corner you shouldn't be because
we didn't tell the RA what it needed.

Fixes compiler assertion failures on a300-a400's blit_z shader, used for
Z32F gmem blits.

Looks like as a result we get tighter register allocation but more nops:

instructions in affected programs: 757945 -> 760356 (0.32%)
nops in affected programs: 317983 -> 320468 (0.78%)
non-nops in affected programs: 27525 -> 27451 (-0.27%)
mov in affected programs: 3098 -> 3023 (-2.42%)
dwords in affected programs: 109664 -> 110656 (0.90%)
last-baryf in affected programs: 112701 -> 112847 (0.13%)
full in affected programs: 4326 -> 4011 (-7.28%)
sstall in affected programs: 120550 -> 120836 (0.24%)
(ss) in affected programs: 13939 -> 13918 (-0.15%)
(sy) in affected programs: 3006 -> 2786 (-7.32%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:32 +00:00
Rob Clark 656051d735 freedreno/ir3/ra: only assign array base in first pass
In particular, we specifically don't want to let the base change between
passes, as it could end up conflicting with registers assigned in the
first pass.

Mostly-closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2838
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>
2020-04-28 20:06:49 +00:00
Rob Clark 3d8ec96762 freedreno/ir3/ra: split out helper for array assignment
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>
2020-04-28 20:06:49 +00:00
Rob Clark 6313b8d881 freedreno/ir3/ra: use ir3_debug_print helper
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>
2020-04-28 20:06:49 +00:00
Rob Clark 8b3ac7084a freedreno/ir3/ra: remove unused variable
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>
2020-04-28 20:06:49 +00:00
Connor Abbott 94cb129d51 ir3/ra: Fix off-by-one issues with live-range extension
The intersects() function assumes that inside each instruction values
always die before they are defined, so that if the end of one range is
the same instruction as the beginning of the next then they don't
intersect. However, this isn't the case for values that become live at
the beginning of a basic block, which become live *before* the first
instruction, or instructions that die at the end of a basic block which
die after the last instruction.

For example, imagine that we have two values, A which is defined earlier
in the block and B which is defined in the last instruction of the block
and both die at the end of the basic block (e.g. are used in the next
iteration of a loop). We would compute a range for A of, say, (10, 20)
and for B of (20, 20) since each block's end_ip is the same as the ip of
the last instruction, and RA would consider them to not interfere.
There's a similar problem with values that become live at the beginning.

The fix is to offset the block's start_ip and end_ip by one so that they
don't correspond to any actual instruction. One way to think about this
is that we're adding fake instructions at the beginning and end of a
block where values become live & die. We could invert the order, so that
values consumed by each instruction are considered dead at the end of
the previous instruction, but then values that become dead at the
beginning of the basic block would incorrectly have an empty live range,
with a similar problem at the end of the basic block if we try to say
that values are defined at the beginning of the next instruction. So
the extra padding instructions are unavoidable.

This fixes an accidental infinite loop in the shader for
dEQP-VK.spirv_assembly.type.scalar.u32.switch_vert.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4614>
2020-04-18 17:31:56 +00:00
Rob Clark 96ff2a4099 freedreno/ir3/ra: handle array case for SFU select_reg opt
The src of the SFU instruction could also be array/reg (non-SSA).
Handle this case too.

The postsched cp pass makes this scenario more likely.

Fixes: cc82521de4 ("freedreno/ir3: round-robin RA")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
2020-04-13 20:47:28 +00:00
Kristian H. Kristensen 5ec1f264f1 freedreno/ir3: Fix sz vs class confusion
Add bounds checking to make sure we don't silently access out of
bounds again.

Fixes: 90f7d12236 ("freedreno/ir3/ra: pick higher numbered scalars in first pass")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4503>
2020-04-10 10:24:14 -07:00
Connor Abbott de7d90ef53 ir3: Plumb through support for a1.x
This will need to be used in some cases for the upcoming bindless
support, plus ldc.k instructions which push data from a UBO to const
registers.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09 15:56:55 +00:00
Rob Clark c2d0cc8b8d freedreno/ir3: fixup cat3 32b vs 16b
These should be keyed on src arg type.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4423>
2020-04-04 00:07:10 +00:00
Rob Clark 90f7d12236 freedreno/ir3/ra: pick higher numbered scalars in first pass
Since we are re-assigning the scalars anyways in the second pass, assign
them to the highest free reg in the first pass (rather than lowest) to
allow packing vecN regs as low as possible.

Note this required some changes specifically for tex instructions with a
single component writemask that is not necessarily .x, as previously
these would get assigned in the first RA pass, and since they are still
scalar, we'd end up w/ some r47.* and other similarly way-to-high
assignments after the 2nd pass.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>
2020-03-27 22:41:36 +00:00
Rob Clark 1da90ca9bf freedreno/ir3/ra: compute register target from liveranges
Using the output of the first pass isn't ideal, as it can bake in the
losses from fragmentation which the scalar pass is intended to fill in.
This gets worse when we start using "vectorish" instructions, due to
higher use of vecN values.

Instead, we can just use the outputs of the liveness analysis to get a
more accurate # of maximum live values at any point.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>
2020-03-27 22:41:36 +00:00
Rob Clark d2cc92c747 freedreno/ir3/ra: fix array liveranges
Fixes: 1b658533e1 ("freedreno/ir3: extend liverange of arrays")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>
2020-03-27 22:41:36 +00:00
Rob Clark 6347c2ea89 freedreno/ir3/ra: add def/use iterators
Decouple the messy logic of figuring out vreg names defined/used by an
instruction from the logic of what to do about it by introducing
iterators.  There is still *some* array vs ssa special casing in
ra_block_compute_live_ranges(), but less than before.  And this will
avoid introducing a second copy of the def/use logic in a following
patch which uses the liveranges to calculate the maximum # of live
values (which is the optimal target for max physical register window
to round-robin within).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>
2020-03-27 22:41:36 +00:00
Rob Clark bf0aa7ed90 freedreno/ir3/ra: drop extending output live-ranges
This is no longer needed as we create meta:collect instructions in the
end block, which achieves the same result.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>
2020-03-27 22:41:36 +00:00
Rob Clark 0e7d24b532 freedreno/ir3/ra: add helper to map name to array
For vreg names that refer to arrays rather than SSA values, this is the
counterpart to name_to_instr().

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>
2020-03-27 22:41:36 +00:00
Rob Clark d99d358389 freedreno/ir3/ra: fix target register calculation
Account for the # of regs an instruction writes, and fix an off-by-one.

(We are about to replace this with calculating the register target using
the live-ranges, but in debugging that it was useful to assert() if it
chose a higher target.)

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>
2020-03-27 22:41:36 +00:00
Rob Clark d20a06e401 freedreno/ir3/ra: add helper to map name to instruction
Extract out a helper from the select_reg callback.  And include all the
instructions in the hashtable, not just SFU.  This will be useful in the
following commits.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>
2020-03-27 22:41:36 +00:00
Rob Clark 29992a039e freedreno/ir3/ra: split-up
Split out regset and shared header, since the RA pass is already getting
large-ish.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>
2020-03-27 22:41:36 +00:00
Rob Clark 6da53911c1 freedreno/ir3/ra: add debug option for RA debug msgs
Similar to the debug switch for sched debug msgs

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>
2020-03-27 22:41:36 +00:00
Rob Clark a0de0db0e4 freedreno/ir3: small cleanup and comments
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>
2020-03-27 22:41:36 +00:00
Rob Clark cc82521de4 freedreno/ir3: round-robin RA
In the second (scalar pass) use the information about # of registers
used in the first pass as the target max, and round-robin within that
range.  This generally gives the post-RA sched pass more opportunities
to re-order instructions to remove nop's.

Also, we can be a bit clever when assigning dest registers for SFU
instructions, by picking the register used for it's src (if available
and already assigned).  This avoids some (ss) syncs caused by write
after read hazards.  (Ie. the SFU instruction will read it's own src
before writing dest.)

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
2020-03-10 16:01:39 +00:00
Rob Clark b2b349096f freedreno/ir3: track register usage in first RA pass
We'll use the feedback from the first pass to select a target register
usage in the second pass.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
2020-03-10 16:01:39 +00:00
Rob Clark c1f4367461 freedreno/ir3: don't precolor unassigned inputs
Fixes crash seen in:
dEQP-VK.glsl.conversions.matrix_to_matrix.mat4_to_mat3x4_vertex

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3989>
2020-02-28 16:53:41 +00:00
Rob Clark 2cf4b5f29e freedreno/ir3: track half-precision live values
In schedule live value tracking, differentiate between half vs full
precision.  Half-precision live values are less costly than full
precision.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3989>
2020-02-28 16:53:41 +00:00
Hyunjun Ko c822460f85 freedreno/ir3: handle half registers for arrays during register allocation.
So far we only handle full regs of arrays during pre-allocation.
This patch is to handle half regs of arrays and also consider the size
of half regs when finding out conflicts.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3822>
2020-02-24 17:24:13 +00:00
Hyunjun Ko d70192e697 freedreno/ir3: Add cat4 mediump opcodes
v2: Reworked to assign half-opcodes in ir3_ra.c (krh).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3737>
2020-02-07 09:51:25 -08:00
Rob Clark 2ffe44ec0a freedreno/ir3: add RA sanity check
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
2020-02-01 02:40:22 +00:00
Rob Clark 3e79c4f0ed freedreno/ir3: two pass register allocation
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
2020-02-01 02:40:22 +00:00
Rob Clark b0293af7a5 freedreno/ir3: don't precolor unused inputs
This apparently can happen with gs/tess.  And will cause problems with
two-pass-ra, so lets just skip them.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
2020-02-01 02:40:22 +00:00
Rob Clark 0f78c32492 freedreno/ir3: post-RA sched pass
After RA, we can schedule to increase parallelism (reduce nop's) without
worrying about increasing register pressure.  This pass lets us cut down
the instruction count ~10%, and prioritize bary.f, kill, etc, which
would tend to increase register pressure if we tried to do that before
RA.

It should be more useful if RA round-robin'd register choices.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
2020-02-01 02:40:22 +00:00
Rob Clark 9a9f78f1f9 freedreno/ir3/ra: make use()/def() functions instead of macros
Originally these were nested functions, which worked nicely, giving us
the function of a local macro that was actual 'c' syntax (ie. not token
pasted macro).  But these were converted to macros because clang doesn't
let us have nice gcc extensions.

Extract these back out into functions, before adding more things and
making the macros even more cumbersome.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
2020-02-01 02:40:22 +00:00
Rob Clark a5f24f966a freedreno/ir3: a bit more optmsgs debug
Also dump where arrays are allocated.  This was useful for debugging.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
2020-02-01 02:40:22 +00:00
Rob Clark 54c795f829 freedreno/ir3: fix crash when no non-input instructions
This scenario can come up with block-sched and nop-sched moved to after
RA.  So lets fix it first to keep things bisectable.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
2020-02-01 02:40:22 +00:00
Kristian H. Kristensen f9d35ea55b ir3: Set up full/half register conflicts correctly
Setting up transitive conflicts between a full register and its two
half registers (eg r0.x and hr0.x and hr0.y) will make the half
registers conflict.  They don't actually conflict and this prevents us
from using both at the same time.

Add and use a new ra helper that sets up transitive conflicts between
a register and its subregisters, except it carefully avoids the
subregister conflict.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2020-01-09 16:03:25 -08:00
Rob Clark 3b8feefd9c freedreno/ir3: add iterator macros
So many open coded list iterators were getting annoying.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-13 09:25:40 -08:00
Hyunjun Ko 407f8c71d3 freedreno/ir3: fixup when changing to mad.f16
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-11-20 14:09:43 +01:00
Rob Clark b22617fb57 freedreno/ir3: fix gpu hang with pre-fs-tex-fetch
For pre-fs-dispatch texture fetch, we need to assign bary_ij to r0.x,
even if it is not used in the shader (ie. only varying use is for tex
coords).  But if, for example, gl_FragCoord is used, it could get
assigned on top of bary_ij, resulting in a GPU hang.

The solution to this is two-fold: (1) the inputs/outputs rework has the
benefit of making RA realize bary_ij is a vec2, even if there are no
split/collect instructions (due to no varying fetches in the shader
itself).  And (2) extend the live ranges of meta:input instructions to
the first non-input, to prevent RA from assigning the same register to
multiple inputs.

Backport note: because of (1) above, a better solution for 19.3 would be
to revert f30c256ec0.

Fixes: f30c256ec0 ("freedreno/ir3: enable pre-fs texture fetch for a6xx")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:57:52 -08:00