Commit Graph

609 Commits

Author SHA1 Message Date
Connor Abbott 168c42290f ir3: Don't calculate num_samp ourselves
In addition to duplicating what core NIR does better, this was wrong for
Vulkan, where it should be 0 as there are no non-bindless samplers.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5519>
2020-06-17 14:36:50 +00:00
Connor Abbott 6f2981176d ir3: Pass reserved_user_consts to ir3_shader_from_nir()
ir3_shader_from_nir() calls ir3_optimize_nir(), which currently sets up
the const state. However, we need to know the number of user consts
reserved by the driver before setting up the const state, which means
that this information needs to be passed into ir3_shader_from_nir()
somehow rather than being set in the shader.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5500>
2020-06-17 13:13:05 +00:00
Rob Clark 680ca5b393 freedreno/ir3: add post-scheduler cp pass
A pass to eliminate extra mov's from an array.  We need to do this after
scheduling so we know that there are not any potentially conflicting
array writes between the original `mov` and it's use(s).

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2124
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
2020-06-16 20:56:15 +00:00
Rob Clark a60d48a863 freedreno/ir3/cp: extract valid_flags
We'll also need this in the postsched-cp pass.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
2020-06-16 20:56:15 +00:00
Rob Clark 5f1f8f7b17 freedreno/ir3: delay test support for vectorish instructions
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
2020-06-16 20:56:15 +00:00
Rob Clark 92d6eb4dd5 freedreno/ir3: add helpers to move instructions
A bit cleaner than open coding the list manipulation.  Plus I want to
use it in the next patch, rather than adding more open coded list
futzing.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
2020-06-16 20:56:15 +00:00
Rob Clark 9eed0c6011 freedreno/ir3/delay: calculate delay properly for (rptN)'d instructions
When a sequence of same instruction is encoded with repeat flag,
destination registers are written on successive cycles.  Teach the
delay calculation about this.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
2020-06-16 20:56:15 +00:00
Rob Clark c3b30963dd freedreno/ir3: add test for delay slot calculation
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
2020-06-16 20:56:15 +00:00
Rob Clark a69d28769a freedreno/ir3/print: print (r) flag
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
2020-06-16 20:56:15 +00:00
Rob Clark cd376a1434 freedreno/ir3/legalize: don't allow (nopN) if (rptN)
These two encodings are mutually exclusive.  If the instruction is a
vector(ish) `(rptN)` instruction, then we can't fold a `(nopN)` post-
delay into it.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
2020-06-16 20:56:15 +00:00
Rob Clark f35f711c71 freedreno/ir3/cp: properly handle already-folded RELATIV
In the `try_swap_mad_two_srcs()` case, valid_flags() gets called both
for the src that we want to try to fold, and for the other src that we
are trying to swap to make that possible.  It can happen in the 2nd case
that a RELATIV src has already been folded.  Since `ssa()` returns non-
null in both the `IR3_REG_SSA` and `IR3_REG_ARRAY` cases (in the later
case, it is the dependent array access that the current instruction
cannot be moved ahead of), we need to explicitly check that the src
reg we are looking at is still an SSA src.

Reported-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
2020-06-16 20:56:15 +00:00
Rob Clark 1bee79996b freedreno/ir3/validate: also check instr->address
Verify that instructions which have a relative src and/or dest, have
`instr->address`.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
2020-06-16 20:56:15 +00:00
Rob Clark f598786775 freedreno/sched: reset delay counters at start of block
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
2020-06-16 20:56:15 +00:00
Rob Clark 28a14787c0 freedreno/ir3: don't rely on intr->num_components
It is better to use `nir_intrinsic_dest_components()` which also handles
the case of intrinsics with a fixed number of dest components.

Somehow this starts showing up with a nir_serialize round-trip with
shader-cache.  But we really shouldn't have been relying on
`intr->num_components` directly.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5371>
2020-06-16 02:48:18 +00:00
Rob Clark 1a33faea8c freedreno/ir3: move the libdrm dependency out of shared code
The only reason for this dependency was the fd_bo used for the uploaded
shader.  But this isn't used by turnip.  Now that we've unified the
cleanup path from gallium, it isn't hard to pull the fd_bo upload/free
parts into ir3_gallium.

This cleanup has the added benefit that the shader disk-cache will not
have to deal with it.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5476>
2020-06-15 15:46:37 +00:00
Jonathan Marek 1d9e6e456a freedreno/ir3: fix ir3_nir_move_varying_inputs
ir3_nir_move_varying_inputs is broken when there a load input outside of
the first block which depends on the result of a previous load input.

This simplification/rework avoids the problem, and should also be faster.

Fixes this dEQP-VK test:

dEQP-VK.pipeline.multisample_interpolation.offset_interpolate_at_pixel_center.128_128_1.samples_2

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5465>
2020-06-14 17:53:47 +00:00
Rob Clark ee29c682fe freedreno/ir3: limit pre-fetched tex dest
Teach RA to setup additional interference to prevent textures fetched
before the FS starts from ending up in a register that is too high to
encode.

Fixes mis-rendering in multiple playcanv.as webgl apps.

Note that the regression was not actually 733bee57eb8's fault, but
that was the commit that exposed the problem.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3108
Fixes: 733bee57eb ("glsl: lower samplers with highp coordinates correctly")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>
2020-06-11 21:59:54 +00:00
Rob Clark f80092dad2 freedreno/ir3: remove RA "q-values" optimization
This is mainly the "piglit optimization" (ie, since piglit launches an
separate process for for each test).  It was never wired up for a6xx,
and makes register class setup unnecessarily complicated.  Remove it to
simplify the next patch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>
2020-06-11 21:59:54 +00:00
Rob Clark 562aaea07c freedreno/ir3: respect tex prefetch limits
Refactor a bit the limit checking in the bindless case, and add tex/samp
limit checking for the non-bindless case, to ensure we do not try to
prefetch textures which cannot be encoded in the # of bits available.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>
2020-06-11 21:59:54 +00:00
Rob Clark 4cabc25fa4 freedreno/ir3: add debug code to print conflicting half-regs
I keep re-typing this from time to time when debugging various things.
Which is dumb.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>
2020-06-11 21:59:54 +00:00
Eric Anholt 0bacb280a8 freedreno/ir3: Handle cases where we decide not to lower UBO 0 loads.
We advertize 4096 vec4s of GL uniform storage, but the HW can only store
512 vec4s in the const buffer.

Closes: #3049
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>
2020-06-05 13:43:30 -07:00
Eric Anholt e349f50279 freedreno/ir3: Drop the max_const on a6xx to 512.
The GLES blob on the p3a limits constlen to 512 between VS and FS across
a6xx gpu ids (615, 630, 640, and 650).  Experimentally, exceeding that
limit in any one stage results in rendering corruption or GPU hangs
(though my most detailed testing had a loop limit in a uniform, so that
may the cause of the hang).  Clamp the limit we use inside of a shader so
we don't exceed it within a stage.

This commit doesn't resovle limiting inter-stage.  Experimentally, I've
found that I can push up to a total of ~768 vec4s between VS and FS on
a630, with or without uniform updates between each draw.  We'll need to do
some shader key-based limiting of constlen at draw time to respect that
limit, but that's left for future work, and this commit is enough for the
google earth case that initiated this work.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>
2020-06-05 13:36:29 -07:00
Eric Anholt 486b894307 freedreno/ir3: Account for driver params in UBO max const upload.
The const state setup needs to be able to push its driver params, so
account for them in the analyze_ubo_ranges.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>
2020-06-05 13:36:29 -07:00
Eric Anholt a25347ab92 freedreno/ir3: Stop shifting UBO 1 down to be UBO 0.
It turns out the GL uniforms file is larger than the hardware constant
file, so we need to limit how many UBOs we lower to constbuf loads.  To do
actual UBO loads, we'll need to be able to upload UBO 0's pointer or
descriptor.

No difference on nohw 1 UBO update drawoverhead case (n=35).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>
2020-06-05 13:36:29 -07:00
Eric Anholt 9e58ab09ff freedreno/ir3: Drop unnecessary alignment of pushed UBO size.
The analysis pass gives us vec4-aligned size, and all of our other
constbuf allocations here are in vec4 units, so we can just divide by 16.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>
2020-06-05 13:36:29 -07:00
Eric Anholt 07ec745014 freedreno/ir3: Stop pushing immediates once we've filled the constbuf.
If we filled the constbuf up with UBOs, we may need to avoid generating
more immediate push constants.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>
2020-06-05 13:36:29 -07:00
Eric Anholt ab29f2da42 freedreno/ir3: Refactor ir3_cp's lower_immed().
There was duplicated handling in the callers that we can just move inside.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>
2020-06-05 13:36:29 -07:00
Rob Clark ebcf3545db freedreno/ir3: split kill from no_earlyz
Unlike other conditions which prevent early-discard of fragments, kill
does not prevent early LRZ test.  Split `has_kill` from `no_earlyz` so
we can take advantage of this.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5298>
2020-06-04 02:34:54 +00:00
Timothy Arceri 04dbf709ed nir: add callback to nir_remove_dead_variables()
This allows us to do API specific checks before removing variable
without filling nir_remove_dead_variables() with API specific code.

In the following patches we will use this to support the removal
of dead uniforms in GLSL.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4797>
2020-06-03 02:22:23 +00:00
Dylan Baker a8e2d79e02 meson: use gnu_symbol_visibility argument
This uses a meson builtin to handle -fvisibility=hidden. This is nice
because we don't need to track which languages are used, if C++ is
suddenly added meson just does the right thing.

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4740>
2020-06-01 18:59:18 +00:00
Kristian H. Kristensen f4e64e9f53 freedreno/ir3: Avoid {0} initializer for struct reginfo
First element is not a scalar.  Just initialize the struct like we do
elsewhere.

src/freedreno/ir3/disasm-a3xx.c:958:33: warning: suggest braces around
initialization of subobject [-Wmissing-braces]

Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5174>
2020-05-26 12:46:18 -07:00
Eric Anholt 5ec3747fbe freedreno/ir3: Use RESINFO for a6xx image size queries.
The closed GL driver uses resinfo on images with the writeonly flag (using
the texture-path's getsize only for readonly images).  The closed vulkan
driver seems to use resinfo regardless.  Using resinfo doesn't need any
fixups after the instruction.  It also avoids one of the needs for the
TEX_CONST state for the image, which is awkward to set up in the GL
driver.

The new handler goes into ir3_a6xx to be next to the other current image
code, but the a4xx version is left in place because it wants a bunch of
sampler helpers.

Fixes assertion failure in dEQP-VK.image.image_size.buffer.readonly_32.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>
2020-05-26 18:17:46 +00:00
Eric Anholt 2ec4c53ef9 freedreno/ir3: Move handle_bindless_cat6 to compiler_nir and reuse.
There was an open coded version for ldc, and now we can drop that.  I
needed to do it for resinfo as well.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>
2020-05-26 18:17:46 +00:00
Eric Anholt 2068b01430 freedreno/ir3: Refactor out IBO source references.
All the users of the unsigned result just wanted an ir3_instruction to
reference.  Move a6xx's helpers to ir3_image.c and inline the old unsigned
results version.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>
2020-05-26 18:17:46 +00:00
Eric Anholt 00b9099dd5 freedreno: Set the immediate flag in a4/a5xx resinfos.
Noticed comparing our RESINFO asm to qcom's for the same test, and if I
drop this bit their disasm switches from immediate to reg.  ldgb seems to
have the same behavior.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>
2020-05-26 18:17:46 +00:00
Eric Anholt ae00da5ddb freedreno: Fix resinfo asm, which doesn't have srcs besides IBO number.
In the process, clarify what's going on with the LDC/LDIB case.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>
2020-05-26 18:17:46 +00:00
Eric Anholt c1cb75678d freedreno: Add more resinfo/ldgb testcases.
Since I'm going to start using the resinfo opcode, make sure we can disasm
the blob's instances of it that I've found.  And, since resinfo disasm
will impact ldgb on pre-a6xx, include some of those too.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>
2020-05-26 18:17:46 +00:00
Eric Anholt 5d4a911d8c freedreno: Fix printing of unused src in disasm of cat6 RESINFO.
Compare to QC's disasm right next to ours, and we clearly had an extra src
that wouldn't make sense.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>
2020-05-26 18:17:46 +00:00
Rob Clark 3c355f1ae8 freedreno/ir3/validate: add checking for types and opcodes
For cases where instructions have a src and/or dst type, validate that
it matches the src/dst register types.  And for cases where there are
different opcodes for half vs full, validate that the opcode matches.

Now that we maintain this properly throughout the stages of the ir, we
can drop the fixups from the RA pass.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark f484d63617 freedreno/ir3: add helpers to deal with src/dst types
Add some helpers to properly maintain src/dst types, and in the cases
where opcode depends on src or dst type, maintain that as well.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark 3561d34fff freedreno/ir3: add simple validate pass
We can add to this as we notice other things that are worth validating
between ir3 passes.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark 554f3d54ca freedreno/ir3: fix mismatched wrmask for overlapping VS inputs
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark 16cd232dbc freedreno/ir3/cp: fix cmps folding
When we start doing cp iteratively, we hit the case that we've already
`cmps.s.*` into a `cmps.s.ne p0.x, ...`..  when we try to do that again
we can invert the logic condition.  So check specifically the condition
to prevent this.

TODO we could maybe be more clever about this to combine conditions.
But why isn't that happening in nir?  For example, see
dEQP-GLES31.functional.ssbo.layout.single_basic_array.packed.bool

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark 39de27d3b9 freedreno/ir3/print: print cat2 condition
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark 7b86b5ed7d freedreno/ir3: fix immed type in create_addr0()
We can also remove a bunch of manual src/dst flag munging, since the
instruction builders handle this automatically now.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark 3474ba53b5 freedreno/ir3/cf: handle multiple cov's properly
There can be multiple (for ex.) f32f16's from a single source, in
particular appearing in different blocks.  We need to update all uses
of the src which had conversion folded in, not all the uses of the
individual cov.  Also, to avoid invalidating the ssa use info that was
gathered at the beginning of the pass, don't actually eliminate the
cov, but instead change it to a simple mov that the cp pass can gobble
up.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark 3db5d146e9 freedreno/ir3: fix mismatched flags on split
We have to fixup the meta:split half flag, because `ir3_split_dest()` is
called before we fixup the dest type.  But we should fixup both the
split src and dest, as well as the thing it is splitting.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark b24b6a8365 freedreno/ir3/group: fix for half-regs
If we're inserting a mov to resolve a conflict between meta:collect's
(ie. for .zyx type swizzles, etc), we should use the correct precision.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark fcfe5eff63 freedreno/ir3: make input/output iterators declare cursor ptr
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark c1d33eed41 freedreno/ir3: make foreach_ssa_src declar cursor ptr
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark 65f604e3b3 freedreno/ir3: make foreach_src declare cursor ptr
To match how the newer iterators work.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark 599fd861d4 freedreno/ir3: be iterative
It does pick up a few more cf/cp opportunities, according to sharder-db.
But don't think it will be measurable.

But this will allow some future simplification to cp by pulling out it's
internal iteration.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark b828929ac9 freedreno/ir3: move where we preserve binning pass inputs
For a6xx, since we use same VBO state for binning and VS, we need to
preserve potentially unused inputs.  This needs to be done before DCE.
So move it before we add earlier DCE passes.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark d0cfc06a2c freedreno/ir3: add IR3_PASS() macro
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark c9e5605720 freedreno/ir3/postsched: report progress
Or do the easy thing and claim we always changed something.  It is kinda
hard and not worth the effort to determine for real.

Also rip out unused error handling.  This pass should never fail.  And
we weren't even actually checking the return.

And while we're at it, switch over to taking the 'struct ir3 ir*`
instead of ctx, to standardize with the other passes.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark c953794cd6 freedreno/ir3/legalize: report progress
It always does something.  Just return true for IR3_PASS()

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark c3630c9d29 freedreno/ir3/group: report progress
Not iterative, but this will let IR3_PASS() macro know if there are any
changes to print.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark 721147a05d freedreno/ir3/deps: report progress
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark e4ecfde2dd freedreno/ir3/cp: report progress
Later when we do this pass iteratively, we can drop some of the internal
iteration and just rely on this pass getting run until there is no more
progress.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark 372e466301 freedreno/cf: report progress
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark b6d121502d freedreno/ir3/dce: report progress
Eventually we'll pull the iteration out of the pass itself, but the
first step is to just report progress.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark 9beb2baaff freedreno/ir3: juggle around ir3_debug_print()
In a later patch, this will get folded into an IR3_PASS() macro, at
least for most passes.  But to do that, it is better to standardize
on printing the ir3 after the pass.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Rob Clark 947aa23eff freedreno/ir3: remove Sethi-Ullman numbering pass
We haven't used this for a while.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
2020-05-19 16:06:17 +00:00
Ilia Mirkin b5accb3ff9 freedreno/a3xx: parameterize ubo optimization
A3xx apparently has higher alignment requirements than later gens for
indirect const uploads. It also has fewer of them. Add compiler
parameters for both settings, and set accordingly for a3xx and a4xx+.
This fixes all the ubo test failures caused by this optimization.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5077>
2020-05-17 19:51:40 -04:00
Ilia Mirkin 9048adbd24 freedreno/ir3: avoid applying (sat) on bary.f
This causes failures on a3xx resulting in the non-sensical dEQP failures
on packUnorm2x16. The same test uses ldlv on a4xx+, so just disallow
(sat) on bary.f on all generations.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5074>
2020-05-17 21:17:57 +00:00
Connor Abbott 2a9d12d513 ir3: Fixup dual-source blending slot
The hardware expects that where MRT0 and MRT1 would normally go are the
dual sources for MRT0, whereas GLSL has an extra "index" parameter that
indicates which source it is. Remap it when handling FS outputs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>
2020-05-14 18:15:31 +00:00
Rob Clark cf21b76383 freedreno/ir3: use lower_wrmasks pass
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2020-05-13 20:24:53 -07:00
Rob Clark a506d49fae nir: add helper to copy const_index[]
It seems less brittle to not assume they are in the same order for src
and dst instructions.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2020-05-13 20:24:45 -07:00
Rob Clark ea6b404294 freedreno/ir3: use const_index accessors
Cleans up a couple spots that were still open-coding this.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2020-05-13 20:24:38 -07:00
Kristian H. Kristensen 14969aab11 freedreno/ir3: Drop wrmask for ir3 local and global store intrinsics
These intrinsics are supposed to map to the underlying hardware
instructions, which don't have wrmask. We use them when we lower
store_output in the geometry pipeline and since store_output gets
lowered to temps, we always see full wrmasks there.
2020-05-13 20:24:33 -07:00
Eric Anholt 112c65825f freedreno/a6xx: Use LDC for UBO loads.
It saves addressing math, but may cause multiple loads to be done and
bcseled due to NIR not giving us good address alignment information
currently.  I don't have any workloads I know of using non-const-uploaded
UBOs, so I don't have perf numbers for it

This makes us match the GLES blob's behavior, and turnip (other than being
bindful).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
2020-05-14 00:10:43 +00:00
Eric Anholt ab93a631b4 freedreno: Trim num_ubos to just the ones we haven't lowered to constbuf.
With the upcoming LDC usage in the GL driver, we don't want to be
uploading descriptors for every UBO when they aren't actually in use.
Trimming NIR's num_ubos will avoid that, and cleans up num_ubo handling
elsewhere right now.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
2020-05-14 00:10:43 +00:00
Eric Anholt d5176c453e freedreno/ir3: Move i/o offset lowering after analyze_ubo_ranges.
I found that when moving more UBOs to load_ubo_ir3, analyze_ubo_ranges
would move things back in a broken way.  We can just run this pass later
and drop the _ir3 path.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
2020-05-14 00:10:43 +00:00
Eric Anholt 5387c27140 freedreno/ir3: Leave the cursor alone during ir3_nir_try_propagate_bit_shift.
Otherwise, we might end up inserting the nir_intrinsic_load_ubo_ir3()
after the non-offset src's definition, leading to nir_validate() failures.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
2020-05-14 00:10:43 +00:00
Eric Anholt e0a4d1c4e5 freedreno/ir3: Clean up a silly nir_src_for_ssa(src.ssa).
Just copy the src through.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
2020-05-14 00:10:43 +00:00
Rob Clark d6706fdc46 freedreno/ir3/sched: try to avoid syncs
Similar to what we do in postsched.  It is useful for pre-RA sched to be
a bit aware of things that would cause syncs.  In particular for the tex
fetches, since the vecN src/dst tends to limit postsched's ability to
re-order them.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
2020-05-13 03:28:40 +00:00
Rob Clark d95a6e3a0c freedreno/ir3/sched: avoid scheduling outputs
If an instruction's only use is as an output, and it increases register
pressure, then try to avoid scheduling it until there are no other
options.

A semi-common pattern is `fragcolN.a = 1.0`, this pushes all these
immed loads to the end of the shader.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
2020-05-13 03:28:40 +00:00
Rob Clark 488cf208d5 freedreno/ir3/postsched: try to avoid (sy) syncs
Similar to avoidance of `(ss)` syncs, it turns out to be helpful to
avoid `(sy)` syncs as well.  This helps us turn an tex, (sy)alu, tex,
(sy)alu sequence into tex, tex, (sy)alu, alu, which is a big win in
gfxbench gl_fill2.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
2020-05-13 03:28:40 +00:00
Rob Clark 25f4fb346e freedreno/ir3/postsched: reset sfu_delay on sync
Once we schedule an instruction that will require an `(ss)` sync flag,
there is no need to delay any further instructions that consume an
SFU result (until the next SFU instruction is scheduled).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
2020-05-13 03:28:40 +00:00
Rob Clark f351e1d137 freedreno/ir3: limit # of tex prefetch by shader size
It seems for short frag shaders, too much prefetch can be detrimental.
I think what we *really* want to do is decide after pre-RA sched, when
we also know about nop's and what the actual ir3 instruction count is.
But that will require re-working how prefetch lowering works.  For now
this is a super crude heuristic to attempt to approximate a good
solution.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
2020-05-13 03:28:40 +00:00
Rob Clark d69f6fd852 freedreno/ir3: fix indirect cb0 load_ubo lowering
We can no longer assume that `state->ranges[0]` is block 0.  It *often*
is, but when we encounter a "real" ubo that we lower to `load_uniform`
before a block 0 `load_ubo`, it could end up another entry in the table.
Resulting in the second pass after gathering ubo ranges, not finding a
valid range.  Which results in a `load_ubo` for a thing that is not
actually a ubo making it's way into ir3 frontend.  Resulting in grabbing
what we think is a ubo address out of some unrelated const register, and
trying to dereference that.  Which as you can imagine, fails in amusing
ways.

Fixes: fc850080ee ("ir3: Rewrite UBO push analysis to support bindless")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>
2020-05-12 23:51:46 +00:00
Rob Clark c4dc877cb5 freedreno/ir3: don't allow negative const_offset
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>
2020-05-12 23:51:46 +00:00
Eric Anholt f789c5975c freedreno: Fix non-constbuf-upload UBO block indices and count.
The nir_analyze_ubo_ranges pass removes all UBO block 0 loads to reverse
what nir_lower_uniforms_to_ubo() had done, and we only upload UBO pointers
to the HW for UBO block 1-N, so let's just fix up the shader state.

Fixes an off by one in const state layout setup, and some really dodgy
register addressing trying to deal with dynamic UBO indices when the UBO
pointers happen to be at the start of the constbuf.

There's no fixes tag, though this fixes a bug from September, because it
would require the num_ubos fix in nir_lower_uniforms_to_ubo.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4992>
2020-05-12 17:01:55 +00:00
Eric Anholt 554b959df0 freedreno: Replace OUT_RELOCD with permanently flagging shader BOs for it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>
2020-05-12 16:30:57 +00:00
Hyunjun Ko 094c7646a3 freedreno,tu: Don't request fragcoord components not being read.
v1. Replace the existed bool type with new bitfield and edit register
files to take a mask instead of duplicating codes to do masking.

v2. Use fragcoord_compmask != 0 instead of fragcoord_compmask > 0 since
it represents a bitfield.

Tested with
  dEQP-VK.glsl.builtin_var.simple.fragcoord_xyz/w
  dEQP-GLES2.functional.shaders.builtin_variable.fragcoord_xyz/w

Closes: #2680

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4723>
2020-05-08 17:45:03 +00:00
Eric Anholt 9a6bbf4c80 freedreno/ir3: Disable sin/cos range reduction for mediump.
robclark noted that the blob wasn't doing range reduction in the mediump
case, and I confirmed it on
dEQP-GLES3.functional.shaders.operator.angle_and_trigonometry.sin.mediump_float_fragment
vs
dEQP-GLES3.functional.shaders.operator.angle_and_trigonometry.sin.highp_float_fragment.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4893>
2020-05-05 17:23:34 +00:00
Eric Anholt 5c81f51c3c freedreno/ir3: Define the bindful uniform/nonuniform desc modes for cat6 a6xx.
These come from the disasm tests, and fix our disasm of blob's
uniform/nonuniform cat6 operands.  We also now include human-readable names
for all the modes we know about (though bindless gets distinguished by its
.baseN, like Connor's original disasm).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>
2020-05-04 11:15:50 -07:00
Eric Anholt 97b21110b8 freedreno/ir3: Sync some new changes from envytools.
With this I also brought in a few new control flow instruction disasm
tests that I'd made back when I wrote the disasm test, but which were too
far from correct to include until now.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>
2020-05-04 11:14:46 -07:00
Eric Anholt 1e5b0c92c5 freedreno/ir3: Add some more tests of cat6 disasm.
I put these together from traces I had while trying to do LDC for GL.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>
2020-05-04 11:14:46 -07:00
Eric Anholt 29f58cfbd0 freedreno/ir3: Set up outputs for multi-slot varyings.
Necessary to avoid compiler assertion failures in:

dEQP-GLES31.functional.program_interface_query.program_output.type.interface_blocks.out.named_block_explicit_location.struct.mat3x2

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:32 +00:00
Eric Anholt 88dcfaf0ee freedreno/ir3: Stop initializing regid of so->outputs during setup.
It's unused and overwritten by ir3_compile_shader_nir().

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:32 +00:00
Eric Anholt 8c1c218909 freedreno/ir3: Improve shader key normalization.
We can remove a bunch of conditional code at key comparison time by
computing a bitmask of used key bits at ir3_shader creation time.  This
also gives us a nice place to put additional key simplification to reduce
how many variants we create (like skipping rastflat if we don't read
colors in the FS, or skipping vclamp_color if we don't write colors).

It does mean walking the whole key to AND it, but the key is just 28 bytes
so far so that seems pretty fine.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:32 +00:00
Eric Anholt 6f1e3235f2 freedreno: Emit debug messages when doing draw-time recompiles of shaders.
Right now that's "always" unless you have shaderdb set.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:32 +00:00
Eric Anholt a361567c46 freedreno/ir3: Remove unused half precision shader key flag.
The code using it was removed in 4af86bd0b9 ("freedreno/ir3: remove
half-precision output")

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:32 +00:00
Eric Anholt 05be0659fe freedreno: Fix assertion failures on GS/tess shaders with shader-db enabled.
We weren't filling in the tess mode of the key, or setting has_gs on GS
shaders, resulting in assertion failures when NIR intrinsics didn't get
lowered.

We have to make a guess at prim mode for TCS, but it should be better to
have some shader-db coverage than none, and it will avoid these failures
happening when we start precompiling shaders.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:32 +00:00
Eric Anholt f91e49ee29 freedreno/ir3: Skip tess epilogue if the program is missing stores.
Some of the negative API tests make shaders for tess stages that don't do
all the stores they need to.  Once we start precompiling (or doing
shader-db of tess), we need to at least not segfault when generating them.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:32 +00:00
Eric Anholt b420d04e1f freedreno/ir3: Fix register allocation assertion failures.
We were failing to tell the allocator about the restriction that scalar
texture instructions (allocated as scalar regs) couldn't be allocated such
that the start of the full unwritemasked vector started before r0.  There
was a patch in select_reg_callback on a6xx that tried to work around that,
but you could still end up backed into a corner you shouldn't be because
we didn't tell the RA what it needed.

Fixes compiler assertion failures on a300-a400's blit_z shader, used for
Z32F gmem blits.

Looks like as a result we get tighter register allocation but more nops:

instructions in affected programs: 757945 -> 760356 (0.32%)
nops in affected programs: 317983 -> 320468 (0.78%)
non-nops in affected programs: 27525 -> 27451 (-0.27%)
mov in affected programs: 3098 -> 3023 (-2.42%)
dwords in affected programs: 109664 -> 110656 (0.90%)
last-baryf in affected programs: 112701 -> 112847 (0.13%)
full in affected programs: 4326 -> 4011 (-7.28%)
sstall in affected programs: 120550 -> 120836 (0.24%)
(ss) in affected programs: 13939 -> 13918 (-0.15%)
(sy) in affected programs: 3006 -> 2786 (-7.32%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:32 +00:00
Kristian H. Kristensen 73f34e0d46 freedreno/ir3: Drop hack to clean up split vars
When the GS lowering was working on store_output intrinsics, we had to
clean up the split vars to avoid getting confused.  Now that we shadow
the output vars instead, there's no confusion and we can drop this
hack.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:31 +00:00
Kristian H. Kristensen dd8d257a30 freedreno/ir3: Lower GS builtins before lowering IO
We mostly got away with replacing a store_output with a store_var, but
for complex types like structs, that doesn't work. Once the IO has
been lowered from vars to intrinsic, we've lost the deref chains and
can't properly shadow the outputs.

This commits moves the GS lowering up so we do it before the output
variables get lowered to store_output.  This way the pass works much
like nir_lower_io_to_temporaries() and cleanly shadows the outputs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:31 +00:00
Kristian H. Kristensen 79355fd901 freedreno/ir3: Add ir3_nir_lower_to_explicit_input() pass
This pass lowers per-vertex input intrinsics to load_shared_ir3. This
was open coded in the TCS and GS lowering passes before - this way we
can share it. Furthermore, we'll need to run the rest of the GS
lowering earlier (before lowering IO) so we need to split off this
part that operates on the IO intrinsics first.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:31 +00:00