Commit Graph

78 Commits

Author SHA1 Message Date
Danylo Piliaiev e6f5480180 ir3: Add cat7 sleep instruction
Has short and long variants, long seem to be ~20 times longer.
The exact difference between it and a bunch of nops is unknown.

The emission of this instruction were not observed in the wild.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14419>
2023-02-21 19:59:14 +00:00
Danylo Piliaiev 121e4ca87d ir3: Add cat5/cat7 cache related instructions
- tcinv - Likely Texture Cache Invalidate (unverified)
- icinv - Mostly sure that it is Instruction Cache Invalidate
- dccln - Data Cache Clean
- dcinv - Data Cache Invalidate
- dcflu - Data Cache Flush

The emission of these instructions were not observed in the wild.

TODO: find out the difference between .shr and .all modes of
      dccln, dcinv, dcflu.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14419>
2023-02-21 19:59:14 +00:00
Amber 228d812a0c ir3, isaspec: add raw instruction to assembler/disassembler.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20789>
2023-01-26 14:26:11 +00:00
Connor Abbott 66b9c05bb9 ir3: Add missing cat5 encoding to asm parser
We were missing the case where there is a sampler and texture but the
texture offset is encoded in a1.x.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18840>
2022-10-04 14:00:50 +00:00
Connor Abbott 221a912b8c ir3: Refactor ir3_compiler_create() to take an options struct
This will let us add more options without creating too much churn.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Connor Abbott acba08b58f ir3: Implement and document ldc.k
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Connor Abbott 944f4e6f8a ir3: Better assemble/disassemble stc
Add in the type, even though it turns out to not be that useful. Add
in support for assembling it. Add some notes based on computerator
experiments. And add support for the indirect a1.x mode that's needed
for storing c64.x and later.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Danylo Piliaiev c1d5c318bc ir3: New cat3 instructions
* shrm - (src2 >> src1) & src3
* shlm - (src2 << src1) & src3
* shrg - (src2 >> src1) | src3
* shlg - (src2 << src1) | src3
* andg - (src2 & src1) | src3
* dp2acc - dot product of two {i,u}8vec2 packed into
  SRC1 and SRC2, added to 32b SRC3
* dp4acc - dot product of two {i,u}8vec4 packed into
  SRC1 and SRC2, added to 32b SRC3
* wmm - vec4(x_1, x_2, x_3, x_4) * (y_1 + y_2 + y_3 + y_4), which is
  duplicated (1 << (SRC3 / 32)) times starting from DST register
* wmm.accu - same as wmm but result is added to DST registers, however
  the first reg in each vec4 result is overwritten instead of
  accumulating.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>
2022-01-10 13:20:39 +02:00
Danylo Piliaiev d1c49901df ir3: Add gen4 new subgroup instructions
* getlast.w8 #4 - Perform jump for the first (CLUSTER_SIZE-1)
   fibers in a subgroup
* brcst.active.w8 - necessary to implement arithmetic subgroup
   operations with prefix sum.
* quad_shuffle.brcst - subgroupQuadBroadcast
* quad_shuffle.horiz - subgroupQuadSwapHorizontal
* quad_shuffle.vert - subgroupQuadSwapVertical
* quad_shuffle.diag - subgroupQuadSwapDiagonal
* getfiberid - gl_SubgroupID

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13817>
2021-12-07 20:45:53 +00:00
Danylo Piliaiev 5d5b1fc472 freedreno/ir3: add a6xx global atomics and separate atomic opcodes
Separating atomic opcodes makes possible to express a6xx global
atomics which take iova in SRC1. They would be needed by
VK_KHR_buffer_device_address.
The change also makes easier to distiguish atomics in conditions.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8717>
2021-11-23 18:26:37 +00:00
Danylo Piliaiev ed16eedb2d ir3: print half-dst/src for ldib.b/stib.b
So it would print:
 ldib.b.untyped.1d.u16.1.imm.base0 hr0.z, r0.x, 0
instead of:
 ldib.b.untyped.1d.u16.1.imm.base0 r0.z, r0.x, 0

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13876>
2021-11-22 12:32:15 +00:00
Connor Abbott 6f5c0d209c ir3/postsched: Rewrite delay handling
Analogous to the pre-RA scheduler. Unfortunately this time it's a bit
more involved because we have to correctly handle (rptN), which is
already relevant for swz. This means we need the index of the
destination register that conflicts with the source register, to handle
swz, and we need to expose that part of ir3_delay. But once that's done,
we can delete ir3_delay_calc_postra.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
2021-11-17 13:41:47 +00:00
Connor Abbott 140e117f2b ir3/delay: Ignore earlier definitions to the same register
We have a situation in some skia shaders like:

add.f r0.x, ...
(rpt2)nop
mul.f ..., r0.x
sam (xyzw) r0.x, ...
rcp ..., r0.x

Notice that rcp uses the result of the sam instruction, not the add.f,
but we didn't keep track of which instructions kill the sources in
ir3_delay, so we'd add an extra nop, resulting in a disagreement betwen
ir3_delay and the scheduling graph. Since postsched is correct, fix
ir3_delay. This only results in some very slight shader-db changes but
keeps the next commit from changing things.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
2021-11-17 13:41:47 +00:00
Emma Anholt 9e04f97d8e freedreno: Fix the uniform/nonuniform handling for cat5 bindful modes.
We can see from the dynamically_uniform (compiler doesn't know if you're
uniform or not) vs uniform (compiler can see it's uniform) case in the
blob which is which.  Now that we have the right names, also use the
nonunif flag for encoding the actual non-uniform mode (previously, we were
always setting it always in a way that meant uniform).

I verified this behavior back to a418 with samplers.  The a3xx blob I have
only does GLES3, so we don't have the opaque_type_indexing tests to see.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13601>
2021-11-10 17:48:59 +00:00
Matt Turner a150e31910 ir3: Add support for (dis)assembling flat.b
flat.b is a variant of the bary.f instruction that does not perform
interpolation of the varying input.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13558>
2021-11-04 02:59:28 +00:00
Rob Clark 834e8066c1 freedreno/ir3/tests: Add some 8/16b ldg/stg tests
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13426>
2021-10-19 16:04:42 +00:00
Rob Clark 8657e201d0 freedreno/ir3/tests: Don't skip encode test if decode fails
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13426>
2021-10-19 16:04:42 +00:00
Rob Clark bfd8b7c930 freedreno/ir3/tests: Add additional disasm test vectors
Add branch with negative offset, and a couple others to trigger issues I
found while adding pack_field() overflow asserts.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13353>
2021-10-15 15:52:33 +00:00
Rob Clark c0ecfeb023 freedreno/ir3/tests: Fix indentation
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13353>
2021-10-15 15:52:33 +00:00
Connor Abbott 470bf75ff8 ir3: Fix handling cat6 immediates
We were treating them the same as regular cat2/cat3/cat4 immediates, but
that's not right because cat6 sources are only 8 bits.

Our bindless code was handling this before for bindless resources, and
it was disabled for most other things, so this was mostly harmless, but
fixing it will be necessary for handling ldc offsets.

In addition enable tests for this that were just commented out, and add
a custom test making sure that the immediate source is treated as
unsigned.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13142>
2021-10-12 11:30:52 +00:00
Danylo Piliaiev d590515112 ir3: support source modes for resinfo.b
IBO/SSBO may have dynamic index, previously we just silently ignored
this fact. However resinfo supports different modes.

Fixes vkd3d test "test_null_uav"

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13224>
2021-10-07 08:19:13 +00:00
Emma Anholt 2b6729883a freedreno/ir3: Add encode/decode support for a5xx's LDIB.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12704>
2021-09-03 18:17:07 +00:00
Rob Clark 7806843866 freedreno/all: Introduce fd_dev_id
Move away from using gpu_id as the primary means to identify which
adreno we are running on, as future GPUs (starting with 7c3) stop
providing a gpu_id as a new naming scheme is introduced.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12159>
2021-08-06 18:51:50 +00:00
Connor Abbott 177138d8cb ir3: Reformat source with clang-format
Generated using:

cd src/freedreno/ir3 && clang-format -i {**,.}/*.c {**,.}/*.h -style=file

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11801>
2021-07-12 20:57:21 +00:00
Connor Abbott 2e76f7b60c ir3: Manually reformat some places
clang-format does a bad job with a few tables and macros, and there were
some places it was doing wonky things because comments were longer than
80 characters and it tries to fix that without reformatting the comment
itself. Add magic comments to tell it to turn itself off and retab those
places manually (well, with a regex!).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11801>
2021-07-12 20:57:21 +00:00
Danylo Piliaiev 1c6c200c0d ir3: add newly found shlg.b16 instruction
Example of blob's output:
  (nop3) shlg.b16 hr8.x, (r)8, (r)hr8.x, 12

It does: (src2 << src1) | src2

src1 and src2 could be GPRs, relative GPRs, relative consts,
or immidiates. However, they could not be plain const registers.

Blob does use it in conjuncture with "samgq" instruction.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11760>
2021-07-09 13:00:29 +00:00
Danylo Piliaiev fdc0f489e0 ir3: add ldg.a,stg.a which allow complex in-place offset calculation
The full form for ldg.a/stg.a offset is:
 g[reg_address + reg_offset << (imm_shift + 2) + imm_offset << 2]

where imm_shift is in [0, 3] and imm_offset is in [0, 3]

a6xx blob was found to produce a bit simplier offset calculations
for TES/TCS shaders in GTA V:

 [c002000a_03c14215] ldg.a.f32 r2.z, g[r1.y+((r2.z+1)<<2)], 3;
 [c0020004_01c14609] ldg.a.f32 r1.x, g[r1.y+((r1.x+3)<<2)], 1;

Our new syntax:
 stg.a.u32 g[r2.x+(r1.x+1)<<2], r5.x, 1
 stg.a.u32 g[r2.x+r1.x<<4+3<<2], r5.x, 1
 ldg.a.f32 r1.w, g[r1.y+(r1.w+1)<<2], 3
 ldg.a.f32 r1.w, g[r1.y+r1.w<<5+2<<2], 3

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11431>
2021-06-25 15:39:51 +00:00
Connor Abbott 132dfacdcb freedreno/tests: Convert to srcs/dsts
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11469>
2021-06-23 17:20:29 +00:00
Connor Abbott 58d82add87 ir3: Rewrite delay calculation
The old delay calculation relied on the SSA information staying around,
and wouldn't work once we start introducing phi nodes and making
"normal" values defined in multiple blocks not array regs anymore.
What's worse is that properly inserting phi nodes when splitting live
ranges would make that code even more complicated, and this was the last
place post-RA that actually needed that information.

The new version only compares the physical registers of sources and
destinations. It works by going backwards up to a maximum number of
cycles, so it might be slightly slower when the definition is closer but
should be faster when it is farther away.

To avoid complicating the new method, the old method is kept around, but
only for pre-RA scheduling and it can therefore be drastically
simplified as the array case can be dropped.

ir3_delay_calc() is split into a few variants to avoid an explosion of
boolean arguments in users, especially now that merged_regs now has to
be passed to it.

The new method is a little more complicated when it comes to handling
(rptN), because both the assigner and consumer may be (rptN). This adds
some unit tests for those cases, in addition to dropping the to-SSA code
in the test harness since it's no longer needed.

Finally, ir3_legalize has to be switched to using physical registers for
the branch condition. This was the one place where IR3_REG_SSA remained
after RA.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>
2021-06-10 12:20:38 -07:00
Connor Abbott edf23e15eb ir3: Prepare for instructions with multiple destinations
To simplify the pre-RA merge set code and express the result live-range
splitting in RA, we need to add support for parallel copy instructions,
and for the merge set code these parallel copies need to be in SSA form.
Parallel copies have multiple destinations by necessity, but there was
no way to express this in the existing IR. In particular there was no
support for marking a register as being a destination, and no support
for indicating which destination register out of several an SSA source
refers to. This replaces ir3_register::instr with ir3_register::def and
re-purposes ir3_register::instr. I haven't propagated this into common
helpers, like ssa(), because that would vastly increase the amount of
churn and the number of places that produce such instructions should be
limited -- only RA will create parallel copies and they will be
destroyed right after RA. In the future swz will have multiple
destinations too, but it will only be created after RA via parallel copy
lowering.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>
2021-06-10 12:20:38 -07:00
Connor Abbott 0ef021be4a ir3: Add ir3_start_block()
Name based on nir_start_block(). A number of places were already
open-coding this, convert them.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>
2021-06-10 12:20:38 -07:00
Connor Abbott 08499369d0 ir3: Assemble and disassemble swz/gat/sct
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10291>
2021-04-19 16:10:44 +00:00
Connor Abbott c68ea960a7 ir3, tu: Add compiler flag for robust UBO behavior
This needs to be part of the compiler because it's the only piece that
we always have access to in all the places ir3_optimize_loop() is
called, and it's only enabled for the whole Vulkan device. Right now
it's just used for constraining vectorization, but the next commit adds
another use.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>
2021-04-15 16:05:11 +02:00
Danylo Piliaiev ce1a381e57 turnip: enable VK_KHR_16bit_storage on A650
A650 can use the same SSBO descriptor for both 32-bit and 16-bit access,
which makes it easy to enable this extension.

Passes tests that run under:

dEQP-VK.spirv_assembly.instruction.*.16bit_storage.*

Rebased and modified commit from Jonathan Marek.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
2021-04-01 17:51:07 +00:00
Danylo Piliaiev 00d6ccebf9 ir3/isa: account for randomly set by blob lowest bit of ibo atomics
As far as I could see - blob randomly sets the lowest bit of atomic.b.*
instructions.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9841>
2021-03-31 15:03:35 +00:00
Danylo Piliaiev c0a62b203e ir3/isa,parser: fix encoding and parsing of bindless s2en SAM
Before, decoding showed that there is an error:
 sam.base0 (f32)(xyzw)r0.x, r0.z, a1.x   ; no field 'HAS_SAMP',
  WARNING: unexpected bits[0:7] in #cat5-samp-s2en-bindless-a1: 0x1 vs 0x0

After:
 sam.base0 (f32)(xyzw)r0.x, r0.z, s#1, a1.x

Fixes textures on the ground in TauCeti Vulkan Technology Benchmark

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9628>
2021-03-17 12:07:54 +00:00
Hyunjun Ko e9fd2a2a58 ir3: Add nonuniform encodings to ir3 encoder and parser
By keeping track of nonuniform access from nir and storing it to ir3.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9125>
2021-03-17 01:09:30 +00:00
Danylo Piliaiev 2764cf8d32 ir3: use OPC_GETBUF to get size of sampler buffers
The maximum value which OPC_GETSIZE could return for one dimension
is 0x007ff0, however sampler buffer could be much bigger.
Blob uses OPC_GETBUF for them.

Fixes tests:
 dEQP-VK.memory.pipeline_barrier.transfer_dst_uniform_texel_buffer.1048576

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9391>
2021-03-10 17:10:45 +00:00
Danylo Piliaiev 5e2cee57c5 freedreno/ir3/parser: add cat7 support
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8420>
2021-01-15 10:08:38 +00:00
Rob Clark 11cba228fd freedreno/ir3: Small resinfo disasm tweak
Add the 'type' field.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7997>
2021-01-13 18:32:47 +00:00
Rob Clark 01e8bd55de freedreno/ir3/tests: Switch disasm test over to new decoder
Also, uncomment the `stc` test vectors (since the new decoder decodes
these properly) and comment out an instruction which looks suspiciously
like -6.0 in hex.

This also switches the parser back to `atomic.b.op` from `atomic.op.b`
which was a short-term workaround to make it easier for the legacy
disassembler.

Also switch the binary encoding for ldib to clear b0, because the new
disassembler warns about unexpected dontcare bits (which cases the
disasm to not match).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7997>
2021-01-13 18:32:47 +00:00
Rob Clark e1f8aaf9d2 freedreno/ir3: Fix ldg decoding/parsing
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7997>
2021-01-13 18:32:47 +00:00
Rob Clark 32a6a13052 freedreno/ir3/parser: Fix pre-a6xx stib parsing
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8175>
2021-01-06 16:46:53 +00:00
Rob Clark 859c92d7ee freedreno/ir3/parser: a6xx ldib/stib parsing
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8175>
2021-01-06 16:46:53 +00:00
Rob Clark b7ea6ec178 freedreno/ir3: Fix pre-a6xx ldgb/stib parsing
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8175>
2021-01-06 16:46:53 +00:00
Rob Clark 050a449dbb freedreno/ir3: Explicitly flag disasm test vectors that don't parse
Mark the test cases which aren't supported by ir3_parser.y explicitly,
so we notice future regressions.  And likewise, fail when we see an
unexpected pass, so we don't forget to update the test vectors in the
future as ir3_parser improves.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8175>
2021-01-06 16:46:53 +00:00
Rob Clark b073dae5f0 freedreno/ir3: Fix ldg decoding/parsing
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8175>
2021-01-06 16:46:53 +00:00
Rob Clark a7e88787f6 freedreno/ir3/parser: Fixup stg parsing and add more tests
The offset can also be a register, in which case we need to shuffle
around the src order.  Add a few more test vectors to cover each
permutation (no offset, immed offset, gpr offset).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8175>
2021-01-06 16:46:52 +00:00
Rob Clark d6fa130dda freedreno/ir3/parser: Add stgb support
Note that this conflicts with `stc` on a6xx+, so a good test that the
(new) disasm can handle both cases properly.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8175>
2021-01-06 16:46:52 +00:00
Rob Clark 1746c4d211 freedreno/ir3/parser: Fix pre-a6xx resinfo
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8175>
2021-01-06 16:46:52 +00:00