Commit Graph

236 Commits

Author SHA1 Message Date
Rob Clark 5eed59cc87 freedreno/ir3+tu: Calculate subgroup size in ir3
TBD if the size changes for a7xx, but at least let's have it in one
place instead of duplicating in turnip and gallium.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21846>
2023-03-13 17:31:22 +00:00
Rob Clark c449e63809 freedreno/ir3: c++-proof the headers
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21846>
2023-03-13 17:31:22 +00:00
Rob Clark bff0ff5ae3 freedreno/ir3: Don't use negative opc for meta instructions
Stricter compilers complain about this, ie:

  error: left operand of shift expression ‘(-1 << 7)’ is negative [-fpermissive]

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21846>
2023-03-13 17:31:22 +00:00
Rob Clark 7c7761574e freedreno/ir3: Un-inline enums
It seems to be a thing that c++ dislikes

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21846>
2023-03-13 17:31:22 +00:00
Danylo Piliaiev 121e4ca87d ir3: Add cat5/cat7 cache related instructions
- tcinv - Likely Texture Cache Invalidate (unverified)
- icinv - Mostly sure that it is Instruction Cache Invalidate
- dccln - Data Cache Clean
- dcinv - Data Cache Invalidate
- dcflu - Data Cache Flush

The emission of these instructions were not observed in the wild.

TODO: find out the difference between .shr and .all modes of
      dccln, dcinv, dcflu.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14419>
2023-02-21 19:59:14 +00:00
Rob Clark eaf272aa93 ir3: Quiet unused variable warning
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21195>
2023-02-08 18:27:55 +00:00
Amber 228d812a0c ir3, isaspec: add raw instruction to assembler/disassembler.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20789>
2023-01-26 14:26:11 +00:00
Eric Engestrom cf520806b1 freedreno/ir3: fix -Wundef warning
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19963>
2022-11-23 19:41:44 +00:00
Yonggang Luo e399dc3544 util: normalize include files under src/util/*.h with util/ prefix in mesa code base
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19546>
2022-11-10 06:27:25 +00:00
Danylo Piliaiev 33e60798e1 ir3: Prevent reordering movmsk with kill
`kill` changes which fibers are active, thus reodering instructions
which depend on which fibers are active - is wrong.

The issue was hidden because only `ballot(true)` is translated to movmsk
immidiately, while others are passed as MACRO and don't properly
take part in ir3_sched (which does the reordering).

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7162

Fixes CTS test (on gen3+):
 dEQP-VK.spirv_assembly.instruction.terminate_invocation.terminate.subgroup_ballot

Fixes: b1b80c06a7
("ir3: Implement nir subgroup intrinsics")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18413>
2022-09-14 11:56:28 +00:00
Chia-I Wu b1cb764316 ir3: fix predicate splitting in scheduler
Fix up src->def->instr, not src->instr.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7014
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18096>
2022-08-26 17:36:18 +00:00
Chia-I Wu 8001c78d49 ir3: set UL flag before ir3_lower_subgroups
ir3_legalize_relative, extracted from ir3_legalize, assumes a0 is loaded
first in each block if there is any user in the block.
ir3_lower_subgroups breaks the assumption.  We need to do
ir3_legalize_relative first.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6902
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17720>
2022-07-27 17:08:03 +00:00
Marek Olšák c9ca8abe4f Change all debug_assert calls to assert
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17403>
2022-07-10 00:50:35 +00:00
Emma Anholt 01988667fd ir3: Retire the cp postsched pass now that we do RA in SSA.
Before, we needed CP post-sched to copy-propagate references to NIR
registers produced by out-of-ssa.  Now that we're in SSA, this pass ends
up not doing anything useful, and actually gets in the way by occasionally
creating a cycle in the DAG.

The entire shader-db impact is:

instructions HURT:   shaders/closed/steam/tropico-5/78.shader_test FRAG: 238 -> 242 (1.68%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17320>
2022-07-04 22:15:58 +00:00
Connor Abbott acba08b58f ir3: Implement and document ldc.k
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Connor Abbott 00d7ad334a ir3/legalize: Handle inserting (ei) with preamble
Make sure that shaders with a preamble are still considered
early-release so that we don't regress them.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Connor Abbott ccc64b7e00 ir3: Plumb through store_uniform_ir3 intrinsic
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Connor Abbott 3244e659e0 ir3: Implement basic shader preamble intrinsics
These will be used to implement the ir3-specific shader preamble
lowering in NIR. shps is conceptually similar to getone (although it
technically can't be duplicated) and shpe is similar to other barriers,
since it has to happen after any stores to the constant file in the
preamble. Add NIR intrinsics and plumbs them through ir3.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Rob Clark 05d6877235 freedreno/ir3: Don't try re-swapping cat3 srcs
This can lead us to endless loops of "progress".. Note fixes commit
commit really just exposed an existing problem.

Fixes: 9c9e8c3349 ("nir: Reorder ffma and fsub combining")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6133
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15336>
2022-03-12 16:42:00 +00:00
Rob Clark fa59556e1a freedreno/ir3: Remove unused define
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15336>
2022-03-12 16:42:00 +00:00
Connor Abbott 1a78604d20 ir3: Add support for subgroup arithmetic
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
2022-03-10 17:15:29 +00:00
Connor Abbott a433db60c1 ir3: Track physical edges when inserting (ss) for shared regs
Normally this wouldn't matter, but it will matter for the upcoming scan
macro because the running tally is communicated through a shared
register across a physical edge. It may also matter if a live-range
split occurs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
2022-03-10 17:15:29 +00:00
Connor Abbott 2ff5826f09 ir3/ra: Add IR3_REG_EARLY_CLOBBER
We'll need this to model the subgroup reduction macros.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
2022-03-10 17:15:29 +00:00
Ilia Mirkin 96211adf77 freedreno/a4xx: add swizzles to shader keys for tg4 workaround
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>
2022-03-03 18:26:43 +00:00
Danylo Piliaiev c1d5c318bc ir3: New cat3 instructions
* shrm - (src2 >> src1) & src3
* shlm - (src2 << src1) & src3
* shrg - (src2 >> src1) | src3
* shlg - (src2 << src1) | src3
* andg - (src2 & src1) | src3
* dp2acc - dot product of two {i,u}8vec2 packed into
  SRC1 and SRC2, added to 32b SRC3
* dp4acc - dot product of two {i,u}8vec4 packed into
  SRC1 and SRC2, added to 32b SRC3
* wmm - vec4(x_1, x_2, x_3, x_4) * (y_1 + y_2 + y_3 + y_4), which is
  duplicated (1 << (SRC3 / 32)) times starting from DST register
* wmm.accu - same as wmm but result is added to DST registers, however
  the first reg in each vec4 result is overwritten instead of
  accumulating.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>
2022-01-10 13:20:39 +02:00
Connor Abbott cb45120556 ir3: Use (ss) for instructions writing shared regs
The blob uses *both* nops and (ss). It turns out that in some rare cases
the hardware does take more than 6 cycles, at least for movmsk, but
adding nops is unnecessary. I believe the extra nops are only there due
to the immaturity of the blob's implementation of subgroup ops, so we
don't have to copy them - just handle shared reg producers the same as
SFU instructions.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
2022-01-07 14:26:08 +00:00
Connor Abbott 7e60978d30 ir3: Introduce systall metric and new helper functions
Add new centralized functions which will replace the various places we
hardcode 10 for the number of (ss) nops, add numbers for soft (sy) nops
based on similar computerator experiments with ldc, sam, and ldib (the
most common (sy) producers), and add a "systall" metric which is
analogous to sstall. This also fixes some cases where we'd erroniously
count ldl* as (sy) producers instead of (ss) producers when calculating
sstall.

This only switches over the metric reporting to the new functions, so
there is no behavior change. The following commit will switch over
the rest of the compiler.

While we're at it, remove max_sun as it's never set.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
2022-01-07 14:26:08 +00:00
Danylo Piliaiev c749da6135 ir3,turnip: Add support for GL_KHR_shader_subgroup_quad
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13817>
2021-12-07 20:45:53 +00:00
Danylo Piliaiev ded51fd39e ir3: Use getfiberid for SubgroupInvocationID on gen4
Since it requires (ss) categorize it as is_sfu() and not is_mem().

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13817>
2021-12-07 20:45:53 +00:00
Danylo Piliaiev d1c49901df ir3: Add gen4 new subgroup instructions
* getlast.w8 #4 - Perform jump for the first (CLUSTER_SIZE-1)
   fibers in a subgroup
* brcst.active.w8 - necessary to implement arithmetic subgroup
   operations with prefix sum.
* quad_shuffle.brcst - subgroupQuadBroadcast
* quad_shuffle.horiz - subgroupQuadSwapHorizontal
* quad_shuffle.vert - subgroupQuadSwapVertical
* quad_shuffle.diag - subgroupQuadSwapDiagonal
* getfiberid - gl_SubgroupID

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13817>
2021-12-07 20:45:53 +00:00
Danylo Piliaiev a78c36ecc6 ir3/cp: Prevent setting an address on subgroup macros
These macros expand to a mov in an if statement which breaks address
assumption that instruction which produces address and consumes it
are in the same block.

Fixes test:
 dEQP-VK.subgroups.ballot_broadcast.framebuffer.subgroupbroadcast_intvertex

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13931>
2021-11-25 12:18:48 +00:00
Danylo Piliaiev 5d5b1fc472 freedreno/ir3: add a6xx global atomics and separate atomic opcodes
Separating atomic opcodes makes possible to express a6xx global
atomics which take iova in SRC1. They would be needed by
VK_KHR_buffer_device_address.
The change also makes easier to distiguish atomics in conditions.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8717>
2021-11-23 18:26:37 +00:00
Connor Abbott 23a5f1a5ac ir3: Stop inserting nops during scheduling
Not necessary since nothing uses it anymore. This might have a slight
effect on spilling with multiple blocks, but no shader-db difference
because nothing spills.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
2021-11-17 13:41:47 +00:00
Connor Abbott 6f5c0d209c ir3/postsched: Rewrite delay handling
Analogous to the pre-RA scheduler. Unfortunately this time it's a bit
more involved because we have to correctly handle (rptN), which is
already relevant for swz. This means we need the index of the
destination register that conflicts with the source register, to handle
swz, and we need to expose that part of ir3_delay. But once that's done,
we can delete ir3_delay_calc_postra.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
2021-11-17 13:41:47 +00:00
Connor Abbott 140e117f2b ir3/delay: Ignore earlier definitions to the same register
We have a situation in some skia shaders like:

add.f r0.x, ...
(rpt2)nop
mul.f ..., r0.x
sam (xyzw) r0.x, ...
rcp ..., r0.x

Notice that rcp uses the result of the sam instruction, not the add.f,
but we didn't keep track of which instructions kill the sources in
ir3_delay, so we'd add an extra nop, resulting in a disagreement betwen
ir3_delay and the scheduling graph. Since postsched is correct, fix
ir3_delay. This only results in some very slight shader-db changes but
keeps the next commit from changing things.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
2021-11-17 13:41:47 +00:00
Connor Abbott d9a91318b1 ir3/sched: Rewrite delay handling
The old code walked the instructions between each ready instruction and
each of its parents for every instruction, which can quickly become
accidentally quadratic. Instead we keep track of the current
"instruction pointer" of the to-be-scheduled instruction, and for each
ready instruction calculate an "earliest possible IP" which is the IP
that needs to be reached before we can schedule it. Because this stays
constant as soon as an instruction becomes ready, we never have to
recompute it and each call to ir3_delay_calc_prera() becomes a simple
comparison and subtract. We only need to iterate over the children and
update their earliest_ip when scheduling an instruction, and we already
do that in util_day_prune_head() so it should be cheap.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
2021-11-17 13:41:47 +00:00
Matt Turner 2ab0cf2b54 freedreno/ir3: Use flat.b to load flat varyings on a6xx
The flat.b/bary.f cat2 instruction should be faster than an ldlv cat6
instruction, even with a couple of additional moves (which will be
removed in the next patch).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13558>
2021-11-04 02:59:28 +00:00
Matt Turner 2ee1b5a526 freedreno/ir3: Add infrastructure for flat.b
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13558>
2021-11-04 02:59:28 +00:00
Rob Clark f58438320c freedreno/ir3: Add ihadd/uhadd
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13300>
2021-10-21 18:59:57 +00:00
Rob Clark 81eefe0090 freedreno/ir3: 8bit fixes
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13300>
2021-10-21 18:59:57 +00:00
Rob Clark 8b0550f09f freedreno/isa: Fixes for validation
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13353>
2021-10-15 15:52:33 +00:00
Connor Abbott 470bf75ff8 ir3: Fix handling cat6 immediates
We were treating them the same as regular cat2/cat3/cat4 immediates, but
that's not right because cat6 sources are only 8 bits.

Our bindless code was handling this before for bindless resources, and
it was disabled for most other things, so this was mostly harmless, but
fixing it will be necessary for handling ldc offsets.

In addition enable tests for this that were just commented out, and add
a custom test making sure that the immediate source is treated as
unsigned.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13142>
2021-10-12 11:30:52 +00:00
Connor Abbott 1ed9a2f50c ir3: Handle special regs in regmask
Use the same hack as post-RA scheduling.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13142>
2021-10-12 11:30:52 +00:00
Connor Abbott a37f9602b7 ir3: Remove separate regmask.h
Inline it into its one user. There's no point in keeping it separate,
and in order to handle special registers it will have to become a bit
more intertwined with core ir3.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13142>
2021-10-12 11:30:52 +00:00
Connor Abbott 0209311c6e ir3: Use source in ir3_output_conv_src_type()
This was incorrectly converted when splitting the regs array. Noticed by
inspection.

Fixes: d3e08327cf ("ir3/core: Switch to srcs/dsts arrays")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13220>
2021-10-06 13:15:50 +00:00
Emma Anholt 1cc8523c5c freedreno/ir3: Use LDIB for coherent image loads on a5xx.
If the coherent flag is present, then we need to not have an incoherent
cache between us and previous stores to the image that were also decorated
as coherent.  isam apparently (unsurprisingly) goes through a texture
cache.  Use ldib instead, so that we don't get the wrong result.

We would need a similar fix for a4xx, but that uses ldgb and I don't
have hardware to test on.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12704>
2021-09-03 18:17:07 +00:00
Connor Abbott 62a7acee93 ir3: Make ir3_register::name 32-bits
It was overflowing with
dEQP-VK.spirv_assembly.instruction.compute.spirv_ids_abuse.lots_ids.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12487>
2021-08-31 20:33:49 +00:00
Connor Abbott 82c3dc220e ir3: Make instruction IP 32 bits
a6xx supports shaders with more than 64k dwords, or at least the shader
size register has increased in size, and the matching name is gone so
there's no reason to be clever here. This doesn't fix anything at the
moment.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12487>
2021-08-31 20:33:49 +00:00
Connor Abbott 9fd1616842 ir3: Remove ir3_instr::name
Unused since the switch to new RA.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12487>
2021-08-31 20:33:49 +00:00
Connor Abbott 0b39f4ab42 ir3, turnip, freedreno: Report stp/ldp in shader stats
This is important after spilling, so that we get an indication when a
change causes spilling.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>
2021-08-20 10:37:36 +00:00