Commit Graph

475 Commits

Author SHA1 Message Date
Boris Brezillon 2f68c58767 pan/blend: Allow passing blend constants through a sysval
This is needed to allow lowering blend operations in fragment shaders
when the blend operation uses dynamic blend constants.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13060>
2021-09-30 16:54:42 +02:00
Boris Brezillon f256ec2a88 pan/bi: Fix 1DArray image coordinate retrieval
In NIR image instructions, the array index always follows the last
image coordinate, meaning that array index is in coord.y for 1D
arrays. But the current panfrost ABI wants it in coord.z regardless of
the image dimension.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13078>
2021-09-28 18:44:05 +00:00
Boris Brezillon 2dd13257c8 pan/bi: Allow passing RT conversion descriptors to fragment shaders
The Vulkan copy shaders sometimes need to partially write a texel and
issue a load on the FRAGOUT variable in that case, but they do know
the format of the tile buffer in advance in that case. Let's not add an
RT_CONVERSION sysval if we can avoid it.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12095>
2021-09-21 14:59:38 +02:00
Boris Brezillon 04d6c593d0 pan/bi: Relax check on 8bit swizzles
Allow extracting components Y, Z or W from an 8bit vector.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12095>
2021-09-21 14:58:54 +02:00
Alyssa Rosenzweig 61c8e39649 pan/bi: Use CLPER_V6 on Mali G31
Apparently, CLPER_V7 is missing from Mali G31, but CLPER_V6 works. Fixes
INSTR_INVALID_ENC faults and failures in
dEQP-GLES3.functional.shaders.derivate.* on Dvalin.

Technically not an errata but an implementation difference. I suspect
Mali G51 will need this as well, should we ever allowlist it.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>
2021-08-20 20:13:27 +00:00
Alyssa Rosenzweig bfd3ae35c9 pan/bi: Use ST_TILE for multisampled blend output
ST_TILE lets us specify an explicit sample, whereas BLEND replicates to
all samples. This fully fixes the interaction between blend shaders and
multisampling on Bifrost, manifesting as
dEQP-GLES3.functional.fragment_ops.random.* failures with the
configuration rgba8888d24s8ms4.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>
2021-08-20 20:13:27 +00:00
Alyssa Rosenzweig 16394dc71a pan/bi: Set the sample ID for blend shader LD_TILE
Use the explicit sample mode and set the sample ID in the pixel indices
structure to the current sample ID. This fixes tilebuffer loads in blend
shaders on multisampled framebuffers.

Make sure the new routine is broken out to a helper for use with ST_TILE
in the next commit.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>
2021-08-20 20:13:27 +00:00
Alyssa Rosenzweig 9f19a883bc pan/bi: Extract load_sample_id to a helper
Will be reused in the next commit.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>
2021-08-20 20:13:27 +00:00
Icecream95 a9ab168e16 pan/bi,pan/mdg: Fix memory leak of hash tables
Despite being created with a ralloc context, some memory is still
leaked when not manually destroying hash tables.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12376>
2021-08-16 14:18:36 +00:00
Alyssa Rosenzweig d74ab1e4d9 pan/bi: Fuse DISCARD with conditions
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12205>
2021-08-11 14:59:26 -04:00
Alyssa Rosenzweig ac636f5adb pan/bi: Use FCLAMP pseudo op for clamp prop
Map nir_op_fsat/etc to FCLAMP pseudo ops, instead of FADD. There are
significantly fewer knobs on FCLAMP, meaning significantly fewer things
to get wrong.

This fixes two(!) classes of bugs:

* Swizzles (failing to lower/compose swizzles on clamps)
* Numerical bugs (incorrectly treating +0.0 as an additive identity)

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12205>
2021-08-11 14:59:26 -04:00
Alyssa Rosenzweig 89e452883a pan/bi: Use FABSNEG pseudo ops for modifier prop
Simplifies pattern matching. This commit by itself fixes multiple
numerical issues -- the previous fabsneg check failed to check the round
mode or the sign of the zero. That will break Vulkan/OpenCL.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12205>
2021-08-11 14:59:26 -04:00
Alyssa Rosenzweig ec76119dfb pan/bi: Constant fold texturing lowerings
This ensures we can constant fold the ALU ops used to lower:

* explicit LOD calculations
* array textures
* texture offsets
* multisample indices

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12205>
2021-08-11 14:59:24 -04:00
Alyssa Rosenzweig 3958f00215 pan/bi: Add a noopt debug option
To rule out buggy optimization passes when debugging.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12328>
2021-08-11 18:00:45 +00:00
Alyssa Rosenzweig ff03f096bf pan/bi: Make bi_opt_push_ubo optional
It's an optimization pass -- omitting it should not cause MMU faults
(!). Make sure the UBO push mask is set regardless of whether the pass
is called, and just call the pass when required.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12328>
2021-08-11 18:00:45 +00:00
Icecream95 ee2bb57f1e pan/bi: Use the computed scale for fexp NaN propagation
This makes pow(NaN, x) return NaN rather than 1.0.

Fixes: 499397700c ("pan/bi: Don't lower fpow")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5189
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12269>
2021-08-10 22:42:08 +00:00
Timothy Arceri a9ed4538ab nir: add indirect loop unrolling to compiler options
This is where it should be rather than having to pass it into the
optimisation pass every time.

It also allows us to call the loop analysis pass without having to
duplicate these options which we will do later in this series.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>
2021-08-03 10:54:50 +00:00
Alyssa Rosenzweig 1e29f57b3a pan/bi: Validate the live set starts empty
Otherwise there is an uninitialized read, and the register allocation
will fail. (In the sense of failing a precondition. This manifests as
synthetic interference leading to higher register pressure and useless
moves. The allocation itself is ok, but it indicates a real bug.)

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12130>
2021-07-29 23:40:46 +00:00
Alyssa Rosenzweig c4f8b52e06 pan/bi: Lower fragment output with <4 components
This avoids undefined behaviour in the shader, which will fail
validation added later in the series. shader-db results are neglible --
the extra moves required in a few cases are cancelled out by the extra
moves eliminated by allowing register allocation to work properly.

total instructions in shared programs: 146903 -> 146907 (<.01%)
instructions in affected programs: 33 -> 37 (12.12%)
helped: 0
HURT: 1

total tuples in shared programs: 123616 -> 123613 (<.01%)
tuples in affected programs: 764 -> 761 (-0.39%)
helped: 6
HURT: 4
helped stats (abs) min: 1.0 max: 4.0 x̄: 1.67 x̃: 1
helped stats (rel) min: 0.54% max: 5.88% x̄: 2.64% x̃: 1.86%
HURT stats (abs)   min: 1.0 max: 2.0 x̄: 1.75 x̃: 2
HURT stats (rel)   min: 4.55% max: 13.33% x̄: 8.57% x̃: 8.19%
95% mean confidence interval for tuples value: -1.73 1.13
95% mean confidence interval for tuples %-change: -2.72% 6.41%
Inconclusive result (value mean confidence interval includes 0).

total clauses in shared programs: 25656 -> 25654 (<.01%)
clauses in affected programs: 43 -> 41 (-4.65%)
helped: 2
HURT: 1
helped stats (abs) min: 1.0 max: 2.0 x̄: 1.50 x̃: 1
helped stats (rel) min: 6.25% max: 12.50% x̄: 9.38% x̃: 9.38%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 33.33% max: 33.33% x̄: 33.33% x̃: 33.33%

total cycles in shared programs: 12114.21 -> 12114.12 (<.01%)
cycles in affected programs: 27.42 -> 27.33 (-0.30%)
helped: 4
HURT: 3
helped stats (abs) min: 0.04166700000000034 max: 0.08333299999999966 x̄: 0.06 x̃: 0
helped stats (rel) min: 0.57% max: 1.59% x̄: 1.02% x̃: 0.96%
HURT stats (abs)   min: 0.0416669999999999 max: 0.08333299999999999 x̄: 0.06 x̃: 0
HURT stats (rel)   min: 4.17% max: 16.67% x̄: 8.80% x̃: 5.56%
95% mean confidence interval for cycles value: -0.07 0.05
95% mean confidence interval for cycles %-change: -2.90% 9.27%
Inconclusive result (value mean confidence interval includes 0).

total arith in shared programs: 4601.08 -> 4601.04 (<.01%)
arith in affected programs: 29 -> 28.96 (-0.14%)
helped: 6
HURT: 4
helped stats (abs) min: 0.04166700000000001 max: 0.08333299999999966 x̄: 0.06 x̃: 0
helped stats (rel) min: 0.57% max: 10.00% x̄: 3.63% x̃: 1.39%
HURT stats (abs)   min: 0.04166700000000001 max: 0.08333399999999991 x̄: 0.07 x̃: 0
HURT stats (rel)   min: 5.56% max: 16.67% x̄: 10.85% x̃: 10.60%
95% mean confidence interval for arith value: -0.05 0.05
95% mean confidence interval for arith %-change: -3.95% 8.28%
Inconclusive result (value mean confidence interval includes 0).

total quadwords in shared programs: 110008 -> 110002 (<.01%)
quadwords in affected programs: 1090 -> 1084 (-0.55%)
helped: 11
HURT: 8
helped stats (abs) min: 1.0 max: 7.0 x̄: 2.18 x̃: 1
helped stats (rel) min: 0.61% max: 13.16% x̄: 4.07% x̃: 1.82%
HURT stats (abs)   min: 1.0 max: 6.0 x̄: 2.25 x̃: 1
HURT stats (rel)   min: 3.70% max: 42.86% x̄: 12.55% x̃: 7.50%
95% mean confidence interval for quadwords value: -1.76 1.13
95% mean confidence interval for quadwords %-change: -2.95% 8.81%
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12130>
2021-07-29 23:40:46 +00:00
Alyssa Rosenzweig a5727909e1 pan/bi: Rename CLPER_V7 back to CLPER
v6 is really the oddball here. CLPER on v9 supports a superset of v7.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12026>
2021-07-28 00:26:06 +00:00
Alyssa Rosenzweig be95198de5 pan/bi: DCE after bifrost_nir_lower_algebraic_late
Needed for sat_signed to fuse, since we run modifier prop before backend
DCE.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12026>
2021-07-28 00:26:06 +00:00
Alyssa Rosenzweig 8cbabbd532 pan/bi: Only call clause code on Bifrost
Valhall will have its own simpler code path.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12026>
2021-07-28 00:26:06 +00:00
Jason Ekstrand 74ec2b12be nir/lower_tex: Rework invalid implicit LOD lowering
Only fragment and some compute shaders support implicit derivatives.
They're totally meaningless without helper invocations and some
understanding of the dispatch pattern.  We've got code to lower
nir_texop_tex in these shader stages to use an explicit derivative of 0
but it was pretty badly broken:

 1. It only handled nir_texop_tex, not nir_texop_txb or nir_texop_lod.

 2. It didn't take min_lod into account

 3. It was conflated with adding a missing LOD parameter to opcodes
    which expect one such as nir_texop_txf.  While not really a bug,
    this does make it way harder to reason about the code.

 4. Unless you set a flag (which most drivers don't), it left the
    opcode nir_texop_tex instead of nir_texop_txl which it should have
    been.

This reworks it to go through roughly the same path as other LOD
lowering only with a constant lod of 0 instead of calling out to
nir_texop_lod.  We also get rid of the lower_tex_without_implicit_lod
flag because most drivers set it and those that don't are probably
subtly broken.  If someone really wants to get nir_texop_tex in their
vertex shaders, they can write a new patch to add the flag back in.

Fixes: e382890e25 "nir: set default lod to texture opcodes that..."
Fixes: d5ac5d6e83 "nir: Add option to lower tex to txl when..."
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775>
2021-07-23 15:53:57 +00:00
Jason Ekstrand cdde108af5 panfrost: Don't handle nir_texop_txf_ms_mcs
It's an intel-specific opcode and will never be seen on panfrost

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775>
2021-07-23 15:53:57 +00:00
Alyssa Rosenzweig b56f5c27de pan/bi: Add explicit cast for lod_or_mode
The enums alias. Fixes the following warning under clang:

../src/panfrost/bifrost/bifrost_compile.c:2515:25: warning: implicit conversion from enumeration type 'enum bifrost_texture_fetch' to different enumeration type 'enum bifrost_lod_mode' [-Wenum-conversion]
                        BIFROST_TEXTURE_FETCH_TEXEL;

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12003>
2021-07-22 17:55:49 +00:00
Alyssa Rosenzweig 84e2e5f23c pan/bi: Clean up useless casts
Left over from removing inheritance with sed instead of coccinelle.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11936>
2021-07-18 01:49:26 +00:00
Alyssa Rosenzweig 1fcea30295 pan/bi: Copy block bi_block
Gets rid of the silly inheritance everywhere, which has caused _far_
more problems in practice than it has fixed. It was an idea I tried
before the pandemic. It didn't work. I'm finally cleaning it up.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11936>
2021-07-18 01:49:26 +00:00
Alyssa Rosenzweig 8bee48a635 pan/bi: Copy back add_successor
Trying to get back independent block types.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11936>
2021-07-18 01:49:26 +00:00
Alyssa Rosenzweig 5996622050 pan/bi: Calculate dependency graph when bundling
Code is ported from Midgard, modified to be scalar, post-RA, and to put
the arrays on the worklist instead of the instruction to save memory.
This enables out-of-order scheduling.

total tuples in shared programs: 128691 -> 125304 (-2.63%)
tuples in affected programs: 114091 -> 110704 (-2.97%)
helped: 844
HURT: 377
helped stats (abs) min: 1.0 max: 150.0 x̄: 4.88 x̃: 3
helped stats (rel) min: 0.30% max: 26.42% x̄: 5.56% x̃: 4.35%
HURT stats (abs)   min: 1.0 max: 8.0 x̄: 1.94 x̃: 1
HURT stats (rel)   min: 0.20% max: 33.33% x̄: 6.84% x̃: 3.23%
95% mean confidence interval for tuples value: -3.16 -2.38
95% mean confidence interval for tuples %-change: -2.19% -1.27%
Tuples are helped.

total clauses in shared programs: 27579 -> 26059 (-5.51%)
clauses in affected programs: 20606 -> 19086 (-7.38%)
helped: 941
HURT: 39
helped stats (abs) min: 1.0 max: 21.0 x̄: 1.66 x̃: 1
helped stats (rel) min: 0.69% max: 44.44% x̄: 10.48% x̃: 9.09%
HURT stats (abs)   min: 1.0 max: 2.0 x̄: 1.15 x̃: 1
HURT stats (rel)   min: 1.89% max: 10.00% x̄: 4.73% x̃: 4.55%
95% mean confidence interval for clauses value: -1.63 -1.47
95% mean confidence interval for clauses %-change: -10.27% -9.48%
Clauses are helped.

total cycles in shared programs: 12262.54 -> 12154.79 (-0.88%)
cycles in affected programs: 2210.54 -> 2102.79 (-4.87%)
helped: 374
HURT: 56
helped stats (abs) min: 0.041665999999999315 max: 6.25 x̄: 0.30 x̃: 0
helped stats (rel) min: 0.42% max: 26.00% x̄: 6.90% x̃: 7.14%
HURT stats (abs)   min: 0.041665999999999315 max: 0.5833319999999986 x̄: 0.11 x̃: 0
HURT stats (rel)   min: 0.16% max: 100.00% x̄: 55.17% x̃: 50.00%
95% mean confidence interval for cycles value: -0.29 -0.21
95% mean confidence interval for cycles %-change: -1.37% 3.73%
Inconclusive result (%-change mean confidence interval includes 0).

total arith in shared programs: 4852.29 -> 4658.13 (-4.00%)
arith in affected programs: 4525.17 -> 4331 (-4.29%)
helped: 1112
HURT: 166
helped stats (abs) min: 0.041665999999999315 max: 6.25 x̄: 0.19 x̃: 0
helped stats (rel) min: 0.42% max: 33.33% x̄: 6.59% x̃: 5.36%
HURT stats (abs)   min: 0.041665999999999315 max: 0.5833319999999986 x̄: 0.07 x̃: 0
HURT stats (rel)   min: 0.16% max: 100.00% x̄: 25.05% x̃: 2.40%
95% mean confidence interval for arith value: -0.17 -0.14
95% mean confidence interval for arith %-change: -3.44% -1.51%
Arith are helped.

total quadwords in shared programs: 117141 -> 111394 (-4.91%)
quadwords in affected programs: 104390 -> 98643 (-5.51%)
helped: 1245
HURT: 76
helped stats (abs) min: 1.0 max: 69.0 x̄: 4.74 x̃: 4
helped stats (rel) min: 0.28% max: 35.00% x̄: 7.88% x̃: 6.45%
HURT stats (abs)   min: 1.0 max: 8.0 x̄: 2.01 x̃: 1
HURT stats (rel)   min: 0.20% max: 10.00% x̄: 3.52% x̃: 4.25%
95% mean confidence interval for quadwords value: -4.61 -4.09
95% mean confidence interval for quadwords %-change: -7.56% -6.88%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10961>
2021-07-12 23:29:12 +00:00
Alyssa Rosenzweig 6bf8e960fa pan/bi: Do helper termination analysis on clauses
Unlike the dependency analysis for the skip bits which is a function of
the data flow graph, the thread termination analysis is dependent on the
actual sequence of instructions. As such, it must be done after
scheduling to be correct in the presence of out-of-order scheduling.

Furthermore it's specified in terms of clauses, not instructions.
Reflecting this in our code gets a nice simplification. As a side effect
this puts extra td flags on subsequent clauses, which matches the DDK's
behaviour. (Maybe td is just a hint?)

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10961>
2021-07-12 23:29:12 +00:00
Alyssa Rosenzweig 499397700c pan/bi: Don't lower fpow
We can fuse the intermediate multiply with the FMA_RSCALE in the
exponent code and save an instruction. Whether this is better than
adding a NIR op remains to be seen.

total instructions in shared programs: 146614 -> 146190 (-0.29%)
instructions in affected programs: 40724 -> 40300 (-1.04%)
helped: 157
HURT: 0
helped stats (abs) min: 1.0 max: 9.0 x̄: 2.70 x̃: 2
helped stats (rel) min: 0.22% max: 10.34% x̄: 1.37% x̃: 1.20%
95% mean confidence interval for instructions value: -3.00 -2.40
95% mean confidence interval for instructions %-change: -1.58% -1.15%
Instructions are helped.

total tuples in shared programs: 128116 -> 127696 (-0.33%)
tuples in affected programs: 33421 -> 33001 (-1.26%)
helped: 150
HURT: 0
helped stats (abs) min: 1.0 max: 16.0 x̄: 2.80 x̃: 2
helped stats (rel) min: 0.28% max: 4.37% x̄: 1.36% x̃: 1.07%
95% mean confidence interval for tuples value: -3.24 -2.36
95% mean confidence interval for tuples %-change: -1.50% -1.21%
Tuples are helped.

total clauses in shared programs: 27531 -> 27483 (-0.17%)
clauses in affected programs: 719 -> 671 (-6.68%)
helped: 20
HURT: 0
helped stats (abs) min: 1.0 max: 8.0 x̄: 2.40 x̃: 1
helped stats (rel) min: 1.61% max: 12.90% x̄: 6.96% x̃: 5.33%
95% mean confidence interval for clauses value: -3.48 -1.32
95% mean confidence interval for clauses %-change: -9.10% -4.82%
Clauses are helped.

total cycles in shared programs: 12250.81 -> 12233.69 (-0.14%)
cycles in affected programs: 1251.50 -> 1234.38 (-1.37%)
helped: 141
HURT: 0
helped stats (abs) min: 0.041665999999999315 max: 0.6666670000000003 x̄: 0.12 x̃: 0
helped stats (rel) min: 0.29% max: 5.00% x̄: 1.48% x̃: 1.20%
95% mean confidence interval for cycles value: -0.14 -0.10
95% mean confidence interval for cycles %-change: -1.63% -1.32%
Cycles are helped.

total arith in shared programs: 4840.25 -> 4822.71 (-0.36%)
arith in affected programs: 1324.08 -> 1306.54 (-1.32%)
helped: 151
HURT: 0
helped stats (abs) min: 0.041665999999999315 max: 0.6666670000000003 x̄: 0.12 x̃: 0
helped stats (rel) min: 0.29% max: 5.00% x̄: 1.43% x̃: 1.13%
95% mean confidence interval for arith value: -0.13 -0.10
95% mean confidence interval for arith %-change: -1.59% -1.28%
Arith are helped.

total texture in shared programs: 1666.50 -> 1666.50 (0.00%)
texture in affected programs: 0 -> 0
helped: 0
HURT: 0

total vary in shared programs: 639.06 -> 639.06 (0.00%)
vary in affected programs: 0 -> 0
helped: 0
HURT: 0

total ldst in shared programs: 9682 -> 9682 (0.00%)
ldst in affected programs: 0 -> 0
helped: 0
HURT: 0

total quadwords in shared programs: 116758 -> 116378 (-0.33%)
quadwords in affected programs: 28054 -> 27674 (-1.35%)
helped: 148
HURT: 2
helped stats (abs) min: 1.0 max: 16.0 x̄: 2.58 x̃: 2
helped stats (rel) min: 0.29% max: 5.13% x̄: 1.54% x̃: 1.23%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.67% max: 0.85% x̄: 0.76% x̃: 0.76%
95% mean confidence interval for quadwords value: -2.94 -2.12
95% mean confidence interval for quadwords %-change: -1.69% -1.33%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
2021-07-09 23:15:28 +00:00
Alyssa Rosenzweig 91f130fa1e pan/bi: Factor out exp2/log2 code
Will be reused for fpow.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
2021-07-09 23:15:28 +00:00
Alyssa Rosenzweig 0991e9592d pan/bi: Comment the fexp2 implementation
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
2021-07-09 23:15:28 +00:00
Alyssa Rosenzweig 297ac6e453 pan/bi: Simplify cube map descriptor generation
We don't need to do the bitwise manipulation ourselves, we can just use
a bitwise MUX instead.

total instructions in shared programs: 146840 -> 146614 (-0.15%)
instructions in affected programs: 15037 -> 14811 (-1.50%)
helped: 109
HURT: 0
helped stats (abs) min: 2.0 max: 4.0 x̄: 2.07 x̃: 2
helped stats (rel) min: 0.86% max: 4.00% x̄: 1.70% x̃: 1.77%
95% mean confidence interval for instructions value: -2.15 -2.00
95% mean confidence interval for instructions %-change: -1.81% -1.59%
Instructions are helped.

total tuples in shared programs: 128149 -> 128116 (-0.03%)
tuples in affected programs: 2896 -> 2863 (-1.14%)
helped: 16
HURT: 0
helped stats (abs) min: 1.0 max: 5.0 x̄: 2.06 x̃: 1
helped stats (rel) min: 0.65% max: 2.33% x̄: 1.16% x̃: 0.70%
95% mean confidence interval for tuples value: -3.01 -1.12
95% mean confidence interval for tuples %-change: -1.50% -0.83%
Tuples are helped.

total cycles in shared programs: 12257.10 -> 12250.81 (-0.05%)
cycles in affected programs: 449.87 -> 443.58 (-1.40%)
helped: 92
HURT: 0
helped stats (abs) min: 0.0416660000000002 max: 0.20833400000000069 x̄: 0.07 x̃: 0
helped stats (rel) min: 0.93% max: 2.53% x̄: 1.40% x̃: 1.26%
95% mean confidence interval for cycles value: -0.08 -0.06
95% mean confidence interval for cycles %-change: -1.48% -1.32%
Cycles are helped.

total arith in shared programs: 4847.33 -> 4840.25 (-0.15%)
arith in affected programs: 490.37 -> 483.29 (-1.44%)
helped: 109
HURT: 0
helped stats (abs) min: 0.0416660000000002 max: 0.20833400000000069 x̄: 0.06 x̃: 0
helped stats (rel) min: 0.93% max: 5.56% x̄: 1.51% x̃: 1.26%
95% mean confidence interval for arith value: -0.07 -0.06
95% mean confidence interval for arith %-change: -1.64% -1.39%
Arith are helped.

total quadwords in shared programs: 116775 -> 116758 (-0.01%)
quadwords in affected programs: 1331 -> 1314 (-1.28%)
helped: 7
HURT: 0
helped stats (abs) min: 1.0 max: 4.0 x̄: 2.43 x̃: 3
helped stats (rel) min: 0.91% max: 2.38% x̄: 1.65% x̃: 1.39%
95% mean confidence interval for quadwords value: -3.48 -1.38
95% mean confidence interval for quadwords %-change: -2.27% -1.04%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
2021-07-09 23:15:28 +00:00
Alyssa Rosenzweig 99b2dddebf pan/bi: Add a constant subexpression elimination pass
ALU only. Intended to clean up the lowerings used with complex
texturings. Ex: if a shader reads two cube maps at the same coordinates,
this deduplicates the cube map transformation.

This needs to happen in the backend since we do the cube map
transformation with the backend builder, rather than special NIR ops.
This is a tradeoff.

Pass based on ir3's, which in turn is inspired by NIR's.

total instructions in shared programs: 148799 -> 147348 (-0.98%)
instructions in affected programs: 20509 -> 19058 (-7.07%)
helped: 145
HURT: 0
helped stats (abs) min: 4.0 max: 30.0 x̄: 10.01 x̃: 8
helped stats (rel) min: 1.92% max: 54.55% x̄: 10.87% x̃: 7.41%
95% mean confidence interval for instructions value: -10.73 -9.28
95% mean confidence interval for instructions %-change: -12.81% -8.94%
Instructions are helped.

total tuples in shared programs: 129992 -> 128908 (-0.83%)
tuples in affected programs: 17624 -> 16540 (-6.15%)
helped: 145
HURT: 0
helped stats (abs) min: 2.0 max: 25.0 x̄: 7.48 x̃: 7
helped stats (rel) min: 0.74% max: 42.86% x̄: 9.16% x̃: 7.22%
95% mean confidence interval for tuples value: -7.96 -6.99
95% mean confidence interval for tuples %-change: -10.52% -7.79%
Tuples are helped.

total clauses in shared programs: 27632 -> 27582 (-0.18%)
clauses in affected programs: 1077 -> 1027 (-4.64%)
helped: 44
HURT: 0
helped stats (abs) min: 1.0 max: 3.0 x̄: 1.14 x̃: 1
helped stats (rel) min: 2.50% max: 16.67% x̄: 4.99% x̃: 4.45%
95% mean confidence interval for clauses value: -1.26 -1.01
95% mean confidence interval for clauses %-change: -5.70% -4.27%
Clauses are helped.

total cycles in shared programs: 12323 -> 12285.63 (-0.30%)
cycles in affected programs: 618.25 -> 580.88 (-6.05%)
helped: 120
HURT: 0
helped stats (abs) min: 0.08333299999999966 max: 0.5416680000000014 x̄: 0.31 x̃: 0
helped stats (rel) min: 0.77% max: 66.67% x̄: 7.60% x̃: 7.37%
95% mean confidence interval for cycles value: -0.33 -0.29
95% mean confidence interval for cycles %-change: -8.73% -6.47%
Cycles are helped.

total arith in shared programs: 4916.75 -> 4866.88 (-1.01%)
arith in affected programs: 677.79 -> 627.92 (-7.36%)
helped: 145
HURT: 0
helped stats (abs) min: 0.08333299999999966 max: 1.0833329999999997 x̄: 0.34 x̃: 0
helped stats (rel) min: 0.77% max: 66.67% x̄: 12.81% x̃: 7.87%
95% mean confidence interval for arith value: -0.37 -0.32
95% mean confidence interval for arith %-change: -15.33% -10.29%
Arith are helped.

total quadwords in shared programs: 118117 -> 117262 (-0.72%)
quadwords in affected programs: 15283 -> 14428 (-5.59%)
helped: 143
HURT: 0
helped stats (abs) min: 1.0 max: 23.0 x̄: 5.98 x̃: 5
helped stats (rel) min: 0.44% max: 25.71% x̄: 7.56% x̃: 5.56%
95% mean confidence interval for quadwords value: -6.46 -5.50
95% mean confidence interval for quadwords %-change: -8.59% -6.53%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
2021-07-09 23:15:28 +00:00
Alyssa Rosenzweig 3fef3b6afc pan/bi: Analyze helper invocations
Set the .skip bit on texture instructions and the terminate discarded
threads bit on the clause header based on data flow analysis of helper
invocations. This code is adapted from Midgard, which requires the same
analysis with a few details changed.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
2021-07-09 23:15:28 +00:00
Alyssa Rosenzweig 134c74e301 pan/bi: Track LOD mode even for TEXC
Redundant with the texture operation descriptor, but we don't want to
parse that in the rest of the compiler. Handling it as a pseudo-modifier
lets us share a code path with TEXS.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
2021-07-09 23:15:28 +00:00
Alyssa Rosenzweig b7ca125278 pan/bi: Report cycle counts
Based on analysis of results from the Mali Offline Compiler. I am
uncertain how well these translate to real life, and they are
normalized counts only...

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
2021-07-09 23:15:28 +00:00
Alyssa Rosenzweig 113157dafe pan/bi: Try to hit full occupancy on v7
Bifrost v7 trades off register pressure and occupancy. If we restrict to
[R0, R15] U [R48, R63], we get full occupancy, but if we use the full
register file, we only get half occupancy. Try to allocate just 32
registers, and only use the full 64 registers if that would spill.

Clever heuristics could make this both more effective (live range
splitting, shuffling, spilling if deemed acceptable) and cheaper at
compile-time (tracking maximum liveness to determine if it's possible to
hit at all). For now, this should suffice.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
2021-07-09 23:15:28 +00:00
Icecream95 c246af0dd8 panfrost: Only upload UBOs when needed
If all of the used values from a UBO are pushed, it doesn't need to be
uploaded.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11700>
2021-07-03 13:23:29 +00:00
Jason Ekstrand 0afbfee8da nir,panfrost: Suffix fsat_signed and fclamp_pos with _mali
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11463>
2021-06-21 09:03:34 -05:00
Alyssa Rosenzweig 295e65bb50 pan/bi: Add back custom algebraic opts
Right now just do a trivial one to test the infrastructure. In the next
commit we'll use this for a more interesting optimization that's a bit
painful in BIR but trivial with nir_search.

total instructions in shared programs: 149566 -> 149562 (<.01%)
instructions in affected programs: 502 -> 498 (-0.80%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.33 x̃: 1
helped stats (rel) min: 0.38% max: 1.30% x̄: 0.97% x̃: 1.21%

total tuples in shared programs: 130957 -> 130487 (-0.36%)
tuples in affected programs: 54752 -> 54282 (-0.86%)
helped: 303
HURT: 2
helped stats (abs) min: 1 max: 29 x̄: 1.56 x̃: 1
helped stats (rel) min: 0.13% max: 7.14% x̄: 1.08% x̃: 0.92%
HURT stats (abs)   min: 1 max: 2 x̄: 1.50 x̃: 1
HURT stats (rel)   min: 1.89% max: 2.99% x̄: 2.44% x̃: 2.44%
95% mean confidence interval for tuples value: -1.79 -1.30
95% mean confidence interval for tuples %-change: -1.17% -0.95%
Tuples are helped.

total clauses in shared programs: 27877 -> 27827 (-0.18%)
clauses in affected programs: 1556 -> 1506 (-3.21%)
helped: 45
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.11 x̃: 1
helped stats (rel) min: 1.43% max: 9.52% x̄: 3.88% x̃: 3.57%
95% mean confidence interval for clauses value: -1.21 -1.02
95% mean confidence interval for clauses %-change: -4.38% -3.39%
Clauses are helped.

total quadwords in shared programs: 119058 -> 118563 (-0.42%)
quadwords in affected programs: 33777 -> 33282 (-1.47%)
helped: 250
HURT: 2
helped stats (abs) min: 1 max: 29 x̄: 1.99 x̃: 1
helped stats (rel) min: 0.23% max: 11.11% x̄: 1.67% x̃: 1.40%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.64% max: 2.00% x̄: 1.82% x̃: 1.82%
95% mean confidence interval for quadwords value: -2.27 -1.66
95% mean confidence interval for quadwords %-change: -1.80% -1.49%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11327>
2021-06-15 20:27:22 +00:00
Alyssa Rosenzweig 41070fedca pan/bi: Propagate fabs/neg/sat
Initial support for modifier propagation. Bifrost makes this
unreasonably hard.

total instructions in shared programs: 151604 -> 150761 (-0.56%)
instructions in affected programs: 48773 -> 47930 (-1.73%)
helped: 212
HURT: 0
helped stats (abs) min: 1 max: 28 x̄: 3.98 x̃: 1
helped stats (rel) min: 0.29% max: 12.70% x̄: 1.75% x̃: 1.26%
95% mean confidence interval for instructions value: -4.71 -3.25
95% mean confidence interval for instructions %-change: -1.97% -1.53%
Instructions are helped.

total tuples in shared programs: 131876 -> 131560 (-0.24%)
tuples in affected programs: 25393 -> 25077 (-1.24%)
helped: 104
HURT: 3
helped stats (abs) min: 1 max: 28 x̄: 3.08 x̃: 2
helped stats (rel) min: 0.34% max: 8.57% x̄: 1.55% x̃: 1.04%
HURT stats (abs)   min: 1 max: 2 x̄: 1.33 x̃: 1
HURT stats (rel)   min: 0.51% max: 2.86% x̄: 1.30% x̃: 0.53%
95% mean confidence interval for tuples value: -3.63 -2.28
95% mean confidence interval for tuples %-change: -1.73% -1.21%
Tuples are helped.

total clauses in shared programs: 28122 -> 28032 (-0.32%)
clauses in affected programs: 2720 -> 2630 (-3.31%)
helped: 58
HURT: 1
helped stats (abs) min: 1 max: 6 x̄: 1.57 x̃: 1
helped stats (rel) min: 0.88% max: 14.29% x̄: 4.06% x̃: 3.67%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 7.69% max: 7.69% x̄: 7.69% x̃: 7.69%
95% mean confidence interval for clauses value: -1.85 -1.20
95% mean confidence interval for clauses %-change: -4.60% -3.13%
Clauses are helped.

total quadwords in shared programs: 119778 -> 119509 (-0.22%)
quadwords in affected programs: 20698 -> 20429 (-1.30%)
helped: 95
HURT: 1
helped stats (abs) min: 1 max: 28 x̄: 2.85 x̃: 2
helped stats (rel) min: 0.38% max: 7.14% x̄: 1.50% x̃: 1.13%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23%
95% mean confidence interval for quadwords value: -3.49 -2.11
95% mean confidence interval for quadwords %-change: -1.71% -1.20%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11327>
2021-06-15 20:27:22 +00:00
Alyssa Rosenzweig e41d8ed007 pan/bi: Report tuples, not nops, in shader-db
More useful in practice, I find.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11327>
2021-06-15 20:27:22 +00:00
Alyssa Rosenzweig ecd9a3fed5 pan/bi: Handle fsat_signed and fclamp_pos
Translates to Mali-specific clamps.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11327>
2021-06-15 20:27:22 +00:00
Alyssa Rosenzweig a5ec1e7f75 pan/bi: Force u32 for flat varyings
Since the GLSL compilers will pack together flat varyings with no regard
to type, under the assumption the backend can deal with it. I guess we
can deal with it then... Fixes fails in
dEQP-GLES31.functional.separate_shader.random.*

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
2021-06-10 18:06:10 +00:00
Alyssa Rosenzweig 6dbaae77d9 pan/bi: Emit a dummy ATEST if needed
Match what the blob does, since Bifrost has so many random errata we'd
be fools not to.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
2021-06-10 18:06:10 +00:00
Alyssa Rosenzweig b947ab8b10 pan/bi: Lower 64-bit ints again
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
2021-06-10 18:06:10 +00:00
Alyssa Rosenzweig 95458c4033 pan/bi: Lower stores with component != 0
If the shader packs multiple varyings into the same location with
different location_frac, we'll need to lower to a single varying store
that collects all of the channels together. This is not trivial during
code gen, but it is trivial to do in NIR right before codegen by relying
on nir_lower_io_to_temporaries. Since we're guaranteed all varyings will
be written exactly once, in the exit block, we can scan the shader
linearly and collect stores together in a single pass.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
2021-06-10 18:06:10 +00:00
Alyssa Rosenzweig de42707101 pan/bi: Lower loads with component > 0
We have no native way to swizzle out a nonzero component in a load, but
we can simply load extra components and do the swizzle in shader
instructions. This is inefficient, since it loads data to discard
immediately, but it's required for conformance in some edge cases.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11123>
2021-06-10 18:06:10 +00:00