Commit Graph

1743 Commits

Author SHA1 Message Date
Alyssa Rosenzweig 90beea75f6 pan/bi: Don't reorder push with no_ubo_to_push
Otherwise, load_push_constant won't work properly. This could probably be made
to work if we tried hard enough, but we still don't want reordering for internal
(meta) shaders which are layed out deliberately.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16916>
2022-06-08 13:42:42 +00:00
Alyssa Rosenzweig 17ea1642e2 pan/bi: Implement load_push_constant
Bifrost supports "fast access uniforms" loaded from a single contiguous buffer.
This maps directly to Vulkan push constants, with some caveats:

* No indirect access. Indirects need to be lowered to a UBO pull.
* Strict alignment requirements. These will be met in practice.

Implement the NIR intrinsic and map it to the native hardware construct.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16916>
2022-06-08 13:42:42 +00:00
Alyssa Rosenzweig 28801cfba0 pan/va: Unit test constant lowering pass
Like other optimizations, breaking this pass may not affect functional
correctness. It's also dead simple to unit test the pass, so we have no excuse
not to. Add unit tests for the functionality we currently support, since we just
extended it and want to make sure everything still works.

This includes tests for use of modifiers to get more small constants. There are
lots of subtle gotchas there, so let's add lots of unit tests to make sure we
got it right.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16862>
2022-06-06 18:10:24 +00:00
Alyssa Rosenzweig 9cfafbb09b pan/va: Try widening small constants
Many small integers are availabled as small constants, but the table of small
constants is tightly packed. Zero and sign extensions are usually required to
access small integers. When packing constants, try zero/sign extension for
unsigned/signed integer instructions respectively.

total instructions in shared programs: 2716912 -> 2707795 (-0.34%)
instructions in affected programs: 1045609 -> 1036492 (-0.87%)
helped: 4460
HURT: 125
helped stats (abs) min: 1.0 max: 58.0 x̄: 2.14 x̃: 1
helped stats (rel) min: 0.14% max: 23.85% x̄: 1.35% x̃: 0.88%
HURT stats (abs)   min: 1.0 max: 68.0 x̄: 3.41 x̃: 1
HURT stats (rel)   min: 0.34% max: 3.88% x̄: 0.93% x̃: 0.70%
95% mean confidence interval for instructions value: -2.09 -1.89
95% mean confidence interval for instructions %-change: -1.33% -1.25%
Instructions are helped.

total cycles in shared programs: 141984.06 -> 141932.42 (-0.04%)
cycles in affected programs: 552.08 -> 500.44 (-9.35%)
helped: 18
HURT: 0
helped stats (abs) min: 0.015625 max: 11.0 x̄: 2.87 x̃: 0
helped stats (rel) min: 0.50% max: 19.64% x̄: 5.36% x̃: 1.53%
95% mean confidence interval for cycles value: -5.17 -0.56
95% mean confidence interval for cycles %-change: -9.28% -1.44%
Cycles are helped.

total cvt in shared programs: 13805.05 -> 13663.39 (-1.03%)
cvt in affected programs: 6127.45 -> 5985.80 (-2.31%)
helped: 4460
HURT: 125
helped stats (abs) min: 0.015625 max: 0.90625 x̄: 0.03 x̃: 0
helped stats (rel) min: 0.35% max: 50.00% x̄: 5.19% x̃: 4.00%
HURT stats (abs)   min: 0.015625 max: 1.0625 x̄: 0.05 x̃: 0
HURT stats (rel)   min: 0.77% max: 9.30% x̄: 3.40% x̃: 2.78%
95% mean confidence interval for cvt value: -0.03 -0.03
95% mean confidence interval for cvt %-change: -5.10% -4.81%
Cvt are helped.

total ls in shared programs: 129545 -> 129494 (-0.04%)
ls in affected programs: 495 -> 444 (-10.30%)
helped: 6
HURT: 0
helped stats (abs) min: 2.0 max: 11.0 x̄: 8.50 x̃: 11
helped stats (rel) min: 1.49% max: 19.64% x̄: 13.95% x̃: 19.64%
95% mean confidence interval for ls value: -12.68 -4.32
95% mean confidence interval for ls %-change: -23.23% -4.67%
Ls are helped.

total quadwords in shared programs: 1476416 -> 1469824 (-0.45%)
quadwords in affected programs: 121208 -> 114616 (-5.44%)
helped: 820
HURT: 16
helped stats (abs) min: 8.0 max: 32.0 x̄: 8.28 x̃: 8
helped stats (rel) min: 1.39% max: 50.00% x̄: 11.00% x̃: 10.00%
HURT stats (abs)   min: 8.0 max: 32.0 x̄: 12.50 x̃: 8
HURT stats (rel)   min: 1.38% max: 10.00% x̄: 6.19% x̃: 7.14%
95% mean confidence interval for quadwords value: -8.14 -7.63
95% mean confidence interval for quadwords %-change: -11.20% -10.15%
Quadwords are helped.

total threads in shared programs: 53633 -> 53663 (0.06%)
threads in affected programs: 39 -> 69 (76.92%)
helped: 33
HURT: 3
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.64 1.02
95% mean confidence interval for threads %-change: 73.27% 101.73%
Threads are helped.

total spills in shared programs: 154 -> 103 (-33.12%)
spills in affected programs: 75 -> 24 (-68.00%)
helped: 6
HURT: 0

total fills in shared programs: 656 -> 656 (0.00%)
fills in affected programs: 148 -> 148 (0.00%)
helped: 2
HURT: 4

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16862>
2022-06-06 18:10:23 +00:00
Alyssa Rosenzweig 72146051d5 pan/va: Try negating small constants when lowering
If a constant is used with a floating point instruction with a floating-point
negate modifier, we can use the modifier to negate constants in the table for
free. Each floating point in the table is positive, so this is required for
negative small constants.

total instructions in shared programs: 2728438 -> 2716912 (-0.42%)
instructions in affected programs: 1418220 -> 1406694 (-0.81%)
helped: 6053
HURT: 94
helped stats (abs) min: 1.0 max: 43.0 x̄: 1.94 x̃: 1
helped stats (rel) min: 0.06% max: 18.18% x̄: 1.34% x̃: 0.84%
HURT stats (abs)   min: 1.0 max: 5.0 x̄: 2.34 x̃: 2
HURT stats (rel)   min: 0.09% max: 21.43% x̄: 1.87% x̃: 0.91%
95% mean confidence interval for instructions value: -1.93 -1.82
95% mean confidence interval for instructions %-change: -1.34% -1.25%
Instructions are helped.

total cycles in shared programs: 142103 -> 141984.06 (-0.08%)
cycles in affected programs: 766.70 -> 647.77 (-15.51%)
helped: 97
HURT: 0
helped stats (abs) min: 0.015625 max: 40.0 x̄: 1.23 x̃: 0
helped stats (rel) min: 0.27% max: 41.24% x̄: 3.63% x̃: 2.08%
95% mean confidence interval for cycles value: -2.41 -0.04
95% mean confidence interval for cycles %-change: -4.68% -2.57%
Cycles are helped.

total cvt in shared programs: 13983.34 -> 13805.05 (-1.28%)
cvt in affected programs: 7952.45 -> 7774.16 (-2.24%)
helped: 6049
HURT: 98
helped stats (abs) min: 0.015625 max: 0.359375 x̄: 0.03 x̃: 0
helped stats (rel) min: 0.25% max: 100.00% x̄: 4.74% x̃: 2.52%
HURT stats (abs)   min: 0.015625 max: 0.078125 x̄: 0.04 x̃: 0
HURT stats (rel)   min: 0.17% max: 100.00% x̄: 5.48% x̃: 2.54%
95% mean confidence interval for cvt value: -0.03 -0.03
95% mean confidence interval for cvt %-change: -4.83% -4.32%
Cvt are helped.

total ls in shared programs: 129660 -> 129545 (-0.09%)
ls in affected programs: 601 -> 486 (-19.13%)
helped: 7
HURT: 0
helped stats (abs) min: 3.0 max: 40.0 x̄: 16.43 x̃: 8
helped stats (rel) min: 2.88% max: 41.24% x̄: 17.41% x̃: 12.50%
95% mean confidence interval for ls value: -31.42 -1.44
95% mean confidence interval for ls %-change: -29.25% -5.58%
Ls are helped.

total quadwords in shared programs: 1482728 -> 1476416 (-0.43%)
quadwords in affected programs: 131200 -> 124888 (-4.81%)
helped: 798
HURT: 15
helped stats (abs) min: 8.0 max: 24.0 x̄: 8.06 x̃: 8
helped stats (rel) min: 0.34% max: 50.00% x̄: 10.15% x̃: 6.67%
HURT stats (abs)   min: 8.0 max: 8.0 x̄: 8.00 x̃: 8
HURT stats (rel)   min: 1.49% max: 100.00% x̄: 11.25% x̃: 2.78%
95% mean confidence interval for quadwords value: -7.92 -7.60
95% mean confidence interval for quadwords %-change: -10.52% -8.99%
Quadwords are helped.

total threads in shared programs: 53585 -> 53633 (0.09%)
threads in affected programs: 51 -> 99 (94.12%)
helped: 49
HURT: 1
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.88 1.04
95% mean confidence interval for threads %-change: 90.97% 103.03%
Threads are helped.

total spills in shared programs: 125 -> 154 (23.20%)
spills in affected programs: 75 -> 104 (38.67%)
helped: 3
HURT: 4

total fills in shared programs: 800 -> 656 (-18.00%)
fills in affected programs: 476 -> 332 (-30.25%)
helped: 7
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16862>
2022-06-06 18:10:23 +00:00
Alyssa Rosenzweig cecfa0c44a pan/va: Record which instructions are signed
We need to distinguish signed integer instructions from unsigned integer
instructions, to distinguish sign-extension and zero-extension of sources.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16862>
2022-06-06 18:10:23 +00:00
Alyssa Rosenzweig e57dfed419 pan/bi: Implement b2i with MUX
The result_type modifier propagation looks for MUX instructions, so using this
canonical b2i implementation allows the sequence b2i(cmp) to be fused.
It's also faster on its own: on Valhall, MUX may be implemented as CSEL on the
CVT unit, while AND may only be implemented on the SFU unit. So in case this
doesn't get fused, we expect 4x better throughput for b2i with this
implementation. Similarly, on Bifrost, MUX may be scheduled to either unit (as
CSEL on FMA or MUX on ADD), whereas AND may only be scheduled to FMA.

Results on Mali-G52:

total instructions in shared programs: 2419171 -> 2414814 (-0.18%)
instructions in affected programs: 272203 -> 267846 (-1.60%)
helped: 767
HURT: 0
helped stats (abs) min: 1.0 max: 138.0 x̄: 5.68 x̃: 2
helped stats (rel) min: 0.12% max: 15.57% x̄: 2.09% x̃: 0.68%
95% mean confidence interval for instructions value: -6.68 -4.68
95% mean confidence interval for instructions %-change: -2.37% -1.82%
Instructions are helped.

total tuples in shared programs: 1932822 -> 1929234 (-0.19%)
tuples in affected programs: 76485 -> 72897 (-4.69%)
helped: 380
HURT: 3
helped stats (abs) min: 1.0 max: 138.0 x̄: 9.46 x̃: 1
helped stats (rel) min: 0.14% max: 15.96% x̄: 3.81% x̃: 0.92%
HURT stats (abs)   min: 1.0 max: 6.0 x̄: 2.67 x̃: 1
HURT stats (rel)   min: 0.38% max: 8.57% x̄: 3.80% x̃: 2.44%
95% mean confidence interval for tuples value: -11.30 -7.44
95% mean confidence interval for tuples %-change: -4.27% -3.22%
Tuples are helped.

total clauses in shared programs: 356094 -> 355992 (-0.03%)
clauses in affected programs: 3264 -> 3162 (-3.12%)
helped: 80
HURT: 0
helped stats (abs) min: 1.0 max: 9.0 x̄: 1.27 x̃: 1
helped stats (rel) min: 0.81% max: 50.00% x̄: 4.83% x̃: 3.39%
95% mean confidence interval for clauses value: -1.49 -1.06
95% mean confidence interval for clauses %-change: -6.23% -3.43%
Clauses are helped.

total cycles in shared programs: 167337.10 -> 167329.19 (<.01%)
cycles in affected programs: 510.08 -> 502.17 (-1.55%)
helped: 80
HURT: 2
helped stats (abs) min: 0.041665999999999315 max: 0.7916659999999993 x̄: 0.10 x̃: 0
helped stats (rel) min: 0.51% max: 13.64% x̄: 2.12% x̃: 1.34%
HURT stats (abs)   min: 0.041665999999999315 max: 0.0416669999999999 x̄: 0.04 x̃: 0
HURT stats (rel)   min: 0.39% max: 2.78% x̄: 1.58% x̃: 1.58%
95% mean confidence interval for cycles value: -0.12 -0.07
95% mean confidence interval for cycles %-change: -2.59% -1.48%
Cycles are helped.

total arith in shared programs: 73819.54 -> 73669.25 (-0.20%)
arith in affected programs: 2840.54 -> 2690.25 (-5.29%)
helped: 383
HURT: 3
helped stats (abs) min: 0.041665999999999315 max: 5.75 x̄: 0.39 x̃: 0
helped stats (rel) min: 0.33% max: 18.81% x̄: 4.39% x̃: 0.98%
HURT stats (abs)   min: 0.041665999999999315 max: 0.25 x̄: 0.11 x̃: 0
HURT stats (rel)   min: 0.39% max: 8.96% x̄: 4.04% x̃: 2.78%
95% mean confidence interval for arith value: -0.47 -0.31
95% mean confidence interval for arith %-change: -4.93% -3.71%
Arith are helped.

total quadwords in shared programs: 1679798 -> 1676259 (-0.21%)
quadwords in affected programs: 72826 -> 69287 (-4.86%)
helped: 381
HURT: 15
helped stats (abs) min: 1.0 max: 142.0 x̄: 9.35 x̃: 1
helped stats (rel) min: 0.25% max: 18.87% x̄: 4.33% x̃: 1.13%
HURT stats (abs)   min: 1.0 max: 6.0 x̄: 1.47 x̃: 1
HURT stats (rel)   min: 0.30% max: 6.25% x̄: 0.77% x̃: 0.35%
95% mean confidence interval for quadwords value: -10.76 -7.11
95% mean confidence interval for quadwords %-change: -4.71% -3.56%
Quadwords are helped.

Results on Mali-G57:

total instructions in shared programs: 2704193 -> 2699317 (-0.18%)
instructions in affected programs: 293366 -> 288490 (-1.66%)
helped: 758
HURT: 5
helped stats (abs) min: 1.0 max: 151.0 x̄: 6.45 x̃: 2
helped stats (rel) min: 0.11% max: 22.22% x̄: 2.05% x̃: 0.64%
HURT stats (abs)   min: 1.0 max: 7.0 x̄: 2.20 x̃: 1
HURT stats (rel)   min: 0.22% max: 1.69% x̄: 0.87% x̃: 1.08%
95% mean confidence interval for instructions value: -7.42 -5.36
95% mean confidence interval for instructions %-change: -2.27% -1.79%
Instructions are helped.

total cycles in shared programs: 141711.73 -> 141711.84 (<.01%)
cycles in affected programs: 214.36 -> 214.47 (0.05%)
helped: 4
HURT: 42
helped stats (abs) min: 0.015625 max: 0.359375 x̄: 0.20 x̃: 0
helped stats (rel) min: 1.85% max: 12.78% x̄: 9.12% x̃: 10.93%
HURT stats (abs)   min: 0.015625 max: 0.09375 x̄: 0.02 x̃: 0
HURT stats (rel)   min: 0.17% max: 17.65% x̄: 0.84% x̃: 0.34%
95% mean confidence interval for cycles value: -0.02 0.03
95% mean confidence interval for cycles %-change: -1.23% 1.17%
Inconclusive result (value mean confidence interval includes 0).

total cvt in shared programs: 14479.14 -> 14474.19 (-0.03%)
cvt in affected programs: 2877.05 -> 2872.09 (-0.17%)
helped: 508
HURT: 209
helped stats (abs) min: 0.015625 max: 0.453125 x̄: 0.02 x̃: 0
helped stats (rel) min: 0.25% max: 16.67% x̄: 1.23% x̃: 0.37%
HURT stats (abs)   min: 0.015625 max: 0.296875 x̄: 0.03 x̃: 0
HURT stats (rel)   min: 0.15% max: 18.18% x̄: 1.70% x̃: 0.34%
95% mean confidence interval for cvt value: -0.01 -0.00
95% mean confidence interval for cvt %-change: -0.57% -0.18%
Cvt are helped.

total sfu in shared programs: 7875.69 -> 7590.75 (-3.62%)
sfu in affected programs: 1567.38 -> 1282.44 (-18.18%)
helped: 906
HURT: 0
helped stats (abs) min: 0.0625 max: 8.625 x̄: 0.31 x̃: 0
helped stats (rel) min: 2.38% max: 100.00% x̄: 16.80% x̃: 5.63%
95% mean confidence interval for sfu value: -0.37 -0.26
95% mean confidence interval for sfu %-change: -18.43% -15.17%
Sfu are helped.

total quadwords in shared programs: 1468152 -> 1465800 (-0.16%)
quadwords in affected programs: 37104 -> 34752 (-6.34%)
helped: 161
HURT: 2
helped stats (abs) min: 8.0 max: 80.0 x̄: 14.71 x̃: 8
helped stats (rel) min: 1.67% max: 20.00% x̄: 8.05% x̃: 7.69%
HURT stats (abs)   min: 8.0 max: 8.0 x̄: 8.00 x̃: 8
HURT stats (rel)   min: 3.57% max: 3.85% x̄: 3.71% x̃: 3.71%
95% mean confidence interval for quadwords value: -16.29 -12.57
95% mean confidence interval for quadwords %-change: -8.58% -7.22%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16857>
2022-06-06 16:08:25 +00:00
Alyssa Rosenzweig 8f3b62f87e pan/va: Add MUX lowering tests
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16857>
2022-06-06 16:08:25 +00:00
Alyssa Rosenzweig 677a66b3eb pan/va: Lower MUX to CSEL where possible
CSEL executes on the conversion unit (CVT), while MUX executes on the special
function unit (SFU). Throughput on CVT is 4x higher than SFU, so this is
(almost) always an optimization.

The "real" MUX is still used for unusual cases, like 8-bit and bitselect.

Note that it's easier for us to use MUX everywhere for the IR. This is an easy
fixup to get better codegen on Valhall without touching the core Bifrost code.

shader-db is a bit of a toss up: register pressure and instruction count are
hurt in some cases due to restrictions on FAU access. In particular, a shader
that muxes between two uniforms needs an extra move due to extra constant
(zero). However, in terms of throughput this is still a win: 2 CVT instructions
(MOV + CSEL) have 2x throughput to 1 SFU instruction (MUX). The MOV has
opportunities for CSE, but that can hurt pressure in turn. Overall, cycles are
helped substantially.

total instructions in shared programs: 2728438 -> 2731597 (0.12%)
instructions in affected programs: 414391 -> 417550 (0.76%)
helped: 87
HURT: 1063
helped stats (abs) min: 1.0 max: 6.0 x̄: 5.17 x̃: 6
helped stats (rel) min: 0.19% max: 15.79% x̄: 4.12% x̃: 4.11%
HURT stats (abs)   min: 1.0 max: 56.0 x̄: 3.40 x̃: 2
HURT stats (rel)   min: 0.11% max: 23.43% x̄: 1.15% x̃: 0.63%
95% mean confidence interval for instructions value: 2.47 3.03
95% mean confidence interval for instructions %-change: 0.61% 0.90%
Instructions are HURT.

total cycles in shared programs: 142103 -> 142015.75 (-0.06%)
cycles in affected programs: 1263.45 -> 1176.20 (-6.91%)
helped: 281
HURT: 176
helped stats (abs) min: 0.015625 max: 2.234375 x̄: 0.50 x̃: 0
helped stats (rel) min: 0.71% max: 54.17% x̄: 16.93% x̃: 15.31%
HURT stats (abs)   min: 0.015625 max: 30.0 x̄: 0.30 x̃: 0
HURT stats (rel)   min: 0.84% max: 120.00% x̄: 7.16% x̃: 5.00%
95% mean confidence interval for cycles value: -0.33 -0.05
95% mean confidence interval for cycles %-change: -9.08% -6.22%
Cycles are helped.

total cvt in shared programs: 13983.34 -> 14891.70 (6.50%)
cvt in affected programs: 7498.36 -> 8406.72 (12.11%)
helped: 71
HURT: 4711
helped stats (abs) min: 0.0625 max: 0.0625 x̄: 0.06 x̃: 0
helped stats (rel) min: 5.41% max: 40.00% x̄: 10.23% x̃: 9.30%
HURT stats (abs)   min: 0.015625 max: 2.640625 x̄: 0.19 x̃: 0
HURT stats (rel)   min: 0.18% max: 141.18% x̄: 16.21% x̃: 9.52%
95% mean confidence interval for cvt value: 0.18 0.20
95% mean confidence interval for cvt %-change: 15.21% 16.42%
Cvt are HURT.

total sfu in shared programs: 11320.44 -> 7882.56 (-30.37%)
sfu in affected programs: 7618.50 -> 4180.62 (-45.13%)
helped: 4782
HURT: 0
helped stats (abs) min: 0.0625 max: 10.5625 x̄: 0.72 x̃: 0
helped stats (rel) min: 1.34% max: 100.00% x̄: 41.91% x̃: 37.50%
95% mean confidence interval for sfu value: -0.75 -0.68
95% mean confidence interval for sfu %-change: -42.68% -41.14%
Sfu are helped.

total ls in shared programs: 129660 -> 129690 (0.02%)
ls in affected programs: 25 -> 55 (120.00%)
helped: 0
HURT: 1

total quadwords in shared programs: 1482728 -> 1484128 (0.09%)
quadwords in affected programs: 58624 -> 60024 (2.39%)
helped: 24
HURT: 195
helped stats (abs) min: 8.0 max: 8.0 x̄: 8.00 x̃: 8
helped stats (rel) min: 3.70% max: 20.00% x̄: 10.34% x̃: 10.00%
HURT stats (abs)   min: 8.0 max: 24.0 x̄: 8.16 x̃: 8
HURT stats (rel)   min: 1.41% max: 50.00% x̄: 4.84% x̃: 2.56%
95% mean confidence interval for quadwords value: 5.70 7.09
95% mean confidence interval for quadwords %-change: 2.22% 4.14%
Quadwords are HURT.

total spills in shared programs: 125 -> 127 (1.60%)
spills in affected programs: 0 -> 2
helped: 0
HURT: 1

total fills in shared programs: 800 -> 828 (3.50%)
fills in affected programs: 0 -> 28
helped: 0
HURT: 1

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16857>
2022-06-06 16:08:25 +00:00
Alyssa Rosenzweig 3741606b25 pan/va: Implement more lanes
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16857>
2022-06-06 16:08:25 +00:00
Alyssa Rosenzweig 1768afa5b9 pan/bi: Extract MUX to CSEL optimization
It's portable, and useful to both Bifrost and Valhall, in the clause scheduler
and in an instruction selection respectively. Move it from the Bifrost clause
scheduler to common code so we can share the benefits.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16857>
2022-06-06 16:08:25 +00:00
Alyssa Rosenzweig e1fb182d90 pan/va: Do not insert NOPs into empty shaders
It's unnecessary and breaks the empty shader optimizations. Noticed while
inspecting a trace from dEQP-GLES3.functional.color_clear.masked_scissored_rgb,
which does not produce any varyings other than gl_Position in its vertex shader
and hence should omit the varying shader.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16868>
2022-06-06 14:28:59 +00:00
Alyssa Rosenzweig 3b3cd59fb8 panfrost: Launch transform feedback shaders
We now have infrastructure in place to generate variants of vertex shaders
specialized for transform feedback. All that's left is launching these
compute-like kernels before the IDVS job, implementing both the
transform feedback and the regular rasterization pipeline. This implements
transform feedback on Valhall, passing the relevant GLES3.1 tests.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>
2022-06-04 14:35:56 +00:00
Alyssa Rosenzweig 4e341e70d8 pan/bi: Handle transform feedback intrinsics
Translate the intrinsics we introduced to lower away transform feedback into
Panfrost system values which the GL driver can handle.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>
2022-06-04 14:35:56 +00:00
Alyssa Rosenzweig ae3fa6cc1d pan/bi: Add transform feedback lowering pass
Add a simple NIR-based implementation of transform feedback, appropriate for
OpenGL ES 3.1 class hardware (compute but no geometry or tessellation shaders).
Stores to varyings that will be captured are replaced by stores to transform
feedback buffers and some addressing math. This allows implementing the semantic
of transform feedback in a compute-like stage.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>
2022-06-04 14:35:56 +00:00
Alyssa Rosenzweig 7535362204 pan/bi: Fix clper_xor on Mali-G31
Mali-G31 has the old CLPER instruction, not the new one, which means we don't
get to specify a custom lane op. But the clper_xor helper incorrectly checked
the arch, not the implementation quirk.

Fixes: c00e7b729f ("pan/bi: Optimize abs(derivative)")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Icecream95 <ixn@disroot.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16846>
2022-06-02 20:32:43 -04:00
Alyssa Rosenzweig ad5c84999b pan/bi: Rework Valhall register alignment
Because we lower SPLIT and COLLECT before RA, we need to consider offsets when
determining the dimensions of vectors, in order to align properly. Lowering
COLLECT post-RA would avoid this special case.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>
2022-06-02 17:13:16 +00:00
Alyssa Rosenzweig 0770e7a90c pan/bi: Align 64-bit register sources
Similar idea to aligning staging register sources.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>
2022-06-02 17:13:16 +00:00
Alyssa Rosenzweig 8553dd97ad pan/bi: Allow vec6 for collects
Hit for some Valhall texturing instructions.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>
2022-06-02 17:13:16 +00:00
Icecream95 1bfff407b9 pan/bi: Use nodearrays for linear constraints
Speeds up compiling shaders/skia/781.shader_test in shader-db by 8x
(Icecream95).

...At least it did before I extended to support register allocation of vec8.  On
Valhall, texture instructions require up to 8 consecutive registers. To handle
this, provide for vec8 register allocation. Liveness was already (accidentally?)
vec8. The increased memory requirement is acceptable given that the interference
matrix is now stored sparsely (Alyssa).

Icecream95 reports the vec8 changes hurt RA performance by about 1% on average.
I consider this acceptable for now.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>
2022-06-02 17:13:16 +00:00
Icecream95 c70daa74f0 pan/bi: Add nodearray datastructure
This is an array which can either be sparse or dense, and was designed
to be used to track liveness and interference information.

Either a sparse array with sorted indices or dense array is used.
Other data structures were tried, such as red-black trees or hash
tables, but they were slower. When used for storing constraints, the
indices do not have to be sorted as duplicating elements is okay, but
the speedup from that was not enough to justify the extra complexity.

v2: Add a comment about how to potentially speed it up. But it seems
  fast enough even without this change.
v3: Use a custom struct rather than relying on util_dynarray.
v4: Split out functions only used for liveness analysis, rather than the simpler
  data structure needed for the register interference matrix. If we need to
  optimize liveness, that can follow on after. Also make it for vec8 (Alyssa).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>
2022-06-02 17:13:16 +00:00
Icecream95 c24b78cceb pan/bi: Reverse linear constraint bits
This will make it simpler to implement parallel RA where multiple
possible registers for a node are tested at once.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>
2022-06-02 17:13:16 +00:00
Alyssa Rosenzweig bc4d42023d pan/bi: Respect swizzles in nir_op_pack_64_2x32_split
Triggered a BIR validation error, which made debugging a breeze. That validation
pass (dimensionality checks) gets a lot of use, it seems :-)

Fixes:

   dEQP-VK.ssbo.layout.2_level_array.std430.row_major_mat4x2_comp_access_store_cols

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16724>
2022-06-01 20:08:42 +00:00
Alyssa Rosenzweig 5067a26f44 pan/bi: Use flow control lowering on Valhall
Logically at the same part of the compile pipeline as clause scheduling on
Bifrost. Lots of similarities, too. Now that we generate flow control only as a
late pass, various hacks in the compiler are no longer necessary and are
dropped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig a394c32cd2 pan/va: Unit test flow control merging
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig 4b06e7f5b6 pan/va: Unit test flow control insertion
Test that we correctly track the scoreboard, helper invocations, reconvergence,
and ends and insert NOPs to effect this expected flow control.

As the pass inserts NOPs but does not otherwise modify the shader, this is easy
to test with well-defined behaviour of the pass.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig 0fa9204049 pan/va: Respect assigned slots
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig 492f4055dd pan/va: Assign slots roundrobin
This should reduce false dependencies with asynchronous instructions.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig aa7393f81a pan/va: Add flow control merging pass
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig 03d8439c0a pan/va: Terminate helper threads
On Bifrost, to terminate helper threads we set the td bit on the clause. On
Valhall, we need to use the .discard flow control. Extend the flow control NOP
insertion to insert NOP.discard where necessary to terminate helper threads.
This should reduce wasted work in fragment shaders.

This requires fairly involved data flow analysis, but the handling here should
be optimal.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig 41b39d6d5d pan/va: Do scoreboard analysis
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig 7e3b9cf754 pan/va: Add pass to insert flow control
To set flow control modifiers correctly and efficiently, we need a pass that
runs after register allocation and scheduling, but before packing. Add such a
pass.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig 82b1897900 pan/bi: Print flow control on instructions
This helps debug the flow control lowering passes on Valhall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig c0180f6bd3 pan/bi: Export helper termination analysis
The current helper termination analysis code is hardwired for clauses, so it
won't work for Valhall. However, the bulk of it is dataflow analysis which is
portable between Bifrost and Valhall. Export the interesting bits so we can
reuse them on Valhall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig 7bb635316b pan/bi: Export bi_block_add_successor
For use in unit tests that need to create blocks.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig d7c6b7c9d2 pan/bi: Extract bit_block helper
Convenience for unit tests which need to create multiple blocks, to test global
passes.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig b0edd92156 pan/bi: Add a trivial ctx->inputs for unit tests
So we can unit test the flow control insertion which needs to gate some
behaviour on not being in a blend shader.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig 218148d38a pan/bi: Add ASSERT_SHADER_EQUAL macro
Useful for whole-program unit tests.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig 4627cd99de pan/bi: Preserve flow control for non-psiz variant
Otherwise we will get INSTR_INVALID_ENC faults when deleting the final STORE.end
instruction, after we rework our flow control code.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig c846e0812b pan/bi: Add slot to bi_instr
For better handling of message-passing instructions.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig 616df0e97d pan/bi: Extend bi_scoreboard_state for finer tracking
We need to insert dependencies for varyings and memory access. Currently, the
Bifrost scoreboarding pass just treats these as barriers, but this is too heavy
handed. Extend the scoreboard data structure so we can do better.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>
2022-06-01 16:14:38 +00:00
Daniel Schürmann bd151a256e nir/opt_vectorize: add callback for max vectorization width
The callback allows to request different vectorization factors
per instruction depending on e.g. bitsize or opcode.

This patch also removes using the vectorize_vec2_16bit option
from nir_opt_vectorize().

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13080>
2022-06-01 11:41:44 +00:00
Alyssa Rosenzweig 0170500627 pan/bi: Interpolate varyings at 16-bit
On Bifrost, we have a single "load float varying" instruction that controls the
bit size of the result, allowing us to fold a f2f16 into the load. However, the
larger benefit is that 16-bit varying loads are interpolated at 16-bit. Arm
claims that the varying unit has a 32-bit data path, allowing 16-bit varyings to
be interpolated in half the cycles from 32-bit. This change should therefore
improve performance for workloads that are varying units. This means we want to
be aggressive about 16-bit varying loads, even if it costs some extra f2f32
instructions.

glmark2 total score on Mali-G52 up from 1173fps to 1218fps with particular wins
in -brefract, -bshadow, -bjellyfish, and -bshading.

total instructions in shared programs: 2432246 -> 2423668 (-0.35%)
instructions in affected programs: 516056 -> 507478 (-1.66%)
helped: 3641
HURT: 432
helped stats (abs) min: 1.0 max: 12.0 x̄: 2.91 x̃: 2
helped stats (rel) min: 0.08% max: 54.55% x̄: 9.88% x̃: 5.71%
HURT stats (abs)   min: 1.0 max: 42.0 x̄: 4.71 x̃: 4
HURT stats (rel)   min: 0.23% max: 200.00% x̄: 12.58% x̃: 6.37%
95% mean confidence interval for instructions value: -2.21 -2.00
95% mean confidence interval for instructions %-change: -7.92% -7.07%
Instructions are helped.

total tuples in shared programs: 1941309 -> 1934647 (-0.34%)
tuples in affected programs: 353169 -> 346507 (-1.89%)
helped: 3233
HURT: 453
helped stats (abs) min: 1.0 max: 14.0 x̄: 2.46 x̃: 2
helped stats (rel) min: 0.12% max: 50.00% x̄: 9.90% x̃: 5.56%
HURT stats (abs)   min: 1.0 max: 25.0 x̄: 2.85 x̃: 2
HURT stats (rel)   min: 0.22% max: 150.00% x̄: 8.96% x̃: 5.26%
95% mean confidence interval for tuples value: -1.89 -1.72
95% mean confidence interval for tuples %-change: -8.01% -7.15%
Tuples are helped.

total clauses in shared programs: 357354 -> 356610 (-0.21%)
clauses in affected programs: 25794 -> 25050 (-2.88%)
helped: 994
HURT: 317
helped stats (abs) min: 1.0 max: 3.0 x̄: 1.16 x̃: 1
helped stats (rel) min: 1.49% max: 33.33% x̄: 10.78% x̃: 10.00%
HURT stats (abs)   min: 1.0 max: 4.0 x̄: 1.31 x̃: 1
HURT stats (rel)   min: 1.19% max: 50.00% x̄: 13.56% x̃: 8.33%
95% mean confidence interval for clauses value: -0.63 -0.50
95% mean confidence interval for clauses %-change: -5.63% -4.16%
Clauses are helped.

total cycles in shared programs: 167697.96 -> 167431.15 (-0.16%)
cycles in affected programs: 12638.29 -> 12371.48 (-2.11%)
helped: 2652
HURT: 350
helped stats (abs) min: 0.04166399999999726 max: 0.75 x̄: 0.11 x̃: 0
helped stats (rel) min: 0.12% max: 100.00% x̄: 14.39% x̃: 5.04%
HURT stats (abs)   min: 0.041665999999999315 max: 0.5833329999999997 x̄: 0.11 x̃: 0
HURT stats (rel)   min: 0.00% max: 75.00% x̄: 7.90% x̃: 4.71%
95% mean confidence interval for cycles value: -0.09 -0.08
95% mean confidence interval for cycles %-change: -12.56% -11.02%
Cycles are helped.

total arith in shared programs: 74169.46 -> 73891.71 (-0.37%)
arith in affected programs: 13885.87 -> 13608.12 (-2.00%)
helped: 3215
HURT: 445
helped stats (abs) min: 0.04166399999999726 max: 0.5416680000000014 x̄: 0.10 x̃: 0
helped stats (rel) min: 0.12% max: 100.00% x̄: 14.16% x̃: 6.67%
HURT stats (abs)   min: 0.041665999999999315 max: 1.125 x̄: 0.12 x̃: 0
HURT stats (rel)   min: 0.00% max: 100.00% x̄: 9.76% x̃: 5.49%
95% mean confidence interval for arith value: -0.08 -0.07
95% mean confidence interval for arith %-change: -11.91% -10.59%
Arith are helped.

total texture in shared programs: 11936 -> 11931 (-0.04%)
texture in affected programs: 20 -> 15 (-25.00%)
helped: 10
HURT: 0
helped stats (abs) min: 0.5 max: 0.5 x̄: 0.50 x̃: 0
helped stats (rel) min: 14.29% max: 100.00% x̄: 45.71% x̃: 33.33%
95% mean confidence interval for texture value: -0.50 -0.50
95% mean confidence interval for texture %-change: -73.16% -18.26%
Texture are helped.

total vary in shared programs: 4180.88 -> 3447.19 (-17.55%)
vary in affected programs: 2109.88 -> 1376.19 (-34.77%)
helped: 2202
HURT: 39
helped stats (abs) min: 0.0625 max: 1.4375 x̄: 0.34 x̃: 0
helped stats (rel) min: 2.38% max: 66.67% x̄: 40.43% x̃: 50.00%
HURT stats (abs)   min: 0.125 max: 0.375 x̄: 0.26 x̃: 0
HURT stats (rel)   min: 0.00% max: 300.00% x̄: 92.54% x̃: 23.08%
95% mean confidence interval for vary value: -0.34 -0.32
95% mean confidence interval for vary %-change: -39.22% -37.01%
Vary are helped.

total quadwords in shared programs: 1689664 -> 1684852 (-0.28%)
quadwords in affected programs: 265522 -> 260710 (-1.81%)
helped: 2864
HURT: 447
helped stats (abs) min: 1.0 max: 14.0 x̄: 2.10 x̃: 2
helped stats (rel) min: 0.15% max: 31.58% x̄: 6.05% x̃: 4.65%
HURT stats (abs)   min: 1.0 max: 22.0 x̄: 2.67 x̃: 2
HURT stats (rel)   min: 0.27% max: 38.46% x̄: 6.79% x̃: 4.55%
95% mean confidence interval for quadwords value: -1.54 -1.37
95% mean confidence interval for quadwords %-change: -4.55% -4.08%
Quadwords are helped.

total threads in shared programs: 53656 -> 53688 (0.06%)
threads in affected programs: 32 -> 64 (100.00%)
helped: 32
HURT: 0
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for threads value: 1.00 1.00
95% mean confidence interval for threads %-change: 100.00% 100.00%
Threads are helped.

total preloads in shared programs: 116212 -> 103476 (-10.96%)
preloads in affected programs: 45222 -> 32486 (-28.16%)
helped: 3022
HURT: 11
helped stats (abs) min: 1.0 max: 11.0 x̄: 4.23 x̃: 4
helped stats (rel) min: 7.14% max: 68.75% x̄: 30.39% x̃: 25.00%
HURT stats (abs)   min: 2.0 max: 4.0 x̄: 3.45 x̃: 4
HURT stats (rel)   min: 14.29% max: 50.00% x̄: 25.93% x̃: 25.00%
95% mean confidence interval for preloads value: -4.26 -4.14
95% mean confidence interval for preloads %-change: -30.68% -29.69%
Preloads are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Tested-by: Chris Healy cphealy@gmail.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16752>
2022-05-30 17:49:44 -04:00
Alyssa Rosenzweig 93f69e4b1c pan/bi: Model Valhall source formats
LD_VAR_BUF instructions on Valhall take a source format, indicating the
in-memory format of the varying independent from the register format, which we
still model within the compiler for compatibility with Bifrost. (Prior to
Valhall, source format is specified in the attribute descriptor as a physical
pixel format.)

Model this information, allowing us to generate fp16 LD_VAR_BUF instructions
correctly on Valhall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16752>
2022-05-30 17:49:44 -04:00
Alyssa Rosenzweig 06886c3861 pan/bi: Make LD_VAR w=format instead of w=vecsize
Fixes a vector dimension validation failure in
dEQP-GLES3.functional.shaders.indexing.varying_array.vec4_static_write_dynamic_read
after we enable fp16 varyings.

No shader-db changes, as we don't yet support fp16 varyings.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16752>
2022-05-30 17:49:44 -04:00
Alyssa Rosenzweig a9b13a1867 pan/va: Fill in missing src_flat16 enum
Valhall gains(?) the ability to flatshade 16-bit varyings, this is indicated by
a particular source format.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16752>
2022-05-30 17:49:44 -04:00
Alyssa Rosenzweig e898e2466b pan/bi: Add VAR_TEX fusing unit test
As fusing VAR_TEX is an optimization, it's helpful to have unit tests since
functional tests won't check that the optimization triggers when expected.
Originally written when I was touching the VAR_TEX code. Those changes have
since been dropped by the unit test remains useful.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16752>
2022-05-30 17:48:59 -04:00
Alyssa Rosenzweig 42a4a123a6 pan/bi: Don't allow spilling coverage mask writes
The register precolouring logic assumes that coverage masks are always in R60,
so spilling them causes incorrect results. We could do better. Fixes on Valhall:

   dEQP-GLES3.functional.ubo.random.all_per_block_buffers.28

Fixes: 3df5446cbd ("pan/bi: Simplify register precolouring in the IR")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16748>
2022-05-30 14:00:55 +00:00
Alyssa Rosenzweig 67f5721349 panfrost: Set allow_rotating_primitives
On Valhall, the driver should set this flag if the hardware may rotate
primitives. This happens if:

1. The rasterization of lines does not matter, AND
2. The provoking vertex does not matter.

The first condition we may satisfy by checking for LINES and the second by
checking for flat shading. Otherwise, we should set this flag to allow
optimizations. This may be more efficient for tiling.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16748>
2022-05-30 14:00:55 +00:00
Alyssa Rosenzweig 01ba3460a9 pan/bi: Test CMP result_type optimization
Add unit tests ensuring the optimization applies in all the cases we care about,
as functional integration tests (CTS and Piglit) won't test this. Also add unit
tests for a few cases where we specifically cannot fuse, in case these cases are
missed by the tests.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16725>
2022-05-27 12:14:22 +00:00