Commit Graph

2409 Commits

Author SHA1 Message Date
Tomeu Vizoso 26079868a7 ci/freedreno: Add depth32f_stencil8 flakes
Started happening after disabling cpufreq, devfreq and runtime PM.

At least one of these fail in each run, so it's blocking MRs.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7987>
2021-05-26 18:20:19 +00:00
Antonio Caggiano 8e470457de ci: Add a manual job for tracking the performance of Freedreno
Use Piglit's replay profile to measure and store the time that frames
take to render in the GPU.

This job won't run automatically in regular pipelines, but will be
triggered automatically by a script for every successful pre-merge
pipeline.

This is because we want to generate performance data for every relevant
commit merged in main, but we don't want to keep a device busy during
the pre-merge run.

Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7987>
2021-05-26 18:20:19 +00:00
Emma Anholt ee408df29c ci/freedreno: Consolidate ssbo.fragment_binding_array flake annotation.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10933>
2021-05-24 16:42:33 +00:00
Emma Anholt fec60d5bee ci/freedreno: Drop VK flake annotations not seen in the last ~year.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10933>
2021-05-24 16:42:33 +00:00
Emma Anholt 4caf9b430d ci/freedreno: Add a link explaining get_display_plane_capabilities
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10933>
2021-05-24 16:42:33 +00:00
Emma Anholt a8c3783982 ci/freedreno: Drop a630 flake annotation from the go-fast changes.
The async fix seems to have fixed it, haven't seen this one since May 3rd.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10933>
2021-05-24 16:42:33 +00:00
Emma Anholt 1dbaaa22f9 ci/freedreno: Clear stale validation failure flake annotation.
Haven't seen it in my current set of IRC logs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10933>
2021-05-24 16:42:33 +00:00
Emma Anholt a40479b6bc ci/freedreno: Clear compswap flake annotation.
These flakes disappeared around 2020-08 and the only sign since then has
been some flakes on versions of the ir3 RA rewrite.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10933>
2021-05-24 16:42:33 +00:00
Connor Abbott 9350900fcd ir3: Only use per-wave pvtmem layout for compute
The blob seems to do this since a630, and it fixes
spec@glsl-1.30@execution@fs-large-local-array on a650.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10922>
2021-05-21 20:45:07 +00:00
Connor Abbott 0ab01f4215 ir3: Call nir_lower_wrmask() again after lowering scratch
I forgot that after rebasing on large_consts support that this is now
called after the first time nir_lower_wrmask is called and can generate
partial writemasks that need to be lowered. While we're here, also call
the main optimization loop if things are lowered to scratch because it
generates address arithmetic that may need to be cleaned up.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10922>
2021-05-21 20:45:07 +00:00
Emma Anholt 307139c7f9 vulkan: Avoid stomping array padding in the MemoryProperties wrapper.
The deqp test for it expects that the unused array elements are untouched,
so make sure they don't get replaced with random stack data.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10737>
2021-05-20 21:41:06 +00:00
Connor Abbott de9f2170cc ir3: Use round-to-nearest-even for fquantize2f16
We're supposed to map a floating-point value too large to be represented
as fp16 to infinity, however round-to-zero naturally rounds it down to
the largest representable fp16 number instead. The blob emits a bunch of
fixup code to work around this, but instead we can just do what all the
other drivers seem to do and use round-to-nearest-even instead.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10897>
2021-05-20 18:45:59 +00:00
Ian Romanick 7d85dc4f35 nir/algebraic: Equality comparison inversions require sources be numbers
v2: Update A630 expected image checksum for minetest.trace.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

Tiger Lake
total instructions in shared programs: 21036690 -> 21049485 (0.06%)
instructions in affected programs: 852085 -> 864880 (1.50%)
helped: 240
HURT: 2514
helped stats (abs) min: 1 max: 46 x̄: 2.45 x̃: 2
helped stats (rel) min: 0.15% max: 4.30% x̄: 0.79% x̃: 0.55%
HURT stats (abs)   min: 1 max: 198 x̄: 5.32 x̃: 2
HURT stats (rel)   min: 0.06% max: 10.71% x̄: 1.48% x̃: 1.04%
95% mean confidence interval for instructions value: 4.14 5.15
95% mean confidence interval for instructions %-change: 1.23% 1.34%
Instructions are HURT.

total cycles in shared programs: 856045255 -> 855816220 (-0.03%)
cycles in affected programs: 16743786 -> 16514751 (-1.37%)
helped: 790
HURT: 1973
helped stats (abs) min: 1 max: 10766 x̄: 627.97 x̃: 18
helped stats (rel) min: <.01% max: 32.59% x̄: 3.01% x̃: 0.64%
HURT stats (abs)   min: 1 max: 4078 x̄: 135.36 x̃: 18
HURT stats (rel)   min: <.01% max: 54.56% x̄: 2.80% x̃: 0.82%
95% mean confidence interval for cycles value: -131.36 -34.42
95% mean confidence interval for cycles %-change: 0.88% 1.40%
Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree).

total spills in shared programs: 9771 -> 9766 (-0.05%)
spills in affected programs: 47 -> 42 (-10.64%)
helped: 1
HURT: 0

total fills in shared programs: 9451 -> 9430 (-0.22%)
fills in affected programs: 91 -> 70 (-23.08%)
helped: 1
HURT: 0

LOST:   16
GAINED: 51

All Intel GPUs from Sandybridge through Ice Lake had similar results. (Ice Lake shown)
total instructions in shared programs: 20024781 -> 20025568 (<.01%)
instructions in affected programs: 103309 -> 104096 (0.76%)
helped: 12
HURT: 389
helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1
helped stats (rel) min: 0.20% max: 2.70% x̄: 1.36% x̃: 1.37%
HURT stats (abs)   min: 1 max: 8 x̄: 2.06 x̃: 1
HURT stats (rel)   min: 0.05% max: 7.14% x̄: 1.25% x̃: 0.95%
95% mean confidence interval for instructions value: 1.78 2.15
95% mean confidence interval for instructions %-change: 1.06% 1.28%
Instructions are HURT.

total cycles in shared programs: 979419070 -> 979439180 (<.01%)
cycles in affected programs: 4968711 -> 4988821 (0.40%)
helped: 60
HURT: 381
helped stats (abs) min: 1 max: 1296 x̄: 96.92 x̃: 26
helped stats (rel) min: <.01% max: 27.10% x̄: 1.64% x̃: 0.65%
HURT stats (abs)   min: 1 max: 7320 x̄: 68.04 x̃: 30
HURT stats (rel)   min: <.01% max: 19.77% x̄: 1.32% x̃: 0.87%
95% mean confidence interval for cycles value: 10.25 80.95
95% mean confidence interval for cycles %-change: 0.69% 1.15%
Cycles are HURT.

LOST:   1
GAINED: 2

GM45 and Iron Lake had similar results. (Iron Lake shown)
total instructions in shared programs: 8128474 -> 8132527 (0.05%)
instructions in affected programs: 642323 -> 646376 (0.63%)
helped: 12
HURT: 1972
helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4
helped stats (rel) min: 0.72% max: 1.72% x̄: 1.09% x̃: 0.83%
HURT stats (abs)   min: 1 max: 16 x̄: 2.07 x̃: 3
HURT stats (rel)   min: 0.12% max: 7.14% x̄: 0.77% x̃: 0.70%
95% mean confidence interval for instructions value: 1.99 2.10
95% mean confidence interval for instructions %-change: 0.74% 0.79%
Instructions are HURT.

total cycles in shared programs: 238280994 -> 238294376 (<.01%)
cycles in affected programs: 8841250 -> 8854632 (0.15%)
helped: 84
HURT: 1192
helped stats (abs) min: 4 max: 64 x̄: 12.50 x̃: 8
helped stats (rel) min: 0.02% max: 1.61% x̄: 0.28% x̃: 0.17%
HURT stats (abs)   min: 2 max: 198 x̄: 12.11 x̃: 12
HURT stats (rel)   min: 0.02% max: 8.03% x̄: 0.28% x̃: 0.14%
95% mean confidence interval for cycles value: 9.65 11.32
95% mean confidence interval for cycles %-change: 0.22% 0.27%
Cycles are HURT.

No fossil-db changes on any Intel platform.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>
2021-05-20 01:39:35 +00:00
Ian Romanick 4246c2869c nir/algebraic: Invert comparisons less often
This fixes the piglit test range_analysis_fsat_of_nan.shader_test.  That
test contains some code like

    o = saturate(X) > 0 ? vec4(1.0, 0.0, 0.0, 1.0)
                        : vec4(0.0, 1.0, 0.0, 1.0);

A clever optimizer will convert this to

    o = vec4(float(saturate(X) > 0),
             float(!(saturate(X) > 0)),
             0, 1);

Due to the ordering of optimizations in the compiler, the `saturate`
operations are removed.  This is safe even in the presense of NaN.

    o = vec4(float(X > 0), float(!(X > 0)), 0, 1);

Since the calculations are not marked precise, an overzealous
optimizer may reduce this to

    o = vec4(float(X > 0), float(X <= 0), 0, 1);

This will result in black being output.  The GLSL spec gives quite a bit
of leeway with respect to NaN, but that seems too far.  The shader
author asked for a result of red or green.  A result of black is still
"undefined behavior," but it's also a little mean.

This also enables CSE to do its job better.

v2: Update A530 expected image checksum for minetest.trace.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4531
Fixes: 0dbda153aa ("nir/algebraic: Flag inexact optimizations")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

Tiger Lake
total instructions in shared programs: 21041563 -> 21041789 (<.01%)
instructions in affected programs: 992066 -> 992292 (0.02%)
helped: 526
HURT: 548
helped stats (abs) min: 1 max: 16 x̄: 2.48 x̃: 2
helped stats (rel) min: 0.04% max: 5.56% x̄: 0.74% x̃: 0.49%
HURT stats (abs)   min: 1 max: 27 x̄: 2.80 x̃: 2
HURT stats (rel)   min: 0.04% max: 4.55% x̄: 0.59% x̃: 0.38%
95% mean confidence interval for instructions value: -0.00 0.42
95% mean confidence interval for instructions %-change: -0.12% <.01%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 855885569 -> 856118189 (0.03%)
cycles in affected programs: 343637248 -> 343869868 (0.07%)
helped: 907
HURT: 541
helped stats (abs) min: 1 max: 7724 x̄: 206.45 x̃: 36
helped stats (rel) min: <.01% max: 29.97% x̄: 1.01% x̃: 0.37%
HURT stats (abs)   min: 1 max: 14177 x̄: 776.09 x̃: 31
HURT stats (rel)   min: <.01% max: 29.94% x̄: 1.24% x̃: 0.35%
95% mean confidence interval for cycles value: 84.30 237.00
95% mean confidence interval for cycles %-change: -0.32% -0.01%
Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree).

LOST:   3
GAINED: 5

Ice Lake
total instructions in shared programs: 20027107 -> 20025352 (<.01%)
instructions in affected programs: 1068856 -> 1067101 (-0.16%)
helped: 1153
HURT: 273
helped stats (abs) min: 1 max: 14 x̄: 1.83 x̃: 1
helped stats (rel) min: 0.03% max: 5.66% x̄: 0.61% x̃: 0.35%
HURT stats (abs)   min: 1 max: 15 x̄: 1.29 x̃: 1
HURT stats (rel)   min: 0.16% max: 1.30% x̄: 0.58% x̃: 0.60%
95% mean confidence interval for instructions value: -1.33 -1.13
95% mean confidence interval for instructions %-change: -0.43% -0.34%
Instructions are helped.

total cycles in shared programs: 979499227 -> 979448725 (<.01%)
cycles in affected programs: 344261539 -> 344211037 (-0.01%)
helped: 1079
HURT: 441
helped stats (abs) min: 1 max: 9384 x̄: 147.78 x̃: 48
helped stats (rel) min: <.01% max: 31.83% x̄: 0.90% x̃: 0.33%
HURT stats (abs)   min: 1 max: 7220 x̄: 247.07 x̃: 32
HURT stats (rel)   min: <.01% max: 31.30% x̄: 1.52% x̃: 0.53%
95% mean confidence interval for cycles value: -70.01 3.56
95% mean confidence interval for cycles %-change: -0.35% -0.05%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 10564 -> 10568 (0.04%)
spills in affected programs: 143 -> 147 (2.80%)
helped: 0
HURT: 1

total fills in shared programs: 11343 -> 11347 (0.04%)
fills in affected programs: 287 -> 291 (1.39%)
helped: 0
HURT: 1

LOST:   3
GAINED: 2

Skylake
total instructions in shared programs: 18192274 -> 18190128 (-0.01%)
instructions in affected programs: 1000188 -> 998042 (-0.21%)
helped: 1149
HURT: 55
helped stats (abs) min: 1 max: 14 x̄: 1.92 x̃: 1
helped stats (rel) min: 0.04% max: 6.67% x̄: 0.67% x̃: 0.42%
HURT stats (abs)   min: 1 max: 2 x̄: 1.05 x̃: 1
HURT stats (rel)   min: 0.16% max: 0.55% x̄: 0.27% x̃: 0.26%
95% mean confidence interval for instructions value: -1.87 -1.69
95% mean confidence interval for instructions %-change: -0.67% -0.58%
Instructions are helped.

total cycles in shared programs: 960856054 -> 960728040 (-0.01%)
cycles in affected programs: 340840968 -> 340712954 (-0.04%)
helped: 1079
HURT: 233
helped stats (abs) min: 1 max: 7640 x̄: 170.95 x̃: 46
helped stats (rel) min: <.01% max: 30.20% x̄: 0.96% x̃: 0.28%
HURT stats (abs)   min: 1 max: 6864 x̄: 242.23 x̃: 26
HURT stats (rel)   min: <.01% max: 34.64% x̄: 2.10% x̃: 0.22%
95% mean confidence interval for cycles value: -135.62 -59.53
95% mean confidence interval for cycles %-change: -0.59% -0.25%
Cycles are helped.

LOST:   15
GAINED: 1

Broadwell
total instructions in shared programs: 17855624 -> 17853580 (-0.01%)
instructions in affected programs: 1012209 -> 1010165 (-0.20%)
helped: 1105
HURT: 52
helped stats (abs) min: 1 max: 13 x̄: 1.90 x̃: 1
helped stats (rel) min: 0.03% max: 6.67% x̄: 0.67% x̃: 0.36%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.13% max: 0.52% x̄: 0.26% x̃: 0.25%
95% mean confidence interval for instructions value: -1.86 -1.67
95% mean confidence interval for instructions %-change: -0.68% -0.58%
Instructions are helped.

total cycles in shared programs: 1029905447 -> 1029840699 (<.01%)
cycles in affected programs: 347102680 -> 347037932 (-0.02%)
helped: 1007
HURT: 211
helped stats (abs) min: 1 max: 1360 x̄: 89.76 x̃: 48
helped stats (rel) min: <.01% max: 16.26% x̄: 0.69% x̃: 0.25%
HURT stats (abs)   min: 1 max: 1297 x̄: 121.51 x̃: 20
HURT stats (rel)   min: <.01% max: 31.31% x̄: 1.21% x̃: 0.20%
95% mean confidence interval for cycles value: -62.39 -43.92
95% mean confidence interval for cycles %-change: -0.47% -0.25%
Cycles are helped.

total spills in shared programs: 20335 -> 20333 (<.01%)
spills in affected programs: 19 -> 17 (-10.53%)
helped: 2
HURT: 0

total fills in shared programs: 25905 -> 25899 (-0.02%)
fills in affected programs: 23 -> 17 (-26.09%)
helped: 2
HURT: 0

LOST:   9
GAINED: 0

Haswell
total instructions in shared programs: 16418516 -> 16417293 (<.01%)
instructions in affected programs: 223785 -> 222562 (-0.55%)
helped: 590
HURT: 67
helped stats (abs) min: 1 max: 15 x̄: 2.19 x̃: 1
helped stats (rel) min: 0.03% max: 6.52% x̄: 0.87% x̃: 0.60%
HURT stats (abs)   min: 1 max: 2 x̄: 1.04 x̃: 1
HURT stats (rel)   min: 0.04% max: 1.85% x̄: 0.44% x̃: 0.25%
95% mean confidence interval for instructions value: -2.01 -1.71
95% mean confidence interval for instructions %-change: -0.80% -0.67%
Instructions are helped.

total cycles in shared programs: 1037179754 -> 1037084874 (<.01%)
cycles in affected programs: 352541071 -> 352446191 (-0.03%)
helped: 1093
HURT: 182
helped stats (abs) min: 1 max: 888 x̄: 111.03 x̃: 64
helped stats (rel) min: <.01% max: 27.30% x̄: 0.84% x̃: 0.20%
HURT stats (abs)   min: 1 max: 6777 x̄: 145.49 x̃: 21
HURT stats (rel)   min: <.01% max: 24.10% x̄: 1.99% x̃: 0.29%
95% mean confidence interval for cycles value: -88.10 -60.73
95% mean confidence interval for cycles %-change: -0.58% -0.29%
Cycles are helped.

total spills in shared programs: 17457 -> 17456 (<.01%)
spills in affected programs: 12 -> 11 (-8.33%)
helped: 1
HURT: 0

total fills in shared programs: 20387 -> 20385 (<.01%)
fills in affected programs: 15 -> 13 (-13.33%)
helped: 1
HURT: 0

LOST:   6
GAINED: 1

Ivy Bridge and earlier platforms had similar results. (Ivy Bridge shown)
total instructions in shared programs: 15515482 -> 15513998 (<.01%)
instructions in affected programs: 239739 -> 238255 (-0.62%)
helped: 573
HURT: 57
helped stats (abs) min: 1 max: 20 x̄: 2.73 x̃: 2
helped stats (rel) min: 0.03% max: 9.84% x̄: 0.94% x̃: 0.55%
HURT stats (abs)   min: 1 max: 2 x̄: 1.39 x̃: 1
HURT stats (rel)   min: 0.09% max: 1.85% x̄: 0.52% x̃: 0.35%
95% mean confidence interval for instructions value: -2.57 -2.14
95% mean confidence interval for instructions %-change: -0.89% -0.73%
Instructions are helped.

total cycles in shared programs: 584509880 -> 584463152 (<.01%)
cycles in affected programs: 11765280 -> 11718552 (-0.40%)
helped: 661
HURT: 152
helped stats (abs) min: 1 max: 3073 x̄: 101.99 x̃: 32
helped stats (rel) min: <.01% max: 34.38% x̄: 1.46% x̃: 0.50%
HURT stats (abs)   min: 1 max: 6637 x̄: 136.10 x̃: 15
HURT stats (rel)   min: <.01% max: 24.19% x̄: 1.75% x̃: 0.25%
95% mean confidence interval for cycles value: -82.79 -32.16
95% mean confidence interval for cycles %-change: -1.11% -0.61%
Cycles are helped.

LOST:   9
GAINED: 0

Tiger Lake
Instructions in all programs: 160905127 -> 160900949 (-0.0%)
SENDs in all programs: 6812418 -> 6812085 (-0.0%)
Loops in all programs: 38225 -> 38225 (+0.0%)
Cycles in all programs: 7431911114 -> 7433914697 (+0.0%)
Spills in all programs: 192582 -> 192582 (+0.0%)
Fills in all programs: 304539 -> 304537 (-0.0%)

Ice Lake
Instructions in all programs: 145296733 -> 145292370 (-0.0%)
SENDs in all programs: 6863818 -> 6863485 (-0.0%)
Loops in all programs: 38219 -> 38219 (+0.0%)
Cycles in all programs: 8798257570 -> 8800204360 (+0.0%)
Spills in all programs: 216880 -> 216880 (+0.0%)
Fills in all programs: 334250 -> 334248 (-0.0%)

Skylake
Instructions in all programs: 135891485 -> 135887357 (-0.0%)
SENDs in all programs: 6803031 -> 6802698 (-0.0%)
Loops in all programs: 38216 -> 38216 (+0.0%)
Cycles in all programs: 8442221881 -> 8444201959 (+0.0%)
Spills in all programs: 194839 -> 194839 (+0.0%)
Fills in all programs: 301116 -> 301114 (-0.0%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>
2021-05-20 01:39:35 +00:00
Connor Abbott e894e83e47 ir3/cf: Rewrite pass
The old pass had a few bugs:
- It tried to avoid folding f2f32 into f2f16, but didn't consider
  conversions that were already folded in.
- It didn't prevent folding an f2f16 or f2f32 into a non-floating-point
  op.

In addition it wasn't written in a manner which made handling integer
conversions practical. This rewrites the pass to instead calculate the
"type" of the conversion source and then check whether folding the
conversion is allowed. This allows us to cleanly separate the
declarative part where we describe how the HW works from the policy part
where we decide whether the transform is allowed, and makes it simple to
add support for folding integer conversions.

Closes: #3208
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10859>
2021-05-19 20:03:19 +00:00
Danylo Piliaiev 931ad19a18 turnip: make cmdstream bo's read-only to GPU
Would allow earlier faults instead of having corrupted cmdstream.

This was already done to Freedreno long ago in:
    04aff7e4 "freedreno: make cmdstream bo's read-only to GPU"

Since private memory should be GPU writable it is now allocated
separately, instead of suballocation from now read-only cmdstream.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10807>
2021-05-17 18:29:09 +00:00
Danylo Piliaiev 413e7c6dc8 turnip: make possible to create read-only bo with tu_bo_init_new
GPU won't be able to write to such BOs, which would to useful for
cmdstream BOs.

Move "bool dump" to the new flags along the way.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10807>
2021-05-17 18:29:09 +00:00
Connor Abbott a40714abf7 nir/lower_phis_to_scalar: Add "lower_all" option
We don't want to have to deal with vector phis in freedreno, because
vectors are always split/unsplit around vectorized instructions anyways,
and the stated reason for not scalarising them (it hurting coalescing)
won't apply to us because we won't be using nir_from_ssa. Add this
option so that we don't have to do the equivalent thing while
translating from NIR.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10809>
2021-05-17 09:59:45 +00:00
Danylo Piliaiev 3a29e45a90 turnip: do not ignore early_fragment_tests
Specifying "early_fragment_tests" in fragment shader takes precedence
over our internal conditions.

Fixes test:
 dEQP-VK.fragment_operations.early_fragment.early_fragment_tests_stencil

Fixes: b2a60c157e "turnip: add LRZ early-z support"

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10803>
2021-05-17 11:56:32 +03:00
Rob Clark a5a86adc23 freedreno/a6xx: Add a few registers
Based-on: https://patchwork.freedesktop.org/patch/429745/?series=89269&rev=2
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10829>
2021-05-16 18:47:55 +00:00
Dmitry Baryshkov 0c30ad402d freedreno/regs: split DSI PHY registers to separate xml files.
In-kernel DSI PHY driver is being more and more split from the main DSI
driver. Split PHY registers from main dsi.xml file to ease further split
of the drivers.

Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10817>
2021-05-16 15:53:14 +00:00
Juan A. Suarez Romero 629e8347ad ci: Update VK-GL-CTS to 1.2.6.1
Reviewed-by: Emma Anholt <emma@anholt.net>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10800>
2021-05-14 20:35:24 +00:00
Emma Anholt 341ecb2dfc ci/freedreno: Skip refract on a306 now that it hangchecks sometimes.
Not every MR, but several per day since I landed the apitrace switch which
increased trace replay resolution to 1080p.  Hopefully some day we can
tune the hangcheck to be less aggressive.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10793>
2021-05-14 16:59:13 +00:00
Danylo Piliaiev 5a133ef1f2 ci/turnip: drop fail annotation for image.extend_operands_spirv1p4.*
They were fixed in
ed20e69b "vtn: Handle ZeroExtend/SignExtend image operands"

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10783>
2021-05-13 17:18:48 +00:00
Danylo Piliaiev 9a477ccbea ci/turnip: drop fail annotation for float_control tests
These tests are NotSupported and therefore cannot fail.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10783>
2021-05-13 17:18:48 +00:00
Rob Clark 6c530ebf40 freedreno/ir3: Don't force RTNE if rounding mode is undefined
Forcing round-to-nearest-even results in loss of opportunities for
conversion folding, causing a regression in gfxbench gl_alu2.

Fixes: de195671bd ("ir3: nir_op_f2f16 should round to even")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10773>
2021-05-12 19:05:27 +00:00
Hyunjun Ko 3f229e34c9 turnip: Implement VK_KHR_timeline_semaphore.
Implements non-shareable timelines using legacy syncobjs,
inspired by anv/radv implementation.

v1. Avoid memcpy in/out_syncobjs and fix some mistakes.
v2.
 - Handle vkQueueWaitIdle.
 - Add enum tu_semaphore_type.
 - Fix to handle VK_SEMAPHORE_WAIT_ANY_BIT_KHR correctly.
 - Fix a crash of dEQP-VK.synchronization.timeline_semaphore.device_host.misc.max_difference_value.

v3. Avoid indefinite waiting in vkQueueWaitIdle by calling
tu_device_submit_deferred_locked itself.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10126>
2021-05-12 05:07:44 +00:00
Hyunjun Ko daefc6e2a4 turnip: prep work for timeline semaphore support
Small refactor to classify semphore types, currently only binary
syncobj is being used though.

v1. Fix a crash of dEQP-VK.api.null_handle.destroy_semaphore

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10126>
2021-05-12 05:07:44 +00:00
Emma Anholt 7520ac54dd ci: Switch to apitraces for glmark2
This brings in upstream mediump fixes, and should also replay faster than
.rdc files.

Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10295>
2021-05-11 20:07:29 +00:00
Emma Anholt 9a5c9ff342 turnip: Drop fail annotation for driver_properties.
These subtests weren't run in CI, and the whole set is skipped since
dropping to 1.1.

Fixes: 7bcda21441 ("turnip: Demote API version to 1.1.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10734>
2021-05-11 14:16:25 +00:00
Emma Anholt 63a3d18ae1 ci/turnip: Add some links to issues and MRs for some test failures.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10734>
2021-05-11 14:16:25 +00:00
Emma Anholt 90b08175b7 ci/turnip: Clean up some stale fail annotations.
This test group was fixed in the deqp 1.2.6.0 uprev, but we do a
fractional run that didn't include these tests.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10734>
2021-05-11 14:16:25 +00:00
Danylo Piliaiev 811f289c56 turnip: copy all layers specified in vkCmdCopyImage
When copying layered images we ignored .layerCount parameter.

Fixes mis-rendering of walls in D3D11 game "Company Of Heroes 2".

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10736>
2021-05-11 11:42:12 +00:00
Emma Anholt d93acf1001 freedreno: Update editorconfig and emacs settings for freedreno reformat.
Fixes: 2d439343ea ("freedreno: Re-indent")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10742>
2021-05-10 23:16:00 +00:00
Eric Anholt 3eee475e39 turnip: Claim 2 discrete queue priorities.
The spec requires at least 2, but says "No specific guarantees are made
about higher priority queues receiving more processing time or better
quality of service than lower priority queues."  So, we can just leave the
priorities as a stub.

Fixes dEQP-VK.info.device_properties

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10470>
2021-05-10 17:21:02 +00:00
Eric Anholt d8099df65a turnip: Drop wideLines properties since we don't support wide lines.
The blob doesn't expose wideLines either, and
dEQP-VK.info.device_properties fails if you claim wide line properties
without it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10470>
2021-05-10 17:21:02 +00:00
Rob Clark 3a772be026 freedreno: Add perfetto renderpass support
Add a custom DataSource to provide trace events for render stages.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9901>
2021-05-10 15:34:07 +00:00
Rob Clark 133a3e4dd3 freedreno/pps: Detect GPU suspend on newer kernels
We can avoid re-sending the configuration cmdstream constantly if we
know the device has not suspended since the last sampling period.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9901>
2021-05-10 15:34:07 +00:00
Rob Clark e63ef520fe freedreno/drm: Add support to query device suspend count
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9901>
2021-05-10 15:34:07 +00:00
Rob Clark 3e13e45467 freedreno: Add freedreno pps driver
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9901>
2021-05-10 15:34:07 +00:00
Danylo Piliaiev daad8f2245 freedreno/a5xx: SP_BLEND_CNTL has per-mrt blend enable bit
Blending in SP_BLEND_CNTL is not a binary flag but the same
mask as in RB_BLEND_CNTL. It is a per-mrt enable bit for blending.

Copied form a6xx, on a5xx it should be have the same since it seems
to have the same structure layout.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10682>
2021-05-07 19:12:17 +00:00
Danylo Piliaiev 14da2444a9 turnip,freedreno/a6xx: SP_BLEND_CNTL has per-mrt blend enable bit
Blending in SP_BLEND_CNTL is not a binary flag but the same
mask as in RB_BLEND_CNTL. It is a per-mrt enable bit for blending.

Example SP_BLEND_CNTL produced by blob on a630 and different MRT
blendings:
  SP_BLEND_CNTL: { UNK8 | 0x6 }
  SP_BLEND_CNTL: { ENABLED | UNK8 | 0xe }
(Decoded before this commit)

Fixes mis-rendering with D3D11 game "Spelunky 2".

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10682>
2021-05-07 19:12:17 +00:00
Emma Anholt 0cd63e891d turnip: Move the extension tables to tu_device.c
Following intel's lead in 27d49670.  In the dEQP-VK.info.* tests, this
bumps apiVersion from 1.1.128 to 1.1.177.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10635>
2021-05-06 00:14:12 +00:00
Emma Anholt c5438450ad turnip: Switch to the shared vulkan ICD generator.
One less python script to maintain.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10635>
2021-05-06 00:14:12 +00:00
Eric Anholt cc5df4398a freedreno/a5xx: Fix up border color pointers.
We were forgetting to increment in the loop, but also it looks from blob
dumps on Pixel 2 like all the pointers it emitted were shifted up by 3
compared to our xml, and that's the same shift that a6xx uses for its
pointers.  None of the tests seem to use more than one
border-color-requiring texture, so it's hard to tell.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9904>
2021-05-05 21:28:03 +00:00
Rob Clark b447db41fc freedreno/tools: Fix async flush vs fdperf/computerator
They need to wait on the ready fence to ensure the submit has been
flushed to the kernel.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10626>
2021-05-05 20:32:31 +00:00
Eric Anholt 7bcda21441 turnip: Demote API version to 1.1.
We don't support major 1.2 required extensions like timeline semaphores.
Fixes many complaints in the dEQP-VK.info.vulkan1p2.* group.

We were originally bumped to 1.2 in 75755e0eba ("turnip: Pretend to
support Vulkan 1.2") but hopefully that build issue has been fixed in the
entrypoint reworks since then.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10471>
2021-05-05 17:09:09 +00:00
Danylo Piliaiev d8ab0ec8e4 turnip: implement VK_KHR_vulkan_memory_model
No handling of Acquire/Release because at the moment scheduler
works as if any barrier is Acq+Rel.

Instead of removing scoped_barrier with scope/mode that for TCS
corresponds to a control_barrier or a memory_barrier_tcs_patch in
ir3_nir_lower_tess_ctrl - remove them in emit_intrinsic_barrier.
And do the same for memory_barrier_tcs_patch and control_barrier.
While in any case hw fence/barrier shouldn't be emitted for them,
they still affect ordering of stores, and in feature ir3 backend
may want to have that information.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9054>
2021-05-05 10:05:38 +00:00
Danylo Piliaiev a898828a63 ir3: update bar/fence bits in accordance to blob
On a6xx blob uses .l rather differently from a5xx.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9054>
2021-05-05 10:05:38 +00:00
Danylo Piliaiev cb8a00791c ir3: memory_barrier also controls shared memory access order
nir_intrinsic_memory_barrier has the same semantic as memoryBarrier()
in GLSL, which is:

GLSL 4.60, 4.10. "Memory Qualifiers":
 "The built-in function memoryBarrier() can be used if needed to
 guarantee the completion and relative ordering of memory accesses
 performed by a single shader invocation."

GLSL 4.60, 8.17. "Shader Memory Control Functions":
 "The built-in functions memoryBarrier() and groupMemoryBarrier() wait
 for the completion of accesses to all of the above variable types."

Fixes tests:
 dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_local.image.guard_nonlocal.workgroup.comp
 dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_nonlocal.workgroup.guard_local.image.comp

Fixes: 819a613a ("freedreno/ir3: moar better scheduler")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9054>
2021-05-05 10:05:38 +00:00
Eric Anholt 89114225b5 tunrip: Add support for VK_EXT_separate_stencil_usage.
We were implictly including it in exposing VK 1.2, but we weren't making
use of the supplied struct.  Actually enabling it gives us a chance to do
slightly better at Z/S UBWC, and means we won't lose the separate usage
test coverage when switching back to exposing VK 1.1.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10594>
2021-05-04 20:30:50 +00:00
Eric Anholt 7c52a79057 ci/freedreno: Add another db820c flake that's appeared in the last few months.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10597>
2021-05-04 01:03:50 +00:00
Connor Abbott 9fa587ae96 ir3: Don't assume regs[1] exists in ir3_fixup_src_type()
It won't exist for phi nodes because they are only partially constructed
beforehand. Move it into the switch arguments where we know it's needed.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>
2021-05-03 19:52:31 +00:00
Connor Abbott 3c8a5d7e17 ir3: Rework outputs
Instead of using a separate outputs array, make the "end" instruction
(or chmask) take the outputs as sources. This works better for the new
RA, because it better models the fact that outputs are consumed all at
the same time. With the old model, each output collect would be assumed
dead after it was processed and subsequent collects could use it when
inserting shuffle code, which wouldn't work, and the new RA also deletes
collect instructions after lowering them to moves so the information
would be gone after RA.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>
2021-05-03 19:52:31 +00:00
Connor Abbott dd55bd8f68 ir3: Make predecessors an array
We need a stable order in order to create phi instructions. In the
future we can make this more sophisticated in order to make manipulating
the CFG easier, but for now that only happens after RA, so we won't have
to worry about it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>
2021-05-03 19:52:31 +00:00
Connor Abbott 0bd68b8386 ir3: Refactor nir->ir3 block handling
Originally I wrote this to support multiple ir3 blocks per NIR block,
but this turned out to be more useful for creating a stable ordering to
the predecessors. We compute the predecessors ourselves, rather than
relying on NIR, so that the array of predecessors we create in the next
commit has a stable order we can rely on when creating phi nodes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>
2021-05-03 19:52:31 +00:00
Connor Abbott d28b22374c ir3/cp_postsched: Fixup SSA use pointer for direct reads
There's an optimization here to sink direct (i.e. not relative) reads of
an array past unrelated direct writes. However, since each write
actually reads, modifies, and then writes again to the array, this means
that we need to read the latest updated array. The old RA used the array
id instead of the SSA information, so it didn't care, but the new RA
uses ->instr instead and ignores the array id because arrays are now SSA
so it needs to be correct.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>
2021-05-03 19:52:31 +00:00
Connor Abbott 40a1c4ba2d ir3/postsched: Fix ir3_postsched_node::delay calculation
This wasn't using the same calculation that add_reg_dep() was using to
get the index into state->regs, so it was using the wrong register. Fix
this by folding it into add_reg_dep().

This shouldn't fix anything, because it's just used for scheduler
priorities, but it should reduce nop's and syncs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>
2021-05-03 19:52:31 +00:00
Connor Abbott 4b41ffc231 ir3/delay: Remove special case for array deps
The case it was trying to handle (array read-after-write depedendencies)
is already handled by the normal SSA source handling, so this is just
useless.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>
2021-05-03 19:52:31 +00:00
Connor Abbott 873e21f4e9 ir3/postsched: Use correct src index
Match what ir3_delay_calc() does. Caught by an assert later.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>
2021-05-03 19:52:31 +00:00
Connor Abbott af7f29a78e ir3/sched: Use correct src index
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>
2021-05-03 19:52:31 +00:00
Connor Abbott 7df7bab03b ir3/cp: Clone registers for compare-folding optimization
Sharing the same register between instructions happened to work with the
old RA, but not with the new RA because they may get different register
assignments.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>
2021-05-03 19:52:31 +00:00
Connor Abbott e597f8b122 ir3/postsched: Fix dependencies for a0.x/p0.x
a0.x is written as a half-reg, but just interpreting it as "hr61.x" will
result in it overlapping with r30.z in merged mode, which is not what
the hardware does at all. This introduced a spurious dependency on
a write to r30.z which resulted in an assert tripping. Just pretend it's
a full reg instead.

This fixes
spec@arb_tessellation_shader@execution@variable-indexing@vs-output-array-vec3-index-wr-before-tcs
with the new RA.

Fixes: 0f78c32 ("freedreno/ir3: post-RA sched pass")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>
2021-05-03 19:52:31 +00:00
Danylo Piliaiev ea72be8b7c ir3: do not fold cmps from different blocks with non-null address
Scheduling don't like address being in the different block from
the instruction.

Fixes a crash in the trace of "War Thunder" (DX11)

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10355>
2021-05-03 17:25:05 +00:00
Connor Abbott 3d5c1c4989 tu: Fix SP_GS_PRIM_SIZE for large sizes
Based on the previous commit.

Fixes: 012773b ("turnip: Configure VPC for geometry shaders")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10551>
2021-05-03 14:06:24 +00:00
Connor Abbott 0157076982 freedreno/a6xx: Better document SP_GS_PRIM_SIZE
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10551>
2021-05-03 14:06:24 +00:00
Rob Clark c554757bc2 freedreno/drm: Initialize control->fence
Don't rely on getting a zero'd out buffer, we could hit the bo-cache.

Fixes: 7dabd62464 ("freedreno/drm: Userspace fences")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10567>
2021-05-02 15:17:25 +00:00
Rob Clark 928453ccb2 freedreno/ci: Mark client_wait_sync_finish as flake
This one has shown up a couple times since fd/go-fast, I'm still trying
to reproduce/debug.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10530>
2021-05-01 08:47:50 -07:00
Rob Clark 5181f40670 freedreno/drm: Allow FD_BO_PREP_FLUSH without _NOSYNC
This provides the upper layer (gallium, etc) a way to ensure that
rendering involving the bo has been flushed all the way to the kernel.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10530>
2021-05-01 08:46:27 -07:00
Rob Clark 9fa3312773 freedreno/ci: Isolate dEQP-EGL reset_context tests
To reduce flakes, separate out the dEQP-EGL tests that are intentionally
triggering GPU hangs.  This avoids some kernel side issues with bad
handling of ringbuffer-full scenarios, causing innocent tests to flake.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10560>
2021-05-01 02:37:05 +00:00
Eric Anholt 0987df6a3e ci/freedreno: Mark dEQP-EGL flakes reported on IRC since its introduction.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10552>
2021-04-30 16:44:20 -07:00
Danylo Piliaiev 1201aa9332 ir3: do not move varying inputs that depend on unmovable instrs
Not all varying fetches could be pulled into the start block.
If there are fetches we couldn't pull, like load_interpolated_input
with offset which depends on a non-reorderable ssbo load or on a
phi node, this pass is skipped since it would be hard to find a place
to set (ei) flag (beside at the very end).

We also don't have to manually set (ei) in such cases since a5xx and
a6xx do automatically release varying storage at the end.
Earlier gens need further testing, however they do not support
interpolateAt* functions at moment, so unless we would like to support
sample shading on them - they are fine.

Fixes crash in GTA V.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10483>
2021-04-30 14:49:18 +00:00
Juan A. Suarez Romero e532a47f76 util/hash_table: do not leak u64 struct key
For non 64bit devices the key stored in hash_table_u64 is wrapped in
hash_key_u64 structure, which is never free.

This commit fixes this issue by just removing the user-defined
`delete_function` parameter in hash_table_u64_{destroy,clear} (which
nobody is using) and using instead a delete function to free this
structure.

Fixes: 608257cf82 ("i965: Fix INTEL_DEBUG=bat")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10480>
2021-04-29 12:58:23 +02:00
Rob Clark 1d19325483 freedreno/ci: Disable counterstrike trace on a306 for now
The combination of removing bottlenecks in userspace (userspace fences,
etc) and slow GPU results in hitting full ringbuffer on a306.  Haven't
figured out a reasonable way to work around that in userspace until a
kernel fix is in place, so disable this one for now.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark f92f31455a freedreno/drm: Assume explicit fences if in_fence_fd
If we ever see explicit fencing used, then we can disable implicit
fencing, even for internal or unfenced batches.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark e9a9ac6f77 freedreno/drm: Async submit support
Move the submit ioctl to it's own thread to unblock the driver thread
and let it move on to the next frame.

Note that I did experiment with doing the append_bo() parts
synchronously on the theory that we should be more likely to hit the
fast path if we did that part of submit merging before the bo was
potentially re-used in the next batch/submit.  It helped some things
by a couple percent, but hurt more things.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark 2c9e8db28d freedreno/drm: pipe should hold reference to device
A more direct solution would be for bo's to have a reference to the
device.  But bo's are ref/unrefd more frequently.

This avoids async submits unrefing a bo after the device handle-
table is freed.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark ad9654a4c1 freedreno/drm: fd_submit should hold ref to fd_pipe
Also, move this into the base class, no reason for it to be in backend.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark cccdc513e3 freedreno/drm/sp: Implement deferred submit merging
For submits flushed with (a) no required fence, and (b) no externally
visible effects (ie. imported/exported bo), we can defer flushing the
submit and merge it into a later submit.

This is a bit more work in userspace, but it cuts down the number of
submit ioctls.  And a common case is that later submits overlap in the
bo's used (for example, blit upload to a buffer, which is then used in
the following draw pass), so it reduces the net amount of work needed
to be done in the kernel to handle the submit ioctl.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/19
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark c7dc5cf3cb freedreno/drm/sp: Split submit prep and finish
For deferred submits, we still need to do the prep steps immediately,
but the ioctl construction can be deferred.. prepare for that by
splitting the prep out.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark 62a6773d80 freedreno/drm: Add pipe tracking for deferred submits
Now that we have some bo state tracking for userspace fences, we can
build on this to add a way for the pipe implementation to defer a submit
flush in order to merge submits into a single ioctl.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark aafcd8aacb freedreno: Re-work fd_submit fence interface
Move everything into a struct assocated with the pipe_fence_handle, so
that the drm layer can fill in the seqn/fd fences directly.

This will give us a comvenient place to insert a util_queue_fence in the
next commit.

While we're at it, extract the uint32_t fence (previously called
'timestamp' in place, a kgsl legacy) into a struct that encapsulates
both the kernel fence and the userspace fence.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark d531c8d22a freedreno/drm: Reference count submits
To merge submits, we'll need drm to internally hold an extra reference
to the submit.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark 49852ace2a freedreno/drm: Inline the fence-table
In the common case, a bo will have no more than a single fence attached.
We can inline the storage to avoid a separate allocation in this case.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark 7dabd62464 freedreno/drm: Userspace fences
Add a per-fd_pipe fence "timeline" so we can detect cases where we don't
need to call into the kernel to determine if a fd_bo is still busy.

This reuses table_lock, rather than introducing a per-bo lock to protect
fence state updates because (a) the common / hotpath pattern is to
update fences on a lot of objects, but checking the fence state of a
single object is less common, and (b) because we already hold the table
lock in common spots where we need to check the bo's fence state (ie.
allocations from the bo-cache).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark df78934cdf freedreno/drm: Add locked version fd_{bo,pipe}_del()
This will be needed in the next patch, so we can reuse the bo table_lock
for fence related locking.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark 8f5c89350f freedreno/drm: Move the growable array helper
We'll need this to track userspace fences attached to a fd_bo.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark c98ada6ad1 freedreno/drm: Add FD_BO_PREP_FLUSH
There are a couple cases where we want to use _NOSYNC, but at the same
time we want to ensure that rendering related to a bo is actually
flushed.

This doesn't do anything yet, but when we start deferring/merging
submits we'll need a way to trigger anything deferred to flush.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark 8ab227c373 freedreno/drm: Cleanup bo cpu_prep flags
Also add some STATIC_ASSERT()

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark 7f0abd9048 freedreno/drm: Cleanup bo allocation flags
Most of them were actually unused.  The memory type (KMEM vs SMI) only
applied to very old a2xx era devices that had a small/fast stacked
memory (SMI) vs normal memory (KMEM).  And the cache flags are ignored
(ie. everything is writecombine), but we can add new cache flags later
when they actually do something.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark ef0c5007f2 freedreno/drm: Move submit->primary to base class
Gets rid of a bit of duplication between the two current
implementations, and will be needed in next patch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark 224dbd77d5 freedreno: Small indent fix
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Danylo Piliaiev addab037f0 tu: do not corrupt unwritten render targets
There is no point in having a write to an attachment enabled when there
is no corresponding output in the shader. Per VK spec it is an UB,
however a few apps depend on attachment not being changed if FS doesn't
have corresponding output.

Fixes water in Genshin Impact.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10489>
2021-04-28 08:51:49 +00:00
Eric Anholt 7ae0719117 turnip: Only write the tu_RegisterDeviceEXT() out fence on success.
Fixes a double-free in dEQP-VK.wsi.display_control.register_device_event
where the fence that we destroyed got destroyed again.  Leaving it unset
in the error path leaves the test's NULL in place.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10473>
2021-04-27 16:19:26 +00:00
Italo Nicola 8074a040e7 util: add util_sign_extend
This code is taken from src/freedreno/isa/decode.c.
Since we need a similar function in panfrost, it's probably good to move
it to utils.

Signed-off-by: Italo Nicola <italonicola@collabora.com>
Acked-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9461>
2021-04-27 07:04:07 +00:00
Connor Abbott 643f2cb8a3 ir3, tu: Cleanup indirect i/o lowering
Do all the necessary lowering in one place, during finalization, and
stop uselessly calling nir_lower_indirect_derefs in turnip. Splitting
i/o to elements should no longer be necessary since we use the i/o
semantics instead of variables now.

This has the side effect that we no longer generate enormous if-ladders
for tess/GS shaders with turnip.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7274>
2021-04-26 17:07:02 +00:00
Connor Abbott decfea2f4e ir3: Prevent oob writes to inputs/outputs array
Don't setup inputs and outputs if we aren't using
load_input/store_output intrinsics. While it's mostly harmless, there
may be more outputs than expected which would lead to an oob write of
the outputs array when setting the register id to INVALID_REG.

Also be more paranoid with asserts to catch this.

Fixes: a6291b1 ("freedreno/ir3: rework setup_{input,output} to make struct varyings work")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7274>
2021-04-26 17:07:02 +00:00
Robert Foss 9967dabe91 freedreno/regs: add 5nm DSI PHY/PLL regs
This is for the kernel driver.

Signed-off-by: Robert Foss <robert.foss@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10368>
2021-04-21 15:42:03 +00:00
Danylo Piliaiev 9402d5a6b5 ir3: make possible to specify branchstack up to 64
On a6xx/a5xx there is such dependency between branchstack bitfield
and the amount of nested ifs, which could be seen with blob:

IFs   BRANCHSTACK
0	0
1	1
2	2
3	2
4	3
5	3
6	4
...
59	30
60	31
61	31
62	32
63	32
64	32

Remove open-coded branchstack for a5xx compute along the way.

Fixes tests:
 dEQP-VK.spirv_assembly.instruction.compute.float16.opvectorshuffle.344
 dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.344_vert
 dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.444_geom
 dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.244_tessc
 dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.344_frag

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9859>
2021-04-21 11:57:07 +00:00
Danylo Piliaiev e7eed45869 ir3: do not double threadsize when exceeding branchstack limit
We can't support more than compiler->branchstack_size diverging threads
in a wave. Thus, doubling the threadsize is only possible if we don't
exceed the branchstack size limit.

As of blob version 512.490.0 - it doesn't have this heuristics.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9859>
2021-04-21 11:57:07 +00:00
Danylo Piliaiev 1e33b6a32b turnip: enable shaderInt16
We should have everything to enable it.
16b integer division is lowered by nir_lower_idiv.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10054>
2021-04-20 20:32:20 +00:00
Danylo Piliaiev d918bbfa1c ir3: treat 16b imul as mul.s24
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10054>
2021-04-20 20:32:20 +00:00
Rob Clark 5bf7475460 ir3: handle 16b op_i2b1
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10054>
2021-04-20 20:32:20 +00:00
Samuel Iglesias Gonsálvez b2a60c157e turnip: add LRZ early-z support
Imported the logic from Freedreno driver.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7186>
2021-04-20 10:01:58 +00:00
Samuel Iglesias Gonsálvez af049b6668 turnip: fix setting dynamic state mask for VK_DYNAMIC_STATE_STENCIL_OP_EXT case
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7186>
2021-04-20 10:01:58 +00:00
Samuel Iglesias Gonsálvez 88c7aa0b3e turnip: group all geometry constant draw states in one
Thus, we can free some draw state slots for future use.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7186>
2021-04-20 10:01:58 +00:00
Samuel Iglesias Gonsálvez 2c0c696f16 turnip: update LRZ state based on stencil test state
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7186>
2021-04-20 10:01:58 +00:00
Samuel Iglesias Gonsálvez ff8e3547b3 turnip: implement LRZ direction
There are some LRZ compare op switches that are not supported by
the HW, like GREATER* <-> LESS* ones.

This patch tracks the direction of the switch and disables LRZ
if needed.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7186>
2021-04-20 10:01:58 +00:00
Eric Anholt 8a8e55d6a8 ci/freedreno: Test dEQP-EGL against Xorg.
This should help us be able to refactor core EGL code with more
confidence, and increase our confidence uprevving Mesa in ChromeOS.

Part of #1884

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10240>
2021-04-19 20:53:27 +00:00
Danylo Piliaiev 64367f2359 turnip: implement VK_KHR_shader_terminate_invocation
OpTerminateInvocation provides the behavior required by the GLSL
discard statement, which we already implement.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9460>
2021-04-19 17:11:36 +00:00
Danylo Piliaiev 9dd9424a85 turnip: implement VK_EXT_shader_demote_to_helper_invocation
The "demote" intrinsic has the semantics of D3D discard, which means
it doesn't change the control flow, allowing derivatives to work.

On A6xx there is no known way to check whether invocation was demoted,
thus we use nir_lower_is_helper_invocation.

Add "logical" OPC_DEMOTE which is later translated to "kill".
Such separation is necessary to run "kill" specific optimizations
which are invalid for "demote".

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9460>
2021-04-19 17:11:36 +00:00
Connor Abbott 08499369d0 ir3: Assemble and disassemble swz/gat/sct
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10291>
2021-04-19 16:10:44 +00:00
Connor Abbott d48d43039a ir3: Improve cat1 modifier disassembly
Remove bit that shouldn't be part of (rptN), and rewrite the handling of
(even) and (pos_infinity) to uncover a missing (neg_infinity) modifier.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10291>
2021-04-19 16:10:44 +00:00
Connor Abbott 4c5b696cc3 ir3/parser: Fix oob write with immediates array
immediates_count and immediates_size are supposed to have the same
units, but it was only incrementing immediates_count by 1. While we're
here, also fix the case where constants are specified out-of-order.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10291>
2021-04-19 16:10:44 +00:00
Rob Clark c74d93cf01 freedreno/fdl: Re-indent
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>
2021-04-17 15:38:56 +00:00
Rob Clark 6050976232 freedreno/perfcntrs: Re-indent
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>
2021-04-17 15:38:56 +00:00
Rob Clark d26a224ca9 freedreno/ir2: Re-indent
clang-format -fallback-style=none --style=file -i src/freedreno/ir2/*.[ch]

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>
2021-04-17 15:38:56 +00:00
Rob Clark 2dbf09c2b4 freedreno/drm-shim: Re-indent
clang-format -fallback-style=none --style=file -i src/freedreno/drm-shim/*.[ch]

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>
2021-04-17 15:38:56 +00:00
Rob Clark 45856c5fbc freedreno/decode: Re-indent
clang-format -fallback-style=none --style=file -i src/freedreno/decode/*.[ch]

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>
2021-04-17 15:38:56 +00:00
Rob Clark 3894bc9664 freedreno/computerator: Re-indent
clang-format -fallback-style=none --style=file -i src/freedreno/computerator/*.[ch]

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>
2021-04-17 15:38:56 +00:00
Rob Clark ccd68b672a freedreno/common: Re-indent
clang-format -fallback-style=none --style=file -i src/freedreno/common/*.[ch]

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>
2021-04-17 15:38:56 +00:00
Rob Clark f5918f750f freedreno/afuc: Re-indent
clang-format -fallback-style=none --style=file -i src/freedreno/afuc/*.[ch]

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>
2021-04-17 15:38:56 +00:00
Rob Clark b94db11708 freedreno/drm: Re-indent
clang-format -fallback-style=none --style=file -i src/freedreno/drm/*.[ch]

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>
2021-04-17 15:38:56 +00:00
Eric Anholt 23159f1a7a ci/freedreno: Skip some precision tests on a530.
These have flaked as Timeouts in CI in the last month.  .precision.* is
generally very slow (some in the 15s-30s range), but it's unclear to me
why they sometimes spike up to 60 seconds (thermal throttling?).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10274>
2021-04-16 04:34:14 +00:00
Eric Anholt 7d234da6ee freedreno: Fix YUV sampler regression.
We have to keep sampler uniforms around for later YUV lowering, and we
only need to remove uniforms that take up storage space.  Code comes from
radeonsi.

Closes: #4644.
Fixes: de17b4aab5 ("freedreno: Remove uniform variables after finalizing NIR.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10246>
2021-04-15 16:20:15 +00:00
Michel Dänzer d200f45875 Use explicit break instead of fall-through to break-only case
clang generates a warning if there's no explicit break or fall-through
annotation. The latter would be kind of silly in this case, and not
robust against any future changes turning the fall-through invalid.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>
2021-04-15 16:01:22 +00:00
Michel Dänzer 2928c21eb7 Convert most remaining free-form fall-through comments to FALLTHROUGH
One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is
explicitly disabled.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>
2021-04-15 16:01:22 +00:00
Connor Abbott cf727e6ba4 tu: Expose VK_EXT_robustness2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>
2021-04-15 16:05:13 +02:00
Connor Abbott 0fb14420da tu: Handle null descriptors
Writing all 0's, including for the format, seems to work. Actually
setting the format seems to break textureSize() (getsize returns 1 for
some reason).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>
2021-04-15 16:05:13 +02:00
Connor Abbott f58ece08da tu: Handle robust UBO behavior for pushed UBO ranges
If we push a UBO range but then find out at draw-time that part of the
pushed range is out of range of the UBO descriptor, then we have to fill
in the rest of the range with 0's to mimic the bounds-checking that ldc
would've done.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>
2021-04-15 16:05:13 +02:00
Connor Abbott cb02a48f83 tu: Correctly preserve old push descriptor contents
We were never setting set->size, so we were always copying 0 bytes. But
as we only copy the contents when the layout and therefore the size is
the same, we don't have to take the old size into account anyway.

This fixes some VK_EXT_robustness2 tests that use push descriptors.

Fixes: 6d4f33e ("turnip: initial implementation of VK_KHR_push_descriptor")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>
2021-04-15 16:05:13 +02:00
Connor Abbott c68ea960a7 ir3, tu: Add compiler flag for robust UBO behavior
This needs to be part of the compiler because it's the only piece that
we always have access to in all the places ir3_optimize_loop() is
called, and it's only enabled for the whole Vulkan device. Right now
it's just used for constraining vectorization, but the next commit adds
another use.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>
2021-04-15 16:05:11 +02:00
Connor Abbott 8f54028479 ir3: Reduce max const file indirect offset base to 9 bits
This fixes
dEQP-VK.robustness.robustness2.bind.notemplate.r32i.dontunroll.nonvolatile.uniform_buffer.no_fmt_qual.len_260.samples_1.1d.frag,
which accesses the shader UBO with c<a0.x + 512> due to the constant
data UBO coming before it in the const file. The len_256 variant has a
smaller constant data UBO, so it uses c<a0.x + 256> instead, and that
works, so 512 seems to be the real limit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>
2021-04-15 16:03:54 +02:00
Connor Abbott 8e11f0560e ir3: Fix list corruption in legalize_block()
We forgot to remove the instruction under consideration from instr_list
before inserting it into the block's list, which caused instr_list to
become corrupted. This happened to work but caused further corruption in
some rare scenarios.

Fixes: adf1659 ("freedreno/ir3: use standard list implementation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>
2021-04-15 16:03:54 +02:00
Eric Anholt 6d510fd473 ci/freedreno: Merge a630 piglit to a single job.
piglit_gl clocked in at 6:12 end-to-end runtime, and piglit_shader spent
2:53 in deqp-runner, so merging them together should be about 9 minutes.
Removing a boot should save us a minute or two of runner time per
pipeline.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10243>
2021-04-15 10:06:14 +00:00
Samuel Iglesias Gonsálvez 029bc53be6 turnip: fix typo in tu_CmdBeginRenderPass2()
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>
2021-04-15 09:51:25 +02:00
Samuel Iglesias Gonsálvez d52917f858 turnip/lrz: added support for depth bounds test enable
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>
2021-04-15 09:51:25 +02:00
Samuel Iglesias Gonsálvez 2161aebf8d turnip: document GRAS_LRZ_CNTL's UNK5 bitfield
It is used by the blob to enable depth bounds test for LRZ.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>
2021-04-15 09:51:25 +02:00
Samuel Iglesias Gonsálvez 54cf12774a turnip/lrz: add support for VK_EXT_extended_dynamic_state
When the depth or stencil state changes dynamically, that might affect
LRZ state and we need to recalculate it and emit it again.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>
2021-04-15 09:51:20 +02:00
Samuel Iglesias Gonsálvez 6d6cbb7361 turnip: refactor how LRZ state is calculated
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>
2021-04-15 09:50:51 +02:00
Samuel Iglesias Gonsálvez 43ebba4e88 turnip: initialize pipeline->rb_{stencil,depth}_cntl always
This change will simplify further changes on LRZ state management.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>
2021-04-15 09:50:51 +02:00
Samuel Iglesias Gonsálvez 1f9fb7677b turnip: move pipeline gras_su and rb{stencil,depth}_cntl_mask initialization
Move them up, so they are initialized even when the dynamic state is
not used.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>
2021-04-15 09:50:51 +02:00
Rob Clark 31782330da freedreno: Add missing foreach macros and update indentation
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10247>
2021-04-14 16:53:26 -07:00
Rob Clark 2fb3984805 freedreno: Add .clang-format
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8883>
2021-04-14 19:52:21 +00:00
Connor Abbott 2deead184c ir3/sched: Don't schedule too many tex/SFU instructions
Consider a simple loop that does a series of texture instructions and
then reduces the results:

vec4 sum = vec4(0);
for (int i = 0; i < N; i++) {
   sum += texture(...);
}

Assume that the loop is unrolled and we schedule the resulting basic
block. Right now, after we schedule the first texture instruction, the
only instructions available to schedule that don't incur a sync are the
instructions to setup the second texture instruction. So we keep picking
the texture instructions, no matter how large N is, resulting in a
pathological schedule for register pressure when N is very large:

sum1 = texture(...);
sum2 = texture(...);
sum3 = texture(...);
...
sum = sum1 + sum2 + sum3 + ...;

In particular this happens with some CTS tests for VK_EXT_robustness2,
where a loop like that with many iterations is marked as [[unroll]],
forcing NIR to unroll it.

This solution is a balance between the current approach and always
scheduling for register pressure (and ignoring sync's). We only allow a
certain number of texture fetches to be in flight before considering
textures to "sync", even though they don't really, both because they
likely *will* sync in reality (overflowing the internal queue of waiting
texture instructions) and because at some point we need the normal
algorithm to kick in and start lowering register pressure.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7571>
2021-04-14 17:33:58 +00:00
Connor Abbott 7821e5a3f8 ir3/sched: Don't penalize uses of already-waited tex/SFU
Once we insert a use of a given tex or SFU instruction, then we must
wait for that tex/SFU instruction (as well as all earlier ones) to
complete, so we shouldn't penalize further uses, even if a subsequent
tex/SFU instruction gets scheduled after the first use. This especially
matters after the next commit when we start forcibly breaking up long
sequences of texture instructions, since if we schedule a group of 8
texture instructions then we want to schedule the uses of those
instructions in parallel with the next 8 texture instructions to reduce
register pressure.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7571>
2021-04-14 17:33:58 +00:00
Michel Dänzer af0fde955c ci: Move docker images from Debian buster to bullseye
Among other things, this gets us GCC 10 (was 6).

Requires some changes to third party components we use:

* Install apitrace (& waffle) from Debian; was hitting issues with the
  local build, and it's the same version 9.0 anyway.
* Update Fossilize to a newer commit which builds with GCC 10.
* apt.llvm.org repositories are no longer needed.
* Use an SPIRV-LLVM-Translator commit which builds with LLVM 11.0.1.
* Install XCB packages from Debian, 1.13 fails to build with Python 3.9.
* Install wayland-protocols from Debian, 1.12 is too old for
  libgtk-3-dev in bullseye.

LLVM 7/8 packages are no longer available.

Also adapt expected test results to Xvfb now exposing multi-samle
GLXFBConfigs.

v2:
* Install clang instead of clang-11.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3124
Reviewed-by: Eric Anholt <eric@anholt.net> # v1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9833>
2021-04-14 13:05:08 +00:00
Connor Abbott 271c18f48e tu: Expose VK_KHR_relaxed_block_layout
This was absorbed into Vulkan 1.1, but we forgot to expose it
separately. It's a subset of what's allowed by
VK_EXT_scalar_block_layout.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8695>
2021-04-14 11:48:38 +00:00
Connor Abbott 765c3b85a5 tu: Expose VK_KHR_spirv_1_4 and VK_EXT_scalar_block_layout
VK_KHR_spirv_1_4 is trivial because vtn already supports all the added
SPIR-V features that aren't gated behind Vulkan extensions. I've
observed some robustness2 CTS tests requiring this. However there are
a few tests currently failing due to lacking spilling.

VK_EXT_scalar_block_layout should also be trivial, since support for
"straddling" UBO loads was added recently for other reasons. This is
used by every robustness2 CTS test.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8695>
2021-04-14 11:48:38 +00:00
Juan A. Suarez Romero 9e5762c387 ci: Update VK-GL-CTS to 1.2.6.0
v2:
 - Bump up MESA_ROOTFS_TAG instead of arm_build (Michel)

Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10136>
2021-04-14 08:06:55 +00:00
Marek Olšák fb29cef8dd nir: add many passes that lower and optimize 16-bit input/outputs and samplers
Added:
* a pass that renumbers bases of IO intrinsics
* a pass that converts mediump IO to 16 bits, optionally using the new
  packed varying slots
* a pass that sets (forces) mediump in IO intrinsics (for testing)
* a pass that remaps VARYING_SLOT_VAR[0..15]_16BIT to VARYING_SLOT_VAR[0..31]
  (if some shader stages don't want packed varyings)
* a pass that folds type conversions around texture opcodes into those
  opcodes (e.g. tex(f2f32(coord), ..) is changed into tex accepting f16)
* a pass that changes (legalizes) sampler src and dst types based on specified
  hw constraints (e.g. derivatives must be the same type as coordinates)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9050>
2021-04-13 05:07:42 +00:00
Rhys Perry a2619b97f5 nir/lower_idiv: add options to use fp32 for 8-bit division lowering
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081>
2021-04-12 16:19:46 +00:00
Danylo Piliaiev 16fd5bd996 turnip: support copying both aspects of D32_SFLOAT_S8_UINT
We cannot copy both aspects at the same time, so copy them one by one.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10140>
2021-04-12 14:36:30 +00:00
Connor Abbott ba796d5115 ir3/postsched: Make sure to schedule inputs before kill
Before, we would prefer to schedule inputs before kills, which works
assuming that the live range of the bary_ij system value don't get
split and therefore all bary.f are ready at the start of the block.
However live range splitting can mess up that assumption and cause a
kill to get scheduled before a move that leads to a bary.f.

This fixes even e.g. dEQP-GLES2.functional.shaders.discard.basic_always
on a3xx before introducing CSE of collect instructions, but even after
that it could be a problem theoretically as the register allocator
doesn't guarantee that any live ranges aren't split.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10143>
2021-04-09 16:31:29 +00:00
Matt Turner 4251e9cddf ir3: Don't count (nopX) towards the wrong category
Prior to this commit

   (nop3) mad.f32 r0.y, c0.x, r1.w, c0.y

was counted as 4 cat3 instructions (and still 3 cat0/nops) in shader-db
results. With this change, it is counted as only 1 cat3 instruction.

Probably never going to have better shader-db results than this in my
career:

total cat2 in shared programs: 1214667 -> 732058 (-39.73%)
cat2 in affected programs: 1194729 -> 712120 (-40.39%)
helped: 8551
HURT: 0

total cat3 in shared programs: 376448 -> 274745 (-27.02%)
cat3 in affected programs: 344918 -> 243215 (-29.49%)
helped: 7222
HURT: 0

Reviewed-by: Rob Clark <robdclark@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10116>
2021-04-09 14:26:35 +00:00
Bas Nieuwenhuizen 4ca4de50f7 nir: Remove nir_shader->shared_size.
The same info is in shader_info. Dedupe.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10094>
2021-04-08 14:39:28 +00:00
Chad Versace 5e6db19168 anv: Remove vkCreateDmaBufINTEL (v4)
Superceded by VK_EXT_image_drm_format_modifier.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1466>
2021-04-08 14:15:55 +00:00
Chad Versace 0845cabc72 vulkan: Track dependencies of Python imports
The meson.build was unaware of transitive dependencies introduced by
Python imports.

Android still needs fixing. But I did not update the Android files lest
I break the build.

Ideally, we would fix this by using a Python runner that generates
a depfile, similar to how meson creates depfiles for C files by passing
flags -MD -MQ -MF to gcc. But this patch gets the job done, without
stalling on the ideal general solution, by manually tracking the Python
imports in new 'foo_depend_files' variables.

CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1466>
2021-04-08 14:15:54 +00:00
Connor Abbott 5a70c4d4a0 ir3: Don't copy propagate arrays in ir3_cp
We don't check whether there's an intervening write in this pass, which
makes it incorrect. ir3_cp_postsched does check correctly, but we were
accidentally doing it here anyway for some sources.

While we're here, delete some code that was only used in the array case.

Fixes: f370e954 ("freedreno/ir3: handle const/immed/abs/neg in cp")
Reviewed-by: Rob Clark <robdclark@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10076>
2021-04-07 14:35:13 +00:00
Connor Abbott 1ad5ee5a04 ir3/cp_postsched: Set address of uses for relative mov's
Fixes: 680ca5b ("freedreno/ir3: add post-scheduler cp pass")
Reviewed-by: Rob Clark <robdclark@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10076>
2021-04-07 14:35:13 +00:00
Connor Abbott dcc26a3945 ir3: Fix valid flags for STIB
Disallow immediates for the source. This was hidden by the fact that we
didn't copy-propagate trivial collect instructions.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10076>
2021-04-07 14:35:13 +00:00
Connor Abbott 94beaa1d92 ir3/legalize: Fix last input (ss) insertion
If there was a mix of ldlv and bary.f and we inserted an (ss) *after*
the last input which was a bary.f, then last_input_needs_ss would get
unset, even though it shouldn't. For figuring out whether we need the
(ss), we need to know whether there are any pending ldlv's when
last_input gets executed, not at the end of the block, which means that
the existing code's strategy of inserting it after the whole block has
been processed won't work. Rework it to do the last_input processing in
the main loop instead.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10076>
2021-04-07 14:35:13 +00:00
Connor Abbott 35ffe4fec1 freedreno/a3xx: Fix SP_FS_CTRL_REG1_INITIALOUTSTANDING
Unfortunately this didn't fix anything, but I thought I might as well
include it.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10076>
2021-04-07 14:35:13 +00:00
Danylo Piliaiev 519eb735a3 turnip: implement variableMultisampleRate
If subpass doesn't have depth/color attachments - samples count is
devised from VkPipelineMultisampleStateCreateInfo::rasterizationSamples.
Without variableMultisampleRate enabled all pipelines in such subpass
should have the same samples count; variableMultisampleRate allows
to have pipelines with different number of samples in one subpass,
given that it doesn't have depth/color attachments.

Blob doesn't have it enabled but there is no known reason for this.

Passes:
 dEQP-VK.pipeline.multisample.variable_rate.*

Fixes test:
 dEQP-VK.pipeline.framebuffer_attachment.no_attachments_ms

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9556>
2021-04-07 12:04:45 +00:00
Alejandro Piñeiro 1e0a69afa7 vulkan: track number of bindings instead of max binding for CreateDescriptorSetLayout
As that handles better, and more clear, the case of bindingCount being
zero. For the case of Anvil and Turnip, this avoids allocating a
non-needed binding when bindingCount is zero.

Inspired on radv, that was what it was doing so far.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4526

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9905>
2021-04-05 20:17:53 +00:00
Danylo Piliaiev 0709a6b363 turnip: fix alignment of non-32b types in workgroup memory
Fixes tests:
 dEQP-VK.spirv_assembly.instruction.compute.workgroup_memory.float16

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10039>
2021-04-05 17:31:11 +00:00
Alyssa Rosenzweig 06ebbde630 vulkan: Deduplicate mesa stage conversion
Across every driver...

v2: Add casts to appease -fpermissive used on CI.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9477>
2021-04-03 17:34:39 +00:00
Danylo Piliaiev 0ec495e3c9 turnip: handle format list for compressed formats
Compressed formats may have compatible formats, however they could
only be sampled, so we should not call tu6_format_color with them.

tu6_format_texture should have the same behaviour for checking swap
so use it for all cases.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10009>
2021-04-02 21:52:05 +00:00
Eric Anholt f67b6f9c47 ci/freedreno: Fix up the a5xx border color flake annotation.
Looks like I put it in the wrong file back when I first caught it.  It's a
one-or-twice-a-week back flake that seems to happen.  The upcoming
deqp-runner uprev would have caught this mistake.

Fixes: 957132294f ("ci/a5xx: Increase the gles3/31 coverage.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9806>
2021-04-02 18:42:04 +00:00
Eric Anholt adf04d1af4 ci/freedreno: Switch to the trimmed glxgears trace.
The old one had a ton of frames and took ~5 minutes on a306.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9957>
2021-04-01 21:04:11 +00:00
Eric Anholt fe5349f70c freedreno/a6xx: Fix alpha tests.
Apparently I inverted the sense of this flag back when we didn't have
piglit testing.  Fixes terrible rendering in minetest, HL2, CS:Source, and
CS.

Fixes: 0369dd9077 ("freedreno/a6xx: Add ARB_depth_clamp and separate clamp support.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9957>
2021-04-01 21:04:11 +00:00
Eric Anholt 3043940183 freedreno/a5xx: Fix alpha test vs early Z bugs.
Just like with discards, we have to disable early Z writes when alpha test
is enabled.

Fixes rendering on HL2, CS: Source, counter-strike, and minetest.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9957>
2021-04-01 21:04:11 +00:00
Eric Anholt c9fd8c2570 ci/freedreno: Add trace testing on a3xx, a5xx.
Having compared rendering between a6xx and these, I found several bugs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9957>
2021-04-01 21:04:11 +00:00
Eric Anholt 8e3a1d0dd2 ci/freedreno: Rename a306-test and a530-test to drop "arm64" from the name.
We don't have an armhf variant, and probably won't.  Now matches a630.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9957>
2021-04-01 21:04:11 +00:00
Eric Anholt ec54546b2a ci/freedreno: Add more new traces for a630 (minetest, TDM, pioneer, glyphy).
These are all recent traces that have been added.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9957>
2021-04-01 21:04:11 +00:00
Danylo Piliaiev ce1a381e57 turnip: enable VK_KHR_16bit_storage on A650
A650 can use the same SSBO descriptor for both 32-bit and 16-bit access,
which makes it easy to enable this extension.

Passes tests that run under:

dEQP-VK.spirv_assembly.instruction.*.16bit_storage.*

Rebased and modified commit from Jonathan Marek.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
2021-04-01 17:51:07 +00:00
Jonathan Marek 14acc64c3b turnip: enable VK_KHR_shader_float16_int8
ir3 supports 16-bit floats, so we can enable this.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
2021-04-01 17:51:07 +00:00
Danylo Piliaiev 64aaa4afc3 turnip: enable infinities for f16 math and document the register
When float16 is enabled this will allow to pass a number of
float16 tests.

When A6XX_SP_FLOAT_CNTL_F16_NO_INF is set - all operations which
generate +-infinity generate +-MAX_HALF_FLOAT.

Fixes some tests from:
 dEQP-VK.spirv_assembly.instruction.*.float16.*
 dEQP-VK.spirv_assembly.instruction.*.float_controls.fp16.*

E.g.:
 dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_1.sinh_vert
 dEQP-VK.spirv_assembly.instruction.compute.float16.arithmetic_4.length
 dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.log_denorm_flush_to_zero_nostorage
 dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.log2_denorm_flush_to_zero_nostorage
 dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.inv_sqrt_denorm_flush_to_zero_nostorage

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
2021-04-01 17:51:07 +00:00
Danylo Piliaiev 14460faa64 ir3: convert shift amount to 16b for 16b shifts
NIR has shifts defined as:
 opcode("*shr", 0, tuint, [0, 0], [tuint, tuint32], False, ...

However, in ir3 we have to ensure that both operators of shift
instruction have the same bitness.

Let's hope that in future the additional COV for constants would
be optimized away.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
2021-04-01 17:51:07 +00:00
Jonathan Marek 3777ecdf11 turnip: implement VK_KHR_shader_float_controls
This matches the blob and doesn't require actually implementing controls
since the supported modes are just what the HW does.

Passes tests under:

dEQP-VK.spirv_assembly.*.float_controls.*

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
2021-04-01 17:51:07 +00:00
Danylo Piliaiev de195671bd ir3: nir_op_f2f16 should round to even
cat1 instructions round to zero by default.

When fp16 is enabled this will fix:
 dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_conv_from_fp32_nostorage_frag
 dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_conv_from_fp32_nostorage_vert
 dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.rounding_rte_conv_from_fp32_nostorage

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
2021-04-01 17:51:07 +00:00
Michel Dänzer 6652c5018c ci: Merge ARM testing docker images to a single arm_test one
The merged image contains kernels & rootfs for both arm64 & armhf
baremetal test jobs, and is smaller than either arm{64,hf}_test image
before.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9955>
2021-04-01 16:35:26 +00:00
Michel Dänzer 4b20bd7425 ci: Build ARM baremetal rootfs in native container
Doing so in an x86 container via qemu was slow, and started failing
recently after updating to a newer qemu version.

This also results in smaller arm*_test* docker images, since we need to
install fewer Debian packages in them.

As a bonus, this turns some piglit tests from fail to pass (Or maybe
they'll turn out to be flakes? They've passed at least 3 times in a
row).

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9955>
2021-04-01 16:35:26 +00:00
Eric Anholt 0be9a40225 ci/freedreno: Demote a630-asan to a manual test for now.
It's flaky in producing Missing results. I've got an uprev that should
avoid the issue (and possibly a followon actual fix), but it's blocked on
being able to rebuild the arm containers.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9932>
2021-03-31 17:15:27 +00:00
Danylo Piliaiev 00d6ccebf9 ir3/isa: account for randomly set by blob lowest bit of ibo atomics
As far as I could see - blob randomly sets the lowest bit of atomic.b.*
instructions.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9841>
2021-03-31 15:03:35 +00:00
Bas Nieuwenhuizen 83c92a48b7 vulkan: Fix descriptor set creation with zero bindings.
MAX2(count * struct size, 1) results in 1 for count=0, not the size of a struct.

Since this MAX only seems to exist so we can keep using NULL for error reporting,
just refactor to return a VkResult.

Fixes: ad241b15a9 ("vk: consolidate dynamic descriptor binding sorting")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4522
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9880>
2021-03-29 23:32:50 +00:00
Matt Turner 0b35987895 tu: Skip tu_tiling_config_update_tile_layout() if not using gmem
Otherwise pass->tile_align_w will be 0, leading to a divide by zero and
undefined behavior. In practice, I saw this lead to an infinite loop in
tests like

dEQP-VK.draw.instanced.draw_indexed_indirect_vk_primitive_topology_line_list_attrib_divisor_0_multiview

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9606>
2021-03-29 21:58:24 +00:00
Eric Anholt 99838513ae freedreno/a5xx: Add support for clip distances and use them for userclip.
A little low-stakes RE effort as I unwind from fighting CI all day.  Comes
from diffing dEQP-VK.clipping.user_defined.clip_distance.vert.* on the
blob and comparing to a6xx behavior.  (My blob doesn't do tess, so if
there are equivalent tess fields for some of these, I didn't find them)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9870>
2021-03-29 21:24:16 +00:00
Vinson Lee a5d5cbdf08 freedreno: Fix file descriptor leak.
Fix defect reported by Coverity Scan.

Resource leak (RESOURCE_LEAK)
leaked_handle: Handle variable fd going out of scope leaks the handle.

Fixes: 5a13507164 ("freedreno/perfcntrs: add fdperf")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9889>
2021-03-29 17:08:56 +00:00
Danylo Piliaiev 2087168a30 turnip,ir3: account for dispatch group offsets
Fixes tests:
 dEQP-VK.compute.device_group.dispatch_base

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9339>
2021-03-29 14:31:44 +03:00
Eric Anholt 1e8792ea5f freedreno/a6xx: Use the frontend userclip lowering.
This ends up being way more piglit-conformant than our backend lowering.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9815>
2021-03-26 20:51:18 +00:00
Danylo Piliaiev 243724031b turnip: clamp to zero negative upper left corner of viewport
We cannot send negative viewport coordinates to the hardware,
so clamp them since negative min.x/y is valid per spec.
The negative origin still counts in calculations of guardband.

Fixes crash in 3DMark's "Sling Shot Extreme" test.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9629>
2021-03-25 17:57:59 +00:00
Danylo Piliaiev 56909868cd turnip: implement VK_KHR_pipeline_executable_properties
Loosely based on ANV implementation.

For executable's internal representation we output:
- Initial NIR after spirv_to_nir
- Final optimized NIR
- IR3 disassembly

Note, that vkGetPipelineExecutablePropertiesKHR is required to
return executable properties even if pipeline was not created with
CAPTURE_STATISTICS or CAPTURE_INTERNAL_REPRESENTATIONS bits set.
So the executables array is unconditionally populated, however
NIR and IR3 disassemlies are filled only when
CAPTURE_INTERNAL_REPRESENTATIONS is set.

Passes dEQP-VK.pipeline.executable_properties.*
Works with RenderDoc.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8877>
2021-03-25 13:53:33 +00:00
Tomeu Vizoso f64ef064de ci/fdo: Use trimmed traces for Valve games
Now that we have trimmed versions of them, we can run them in a more
reasonable amount of time.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9771>
2021-03-24 20:54:58 +00:00
Eric Anholt 6eee6769e9 turnip: Fix KGSL build since common dispatch rework.
Fixes: 59d70c47c7 ("turnip: Use the common dispatch framework")
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9764>
2021-03-24 17:25:07 +00:00
Samuel Pitoiset 2c2ea54020 turnip: use common entrypoints for VK_KHR_create_renderpass2
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9601>
2021-03-24 11:21:53 +00:00
Rob Clark e7202e889b freedreno: Split out devicetree helpers
The freedreno pps datasource is going to need the same, so split out
helpers that can be re-used.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9758>
2021-03-22 20:46:17 +00:00
Rob Clark 9479ae9761 freedreno/fdperf: Use os_read_file()
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9758>
2021-03-22 20:46:17 +00:00
Rob Clark 5871f4177c freedreno: Make headers C++ happy
We'll need a few of these for the C++ based gfx-pps performance counter
collector datasource.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9758>
2021-03-22 20:46:17 +00:00
Eric Anholt 431b0ef9ee freedreno/a6xx: Rename the RB_BLIT_INFO.INTEGER field to SAMPLE_0.
As @samuelig found, this is the field for disabling sample averaging and
using sample 0 instead.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9661>
2021-03-22 19:07:08 +00:00
Danylo Piliaiev a5b37c64d1 turnip: expose several already implemented extensions
They were promoted to Vulkan 1.1 and we already support them.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9757>
2021-03-22 18:20:57 +00:00
Connor Abbott d8a2abe348 freedreno/computerator: Add script for finding reg file size
This helps with finding the various parameters introduced in the last
commit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9498>
2021-03-22 18:03:16 +00:00
Connor Abbott d274649799 freedreno/computerator: Use threadsize calculated by ir3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9498>
2021-03-22 18:03:16 +00:00
Connor Abbott 7ecc70b31c turnip: Use threadsize calculated by ir3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9498>
2021-03-22 18:03:16 +00:00
Connor Abbott fd7960e191 ir3: Calcuate max_waves and threadsize
max_waves is just for shader-db stats for now, but threadsize will
replace the various mechanisms used to determine threadsize across the
different gen's. Calculating these correctly entails adding a bunch of
details about the sizes of various things to ir3. In the future we will
use the guts of the max_waves calculation to inform RA decisions as
well, which is why the max_waves calculation is broken up into register
dependent/independent pieces.

Something should be said about the units of reg_size_vec4. These units
were chosen for two reasons:

1. As said in the comment, it makes some calculations easier.
2. For a4xx/a5xx, where we don't know as much because we haven't done
   the same sorts of experiments to probe for the HW configuration, it
   corresponds more directly to things that are known. The existing code
   switches to the smaller threadsize when r24.x or higher is used,
   which translates directly to a reg_size_vec4 of 48. If we chose
   different units (e.g. multiplying by wave_granularity and/or
   threadsize_base), then to match the same behavior we'd have to set
   reg_size_vec4 based on some other parameters that aren't 100% known.
   If someone comes along and updates them, they might inadvertantly
   break it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9498>
2021-03-22 18:03:16 +00:00
Connor Abbott cbc68c79a5 freedreno: Add local_size to ir3_shader_variant
We want to use the local_size when available to calculate the threadsize
in ir3, and we need it to work with e.g. computerator where we don't
have a nir shader. Add a local_size field and use that in computerator
instead of of a separate structure that's inaccessable to core ir3.

Also set a dummy local_size in the tests to avoid a divide-by-zero.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9498>
2021-03-22 18:03:16 +00:00
Mike Blumenkrantz ad241b15a9 vk: consolidate dynamic descriptor binding sorting
this code was duplicated across several drivers

Reviewed-by: Adam Jackson <ajax@redhat.com>
turnip changes Reviewed-by: Hyunjun Ko <zzoon@igalia.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9480>
2021-03-22 16:51:55 +00:00
Danylo Piliaiev 208250b376 ir3: update info about applicability of saturation modifier
On a6xx saturation doesn't work on cat4 and on bary.f

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9751>
2021-03-22 15:02:14 +00:00
Rob Clark 9aef029635 freedreno/ir3: Precompute whether we need driver-params
To save a bit of extra math in the draw-path.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9581>
2021-03-20 12:13:09 -07:00
Rob Clark b5e1e99da1 freedreno/drm: Inline iova calculation
The shift/or are frequently zero, so this lets the compiler optimize out
some draw-overhead hotpath.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9581>
2021-03-20 12:13:09 -07:00
Rob Clark 93d5349fa5 freedreno/drm: Move emit_reloc_tail to head
Get this out of the way first to avoid some register push/pop.  Only
reloc->bo is needed after writing the address into cmdstream, so this
turns msm_submit_append_bo() into a tail call.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9581>
2021-03-20 12:13:09 -07:00
Rob Clark 684586b96e freedreno/drm: Split 64b vs 32b paths
No need to 'if (gpu_id >= 500)' on every reloc

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9581>
2021-03-20 12:13:09 -07:00
Rob Clark 9168d9cbfb freedreno/drm: Split softpin "reloc" functions
"OBJECT" rb's are long lived, and generating them is not a hotpath, but
relocs to "STREAMING" rb's are a hot path.  But we can decouple these.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9581>
2021-03-20 12:13:08 -07:00
Eric Anholt 4eb7c4d60c ci/freedreno: Mark all of dEQP TF as flaky.
I keep working on stabilizing it, but no luck yet.  Stop blocking CI on
our flakes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9715>
2021-03-19 22:07:57 +00:00
Danylo Piliaiev 9efec45b0c ir3: disallow .sat on SEL instructions
Saturation is unsupported on SEL instructions.

Fixes main menu rendering in Genshin Impact.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9666>
2021-03-19 17:09:07 +00:00
Eric Anholt 5da520cf3d freedreno/ir3: Demote centroid usage to pixel on non-msaa.
Like with the sample qualifier on all GPUs, use pixel on older HW when
MSAA rasterization is disabled to get reliable results.  Since I ran many
CI jobs on this, this updates the A530 TF flakes list, though I don't
think that this MR necessarily made it flakier (we were already struggling
on a5xx TF, which was what was motivating me to look at this!)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9641>
2021-03-18 10:46:09 -07:00
Danylo Piliaiev b804abd61d freedreno/isa: assert if field's range is out of bitset's range
Also, update outdated comment along the way.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9628>
2021-03-17 12:07:54 +00:00
Danylo Piliaiev 42c81e1901 ir3: match mova1 mnemonic when writing to A1
For MOV to A1 blob uses "mova1" mnemonic, which is mov.u16u16;
change s16 to u16 when creating MOV to A1 in order to match the blob.

Before, couldn't be parsed back:
 mov.s16s16 ha0.y, 0

After, could be parsed back and matches blob behaviour:
 mova1 a1.x, 0

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9628>
2021-03-17 12:07:54 +00:00
Danylo Piliaiev c0a62b203e ir3/isa,parser: fix encoding and parsing of bindless s2en SAM
Before, decoding showed that there is an error:
 sam.base0 (f32)(xyzw)r0.x, r0.z, a1.x   ; no field 'HAS_SAMP',
  WARNING: unexpected bits[0:7] in #cat5-samp-s2en-bindless-a1: 0x1 vs 0x0

After:
 sam.base0 (f32)(xyzw)r0.x, r0.z, s#1, a1.x

Fixes textures on the ground in TauCeti Vulkan Technology Benchmark

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9628>
2021-03-17 12:07:54 +00:00
Samuel Iglesias Gonsálvez 0acd7df67b turnip: set depth plane control zmode to A6XX_LATE_Z when sample mask is written
Otherwise, gl_SampleMask[] writes are ignored and the stencil test works like
if all samples were enabled.

Fixes: dEQP-VK.renderpass.suballocation.multisample.*s8*

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9478>
2021-03-17 09:05:33 +00:00
Iago Toral Quiroga 1e4abf1fe3 vulkan/util: call glsl_type_singleton_init_or_ref from vk_instance_init
v2: link libvulkan_util with libglsl so it can find the glsl singleton symbols.
v3: link with libcompiler instead of libglsl (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> for the v3dv bits.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> for the turnip bits.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> for the radv bits.
Acked-by: Dave Airlie <airlied@redhat.com> for the lvp bits.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9457>
2021-03-17 08:15:36 +01:00
Hyunjun Ko d9fcf5de55 turnip: Enable nonuniform descriptor indexing
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9125>
2021-03-17 01:09:30 +00:00
Hyunjun Ko e9fd2a2a58 ir3: Add nonuniform encodings to ir3 encoder and parser
By keeping track of nonuniform access from nir and storing it to ir3.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9125>
2021-03-17 01:09:30 +00:00
Hyunjun Ko 433cdd1cff ir3: fix has_src() to return correctly in ir3_nir_lower_tex_prefetch
This seems to be originally introduced from 2a0d45ae6c, and 562aaea07c
misused the method.

Fixes: 2a0d45ae6c "freedreno/ir3: Add a NIR pass to select tex instructions eligible for pre-fetch"
Fixes: 562aaea07c "freedreno/ir3: respect tex prefetch limits"

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9125>
2021-03-17 01:09:30 +00:00
Hyunjun Ko e0e55b181f turnip: Return correct value of tu6_load_state_size
The state of active_desc_sets in pipeline should be set before
allocation of the pipeline so we get correct size of descriptor
sets and reserve enough space upfront.

Otherwise we might hit assert(pipeline->cs.bo_count == 1).

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9125>
2021-03-17 01:09:30 +00:00
Danylo Piliaiev e767208069 ir3: fix oob access to regs array for getbuf,getinfo,rgetinfo
Since they have zero source registers, src->regs[1] is out of bounds.
It probably wasn't able to cause any harm, but it's always better
be safe.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4209

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9609>
2021-03-16 22:36:12 +00:00
Eric Anholt f3a7a8a4dc ci/freedreno: Switch the piglit testing to the new piglit runner.
Getting piglit to fit onto our test devices was proving difficult, and we
need the ability to handle flakes, so switch to the rust piglit runner
that @pepp wrote as part of the deqp-runner repo which gives us flake
detection, sharding across boards, fractional runs, and almost half the
runtime.

It doesn't handle piglit subtests yet, but if you can't run piglit's
python on your devices because it's too bloated and unstable, this is a
way forward.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9468>
2021-03-16 22:19:30 +00:00
Eric Anholt 739486de2f freedreno/a5xx: Fix the max texture buffer size.
The GLES minmax is 65536.  The blob vulkan exposes 65536 on both a5xx and
a6xx, but try just doing the same as we do for a6xx.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9617>
2021-03-16 16:15:48 +00:00
Eric Anholt b93d21810a freedreno/a5xx: Fix the texel buffer alignment requirement.
Info comes from the a540 vulkan blob driver minTexelBufferOffsetAlignment.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9617>
2021-03-16 16:15:48 +00:00
Danylo Piliaiev b8ca39a80d turnip: implement intrinsic_vulkan_resource_reindex
Descriptor arrays are continuous, so it's just an addition of offset.

Fixes test:
 dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.dynamic_offset.select_descriptor_array

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9495>
2021-03-15 23:56:26 +00:00
Eric Anholt 3dc8102420 ci/freedreno: Add three more a5xx flakes from the last day.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9575>
2021-03-15 22:45:13 +00:00
Mike Blumenkrantz 71b17149e8 tu: use common interfaces for shader modules
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9508>
2021-03-15 21:47:44 +00:00
Danylo Piliaiev 914e7a7f73 turnip: set zmode to A6XX_EARLY_Z if FS forces early fragment test
Specifying "early_fragment_tests" in fragment shader takes precedence
over our internal conditions.

Fixes test:
 dEQP-VK.fragment_operations.early_fragment.early_fragment_tests_stencil

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9569>
2021-03-12 20:11:28 +00:00
Danylo Piliaiev 1a2f1e3f47 turnip: fill VkMemoryDedicatedRequirements
We support VK_KHR_dedicated_allocation so we must fill
VkMemoryDedicatedRequirements.

Vulkan spec states:

 "[...] requiresDedicatedAllocation may be VK_TRUE under one of the
 following conditions:

 The pNext chain of VkImageCreateInfo for the call to vkCreateImage used
 to create the image being queried included a VkExternalMemoryImageCreateInfo
 structure, and any of the handle types specified in
 VkExternalMemoryImageCreateInfo::handleTypes requires dedicated allocation,
 as reported by vkGetPhysicalDeviceImageFormatProperties2 in
 VkExternalImageFormatProperties::externalMemoryProperties.externalMemoryFeatures,
 the requiresDedicatedAllocation field will be set to VK_TRUE."

All handle types require dedicated allocation at the moment.

Fixes:
 dEQP-VK.api.external.memory.opaque_fd.dedicated.image.info
 dEQP-VK.memory.requirements.dedicated_allocation.buffer.regular
 dEQP-VK.memory.requirements.dedicated_allocation.image.transient_tiling_optimal

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9086>
2021-03-12 11:56:47 +02:00
Danylo Piliaiev ae3b95daa7 turnip: lower device index to zero
Vulkan 1.1 has VK_KHR_device_group and VK_KHR_device_group_creation
promoted to core, thus we should handle DeviceIndex built-in.

While we are here, also add these extensions to the extensions list,
even though they are not doing anything useful.

Fixes test:
 dEQP-VK.compute.device_group.device_index

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9516>
2021-03-11 21:12:52 +00:00
Connor Abbott ee1f140fd9 freedreno/a6xx: Cleanup SP_XS_CTRL_REG0 definitions
The registers were actually different per-stage even though we used the
same type, which resulted in a bunch of incorrectly programmed fields
and confusion. Move the stage-specific values to the registers
themselves, which makes things much less confusing and makes it possible
to set "mergedregs" correctly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9493>
2021-03-11 20:58:39 +00:00
Connor Abbott 9a5596d679 freedreno/registers: Handle typed registers with fields
When a bitset is "inline" it should act as-if the its fields were
inserted into the register itself. However when initializing the
register's bitfield we weren't doing a deep copy of the inline bitfield,
so if the register defined additional fields then they would get added
to the original inline bitfield and any further registers with the same
type would get them. Fix this.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9493>
2021-03-11 20:58:39 +00:00
Connor Abbott 1d8bf2d0bf freedreno/computerator: Fix thrsz type
And use it for the other thread size field, too

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9493>
2021-03-11 20:58:39 +00:00
Yannik Marek 369f9d225d turnip: fix alpha to coverage in no color and unused attachment cases
In cases where the alpha coverage is enabled but the color attachment is
either unused or absent there should be a dummy mrt to make the draw behave
correctly.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Yannik Marek <yannik@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8952>
2021-03-10 22:02:43 +00:00
Matt Turner 6ceb6b509e turnip: Remove unused TU_DEBUG_IR3 flag
Replaced by IR3_SHADER_DEBUG=disasm,{vs,...,cs} and unused since the
commit referenced below.

Fixes: 808992fc50 ("tu: Use the ir3 shader API")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8249>
2021-03-10 18:59:22 +00:00
Eric Anholt eba1b2a1ba ci/freedreno: Mark another a5xx TF flake.
Showed up with an iommu fault preceding it each time it failed.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9488>
2021-03-10 18:44:16 +00:00
Jason Ekstrand 4fb6c051c9 anv: Move vk_format helpers to common code
The Android ones we put in anv_android.c.  Maybe one day we'll want a
vk_android.h to put some common Android stuff but, for now, let's keep
it contained to ANV's android code.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8857>
2021-03-10 18:17:31 +00:00
Jason Ekstrand 2523c47720 turnip: Move the CreateRenderPass wrapper to common code
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8857>
2021-03-10 18:17:31 +00:00
Danylo Piliaiev 2764cf8d32 ir3: use OPC_GETBUF to get size of sampler buffers
The maximum value which OPC_GETSIZE could return for one dimension
is 0x007ff0, however sampler buffer could be much bigger.
Blob uses OPC_GETBUF for them.

Fixes tests:
 dEQP-VK.memory.pipeline_barrier.transfer_dst_uniform_texel_buffer.1048576

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9391>
2021-03-10 17:10:45 +00:00
Danylo Piliaiev 8e6ed9948e freedreno/a5xx: port handling of PIPE_BUFFER textures from a6xx
Otherwise, we won't be able to use OPC_GETBUF to get their size.

After this change we also could get rid of the hack for OPC_GETSIZE
which scaled the size for texture buffers.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9391>
2021-03-10 17:10:44 +00:00
Danylo Piliaiev d968995c67 turnip: fix SP_HS_WAVE_INPUT_SIZE value
It appears that storage for varyings in a wave has an upper
limit of wavesize * max_a831 where max_a831 is 64.
Exceeding the limit seam to force gpu to reduce primitives
processed per wave, at least calculations make sense with
such interpretation.

With blob SP_HS_WAVE_INPUT_SIZE never exceeds 64 and setting
it to 65 in freedreno leads to a hang.

Copied from the commit to freedreno e5499ca2

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8187>
2021-03-10 16:50:11 +00:00
Connor Abbott 7b7532b806 freedreno/computerator: Add branching example
Mainly to be able to test label resolution without having to replace a
shader.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9463>
2021-03-10 16:23:04 +00:00
Connor Abbott 19c7b6f9d6 ir3/parser: Add ability to specify branchstack
This lets you test branching with computerator.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9463>
2021-03-10 16:23:04 +00:00
Connor Abbott a820eb537c ir3/parser: Support labels
This fixes the assembly for many scenarios where you want to use shader
replacement.

Note: unfortunately this leaks the identifier string created while
lexing, but I couldn't find a way to avoid leaking it except for
bringing in ralloc or something (which would be way more complicated).
The only other place doing something similar in mesa is the glsl parser,
which is using ralloc (actually a linear context).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9463>
2021-03-10 16:23:04 +00:00
Connor Abbott 534658f79b freedreno/computerator: Fix example assembly
Use the new bindless cat6 syntax for a6xx.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9463>
2021-03-10 16:23:04 +00:00