Commit Graph

150783 Commits

Author SHA1 Message Date
Samuel Pitoiset af2951dde8 radv/ci: update list of expected failures
Add dEQP-VK.glsl.builtin.precision_double.determinant.compute.mat3
which fails on all generations.

It looks like CTS should relax tolerance slightly.

Co-authored-by: Charlie Turner <cturner@igalia.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15234>
2022-03-04 18:43:18 +01:00
Samuel Pitoiset 51c6fdf708 radv/ci: skip dEQP-VK.renderpass2.depth_stencil_resolve.*_samplemask
They randomly hang on Navi10 and randomly fail on Sienna Cichlid.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15234>
2022-03-04 18:43:16 +01:00
Juan A. Suarez Romero 7ffee7f1ab v3d: rebind sampler view if resource changed the BO
When discarding the whole resource to create a new one, if this resource
is used by a sampler view, a rebind must be done to use the new
resource.

But this must be done when setting the sampler views, because we don't
have access to those samplers before.

v2:
 - Pack shader state on setting sampler views (Iago)
 - Use a serial ID to know when to rebind sampler views (Juan)

v3:
 - Move check to caller (Iago)
 - Keep rebind sampler view on BO change (Iago)

v4:
 - Rename "serial_bo" to "serial_id" (Iago)
 - Add comments (Iago)

Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6027
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15171>
2022-03-04 17:20:28 +00:00
Alyssa Rosenzweig 7bda838c56 panfrost: Push twice as many uniforms
The limit for Bifrost is twice as high as previously thought -- the limit is 64
*slots* of FAU, not 64 words. Each slot is 2 words. We can push twice as much,
saving a considerable number of cycles in some cases.

total instructions in shared programs: 2454260 -> 2431502 (-0.93%)
instructions in affected programs: 845176 -> 822418 (-2.69%)
helped: 3376
HURT: 304
helped stats (abs) min: 1.0 max: 60.0 x̄: 7.92 x̃: 6
helped stats (rel) min: 0.13% max: 45.45% x̄: 4.60% x̃: 4.11%
HURT stats (abs)   min: 1.0 max: 60.0 x̄: 13.06 x̃: 8
HURT stats (rel)   min: 0.16% max: 35.09% x̄: 7.58% x̃: 6.52%
95% mean confidence interval for instructions value: -6.50 -5.87
95% mean confidence interval for instructions %-change: -3.75% -3.43%
Instructions are helped.

total tuples in shared programs: 1963383 -> 1951560 (-0.60%)
tuples in affected programs: 638622 -> 626799 (-1.85%)
helped: 2959
HURT: 573
helped stats (abs) min: 1.0 max: 54.0 x̄: 5.61 x̃: 4
helped stats (rel) min: 0.15% max: 28.57% x̄: 3.61% x̃: 3.12%
HURT stats (abs)   min: 1.0 max: 50.0 x̄: 8.35 x̃: 6
HURT stats (rel)   min: 0.25% max: 27.34% x̄: 6.24% x̃: 4.92%
95% mean confidence interval for tuples value: -3.61 -3.08
95% mean confidence interval for tuples %-change: -2.18% -1.85%
Tuples are helped.

total clauses in shared programs: 387817 -> 365111 (-5.85%)
clauses in affected programs: 135527 -> 112821 (-16.75%)
helped: 3489
HURT: 25
helped stats (abs) min: 1.0 max: 43.0 x̄: 6.52 x̃: 5
helped stats (rel) min: 0.82% max: 58.33% x̄: 17.48% x̃: 15.87%
HURT stats (abs)   min: 1.0 max: 3.0 x̄: 1.56 x̃: 1
HURT stats (rel)   min: 2.94% max: 11.11% x̄: 6.87% x̃: 6.67%
95% mean confidence interval for clauses value: -6.67 -6.26
95% mean confidence interval for clauses %-change: -17.65% -16.96%
Clauses are helped.

total cycles in shared programs: 201842.21 -> 168754.04 (-16.39%)
cycles in affected programs: 84035.50 -> 50947.33 (-39.37%)
helped: 3547
HURT: 136
helped stats (abs) min: 0.041665999999999315 max: 54.0 x̄: 9.33 x̃: 8
helped stats (rel) min: 0.17% max: 80.77% x̄: 36.10% x̃: 36.84%
HURT stats (abs)   min: 0.041665999999999315 max: 1.0 x̄: 0.12 x̃: 0
HURT stats (rel)   min: 0.18% max: 12.24% x̄: 1.18% x̃: 0.61%
95% mean confidence interval for cycles value: -9.26 -8.71
95% mean confidence interval for cycles %-change: -35.34% -34.11%
Cycles are helped.

total arith in shared programs: 74918.46 -> 75022.62 (0.14%)
arith in affected programs: 22471.04 -> 22575.21 (0.46%)
helped: 1571
HURT: 1492
helped stats (abs) min: 0.041665999999999315 max: 1.125 x̄: 0.17 x̃: 0
helped stats (rel) min: 0.17% max: 40.00% x̄: 2.50% x̃: 1.96%
HURT stats (abs)   min: 0.041665999999999315 max: 2.375 x̄: 0.25 x̃: 0
HURT stats (rel)   min: 0.16% max: 100.00% x̄: 5.35% x̃: 2.37%
95% mean confidence interval for arith value: 0.02 0.05
95% mean confidence interval for arith %-change: 1.08% 1.56%
Arith are HURT.

total ldst in shared programs: 174812 -> 137889 (-21.12%)
ldst in affected programs: 81319 -> 44396 (-45.41%)
helped: 3722
HURT: 0
helped stats (abs) min: 1.0 max: 62.0 x̄: 9.92 x̃: 8
helped stats (rel) min: 1.82% max: 100.00% x̄: 47.18% x̃: 43.75%
95% mean confidence interval for ldst value: -10.20 -9.64
95% mean confidence interval for ldst %-change: -47.97% -46.39%
Ldst are helped.

total quadwords in shared programs: 1757124 -> 1714130 (-2.45%)
quadwords in affected programs: 584065 -> 541071 (-7.36%)
helped: 3474
HURT: 173
helped stats (abs) min: 1.0 max: 90.0 x̄: 12.66 x̃: 9
helped stats (rel) min: 0.26% max: 34.18% x̄: 8.78% x̃: 8.33%
HURT stats (abs)   min: 1.0 max: 26.0 x̄: 5.76 x̃: 4
HURT stats (rel)   min: 0.45% max: 20.66% x̄: 4.48% x̃: 2.63%
95% mean confidence interval for quadwords value: -12.21 -11.37
95% mean confidence interval for quadwords %-change: -8.36% -7.95%
Quadwords are helped.

total threads in shared programs: 52898 -> 53142 (0.46%)
threads in affected programs: 262 -> 506 (93.13%)
helped: 250
HURT: 6
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.92 0.99
95% mean confidence interval for threads %-change: 93.69% 99.28%
Threads are helped.

total spills in shared programs: 161 -> 107 (-33.54%)
spills in affected programs: 54 -> 0
helped: 27
HURT: 0

total fills in shared programs: 1386 -> 796 (-42.57%)
fills in affected programs: 590 -> 0
helped: 27
HURT: 0

Fixes: d4dccea0ba ("panfrost: Add UBO push data structure")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15239>
2022-03-04 15:22:04 +00:00
Alyssa Rosenzweig e7cfe18099 pan/bi: Run CSE after lowering FAU
Lowering FAU can add moves from uniforms. If a uniform is moved out to a
register mulitple times in a basic block, these moves can be CSE'd, saving
instructions at the cost of register pressure.

854 shaders in my shader-db are helped on cycle count (average 2.94% reduction
in cycles). Only 9 shaders have hurt thread count, and there is no change in
spills or fills. Overall, this seems to be a win.

Prevents instruction count regressions from the next commit.

total instructions in shared programs: 2454423 -> 2444690 (-0.40%)
instructions in affected programs: 386274 -> 376541 (-2.52%)
helped: 2105
HURT: 0
helped stats (abs) min: 1.0 max: 116.0 x̄: 4.62 x̃: 2
helped stats (rel) min: 0.04% max: 27.27% x̄: 3.64% x̃: 1.92%
95% mean confidence interval for instructions value: -4.91 -4.33
95% mean confidence interval for instructions %-change: -3.83% -3.45%
Instructions are helped.

total tuples in shared programs: 1963534 -> 1957106 (-0.33%)
tuples in affected programs: 233562 -> 227134 (-2.75%)
helped: 1491
HURT: 117
helped stats (abs) min: 1.0 max: 63.0 x̄: 4.44 x̃: 2
helped stats (rel) min: 0.04% max: 24.53% x̄: 4.39% x̃: 2.59%
HURT stats (abs)   min: 1.0 max: 5.0 x̄: 1.61 x̃: 1
HURT stats (rel)   min: 0.18% max: 8.33% x̄: 1.44% x̃: 1.05%
95% mean confidence interval for tuples value: -4.28 -3.71
95% mean confidence interval for tuples %-change: -4.20% -3.73%
Tuples are helped.

total clauses in shared programs: 387848 -> 387079 (-0.20%)
clauses in affected programs: 13718 -> 12949 (-5.61%)
helped: 583
HURT: 60
helped stats (abs) min: 1.0 max: 16.0 x̄: 1.42 x̃: 1
helped stats (rel) min: 1.11% max: 25.00% x̄: 8.28% x̃: 6.67%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.86% max: 20.00% x̄: 4.58% x̃: 4.00%
95% mean confidence interval for clauses value: -1.29 -1.10
95% mean confidence interval for clauses %-change: -7.57% -6.58%
Clauses are helped.

total cycles in shared programs: 201866.21 -> 201682.92 (-0.09%)
cycles in affected programs: 6241.79 -> 6058.50 (-2.94%)
helped: 952
HURT: 98
helped stats (abs) min: 0.04166399999999726 max: 2.625 x̄: 0.20 x̃: 0
helped stats (rel) min: 0.12% max: 26.00% x̄: 4.05% x̃: 2.38%
HURT stats (abs)   min: 0.041665999999999315 max: 0.16666700000000034 x̄: 0.07 x̃: 0
HURT stats (rel)   min: 0.18% max: 8.70% x̄: 1.60% x̃: 1.43%
95% mean confidence interval for cycles value: -0.19 -0.16
95% mean confidence interval for cycles %-change: -3.80% -3.24%
Cycles are helped.

total arith in shared programs: 74924.00 -> 74660.12 (-0.35%)
arith in affected programs: 9303.67 -> 9039.79 (-2.84%)
helped: 1513
HURT: 118
helped stats (abs) min: 0.04166399999999726 max: 2.625 x̄: 0.18 x̃: 0
helped stats (rel) min: 0.07% max: 33.33% x̄: 4.68% x̃: 2.67%
HURT stats (abs)   min: 0.041665999999999315 max: 0.16666800000000137 x̄: 0.07 x̃: 0
HURT stats (rel)   min: 0.18% max: 8.70% x̄: 1.55% x̃: 1.37%
95% mean confidence interval for arith value: -0.17 -0.15
95% mean confidence interval for arith %-change: -4.48% -3.98%
Arith are helped.

total quadwords in shared programs: 1757254 -> 1751978 (-0.30%)
quadwords in affected programs: 197399 -> 192123 (-2.67%)
helped: 1464
HURT: 110
helped stats (abs) min: 1.0 max: 51.0 x̄: 3.73 x̃: 2
helped stats (rel) min: 0.04% max: 21.95% x̄: 4.16% x̃: 2.52%
HURT stats (abs)   min: 1.0 max: 7.0 x̄: 1.71 x̃: 1
HURT stats (rel)   min: 0.21% max: 13.04% x̄: 1.65% x̃: 0.93%
95% mean confidence interval for quadwords value: -3.58 -3.13
95% mean confidence interval for quadwords %-change: -3.97% -3.53%
Quadwords are helped.

total threads in shared programs: 52899 -> 52890 (-0.02%)
threads in affected programs: 18 -> 9 (-50.00%)
helped: 0
HURT: 9
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: -1.00 -1.00
95% mean confidence interval for threads %-change: -50.00% -50.00%
Threads are HURT.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15239>
2022-03-04 15:22:04 +00:00
Henry Goffin c8f644ec44 frontends/va: ignore incoming frame_num from VA picture parameters
The Gallium pipe video "frame_num" variable is internally used as a
counter of elapsed reference frames since the last IDR. The incoming
frame_num field from VA picture parameters is not equivalent; the VA
value may wrap to zero prematurely, as it is a 16-bit struct field with
a documented max value of 2^(log2_max_frame_num_minus4 + 4)-1.

This change improves "infinite GOP" single-client live streaming, where
it is reasonable for the server to desire an endless series of P-frames
without IDR. Without this change, it is difficult/impossible for an
application to encode a P- or B-frame after the VA frame_num field wraps
around to zero, depending on the backend encoder implementation.

This change has no effect on existing applications that always signal an
IDR frame and reset the VA frame_num to zero before it wraps around. For
example, the FFmpeg vaapi encoder ignores the VA documentation and sends
an un-wrapped VA frame_num, which results in identical computation of
the internal frame_num (as long as each GOP is less than 65536 frames).

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5768

Reviewed-by: Thong Thai <thong.thai@amd.com>

patch revision 3: correctly avoid incrementing frame_num when the encoded
frame is not a reference, per h264 spec and ffmpeg behavior

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14332>
2022-03-04 14:17:20 +00:00
Rhys Perry d28b6b6856 aco: rework removal of jumps over branches
Only allow this in situations where we know it's safe. In particular, this
stops removal of unconditional branches like with
block_kind_continue_or_break.

Fixes dEQP-VK.graphicsfuzz.fragcoord-control-flow hang.

fossil-db (Sienna Cichlid):
Totals from 34 (0.02% of 162293) affected shaders:
Instrs: 84115 -> 84178 (+0.07%); split: -0.00%, +0.08%
CodeSize: 463372 -> 463624 (+0.05%); split: -0.00%, +0.06%
Latency: 3467316 -> 3467652 (+0.01%)
InvThroughput: 3085493 -> 3085578 (+0.00%)
Branches: 3221 -> 3284 (+1.96%); split: -0.03%, +1.99%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: f030b75b7d ("aco: relax condition to remove branches in case of few instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15214>
2022-03-04 12:32:36 +00:00
Samuel Pitoiset 059f870d74 ac/nir: implement nir_op_pack_{uint,sint}_2x16
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231>
2022-03-04 08:06:56 +00:00
Samuel Pitoiset 9b113f1b6c aco: implement nir_op_pack_{uint,sint}_2x16
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231>
2022-03-04 08:06:56 +00:00
Samuel Pitoiset 6532307555 nir: introduce nir_pack_{sint,uint}_2x16 instructions
These instructions have AMD hardware equivalent and they will be used
to lower fragment shader outputs in NIR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231>
2022-03-04 08:06:56 +00:00
Xiaohui Gu 4d81c60e11 iris: Mark a dirty update when vs_needs_sgvs_element value changed
Add vs_needs_sgvs_element value check when updating vertex
element dirty state in iris_update_compiled_vs to solve
render error of Android game "Genshin Impact".

Signed-off-by: Xiaohui Gu <xiaohui.gu@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15142>
2022-03-04 05:41:38 +00:00
Yiwei Zhang aaa25cda0b venus: add VK_EXT_image_robustness support
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>
2022-03-04 01:04:13 +00:00
Yiwei Zhang ba212bf888 venus: add VK_EXT_provoking_vertex support
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>
2022-03-04 01:04:13 +00:00
Yiwei Zhang 33ba61b059 venus: add VK_EXT_line_rasterization support
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>
2022-03-04 01:04:13 +00:00
Yiwei Zhang 58182eb096 venus: update to latest venus protocol
Added the below extension support:
- VK_EXT_line_rasterization
- VK_EXT_provoking_vertex

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>
2022-03-04 01:04:13 +00:00
Yiwei Zhang 20efd9eff3 venus: group extensions promoted to 1.3
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>
2022-03-04 01:04:13 +00:00
Yiwei Zhang fe3815b7fa venus: clean up physical device features and properties
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>
2022-03-04 01:04:13 +00:00
Daniel Schürmann ca4595e01a nir/opt_shrink_vectors: update docstring
in order to reflect the various recent improvements.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468>
2022-03-04 00:18:58 +00:00
Daniel Schürmann 405829cd85 nir/opt_shrink_vectors: remove duplicate components from vecN
vecN instructions which are only used by other ALU
will now get duplicate channels removed.

i915g:
total instructions in shared programs: 396309 -> 396294 (<.01%)
instructions in affected programs: 186 -> 171 (-8.06%)

r300:
total instructions in shared programs: 1165059 -> 1164354 (-0.06%)
instructions in affected programs: 35884 -> 35179 (-1.96%)
total temps in shared programs: 165497 -> 165326 (-0.10%)
temps in affected programs: 2990 -> 2819 (-5.72%)

softpipe:
total instructions in shared programs: 2860028 -> 2859084 (-0.03%)
instructions in affected programs: 55539 -> 54595 (-1.70%)
total temps in shared programs: 516939 -> 516546 (-0.08%)
temps in affected programs: 6623 -> 6230 (-5.93%)

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468>
2022-03-04 00:18:58 +00:00
Daniel Schürmann e5963478c2 nir/opt_shrink_vectors: shrink load_const properly
This patch enables removal of arbitrary channels in
load_const instructions, if they are either unused or
duplicates of other channels and only used by ALU.

Totals from 692 (0.51% of 134913) affected shaders: (GFX10.3)
VGPRs: 21832 -> 21544 (-1.32%)
CodeSize: 1322016 -> 1313080 (-0.68%); split: -0.68%, +0.01%
Instrs: 243635 -> 242231 (-0.58%); split: -0.58%, +0.00%
Latency: 1856138 -> 1857237 (+0.06%); split: -0.09%, +0.15%
InvThroughput: 424298 -> 421671 (-0.62%); split: -0.62%, +0.01%
VClause: 4580 -> 4583 (+0.07%); split: -0.02%, +0.09%
SClause: 14336 -> 14354 (+0.13%); split: -0.04%, +0.17%
Copies: 8897 -> 8859 (-0.43%); split: -0.45%, +0.02%
PreSGPRs: 20439 -> 20437 (-0.01%)
PreVGPRs: 16011 -> 15907 (-0.65%); split: -0.97%, +0.32%

i915g:
total instructions in shared programs: 396471 -> 396309 (-0.04%)
instructions in affected programs: 6408 -> 6246 (-2.53%)
total const in shared programs: 56458 -> 56422 (-0.06%)
const in affected programs: 407 -> 371 (-8.85%)
LOST:   shaders/closed/steam/trine-2/fp-3.shader_test FS

r300:
total instructions in shared programs: 1164421 -> 1165059 (0.05%)
instructions in affected programs: 143981 -> 144619 (0.44%)
total temps in shared programs: 165488 -> 165497 (<.01%)
temps in affected programs: 318 -> 327 (2.83%)
total consts in shared programs: 922140 -> 921952 (-0.02%)
consts in affected programs: 12438 -> 12250 (-1.51%)

softpipe:
total instructions in shared programs: 2859978 -> 2860028 (<.01%)
instructions in affected programs: 183355 -> 183405 (0.03%)
total temps in shared programs: 517071 -> 516939 (-0.03%)
temps in affected programs: 1416 -> 1284 (-9.32%)
total imm in shared programs: 103601 -> 102767 (-0.81%)
imm in affected programs: 3928 -> 3094 (-21.23%)

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468>
2022-03-04 00:18:58 +00:00
Dave Airlie a10b5d7086 crocus: change the line width workaround for gfx4/5
This fixes piglit line-flat-clip-color and the hud fps counter.

Fixes: 6b7a68b7c2 ("crocus: add missing line smooth bits.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15229>
2022-03-04 00:06:28 +00:00
Chia-I Wu bbbbf39559 venus: abort when stuck
This gives

  MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 4096
  MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 8192
  MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 12288
  MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 16384
  MESA-VIRTIO: debug: aborting
  Aborted

which should be more friendly than printing the messages forever.

On my i7-7820HQ, this aborts after roughly 4+8+16+32=60 seconds

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15200>
2022-03-03 21:48:13 +00:00
Daniel Schürmann ccf4bcd162 aco/ra: don't immediately assign a register for p_branch
These get now assigned after handling phis.

Totals from 564 (0.42% of 134913) affected shaders: (GFX10.3)
CodeSize: 5519744 -> 5515308 (-0.08%)
Instrs: 1063045 -> 1061936 (-0.10%)
Latency: 11880452 -> 11875904 (-0.04%)
InvThroughput: 2259933 -> 2259581 (-0.02%); split: -0.02%, +0.00%
Copies: 86908 -> 85799 (-1.28%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
2022-03-03 20:21:08 +00:00
Rhys Perry c3070773f8 aco/tests: add test for branch definition RA
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
2022-03-03 20:21:08 +00:00
Rhys Perry 32d0bae8ec aco: fix branch definition validation
Like how they have to be register allocated differently, branch
definitions at merge block predecessors need to be validated differently.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
2022-03-03 20:21:08 +00:00
Rhys Perry bed5a31005 aco: add validate_instr_defs()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
2022-03-03 20:21:08 +00:00
Rhys Perry d5349a99c2 aco/ra: fix register allocation of branch definitions
fossil-db (Sienna Cichlid):
Totals from 704 (0.52% of 134913) affected shaders:
CodeSize: 7177288 -> 7182072 (+0.07%); split: -0.00%, +0.07%
Instrs: 1371781 -> 1372977 (+0.09%); split: -0.00%, +0.09%
Latency: 17993572 -> 18001344 (+0.04%); split: -0.00%, +0.04%
InvThroughput: 4198996 -> 4199569 (+0.01%); split: -0.00%, +0.01%
Copies: 122456 -> 123516 (+0.87%); split: -0.01%, +0.88%
Branches: 43815 -> 43818 (+0.01%); split: -0.02%, +0.03%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
2022-03-03 20:21:08 +00:00
Rhys Perry 608d48b787 aco/ra: add get_reg_phi() helper
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
2022-03-03 20:21:08 +00:00
Rhys Perry ceca5e68c4 aco: remove vcc hint from branch definitions
This doesn't seem to have much benefit anymore.

fossil-db (Sienna Cichlid):
Totals from 198 (0.15% of 134913) affected shaders:
CodeSize: 2610536 -> 2610872 (+0.01%); split: -0.01%, +0.02%
Instrs: 479001 -> 479085 (+0.02%); split: -0.01%, +0.03%
Latency: 7310684 -> 7300735 (-0.14%); split: -0.16%, +0.02%
InvThroughput: 2439084 -> 2437446 (-0.07%); split: -0.07%, +0.00%
SClause: 14760 -> 14722 (-0.26%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
2022-03-03 20:21:08 +00:00
Pavel Ondračka 558f632967 r300: schedule TEX instructions before OUT instructions
NIR-to-TGSI produces partial output writes contrary to the old paths
that always wrote the full outputs. Therefore if there is now a partial
output write ready to be scheduled and nothing else besides a tex
is ready, we would schedule the output write first. This was not a
problem before as usually at last some component of the full output write
depended on the tex result.

This is not optimal from the performance point of view and resulted in
~20% slowdown in the Unigine demos. The docs say:

The first OUTPUT instruction will reserve space in the output register
fifo. This space is limited, therefore issuing an OUTPUT earlier than
necessary may cause threads to stall earlier than necessary. You
should not set an ALU instruction as type OUTPUT unless it is actually
writing to an output register, or it is the last instruction of
the program.

Fix it by explicitly prefering a TEX before OUT and restore the
performance: 9.66 -> 12.12 fps (as compared to 11.83 with the old
glsl-to-TGSI path) in Unigine Sanctuary. No change in Lightsmark or
GLmark.

This is also a win from the intructions point of view as we are usually
able to schedule the partial output writes in a single pair at the end.

total instructions in shared programs: 106009 -> 105891 (-0.11%)
instructions in affected programs: 10153 -> 10035 (-1.16%)
helped: 118
HURT: 0

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5840
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15165>
2022-03-03 20:05:32 +00:00
Pavel Ondračka aff1a85c09 r300: remove some dead logic in tex pair scheduling
The max_score == -1 condition is already before so this
will never trigger. Its unclear what was the intention anyway. Now we
emit either:
- if we have accumulated enough tex intructions for a full block
- if we have nothing else to emit
- or if we can emit all remaining tex instructions already.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15165>
2022-03-03 20:05:32 +00:00
Igor Torrente 688b23885b Venus: Add `vn_physical_device_{features, properties}` for better organization
New extensions properties/feature are being put in the `vn_physical_device`
which is not ideal from an organization point of view.

Here the `vn_physical_device_{features,properties}` are two new struct to
help the `vn_physical_device` organzation.

Signed-off-by: Igor Torrente <igor.torrente@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15170>
2022-03-03 19:43:52 +00:00
Ilia Mirkin 539fae796a freedreno/a4xx: fix integer tg4
Something is slightly off in the integer values returned. It passes many
tests without the fixup, but the dEQP-GLES31 tests complain. The blob
ends up doing 3x gathers, and selects between them based on getinfo
results. Since we already have a per-sampler key with some spare bits,
just stick the bit-size info in there. And we can derive signedness from
the associated type info.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>
2022-03-03 18:26:43 +00:00
Ilia Mirkin 96211adf77 freedreno/a4xx: add swizzles to shader keys for tg4 workaround
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>
2022-03-03 18:26:43 +00:00
Ilia Mirkin 68a2d25d0d freedreno/a4xx: move tex_type to header
This will be used in several places. Factor it out for common use.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>
2022-03-03 18:26:43 +00:00
Ilia Mirkin 8ed07c0da9 nir: remove bogus logic to allow cube + offset to work
This was done for an a4xx hack which is now removed. No API allows cube
texturing to have offsets.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>
2022-03-03 18:26:43 +00:00
Ilia Mirkin 37306ba3f1 freedreno/ir3: remove bogus tg4 -> tex lowering pass
It can't be done. This just provides bad results. The blob had a
comparable approach where they fixed up coordinates, but that also can't
work with a separate texture definition with nearest filtering. By then,
might as well provide a unswizzled variant instead, and using native
functionality.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>
2022-03-03 18:26:43 +00:00
Alex Xu (Hello71) 80bf9c7b97 r300/compiler/tests: print regoff_t as size_t
fixes compilation on musl

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13949>
2022-03-03 17:48:17 +00:00
Samuel Pitoiset 516aee64cc radv,aco: do not lower nir_op_pack_{unorm,snorm}_2x16
v_cvt_pknorm_{u16,i16}_f32 can be emitted instead, it's supported on
all generations.

No fossils-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15215>
2022-03-03 14:54:12 +01:00
Michel Zou f1f1b3d7f8 vulkan/wsi: drop unused wsi_create_win32_image
fixes: ed391d2a

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15088>
2022-03-03 06:13:07 +00:00
Andrii Simiklit ddf2778269 glsl: add member's location layout qualifier rules for `arrayed` in/out blocks
From Section 4.4.1 (Input Layout Qualifiers) of the GLSL 4.50 spec:

     "For some blocks declared as arrays, the location can only be applied
     at the block level: When a block is declared as an array where
     additional locations are needed for each member for each block array
     element, it is a compile-time error to specify locations on the block
     members. That is, when locations would be under specified by applying
     them on block members, they are not allowed on block members. For
     arrayed interfaces (those generally having an extra level of
     arrayness due to interface expansion), the outer array is stripped
     before applying this rule"

From Section 1.2.1 (Changes from Revision 6 of GLSL Version) of the GLSL 4.50 spec:

     "Private Bug 15678: Don’t allow location = on block members where
      the block needs an array of locations"

From Section 4.4.1 (Input Layout Qualifiers) of the GLSL ES 3.20 spec

     "If an input is declared as an array of blocks, excluding per-vertex-arrays
      as required for tessellation, it is an error to declare a member of
      the block with a location qualifier"

From Section 1.1.3 (Changes from GLSL ES 3.2 revision 3) of the GLSL ES 3.20 spec:

     "Arrayed blocks cannot have layout location qualifiers on members"

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11522>
2022-03-03 05:42:45 +00:00
Mike Blumenkrantz 0313110c92 zink: ci updates
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>
2022-03-03 05:21:40 +00:00
Mike Blumenkrantz 712ce86bd1 zink: split primitives generated queries if xfb/gs states change
if one of these states change then it affects which result needs to be
used for that query, so split it up over multiple query ids to make sure
the correct result is obtained

fixes (lavapipe):
GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_pause_resume
GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_states

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>
2022-03-03 05:21:40 +00:00
Mike Blumenkrantz 0cb3ae949c zink: split out query suspending into util function
no functional changes

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>
2022-03-03 05:21:40 +00:00
Mike Blumenkrantz 5aecec48ee zink: update query states before starting renderpass during draw
this gives some leeway for doing transfer ops without crashing the renderpass

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>
2022-03-03 05:21:40 +00:00
Ilia Mirkin 965ab44c50 nvc0: disable EXT_texture_sRGB_RG8
Looks like the green component doesn't get srgb-decoding, and no obvious
way to force it. It works fine on nv50 though.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15211>
2022-03-03 04:37:12 +00:00
Ilia Mirkin 897a7fbbf1 mesa: enable GL_EXT_texture_sRGB_RG8 on desktop
Looks like an extension number was assigned in late 2020. This makes it
possible to hook up this format to teximage-colors without teaching it
about ES.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15211>
2022-03-03 04:37:12 +00:00
Mike Blumenkrantz af5f49f663 zink: remove loop from generated tcs
this is already using per-vertex io, no need to add conditionals to verify

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15225>
2022-03-03 02:58:43 +00:00
Rob Clark 7e63fa2bb1 freedreno/registers: Add a couple regs we need for kernel
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15221>
2022-03-03 02:19:47 +00:00
Dave Airlie 34379a937f gallivm/llvmpipe: add support for NIR to the linear/aos paths.
When the AOS/linear code was added it only worked with TGSI which
meant nothing in mesa upstream was really using it.

This adds support to analyse NIR shaders, and adds aos support
to the backend.

AOS support is limited to mov,vec,fmul,tex sampling in order to
accelerate mostly compositing operations. I've tested weston uses
the fast path. gnome-shell can't use it yet as we can't optimise
the depth test paths.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15140>
2022-03-03 01:39:39 +00:00