Commit Graph

150613 Commits

Author SHA1 Message Date
Alyssa Rosenzweig 5b1a00c565 panfrost: Inline pan_blit_emit_dcd
Easier to follow the logic without having a million arguments passed around.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig c9784c9512 panfrost: Decouple tiler job and DCD emit
We can share the "emit quad" logic, even though the DCDs differ.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig a13d87c484 panfrost: Annotate slow clears as such
We should realistically be using the clear shaders from PanVK once they're moved
to common.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig 1eb3dbafdb panfrost: Set defaults for deprecated DCD fields
There are always set to true. Don't pollute the driver code with them, make
their existence a local detail to pre-Valhall XML and that's it.

Functional change: "four components per vertex" is now set on vertex job DCDs.
This should be a no-op.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig bd3d7e33b6 panfrost: Use pan_shader_prepare_rsd in blitter
This reduces code duplication and will ease Valhall porting. Functional changes
on v7:

* Shader contains barrier is now set (perf loss, fixed later in series)
* Shader register allocation is now set (perf win)
* Point sprite inverted, no-op for blit shaders

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig 6fc81f163e pan/mdg: Fix partial execution mode names
cont -> skip, last -> kill, and fix the special case handling. It's just an
enum. Makes the disassembly easier to read and closer to Bifrost.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
2022-02-23 12:56:30 +00:00
Danylo Piliaiev 7e703e4428 turnip: Always use GMEM for feedback loops in autotuner
For ordinary feedback loops GMEM is a lot faster than sysmem since
we don't set SINGLE_PRIM mode.

For feedback loops with ordered rasterization GMEM should also be
faster.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>
2022-02-23 11:31:59 +00:00
Danylo Piliaiev ebc23ac963 turnip: Implement VK_ARM_rasterization_order_attachment_access
Trivially implemented by using A6XX_GRAS_SC_CNTL_SINGLE_PRIM_MODE.

This extension is useful for emulators e.g. AetherSX2 PS2 emulator and
could drastically improve performance when blending is emulated.

Relevant tests:
dEQP-VK.rasterization.rasterization_order_attachment_access.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>
2022-02-23 11:31:59 +00:00
Danylo Piliaiev d6c89e1e4a turnip: Merge LRZ and DEPTH_PLANE draw states
They were emitted at the same time. Frees 1 draw state for us to use.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>
2022-02-23 11:31:59 +00:00
Danylo Piliaiev dab34bd5c8 turnip: Use LATE_Z when there might be depth/stencil feedback loop
Otherwise a shader invocation would read the value which should have
been set AFTER this shader invocation.

Fixes tests:
 dEQP-VK.rasterization.rasterization_order_attachment_access.depth.samples_1.multi_draw_barriers
 dEQP-VK.rasterization.rasterization_order_attachment_access.stencil.samples_1.multi_draw_barriers

Fixes: 71595a189a
("tu: Fix feedback loops in sysmem mode")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>
2022-02-23 11:31:59 +00:00
Paulo Zanoni d10fd5b7c9 iris: fix register spilling on compute shaders on XeHP
XeHP scratch space is handled differently. Commit ae18e1e707
implemented support for it, but handled it differently between render
and compute shaders: it calculates scratch_addr differently and
doesn't pin the buffer on compute. Make it work on compute shaders by
calling pin_scratch_space() from iris_compute_walker(), which fixes
both the address and the pinning.

This commit can be verified by the two-year-old-but-still-unreviewed
Piglit MR 234. You can also verify this by running a very simple
compute shader with INTEL_DEBUG=spill_fs.

References: https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/234
Fixes: ae18e1e707 ("iris: Add support for scratch on XeHP")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15070>
2022-02-22 22:16:57 +00:00
Kenneth Graunke c46d3acf0e anv: Raise vertex input bindings and attributes limits slightly
This raises our vertex input bindings limit from 28 to 31, and our
vertex input attribute limit from 28 to 29.  We could theoretically
go higher, but it will take additional work.

The 3DSTATE_VERTEX_BUFFERS and 3DSTATE_VERTEX_ELEMENTS limits are 33
vertex buffers, and 34 vertex elements.  But we need up to two vertex
elements for system values (FirstVertex, BaseVertex, BaseInstance,
DrawID), and we currently use two vertex bindings for those.

There is another hidden limit: our compiler backend only supports the
push model for VS inputs currently.  3DSTATE_VS only allows URB Read
Lengths between [0, 15], which is measured in pairs of inputs, which
means we can theoretically push no more than 32 vertex elements.  This
is no artifical limit either, as a vec4 element takes up 4 registers
in the payload, and 32 * 4 = 128, the entire size of our register file.
Plus, the VS Thread payload needs at least g0 and g1 for other things,
so we can really only push 31.

We can theoretically support one additional binding, by combining our
two SGV bindings into a single upload.  In order to support additional
vertex elements, we would need to add support to the backend compiler
for the pull model for VS inputs.

References: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5917
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14991>
2022-02-22 21:31:06 +00:00
Mike Blumenkrantz dabba7d726 zink: ci updates
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15067>
2022-02-22 21:16:55 +00:00
Mike Blumenkrantz 3029000389 zink: remove zink_descriptor_util_init_null_set()
no longer used

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15067>
2022-02-22 21:16:55 +00:00
Mike Blumenkrantz 7266182be0 zink: allow null descriptor set layouts
I got confused while writing this somehow because of the null descriptor
feature, which enables drivers to consume a null descriptor, which has no
relation to a descriptor layout containing no descriptors

failing to accurately use zero descriptors can put layouts over the maximum
per-stage limits, which causes tests to crash

fixes (lavapipe):
KHR-GL46.shading_language_420pack.binding_uniform_block_array
KHR-GL46.multi_bind.dispatch_bind_buffers_base

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15067>
2022-02-22 21:16:55 +00:00
Timur Kristóf 3759a16d8a ac/nir/ngg: Fix mixed up primitive ID after culling.
When NGG culling is enabled, make sure that the correct
primitive ID is exported by each lane.

Fixes: e97f0463a8 "ac/nir: Implement NGG deferred attribute culling in NIR."
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6050
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15055>
2022-02-22 18:15:24 +00:00
Mike Blumenkrantz c063d8ff64 zink: prune ci lists
I don't know why I thought running GL3.2 and GL4.6 was a good idea,
but it wasn't

Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15065>
2022-02-22 18:02:00 +00:00
Emma Anholt 59bc17d57a turnip: Request no implicit sync when we have no implicit-sync WSI BOs.
I chose to implement this as a global flag in the device, because
otherwise we would end up with extra draw overhead trying to avoid it in
the implicit-sync WSI case, and you're probably going to end up needing
implicit sync anyway because you used one of the BOs in any of the
submitted cmdbufs.  To do better than this, we would probably want a
skip-implicit-sync flag on the BOs in the BO list, rather than global on
the submit.

Reports about venus on turnip say that this flag reduces worst-case
QueueSubmit time in a game workload from ~10ms to ~4ms.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14838>
2022-02-22 17:36:05 +00:00
Samuel Pitoiset 83ee08f6d1 radv: fix build on BSD
Just disable inotify for BDS systems.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6060
Fixes: c50557d961 ("radv: allow applications to dynamically change RADV_FORCE_VRS")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15105>
2022-02-22 17:16:21 +00:00
Alyssa Rosenzweig 2e86767370 pan/bi: Add BIFROST_MESA_DEBUG=nosb option
To disable the new scoreboarding optimizations when debugging.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig c81c022e66 pan/bi: Implement basic scoreboarding pass
Extend our existing bi_scoreboard infrastructure with a simple data flow
analysis pass that calculates which dependency slots need waiting. We
still lack a heuristic for selecting dependency slots.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig 8f25d88d90 pan/bi: Print scoreboarding state
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig 6ad9a7f650 pan/bi: Add scoreboard state to IR
To a limited degree, scoreboarding must be global, so add the data
structures for tracking this to the IR.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig 91c02893d8 pan/bi: Clean up nits in liveness analysis
Fix minor silly things.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig 734a8bdc5d pan/bi: Use bi_exit_block
The "generic" one is a vestige of Midgard.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig 75406a561f pan/bi: Add bi_{start, exit}_block helpers
Useful for data flow analysis.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig e5423bb129 pan/bi: Do not cull post-RA staging writes
Bifrost post-RA dead code elimination can cull the destinations of
regular ALU instructions, by weakening from a register write to a
temporary write. However, there is no way to suppress staging writes, so
culling the destinations will result in invalid code generation.

Fixes a regression in
dEQP-GLES3.functional.shaders.switch.switch_in_for_loop_static_vertex
with scoreboarding. The root cause there is the backend dead code
elimination not being sufficiently aggressive in the presence of control
flow. Usually this does not matter, since the backend optimizations are
intended to be local with global optimizations happening in NIR.
Unfortunately, our implementation of IDVS hits this hard. That will need
to be optimized (probably by specializing IDVS shaders in NIR instead of
the backend). In the mean time, let's fix the actual bug affecting
scoreboarding.

No shader-db changes.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig 87d46f40c8 pan/bi: Cull DTSEL_IMM dests in post-RA DCE
They are useless (given the semantics of DTSEL_IMM) and complicate
scoreboarding. Just remove them in the pass that removes all the other silly
register destinations.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig 956b969616 pan/bi: Clarify requirement for barriers
Barriers need to wait on all outstanding messages. This is more of an API
requirement than a hardware requirement, but it's still an invariant the
scoreboarding pass must respect.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Erik Faye-Lund 9a14ddc22d docs: add license to the redirects script
I always intended this to be covered by the MIT license like with the
rest of my contributions, but somehow forgot to add it.

Let's add that license to make things clear.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14751>
2022-02-22 16:15:47 +00:00
Adam Jackson 2eb644e470 mesa: Enable GL_NV_pack_subimage
This just legalizes a few of the pixelstore pack parameters in GLES2
that are already legal in desktop and GLES3. glamor takes advantage of
this in the GetImage and software-fallback paths.

Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14977>
2022-02-22 10:45:28 -05:00
Alyssa Rosenzweig 606ac8d61e pan/bi: Enable nir_opt_shrink_vectors
total instructions in shared programs: 1939513 -> 1935815 (-0.19%)
instructions in affected programs: 809066 -> 805368 (-0.46%)
helped: 3195
HURT: 865
helped stats (abs) min: 1.0 max: 15.0 x̄: 1.99 x̃: 1
helped stats (rel) min: 0.10% max: 25.00% x̄: 2.26% x̃: 1.28%
HURT stats (abs)   min: 1.0 max: 22.0 x̄: 3.09 x̃: 2
HURT stats (rel)   min: 0.10% max: 83.33% x̄: 2.67% x̃: 1.39%
95% mean confidence interval for instructions value: -1.00 -0.82
95% mean confidence interval for instructions %-change: -1.34% -1.08%
Instructions are helped.

total tuples in shared programs: 1523194 -> 1521789 (-0.09%)
tuples in affected programs: 745526 -> 744121 (-0.19%)
helped: 2947
HURT: 1844
helped stats (abs) min: 1.0 max: 18.0 x̄: 2.06 x̃: 1
helped stats (rel) min: 0.15% max: 25.00% x̄: 2.65% x̃: 1.59%
HURT stats (abs)   min: 1.0 max: 29.0 x̄: 2.54 x̃: 1
HURT stats (rel)   min: 0.09% max: 40.00% x̄: 2.32% x̃: 1.52%
95% mean confidence interval for tuples value: -0.39 -0.20
95% mean confidence interval for tuples %-change: -0.85% -0.62%
Tuples are helped.

total clauses in shared programs: 329158 -> 325350 (-1.16%)
clauses in affected programs: 111654 -> 107846 (-3.41%)
helped: 2787
HURT: 498
helped stats (abs) min: 1.0 max: 17.0 x̄: 1.57 x̃: 1
helped stats (rel) min: 0.76% max: 40.00% x̄: 6.92% x̃: 5.26%
HURT stats (abs)   min: 1.0 max: 3.0 x̄: 1.14 x̃: 1
HURT stats (rel)   min: 0.87% max: 50.00% x̄: 4.73% x̃: 3.77%
95% mean confidence interval for clauses value: -1.21 -1.10
95% mean confidence interval for clauses %-change: -5.39% -4.93%
Clauses are helped.

total cycles in shared programs: 172084.50 -> 166827.62 (-3.05%)
cycles in affected programs: 74698.83 -> 69441.96 (-7.04%)
helped: 3706
HURT: 568
helped stats (abs) min: 0.041665999999999315 max: 19.0 x̄: 1.44 x̃: 1
helped stats (rel) min: 0.24% max: 75.00% x̄: 9.48% x̃: 6.90%
HURT stats (abs)   min: 0.041665999999999315 max: 1.0 x̄: 0.15 x̃: 0
HURT stats (rel)   min: 0.25% max: 50.00% x̄: 2.21% x̃: 1.42%
95% mean confidence interval for cycles value: -1.28 -1.18
95% mean confidence interval for cycles %-change: -8.18% -7.67%
Cycles are helped.

total arith in shared programs: 57145.04 -> 57211.37 (0.12%)
arith in affected programs: 27595.12 -> 27661.46 (0.24%)
helped: 1933
HURT: 2259
helped stats (abs) min: 0.041665999999999315 max: 0.75 x̄: 0.09 x̃: 0
helped stats (rel) min: 0.16% max: 33.33% x̄: 2.74% x̃: 1.52%
HURT stats (abs)   min: 0.04166399999999726 max: 1.3333329999999997 x̄: 0.11 x̃: 0
HURT stats (rel)   min: 0.10% max: 100.00% x̄: 2.79% x̃: 1.62%
95% mean confidence interval for arith value: 0.01 0.02
95% mean confidence interval for arith %-change: 0.07% 0.40%
Arith are HURT.

total texture in shared programs: 12857 -> 12857 (0.00%)
texture in affected programs: 0 -> 0
helped: 0
HURT: 0

total vary in shared programs: 11157.75 -> 10222 (-8.39%)
vary in affected programs: 5643 -> 4707.25 (-16.58%)
helped: 3196
HURT: 0
helped stats (abs) min: 0.125 max: 1.875 x̄: 0.29 x̃: 0
helped stats (rel) min: 2.78% max: 75.00% x̄: 18.49% x̃: 15.00%
95% mean confidence interval for vary value: -0.30 -0.29
95% mean confidence interval for vary %-change: -18.88% -18.11%
Vary are helped.

total ldst in shared programs: 146420 -> 140270 (-4.20%)
ldst in affected programs: 66027 -> 59877 (-9.31%)
helped: 2942
HURT: 10
helped stats (abs) min: 1.0 max: 19.0 x̄: 2.09 x̃: 2
helped stats (rel) min: 0.90% max: 100.00% x̄: 16.81% x̃: 8.33%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 2.22% max: 50.00% x̄: 13.03% x̃: 3.33%
95% mean confidence interval for ldst value: -2.15 -2.02
95% mean confidence interval for ldst %-change: -17.53% -15.89%
Ldst are helped.

total quadwords in shared programs: 1398329 -> 1392117 (-0.44%)
quadwords in affected programs: 704641 -> 698429 (-0.88%)
helped: 3677
HURT: 1299
helped stats (abs) min: 1.0 max: 26.0 x̄: 2.51 x̃: 1
helped stats (rel) min: 0.10% max: 26.92% x̄: 2.64% x̃: 1.89%
HURT stats (abs)   min: 1.0 max: 20.0 x̄: 2.31 x̃: 1
HURT stats (rel)   min: 0.11% max: 44.44% x̄: 2.34% x̃: 1.55%
95% mean confidence interval for quadwords value: -1.34 -1.16
95% mean confidence interval for quadwords %-change: -1.44% -1.25%
Quadwords are helped.

total threads in shared programs: 35234 -> 35311 (0.22%)
threads in affected programs: 119 -> 196 (64.71%)
helped: 91
HURT: 14
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.60 0.87
95% mean confidence interval for threads %-change: 70.08% 89.92%
Threads are helped.

total loops in shared programs: 125 -> 125 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 149 -> 144 (-3.36%)
spills in affected programs: 22 -> 17 (-22.73%)
helped: 1
HURT: 0

total fills in shared programs: 966 -> 956 (-1.04%)
fills in affected programs: 44 -> 34 (-22.73%)
helped: 1
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15090>
2022-02-22 15:21:09 +00:00
Alyssa Rosenzweig e0e63c2a8e pan/bi: Specialize IDVS in NIR
It's a bit more code, but it's needed to chew through control flow since we
don't have a backend version of dead_cf. Results are really good, meaning I
really screwed this up the first time around (hence the cc mesa-stable).

total instructions in shared programs: 1963576 -> 1939513 (-1.23%)
instructions in affected programs: 671053 -> 646990 (-3.59%)
helped: 4436
HURT: 729
helped stats (abs) min: 1.0 max: 43.0 x̄: 5.75 x̃: 6
helped stats (rel) min: 0.21% max: 100.00% x̄: 6.47% x̃: 5.17%
HURT stats (abs)   min: 1.0 max: 22.0 x̄: 2.01 x̃: 1
HURT stats (rel)   min: 0.50% max: 50.00% x̄: 10.45% x̃: 9.09%
95% mean confidence interval for instructions value: -4.77 -4.55
95% mean confidence interval for instructions %-change: -4.36% -3.80%
Instructions are helped.

total tuples in shared programs: 1533335 -> 1523194 (-0.66%)
tuples in affected programs: 483167 -> 473026 (-2.10%)
helped: 3414
HURT: 1288
helped stats (abs) min: 1.0 max: 20.0 x̄: 3.73 x̃: 2
helped stats (rel) min: 0.27% max: 100.00% x̄: 4.87% x̃: 3.03%
HURT stats (abs)   min: 1.0 max: 19.0 x̄: 2.02 x̃: 1
HURT stats (rel)   min: 0.24% max: 38.10% x̄: 8.10% x̃: 5.88%
95% mean confidence interval for tuples value: -2.28 -2.03
95% mean confidence interval for tuples %-change: -1.62% -1.02%
Tuples are helped.

total clauses in shared programs: 351432 -> 329158 (-6.34%)
clauses in affected programs: 142237 -> 119963 (-15.66%)
helped: 5328
HURT: 3
helped stats (abs) min: 1.0 max: 43.0 x̄: 4.18 x̃: 4
helped stats (rel) min: 0.74% max: 100.00% x̄: 19.44% x̃: 17.24%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 9.09% max: 12.50% x̄: 10.90% x̃: 11.11%
95% mean confidence interval for clauses value: -4.25 -4.11
95% mean confidence interval for clauses %-change: -19.72% -19.12%
Clauses are helped.

total cycles in shared programs: 202830.92 -> 172084.50 (-15.16%)
cycles in affected programs: 117078.42 -> 86332 (-26.26%)
helped: 5450
HURT: 1
helped stats (abs) min: 0.083333 max: 49.0 x̄: 5.64 x̃: 5
helped stats (rel) min: 1.42% max: 100.00% x̄: 27.94% x̃: 25.64%
HURT stats (abs)   min: 0.25 max: 0.25 x̄: 0.25 x̃: 0
HURT stats (rel)   min: 2.46% max: 2.46% x̄: 2.46% x̃: 2.46%
95% mean confidence interval for cycles value: -5.74 -5.54
95% mean confidence interval for cycles %-change: -28.30% -27.58%
Cycles are helped.

total arith in shared programs: 57274.29 -> 57145.04 (-0.23%)
arith in affected programs: 16418.33 -> 16289.08 (-0.79%)
helped: 2442
HURT: 1784
helped stats (abs) min: 0.041665999999999315 max: 0.75 x̄: 0.14 x̃: 0
helped stats (rel) min: 0.23% max: 100.00% x̄: 5.51% x̃: 2.87%
HURT stats (abs)   min: 0.041665999999999315 max: 0.9166670000000003 x̄: 0.12 x̃: 0
HURT stats (rel)   min: 0.00% max: 100.00% x̄: 25.13% x̃: 9.09%
95% mean confidence interval for arith value: -0.04 -0.03
95% mean confidence interval for arith %-change: 6.61% 8.24%
Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree).

total texture in shared programs: 12857 -> 12857 (0.00%)
texture in affected programs: 0 -> 0
helped: 0
HURT: 0

total vary in shared programs: 11157.75 -> 11157.75 (0.00%)
vary in affected programs: 0 -> 0
helped: 0
HURT: 0

total ldst in shared programs: 177208 -> 146420 (-17.37%)
ldst in affected programs: 117098 -> 86310 (-26.29%)
helped: 5447
HURT: 0
helped stats (abs) min: 1.0 max: 49.0 x̄: 5.65 x̃: 5
helped stats (rel) min: 1.92% max: 100.00% x̄: 27.91% x̃: 25.64%
95% mean confidence interval for ldst value: -5.75 -5.55
95% mean confidence interval for ldst %-change: -28.27% -27.56%
Ldst are helped.

total quadwords in shared programs: 1436507 -> 1398329 (-2.66%)
quadwords in affected programs: 515101 -> 476923 (-7.41%)
helped: 5150
HURT: 111
helped stats (abs) min: 1.0 max: 39.0 x̄: 7.46 x̃: 6
helped stats (rel) min: 0.17% max: 100.00% x̄: 10.02% x̃: 8.24%
HURT stats (abs)   min: 1.0 max: 9.0 x̄: 2.01 x̃: 1
HURT stats (rel)   min: 0.43% max: 21.62% x̄: 3.57% x̃: 1.94%
95% mean confidence interval for quadwords value: -7.41 -7.11
95% mean confidence interval for quadwords %-change: -9.98% -9.49%
Quadwords are helped.

total threads in shared programs: 35025 -> 35228 (0.58%)
threads in affected programs: 218 -> 421 (93.12%)
helped: 208
HURT: 5
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.91 0.99
95% mean confidence interval for threads %-change: 93.40% 99.55%
Threads are helped.

total loops in shared programs: 128 -> 125 (-2.34%)
loops in affected programs: 3 -> 0
helped: 3
HURT: 0
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%

total spills in shared programs: 158 -> 149 (-5.70%)
spills in affected programs: 15 -> 6 (-60.00%)
helped: 9
HURT: 0

total fills in shared programs: 1133 -> 966 (-14.74%)
fills in affected programs: 197 -> 30 (-84.77%)
helped: 9
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15090>
2022-02-22 15:21:09 +00:00
Alyssa Rosenzweig 3c1021cd1e panvk: Use more reliable assert for UBO pushing
The important thing isn't the number of words pushed, it's that there are no
UBOs required for us to upload. Check that instead.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15090>
2022-02-22 15:21:09 +00:00
Georg Lehmann d223b7f096 radv, aco: Add u_foreach_bit to .clang-format.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15083>
2022-02-22 14:57:29 +00:00
Xaver Hugl a6be12fdad gbm: improve documentation about the lifetime of resources
Signed-off-by: Xaver Hugl <xaver.hugl@gmail.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10906>
2022-02-22 14:42:52 +01:00
Marek Olšák 62074cb4ac ac: update shadowed registers
based on PAL

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák e74929bfef radeonsi: move Arcturus code outside the gfx9 branch
preparation for a future commit

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák c740fd18ba ac/llvm: replace structured by vindex != NULL in ac_build_buffer_store_common
"raw" (IDXEN=0) and "structured" (IDXEN=1) do bounds checking differently.
From `si_make_buffer_descriptor`:
    * - For VMEM and inst.IDXEN == 0 or STRIDE == 0, it's in byte units.
    * - For VMEM and inst.IDXEN == 1 and STRIDE != 0, it's in units of STRIDE.

so there is a difference between setting vindex = i32_0 and vindex = NULL.
Instead of having the `structured` flag, we can just check if vindex is NULL.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 1038382baf ac/llvm: replace structured by vindex != NULL in ac_build_tbuffer_store
"raw" (IDXEN=0) and "structured" (IDXEN=1) do bounds checking differently.
From `si_make_buffer_descriptor`:
    * - For VMEM and inst.IDXEN == 0 or STRIDE == 0, it's in byte units.
    * - For VMEM and inst.IDXEN == 1 and STRIDE != 0, it's in units of STRIDE.

so there is a difference between setting vindex = i32_0 and vindex = NULL.
Instead of having the `structured` flag, we can just check if vindex is NULL.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák c8e2c6faf6 radeonsi: use SET_SH_REG_INDEX with index=3 for registers containing CU_EN
This matches PAL and RADV behavior. It's for preemption.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 79a7ab642a ac/surface: add more elements to meta equations because HTILE can use them
according to gfx10SwizzlePattern.h

Fixes: 9fabbf2150 - ac/surface: copy the HTILE equations to the surface

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 9a28f79f7b ac/surface/tests: fix missing NUM_PKRS extraction in test_modifier
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 12e00be09b radeonsi: apply the LLVM discard bug workaround to LLVM 13 only
It was fixed in LLVM 14.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 21f169b2fb ac,radeonsi: rework and optimize how TMPRING_SIZE is set
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Yogesh Mohan Marimuthu 3c7df183a3 radeonsi: prepare clamp, alpha test before mrtz prepare
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Yogesh Mohan Marimuthu c268dafae5 radeonsi: move clamp, alpha test from si_export_mrt_color() to new function
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 1485f683e3 radeonsi: fix the unaligned clear_buffer fallback with TC
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 4e49a05e37 radeonsi: increase the tesselation factor ring size
based on PAL

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 37c26a72a4 radeonsi: remove bit gaps in SI_RESOURCE_FLAG_*
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00