Commit Graph

150929 Commits

Author SHA1 Message Date
Mike Blumenkrantz 351378ae80 zink: remove a bunch of flakes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15219>
2022-03-03 01:14:20 +00:00
Dave Airlie e1e9a44a69 lavapipe: always set read/write on ssbo/images.
This fixes a regressions with overlap in llvmpipe, this is pessimistic
we should write code to make it work properly.

Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15219>
2022-03-03 01:14:20 +00:00
Alyssa Rosenzweig b48236ea3e pan/bi: Add arithmetic flag to RSHIFT ops
Models ops like ARSHIFT_OR.i32 on Valhall without adding piles of new
instructions to the IR.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:44 +00:00
Alyssa Rosenzweig 0b0e74ae82 pan/bi: Extend LD_TILE with a register format
Required for Valhall. NIR has the information anyway, pass it along.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:44 +00:00
Alyssa Rosenzweig 74107abfc6 pan/bi: Add BRANCHZI instruction
Technically this is just JUMP on Valhall, but the semantic is an indirect branch
based on comparing with zero. It can also be used as a conservative branch (like
BRANCHC), but this isn't modeled.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:44 +00:00
Alyssa Rosenzweig 3dc2095b07 pan/bi: Model LD_BUFFER instructions
We'll use these to read from UBOs on Valhall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:44 +00:00
Alyssa Rosenzweig 5796777889 pan/bi: Model offset for LOAD/STORE
Needed to model the immediate offset on Valhall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:43 +00:00
Alyssa Rosenzweig 039bb4e68c pan/bi: Model pos/vary segments in STORE instructions
For Bifrost, we model load/store segments, for example for thread local storage.
We need something similar on Valhall -- access modifiers. There are four access
modifiers on Valhall, controlling memory subsystem optimizations for the access:

none: Nothing may be assumed. Corresponds to "global".

istream: Internally streaming within the GPU. Corresponds to "pos", as it's
used for position stores.

estream: Externally streaming outside the GPU. Corresponds to "vary", as it's
used for varying stores.

force: Force access in discarded threads. Corresponds to "tl", as it's required
for correct behaviour of helper invocations that use the stack.

If these access modifiers end up being useful outside these fixed purposes, we
may need to rework this part of the IR. For now, this should suffice.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:43 +00:00
Alyssa Rosenzweig aaa39f0e60 pan/bi: Model LEA_BUF_IMM in the IR
Required for varying stores in malloced IDVS jobs on Valhall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:43 +00:00
Alyssa Rosenzweig eba9ef4c25 pan/bi: Add LD_VAR_BUF_IMM.f16/f32 instructions
For use on Valhall with memory-allocated IDVS jobs.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:43 +00:00
Alyssa Rosenzweig 48a398bf5b pan/bi: Generalize I->table for Valhall
Can be reused for resource tables in a natural way.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:43 +00:00
Alyssa Rosenzweig 20891e75c2 pan/bi: Extend BLEND to take a register format
Needed on Valhall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:43 +00:00
Alyssa Rosenzweig 3c817ed511 pan/bi: Model Valhall texture instructions
These act like a TEXC+immediate.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:43 +00:00
Alyssa Rosenzweig 234d3efb9b pan/va: Add memory access modifier to LOADs
Might be required for correct spilling in some circumstances.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:43 +00:00
Alyssa Rosenzweig 79aa4af078 pan/va: Remap "store segment" to "memory access"
For now, the difference does not matter. However it's better to model the actual
hardware behaviour, rather than isomorphic driver behaviour, when we can do so.
So fix the names.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:43 +00:00
Alyssa Rosenzweig 254a641290 pan/va: Fix LEA_BUF_IMM definition
Technically the table is folded, too; the 0xD refers to table 61. But this
instruction is more general than previously thought.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:43 +00:00
Alyssa Rosenzweig 7c798fbb9f pan/va: Fix definitions of LD_VAR_BUF_IMM
So close! However, LD_VAR_IMM is something else -- Bifrost-style varying
interpolation, without a hardware buffer. For ES3, we'll need to support both.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:43 +00:00
Alyssa Rosenzweig c62836661e pan/va: Add TEX_GATHER instruction
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:43 +00:00
Alyssa Rosenzweig 65cb3af38a pan/va: Add TEX_DUAL instruction
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:43 +00:00
Alyssa Rosenzweig 47b70ca584 pan/va: Add modifiers required for gathers
Mostly isomorphic to Bifrost-style gathers.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:43 +00:00
Alyssa Rosenzweig 431e7e54a6 pan/va: Handle force_enum differing from name
Needed for secondary register width, for dual texturing.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
2022-03-03 00:41:43 +00:00
Ian Romanick 0a6c6dcb00 i915g: Emit better code for SEQ(x, 0) and SNE(x, 0)
total instructions in shared programs: 789000 -> 788481 (-0.07%)
instructions in affected programs: 16179 -> 15660 (-3.21%)
helped: 157
HURT: 0
helped stats (abs) min: 3 max: 12 x̄: 3.31 x̃: 3
helped stats (rel) min: 1.56% max: 14.29% x̄: 4.24% x̃: 2.56%
95% mean confidence interval for instructions value: -3.51 -3.10
95% mean confidence interval for instructions %-change: -4.70% -3.78%
Instructions are helped.

LOST:   0
GAINED: 3

v2: Drop setting src1 to zero.  Suggested by Emma.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15210>
2022-03-03 00:07:58 +00:00
Ian Romanick 374da6fc41 i915g: Handle constants composed exclusively of 0 or ±1 specially
This can avoid some cases where a constant has to be loaded into a
temporary register.

v2: Update i915-g33-fails.txt.

total instructions in shared programs: 788625 -> 782376 (-0.79%)
instructions in affected programs: 166269 -> 160020 (-3.76%)
helped: 1578
HURT: 0
helped stats (abs) min: 3 max: 21 x̄: 3.96 x̃: 3
helped stats (rel) min: 1.56% max: 33.33% x̄: 4.82% x̃: 3.45%
95% mean confidence interval for instructions value: -4.06 -3.86
95% mean confidence interval for instructions %-change: -5.00% -4.64%
Instructions are helped.

LOST:   0
GAINED: 35

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15210>
2022-03-03 00:07:58 +00:00
Ian Romanick 06eb9fb125 nir/algebraic: Optimize some cases of (sXX(a, b) != 0.0)
I noticed the SGE case while looking at the output of
shaders/closed/steam/trine-2/fp-3.shader_test on i915g.  These are
especially bad on i915 that needs two instructions to implement SNE.

An alternative would be to duplicate the sne(sXX(a, b), 0.0) rules in an
algebraic pass that occurs after bool_to_float.  Doing the work earlier
seems preferable.

i915
total instructions in shared programs: 788274 -> 788223 (<.01%)
instructions in affected programs: 666 -> 615 (-7.66%)
helped: 5
HURT: 0
helped stats (abs) min: 9 max: 12 x̄: 10.20 x̃: 9
helped stats (rel) min: 5.00% max: 11.11% x̄: 8.12% x̃: 8.16%
95% mean confidence interval for instructions value: -12.24 -8.16
95% mean confidence interval for instructions %-change: -10.81% -5.43%
Instructions are helped.

LOST:   0
GAINED: 2

The two gained shaders are assembly fragment programs in Euro Truck
Simulator 2.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15210>
2022-03-03 00:07:58 +00:00
Ian Romanick 7d055c93e0 i915g/ci: update piglit fails
I believe these were fixed by !14573.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15210>
2022-03-03 00:07:58 +00:00
Emma Anholt d506d910e4 nir: Switch to using nir_vec_scalars() for things that used nir_channel().
This should reduce follow-on optimization work to copy-propagate and
dead-code away the movs generated in construction of vectors.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14865>
2022-03-02 22:28:58 +00:00
Emma Anholt 16c064dfaf nir: Add a helper for setting up a nir_ssa_scalar struct.
Trivial, but will help users avoid some struct constructions that can be
awkward in C.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14865>
2022-03-02 22:28:58 +00:00
Emma Anholt d95f9d189a nir: Introduce a nir_vec_scalars() helper using nir_ssa_scalar.
Many users of nir_vec() do so by nir_channel()-ing a new ssa defs as movs
from other vectors to put the new vector together, which then just have to
get copy-propagated into the ALU srcs and DCEed away the temporary movs.
If they instead take nir_ssa_scalar, we can avoid that extra work.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14865>
2022-03-02 22:28:58 +00:00
Dave Airlie 48b3ef625e vulkan/wsi: handle queue families properly for non-concurrent sharing mode.
"queueFamilyIndexCount is the number of queue families having access to the image(s) of the
swapchain when imageSharingMode is VK_SHARING_MODE_CONCURRENT.
pQueueFamilyIndices is a pointer to an array of queue family indices having access to the
images(s) of the swapchain when imageSharingMode is VK_SHARING_MODE_CONCURRENT."

If the type isn't concurrent, don't attempt to access the arrays.

dEQP-VK.wsi.xlib.swapchain.create.exclusive_nonzero_queues on lavapipe.

Fixes: 5b13d74583 ("vulkan/wsi/drm: Break create_native_image in pieces")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15101>
2022-03-02 20:47:16 +00:00
Emma Anholt 221ce1b35a ci/freedreno: Consolidate some information about an a630 flake.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15197>
2022-03-02 19:53:26 +00:00
Emma Anholt a64408dcd5 ir3: Don't assert on not finding the VS output for an FS input.
It should return undefined data, not terminate the program.  Fixes some
piglit tests poking at the UB handling.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15197>
2022-03-02 19:53:26 +00:00
Rhys Perry feb7e30e2d radv: include disable_aniso_single_level and adjust_frag_coord_z in key
Fixes potential pipeline caching bug.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15175>
2022-03-02 19:05:28 +00:00
Emma Anholt 2de15273d5 freedreno: Improve robustness behavior for VBs with offset > size.
We were just emitting the bad reloc (either an assert fail on a debug
build or for a release build likely a GPU hang from the resulting fault).
Given that the GLES 3.2 spec's robust context requirement says we should
return undefined data but not terminate for element indices outside of the
VB, ignoring the offset in this case seems like a better behavior to have
in all cases.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15198>
2022-03-02 17:31:49 +00:00
Emma Anholt 92945825f5 freedreno: Fix start_slot handling in set_vertex_buffers.
mesa/st only ever provides a 0 for this value, but let's be ready if it
ever does something else.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15198>
2022-03-02 17:31:49 +00:00
Emma Anholt 35ddb65ea6 freedreno: Use the resource size rather than BO size for VFD_FETCH[].SIZE.
We should be using the API size to clamp, rather than what we allocated
the BO into.

Fixes: #13
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15198>
2022-03-02 17:31:49 +00:00
Erik Faye-Lund b21e7e1ef7 docs: match build-flags markup with meson docs
In meson.rst, we document build-flags with double backticks, which puts
it inside a code-tag in the rendered HTML instead of a cite-tag like we
currently do.

Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15213>
2022-03-02 14:36:42 +00:00
Erik Faye-Lund 23d3fb6da2 docs: fix a broken link
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15213>
2022-03-02 14:36:42 +00:00
Lionel Landwerlin 96c8880900 intel/fs: fix total_scratch computation
We only have a single prog_data::total_scratch for all shader variants
(SIMD 8, 16, 32). Therefore we should always max the total_scratch on
top of existing variant.

We probably haven't run into that issue before because we compile by
increasing SIMD size and higher SIMD size is more likely to spill. But
for bindless shaders with return shaders, if the last return part
doesn't spill, we completely ignore the previous parts' scratch
computation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15193>
2022-03-02 13:13:03 +00:00
Juan A. Suarez Romero 5b43075888 v3d: enable texture filtering anisotropic
Seems we already had implemented this feature (see commit 521e1d0275
"broadcom/vc5: Add support for anisotropic filtering"), but we didn't
enable the proper capability.

Also update the maximum level of anistropy supported.

Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4201
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15180>
2022-03-02 12:55:17 +00:00
Caio Oliveira dc77542ed2 intel/compiler: Use pass helper in brw_nir_adjust_offset_for_arrayed_indices
Also change the code to preserve certain metadata: control flow is not changed
so both block indices and dominance information is preserved.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15206>
2022-03-02 10:46:23 +00:00
Iago Toral Quiroga f761f8fd9e broadcom/compiler: simplify node/temp translation during register allocation
Now that we don't sort our nodes we can arrange them so we can
easily translate between nodes and temps without a mapping table,
just applying an offset.

To do this we have a single array of nodes where twe put first the nodes
for accumulators and then the nodes for temps. With this setup we can
ensure that for any given temp T, its node is always T + ACC_COUNT.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Iago Toral Quiroga 871b0a7f6a broadcom/compiler: don't sort nodes for register allocation
Nodes are allocated in order to registers so initially sorting
was used to ensure that nodes with smaller life ranges would
be assigned first and therefore be more likely to get
accumulators.

However, since d81a6e5f1d now we don't rely on order to make
decisions about accumulators and instead we make policy decisions
based on actual liveness, so sorting is no longer strictly
relevant to this decision.

Furthermore, we are not re-sorting nodes after each spill either,
since that would probably require that we rebuild the interference
graph after each spill (the graph identifies nodes by their index).

Shader-db results show a significant improvement in instruction
counts, due to more optimal accumulator assignments. The reason for
this is that we use a round-robin policy for choosing the next
accumulator to assign. The idea behind this is preventing nearby
temps to be assigned to the same accumulator so that QPU scheduling
is more flexible, but if we  sort our nodes, we are basically not
assigning temps in program order any more and the round-robin policy
becomes less effective:

total instructions in shared programs: 13000420 -> 12663189 (-2.59%)
instructions in affected programs: 11791267 -> 11454036 (-2.86%)
helped: 62890
HURT: 19987

total threads in shared programs: 415874 -> 415870 (<.01%)
threads in affected programs: 20 -> 16 (-20.00%)
helped: 2
HURT: 4

total uniforms in shared programs: 3711652 -> 3711624 (<.01%)
uniforms in affected programs: 43430 -> 43402 (-0.06%)
helped: 134
HURT: 173

total max-temps in shared programs: 2144876 -> 2138822 (-0.28%)
max-temps in affected programs: 123334 -> 117280 (-4.91%)
helped: 4112
HURT: 1195

total spills in shared programs: 3870 -> 3860 (-0.26%)
spills in affected programs: 1013 -> 1003 (-0.99%)
helped: 14
HURT: 12

total fills in shared programs: 5560 -> 5573 (0.23%)
fills in affected programs: 1765 -> 1778 (0.74%)
helped: 14
HURT: 17

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Iago Toral Quiroga 4483cd24af broadcom/compiler: sink uniform loads
total instructions in shared programs: 13014428 -> 13000420 (-0.11%)
instructions in affected programs: 743624 -> 729616 (-1.88%)
helped: 1392
HURT: 611

total threads in shared programs: 415858 -> 415874 (<.01%)
threads in affected programs: 16 -> 32 (100.00%)
helped: 8
HURT: 0

total uniforms in shared programs: 3720410 -> 3711652 (-0.24%)
uniforms in affected programs: 113442 -> 104684 (-7.72%)
helped: 635
HURT: 29

total max-temps in shared programs: 2154268 -> 2144876 (-0.44%)
max-temps in affected programs: 61279 -> 51887 (-15.33%)
helped: 1124
HURT: 187

total spills in shared programs: 4002 -> 3870 (-3.30%)
spills in affected programs: 265 -> 133 (-49.81%)
helped: 6
HURT: 0

total fills in shared programs: 5788 -> 5560 (-3.94%)
fills in affected programs: 603 -> 375 (-37.81%)
helped: 6
HURT: 0

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Iago Toral Quiroga e228642cf5 broadcom/compiler: move constants before their first user
For us they are basically uniforms too so we want to make their
lifespans short to facilitate allocating them to accumulators.

total instructions in shared programs: 13043585 -> 13015385 (-0.22%)
instructions in affected programs: 8326040 -> 8297840 (-0.34%)
helped: 24939
HURT: 19894

total threads in shared programs: 415860 -> 415858 (<.01%)
threads in affected programs: 4 -> 2 (-50.00%)
helped: 0
HURT: 1

total uniforms in shared programs: 3721953 -> 3720451 (-0.04%)
uniforms in affected programs: 96134 -> 94632 (-1.56%)
helped: 744
HURT: 435

total max-temps in shared programs: 2173431 -> 2154260 (-0.88%)
max-temps in affected programs: 264598 -> 245427 (-7.25%)
helped: 10858
HURT: 841

total spills in shared programs: 4005 -> 4010 (0.12%)
spills in affected programs: 700 -> 705 (0.71%)
helped: 5
HURT: 10

total fills in shared programs: 5801 -> 5817 (0.28%)
fills in affected programs: 1346 -> 1362 (1.19%)
helped: 6
HURT: 11

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Iago Toral Quiroga a1998a9f43 broadcom/compiler: disallow TMU spills if max tmu spills is 0
If we are compiling with a strategy that does not allow TMU spills
we should not allow spilling anything that is not a uniform.
Otherwise the RA cost/benefit algorithm may choose to spill a
temp that is not uniform and that will cause us to immediately
fail the strategy and fallback to the next one, even if we
could've instead chosen to spill more uniforms to compile the
program successfully with that strategy.

Some relevant shader-db stats:

total instructions in shared programs: 13040711 -> 13043585 (0.02%)
instructions in affected programs: 234238 -> 237112 (1.23%)
helped: 73
HURT: 172

total threads in shared programs: 415664 -> 415860 (0.05%)
threads in affected programs: 196 -> 392 (100.00%)
helped: 98
HURT: 0

total uniforms in shared programs: 3717266 -> 3721953 (0.13%)
uniforms in affected programs: 12831 -> 17518 (36.53%)
helped: 6
HURT: 100

total max-temps in shared programs: 2174177 -> 2173431 (-0.03%)
max-temps in affected programs: 4597 -> 3851 (-16.23%)
helped: 79
HURT: 21

total spills in shared programs: 4010 -> 4005 (-0.12%)
spills in affected programs: 55 -> 50 (-9.09%)
helped: 5
HURT: 0

total fills in shared programs: 5820 -> 5801 (-0.33%)
fills in affected programs: 186 -> 167 (-10.22%)
helped: 5
HURT: 0

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Iago Toral Quiroga cbb4d0dded broadcom/compiler: increase cost of TMU spills to 10
Our cost was 5 which matches the number of instructions we have to
add for a TMU spill (a fill is 4 instructions).

Uniform spills on the other hand add an extra instruction for each
fill and remove one instruction for the spill itself. These have
a cost of 1.

Therefore, if we have a single spill+fill, we end up with +9
instructions if it is a TMU spill and +0 instructions with a uniform
spill, so making the former only 5 times more costly is probably
not a good idea, and this is without even considering the added
latency of the TMU accesses.

Relevant shader-db changes show this causes as a marginal instruction
count increase in a few shaders but better thread counts and lower
TMU spilling overall:

total instructions in shared programs: 13037315 -> 13040711 (0.03%)
instructions in affected programs: 370106 -> 373502 (0.92%)
helped: 187
HURT: 321

total threads in shared programs: 415090 -> 415664 (0.14%)
threads in affected programs: 574 -> 1148 (100.00%)
helped: 287
HURT: 0

total uniforms in shared programs: 3706674 -> 3717266 (0.29%)
uniforms in affected programs: 63075 -> 73667 (16.79%)
helped: 40
HURT: 395

total max-temps in shared programs: 2176080 -> 2174177 (-0.09%)
max-temps in affected programs: 15838 -> 13935 (-12.02%)
helped: 316
HURT: 34

total spills in shared programs: 4247 -> 4010 (-5.58%)
spills in affected programs: 2599 -> 2362 (-9.12%)
helped: 107
HURT: 14

total fills in shared programs: 6121 -> 5820 (-4.92%)
fills in affected programs: 3622 -> 3321 (-8.31%)
helped: 108
HURT: 13

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Marek Olšák a02dd17cb3 radeonsi: fix an assertion failure with register shadowing
The problem is that dirty_states must be 0 for any state that is NULL
in "queued". This code was flagging dirty_states for such states because
it was only looking at "emitted". It should have been looking at "queued".

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15209>
2022-03-01 22:30:24 +00:00
Marek Olšák 0f96948dfa radeonsi: fix register shadowing after the pm4 state size was decreased
Fixes: 946bd90a09 "radeonsi: decrease the size of si_pm4_state::pm4 except for cs_preamble_state"

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15209>
2022-03-01 22:30:24 +00:00
Marek Olšák 66e20d2bf7 ac: add an environment variable that parses IBs in files
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15209>
2022-03-01 22:30:24 +00:00
Marek Olšák 3394f0ae14 ac: define PKT3_ATOMIC_MEM
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15209>
2022-03-01 22:30:24 +00:00