Commit Graph

139585 Commits

Author SHA1 Message Date
Emma Anholt d95f9d189a nir: Introduce a nir_vec_scalars() helper using nir_ssa_scalar.
Many users of nir_vec() do so by nir_channel()-ing a new ssa defs as movs
from other vectors to put the new vector together, which then just have to
get copy-propagated into the ALU srcs and DCEed away the temporary movs.
If they instead take nir_ssa_scalar, we can avoid that extra work.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14865>
2022-03-02 22:28:58 +00:00
Dave Airlie 48b3ef625e vulkan/wsi: handle queue families properly for non-concurrent sharing mode.
"queueFamilyIndexCount is the number of queue families having access to the image(s) of the
swapchain when imageSharingMode is VK_SHARING_MODE_CONCURRENT.
pQueueFamilyIndices is a pointer to an array of queue family indices having access to the
images(s) of the swapchain when imageSharingMode is VK_SHARING_MODE_CONCURRENT."

If the type isn't concurrent, don't attempt to access the arrays.

dEQP-VK.wsi.xlib.swapchain.create.exclusive_nonzero_queues on lavapipe.

Fixes: 5b13d74583 ("vulkan/wsi/drm: Break create_native_image in pieces")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15101>
2022-03-02 20:47:16 +00:00
Emma Anholt 221ce1b35a ci/freedreno: Consolidate some information about an a630 flake.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15197>
2022-03-02 19:53:26 +00:00
Emma Anholt a64408dcd5 ir3: Don't assert on not finding the VS output for an FS input.
It should return undefined data, not terminate the program.  Fixes some
piglit tests poking at the UB handling.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15197>
2022-03-02 19:53:26 +00:00
Rhys Perry feb7e30e2d radv: include disable_aniso_single_level and adjust_frag_coord_z in key
Fixes potential pipeline caching bug.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15175>
2022-03-02 19:05:28 +00:00
Emma Anholt 2de15273d5 freedreno: Improve robustness behavior for VBs with offset > size.
We were just emitting the bad reloc (either an assert fail on a debug
build or for a release build likely a GPU hang from the resulting fault).
Given that the GLES 3.2 spec's robust context requirement says we should
return undefined data but not terminate for element indices outside of the
VB, ignoring the offset in this case seems like a better behavior to have
in all cases.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15198>
2022-03-02 17:31:49 +00:00
Emma Anholt 92945825f5 freedreno: Fix start_slot handling in set_vertex_buffers.
mesa/st only ever provides a 0 for this value, but let's be ready if it
ever does something else.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15198>
2022-03-02 17:31:49 +00:00
Emma Anholt 35ddb65ea6 freedreno: Use the resource size rather than BO size for VFD_FETCH[].SIZE.
We should be using the API size to clamp, rather than what we allocated
the BO into.

Fixes: #13
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15198>
2022-03-02 17:31:49 +00:00
Lionel Landwerlin 96c8880900 intel/fs: fix total_scratch computation
We only have a single prog_data::total_scratch for all shader variants
(SIMD 8, 16, 32). Therefore we should always max the total_scratch on
top of existing variant.

We probably haven't run into that issue before because we compile by
increasing SIMD size and higher SIMD size is more likely to spill. But
for bindless shaders with return shaders, if the last return part
doesn't spill, we completely ignore the previous parts' scratch
computation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15193>
2022-03-02 13:13:03 +00:00
Juan A. Suarez Romero 5b43075888 v3d: enable texture filtering anisotropic
Seems we already had implemented this feature (see commit 521e1d0275
"broadcom/vc5: Add support for anisotropic filtering"), but we didn't
enable the proper capability.

Also update the maximum level of anistropy supported.

Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4201
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15180>
2022-03-02 12:55:17 +00:00
Caio Oliveira dc77542ed2 intel/compiler: Use pass helper in brw_nir_adjust_offset_for_arrayed_indices
Also change the code to preserve certain metadata: control flow is not changed
so both block indices and dominance information is preserved.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15206>
2022-03-02 10:46:23 +00:00
Iago Toral Quiroga f761f8fd9e broadcom/compiler: simplify node/temp translation during register allocation
Now that we don't sort our nodes we can arrange them so we can
easily translate between nodes and temps without a mapping table,
just applying an offset.

To do this we have a single array of nodes where twe put first the nodes
for accumulators and then the nodes for temps. With this setup we can
ensure that for any given temp T, its node is always T + ACC_COUNT.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Iago Toral Quiroga 871b0a7f6a broadcom/compiler: don't sort nodes for register allocation
Nodes are allocated in order to registers so initially sorting
was used to ensure that nodes with smaller life ranges would
be assigned first and therefore be more likely to get
accumulators.

However, since d81a6e5f1d now we don't rely on order to make
decisions about accumulators and instead we make policy decisions
based on actual liveness, so sorting is no longer strictly
relevant to this decision.

Furthermore, we are not re-sorting nodes after each spill either,
since that would probably require that we rebuild the interference
graph after each spill (the graph identifies nodes by their index).

Shader-db results show a significant improvement in instruction
counts, due to more optimal accumulator assignments. The reason for
this is that we use a round-robin policy for choosing the next
accumulator to assign. The idea behind this is preventing nearby
temps to be assigned to the same accumulator so that QPU scheduling
is more flexible, but if we  sort our nodes, we are basically not
assigning temps in program order any more and the round-robin policy
becomes less effective:

total instructions in shared programs: 13000420 -> 12663189 (-2.59%)
instructions in affected programs: 11791267 -> 11454036 (-2.86%)
helped: 62890
HURT: 19987

total threads in shared programs: 415874 -> 415870 (<.01%)
threads in affected programs: 20 -> 16 (-20.00%)
helped: 2
HURT: 4

total uniforms in shared programs: 3711652 -> 3711624 (<.01%)
uniforms in affected programs: 43430 -> 43402 (-0.06%)
helped: 134
HURT: 173

total max-temps in shared programs: 2144876 -> 2138822 (-0.28%)
max-temps in affected programs: 123334 -> 117280 (-4.91%)
helped: 4112
HURT: 1195

total spills in shared programs: 3870 -> 3860 (-0.26%)
spills in affected programs: 1013 -> 1003 (-0.99%)
helped: 14
HURT: 12

total fills in shared programs: 5560 -> 5573 (0.23%)
fills in affected programs: 1765 -> 1778 (0.74%)
helped: 14
HURT: 17

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Iago Toral Quiroga 4483cd24af broadcom/compiler: sink uniform loads
total instructions in shared programs: 13014428 -> 13000420 (-0.11%)
instructions in affected programs: 743624 -> 729616 (-1.88%)
helped: 1392
HURT: 611

total threads in shared programs: 415858 -> 415874 (<.01%)
threads in affected programs: 16 -> 32 (100.00%)
helped: 8
HURT: 0

total uniforms in shared programs: 3720410 -> 3711652 (-0.24%)
uniforms in affected programs: 113442 -> 104684 (-7.72%)
helped: 635
HURT: 29

total max-temps in shared programs: 2154268 -> 2144876 (-0.44%)
max-temps in affected programs: 61279 -> 51887 (-15.33%)
helped: 1124
HURT: 187

total spills in shared programs: 4002 -> 3870 (-3.30%)
spills in affected programs: 265 -> 133 (-49.81%)
helped: 6
HURT: 0

total fills in shared programs: 5788 -> 5560 (-3.94%)
fills in affected programs: 603 -> 375 (-37.81%)
helped: 6
HURT: 0

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Iago Toral Quiroga e228642cf5 broadcom/compiler: move constants before their first user
For us they are basically uniforms too so we want to make their
lifespans short to facilitate allocating them to accumulators.

total instructions in shared programs: 13043585 -> 13015385 (-0.22%)
instructions in affected programs: 8326040 -> 8297840 (-0.34%)
helped: 24939
HURT: 19894

total threads in shared programs: 415860 -> 415858 (<.01%)
threads in affected programs: 4 -> 2 (-50.00%)
helped: 0
HURT: 1

total uniforms in shared programs: 3721953 -> 3720451 (-0.04%)
uniforms in affected programs: 96134 -> 94632 (-1.56%)
helped: 744
HURT: 435

total max-temps in shared programs: 2173431 -> 2154260 (-0.88%)
max-temps in affected programs: 264598 -> 245427 (-7.25%)
helped: 10858
HURT: 841

total spills in shared programs: 4005 -> 4010 (0.12%)
spills in affected programs: 700 -> 705 (0.71%)
helped: 5
HURT: 10

total fills in shared programs: 5801 -> 5817 (0.28%)
fills in affected programs: 1346 -> 1362 (1.19%)
helped: 6
HURT: 11

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Iago Toral Quiroga a1998a9f43 broadcom/compiler: disallow TMU spills if max tmu spills is 0
If we are compiling with a strategy that does not allow TMU spills
we should not allow spilling anything that is not a uniform.
Otherwise the RA cost/benefit algorithm may choose to spill a
temp that is not uniform and that will cause us to immediately
fail the strategy and fallback to the next one, even if we
could've instead chosen to spill more uniforms to compile the
program successfully with that strategy.

Some relevant shader-db stats:

total instructions in shared programs: 13040711 -> 13043585 (0.02%)
instructions in affected programs: 234238 -> 237112 (1.23%)
helped: 73
HURT: 172

total threads in shared programs: 415664 -> 415860 (0.05%)
threads in affected programs: 196 -> 392 (100.00%)
helped: 98
HURT: 0

total uniforms in shared programs: 3717266 -> 3721953 (0.13%)
uniforms in affected programs: 12831 -> 17518 (36.53%)
helped: 6
HURT: 100

total max-temps in shared programs: 2174177 -> 2173431 (-0.03%)
max-temps in affected programs: 4597 -> 3851 (-16.23%)
helped: 79
HURT: 21

total spills in shared programs: 4010 -> 4005 (-0.12%)
spills in affected programs: 55 -> 50 (-9.09%)
helped: 5
HURT: 0

total fills in shared programs: 5820 -> 5801 (-0.33%)
fills in affected programs: 186 -> 167 (-10.22%)
helped: 5
HURT: 0

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Iago Toral Quiroga cbb4d0dded broadcom/compiler: increase cost of TMU spills to 10
Our cost was 5 which matches the number of instructions we have to
add for a TMU spill (a fill is 4 instructions).

Uniform spills on the other hand add an extra instruction for each
fill and remove one instruction for the spill itself. These have
a cost of 1.

Therefore, if we have a single spill+fill, we end up with +9
instructions if it is a TMU spill and +0 instructions with a uniform
spill, so making the former only 5 times more costly is probably
not a good idea, and this is without even considering the added
latency of the TMU accesses.

Relevant shader-db changes show this causes as a marginal instruction
count increase in a few shaders but better thread counts and lower
TMU spilling overall:

total instructions in shared programs: 13037315 -> 13040711 (0.03%)
instructions in affected programs: 370106 -> 373502 (0.92%)
helped: 187
HURT: 321

total threads in shared programs: 415090 -> 415664 (0.14%)
threads in affected programs: 574 -> 1148 (100.00%)
helped: 287
HURT: 0

total uniforms in shared programs: 3706674 -> 3717266 (0.29%)
uniforms in affected programs: 63075 -> 73667 (16.79%)
helped: 40
HURT: 395

total max-temps in shared programs: 2176080 -> 2174177 (-0.09%)
max-temps in affected programs: 15838 -> 13935 (-12.02%)
helped: 316
HURT: 34

total spills in shared programs: 4247 -> 4010 (-5.58%)
spills in affected programs: 2599 -> 2362 (-9.12%)
helped: 107
HURT: 14

total fills in shared programs: 6121 -> 5820 (-4.92%)
fills in affected programs: 3622 -> 3321 (-8.31%)
helped: 108
HURT: 13

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Marek Olšák a02dd17cb3 radeonsi: fix an assertion failure with register shadowing
The problem is that dirty_states must be 0 for any state that is NULL
in "queued". This code was flagging dirty_states for such states because
it was only looking at "emitted". It should have been looking at "queued".

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15209>
2022-03-01 22:30:24 +00:00
Marek Olšák 0f96948dfa radeonsi: fix register shadowing after the pm4 state size was decreased
Fixes: 946bd90a09 "radeonsi: decrease the size of si_pm4_state::pm4 except for cs_preamble_state"

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15209>
2022-03-01 22:30:24 +00:00
Marek Olšák 66e20d2bf7 ac: add an environment variable that parses IBs in files
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15209>
2022-03-01 22:30:24 +00:00
Marek Olšák 3394f0ae14 ac: define PKT3_ATOMIC_MEM
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15209>
2022-03-01 22:30:24 +00:00
Marek Olšák ff9e4409c1 ac: parse SET_SH_REG_INDEX packet
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15209>
2022-03-01 22:30:24 +00:00
Marek Olšák 0cae7a59c0 ac/llvm: update LLVM processor names for gfx10.3
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15209>
2022-03-01 22:30:24 +00:00
Marek Olšák 87d83f4103 ci: add point coord failures to d3d12
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:56 +00:00
Marek Olšák 5ca7c20cf7 st/mesa: do nir_lower_io() for inputs & outputs with transform feedback info
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:56 +00:00
Marek Olšák ee4c5b1699 gallium/aux: add helper nir_gather_stream_output_info
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák 2a708efec3 gallium/util: add util_dump_stream_output_info
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák 1dcd1eac6a nir: pass nir_shader into nir_recompute_io_bases instead of func_impl
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák 606811bded nir: add nir_print_xfb_info
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák ad68a1ee5a nir: add nir_gather_xfb_info_from_intrinsics for lowered IO
Drivers will use this.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák d4c051b047 nir: add nir_lower_io_passes() with new transform feedback
moved from radeonsi without the vectorization, which won't be needed for
now. We will lower IO in st/mesa instead of radeonsi to get the transform
feedback info into store instructions.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák 3528dcdfa1 nir: add nir_io_semantics::no_varying, no_sysval_output, and helpers
This is for drivers that have separate store instructions for varyings,
system value outputs (such as clip distances), and transform feedback.
The flags tell the driver not to store the output to those locations.

This will be used by radeonsi initially, and then maybe by a new linker.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák 548b2d47b2 nir: scalarize transform feedback info in nir_lower_io_to_scalar
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák cc5505088b nir: add shader_info::xfb_strides
NIR now fully contains pipe_stream_output_info in shader_info and IO
intrinsics if lower_io_variables is true. radeonsi will not use
pipe_stream_output_info after this, and other drivers are free to follow
that.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák 4636fa7f38 nir: add transform feedback info into nir_intrinsic_store_output
This will allow compaction of transform feedback varyings because they
are no longer tied to varying slots with this information.
It will also make transform feedback info available to all NIR passes
after IO is lowered. It's meant to replace pipe_stream_output_info.

Other intrinsics are not used with transform feedback.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák 2c6e41bfe1 nir: fix nir_io_semantics::gs_streams in nir_lower_io_to_scalar
gs_streams is relative to the component. Also clear the high bits.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák 73ef225fc2 nir: validate write_mask for all intrinsics that have it
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14388>
2022-03-01 21:59:55 +00:00
Marek Olšák dd733fa52e radeonsi: fix broken VK-GL buffer interop
Fixes: ad9b5ac0a1 - radeonsi: more fixes for si_buffer_from_winsys_buffer for GL-VK interop
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6063

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15124>
2022-03-01 20:51:04 +00:00
Nanley Chery c1a7d520f3 anv: Disable aux if the explicit modifier lacks it
For dmabuf imports, configure the primary surface without support for
compression if the modifier doesn't specify it. This helps to create VkImages
with memory requirements that are compatible with the buffers apps provide.

Suggested-by: Philip Langdale <philipl@overt.org>
Cc: 22.0 <mesa-stable>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5940
Tested-by: Philip Langdale <philipl@overt.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15181>
2022-03-01 20:05:50 +00:00
Nanley Chery cada519482 anv: Refactor anv_image_init_from_create_info
Use a variable to store the anv_image_create_info struct. We'll modify it for a
bug fix in the next patch.

Cc: 22.0 <mesa-stable>
Tested-by: Philip Langdale <philipl@overt.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15181>
2022-03-01 20:05:50 +00:00
Nanley Chery 8d2b7e558b anv: Change a parameter of the implicit layout fn
Replace the create_info parameter with isl_extra_usage_flags to more closely
match the parameters of explicit layout function.

Tested-by: Philip Langdale <philipl@overt.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15181>
2022-03-01 20:05:50 +00:00
Alyssa Rosenzweig c3eee6327c pan/va: Add missing copyright notice
Minor.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:23 +00:00
Alyssa Rosenzweig eda00fd39d pan/bi: Extract INSTRUCTION_CASE macro
Useful across multiple optimization tests.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:23 +00:00
Alyssa Rosenzweig ffde1f359b pan/bi: Adapt bi_lower_branch for Valhall
Disable the Bifrost optimization; it's not portable.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:23 +00:00
Alyssa Rosenzweig f3937d9874 pan/bi: Trade off registers/threads on Valhall
It's only v6 that's missing this feature.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:23 +00:00
Alyssa Rosenzweig 7637502c8d pan/bi: Add BI_SUBGROUP_SUBGROUP16 option
Valhall uses 16-wide warps.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig ec9c1f8fa6 pan/bi: Wire Valhall disassembler into compiler
Useful when we grow Valhall support (soon!)

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig 31e991d801 pan/bi: Support standalone Valhall disassembly
$ bifrost_compiler disasm --gpu=G78 foo.bin

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig 600f689a98 pan/bi: Allow CSE of preloaded registers
Needed to CSE `LEA_VARY` in varying shaders on Valhall.

No shader-db changes on Bifrost.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig 3154df232b pan/bi: Use a progress loop for constant folding
Needed to fold the dependent patterns produced by texture instructions
during NIR->Valhall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig e5582710f3 pan/bi: Mark NOP as having no destinations
More accurate and more convenient.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig 2604c65174 panfrost: Unify barrier+helper handling
These are unified in the hardware, so let's unify them in pan_shader_info.
Hoisting this logic to pan_shader.c avoids the need to duplicate this logic for
Midgard/Bifrost (RSD packing) and Valhall (SPD packing).

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig 30d0c2e390 panfrost: Set texel_interleave on Valhall
Instead of specifying the tiling on the texture descriptor, Valhall specifies it
on the plane descriptor. There is a new flag on the texture descriptor
specifying only whether the planes are interleaved or not.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig 407bda4d8c panfrost: Adapt estimate_texture_payload_size to Valhall
The plane descriptor is larger than earlier surface descriptors, so we need to
be somewhat careful here. This removes a memory micro-optimization in the
interest of simplifying the code.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig 469a36d071 panfrost: Don't emit compression tags on Valhall
Unnecessary. To avoid even more #if/#endif soup, merge the v4, v5-v8, and v9
paths together -- by returning 0 as the compression tag on v4 or v9.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig 087b63cb07 panfrost: Allow uploading fragment SPDs
SPDs don't have the state dependence that fragment RSDs do.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig e42b0c68f4 panfrost: Don't pack blend constants with blend shaders
It's probably harmless, but it is logically meaningless. The DDK doesn't do it,
I don't see a reason for us to, either. In theory this should be a small
overhead win.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig 111f5af303 panfrost: Generalize some is_bifrost users
Valhall would want these too. Regretting the is_bifrost check at all..

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig 36a2b8d039 panfrost: Add PAN_MESA_DEBUG=dump option
To dump all graphics memory via the new pandecode_dump_mappings function(),
since for Valhall I have to do this often enough to warrant a dynamic flag.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig 28743a5556 panfrost: Rename prepare_rsd->prepare_shader
This hook will be repurposed on Valhall to prepare the Shader Program
Descriptor, which takes the role of the RSD. Rename to avoid confusion.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig 631c01fc42 panfrost: Add an enum for Valhall resource tables
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig f3c971e0fe panfrost: Make Divisor E an integer on v9
For consistency with previous architecture's XML files. Logically this is an
1-bit unsigned integer, not a boolean.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig b19afaf307 panfrost: Clarify contains descriptor? bit
Influences cache prefetching. I don't see a good reason to put anything other
than descriptors inside shader resources, meaning always setting this bit is
appropriate (at least for GLES).

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig 1df6b0d7e2 panfrost: Remove Invalidate Cache from Valhall job header
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig 217e038289 panfrost: Add Tile Render Order enum to fragment jobs
Not sure what this is needed for yet.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Alyssa Rosenzweig 52ccd21e6b panfrost: Extend SPD size
There is software-defined state at the end we don't need. Model in the XML for
correct behaviour.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15204>
2022-03-01 19:43:22 +00:00
Thong Thai 0136545d16 radeonsi: add check for graphics to si_try_normal_clear
Cc: mesa-stable
Signed-off-by: Thong Thai <thong.thai@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15177>
2022-03-01 19:05:06 +00:00
Lionel Landwerlin 214092da87 anv: fix fast clear type value with external images
Disable fast clear if not supported by the external modifier.

v2: Set fast_clear value to NONE in case of import/export from/to external

v3: Move logic next to existing acquire/release checks (Nanley)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6056
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15096>
2022-03-01 17:37:13 +00:00
Oleksandr Gabrylchuk 02fab4cf9e venus: Implement guest vram blob type.
Add support of GUEST_VRAM type of blob. These are dedicated heap memory
allocations required for vk support on hypervisors that don't support
runtime injections of host memory into guest physical address space.

The flow of usage:
1) Host VM reserves dedicated heap memory
2) Device get info about memory reservations and report it to guest
using mmio registers
3) Guest virtio-gpu driver on starts checks mmio registers for
physical address and length of reserved region. Then it reserves it
in guest.
4) On each call of vkAllocateMemory() guest driver gets chunk of
required memory and send it to host using sg list. It uses one sg
entry for 1 blob call. Heap is managed on guest using drm memory
manager (drm_mm).

Signed-off-by: Oleksandr.Gabrylchuk <Oleksandr.Gabrylchuk@opensynergy.com>
Signed-off-by: Andrii Pauk <Andrii.Pauk@opensynergy.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14536>
2022-03-01 17:25:56 +00:00
Marek Olšák fd3451babd amd: update addrlib
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Tested-by: Yifan Zhang <yifan1.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15155>
2022-03-01 17:03:00 +00:00
Marek Olšák f8cf5ea982 amd: add support for gfx1036 and gfx1037 chips
Both are identified as GFX1036 for simplicity.

Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Tested-by: Yifan Zhang <yifan1.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15155>
2022-03-01 17:03:00 +00:00
Marek Olšák 48046d5bd8 ac: set correct cache size per TCC for Yellow Carp
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Tested-by: Yifan Zhang <yifan1.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15155>
2022-03-01 17:03:00 +00:00
Samuel Pitoiset 4380916b76 radv: disable DCC for Fable Anniversary, Dragons Dogma, GTA IV and more
Also Starcraft 2 and The Force Unleashed II.

These games are known to be affected by the feedback loop issue. We will
fix this properly soon but as a hotfix disabling DCC should be enough.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4424
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15203>
2022-03-01 16:33:18 +00:00
Vadym Shovkoplias dc921f7377 iris: Do not apply SCANOUT allocation flags for SHARED-only requests
It provides similar solution as in [1].

This was workaround for the users of gbm_bo_create_with_modifiers(),
which were unable to specify the buffer usage (GPU / GPU+DISPLAY).

But after the commit [2] this become possible. And forcing usage to
GBM_BO_USE_SCANOUT migrated directly into gbm_bo_create_with_modifiers
[3], allowing us to remove such workarounds from the drivers.

[1]: ef3b31c9 ("v3d: Don't force SCANOUT for PIPE_BIND_SHARED requests")
[2]: 268e12c6 ("gbm: add gbm_{bo,surface}_create_with_modifiers2")
[3]: ad50b47a ("gbm: assume USE_SCANOUT in create_with_modifiers")

Suggested-by: Roman Stratiienko <roman.o.stratiienko@globallogic.com>
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5642
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14264>
2022-03-01 16:04:44 +00:00
Timur Kristóf 93087f71e6 ac/nir: Extract final mesh shader output counts to a separate function.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15199>
2022-03-01 15:37:12 +00:00
Timur Kristóf 11957d3863 aco: Remove superfluous code for mesh shader workgroup ID.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15199>
2022-03-01 15:37:12 +00:00
Timur Kristóf 2d5aae032b ac/nir: Properly invalidate mesh shader metadata.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15199>
2022-03-01 15:37:12 +00:00
Timur Kristóf 3a3bd9cff1 ac/nir: Fix workgroup ID in mesh shader waves other than the first.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15199>
2022-03-01 15:37:12 +00:00
Timur Kristóf 57775dd76a ac/nir: Store mesh shader API and HW workgroup size in lowering state.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15199>
2022-03-01 15:37:12 +00:00
Timur Kristóf d0f45c7c49 ac/nir: Reuse existing nir_builder for emit_ms_finale.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15199>
2022-03-01 15:37:12 +00:00
Timur Kristóf 74f1e7965e ac/nir: Use vertex count minus 1 to determine max index in mesh shaders.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15199>
2022-03-01 15:37:12 +00:00
Charlie Turner 16b417b8d6 ci, valve: Add the dEQP runners for Valve CI
v2.

  - Build the runner image as part of the CI for the boot2container
  project, rather than as a manually step using the build instructions
  in valve-trigger.dockerfile.

  - Depend on a non-default kernel build hosted in the valve-infra
  package repository. This does reduce the current caching feature of
  local artifacts, but makes it easier to chop and change kernels on a
  per-project or even per-test basis.

v3.

  - Depend on a kernel built and stored in the valve-infra generic
  package repo.

  - Build the runner container using ci-templates as part of the CI in
  valve-infra.

  - Now that the runner container is built in the valve-infra CI, I
  dropped the source import of client.py and message.py. They are
  built in the runner container.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14660>
2022-03-01 13:04:14 +00:00
Charlie Turner f0aee991bf amd, ci: Categorize the sections of the CI file.
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14660>
2022-03-01 13:04:14 +00:00
Charlie Turner 58186df32c amd, ci: Drop log level in SPIRV -> NIR code generator.
See 786fa3435c for the rationale of this variable, but the point is to
avoid many error reports for conformance conformance issues within the
VK-CTS shaders.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14660>
2022-03-01 13:04:14 +00:00
Charlie Turner cc327a0fe4 amd, ci: Remove unused runners.
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14660>
2022-03-01 13:04:14 +00:00
Samuel Pitoiset 1e010348ee radv: remove color exports in presence of holes
If there is holes, eg. if only MRT0 and MRT2 are exported, we have to
set MRT1 to SPI_SHADER_32_R to avoid a GPU hang but the export can
still be removed from the fragment shader.

fossils-db (Sienna Cichlid):
Totals from 565 (0.42% of 134913) affected shaders:
VGPRs: 13328 -> 11456 (-14.05%)
CodeSize: 613232 -> 548224 (-10.60%); split: -11.13%, +0.53%
LDS: 284672 -> 296960 (+4.32%)
MaxWaves: 17624 -> 17684 (+0.34%)
Instrs: 113056 -> 100445 (-11.15%); split: -11.68%, +0.53%
Latency: 684327 -> 639348 (-6.57%); split: -7.17%, +0.60%
InvThroughput: 122877 -> 104382 (-15.05%); split: -15.18%, +0.13%
VClause: 2601 -> 2323 (-10.69%); split: -10.77%, +0.08%
SClause: 5629 -> 5443 (-3.30%); split: -3.91%, +0.60%
Copies: 9393 -> 8720 (-7.16%); split: -8.22%, +1.05%
PreSGPRs: 14623 -> 13666 (-6.54%); split: -6.76%, +0.22%
PreVGPRs: 9847 -> 8503 (-13.65%)

fossils-db (Polaris10):
Totals from 565 (0.42% of 135960) affected shaders:
SGPRs: 28064 -> 27104 (-3.42%)
VGPRs: 12516 -> 10544 (-15.76%); split: -15.79%, +0.03%
CodeSize: 516920 -> 456536 (-11.68%); split: -11.68%, +0.00%
MaxWaves: 4369 -> 4418 (+1.12%)
Instrs: 97771 -> 85903 (-12.14%); split: -12.14%, +0.00%
Latency: 767482 -> 708545 (-7.68%); split: -7.97%, +0.29%
InvThroughput: 280017 -> 235744 (-15.81%)
VClause: 2270 -> 2090 (-7.93%); split: -8.50%, +0.57%
SClause: 5185 -> 5012 (-3.34%); split: -3.86%, +0.52%
Copies: 8328 -> 7555 (-9.28%); split: -9.35%, +0.07%
Branches: 1143 -> 1113 (-2.62%)
PreSGPRs: 13816 -> 12725 (-7.90%); split: -7.92%, +0.02%
PreVGPRs: 9707 -> 8270 (-14.80%)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15108>
2022-03-01 12:28:47 +01:00
Rhys Perry f800af2231 ac/nir: remove TCS nir_var_shader_out memory barrier
nir_var_shader_out writes are only used for later TES invocations, so I
don't think there's any need for the TCS workgroup to wait for them.

fossil-db (Sienna Cichlid):
Totals from 1691 (1.04% of 162293) affected shaders:
Instrs: 710699 -> 709008 (-0.24%)
CodeSize: 3830168 -> 3823404 (-0.18%)
Latency: 3396997 -> 3007934 (-11.45%)
InvThroughput: 1212094 -> 1082823 (-10.67%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15195>
2022-03-01 11:02:43 +00:00
Caio Oliveira 7460199a2f intel/compiler: Lower Task/Mesh I/O before SIMD specific lowering
These are the same for all variants, so just lower it before cloning
the nir_shader for each of them.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15019>
2022-03-01 07:35:13 +00:00
Danylo Piliaiev 549e861dc1 turnip: Implement VK_EXT_physical_device_drm
Copied from ANV and V3DV.

v1. Fix a build error for clang "unannotated fall-through between switch labels"
( Hyunjun Ko <zzoon.ko@igalia.com> )

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6011

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14971>
2022-03-01 07:10:40 +00:00
Pierre-Eric Pelloux-Prayer bb6ba8f21f radeonsi/drirc: use force_gl_vendor for Maya
Otherwise OpenCL initialization fails with "unknown vendor id 0".

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15151>
2022-03-01 07:43:26 +01:00
Ilia Mirkin d3196bac51 nouveau: add dEQP/GLCTS run failure info for GF108/GT215
I happened to have these plugged in. Ran them against mesa 21.3 and
recent VK-GL-CTS tree (shortly after vulkan-cts-1.2.8).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14797>
2022-02-28 22:36:31 -05:00
Nanley Chery dc05615ec1 Revert "anv: Require the local heap for CCS on XeHP"
This reverts commit 382f6ccda8.

The spec requires that all color images created with the same tiling
(and a few other properties) support the same memoryTypeBits. So this
wasn't a valid change. It also wasn't necessary - we already have a
mechanism in anv_BindImageMemory2 for disabling compression if the BO
doesn't support it.

With this, XeHP passes the tests in
dEQP-VK.memory.requirements.*tiling_optimal

Fixes: 382f6ccd ("anv: Require the local heap for CCS on XeHP")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15068>
2022-03-01 00:02:51 +00:00
Nanley Chery 203c8be09f anv: Add a perf warning in anv_BindImageMemory2
It reports: "BO lacks implicit CCS. Disabling the CCS aux usage."

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15068>
2022-03-01 00:02:51 +00:00
Nanley Chery 74e446b45b anv: Fall back to HiZ when disabling CCS on HiZ+CCS
When an image configured for HIZ_CCS/HIZ_CCS_WT is bound to a BO lacking
implicit CCS, we disable any compression it may have had. Such images
are still compatible with ISL_AUX_USAGE_HIZ however. Fall back to that
aux usage to retain the performance benefit.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15068>
2022-03-01 00:02:51 +00:00
Nanley Chery ffbde42b93 anv: Don't disable HiZ/MCS in anv_BindImageMemory2
When an image is bound to a BO lacking implicit CCS, we disable any
compression it may have had. This is unnecessary in the cases where the
compression type doesn't depend on the BO having implicit CCS support.
Avoid this disabling for ISL_AUX_USAGE_MCS and ISL_AUX_USAGE_HIZ.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15068>
2022-03-01 00:02:51 +00:00
Connor Abbott ed9a0d48a9 ir3: Use isam for bindless images
In the bindless case, we don't have to keep any shadow descriptors and
can just reuse the IBO descriptor as a texture descriptor. Now that
we're emitting the swizzle we can just flip this on.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15114>
2022-02-28 23:33:22 +00:00
Connor Abbott 06485f7d3d tu: Call nir_opt_access
This adds some small optimizations, and enables lowering to isam in more
cases where the app didn't specify readonly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15114>
2022-02-28 23:33:22 +00:00
Connor Abbott 58d72f45e5 ir3/nir: Fix 1d array readonly images
ncoords includes the array index, and the NIR source has the array index
as its last component, so we have to insert the extra y coordinate in
the middle in this case.

Fixes: 0bb0cac ("freedreno/ir3: handle image buffer")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15114>
2022-02-28 23:33:22 +00:00
Connor Abbott 21ac044c3e ir3: Don't always set bindless_tex with readonly images
Fixes: 274f381 ("ir3: Plumb through bindless support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15114>
2022-02-28 23:33:22 +00:00
Connor Abbott bb1e0eba08 freedreno/fdl: Set swizzle on storage descriptor
It appears to be unused by ldib/stib, but it will let us use isam on IBO
descriptors for bindless images.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15114>
2022-02-28 23:33:22 +00:00
Connor Abbott 00be8c4619 freedreno: Replace A6XX_IBO with A6XX_TEX_CONST
Since these were reverse-engineered, it's become clear that IBO
descriptors are just a subset of texture descriptors, and bindless reads
of readonly images actually use isam on the IBO descriptor, further
confirming that the two are always compatible, even if not all of the
texture fields exist for IBOs. It's pointless to have a separate type
for IBOs, and just leads to things getting out-of-sync unnecessarily
which has already happened. Just remove it and use TEX_CONST insted.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15114>
2022-02-28 23:33:22 +00:00
Connor Abbott e1c4c2ac60 ir3: Use CAN_REORDER instead of NON_WRITEABLE
CAN_REORDER takes volatile into account, and is closer to what we
actually require to use texture instructions, which is that we can
arbitrarily reorder loads.

Fixes: aa93896 ("freedreno/ir3: adjust condition for when to use ldib")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15114>
2022-02-28 23:33:22 +00:00
Danylo Piliaiev 95fabff8de turnip: Set drmFormatModifierTilingFeatures
From Vulkan spec for VkDrmFormatModifierProperties2EXT:

 "drmFormatModifierTilingFeatures is a bitmask of VkFormatFeatureFlagBits
  that are supported by any image created with format and drmFormatModifier."

 "The returned drmFormatModifierTilingFeatures must contain at least one bit."

 "Therefore, if the returned drmFormatModifier is DRM_FORMAT_MOD_LINEAR,
  then drmFormatModifierPlaneCount must equal the format planecount, and
  drmFormatModifierTilingFeatures must be identical to the
  VkFormatProperties2::linearTilingFeatures returned in the same pNext chain."

Relevant tests: dEQP-VK.drm_format_modifiers.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15032>
2022-02-28 22:53:40 +00:00
Mike Blumenkrantz 473f488639 zink: add layer asserts for 3d imageview creation
make sure there's no other mishaps here in the future

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15172>
2022-02-28 17:42:58 +00:00
Mike Blumenkrantz 8e67928862 zink: more accurately clamp 3d fb surfaces to corresponding 2d target
if more than 1 layer is being bound, this is an array, otherwise it's just
regular 2d

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15172>
2022-02-28 17:42:58 +00:00
Mike Blumenkrantz 59b0105e65 zink: clamp 3d/array shader images to lower dimensionality using layer counts
this creates the view type expected by the shader instead of doing weird stuff
like trying to create a 3D imageview with layers > 1

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15172>
2022-02-28 17:42:58 +00:00
Mike Blumenkrantz 26d05e5a38 zink: directly create surfaces for shader images
avoid the implicit clamping of fb surfaces in zink_create_surface()
in order to provide more granularity

no functional changes

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15172>
2022-02-28 17:42:58 +00:00
Mike Blumenkrantz 69ec429c00 zink: restrict clear flushing on sampler/image bind to compute binds
this is otherwise going to be handled on the next renderpass start

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15172>
2022-02-28 17:42:58 +00:00
Mike Blumenkrantz b7b494299d zink: use VK_EXT_depth_clip_control when available
this saves a few ALUs in vertex stages

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15174>
2022-02-28 17:21:29 +00:00
Mike Blumenkrantz 95708c13ee glx/drisw: handle GL_RESET_NOTIFICATION_STRATEGY
fixes (llvmpipe):
KHR-NoContext.gl45.robustness.lose_context_on_reset

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15061>
2022-02-28 16:10:00 +00:00
Mike Blumenkrantz fba13486df zink: update psiz handling to fix xfb output
now when gl_PointSize and gl_PointSizeMESA are both present, the former
will be used for xfb with a new location and the latter will be
exported by the shader

fixes (zink):
GTF-GL46.gtf30.GL3Tests.transform_feedback.transform_feedback_builtins

Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15184>
2022-02-28 15:42:20 +00:00
Mike Blumenkrantz b28cff9f4a nir/lower_psiz_mov: stop clobbering existing exports
for this pass to work with xfb, the original value in the shader must be
preserved when xfb is active, and the driver must export only the newly
created output

Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15184>
2022-02-28 15:42:19 +00:00
Mike Blumenkrantz 3267417c22 nir/lower_psiz: create the store instruction more accurately
creating this at the start of the shader means it will get optimized out
when the pass is used to overwrite existing psiz values, and creating it
at the end means it will get optimized out in geometry shaders, so instead
just walk the instructions and create another store right after the existing one

Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15184>
2022-02-28 15:42:19 +00:00
Jonathan Gray 6250a3bc18 util: use correct type in sysctl argument
Fixes build on OpenBSD/macppc powerpc

error: incompatible pointer types passing 'int *' to parameter of type 'size_t *'
    (aka 'unsigned long *') [-Werror,-Wincompatible-pointer-types]

Fixes: 01bd21eef8 ("gallium: Import Dennis Smit cpu detection code.")
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6511>
2022-02-28 14:28:23 +00:00
Jonathan Gray 0536b69133 util: fix build with clang 10 on mips64
On mips64, the compiler does not allow use of non-zero argument with
__builtin_frame_address(). However, the returned frame address is only
used when PIPE_ARCH_X86 is defined. The compile error can be avoided
by making #ifdef PIPE_ARCH_X86 cover the getting of frame address too.

The argument checking of __builtin_frame_address() has been present
as a debug assert in clang 8. In clang 10, there is a proper runtime
check for the argument. This is why the build has not failed before.

Fixes: dc94a0506f ("gallium: Do not add -Wframe-address option for gcc <= 4.4.")
from Visa Hankala

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6511>
2022-02-28 14:28:23 +00:00
Jonathan Gray f12c107b03 util/u_atomic: fix build on clang archs without 64-bit atomics
Make this build on clang architectures that don't have 64-bit atomic
instructions.  Clang doesn't allow redeclaration (and therefore
redefinition) of the __sync_* builtins.  Use #pragma redefine_extname
to work around that restriction.  Clang also turns __sync_add_and_fetch
into __sync_fetch_and_add (and __sync_sub_and_fetch into
__sync_fetch_and_sub) in certain cases, so provide these functions as
well.

Fixes: a6a38a038b ("util/u_atomic: provide 64bit atomics where they're missing")
patch from Mark Kettenis

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6511>
2022-02-28 14:28:23 +00:00
Daniel Stone d07df90bf4 Revert "CI: Disable Panfrost T720 jobs"
This reverts commit 35209b94a6c7d88fb67b6446fda8f8daf556c911.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15191>
2022-02-28 14:08:34 +00:00
Daniel Stone bd55458304 Revert "CI: Disable panfrost-t760"
This reverts commit b9b444e0b8bc318cea2a93ec04b0a383c444180e.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15191>
2022-02-28 14:08:34 +00:00
Adrián Larumbe 6d0824abcc panfrost: fix segfault in pandecode
The structure wrapped around the rb tree node was being freed, but not the node
itself, which caused a segmentation fault when accessing its parent node.

Add rb tree node remove call to fix it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15188>
2022-02-28 12:40:32 +00:00
Daniel Stone af0f9a31b3 CI: Disable Panfrost T720 jobs
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15189>
2022-02-28 07:08:38 +00:00
Daniel Stone 114e48e923 CI: Disable panfrost-t760
The DUTs are extremely tempremental for some reason.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15186>
2022-02-27 20:26:17 +00:00
Mike Blumenkrantz e1964e1dde zink: don't free non-fbfetch dsl structs when switching to fbfetch
this triggers invalid access when recycling in-flight non-fbfetch sets

cc: mesa-stable

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15173>
2022-02-26 15:26:08 +00:00
Mike Blumenkrantz 03a80490a4 zink: free push descriptor pools on deinit
these are owned by the context, so destroy them when the context
requests destruction

cc: mesa-stable

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15173>
2022-02-26 15:26:08 +00:00
Mike Blumenkrantz 698ae34844 zink: fix cached descriptor set invalidation for array bindings
need to iterate over the descriptors in the binding to invalidate the whole
thing here

=================================================================
==546534==ERROR: AddressSanitizer: heap-use-after-free on address 0x61a0000ae6c0 at pc 0x7fe20e26fd9d bp 0x7ffd92be6bc0 sp 0x7ffd92be6bb8
READ of size 8 at 0x61a0000ae6c0 thread T0
    #0 0x7fe20e26fd9c in zink_descriptor_set_refs_clear ../src/gallium/drivers/zink/zink_descriptors.c:950
    #1 0x7fe20e401304 in zink_destroy_surface ../src/gallium/drivers/zink/zink_surface.c:340
    #2 0x7fe20e21311b in zink_surface_reference ../src/gallium/drivers/zink/zink_surface.h:106
    #3 0x7fe20e21a5b9 in zink_sampler_view_destroy ../src/gallium/drivers/zink/zink_context.c:835
    #4 0x7fe20c41d35f in tc_sampler_view_destroy ../src/gallium/auxiliary/util/u_threaded_context.c:1848
    #5 0x7fe20e210ff7 in pipe_sampler_view_reference ../src/gallium/auxiliary/util/u_inlines.h:216
    #6 0x7fe20e22d592 in zink_set_sampler_views ../src/gallium/drivers/zink/zink_context.c:1532
    #7 0x7fe20c41a3d8 in tc_call_set_sampler_views ../src/gallium/auxiliary/util/u_threaded_context.c:1393
    #8 0x7fe20c411706 in tc_batch_execute ../src/gallium/auxiliary/util/u_threaded_context.c:211
    #9 0x7fe20c4124ba in _tc_sync ../src/gallium/auxiliary/util/u_threaded_context.c:362
    #10 0x7fe20c42b728 in tc_destroy ../src/gallium/auxiliary/util/u_threaded_context.c:4250
    #11 0x7fe20b65176a in st_destroy_context_priv ../src/mesa/state_tracker/st_context.c:387
    #12 0x7fe20b65669f in st_destroy_context ../src/mesa/state_tracker/st_context.c:1009
    #13 0x7fe20b7055ab in st_context_destroy ../src/mesa/state_tracker/st_manager.c:944
    #14 0x7fe20a9c75bd in dri_destroy_context ../src/gallium/frontends/dri/dri_context.c:256
    #15 0x7fe20a9d4bef in driDestroyContext ../src/gallium/frontends/dri/dri_util.c:534
    #16 0x7fe22361f25c in drisw_destroy_context ../src/glx/drisw_glx.c:429
    #17 0x7fe223625d95 in glXDestroyContext ../src/glx/glxcmds.c:523
    #18 0x7fe22636aaeb in glXDestroyContext /home/zmike/src/libglvnd-v1.3.2/src/GLX/libglx.c:332
    #19 0x7fe2269d9e7d in glXDestroyContext /home/zmike/src/libglvnd-v1.3.2/src/GL/g_libglglxwrapper.c:384
    #20 0x41b88a in tcu::lnx::x11::glx::GlxRenderContext::~GlxRenderContext() /home/zmike/src/VK-GL-CTS/framework/platform/lnx/X11/tcuLnxX11GlxPlatform.cpp:734
    #21 0x41b8e9 in tcu::lnx::x11::glx::GlxRenderContext::~GlxRenderContext() /home/zmike/src/VK-GL-CTS/framework/platform/lnx/X11/tcuLnxX11GlxPlatform.cpp:735
    #22 0x2323aa7 in deqp::gles31::Context::destroyRenderContext() /home/zmike/src/VK-GL-CTS/modules/gles31/tes31Context.cpp:77
    #23 0x2323969 in deqp::gles31::Context::~Context() /home/zmike/src/VK-GL-CTS/modules/gles31/tes31Context.cpp:55
    #24 0x232278e in deqp::gles31::TestPackage::deinit() /home/zmike/src/VK-GL-CTS/modules/gles31/tes31TestPackage.cpp:102
    #25 0x2c866c2 in tcu::DefaultHierarchyInflater::leaveTestPackage(tcu::TestPackage*) /home/zmike/src/VK-GL-CTS/framework/common/tcuTestHierarchyIterator.cpp:75
    #26 0x2c87058 in tcu::TestHierarchyIterator::next() /home/zmike/src/VK-GL-CTS/framework/common/tcuTestHierarchyIterator.cpp:252
    #27 0x2c365da in tcu::TestSessionExecutor::iterate() /home/zmike/src/VK-GL-CTS/framework/common/tcuTestSessionExecutor.cpp:122
    #28 0x2c00b0c in tcu::App::iterate() /home/zmike/src/VK-GL-CTS/framework/common/tcuApp.cpp:221
    #29 0x4141b7 in main /home/zmike/src/VK-GL-CTS/framework/platform/tcuMain.cpp:58
    #30 0x7fe2263e155f in __libc_start_call_main (/lib64/libc.so.6+0x2d55f)
    #31 0x7fe2263e160b in __libc_start_main_impl (/lib64/libc.so.6+0x2d60b)
    #32 0x413fa4 in _start (/home/zmike/src/VK-GL-CTS/build/external/openglcts/modules/glcts+0x413fa4)

0x61a0000ae6c0 is located 64 bytes inside of 1328-byte region [0x61a0000ae680,0x61a0000aebb0)
freed by thread T0 here:
    #0 0x7fe226cb6627 in free (/usr/lib64/libasan.so.6+0xae627)
    #1 0x7fe20aab1751 in unsafe_free ../src/util/ralloc.c:302
    #2 0x7fe20aab16c8 in unsafe_free ../src/util/ralloc.c:295
    #3 0x7fe20aab13c3 in ralloc_free ../src/util/ralloc.c:265
    #4 0x7fe20e269234 in descriptor_pool_free ../src/gallium/drivers/zink/zink_descriptors.c:286
    #5 0x7fe20e26937d in descriptor_pool_delete ../src/gallium/drivers/zink/zink_descriptors.c:296
    #6 0x7fe20e26ff53 in zink_descriptor_pool_reference ../src/gallium/drivers/zink/zink_descriptors.c:967
    #7 0x7fe20e270db2 in zink_descriptor_program_deinit ../src/gallium/drivers/zink/zink_descriptors.c:1071
    #8 0x7fe20e3b6536 in zink_destroy_gfx_program ../src/gallium/drivers/zink/zink_program.c:695
    #9 0x7fe20e1eaaf9 in zink_gfx_program_reference ../src/gallium/drivers/zink/zink_program.h:242
    #10 0x7fe20e20d386 in zink_shader_free ../src/gallium/drivers/zink/zink_compiler.c:2099
    #11 0x7fe20e3b9f0b in zink_delete_shader_state ../src/gallium/drivers/zink/zink_program.c:1074
    #12 0x7fe20c3e29ad in util_shader_reference ../src/gallium/auxiliary/util/u_live_shader_cache.c:188
    #13 0x7fe20e3ba11e in zink_delete_cached_shader_state ../src/gallium/drivers/zink/zink_program.c:1093
    #14 0x7fe20c41709e in tc_call_delete_fs_state ../src/gallium/auxiliary/util/u_threaded_context.c:998
    #15 0x7fe20c411706 in tc_batch_execute ../src/gallium/auxiliary/util/u_threaded_context.c:211
    #16 0x7fe20c4124ba in _tc_sync ../src/gallium/auxiliary/util/u_threaded_context.c:362
    #17 0x7fe20c423683 in tc_flush ../src/gallium/auxiliary/util/u_threaded_context.c:3003
    #18 0x7fe20b62d996 in st_flush ../src/mesa/state_tracker/st_cb_flush.c:60
    #19 0x7fe20b62dbe3 in st_glFlush ../src/mesa/state_tracker/st_cb_flush.c:94
    #20 0x7fe20ae4bded in _mesa_make_current ../src/mesa/main/context.c:1493
    #21 0x7fe20ae49702 in _mesa_free_context_data ../src/mesa/main/context.c:1187
    #22 0x7fe20b65668b in st_destroy_context ../src/mesa/state_tracker/st_context.c:1005
    #23 0x7fe20b7055ab in st_context_destroy ../src/mesa/state_tracker/st_manager.c:944
    #24 0x7fe20a9c75bd in dri_destroy_context ../src/gallium/frontends/dri/dri_context.c:256
    #25 0x7fe20a9d4bef in driDestroyContext ../src/gallium/frontends/dri/dri_util.c:534
    #26 0x7fe22361f25c in drisw_destroy_context ../src/glx/drisw_glx.c:429
    #27 0x7fe223625d95 in glXDestroyContext ../src/glx/glxcmds.c:523
    #28 0x7fe22636aaeb in glXDestroyContext /home/zmike/src/libglvnd-v1.3.2/src/GLX/libglx.c:332
    #29 0x7fe2269d9e7d in glXDestroyContext /home/zmike/src/libglvnd-v1.3.2/src/GL/g_libglglxwrapper.c:384

previously allocated by thread T0 here:
    #0 0x7fe226cb691f in __interceptor_malloc (/usr/lib64/libasan.so.6+0xae91f)
    #1 0x7fe20aab0c81 in ralloc_size ../src/util/ralloc.c:120
    #2 0x7fe20aab0e33 in rzalloc_size ../src/util/ralloc.c:153
    #3 0x7fe20aab12c8 in rzalloc_array_size ../src/util/ralloc.c:233
    #4 0x7fe20e26c76d in allocate_desc_set ../src/gallium/drivers/zink/zink_descriptors.c:657
    #5 0x7fe20e26e9cb in zink_descriptor_set_get ../src/gallium/drivers/zink/zink_descriptors.c:840
    #6 0x7fe20e2747aa in zink_descriptors_update ../src/gallium/drivers/zink/zink_descriptors.c:1424
    #7 0x7fe20e36fc48 in void zink_draw<(zink_multidraw)1, (zink_dynamic_state)2, true, false>(pipe_context*, pipe_draw_info const*, unsigned int, pipe_draw_indirect_info const*, pipe_draw_start_count_bias const*, unsigned int, pipe_vertex_state*, unsigned int) ../src/gallium/drivers/zink/zink_draw.cpp:788
    #8 0x7fe20e29166d in zink_draw_vbo<(zink_multidraw)1, (zink_dynamic_state)2, true> ../src/gallium/drivers/zink/zink_draw.cpp:907
    #9 0x7fe20c424982 in tc_call_draw_single ../src/gallium/auxiliary/util/u_threaded_context.c:3155
    #10 0x7fe20c411706 in tc_batch_execute ../src/gallium/auxiliary/util/u_threaded_context.c:211
    #11 0x7fe20c4124ba in _tc_sync ../src/gallium/auxiliary/util/u_threaded_context.c:362
    #12 0x7fe20c41f7a9 in tc_texture_map ../src/gallium/auxiliary/util/u_threaded_context.c:2279
    #13 0x7fe20b630757 in pipe_texture_map_3d ../src/gallium/auxiliary/util/u_inlines.h:572
    #14 0x7fe20b6341f6 in st_ReadPixels ../src/mesa/state_tracker/st_cb_readpixels.c:546
    #15 0x7fe20b42fea7 in read_pixels ../src/mesa/main/readpix.c:1178
    #16 0x7fe20b42fea7 in _mesa_ReadnPixelsARB ../src/mesa/main/readpix.c:1195
    #17 0x7fe20b42ffc0 in _mesa_ReadPixels ../src/mesa/main/readpix.c:1210
    #18 0x2a6d094 in glu::readPixels(glu::RenderContext const&, int, int, tcu::PixelBufferAccess const&) /home/zmike/src/VK-GL-CTS/framework/opengl/gluPixelTransfer.cpp:61
    #19 0x29eaa06 in deqp::gls::ShaderExecUtil::FragmentOutExecutor::execute(int, void const* const*, void* const*) /home/zmike/src/VK-GL-CTS/modules/glshared/glsShaderExecUtil.cpp:677
    #20 0x25a600b in iterate /home/zmike/src/VK-GL-CTS/modules/gles31/functional/es31fOpaqueTypeIndexingTests.cpp:585
    #21 0x2322b53 in deqp::gles31::TestCaseWrapper<deqp::gles31::TestPackage>::iterate(tcu::TestCase*) /home/zmike/src/VK-GL-CTS/modules/gles31/tes31TestCaseWrapper.hpp:86
    #22 0x2c376fd in tcu::TestSessionExecutor::iterateTestCase(tcu::TestCase*) /home/zmike/src/VK-GL-CTS/framework/common/tcuTestSessionExecutor.cpp:302
    #23 0x2c366e3 in tcu::TestSessionExecutor::iterate() /home/zmike/src/VK-GL-CTS/framework/common/tcuTestSessionExecutor.cpp:139
    #24 0x2c00b0c in tcu::App::iterate() /home/zmike/src/VK-GL-CTS/framework/common/tcuApp.cpp:221
    #25 0x4141b7 in main /home/zmike/src/VK-GL-CTS/framework/platform/tcuMain.cpp:58
    #26 0x7fe2263e155f in __libc_start_call_main (/lib64/libc.so.6+0x2d55f)

cc: mesa-stable

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15173>
2022-02-26 15:26:08 +00:00
Mike Blumenkrantz 62b8daa889 zink: set shader key size to 0 for non-generated tcs
Test case 'dEQP-GLES31.functional.shaders.builtin_functions.common.modf.vec2_mediump_tess_control'..
=================================================================
==539161==ERROR: AddressSanitizer: unknown-crash on address 0x60400008cfef at pc 0x7fffdb47b2d6 bp 0x7fffffffa490 sp 0x7fffffffa488
READ of size 4 at 0x60400008cfef thread T0
    #0 0x7fffdb47b2d5 in XXH_read32 ../src/util/xxhash.h:531
    #1 0x7fffdb47bfbf in XXH_readLE32 ../src/util/xxhash.h:608
    #2 0x7fffdb47bfbf in XXH_readLE32_align ../src/util/xxhash.h:620
    #3 0x7fffdb47bfbf in XXH32_endian_align ../src/util/xxhash.h:797
    #4 0x7fffdb47bfbf in XXH32 ../src/util/xxhash.h:831
    #5 0x7fffdb480b49 in _mesa_hash_data ../src/util/hash_table.c:631
    #6 0x7fffded8c10a in shader_module_hash ../src/gallium/drivers/zink/zink_program.c:82
    #7 0x7fffded8cad8 in get_shader_module_for_stage ../src/gallium/drivers/zink/zink_program.c:144
    #8 0x7fffded8cf64 in update_gfx_shader_modules ../src/gallium/drivers/zink/zink_program.c:182
    #9 0x7fffded8dcc2 in zink_update_gfx_program ../src/gallium/drivers/zink/zink_program.c:257
    #10 0x7fffdec63463 in update_gfx_program ../src/gallium/drivers/zink/zink_draw.cpp:223
    #11 0x7fffded7aab9 in update_gfx_pipeline<true> ../src/gallium/drivers/zink/zink_draw.cpp:445
    #12 0x7fffded4a88b in void zink_draw<(zink_multidraw)1, (zink_dynamic_state)2, true, false>(pipe_context*, pipe_draw_info const*, unsigned int, pipe_draw_indirect_info const*, pipe_draw_start_count_bias const*, unsigned int, pipe_vertex_state*, unsigned int) ../src/gallium/drivers/zink/zink_draw.cpp:777
    #13 0x7fffdec6c5b2 in zink_draw_vbo<(zink_multidraw)1, (zink_dynamic_state)2, true> ../src/gallium/drivers/zink/zink_draw.cpp:907
    #14 0x7fffdcdff982 in tc_call_draw_single ../src/gallium/auxiliary/util/u_threaded_context.c:3155
    #15 0x7fffdcdec706 in tc_batch_execute ../src/gallium/auxiliary/util/u_threaded_context.c:211
    #16 0x7fffdcded4ba in _tc_sync ../src/gallium/auxiliary/util/u_threaded_context.c:362
    #17 0x7fffdcdfa492 in tc_buffer_map ../src/gallium/auxiliary/util/u_threaded_context.c:2251
    #18 0x7fffdb7f2439 in pipe_buffer_map_range ../src/gallium/auxiliary/util/u_inlines.h:393
    #19 0x7fffdb7f56c2 in _mesa_bufferobj_map_range ../src/mesa/main/bufferobj.c:488
    #20 0x7fffdb803300 in map_buffer_range ../src/mesa/main/bufferobj.c:3734
    #21 0x7fffdb8036e7 in _mesa_MapBufferRange ../src/mesa/main/bufferobj.c:3817
    #22 0x29ecb02 in deqp::gls::ShaderExecUtil::BufferIoExecutor::readOutputBuffer(void* const*, int) /home/zmike/src/VK-GL-CTS/modules/glshared/glsShaderExecUtil.cpp:1069
    #23 0x29ee499 in deqp::gls::ShaderExecUtil::TessControlExecutor::execute(int, void const* const*, void* const*) /home/zmike/src/VK-GL-CTS/modules/glshared/glsShaderExecUtil.cpp:1390
    #24 0x246264c in deqp::gles31::Functional::CommonFunctionCase::iterate() /home/zmike/src/VK-GL-CTS/modules/gles31/functional/es31fShaderCommonFunctionTests.cpp:400
    #25 0x2322b53 in deqp::gles31::TestCaseWrapper<deqp::gles31::TestPackage>::iterate(tcu::TestCase*) /home/zmike/src/VK-GL-CTS/modules/gles31/tes31TestCaseWrapper.hpp:86
    #26 0x2c376fd in tcu::TestSessionExecutor::iterateTestCase(tcu::TestCase*) /home/zmike/src/VK-GL-CTS/framework/common/tcuTestSessionExecutor.cpp:302
    #27 0x2c366e3 in tcu::TestSessionExecutor::iterate() /home/zmike/src/VK-GL-CTS/framework/common/tcuTestSessionExecutor.cpp:139
    #28 0x2c00b0c in tcu::App::iterate() /home/zmike/src/VK-GL-CTS/framework/common/tcuApp.cpp:221
    #29 0x4141b7 in main /home/zmike/src/VK-GL-CTS/framework/platform/tcuMain.cpp:58
    #30 0x7ffff6dbc55f in __libc_start_call_main (/lib64/libc.so.6+0x2d55f)
    #31 0x7ffff6dbc60b in __libc_start_main_impl (/lib64/libc.so.6+0x2d60b)
    #32 0x413fa4 in _start (/home/zmike/src/VK-GL-CTS/build/external/openglcts/modules/glcts+0x413fa4)

0x60400008cff1 is located 0 bytes to the right of 33-byte region [0x60400008cfd0,0x60400008cff1)
allocated by thread T0 here:
    #0 0x7ffff769191f in __interceptor_malloc (/usr/lib64/libasan.so.6+0xae91f)
    #1 0x7fffded8c608 in get_shader_module_for_stage ../src/gallium/drivers/zink/zink_program.c:115
    #2 0x7fffded8cf64 in update_gfx_shader_modules ../src/gallium/drivers/zink/zink_program.c:182
    #3 0x7fffded8dcc2 in zink_update_gfx_program ../src/gallium/drivers/zink/zink_program.c:257
    #4 0x7fffdec63463 in update_gfx_program ../src/gallium/drivers/zink/zink_draw.cpp:223
    #5 0x7fffded7aab9 in update_gfx_pipeline<true> ../src/gallium/drivers/zink/zink_draw.cpp:445
    #6 0x7fffded4a88b in void zink_draw<(zink_multidraw)1, (zink_dynamic_state)2, true, false>(pipe_context*, pipe_draw_info const*, unsigned int, pipe_draw_indirect_info const*, pipe_draw_start_count_bias const*, unsigned int, pipe_vertex_state*, unsigned int) ../src/gallium/drivers/zink/zink_draw.cpp:777
    #7 0x7fffdec6c5b2 in zink_draw_vbo<(zink_multidraw)1, (zink_dynamic_state)2, true> ../src/gallium/drivers/zink/zink_draw.cpp:907
    #8 0x7fffdcdff982 in tc_call_draw_single ../src/gallium/auxiliary/util/u_threaded_context.c:3155
    #9 0x7fffdcdec706 in tc_batch_execute ../src/gallium/auxiliary/util/u_threaded_context.c:211
    #10 0x7fffdcded4ba in _tc_sync ../src/gallium/auxiliary/util/u_threaded_context.c:362
    #11 0x7fffdcdfa492 in tc_buffer_map ../src/gallium/auxiliary/util/u_threaded_context.c:2251
    #12 0x7fffdb7f2439 in pipe_buffer_map_range ../src/gallium/auxiliary/util/u_inlines.h:393
    #13 0x7fffdb7f56c2 in _mesa_bufferobj_map_range ../src/mesa/main/bufferobj.c:488
    #14 0x7fffdb803300 in map_buffer_range ../src/mesa/main/bufferobj.c:3734
    #15 0x7fffdb8036e7 in _mesa_MapBufferRange ../src/mesa/main/bufferobj.c:3817
    #16 0x29ecb02 in deqp::gls::ShaderExecUtil::BufferIoExecutor::readOutputBuffer(void* const*, int) /home/zmike/src/VK-GL-CTS/modules/glshared/glsShaderExecUtil.cpp:1069
    #17 0x29ee499 in deqp::gls::ShaderExecUtil::TessControlExecutor::execute(int, void const* const*, void* const*) /home/zmike/src/VK-GL-CTS/modules/glshared/glsShaderExecUtil.cpp:1390
    #18 0x246264c in deqp::gles31::Functional::CommonFunctionCase::iterate() /home/zmike/src/VK-GL-CTS/modules/gles31/functional/es31fShaderCommonFunctionTests.cpp:400
    #19 0x2322b53 in deqp::gles31::TestCaseWrapper<deqp::gles31::TestPackage>::iterate(tcu::TestCase*) /home/zmike/src/VK-GL-CTS/modules/gles31/tes31TestCaseWrapper.hpp:86
    #20 0x2c376fd in tcu::TestSessionExecutor::iterateTestCase(tcu::TestCase*) /home/zmike/src/VK-GL-CTS/framework/common/tcuTestSessionExecutor.cpp:302
    #21 0x2c366e3 in tcu::TestSessionExecutor::iterate() /home/zmike/src/VK-GL-CTS/framework/common/tcuTestSessionExecutor.cpp:139
    #22 0x2c00b0c in tcu::App::iterate() /home/zmike/src/VK-GL-CTS/framework/common/tcuApp.cpp:221
    #23 0x4141b7 in main /home/zmike/src/VK-GL-CTS/framework/platform/tcuMain.cpp:58
    #24 0x7ffff6dbc55f in __libc_start_call_main (/lib64/libc.so.6+0x2d55f)

cc: mesa-stable

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15173>
2022-02-26 15:26:08 +00:00
Mike Blumenkrantz 861fc10bfc zink: skip extra descriptor lookups for images during barrier updates
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15173>
2022-02-26 15:26:08 +00:00
Mike Blumenkrantz cd7ea80e70 zink: add layout to sampler descriptor hash
this can have more than one value, so avoid stale cache entries

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15173>
2022-02-26 15:26:08 +00:00
Mike Blumenkrantz 7431c30999 zink: fix typo for image descriptor rebinds
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15173>
2022-02-26 15:26:08 +00:00
Mike Blumenkrantz a179977b8e zink: update descriptor refs after starting renderpass
this ensures that swapchain images will have been acquired before potentially
accessing swapchain images bound as descriptors

fixes caselist like:
dEQP-GLES31.functional.fbo.color.texcubearray.r8ui
dEQP-GLES31.functional.primitive_bounding_box.blit_fbo.blit_default_to_fbo

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15173>
2022-02-26 15:26:08 +00:00
Jonathan Gray f0398180a5 radv: use MAJOR_IN_SYSMACROS for sysmacros.h include
fixes build on OpenBSD
../src/amd/vulkan/radv_device.c:35:10: fatal error: 'sys/sysmacros.h' file not found

Fixes: 7aaa54feb5 ("radv: implement VK_EXT_physical_device_drm")
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13448>
2022-02-26 01:00:29 +00:00
Jonathan Gray afece589dc util: fix util_cpu_detect_once() build on OpenBSD
Correct type for sysctl argument to fix the build.

../src/util/u_cpu_detect.c:631:29: error: incompatible pointer types passing 'int *' to parameter of type 'size_t *' (aka 'unsigned long *') [-Werror,-Wincompatible-pointer-types]
      sysctl(mib, 2, &ncpu, &len, NULL, 0);
                            ^~~~

Fixes: 5623c75e40 ("util: Fix setting nr_cpus on some BSD variants")
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13448>
2022-02-26 01:00:29 +00:00
Jonathan Gray 623ff4ec42 util: fix u_print.cpp build on OpenBSD
move include so va_list will be picked up via stdarg.h

In file included from ../src/util/u_printf.cpp:24:
../src/util/u_printf.h:43:41: error: unknown type name 'va_list'; did you mean '__va_list'?
size_t u_printf_length(const char *fmt, va_list untouched_args);
                                        ^~~~~~~
                                        __va_list
/usr/include/machine/_types.h:126:27: note: '__va_list' declared here
typedef __builtin_va_list       __va_list;
                                ^

and add includes to u_printf.h as suggested by Ilia Mirkin
stdarg.h for va_list and stddef.h for size_t

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13448>
2022-02-26 01:00:29 +00:00
Jonathan Gray 7d609431d4 util: unbreak non-linux mips64 build
Put linux specific path inside an ifdef.  Unbreaks mips64 build on
OpenBSD and likely other systems without Elf64_auxv_t.

Fixes: 88b234d7a7 ("gallivm: add basic mips64 support and set mcpu to mips64r5 on ls3a4000")
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15166>
2022-02-26 10:47:43 +11:00
Marcin Ślusarz e5c39bc427 intel/compiler: optimize flat inputs mask calculation
Don't bother looking at urb if variable is not flat.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15169>
2022-02-25 22:34:22 +00:00
Marcin Ślusarz e2cb562dd1 intel/compiler: ignore per-primitive attrs when calculating flat input mask
If we say that per-primitive attributes are flat (which is communicated by
3DSTATE_SBE.ConstantInterpolationEnable), GPU freaks out and applies it
to other (non-flat) attributes.

Fixes: be89ea3231 ("intel/compiler: Handle per-primitive inputs in FS")

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15169>
2022-02-25 22:34:22 +00:00
Alyssa Rosenzweig 216da26b3f pan/va: Add TEX_FETCH assembler case
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15182>
2022-02-25 21:53:03 +00:00
Alyssa Rosenzweig 794836daf0 pan/va: Handle sr_write_count in the disassembler
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15182>
2022-02-25 21:53:02 +00:00
Alyssa Rosenzweig eee6dad0c9 pan/va: Fix definitions of TEX_SINGLE and TEX_FETCH
Fix the definitions of the basic texturing instructions. In particular, a
register format and a write mask were previously missing, as well as incorrect
handling of staging registers.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15182>
2022-02-25 21:53:02 +00:00
Alyssa Rosenzweig a58807fa95 pan/va: Don't use staging index as a sideband
It would cause us to get incorrect disassembly when the syntax is flipped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15182>
2022-02-25 21:53:02 +00:00
Alyssa Rosenzweig 49a4cc6af8 pan/va: Handle extended staging counts in assembler
Needed for texturing.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15182>
2022-02-25 21:53:02 +00:00
Alyssa Rosenzweig 142ba9fea6 pan/va: Allow forcing enums for 1-bit modifiers
Ocassionally the 0 value has a meaningful value that's not meaningfully default,
so we want an enum to encode both possible states.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15182>
2022-02-25 21:53:02 +00:00
Alyssa Rosenzweig 20fce28dfd pan/va: Add MUX.v2i16 and MUX.v4i8 opcodes
Basically identical to MUX.i32, slight differences in opcode and swizzling only.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15182>
2022-02-25 21:53:02 +00:00
Alyssa Rosenzweig 97f8fad37b pan/va: Remove incorrect TEX test cases
Not close enough to salvage; TEX is going to be redefined.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15182>
2022-02-25 21:53:02 +00:00
Emma Anholt b1f349dff4 nir: Allow the _replicates opcodes to have num_components != 4.
This required relaxing a core NIR assertion which I don't think is doing
any important validation.

The shader-db effects here are small, but they're important for avoiding a
regression when we start doing per-component DCE in opt_shrink_vectors
(https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468)

softpipe shader-db:
total instructions in shared programs: 2859777 -> 2859454 (-0.01%)
instructions in affected programs: 18881 -> 18558 (-1.71%)
total temps in shared programs: 293994 -> 293914 (-0.03%)
temps in affected programs: 418 -> 338 (-19.14%)

i915g:
total instructions in shared programs: 407562 -> 407544 (<.01%)
instructions in affected programs: 570 -> 552 (-3.16%)

r300:
total instructions in shared programs: 1414450 -> 1414459 (<.01%)
instructions in affected programs: 44494 -> 44503 (0.02%)
total vinst in shared programs: 473782 -> 473727 (-0.01%)
vinst in affected programs: 1102 -> 1047 (-4.99%)
total sinst in shared programs: 231224 -> 231216 (<.01%)
sinst in affected programs: 432 -> 424 (-1.85%)
total temps in shared programs: 197605 -> 197607 (<.01%)
temps in affected programs: 103 -> 105 (1.94%)

crocus hsw:
total instructions in shared programs: 8158185 -> 8158134 (<.01%)
instructions in affected programs: 10927 -> 10876 (-0.47%)

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15178>
2022-02-25 12:31:48 -08:00
Daniel Schürmann f030b75b7d aco: relax condition to remove branches in case of few instructions
This patch relaxes the conditions under which
we remove branch instructions.

Totals from 27246 (20.20% of 134913) affected shaders: (GFX10.3)
CodeSize: 193413312 -> 192924928 (-0.25%)
Instrs: 36146788 -> 36024692 (-0.34%)
Latency: 528374112 -> 528469044 (+0.02%); split: -0.01%, +0.02%
InvThroughput: 106198759 -> 106216583 (+0.02%); split: -0.00%, +0.02%
Branches: 1040640 -> 918543 (-11.73%)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8647>
2022-02-25 15:38:08 +00:00
Samuel Pitoiset 53ca85ac2a radv,drirc: move RADV workarounds to 00-radv-defaults.conf
Because we have to maintain two different packages of Mesa, one
specific to RADV and another one for RadeonSI and such, it's a bit
annoying to have to synchronize the drirc entries. Currently, only our
Mesa package installs 00-mesa-defaults.conf which means we have to
backport the drirc RADV changes.

This splits 00-mesa-defaults.conf in two to move the drirc RADV entries
to src/amd/vulkan/00-radv-defaults.conf. Meson will install the file
only if RADV is built.

There is still a caveat for common drirc workarounds like for WSI but
they are rare enough and we could still duplicate them if needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15152>
2022-02-25 15:05:56 +01:00
Timur Kristóf 1ca6b2f216 aco: Support memory modes properly with load/store_buffer_amd.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15161>
2022-02-25 14:08:39 +01:00
Timur Kristóf ba4b48e787 aco: Support task_payload with barriers, refactor allowed storage class.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15161>
2022-02-25 14:08:36 +01:00
Timur Kristóf cd0dd5d6b7 aco: Add storage class for Task Shader payload.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15161>
2022-02-25 13:20:08 +01:00
Timur Kristóf 962b2fe214 spirv: Use task_payload mode for generic task outputs and mesh inputs.
This new mode will be only used for the actual payload variables and
not the number of launched mesh shader workgroups, which will still
be treated as an output.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14930>
2022-02-25 06:52:07 +00:00
Timur Kristóf f629fbd778 nir: Add new variable mode for task/mesh payload.
Task shader outputs work differently than other shaders, so they
need special consideration. Essentially, they have two kinds of
outputs:

1. Number of mesh shader workgroups to launch.
Will be still represented by a shader output.

2. Optional payload of up to (at least) 16K bytes.
These payload variables behave similarly to shared memory, but
the spec doesn't actually define them as shared memory (also, they
may be implemented differently by each backend), so we need to add
a new NIR variable mode for them.

These payload variables can't be represented by shader outputs
because the 16K bytes don't fit the 32x vec4 model that NIR uses
for its output variables.

This patch adds a new NIR variable mode: nir_var_mem_task_payload
and corresponding explicit I/O intrinsics, as well as support for
this new mode in nir_lower_io.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14930>
2022-02-25 06:52:07 +00:00
Timur Kristóf d2d6eca081 radv: Refactor mesh shader draws and add num_workgroups.
Several of the new draw packets need this argument
including all of the taskmesh commands, so it's
best to always declare it.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
2022-02-25 06:31:33 +00:00
Timur Kristóf bf519a7d47 ac/nir: Refactor mesh shader output code to smaller functions.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
2022-02-25 06:31:33 +00:00
Timur Kristóf a84789f795 ac/nir: Make sure to exclude special outputs from arrayed output masks.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
2022-02-25 06:31:33 +00:00
Timur Kristóf 3956c03b05 ac/nir: Sanitize mesh shader primitive indices using umin.
This makes our implementation friendlier to potentially buggy shaders,
meaning that it will less likely to hang the GPU.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
2022-02-25 06:31:33 +00:00
Timur Kristóf 0746b98f4a ac/nir: Properly handle when mesh API workgroup size is smaller than HW.
The problem is that the real workgroup launched on NGG HW
can be larger than the size specified by the API, and the
extra waves need to keep up with barriers in the API waves.

There are 2 different cases:

1. The whole API workgroup fits in a single wave.
   We can shrink the barriers to subgroup scope and
   don't need to insert any extra ones.

2. The API workgroup occupies multiple waves, but not
   all. In this case, we emit code that consumes every
   barrier on the extra waves.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
2022-02-25 06:31:33 +00:00
Timur Kristóf d88516a23f ac/nir: Move LDS area for primitive count to the beginning.
This makes it impossible for out of bounds vertex and primitive
attribute stores and indices stores to overwrite this.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
2022-02-25 06:31:33 +00:00
Timur Kristóf 9cc9cf77a8 aco: Fix multiview view index for mesh shaders.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
2022-02-25 06:31:33 +00:00
Timur Kristóf 082b691141 aco: Fix workgroup_id.y and .z for NV_mesh_shader.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
2022-02-25 06:31:33 +00:00
Timur Kristóf 10ebfb3bf2 aco: Allow 1-byte loads and stores with load/store_buffer_amd
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
2022-02-25 06:31:33 +00:00
Timur Kristóf 1ee3d49e3e radv: Better exclude special MS outputs from driver location assignment.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
2022-02-25 06:31:33 +00:00
Guilherme Gallo d1c6185b5a ci: skqp: Add Vulkan support for a630_skqp job
This commit adds support for Vulkan backend on a630_skqp job.

= Needed changes
- Needed to install libvulkan-dev package on system
- Refactored the way the available skqp reports are printed
  tested in development builds with skia tools

Piglit expectations had to be updated in various drivers due to !14750 not
having bumped the tags when it tried to uprev.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14686>
2022-02-25 05:50:06 +00:00
Guilherme Gallo 8bfef8bf6b ci: skqp: Build skqp from android-cts-10.0_r11 tag with Clang
The Android CTS 10 version is relative old when compared with skia main
branch, which was being used before. Some modifications in the skqp
build/runner scripts were needed to make it run on CI.

- skqp versions from android-cts have already all assets inside
  platform_tools folder.
  - along with the assets, are the render and unit files which are
    expected to pass in the Android CTS execution.
  - removed custom test files from the a630 folder, to make it comply
    with the CTS expectations.
- include new patches to remove Python2 dependencies and avoid the
  installation of it in rootfs.
- strip binariesthe built binaries `skqp` and `list_gpu_unit_tests`, as
  `is_debug = false` gn argument did not work, maybe it is not well
      tested in development builds with skia tools
- use Clang instead of GCC. The GCC support is not so graceful as it is
  in the skia main branch, some NEON instructions needs to be turned off
  in the GCC compilation, causing different tests result. This change
  does not imply a bigger rootfs, since the built skqp binary uses GCC
  libc++ and other library runtimes. So clang is just a build
  dependency.

= Changes in skqp results =

Some errors were found for GL backend and unit tests. GLES and VK tests are green.
All the failed tests were classified as expected to fail in the render and unit tests list.

```
gl_blur2rectsnonninepatch
gl_bug339297_as_clip
gl_bug6083
gl_dashtextcaps
```

```
SRGBReadWritePixels (../../tests/SRGBReadWritePixelsTest.cpp:214	Could not create sRGB surface context. [OpenGL])
```

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14686>
2022-02-25 05:50:06 +00:00
Mike Blumenkrantz 07c0801e60 lavapipe: EXT_depth_clip_control
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15126>
2022-02-25 05:27:27 +00:00
Emma Anholt f458c7f200 ci/zink: Add testing of dEQP GLES3.1/3.2.
I think this has been kind of just an oversight.  Increases runtime by a
minute, to 5:30.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15159>
2022-02-24 16:16:57 -08:00
Emma Anholt b4132bd026 ci/zink: Move testing to shared 64-core runners at Google.
Now the main deqp and piglit run takes about 4:30 of runner time in a
single job.

Added a couple of flakes that hit this MR, but which I think predate it
(probably due to not having #zink-ci until recently).

Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15159>
2022-02-24 16:15:56 -08:00
Alyssa Rosenzweig 988d5aae74 panfrost: Flush resources when shadowing
When we shadow a resource, the backing BO is changed; as such,
existing references to the resource become invalid. So batches accessing the
resource need to be flushed (or otherwise have their references invalidated).

The wrong behaviour change (not flushing) was introduced when we started
tracking resources instead of BOs. The issue manifested as a severe performance
regression in glmark2's -bbuffer test, particular the subdata subtest. The issue
is magnified on slow CPUs; without the fix, the test becomes completely CPU
bound

Relevant glmark2 -bbuffer test from 43fps to 84fps.

Apparently, this causes functional issues too -- this performance-minded change
also fixes a few piglits.

Fixes: cecb889481 ("panfrost: Do tracking of resources, not BOs")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Chris Healy <cphealy@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13502>
2022-02-24 23:11:20 +00:00
Alyssa Rosenzweig 5536852d60 panfrost: Handle NULL samplers
Fixes a NULL dereference in Piglit fp-fragment-position, getting the
test to pass.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13203>
2022-02-24 22:48:30 +00:00
Alyssa Rosenzweig 53ef20f08d panfrost: Handle NULL sampler views
Fixes a NULL dereference in Piglit fp-fragment-position.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13203>
2022-02-24 22:48:30 +00:00
Alyssa Rosenzweig 304851422a panfrost: Fix set_sampler_views for big GL
Roughly use the freedreno logic to handle all the extra things that will
come up in our Piglit sooner than later.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13203>
2022-02-24 22:48:30 +00:00
Alyssa Rosenzweig 4b2769493e panfrost/ci: Update xfails list
These tests seem to be passing now.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13203>
2022-02-24 22:48:30 +00:00
Kenneth Graunke 97f18d2929 blorp: Add blorp_measure hooks to the blitter codepaths
I had missed these when hooking up the original support.

Fixes: 31eeb72e45 ("blorp: Add support for blorp_copy via XY_BLOCK_COPY_BLT")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15157>
2022-02-24 21:42:16 +00:00
Kenneth Graunke e6b7e74308 iris: Set MI_FLUSH_DW::PostSyncOperation correctly
The MI_FLUSH_DW post-sync operation uses the same encoding as the
PIPE_CONTROL one so we can use the same helper.  Write PS Depth Count
is not supported, of course, as the blitter has no depth pipeline.

This means that we can write the timestamp register from the blitter.

Fixes: 604d97671b ("iris: Add support for flushing the blitter (hackily)")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15157>
2022-02-24 21:42:16 +00:00
Pavel Ondračka c393753daa r300: add predicate instructions to statistics of vertex shaders
All of IF, ELSE, ENDIF, BREAK and CONTINUE were already translated
to the predication instructions in rc_vert_fc so all the flow control
we count at the moment is just BGNLOOP and ENDLOOP.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15077>
2022-02-24 21:31:03 +00:00
Pavel Ondračka 8eb9bffdfc r300: report number of loops in shader statistics
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15077>
2022-02-24 21:31:03 +00:00
Pavel Ondračka 517b37a08c r300: use %u specifiers when printing unsigned stats values
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6019
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15077>
2022-02-24 21:31:03 +00:00
Pavel Ondračka e7978412c3 r300: only print shader statistics when compilation succeeds
This allows to disregard the huge shaders that won't run anyway
and hopefully make catching shader regressions that result in a
compile failure easier.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15077>
2022-02-24 21:31:03 +00:00
Mike Blumenkrantz b124f83bc2 zink: add a flake channel
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15129>
2022-02-24 20:13:53 +00:00
Alyssa Rosenzweig cd2a4cc47c pan/bi: Unit test message preloading optimization
To make sure it is applied in the cases we expect it to be, to avoid code
generation regressions. Functional regressions are expected to be caught by
integration-testing, so that is not focused on here.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9438>
2022-02-24 14:13:21 -05:00
Alyssa Rosenzweig eb1479bda2 pan/bi: Support message preloading
Preload LD_VAR_IMM or VAR_TEX instructions in the first block of fragment
shaders on v7. Preloaded messages write to fixed registers; when replacing
instructions we insert moves from the registers at the start of the program and
hope coalescing goes to town. (Admittedly we don't do any coalescing yet...)
The extra moves hurts instruction count in some cases; the win for cycle count
should cancel this out. When we get smarter copy prop or RA, those moves should
go away anyway.

This optimization may hurt register pressure by extending the lifetime of up to
eight registers written in the first block. This is expected to be acceptable:
on a large shader-db, there are no additional spills/fills, and only two shaders
are hurt on thread count.

This optimization only applies to v7, as the hardware was not introduced on v6
and was removed for Valhall.

total instructions in shared programs: 2451624 -> 2454286 (0.11%)
instructions in affected programs: 909046 -> 911708 (0.29%)
helped: 4719
HURT: 3341
helped stats (abs) min: 1.0 max: 10.0 x̄: 1.49 x̃: 1
helped stats (rel) min: 0.08% max: 33.33% x̄: 6.79% x̃: 3.92%
HURT stats (abs)   min: 1.0 max: 50.0 x̄: 2.90 x̃: 2
HURT stats (rel)   min: 0.12% max: 66.67% x̄: 6.39% x̃: 3.45%
95% mean confidence interval for instructions value: 0.27 0.39
95% mean confidence interval for instructions %-change: -1.55% -1.11%
Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree).

total tuples in shared programs: 1969529 -> 1963429 (-0.31%)
tuples in affected programs: 601327 -> 595227 (-1.01%)
helped: 5907
HURT: 1297
helped stats (abs) min: 1.0 max: 8.0 x̄: 1.41 x̃: 1
helped stats (rel) min: 0.07% max: 33.33% x̄: 7.25% x̃: 5.26%
HURT stats (abs)   min: 1.0 max: 40.0 x̄: 1.73 x̃: 1
HURT stats (rel)   min: 0.16% max: 31.75% x̄: 3.38% x̃: 2.02%
95% mean confidence interval for tuples value: -0.88 -0.81
95% mean confidence interval for tuples %-change: -5.52% -5.15%
Tuples are helped.

total clauses in shared programs: 401689 -> 387830 (-3.45%)
clauses in affected programs: 136944 -> 123085 (-10.12%)
helped: 8427
HURT: 4
helped stats (abs) min: 1.0 max: 4.0 x̄: 1.65 x̃: 2
helped stats (rel) min: 0.49% max: 50.00% x̄: 19.88% x̃: 18.18%
HURT stats (abs)   min: 1.0 max: 4.0 x̄: 2.50 x̃: 2
HURT stats (rel)   min: 1.96% max: 19.05% x̄: 14.18% x̃: 17.86%
95% mean confidence interval for clauses value: -1.66 -1.63
95% mean confidence interval for clauses %-change: -20.15% -19.58%
Clauses are helped.

total cycles in shared programs: 202735.83 -> 201862.21 (-0.43%)
cycles in affected programs: 16295.46 -> 15421.83 (-5.36%)
helped: 3349
HURT: 1962
helped stats (abs) min: 0.041665999999999315 max: 1.0 x̄: 0.32 x̃: 0
helped stats (rel) min: 0.24% max: 100.00% x̄: 40.77% x̃: 33.33%
HURT stats (abs)   min: 0.041665999999999315 max: 1.5833329999999997 x̄: 0.10 x̃: 0
HURT stats (rel)   min: 0.09% max: 31.40% x̄: 2.95% x̃: 1.94%
95% mean confidence interval for cycles value: -0.17 -0.16
95% mean confidence interval for cycles %-change: -25.48% -23.76%
Cycles are helped.

total arith in shared programs: 74665.50 -> 74920.00 (0.34%)
arith in affected programs: 16059.92 -> 16314.42 (1.58%)
helped: 860
HURT: 3409
helped stats (abs) min: 0.041665999999999315 max: 0.25 x̄: 0.06 x̃: 0
helped stats (rel) min: 0.24% max: 37.50% x̄: 4.73% x̃: 2.56%
HURT stats (abs)   min: 0.041665999999999315 max: 1.5833329999999997 x̄: 0.09 x̃: 0
HURT stats (rel)   min: 0.09% max: 100.00% x̄: 8.99% x̃: 4.21%
95% mean confidence interval for arith value: 0.06 0.06
95% mean confidence interval for arith %-change: 5.83% 6.62%
Arith are HURT.

total texture in shared programs: 13083.50 -> 11877 (-9.22%)
texture in affected programs: 1663 -> 456.50 (-72.55%)
helped: 2377
HURT: 3
helped stats (abs) min: 0.5 max: 1.0 x̄: 0.51 x̃: 0
helped stats (rel) min: 6.25% max: 100.00% x̄: 87.12% x̃: 100.00%
HURT stats (abs)   min: 0.5 max: 0.5 x̄: 0.50 x̃: 0
HURT stats (rel)   min: 0.00% max: 25.00% x̄: 16.67% x̃: 25.00%
95% mean confidence interval for texture value: -0.51 -0.50
95% mean confidence interval for texture %-change: -87.98% -86.00%
Texture are helped.

total vary in shared programs: 10220.62 -> 4183.88 (-59.06%)
vary in affected programs: 10126.50 -> 4089.75 (-59.61%)
helped: 8538
HURT: 0
helped stats (abs) min: 0.125 max: 1.0 x̄: 0.71 x̃: 0
helped stats (rel) min: 7.14% max: 100.00% x̄: 74.74% x̃: 87.50%
95% mean confidence interval for vary value: -0.71 -0.70
95% mean confidence interval for vary %-change: -75.32% -74.16%
Vary are helped.

total quadwords in shared programs: 1766717 -> 1757161 (-0.54%)
quadwords in affected programs: 553801 -> 544245 (-1.73%)
helped: 6760
HURT: 711
helped stats (abs) min: 1.0 max: 11.0 x̄: 1.58 x̃: 1
helped stats (rel) min: 0.09% max: 29.41% x̄: 5.31% x̃: 4.84%
HURT stats (abs)   min: 1.0 max: 33.0 x̄: 1.54 x̃: 1
HURT stats (rel)   min: 0.10% max: 31.13% x̄: 2.53% x̃: 1.61%
95% mean confidence interval for quadwords value: -1.31 -1.25
95% mean confidence interval for quadwords %-change: -4.67% -4.46%
Quadwords are helped.

total threads in shared programs: 52899 -> 52897 (<.01%)
threads in affected programs: 4 -> 2 (-50.00%)
helped: 0
HURT: 2

total preloads in shared programs: 0 -> 116492
preloads in affected programs: 0 -> 116492
helped: 0
HURT: 8604
HURT stats (abs)   min: 2.0 max: 24.0 x̄: 13.54 x̃: 14
HURT stats (rel)   min: 0.00% max: 0.00% x̄: 0.00% x̃: 0.00%
95% mean confidence interval for preloads value: 13.45 13.63
95% mean confidence interval for preloads %-change: 0.00% 0.00%
Preloads are HURT.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9438>
2022-02-24 14:09:14 -05:00
Alyssa Rosenzweig c8437cd415 pan/bi: Account for message preloading in shaderdb
If a message-passing instruction like LD_VAR is preloaded, it will no longer be
counted in the shader cycle counts. Add a special message preload counter that
approximates the cost of preloading, so this information doesn't get a lost.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9438>
2022-02-24 12:51:04 -05:00
Alyssa Rosenzweig 19541dc8c8 pan/bi: Add bi_before_nonempty_block helper
To be used in the message preloading pass.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9438>
2022-02-24 12:51:04 -05:00
Alyssa Rosenzweig 6618697e0e panfrost: Pack message preloads from compiler
Include full message preload descriptors in the RSD on v7, and do the obvious
packing for fragment shader message preloads.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9438>
2022-02-24 12:51:04 -05:00
Alyssa Rosenzweig bd06a26662 panfrost: Add an unpacked message preload struct
The compiler will soon produce preloaded messages, but it should not pack them
itself, as this would require depending on GenXML or handcoding bitfields / bit
packs in the compiler. Instead, add a struct encoding the unpacked form of the
message, used as ABI between the compiler and the common driver.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9438>
2022-02-24 12:51:04 -05:00
Alyssa Rosenzweig 2d0c4973dc panfrost: Remove Message Preload Descriptor from v6.xml
It is an anachronism, as this descriptor was added in v7 and, seemingly, removed
immediately after. Good work.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9438>
2022-02-24 12:50:58 -05:00
Igor Torrente b130f8f4cf venus: add macros to help with future extensions
Currently we have to add almost the same code to the
`vn_physical_device_init_{features, properties}` to add
the extension to the `physical_dev->{features, properties}`
list.

These macros improves the code reusage.

Signed-off-by: Igor Torrente <igor.torrente@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15059>
2022-02-24 15:55:57 +00:00
Alyssa Rosenzweig 43bbe367ea panfrost/ci: Move T860 flake to skip
Actually an xfail but occassionally passes and gives us no new information, only
noise.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-and-acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15154>
2022-02-24 14:51:31 +00:00
Alyssa Rosenzweig 5c07f7c427 panfrost/ci: Move T720 flakes to skips
Doesn't seem like these will be resolved anytime soon..

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-and-acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15154>
2022-02-24 14:51:31 +00:00
Iago Toral Quiroga cf99584f51 broadcom/compiler: move uniforms right before their first use after scheduling
On V3D the quality of the code we generate is significantly affected by
how we decide to assign accumulators during register allocation, which
is determined by liveness, favoring short-lived temps.

There are many shaders that end up doing a whole lot of uniform loads
first, and using them later, which is very inconvenient for our register
allocation process because this increases uniform liveness and causes
us to use accumulators less efficientely, leading to significant churn.

To fix this, we move uniforms right before their first use in the same
block, but we need to do this after NIR scheduling, which means we are
doing it in non-SSA form, since the scheduler has a tendency to undo
this optimization and it is not easy to modify it to avoid it, since it
works in more abstract terms, using instruction dependencies, estimated
register pressure and instruction delay information to do its work,
which are very different concepts.

total instructions in shared programs: 13316738 -> 13033613 (-2.13%)
instructions in affected programs: 10389172 -> 10106047 (-2.73%)
helped: 55442
HURT: 16144

total threads in shared programs: 413722 -> 415048 (0.32%)
threads in affected programs: 1428 -> 2754 (92.86%)
helped: 680
HURT: 17

total loops in shared programs: 1716 -> 1690 (-1.52%)
loops in affected programs: 26 -> 0
helped: 26
HURT: 0

total uniforms in shared programs: 3704313 -> 3705181 (0.02%)
uniforms in affected programs: 687730 -> 688598 (0.13%)
helped: 2920
HURT: 7384

total max-temps in shared programs: 2364785 -> 2175190 (-8.02%)
max-temps in affected programs: 1215387 -> 1025792 (-15.60%)
helped: 49667
HURT: 1556

total spills in shared programs: 4241 -> 4248 (0.17%)
spills in affected programs: 642 -> 649 (1.09%)
helped: 11
HURT: 19

total fills in shared programs: 6115 -> 6125 (0.16%)
fills in affected programs: 1276 -> 1286 (0.78%)
helped: 11
HURT: 21

total sfu-stalls in shared programs: 34381 -> 36578 (6.39%)
sfu-stalls in affected programs: 16055 -> 18252 (13.68%)
helped: 3647
HURT: 5206

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15056>
2022-02-24 11:36:00 +00:00
Iago Toral Quiroga f1d20ec67c nir/nir_opt_move: handle non-SSA defs
We just skip register defs and avoid moving register reads across them.
This allows us to run this pass in non-SSA form.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15056>
2022-02-24 11:36:00 +00:00
Iago Toral Quiroga fe2249eac5 nir: add a nir_instr_def_is_register helper
This returns true if the instruction has a dest that is not an SSA value.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15056>
2022-02-24 11:36:00 +00:00
Iago Toral Quiroga 0a04468704 nir/nir_opt_move: allow to move uniform loads
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15056>
2022-02-24 11:36:00 +00:00
Tomeu Vizoso c0695bb473 ci: Allow disabling the whole of the Collabora farm
Add a global-level variable that allows disabling all jobs that would
have gone to the Collabora lab, to be used in case of outages.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15150>
2022-02-24 07:33:45 +01:00
Emma Anholt a5fa7e04d7 ci/lvp: Update the asan fails list.
Many tests had been fixed but weren't being run due to test reshuffles
from uprevs.  Add some explanations for what remains.

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15133>
2022-02-24 02:09:02 +00:00
Alyssa Rosenzweig 6b2eda6b72 pan/bi: Reorder pushed uniforms to avoid moves
On Bifrost and Valhall, push uniforms are loaded into Fast Access Uniform
Random Access Memory (FAU-RAM). FAU-RAM is organized as an array of 64-bit
slots. A given tuple (Bifrost) or instruction (Valhall) may access at most a
single 64-bit slot. If an instruction requires uniforms from multiple 64-bit
slots, a uniform-to-register move must be inserted to avoid the hazard. However,
if an instruction requires a pair of 32-bit uniforms from the same 64-bit slot,
no move is required.

To reduce the number of moves we emit, this commit adds an optimization pass
that reorders pushed uniforms, trying to group uniforms used by the same
instruction. The pass works by creating a graph of pushed uniforms, where edges
denote the "both 32-bit uniforms required by the same instruction" relationship.
We perform depth-first search on this graph to find the connected components,
where each connected component is a cluster of uniforms that are used together.
We then select pairs of uniforms from each connected component. The remaining
unpaired uniforms (from components of odd sizes) are paired together
arbitrarily.

In principle, we should weight the graph by number of occurences and choose
pairs that maximize the total selected edge weight. This is left for
future work, as it is nontrivial -- selecting these edges optimally appears to
be NP-hard at first blush.

Implementation note: As position and varying shaders share FAU on Bifrost, extra
care is taken with a `push_offset` shader stage info parameter that ensures
varying shaders do not reorder uniforms selected by the previous position
shader.

total instructions in shared programs: 2503343 -> 2451758 (-2.06%)
instructions in affected programs: 1553309 -> 1501724 (-3.32%)
helped: 14256
HURT: 8
helped stats (abs) min: 1.0 max: 80.0 x̄: 3.62 x̃: 3
helped stats (rel) min: 0.06% max: 36.36% x̄: 7.31% x̃: 6.67%
HURT stats (abs)   min: 1.0 max: 2.0 x̄: 1.38 x̃: 1
HURT stats (rel)   min: 1.30% max: 12.50% x̄: 4.99% x̃: 3.85%
95% mean confidence interval for instructions value: -3.66 -3.58
95% mean confidence interval for instructions %-change: -7.41% -7.20%
Instructions are helped.

total tuples in shared programs: 2008399 -> 1969627 (-1.93%)
tuples in affected programs: 1146344 -> 1107572 (-3.38%)
helped: 12867
HURT: 147
helped stats (abs) min: 1.0 max: 61.0 x̄: 3.03 x̃: 2
helped stats (rel) min: 0.17% max: 42.86% x̄: 6.79% x̃: 4.65%
HURT stats (abs)   min: 1.0 max: 3.0 x̄: 1.20 x̃: 1
HURT stats (rel)   min: 0.29% max: 20.00% x̄: 2.12% x̃: 1.19%
95% mean confidence interval for tuples value: -3.03 -2.93
95% mean confidence interval for tuples %-change: -6.82% -6.57%
Tuples are helped.

total clauses in shared programs: 408005 -> 401708 (-1.54%)
clauses in affected programs: 90760 -> 84463 (-6.94%)
helped: 6006
HURT: 164
helped stats (abs) min: 1.0 max: 9.0 x̄: 1.08 x̃: 1
helped stats (rel) min: 0.45% max: 33.33% x̄: 12.44% x̃: 14.29%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.64% max: 25.00% x̄: 9.81% x̃: 5.26%
95% mean confidence interval for clauses value: -1.03 -1.01
95% mean confidence interval for clauses %-change: -12.03% -11.66%
Clauses are helped.

total cycles in shared programs: 203308.37 -> 202737.83 (-0.28%)
cycles in affected programs: 19264.71 -> 18694.17 (-2.96%)
helped: 3024
HURT: 41
helped stats (abs) min: 0.041665999999999315 max: 2.5416680000000014 x̄: 0.19 x̃: 0
helped stats (rel) min: 0.17% max: 33.33% x̄: 3.83% x̃: 2.83%
HURT stats (abs)   min: 0.041665999999999315 max: 0.125 x̄: 0.06 x̃: 0
HURT stats (rel)   min: 0.30% max: 5.88% x̄: 1.41% x̃: 0.93%
95% mean confidence interval for cycles value: -0.19 -0.18
95% mean confidence interval for cycles %-change: -3.89% -3.64%
Cycles are helped.

total arith in shared programs: 76265.67 -> 74669.25 (-2.09%)
arith in affected programs: 45001.50 -> 43405.08 (-3.55%)
helped: 12945
HURT: 97
helped stats (abs) min: 0.041665999999999315 max: 2.5416680000000014 x̄: 0.12 x̃: 0
helped stats (rel) min: 0.17% max: 50.00% x̄: 8.06% x̃: 4.88%
HURT stats (abs)   min: 0.041665999999999315 max: 0.125 x̄: 0.05 x̃: 0
HURT stats (rel)   min: 0.21% max: 33.33% x̄: 2.16% x̃: 0.96%
95% mean confidence interval for arith value: -0.12 -0.12
95% mean confidence interval for arith %-change: -8.16% -7.81%
Arith are helped.

total quadwords in shared programs: 1796563 -> 1766803 (-1.66%)
quadwords in affected programs: 948830 -> 919070 (-3.14%)
helped: 12078
HURT: 219
helped stats (abs) min: 1.0 max: 42.0 x̄: 2.49 x̃: 2
helped stats (rel) min: 0.10% max: 33.33% x̄: 5.57% x̃: 5.26%
HURT stats (abs)   min: 1.0 max: 4.0 x̄: 1.21 x̃: 1
HURT stats (rel)   min: 0.33% max: 6.67% x̄: 2.00% x̃: 1.14%
95% mean confidence interval for quadwords value: -2.46 -2.38
95% mean confidence interval for quadwords %-change: -5.52% -5.36%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14163>
2022-02-24 01:35:33 +00:00
Timothy Arceri 6eec8fcbfa glsl/nir: free GLSL IR right after we convert to NIR
Gives us memory back faster which is useful for pathalogical CTS
tests.

The GLSL IR was previously used after converting to NIR for things
like building the GL resource list but we have had a NIR version
for this for some time and I don't believe there are any other
use cases left for keeping the old IR hanging around this long.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15127>
2022-02-24 01:10:49 +00:00
Emma Anholt 0fda2ac4f0 ci/virgl: Drop the bvec4_from_mat4x2_vs xfail.
The fix has landed in VK-GL-CTS 1.3.1.0, we were just not noticing it
because this is also in the flakes list.

Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14962>
2022-02-23 23:09:20 +00:00
Emma Anholt 9e710af830 ci/softpipe: Move most of testing to shared 64-core runners at Google.
The single job takes about 3:30 of runner time.  I don't have a good
explanation for the crash->fail test changes.

Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14962>
2022-02-23 23:09:20 +00:00
Emma Anholt 73b37f9ff0 ci/lavapipe: Test 1/3 of lavapipe on the shared 64-core google runners.
Now we can get through 1/3 of the testsuite in about 3:30, while
previously we did 1/10th.

Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14962>
2022-02-23 23:09:20 +00:00
Emma Anholt 0f64f4bdb5 ci/llvmpipe: Move most of testing to shared 64-core runners at Google.
These runners are configured to have a single job take up the whole
runner, which means we get to use threads to our hearts content.  The pile
of cores means we don't need to spawn separate jobs to try to load-balance
across fdo's shared runner capacity.  Having dedicated runners means we
won't get our MRs blocked as much waiting on non-Mesa testing happening on
fd.o.

We manage to complete all of this llvmpipe testing in about 6:15.

Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14962>
2022-02-23 23:09:20 +00:00
Samuel Pitoiset a2c1fa9137 radv: initialize extra state for internal pipelines at one place
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14650>
2022-02-23 22:29:55 +00:00
Samuel Pitoiset 959e8586aa radv: remove useless radv_blend_state::single_cb_enable field
This was only used for meta operations. DCC/FMASK/FCE pipelines
only declare one color attachment and the color writemask of the
second color attachment is 0 for the HW CB resolve.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14650>
2022-02-23 22:29:55 +00:00
Samuel Pitoiset 8347d3dfd7 radv: initialize VGT_GS_OUT_PRIM_TYPE earlier
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14650>
2022-02-23 22:29:55 +00:00
Samuel Pitoiset 9fb0831ca1 radv: initialize more depth/stencil states earlier
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14650>
2022-02-23 22:29:55 +00:00
Dmitry Baryshkov b4bef890ee freedreno/regs: remove 5nm DSI PHY regs
5nm PHY is a variation of 7nm PHY, they use the same register
definitions. To remove duplication, drop 5nm defs.

Cc: Robert Foss <robert.foss@linaro.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15051>
2022-02-23 21:25:22 +00:00
Dave Airlie b77ef4dd60 draw/so: don't use pre clip pos if we have a tes either.
This check for geom shader needed to be expanded for tess support.

dEQP-VK.transform_feedback.simple.depth_clip_control_tese with lvp

Fixes: dacf8f5f5c ("draw: hook up final bits of tessellation")

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15128>
2022-02-23 20:56:42 +00:00
Alyssa Rosenzweig 31b7ebcbc7 pan/mdg: Fix overflow in intra-bundle interference
There are up to 4 instructions in the latter stage (if a branch is included),
not 3. Bump the limit to fix memory corruption.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Icecream95 <ixn@disroot.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15147>
2022-02-23 20:42:33 +00:00
Jordan Justen 0fffaa9fca anv: Align state pools to 2MiB on XeHP
Suggested-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Fixes: c17e2216dd ("anv: Align buffer VMA to 2MiB for XeHP")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15054>
2022-02-23 20:15:24 +00:00
Jordan Justen 5a28d2482f anv: Align GENERAL_STATE_POOL_MIN_ADDRESS to 2MiB
Fixes: c17e2216dd ("anv: Align buffer VMA to 2MiB for XeHP")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15054>
2022-02-23 20:15:24 +00:00
Alyssa Rosenzweig d986731da9 iris,crocus,i915g: Don't stub flush_frontbuffer
This callback is only intended for software rasterizers, layered drivers, and
other special drivers that go through the software winsys path. Remove the
unimplemented stubs from the Intel drivers.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Dave Airlie <airlied@redhat.com> [crocus]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15118>
2022-02-23 19:49:54 +00:00
Alyssa Rosenzweig 51689a2b80 panfrost: Simplify panfrost_resource_get_handle
Unify the exit paths to clean up the logic. There are logically three modes we
support (KMS without renderonly, KMS with renderonly, and FD); these each
correspond to a leg of a small if statement. Outside of the small if's,
everything else should be identical.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: James Jones <jajones@nvidia.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15120>
2022-02-23 18:31:55 +00:00
Alyssa Rosenzweig b5734cc1c4 panfrost: Fix FD resource_get_handle
When handle->type is WINSYS_HANDLE_TYPE_FD, the caller wants a file descriptor
for the BO backing the resource. We previously had two paths for this:

1. If rsrc->scanout is available, we prime the GEM handle from the KMS device
   (rsrc->scanout->handle) to a file descriptor via the KMS device.

2. If rsrc->scanout is not available, we prime the GEM handle from the GPU
   (bo->gem_handle) to a file descriptor via the GPU device.

In both cases, the caller passes in a resource (with BO) and expects out a file
descriptor. There are no direct GEM handles in the function signature; the
caller doesn't care which GEM handle we prime to get the file descriptor. In
principle, both paths produce the same file descriptor for the same BO, since
both GEM handles represent the same underlying resource (viewed from different
devices).

On grounds of redundancy alone, it makes sense to remove the rsrc->scanout path.
Why have a path that only works sometimes, when we have another path that works
always?

In fact, the issues with the rsrc->scanout path are deeper. rsrc->scanout is
populated by renderonly_create_gpu_import_for_resource, which does the
following:

1. Get a file descriptor for the resource by resource_get_handle with
   WINSYS_HANDLE_TYPE_FD
2. Prime the file descriptor to a GEM handle via the KMS device.

Here comes strike number 2: in order to get a file descriptor via the KMS
device, we had to /already/ get a file descriptor via the GPU device. If we go
down the KMS device path, we effectively round trip:

   GPU handle -> fd -> KMS handle -> fd

There is no good reason to do this; if everything works, the fd is the same in
each case. If everything works. If.

The lifetimes of the GPU handle and the KMS handle are not necessarily bound. In
principle, a resource can be created with scanout (constructing a KMS handle).
Then the KMS view can be destroyed (invalidating the GEM handle for the KMS
device), even though the underlying resource is still valid. Notice the GPU
handle is still valid; its lifetime is tied to the resource itself. Then a
caller can ask for the FD for the resource; as the resource is still valid, this
is sensible. Under the scanout path, we try to get the FD by priming the GEM
handle on the KMS device... but that GEM handle is no longer valid, causing the
PRIME ioctl to fail with ENOENT. On the other hand, if we primed the GPU GEM
handle, everything works as expected.

These edge cases are not theoretical; recent versions of Xwayland trigger this
ENOENT, causing issue #5758 on all Panfrost devices. As far as I can tell, no
other kmsro driver has this 'special' kmsro path; the only part of
resource_get_handle that needs special handling for kmsro is getting a KMS
handle.

Let's remove the broken, useless path, fix Xwayland, bring us in line with other
drivers, and delete some code.

Thank you for coming to my ted talk.

Closes: #5758
Fixes: 7da251fc72 ("panfrost: Check in sources for command stream")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-and-tested-by: Jan Palus <jpalus@fastmail.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: James Jones <jajones@nvidia.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Tested-by: Dan Johansen <strit@manjaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15120>
2022-02-23 18:31:55 +00:00
Dmitry Baryshkov 22efeec399 freedreno/registers: add new register for 7nm DSI PHY v4.3 (sm8450)
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15052>
2022-02-23 17:28:17 +00:00
Alyssa Rosenzweig 04b80489d5 ci: Disable windows-vs2019
Currently down.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15148>
2022-02-23 15:12:41 +00:00
Rhys Perry ded9cb904f anv: Enable nir_opt_access
This commit will enable pass for searching readonly / writeonly
access when it's missing.

We don't support shaderStorageImageReadWithoutFormat
and the optimization pass causes those shaders to
take the write-only path which does support formatless.

Following games are affected with positive result:
 - Wolfenstein: Youngblood
 - Wolfenstein II: The New Colossus https://gitlab.freedesktop.org/mesa/mesa/-/issues/3138
 - Rage 2 https://gitlab.freedesktop.org/mesa/mesa/-/issues/5791
 - The Surge 2 https://gitlab.freedesktop.org/mesa/mesa/-/issues/5805
 - Metro Exodus https://gitlab.freedesktop.org/mesa/mesa/-/issues/4703
 - DOOM Eternal https://gitlab.freedesktop.org/mesa/mesa/-/issues/4273

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3138,https://gitlab.freedesktop.org/mesa/mesa/-/issues/5791,https://gitlab.freedesktop.org/mesa/mesa/-/issues/4273
Signed-off-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15082>
2022-02-23 13:11:12 +00:00
Alyssa Rosenzweig abb7f04674 panfrost: Inline pan_emit_sfbd_tiler
Easier to read, the common code was already common.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig 910d4f8245 panfrost: Remove pan_emit_fbd thunking
Use a common interface.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig 8dc7757754 panfrost: Remove unrelated comment
Not sure what this was supposed to describe, but it's not the code here.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig 099d61c95d panfrost: Use txl instead of tex in the blitter
We always blit from a particular level, so it's a waste to compute the LOD.
This corresponds to a simple texture instruction with implement 0 LOD, which is
the optimal texturing path on Bifrost -- it maps to TEXS_2D but does not require
helper invocations.

Functional change on Bifrost: Blit shaders no longer set .computed_lod or
shader_contains_barrier.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig 5b1a00c565 panfrost: Inline pan_blit_emit_dcd
Easier to follow the logic without having a million arguments passed around.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig c9784c9512 panfrost: Decouple tiler job and DCD emit
We can share the "emit quad" logic, even though the DCDs differ.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig a13d87c484 panfrost: Annotate slow clears as such
We should realistically be using the clear shaders from PanVK once they're moved
to common.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig 1eb3dbafdb panfrost: Set defaults for deprecated DCD fields
There are always set to true. Don't pollute the driver code with them, make
their existence a local detail to pre-Valhall XML and that's it.

Functional change: "four components per vertex" is now set on vertex job DCDs.
This should be a no-op.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig bd3d7e33b6 panfrost: Use pan_shader_prepare_rsd in blitter
This reduces code duplication and will ease Valhall porting. Functional changes
on v7:

* Shader contains barrier is now set (perf loss, fixed later in series)
* Shader register allocation is now set (perf win)
* Point sprite inverted, no-op for blit shaders

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig 6fc81f163e pan/mdg: Fix partial execution mode names
cont -> skip, last -> kill, and fix the special case handling. It's just an
enum. Makes the disassembly easier to read and closer to Bifrost.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123>
2022-02-23 12:56:30 +00:00
Danylo Piliaiev 7e703e4428 turnip: Always use GMEM for feedback loops in autotuner
For ordinary feedback loops GMEM is a lot faster than sysmem since
we don't set SINGLE_PRIM mode.

For feedback loops with ordered rasterization GMEM should also be
faster.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>
2022-02-23 11:31:59 +00:00
Danylo Piliaiev ebc23ac963 turnip: Implement VK_ARM_rasterization_order_attachment_access
Trivially implemented by using A6XX_GRAS_SC_CNTL_SINGLE_PRIM_MODE.

This extension is useful for emulators e.g. AetherSX2 PS2 emulator and
could drastically improve performance when blending is emulated.

Relevant tests:
dEQP-VK.rasterization.rasterization_order_attachment_access.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>
2022-02-23 11:31:59 +00:00
Danylo Piliaiev d6c89e1e4a turnip: Merge LRZ and DEPTH_PLANE draw states
They were emitted at the same time. Frees 1 draw state for us to use.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>
2022-02-23 11:31:59 +00:00
Danylo Piliaiev dab34bd5c8 turnip: Use LATE_Z when there might be depth/stencil feedback loop
Otherwise a shader invocation would read the value which should have
been set AFTER this shader invocation.

Fixes tests:
 dEQP-VK.rasterization.rasterization_order_attachment_access.depth.samples_1.multi_draw_barriers
 dEQP-VK.rasterization.rasterization_order_attachment_access.stencil.samples_1.multi_draw_barriers

Fixes: 71595a189a
("tu: Fix feedback loops in sysmem mode")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>
2022-02-23 11:31:59 +00:00
Paulo Zanoni d10fd5b7c9 iris: fix register spilling on compute shaders on XeHP
XeHP scratch space is handled differently. Commit ae18e1e707
implemented support for it, but handled it differently between render
and compute shaders: it calculates scratch_addr differently and
doesn't pin the buffer on compute. Make it work on compute shaders by
calling pin_scratch_space() from iris_compute_walker(), which fixes
both the address and the pinning.

This commit can be verified by the two-year-old-but-still-unreviewed
Piglit MR 234. You can also verify this by running a very simple
compute shader with INTEL_DEBUG=spill_fs.

References: https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/234
Fixes: ae18e1e707 ("iris: Add support for scratch on XeHP")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15070>
2022-02-22 22:16:57 +00:00
Kenneth Graunke c46d3acf0e anv: Raise vertex input bindings and attributes limits slightly
This raises our vertex input bindings limit from 28 to 31, and our
vertex input attribute limit from 28 to 29.  We could theoretically
go higher, but it will take additional work.

The 3DSTATE_VERTEX_BUFFERS and 3DSTATE_VERTEX_ELEMENTS limits are 33
vertex buffers, and 34 vertex elements.  But we need up to two vertex
elements for system values (FirstVertex, BaseVertex, BaseInstance,
DrawID), and we currently use two vertex bindings for those.

There is another hidden limit: our compiler backend only supports the
push model for VS inputs currently.  3DSTATE_VS only allows URB Read
Lengths between [0, 15], which is measured in pairs of inputs, which
means we can theoretically push no more than 32 vertex elements.  This
is no artifical limit either, as a vec4 element takes up 4 registers
in the payload, and 32 * 4 = 128, the entire size of our register file.
Plus, the VS Thread payload needs at least g0 and g1 for other things,
so we can really only push 31.

We can theoretically support one additional binding, by combining our
two SGV bindings into a single upload.  In order to support additional
vertex elements, we would need to add support to the backend compiler
for the pull model for VS inputs.

References: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5917
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14991>
2022-02-22 21:31:06 +00:00
Mike Blumenkrantz dabba7d726 zink: ci updates
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15067>
2022-02-22 21:16:55 +00:00
Mike Blumenkrantz 3029000389 zink: remove zink_descriptor_util_init_null_set()
no longer used

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15067>
2022-02-22 21:16:55 +00:00
Mike Blumenkrantz 7266182be0 zink: allow null descriptor set layouts
I got confused while writing this somehow because of the null descriptor
feature, which enables drivers to consume a null descriptor, which has no
relation to a descriptor layout containing no descriptors

failing to accurately use zero descriptors can put layouts over the maximum
per-stage limits, which causes tests to crash

fixes (lavapipe):
KHR-GL46.shading_language_420pack.binding_uniform_block_array
KHR-GL46.multi_bind.dispatch_bind_buffers_base

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15067>
2022-02-22 21:16:55 +00:00
Timur Kristóf 3759a16d8a ac/nir/ngg: Fix mixed up primitive ID after culling.
When NGG culling is enabled, make sure that the correct
primitive ID is exported by each lane.

Fixes: e97f0463a8 "ac/nir: Implement NGG deferred attribute culling in NIR."
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6050
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15055>
2022-02-22 18:15:24 +00:00
Mike Blumenkrantz c063d8ff64 zink: prune ci lists
I don't know why I thought running GL3.2 and GL4.6 was a good idea,
but it wasn't

Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15065>
2022-02-22 18:02:00 +00:00
Emma Anholt 59bc17d57a turnip: Request no implicit sync when we have no implicit-sync WSI BOs.
I chose to implement this as a global flag in the device, because
otherwise we would end up with extra draw overhead trying to avoid it in
the implicit-sync WSI case, and you're probably going to end up needing
implicit sync anyway because you used one of the BOs in any of the
submitted cmdbufs.  To do better than this, we would probably want a
skip-implicit-sync flag on the BOs in the BO list, rather than global on
the submit.

Reports about venus on turnip say that this flag reduces worst-case
QueueSubmit time in a game workload from ~10ms to ~4ms.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14838>
2022-02-22 17:36:05 +00:00
Samuel Pitoiset 83ee08f6d1 radv: fix build on BSD
Just disable inotify for BDS systems.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6060
Fixes: c50557d961 ("radv: allow applications to dynamically change RADV_FORCE_VRS")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15105>
2022-02-22 17:16:21 +00:00
Alyssa Rosenzweig 2e86767370 pan/bi: Add BIFROST_MESA_DEBUG=nosb option
To disable the new scoreboarding optimizations when debugging.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig c81c022e66 pan/bi: Implement basic scoreboarding pass
Extend our existing bi_scoreboard infrastructure with a simple data flow
analysis pass that calculates which dependency slots need waiting. We
still lack a heuristic for selecting dependency slots.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig 8f25d88d90 pan/bi: Print scoreboarding state
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig 6ad9a7f650 pan/bi: Add scoreboard state to IR
To a limited degree, scoreboarding must be global, so add the data
structures for tracking this to the IR.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig 91c02893d8 pan/bi: Clean up nits in liveness analysis
Fix minor silly things.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig 734a8bdc5d pan/bi: Use bi_exit_block
The "generic" one is a vestige of Midgard.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig 75406a561f pan/bi: Add bi_{start, exit}_block helpers
Useful for data flow analysis.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig e5423bb129 pan/bi: Do not cull post-RA staging writes
Bifrost post-RA dead code elimination can cull the destinations of
regular ALU instructions, by weakening from a register write to a
temporary write. However, there is no way to suppress staging writes, so
culling the destinations will result in invalid code generation.

Fixes a regression in
dEQP-GLES3.functional.shaders.switch.switch_in_for_loop_static_vertex
with scoreboarding. The root cause there is the backend dead code
elimination not being sufficiently aggressive in the presence of control
flow. Usually this does not matter, since the backend optimizations are
intended to be local with global optimizations happening in NIR.
Unfortunately, our implementation of IDVS hits this hard. That will need
to be optimized (probably by specializing IDVS shaders in NIR instead of
the backend). In the mean time, let's fix the actual bug affecting
scoreboarding.

No shader-db changes.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig 87d46f40c8 pan/bi: Cull DTSEL_IMM dests in post-RA DCE
They are useless (given the semantics of DTSEL_IMM) and complicate
scoreboarding. Just remove them in the pass that removes all the other silly
register destinations.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig 956b969616 pan/bi: Clarify requirement for barriers
Barriers need to wait on all outstanding messages. This is more of an API
requirement than a hardware requirement, but it's still an invariant the
scoreboarding pass must respect.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298>
2022-02-22 16:57:30 +00:00
Adam Jackson 2eb644e470 mesa: Enable GL_NV_pack_subimage
This just legalizes a few of the pixelstore pack parameters in GLES2
that are already legal in desktop and GLES3. glamor takes advantage of
this in the GetImage and software-fallback paths.

Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14977>
2022-02-22 10:45:28 -05:00
Alyssa Rosenzweig 606ac8d61e pan/bi: Enable nir_opt_shrink_vectors
total instructions in shared programs: 1939513 -> 1935815 (-0.19%)
instructions in affected programs: 809066 -> 805368 (-0.46%)
helped: 3195
HURT: 865
helped stats (abs) min: 1.0 max: 15.0 x̄: 1.99 x̃: 1
helped stats (rel) min: 0.10% max: 25.00% x̄: 2.26% x̃: 1.28%
HURT stats (abs)   min: 1.0 max: 22.0 x̄: 3.09 x̃: 2
HURT stats (rel)   min: 0.10% max: 83.33% x̄: 2.67% x̃: 1.39%
95% mean confidence interval for instructions value: -1.00 -0.82
95% mean confidence interval for instructions %-change: -1.34% -1.08%
Instructions are helped.

total tuples in shared programs: 1523194 -> 1521789 (-0.09%)
tuples in affected programs: 745526 -> 744121 (-0.19%)
helped: 2947
HURT: 1844
helped stats (abs) min: 1.0 max: 18.0 x̄: 2.06 x̃: 1
helped stats (rel) min: 0.15% max: 25.00% x̄: 2.65% x̃: 1.59%
HURT stats (abs)   min: 1.0 max: 29.0 x̄: 2.54 x̃: 1
HURT stats (rel)   min: 0.09% max: 40.00% x̄: 2.32% x̃: 1.52%
95% mean confidence interval for tuples value: -0.39 -0.20
95% mean confidence interval for tuples %-change: -0.85% -0.62%
Tuples are helped.

total clauses in shared programs: 329158 -> 325350 (-1.16%)
clauses in affected programs: 111654 -> 107846 (-3.41%)
helped: 2787
HURT: 498
helped stats (abs) min: 1.0 max: 17.0 x̄: 1.57 x̃: 1
helped stats (rel) min: 0.76% max: 40.00% x̄: 6.92% x̃: 5.26%
HURT stats (abs)   min: 1.0 max: 3.0 x̄: 1.14 x̃: 1
HURT stats (rel)   min: 0.87% max: 50.00% x̄: 4.73% x̃: 3.77%
95% mean confidence interval for clauses value: -1.21 -1.10
95% mean confidence interval for clauses %-change: -5.39% -4.93%
Clauses are helped.

total cycles in shared programs: 172084.50 -> 166827.62 (-3.05%)
cycles in affected programs: 74698.83 -> 69441.96 (-7.04%)
helped: 3706
HURT: 568
helped stats (abs) min: 0.041665999999999315 max: 19.0 x̄: 1.44 x̃: 1
helped stats (rel) min: 0.24% max: 75.00% x̄: 9.48% x̃: 6.90%
HURT stats (abs)   min: 0.041665999999999315 max: 1.0 x̄: 0.15 x̃: 0
HURT stats (rel)   min: 0.25% max: 50.00% x̄: 2.21% x̃: 1.42%
95% mean confidence interval for cycles value: -1.28 -1.18
95% mean confidence interval for cycles %-change: -8.18% -7.67%
Cycles are helped.

total arith in shared programs: 57145.04 -> 57211.37 (0.12%)
arith in affected programs: 27595.12 -> 27661.46 (0.24%)
helped: 1933
HURT: 2259
helped stats (abs) min: 0.041665999999999315 max: 0.75 x̄: 0.09 x̃: 0
helped stats (rel) min: 0.16% max: 33.33% x̄: 2.74% x̃: 1.52%
HURT stats (abs)   min: 0.04166399999999726 max: 1.3333329999999997 x̄: 0.11 x̃: 0
HURT stats (rel)   min: 0.10% max: 100.00% x̄: 2.79% x̃: 1.62%
95% mean confidence interval for arith value: 0.01 0.02
95% mean confidence interval for arith %-change: 0.07% 0.40%
Arith are HURT.

total texture in shared programs: 12857 -> 12857 (0.00%)
texture in affected programs: 0 -> 0
helped: 0
HURT: 0

total vary in shared programs: 11157.75 -> 10222 (-8.39%)
vary in affected programs: 5643 -> 4707.25 (-16.58%)
helped: 3196
HURT: 0
helped stats (abs) min: 0.125 max: 1.875 x̄: 0.29 x̃: 0
helped stats (rel) min: 2.78% max: 75.00% x̄: 18.49% x̃: 15.00%
95% mean confidence interval for vary value: -0.30 -0.29
95% mean confidence interval for vary %-change: -18.88% -18.11%
Vary are helped.

total ldst in shared programs: 146420 -> 140270 (-4.20%)
ldst in affected programs: 66027 -> 59877 (-9.31%)
helped: 2942
HURT: 10
helped stats (abs) min: 1.0 max: 19.0 x̄: 2.09 x̃: 2
helped stats (rel) min: 0.90% max: 100.00% x̄: 16.81% x̃: 8.33%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 2.22% max: 50.00% x̄: 13.03% x̃: 3.33%
95% mean confidence interval for ldst value: -2.15 -2.02
95% mean confidence interval for ldst %-change: -17.53% -15.89%
Ldst are helped.

total quadwords in shared programs: 1398329 -> 1392117 (-0.44%)
quadwords in affected programs: 704641 -> 698429 (-0.88%)
helped: 3677
HURT: 1299
helped stats (abs) min: 1.0 max: 26.0 x̄: 2.51 x̃: 1
helped stats (rel) min: 0.10% max: 26.92% x̄: 2.64% x̃: 1.89%
HURT stats (abs)   min: 1.0 max: 20.0 x̄: 2.31 x̃: 1
HURT stats (rel)   min: 0.11% max: 44.44% x̄: 2.34% x̃: 1.55%
95% mean confidence interval for quadwords value: -1.34 -1.16
95% mean confidence interval for quadwords %-change: -1.44% -1.25%
Quadwords are helped.

total threads in shared programs: 35234 -> 35311 (0.22%)
threads in affected programs: 119 -> 196 (64.71%)
helped: 91
HURT: 14
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.60 0.87
95% mean confidence interval for threads %-change: 70.08% 89.92%
Threads are helped.

total loops in shared programs: 125 -> 125 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 149 -> 144 (-3.36%)
spills in affected programs: 22 -> 17 (-22.73%)
helped: 1
HURT: 0

total fills in shared programs: 966 -> 956 (-1.04%)
fills in affected programs: 44 -> 34 (-22.73%)
helped: 1
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15090>
2022-02-22 15:21:09 +00:00
Alyssa Rosenzweig e0e63c2a8e pan/bi: Specialize IDVS in NIR
It's a bit more code, but it's needed to chew through control flow since we
don't have a backend version of dead_cf. Results are really good, meaning I
really screwed this up the first time around (hence the cc mesa-stable).

total instructions in shared programs: 1963576 -> 1939513 (-1.23%)
instructions in affected programs: 671053 -> 646990 (-3.59%)
helped: 4436
HURT: 729
helped stats (abs) min: 1.0 max: 43.0 x̄: 5.75 x̃: 6
helped stats (rel) min: 0.21% max: 100.00% x̄: 6.47% x̃: 5.17%
HURT stats (abs)   min: 1.0 max: 22.0 x̄: 2.01 x̃: 1
HURT stats (rel)   min: 0.50% max: 50.00% x̄: 10.45% x̃: 9.09%
95% mean confidence interval for instructions value: -4.77 -4.55
95% mean confidence interval for instructions %-change: -4.36% -3.80%
Instructions are helped.

total tuples in shared programs: 1533335 -> 1523194 (-0.66%)
tuples in affected programs: 483167 -> 473026 (-2.10%)
helped: 3414
HURT: 1288
helped stats (abs) min: 1.0 max: 20.0 x̄: 3.73 x̃: 2
helped stats (rel) min: 0.27% max: 100.00% x̄: 4.87% x̃: 3.03%
HURT stats (abs)   min: 1.0 max: 19.0 x̄: 2.02 x̃: 1
HURT stats (rel)   min: 0.24% max: 38.10% x̄: 8.10% x̃: 5.88%
95% mean confidence interval for tuples value: -2.28 -2.03
95% mean confidence interval for tuples %-change: -1.62% -1.02%
Tuples are helped.

total clauses in shared programs: 351432 -> 329158 (-6.34%)
clauses in affected programs: 142237 -> 119963 (-15.66%)
helped: 5328
HURT: 3
helped stats (abs) min: 1.0 max: 43.0 x̄: 4.18 x̃: 4
helped stats (rel) min: 0.74% max: 100.00% x̄: 19.44% x̃: 17.24%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 9.09% max: 12.50% x̄: 10.90% x̃: 11.11%
95% mean confidence interval for clauses value: -4.25 -4.11
95% mean confidence interval for clauses %-change: -19.72% -19.12%
Clauses are helped.

total cycles in shared programs: 202830.92 -> 172084.50 (-15.16%)
cycles in affected programs: 117078.42 -> 86332 (-26.26%)
helped: 5450
HURT: 1
helped stats (abs) min: 0.083333 max: 49.0 x̄: 5.64 x̃: 5
helped stats (rel) min: 1.42% max: 100.00% x̄: 27.94% x̃: 25.64%
HURT stats (abs)   min: 0.25 max: 0.25 x̄: 0.25 x̃: 0
HURT stats (rel)   min: 2.46% max: 2.46% x̄: 2.46% x̃: 2.46%
95% mean confidence interval for cycles value: -5.74 -5.54
95% mean confidence interval for cycles %-change: -28.30% -27.58%
Cycles are helped.

total arith in shared programs: 57274.29 -> 57145.04 (-0.23%)
arith in affected programs: 16418.33 -> 16289.08 (-0.79%)
helped: 2442
HURT: 1784
helped stats (abs) min: 0.041665999999999315 max: 0.75 x̄: 0.14 x̃: 0
helped stats (rel) min: 0.23% max: 100.00% x̄: 5.51% x̃: 2.87%
HURT stats (abs)   min: 0.041665999999999315 max: 0.9166670000000003 x̄: 0.12 x̃: 0
HURT stats (rel)   min: 0.00% max: 100.00% x̄: 25.13% x̃: 9.09%
95% mean confidence interval for arith value: -0.04 -0.03
95% mean confidence interval for arith %-change: 6.61% 8.24%
Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree).

total texture in shared programs: 12857 -> 12857 (0.00%)
texture in affected programs: 0 -> 0
helped: 0
HURT: 0

total vary in shared programs: 11157.75 -> 11157.75 (0.00%)
vary in affected programs: 0 -> 0
helped: 0
HURT: 0

total ldst in shared programs: 177208 -> 146420 (-17.37%)
ldst in affected programs: 117098 -> 86310 (-26.29%)
helped: 5447
HURT: 0
helped stats (abs) min: 1.0 max: 49.0 x̄: 5.65 x̃: 5
helped stats (rel) min: 1.92% max: 100.00% x̄: 27.91% x̃: 25.64%
95% mean confidence interval for ldst value: -5.75 -5.55
95% mean confidence interval for ldst %-change: -28.27% -27.56%
Ldst are helped.

total quadwords in shared programs: 1436507 -> 1398329 (-2.66%)
quadwords in affected programs: 515101 -> 476923 (-7.41%)
helped: 5150
HURT: 111
helped stats (abs) min: 1.0 max: 39.0 x̄: 7.46 x̃: 6
helped stats (rel) min: 0.17% max: 100.00% x̄: 10.02% x̃: 8.24%
HURT stats (abs)   min: 1.0 max: 9.0 x̄: 2.01 x̃: 1
HURT stats (rel)   min: 0.43% max: 21.62% x̄: 3.57% x̃: 1.94%
95% mean confidence interval for quadwords value: -7.41 -7.11
95% mean confidence interval for quadwords %-change: -9.98% -9.49%
Quadwords are helped.

total threads in shared programs: 35025 -> 35228 (0.58%)
threads in affected programs: 218 -> 421 (93.12%)
helped: 208
HURT: 5
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.91 0.99
95% mean confidence interval for threads %-change: 93.40% 99.55%
Threads are helped.

total loops in shared programs: 128 -> 125 (-2.34%)
loops in affected programs: 3 -> 0
helped: 3
HURT: 0
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%

total spills in shared programs: 158 -> 149 (-5.70%)
spills in affected programs: 15 -> 6 (-60.00%)
helped: 9
HURT: 0

total fills in shared programs: 1133 -> 966 (-14.74%)
fills in affected programs: 197 -> 30 (-84.77%)
helped: 9
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15090>
2022-02-22 15:21:09 +00:00
Alyssa Rosenzweig 3c1021cd1e panvk: Use more reliable assert for UBO pushing
The important thing isn't the number of words pushed, it's that there are no
UBOs required for us to upload. Check that instead.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15090>
2022-02-22 15:21:09 +00:00
Georg Lehmann d223b7f096 radv, aco: Add u_foreach_bit to .clang-format.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15083>
2022-02-22 14:57:29 +00:00
Xaver Hugl a6be12fdad gbm: improve documentation about the lifetime of resources
Signed-off-by: Xaver Hugl <xaver.hugl@gmail.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10906>
2022-02-22 14:42:52 +01:00
Marek Olšák 62074cb4ac ac: update shadowed registers
based on PAL

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák e74929bfef radeonsi: move Arcturus code outside the gfx9 branch
preparation for a future commit

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák c740fd18ba ac/llvm: replace structured by vindex != NULL in ac_build_buffer_store_common
"raw" (IDXEN=0) and "structured" (IDXEN=1) do bounds checking differently.
From `si_make_buffer_descriptor`:
    * - For VMEM and inst.IDXEN == 0 or STRIDE == 0, it's in byte units.
    * - For VMEM and inst.IDXEN == 1 and STRIDE != 0, it's in units of STRIDE.

so there is a difference between setting vindex = i32_0 and vindex = NULL.
Instead of having the `structured` flag, we can just check if vindex is NULL.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 1038382baf ac/llvm: replace structured by vindex != NULL in ac_build_tbuffer_store
"raw" (IDXEN=0) and "structured" (IDXEN=1) do bounds checking differently.
From `si_make_buffer_descriptor`:
    * - For VMEM and inst.IDXEN == 0 or STRIDE == 0, it's in byte units.
    * - For VMEM and inst.IDXEN == 1 and STRIDE != 0, it's in units of STRIDE.

so there is a difference between setting vindex = i32_0 and vindex = NULL.
Instead of having the `structured` flag, we can just check if vindex is NULL.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák c8e2c6faf6 radeonsi: use SET_SH_REG_INDEX with index=3 for registers containing CU_EN
This matches PAL and RADV behavior. It's for preemption.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 79a7ab642a ac/surface: add more elements to meta equations because HTILE can use them
according to gfx10SwizzlePattern.h

Fixes: 9fabbf2150 - ac/surface: copy the HTILE equations to the surface

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 9a28f79f7b ac/surface/tests: fix missing NUM_PKRS extraction in test_modifier
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 12e00be09b radeonsi: apply the LLVM discard bug workaround to LLVM 13 only
It was fixed in LLVM 14.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 21f169b2fb ac,radeonsi: rework and optimize how TMPRING_SIZE is set
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Yogesh Mohan Marimuthu 3c7df183a3 radeonsi: prepare clamp, alpha test before mrtz prepare
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Yogesh Mohan Marimuthu c268dafae5 radeonsi: move clamp, alpha test from si_export_mrt_color() to new function
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 1485f683e3 radeonsi: fix the unaligned clear_buffer fallback with TC
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 4e49a05e37 radeonsi: increase the tesselation factor ring size
based on PAL

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 37c26a72a4 radeonsi: remove bit gaps in SI_RESOURCE_FLAG_*
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák f865631b1b radeonsi: replace SI_RESOURCE_FLAG_UNMAPPABLE with PIPE_RESOURCE_FLAG_UNMAPPABLE
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 790362c10e radeonsi: don't map buffers that VK made unmappable
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák ad9b5ac0a1 radeonsi: more fixes for si_buffer_from_winsys_buffer for GL-VK interop
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
jiadozhu c9097d8939 radeonsi: fix crash in flush_resource when used with buffers
glWaitSemaphoreEXT triggers si_flush_resource callback
on pipe buffer resources, which may cause segmentation fault.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák f968cb921d radeonsi: reduce the max TBO/SSBO binding size to 512 MB to help 32-bit builds
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 1585b4584d radeonsi: document an unexpected behavior of PS_DONE
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák cc5fd00c5c radeonsi: change ACCUM_ISOLINE to 12
based on PAL

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 9c5b4fa097 radeonsi: program SQ_THREAD_TRACE_CTRL.AUTO_FLUSH_MODE on gfx10.3
discovered internally

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 18a1af4929 radeonsi: always set FLUSH_ON_BINNING_TRANSITION
The hardware does the right thing automatically.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák a87ab82f25 radeonsi: add assertions to check if buffer_map/texture_map calls are valid
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák f23e07525f winsys/amdgpu: fix a warning of defining radeon_screen_create_t twice
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák bf7d25cd1b ac/llvm: remove unused function dpp_row_sl
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák cfaaa0892f ac/surface: don't set the display flag for 1D textures
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 2f2fca24d2 ac/gpu_info: print units for some radeon_info fields
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 53f683ff67 ac: add a gfx9 workaround for high priority compute
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 197467c238 amd: add a workaround for an SQ perf counter bug
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 95af3cc2f8 amd: remove the _UMD suffix from register definitions
It was mistakenly added to indicate it's for a User-Mode Driver,
but all defined registers in Mesa are.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:04 +00:00
Marek Olšák 707a94f3c5 winsys/radeon: fix a hang due to introducing spi_cu_en
Fixes: 5406ad93 "radeonsi: set COMPUTE_DESTINATION_EN_SEn to spi_cu_en"
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5989

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
2022-02-22 11:41:03 +00:00
Iago Toral Quiroga c4a78a2d2a broadcom/compiler: fix register class patching for postponed spills
If we have a postponed spill, the temp we create at ip is no longer
the spilled temp and therefore is affected by the thrsw injection.

Fixes corruption in the additive blending animation demo from
Three.js.

Fixes: f3c3228522 ('broadcom/compiler: do not rebuild the interference graph after each spill')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15112>
2022-02-22 11:17:10 +00:00
Tapani Pälli 09b86b4061 iris: setup internal_format for memory object resources
We need to setup internal_format for resource in case main surface was
not configured (iris_resource_configure_main) which is the case with
vertex buffer objects, otherwise transfer helper will make wrong
decisions when copying such a resource.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14925>
2022-02-22 10:56:21 +00:00
Erik Faye-Lund 5c5243adb7 vulkan/wsi: use buffer-image code-path on Windows
This means we no longer map optimal-tiling images in order to display
them on screen, but instead copy via a buffer, which is guaranteed to
linearize.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12210>
2022-02-22 10:04:34 +00:00
Erik Faye-Lund 2e8c119531 vulkan/wsi: add transition to/from transfer-src state
Without these barriers, we'll try to blit from the image when it's in
the incorrect layout, which is illegal.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12210>
2022-02-22 10:04:34 +00:00
Erik Faye-Lund 25a37fabb7 vulkan/wsi: untangle buffer-images from prime
Not all Vulkan implementations allows rendering to linear images, so in
order to support scanning out from these on Windows we might have to copy
through a buffer like we do in the PRIME path.

To avoid reimplementing the same, let's instead generalize the code a
bit so it doesn't have to specfy any PRIME-specific details.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12210>
2022-02-22 10:04:34 +00:00
Boris Brezillon 32c2db1b25 vulkan/wsi: Don't open-code vk_format_get_blocksize()
We already have a standard helper to retrieve the texel block size from
a VkFormat, let's use it instead of adding a new helper.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Suggested-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12210>
2022-02-22 10:04:34 +00:00
Boris Brezillon 7c1c6606d0 vulkan/wsi: Use ALIGN_POT() instead of open-coding it
align_u32() and ALIGN_POT() are doing the same thing.
Replace align_u32() calls by ALIGN_POT() ones and get rid
of align_u32().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Suggested-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12210>
2022-02-22 10:04:34 +00:00
Erik Faye-Lund 83938a1bb3 vulkan/wsi: pass win32-swapchain directly
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12210>
2022-02-22 10:04:34 +00:00
Marcin Ślusarz fddab17e54 anv: cleanup begin_subpass & end_subpass
Share values of some expressions instead of duplicating the same logic.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15079>
2022-02-22 09:09:05 +00:00
Marcin Ślusarz f91bfc80ba intel/compiler: remove redundant code from fs_visitor::run_*
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15079>
2022-02-22 09:09:05 +00:00
Gert Wollny 2fbb4e85f7 virgl: Enable PIPE_CAP_TGSI_TEXCOORD when the host supports it
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15025>
2022-02-22 08:46:54 +00:00
Juan A. Suarez Romero db8a202b41 vc4: remove redundant initialization
These assignments are not required.

Partially fixes https://gitlab.freedesktop.org/mesa/mesa/-/issues/3966

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15080>
2022-02-22 08:29:51 +00:00
Samuel Pitoiset 369b8cffea radv,aco,llvm: lower adjusting vertex alpha in NIR
Instead of duplicating the same lowering in both compiler backends.
This pass will be used to do more VS input lowering.

fossils-db (Polaris10):
Totals from 48 (0.04% of 135960) affected shaders:
VGPRs: 1692 -> 1684 (-0.47%)
CodeSize: 54016 -> 53964 (-0.10%); split: -0.11%, +0.01%
MaxWaves: 339 -> 341 (+0.59%)
Instrs: 11260 -> 11247 (-0.12%); split: -0.13%, +0.02%
Latency: 88165 -> 88113 (-0.06%); split: -0.07%, +0.01%
InvThroughput: 36153 -> 36093 (-0.17%)
Copies: 583 -> 568 (-2.57%)

fossils-db (Pitcairn):
Totals from 43 (0.03% of 135960) affected shaders:
VGPRs: 1548 -> 1552 (+0.26%)
CodeSize: 47900 -> 47820 (-0.17%)
Instrs: 10751 -> 10731 (-0.19%)
Latency: 83029 -> 82873 (-0.19%)
VClause: 168 -> 164 (-2.38%)
SClause: 393 -> 391 (-0.51%)
Copies: 705 -> 685 (-2.84%)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15076>
2022-02-22 08:08:42 +00:00
Qiang Yu 100c80392a util/util_vertex_state_cache: remove error check when deinit
Application may exit without freeing created display list, this
may leave the cache not empty.

This is triggered by Abaqus which just close X11 display without
calling any of GLX cleanup functions like glXDestroyContext. But
GLX hook to X11 display close function to free GLX screen resource.
So display list as a context resource has not been freed, but
cache as a screen resource is freed.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14926>
2022-02-22 07:10:40 +00:00