Commit Graph

1341 Commits

Author SHA1 Message Date
Connor Abbott 3cad11d84a tu: Delete unused tu_clear_blit GS handling
This has been unused for a while since we switched to writing the
array index in the VS.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16147>
2022-05-13 17:07:05 +00:00
Danylo Piliaiev 9a11ad7efd tu: Fix indices of drm_msm_gem_submit_cmd when filling them
For some reason CTS doesn't trigger the issue...
When submit entry is not filled - kernel says:
 [drm:msm_ioctl_gem_submit] *ERROR* invalid type: 00000000

Fixes: dbae9fa7d8
("tu: implement sysmem vs gmem autotuner")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16474>
2022-05-12 16:44:09 +00:00
Emma Anholt b282d504a4 turnip: Add a TU_DEBUG=perf debug option.
For doing performance investigation, I often find it useful to have a "are
we tripping over any of our performance TODOs?" flag, so add it and use it
in a few of the TODOs.

This also greatly cleans up the deqp-vk logs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16316>
2022-05-12 01:00:25 +00:00
Danylo Piliaiev 187d3df52c tu: Do not flush ccu in clear/blits during renderpass
For clear/blits ccu flush not only worse for perf, but also messes up
flush_bits when executed in a conditional set of commands.

We already don't flush for 3d blits.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6419

Fixes: 487aa807bd
("tu: Rewrite flushing to use barriers")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16352>
2022-05-11 08:07:50 +00:00
Danylo Piliaiev db69218cbe tu: Implement VK_EXT_image_view_min_lod
Relevant tests:
 dEQP-VK.texture.mipmap.*.image_view_min_lod.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16292>
2022-05-09 07:53:41 +00:00
Chia-I Wu 53d87865ca turnip: fix drm modifier support with planar formats
We need to advertise the results of tu6_plane_count and handle
VK_IMAGE_ASPECT_MEMORY_PLANE_*_BIT.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6374
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16169>
2022-04-29 22:30:45 +00:00
Danylo Piliaiev 6e6ba85fd9 turnip: Fix tu_debug_flags values clashing
Was not caught during rebase...

Fixes: 725ae34458
("turnip: Add debug option to print gmem load/store skip stats")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16246>
2022-04-29 15:09:36 +00:00
Danylo Piliaiev 725ae34458 turnip: Add debug option to print gmem load/store skip stats
TU_DEBUG=log_skip_gmem_ops would print stats about skipped
gmem/load every second.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15974>
2022-04-29 09:29:55 +00:00
Danylo Piliaiev 0c489f18cb turnip: Skip load/stores for tiles with no geometry
When HW binning is used tile loads/stores could be skipped
if there is no geometry in the tile.

Loads could be skipped when:
- The attachment won't be resolved, otherwise if load is skipped
  there would be holes in the resolved attachment;
- There is no vkCmdClearAttachments afterwards since it is likely
  a partial clear done via 2d blit (2d blit doesn't produce geometry).

Stores could be skipped when:
- The attachment was not cleared, which may happen by load_op or
  vkCmdClearAttachments;
- When store is not a resolve.

I chose to predicate each load/store separately to allow them to be
skipped when only some attachments are cleared or resolved.

Gmem loads are moved into separate cs because whether to emit
CP_COND_REG_EXEC depends on HW binning being enabled and usage of
vkCmdClearAttachments.

CP_COND_REG_EXEC predicate could be changed during draw_cs only
by perf query, in such case the predicate should be re-emitted.
(At the moment it is always re-emitted before stores)

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15974>
2022-04-29 09:29:55 +00:00
Emma Anholt 550975f229 turnip: Don't disable LRZ in subpasses after the first in the easy case.
If it's the same depth/stencil attachment, then there's no need to turn
off LRZ just because the subpass changed.  Doesn't help gfxbench perf yet,
but will with !16014.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15982>
2022-04-19 18:45:30 +00:00
Emma Anholt 7ba63f516a turnip: Ignore TOP/BOTTOM_OF_PIPE bits in subpass src/dst dep flags.
gfxbench sets these between the gbuffer subpass and the following ones.
They should be no-ops as subpass dependencies.  gfxbench vk-5-debug perf
12.8 -> 14.6 fps thanks to getting gmem on the gbuffer rendering.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15982>
2022-04-19 18:45:30 +00:00
Emma Anholt 7ba0c44607 turnip: Add nir_opt_conditional_discard.
We can easily do discard_if in the backend without control flow, but it
wasn't done in ir3 because the GL frontend already did it for us.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15982>
2022-04-19 18:45:29 +00:00
Emma Anholt ce15bf19fb turnip: Add TU_DEBUG=layout for dumping image layouts.
This was useful for comparing image allocations between gfxbench
gl_5_normal and vk_5_normal to see if rendering was generally equivalent
(formats, MSAA, UBWC choices, and notably gfxbench vk was choosing DXT5
instead of ASTC on non-android builds!)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15982>
2022-04-19 18:45:29 +00:00
Danylo Piliaiev 2c683519e2 turnip: Try harder to keep LRZ valid and fix a few edge cases
Refactored tu6_calculate_lrz_state and added comments.

1) If there is no depth write we could keep LRZ valid with any
compare op, we just have to temporary disable LRZ for incompatible
ops in such case.

2) Found that VK_COMPARE_OP_EQUAL is not compatible with LRZ,
and since it doesn't change LRZ buffer - LRZ could be just
temporary disabled. This fixes rendering of grass/trees in
PUBG mobile on angle.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6127

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16014>
2022-04-19 18:06:58 +00:00
Emma Anholt 835704e669 turnip: Move autotune buffers to suballoc.
Now the ANGLE trex_200 trace replay does a single BO allocation at startup
for autotune results instead of one per frame (~350 for the whole replay).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15038>
2022-04-12 01:01:56 +00:00
Emma Anholt 7c636acd53 turnip: Get autotune off of ralloc destructors.
We've wanted to remove destructors from ralloc's API for a long time (it's
an extra storage cost per ralloc for a rarely-used feature), and for the
suballoc change we'd need to spend more storage on storing the tu_device
pointer per result since destructors don't get anything else but the
pointer passed into them.

Fixes use-after-frees:

=================================================================
==2383==ERROR: AddressSanitizer: heap-use-after-free on address 0xffff88fe1940 at pc 0xffff934f427c bp 0xfffff5481e90 sp 0xfffff5481ea8
WRITE of size 8 at 0xffff88fe1940 thread T0
    #0 0xffff934f4278 in list_del ../src/util/list.h:108
    #1 0xffff934f4278 in result_destructor ../src/freedreno/vulkan/tu_autotune.c:237
    #2 0xffff9377793c in unsafe_free ../src/util/ralloc.c:300
    #3 0xffff9377793c in ralloc_free ../src/util/ralloc.c:265
    #4 0xffff934f4368 in history_destructor ../src/freedreno/vulkan/tu_autotune.c:229
    #5 0xffff9377793c in unsafe_free ../src/util/ralloc.c:300
    #6 0xffff9377793c in ralloc_free ../src/util/ralloc.c:265
    #7 0xffff934f5990 in tu_autotune_on_submit ../src/freedreno/vulkan/tu_autotune.c:442
[...]

0xffff88fe1940 is located 80 bytes inside of 112-byte region [0xffff88fe18f0,0xffff88fe1960)
freed by thread T0 here:
    #0 0xffff9c1c90d8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
    #1 0xffff934f4368 in history_destructor ../src/freedreno/vulkan/tu_autotune.c:229
    #2 0xffff9377793c in unsafe_free ../src/util/ralloc.c:300
    #3 0xffff9377793c in ralloc_free ../src/util/ralloc.c:265
    #4 0xffff934f5990 in tu_autotune_on_submit ../src/freedreno/vulkan/tu_autotune.c:442
    #5 0xffff935cf2ac in tu_queue_submit_locked ../src/freedreno/vulkan/tu_drm.c:997
[...]

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15038>
2022-04-12 01:01:56 +00:00
Emma Anholt 435d4f08b2 turnip: Reduce the pipeline's CS allocation a bit.
We don't return unused space to the suballocator, so it's a little useful
to limit how much we overallocate to reduce memory footprint.  I took a
look through the tu_cs_emit_array() calls and accounted for a couple of
them in the variant-specific space calculation, then dropped the base
allocation by factors of 2 until we started throwing asserts.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15038>
2022-04-12 01:01:56 +00:00
Emma Anholt 58f6331eec turnip: Skip telling the kernel the BO list when we don't need any.
In fencing, we sometimes do a dummy submit with no nr_cmds.  If we don't
have commands to execute, we don't need to pin or fence any BOs either.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15038>
2022-04-12 01:01:56 +00:00
Emma Anholt dc3203b087 turnip: Sub-allocate pipelines out of a device-global BO pool.
Allocating a BO for each pipeline meant that for apps with many pipelines
(such as Asphalt9 under ANGLE), we would end up spending too much time in
the kernel tracking the BO references.

Looking at CS:Source on zink, before we had 85 BOs for the pipelines for a
total of 1036 kb, and now we have 7 BOs for a total of 896 kb.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15038>
2022-04-12 01:01:56 +00:00
Emma Anholt e0fbdd3eda turnip: Stop allocating unused pvtmem space in the pipeline CS.
The pvtmem was split off to a separate read/write BO.

Fixes: 931ad19a18 ("turnip: make cmdstream bo's read-only to GPU")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15038>
2022-04-12 01:01:56 +00:00
Emma Anholt 80c44a6626 turnip: Track refcounts on BOs in kgsl as well.
I'm going to be using the BO refcount for the pipeline and autotune buffer
suballocation.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15038>
2022-04-12 01:01:56 +00:00
Connor Abbott 32af90d96f freedreno/a6xx: Fix SP_DS_CTRL_REG0 definition
Bit 20 isn't actually MERGEDREGS, the mode for the entire geometry
pipeline is controlled by SP_VS_CTRL_REG0::MERGEDREGS and it appears to
be something preamble-related instead since writing any register in the
preamble hangs if it's set. This fixes those hangs on freedreno and
turnip since we no longer set it.

Fixes: fccc35c2de ("ir3: Add preamble optimization pass")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15801>
2022-04-08 04:40:17 +00:00
Danylo Piliaiev dde1623ed2 turnip: Implement VK_EXT_primitives_generated_query
Similar to pipeline statistics but done for a single counter.

We use REG_A6XX_RBBM_PRIMCTR_7 to get generated primitives
and not PRIMCTR_8 because PRIMCTR_7 counts pre-clipped prims
while PRIMCTR_8 counts them after clipping.

OpenGL spec for GL_PRIMITIVES_GENERATED says:
 "Subsequent rendering will increment the counter once for every
  vertex that is emitted from the geometry shader, or from the
  vertex shader if no geometry shader is present."

Passes tests:
 dEQP-VK.transform_feedback.primitives_generated_query.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15746>
2022-04-07 08:01:59 +00:00
Danylo Piliaiev a5a97f0b77 turnip: Fix subpassLoad from CUBE input attachments
Cube descriptors require a different sampling instruction in shader,
however we don't know whether image is a cube or not until the start
of a renderpass. We have to patch the descriptor to make it compatible
with how it is sampled in shader.

For the reference subpassLoad is currently translated into isaml.a

Blob v615 also doesn't handle this case correctly.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15734>
2022-04-06 19:42:30 +03:00
Danylo Piliaiev 6c18602164 turnip: Add "unaligned_store" debug option to better test gmem stores
Unaligned store is incredibly rare in CTS, we have to force it to
actually test it.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15532>
2022-04-06 08:44:28 +00:00
Danylo Piliaiev e255305e84 turnip: Ignore aspectMask for D32S8 framebuffer attachment
Vulkan spec says:

 "When an image view of a depth/stencil image is used as a depth/stencil
  framebuffer attachment, the aspectMask is ignored and both depth and
  stencil image subresources are used."

Since we use two planes for D32S8 format we have to add a special
case for depth in addition to already existing case for stencil.

Fixes hang in CTS:
 dEQP-VK.renderpass.depth_stencil_write_conditions.stencil_kill_write_d32sf_s8ui

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15532>
2022-04-06 08:44:28 +00:00
Danylo Piliaiev 72716993b2 turnip: Correctly store separate stencil in gmem store
- When resolving d32s8 to s8 we stored stencil with a wrong format.
- For unaligned multi-sample store we used wrong gmem offset for stencil.

If unaligined store is forced this change fixes a hang in:
 dEQP-VK.renderpass2.depth_stencil_resolve.image_2d_32_32.samples_2.d32_sfloat_s8_uint_separate_layouts.compatibility_depth_zero_stencil_zero_testing_stencil

Fixes: b157a5d0d6
("tu: Implement non-aligned multisample GMEM STORE_OP_STORE")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15532>
2022-04-06 08:44:28 +00:00
Jason Ekstrand bdf52654ac turnip: Enable VK_EXT_debug_utils
It's implemented in common code as long as you use vk_command_buffer.

Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15560>
2022-04-06 01:18:23 +00:00
Connor Abbott b91b90c256 tu: Expose VK_KHR_maintenance4
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15488>
2022-04-05 17:46:35 +00:00
Connor Abbott 5eb63d825f tu: Remove tu_pipeline::layout
This makes it more obvious that the layout is never used after creating
the pipeline, which is required by VK_KHR_maintenance4.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15488>
2022-04-05 17:46:35 +00:00
Connor Abbott 7455a7a44c tu: Fill out maxBufferSize
It seems this is really a workaround for silly issues in
GetBufferMemoryRequirements when you ask for a really large buffer. Just
expose the maximum possible size ATM.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15488>
2022-04-05 17:46:35 +00:00
Connor Abbott d1762b7df0 tu: Implement GetDevice*MemoryRequirements()
Based mostly on anv, which is a bit more optimized than radv - we at
allocate the image on the stack.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15488>
2022-04-05 17:46:35 +00:00
Emma Anholt e1de9b0de5 turnip: Allow image access on swapped formats.
This is apparently something that gamescope would like to have, and the
CTS's test coverage is happy with it.

Fixes: #6011 (we hope)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15293>
2022-04-02 19:55:40 +00:00
Emma Anholt 4cd51efedb turnip: Disable tiling on 1D images.
If we know the height is 1, then it would be a waste to align each
miplevel to tile height.  For non-mipmapped textures, it doesn't save us
memory (since you still align to 4 on the last miplevel), but it should be
better cache locality by not loading those unused lines.

Incidentally, this gets us some more coverage of swap != WZYX cases in CTS
tests, which often use optimal tiling without also testing linear.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15293>
2022-04-02 19:55:40 +00:00
Emma Anholt 51b04a7dfb turnip: Add support for VK_KHR_format_feature_flags2.
This reports all of our storage formats as supporting read/write without
format, since we don't have any in-shader format conversions.  Similarly,
shadow comparisons were already supported on all the depth formats.

This extension is required for VK 1.3.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15293>
2022-04-02 19:55:40 +00:00
Danylo Piliaiev 5ce06f8474 turnip: Use correct type for OUTARRAY in FormatProperties2
Fixes: 799a9db24c
("turnip: Stop using VK_OUTARRAY_MAKE()")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15694>
2022-04-02 09:51:45 +00:00
Rajnesh Kanwal d5405c1608 vulkan: Move common format function to vulkan/util/vk_format.h
Moving duplicate vk_format helper functions to common
vulkan/util/vk_format.h and also renaming
vk_format_get_component_size_in_bits to match how amd and
freedreno name the same function. Not moving this function
to common code as freedreno's implementation is a bit different.

Signed-off-by: Rajnesh Kanwal <rajnesh.kanwal@imgtec.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15696>
2022-03-31 17:18:22 +00:00
Connor Abbott 9d081d7561 tu: Correctly handle VK_IMAGE_CREATE_EXTENDED_USAGE_BIT
In this case we should relax checks based on the format, since the user
will be responsible for them when creating an image view.

This gets dEQP-VK.image.sample_texture.*_bit_compressed_format_* not
skipping again after VK-GL-CTS 736eec57dc0c ("Fix checkSupport in
compressed texture sampling tests").

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15563>
2022-03-28 17:16:54 +00:00
Danylo Piliaiev 37939e9c54 turnip: Fix the lack of WFM before indirect draws
We have to add WFM to pending bits when we are flushing into CP
for indirect draw to know when they should apply WFM workaround.

Fixes CTS tests:
dEQP-VK.draw.renderpass.indirect_draw.*_data_from_compute.indirect_draw_count*

Fixes: abf0ae014a
("tu: Properly handle waiting on an earlier pipeline stage")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15577>
2022-03-28 16:09:07 +00:00
Boris Brezillon 799a9db24c turnip: Stop using VK_OUTARRAY_MAKE()
We're trying to replace VK_OUTARRAY_MAKE() by VK_OUTARRAY_MAKE_TYPED()
so people don't get tempted to use it and make things incompatible with
MSVC (which doesn't support typeof()).

Suggested-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15522>
2022-03-25 11:00:02 +00:00
Danylo Piliaiev 5d151ddfba turnip: Disallow non-linear tiling when casting R8G8 to other fmts
R8G8 have a different block width/height and height alignment from other
formats that would normally be compatible (like R16), and so if we are
trying to, for example, sample R16 as R8G8 we need to demote to linear.

Follows the fix in Freedreno: b97e3bb2e1

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15465>
2022-03-22 13:47:21 +00:00
Danylo Piliaiev a70b197741 turnip: Force linear mode for non-ubwc R8G8 formats
Non-UBWC tiled R8G8 is probably buggy since media formats are always
either linear or UBWC. There is no simple test to reproduce the bug.
However it was observed in the wild leading to an unrecoverable hang
on a650/a660.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5926

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15465>
2022-03-22 13:47:21 +00:00
Connor Abbott fc381fa1e3 tu: Actually expose VK_EXT_texel_buffer_alignment
Oops...

Fixes: 3d04c435 ("tu: Trivially implement VK_EXT_texel_buffer_alignment")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15451>
2022-03-18 18:30:20 +00:00
Jason Ekstrand 2a779f98dc turnip: Drop tu_legacy.c
The remaining three helpers all have helpers in the common code.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15459>
2022-03-18 11:19:08 -05:00
Connor Abbott 3d04c43576 tu: Trivially implement VK_EXT_texel_buffer_alignment
The previous alignment of 64 bytes, which we got from the blob,
indicates that single-texel alignment isn't supported. So just do a
trivial no-op implementation that returns the same alignment as before.
This matches what newer blobs that expose this extension do.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15427>
2022-03-17 20:45:19 +00:00
Connor Abbott 072fdcabcd tu: Enable UniformBufferUpdateAfterBind
UBOs are now read at run-time via the preamble so this can be enabled.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Connor Abbott 9932ca8a3f ir3, turnip: Use ldc.k to push UBOs
This reuses the same UBO analysis to do the pushing in the shader
preamble via the ldc.k instruction instead of in the driver via
CP_LOAD_STATE6. The const_data UBO is exempted as it uses a different
codepath that isn't as critical.

Don't do this on gallium because there are some regressions. Aztec Ruins
in particular regresses a bit, and nothing I've benchmarked benefits.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Connor Abbott 221a912b8c ir3: Refactor ir3_compiler_create() to take an options struct
This will let us add more options without creating too much churn.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Emma Anholt 2f25d16653 turnip: Use the DRM or KGSL GPU reset status ioctls to report device loss.
ANGLE-on-venus-on-turnip and zink-on-turnip want real data here for EGL's
reset tests.

This required moving the remaining GPU-reset-causing tests from flakes or
xfails to skips.  Otherwise, the rest of the caselist associated with them
ends up being marked as fails as well.  The alternative would be to put
these tests in their own test groups with tests_per_group = 1, but that
didn't seem worth the effort.  Or, we could finally do something with
https://gitlab.freedesktop.org/anholt/deqp-runner/-/issues/14.

Fixes: #5955
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14839>
2022-03-16 19:28:04 +00:00
Emma Anholt 3b90d3997a turnip: use vk_shader_module_to_nir().
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15305>
2022-03-15 23:13:16 +00:00
Connor Abbott f9d9c0172a tu: Add an extra storage descriptor for isam
Based on a workaround the blob does.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15288>
2022-03-15 21:36:38 +00:00
Connor Abbott 1ec3d39407 tu: Handle UBO/SSBO descriptors with different sizes
We reuse the otherwise-unused offset channel to represent the array
stride, so that reindexing works properly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15288>
2022-03-15 21:36:38 +00:00
Connor Abbott 5ba3ea1eb3 tu: Rewrite dynamic descriptor handling
We need to prepare for storage buffers having different sizes from
uniform buffers. This switches dynamic_offset_offset to have units of
bytes, the same as offset, and as a nice bonus we can more easily
combine the dynamic and non-dynamic paths in various different places.
This also entails rewriting the code that patches dynamic descriptors,
since we can no longer assume a linear mapping between indices in
dynamicOffsets and descriptor locations which the previous approach
heavily relied on.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15288>
2022-03-15 21:36:38 +00:00
Emma Anholt eb9b092001 turnip: Enable VK_EXT_display_control using the common code.
It's all implemented now, so we can turn it back on.  Passes 15/16 tests
when X11 isn't running, and 1/16 when it is, with no failures in either
mode.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15351>
2022-03-15 20:08:58 +00:00
Connor Abbott cdee38a57b tu: Expose subgroup arithmetic
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
2022-03-10 17:15:29 +00:00
Danylo Piliaiev c4703cd846 tu: Implement VK_EXT_depth_clip_control
Since negativeOneToOne is a static property of the pipeline and
viewport state could be dynamic, we have to defer viewport state
emission until negativeOneToOne value is known.

See: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6070

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14363>
2022-03-10 11:08:50 +02:00
Danylo Piliaiev 2e878293f4 turnip: Make autotuner work with reusable command buffers
To achieve it each command buffer now has its own GPU memory.

However the BOs usage by autotuner is not optimal, the ideal
pattern would be to use some memory pool to suballocate small
GPU memory chunks, since most command buffers have only a few
renderpasses.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5990

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14996>
2022-03-09 12:56:31 +00:00
Danylo Piliaiev 2cd30266f1 tu: Refactor VS DECODE/DEST to be emitted in two pkt4
Refactor to emit VFD_DECODE and VFD_DEST_CNTL in two packets
regardless of attribute count.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14584>
2022-03-09 08:21:40 +00:00
Rob Clark 711f0d1df4 turnip: Don't call getenv() directly
I noticed it was using getenv directly when I tried to use 'setprop
mesa.tu.debug ..' on android.  Use os_get_option() instead so we get
sysprop fallback on android.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15289>
2022-03-09 00:22:36 +00:00
Danylo Piliaiev e2fc99b188 turnip: Add "rast_order" debug option to force rast order access
Enables rasterization order attachment access for all pipelines,
see VK_ARM_rasterization_order_attachment_access for details.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15262>
2022-03-07 17:07:18 +00:00
Danylo Piliaiev 549e861dc1 turnip: Implement VK_EXT_physical_device_drm
Copied from ANV and V3DV.

v1. Fix a build error for clang "unannotated fall-through between switch labels"
( Hyunjun Ko <zzoon.ko@igalia.com> )

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6011

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14971>
2022-03-01 07:10:40 +00:00
Connor Abbott 06485f7d3d tu: Call nir_opt_access
This adds some small optimizations, and enables lowering to isam in more
cases where the app didn't specify readonly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15114>
2022-02-28 23:33:22 +00:00
Connor Abbott 00be8c4619 freedreno: Replace A6XX_IBO with A6XX_TEX_CONST
Since these were reverse-engineered, it's become clear that IBO
descriptors are just a subset of texture descriptors, and bindless reads
of readonly images actually use isam on the IBO descriptor, further
confirming that the two are always compatible, even if not all of the
texture fields exist for IBOs. It's pointless to have a separate type
for IBOs, and just leads to things getting out-of-sync unnecessarily
which has already happened. Just remove it and use TEX_CONST insted.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15114>
2022-02-28 23:33:22 +00:00
Danylo Piliaiev 95fabff8de turnip: Set drmFormatModifierTilingFeatures
From Vulkan spec for VkDrmFormatModifierProperties2EXT:

 "drmFormatModifierTilingFeatures is a bitmask of VkFormatFeatureFlagBits
  that are supported by any image created with format and drmFormatModifier."

 "The returned drmFormatModifierTilingFeatures must contain at least one bit."

 "Therefore, if the returned drmFormatModifier is DRM_FORMAT_MOD_LINEAR,
  then drmFormatModifierPlaneCount must equal the format planecount, and
  drmFormatModifierTilingFeatures must be identical to the
  VkFormatProperties2::linearTilingFeatures returned in the same pNext chain."

Relevant tests: dEQP-VK.drm_format_modifiers.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15032>
2022-02-28 22:53:40 +00:00
Danylo Piliaiev 7e703e4428 turnip: Always use GMEM for feedback loops in autotuner
For ordinary feedback loops GMEM is a lot faster than sysmem since
we don't set SINGLE_PRIM mode.

For feedback loops with ordered rasterization GMEM should also be
faster.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>
2022-02-23 11:31:59 +00:00
Danylo Piliaiev ebc23ac963 turnip: Implement VK_ARM_rasterization_order_attachment_access
Trivially implemented by using A6XX_GRAS_SC_CNTL_SINGLE_PRIM_MODE.

This extension is useful for emulators e.g. AetherSX2 PS2 emulator and
could drastically improve performance when blending is emulated.

Relevant tests:
dEQP-VK.rasterization.rasterization_order_attachment_access.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>
2022-02-23 11:31:59 +00:00
Danylo Piliaiev d6c89e1e4a turnip: Merge LRZ and DEPTH_PLANE draw states
They were emitted at the same time. Frees 1 draw state for us to use.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>
2022-02-23 11:31:59 +00:00
Danylo Piliaiev dab34bd5c8 turnip: Use LATE_Z when there might be depth/stencil feedback loop
Otherwise a shader invocation would read the value which should have
been set AFTER this shader invocation.

Fixes tests:
 dEQP-VK.rasterization.rasterization_order_attachment_access.depth.samples_1.multi_draw_barriers
 dEQP-VK.rasterization.rasterization_order_attachment_access.stencil.samples_1.multi_draw_barriers

Fixes: 71595a189a
("tu: Fix feedback loops in sysmem mode")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>
2022-02-23 11:31:59 +00:00
Emma Anholt 59bc17d57a turnip: Request no implicit sync when we have no implicit-sync WSI BOs.
I chose to implement this as a global flag in the device, because
otherwise we would end up with extra draw overhead trying to avoid it in
the implicit-sync WSI case, and you're probably going to end up needing
implicit sync anyway because you used one of the BOs in any of the
submitted cmdbufs.  To do better than this, we would probably want a
skip-implicit-sync flag on the BOs in the BO list, rather than global on
the submit.

Reports about venus on turnip say that this flag reduces worst-case
QueueSubmit time in a game workload from ~10ms to ~4ms.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14838>
2022-02-22 17:36:05 +00:00
Danylo Piliaiev a814a4f9db turnip: Add a refcount mechanism to BOs
Until now we have lived without a refcount mechanism in the driver
because in Vulkan the user is responsible for handling the life
span of memory allocations for all Vulkan objects, however,
imported BOs are tricky because the kernel doesn't refcount
so user-space needs to make sure that:

1. When importing a BO into the same device used to create it
   (self-importing) it does not double free the same BO.
2. Frees imported BOs that were not allocated through the same
   device.

Our initial implementation always freed BOs when requested,
so we handled 2) correctly but not 1) on drm and we would
double-free self-imported BOs because kernel doesn't return
a unique gem_handle on each import.

Beside this the submit ioctl checks for duplicates in the
BO list and returns an error if there is one.

This fixes the problem for good by adding refcounts to BOs
so that self-imported BOs have a refcnt > 1 and are only freed
when all references are freed.

KGSL on the other hand does not have the same problems,
at least not with ION buffers which are used for exportable
BOs on pre 5.10 android kernels.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5936
Fixes CTS tests: dEQP-VK.drm_format_modifiers.export_import.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15031>
2022-02-19 15:16:55 +00:00
Yiwei Zhang 2a87a741ae turnip: advertise VK_EXT_queue_family_foreign
Both Venus and Android AHB requires this extension.

Turnip ignores VK_SHARING_MODE_EXCLUSIVE so this is a no-op.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Acked-by: Rob Clark <robdclark@chromium.org>
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14836>
2022-02-14 21:27:35 +00:00
Jason Ekstrand bda4c4f6b6 vulkan: Take a vk_command_pool in vk_command_buffer_init()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14917>
2022-02-11 08:06:25 +00:00
Jason Ekstrand d59caf5d11 turnip: Use vk_command_pool
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14917>
2022-02-11 08:06:24 +00:00
Louis-Francis Ratté-Boulianne 5e263cc324 vulkan/runtime: Add a level field to vk_command_buffer
Looks like 3 implementations already have that field in their private
command_buffer struct, and having it at the vk_command_buffer opens the
door for generic (but suboptimal) secondary command buffer support.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14917>
2022-02-11 08:06:24 +00:00
Danylo Piliaiev 97c90c514f turnip: Depth/stencil formats should not expose any bufferFeatures
From the Vulkan 1.3.205 spec, section 19.3 "43.3. Required Format Support":

   Mandatory format support: depth/stencil with VkImageType
   VK_IMAGE_TYPE_2D
   [...]
   bufferFeatures must not support any features for these formats

See https://gitlab.khronos.org/vulkan/vulkan/-/merge_requests/4849

Fixes CTS tests: dEQP-VK.api.buffer.invalid_buffer_features.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14927>
2022-02-09 20:11:22 +00:00
Danylo Piliaiev 44bdac9849 tu: Implement VK_AMD_buffer_marker to support Graphics Flight Recorder
Graphics Flight Recorder is:
 "The Graphics Flight Recorder (GFR) is a Vulkan layer to help
  trackdown and identify the cause of GPU hangs and crashes.
  It works by instrumenting command buffers with completion tags."

This is a nice little tool which could help quickly identify the call
which hanged. Or if command buffer is executed for too long.

The tiling nature of our GPU shouldn't be a big issue aside from
lower performance.

For non-segfault case, if:
- Hang happens at the same place in cmdbuf and draw/dispatch is not
  finished at that point - it is likely that there is an infinite
  loop in some of the shaders in this draw.
- Hang happens always in different place - likely there is nothing
  wrong and command buffer just takes too long to execute and you
  should try increasing hangcheck_period_ms. If it doesn't help
  it is likely a synchronization issue.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13553>
2022-02-07 12:53:34 +02:00
Danylo Piliaiev 183bc15bdb turnip: Unconditionaly remove descriptor set from pool's list on free
We didn't remove desc set from the pool's list if pool was
host_memory_base. On the other hand in there is no point in removing
desc set from the list in DestroyDescriptorPool/ResetDescriptorPool.

Fixes: da7a4751
("turnip: Drop references to layout of all sets on pool reset/destruction")

Fixes cts tests:
 dEQP-VK.api.buffer_marker.graphics.default_mem.bottom_of_pipe.memory_dep.draw
 dEQP-VK.api.buffer_marker.graphics.default_mem.bottom_of_pipe.memory_dep.dispatch

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14855>
2022-02-04 21:07:30 +00:00
Dylan Baker 2f916f2be6 meson: add support for `meson devenv` with vulkan
Meson devenv is a feature added in meson 0.58 (thus the features is
version guarded) that allows creating a shell environment with
environment variables automatically setup for running the project inside
the build dir. Some variables (such as LD_LIBRARY_PATH and PATH) are set
automatically, others must be added by the project.

For vulkan is is relativley simple, we create a new, uninstalled, icd
file for each driver and set the VK_ICD_FILENAMES variable
appropriately. This can be used with:

```sh
meson devenv -C $builddir
```

then, vulkan applications will automatically use the uninstall vulkan
driver, no need to install.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14826>
2022-02-04 09:08:47 -08:00
Danylo Piliaiev fded7a95c5 turnip: Expose VK_KHR_shader_non_semantic_info
This is entirely implemented in the SPIR-V frontend.

Relevant CTS tests:
dEQP-VK.spirv_assembly.instruction.compute.non_semantic_info.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14829>
2022-02-04 09:24:06 +00:00
Danylo Piliaiev ff059605aa turnip: Implement VK_KHR_zero_initialize_workgroup_memory
Moved nir_lower_compute_system_values to lower
load_local_invocation_index which could be emitted by
nir_zero_initialize_shared_memory.

Relevant CTS tests:
dEQP-VK.compute.zero_initialize_workgroup_memory.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14829>
2022-02-04 09:24:06 +00:00
Danylo Piliaiev c6d1cac6e5 turnip: Expose VK_EXT_image_robustness
VK_EXT_image_robustness is a strict subset of VK_EXT_robustness2
so we could just expose it.

Relevant CTS tests: dEQP-VK.robustness.image_robustness.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14829>
2022-02-04 09:24:06 +00:00
Danylo Piliaiev 03f9deecb8 turnip: Use the shared helpers to expose 1.3 core extensions/limits
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14829>
2022-02-04 09:24:06 +00:00
Danylo Piliaiev 5e4bf6d100 turnip: Do not use hw binning if tiles per pipe are over the limit
Otherwise GPU would hang.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5951

Freedreno commit as a reference:
39d00722b2

Fixes VK cts tests on a618 if their memory limit is raised to 1024 MB:
 dEQP-VK.pipeline.render_to_image.core.2d_array.huge.width_height.r8g8b8a8_unorm_d16_unorm
 dEQP-VK.pipeline.render_to_image.core.2d_array.huge.width_height.r8g8b8a8_unorm_s8_uint
 dEQP-VK.pipeline.render_to_image.core.2d_array.huge.width_height.r8g8b8a8_unorm_d24_unorm_s8_uint
 dEQP-VK.pipeline.render_to_image.core.2d_array.huge.width_height.r8g8b8a8_unorm_d32_sfloat_s8_uint
 dEQP-VK.pipeline.render_to_image.core.cube.huge.width_height.r8g8b8a8_unorm
 dEQP-VK.pipeline.render_to_image.core.cube.huge.width_height.r8g8b8a8_unorm_d16_unorm
 dEQP-VK.pipeline.render_to_image.core.cube.huge.width_height.r8g8b8a8_unorm_s8_uint
 dEQP-VK.pipeline.render_to_image.core.cube.huge.width_height.r8g8b8a8_unorm_d24_unorm_s8_uint
 dEQP-VK.pipeline.render_to_image.core.cube_array.huge.width_height.r8g8b8a8_unorm
 dEQP-VK.pipeline.render_to_image.core.cube_array.huge.width_height.r8g8b8a8_unorm_d24_unorm_s8_uint

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Tested-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14849>
2022-02-04 06:01:41 +00:00
Danylo Piliaiev c6e8198f1b turnip: Add TU_GMEM envvar to test different gmem sizes
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14849>
2022-02-04 06:01:41 +00:00
Connor Abbott 0248644c89 ir3,tu: Enable subgroup shuffles and relative shuffles
We still don't use the fast path for relative shuffles, that's left for
future work.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14412>
2022-02-01 16:27:46 +00:00
Emma Anholt bf289e3123 turnip: Store the computed iova in the tu_image.
Less of a big deal than for buffers, but let's be consistent in how we
handle our bindings.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14816>
2022-02-01 15:30:12 +00:00
Emma Anholt f460fb3f91 turnip: Store the computed iova in the tu_buffer.
We recently had a bug of forgeting to add the buf->bo_offset.  Just make
the easiest field to get be the bo->iova + buf->bo_offset already.  Plus,
a little less work at emit time.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14816>
2022-02-01 15:30:12 +00:00
Chia-I Wu 9eb1592e57 turnip: respect buf->bo_offset in transform feedback
buf->bo->iova should always be offset by buf->bo_offset.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14786>
2022-01-31 18:31:54 +00:00
Danylo Piliaiev 803055ccb4 tu: add debug option to force gmem
With autotuner we now want to be able to force gmem rendering,
it will respect existing constraints though.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12128>
2022-01-31 15:26:35 +00:00
Danylo Piliaiev dbae9fa7d8 tu: implement sysmem vs gmem autotuner
The implementation is separate from Freedreno due to multithreading
support.

In Vulkan application may fill command buffer from many threads
and expect no locking to occur. We do introduce the possibility of
locking on renderpass end, however assuming that application
doesn't have a huge amount of slightly different renderpasses,
there would be minimal to none contention.

Other assumptions are:
- Application does submit command buffers soon after their creation.

Breaking the above may lead to some decrease in performance or
autotuner turning itself off.

The heuristic is too simplistic at the moment, to find a proper
one - we should run a bunch of traces with sysmem and gmem, and
build better heuristic from gathered data.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12128>
2022-01-31 15:26:35 +00:00
Connor Abbott 360f7c5d64 tu: Initial link-time optimizations
This is mostly taken from radv, and cleaned up a bit: don't explicitly
list every stage at the beginning, and name the shaders "producer" and
"consumer" to reduce confusion. I also stripped out a lot of other stuff
to get to the bare minimum of calling nir_link_opt_varyings,
nir_remove_unused_varyings, and nir_compact_varyings and then cleaning
up the fallout. In the future we may want to temporarily scalarize I/O
like radv does, and add back a few things like the psize optimization.
In the meantime this already provides a lot of benefit.

Results from the radv fossil_db with some apps not compilable by turnip
removed:

Totals:
MaxWaves: 1637288 -> 1668200 (+1.89%); split: +1.89%, -0.00%
Instrs: 54620287 -> 54114442 (-0.93%); split: -0.98%, +0.05%
CodeSize: 92235646 -> 91277584 (-1.04%); split: -1.07%, +0.03%
NOPs: 11176775 -> 11185206 (+0.08%); split: -0.63%, +0.71%
Full: 1689271 -> 1657175 (-1.90%); split: -1.92%, +0.02%
(ss): 1318763 -> 1317757 (-0.08%); split: -1.40%, +1.32%
(sy): 618795 -> 617724 (-0.17%); split: -0.70%, +0.53%
(ss)-stall: 3496370 -> 3470116 (-0.75%); split: -1.37%, +0.62%
(sy)-stall: 23512954 -> 23511164 (-0.01%); split: -1.04%, +1.03%
STPs: 27557 -> 27461 (-0.35%)
LDPs: 22948 -> 22804 (-0.63%)
Cat0: 11823765 -> 11829681 (+0.05%); split: -0.62%, +0.67%
Cat1: 3120042 -> 2991831 (-4.11%); split: -4.43%, +0.32%
Cat2: 28605309 -> 28324829 (-0.98%); split: -0.98%, +0.00%
Cat3: 7334628 -> 7252342 (-1.12%); split: -1.12%, +0.00%
Cat4: 1216514 -> 1204894 (-0.96%)
Cat5: 863976 -> 861926 (-0.24%)
Cat6: 1648571 -> 1641457 (-0.43%)

Totals from 23575 (16.16% of 145856) affected shaders:
MaxWaves: 258806 -> 289718 (+11.94%); split: +11.94%, -0.00%
Instrs: 7571190 -> 7065345 (-6.68%); split: -7.04%, +0.36%
CodeSize: 13864308 -> 12906246 (-6.91%); split: -7.09%, +0.18%
NOPs: 959185 -> 967616 (+0.88%); split: -7.35%, +8.23%
Full: 313335 -> 281239 (-10.24%); split: -10.36%, +0.11%
(ss): 154628 -> 153622 (-0.65%); split: -11.90%, +11.25%
(sy): 69758 -> 68687 (-1.54%); split: -6.21%, +4.67%
(ss)-stall: 322002 -> 295748 (-8.15%); split: -14.92%, +6.76%
(sy)-stall: 3270366 -> 3268576 (-0.05%); split: -7.45%, +7.40%

STPs: 3624 -> 3528 (-2.65%)
LDPs: 1074 -> 930 (-13.41%)
Cat0: 1022684 -> 1028600 (+0.58%); split: -7.13%, +7.71%
Cat1: 531102 -> 402891 (-24.14%); split: -26.04%, +1.90%
Cat2: 4090309 -> 3809829 (-6.86%); split: -6.86%, +0.00%
Cat3: 1449686 -> 1367400 (-5.68%); split: -5.69%, +0.01%
Cat4: 103543 -> 91923 (-11.22%)
Cat5: 57441 -> 55391 (-3.57%)
Cat6: 316096 -> 308982 (-2.25%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14767>
2022-01-31 12:19:55 +00:00
Danylo Piliaiev da7a475138 turnip: Drop references to layout of all sets on pool reset/destruction
We dropped the references only for non-host_memory_base pools.
Create a list of alive descriptor to account for all of them.

Fixes: 1b513f49 ("tu: add reference counting for descriptor set layouts")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14235>
2022-01-27 23:47:46 +00:00
Danylo Piliaiev 24144f6f5c turnip/trace: Delete unused start/end_resolve tracepoints
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14391>
2022-01-27 18:59:43 +00:00
Danylo Piliaiev 1989e1e6d8 turnip/perfetto: handle gpu timestamps being non-monotonic
Perfetto requires time in clock snaphots to be monotonic, otherwise
the clock would be excluded.
GPU timestamps start from zero after every suspend-resume cycle
which makes them non-monotonic.

As a solution on msm we check whether GPU was just resumed and
remember previous highest timestamp to then add it to the next
timestamps.

If the functionality to get whether gpu is resumed is unavailable
or doesn't work - we fallback to a check for a discontinuity
in timestamps. For kgsl we always use fallback.

Fixes renderstage timeline disappearing in AGI.

Or you could avoid the issue altogether by preventing GPU from going to
sleep by increasing auto suspend delay e.g.:

  echo 5000 > /sys/devices/platform/soc\@0/3d00000.gpu/power/autosuspend_delay_ms

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14391>
2022-01-27 18:59:43 +00:00
Danylo Piliaiev ba7faa6f43 turnip/trace: process u_trace chunks on queue submission
tu_QueuePresentKHR was not the best place since application
isn't required to call it.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14391>
2022-01-27 18:59:43 +00:00
Danylo Piliaiev a6482a3a6e turnip: rename tu_drm_get_timestamp into tu_device_get_gpu_timestamp
It is not drm specific and will be implemented in kgsl.

Change parameter to tu_device along the way.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14391>
2022-01-27 18:59:43 +00:00
Danylo Piliaiev f2c53c2a9b turnip/trace: refactor creation and usage of trace flush data
Fixes the case when last cmd buffer in submission doesn't have
tracepoints leading to flush data not being freed.

Added a few comments, renamed things, refactored allocations - now
the data flow should be a bit more clean.

Extracted submission data creation into tu_u_trace_submission_data_create
which would be later used in in tu_kgsl.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14391>
2022-01-27 18:59:43 +00:00
Danylo Piliaiev 95896dee93 turnip/perfetto: Optimize timestamp synchronization
We shouldn't do ioctl to get timestamp if perfetto isn't connected.
Also it's better to sync timestamps after submission since the
call could block until GPU is resumed.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14391>
2022-01-27 18:59:43 +00:00
Connor Abbott 065785e689 tu: Report code size in pipeline statistics
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14754>
2022-01-27 17:16:18 +00:00
Hyunjun Ko 8a5b949a3e turnip: fix leaks of submit requests.
Fixes: 479a1c40 ("turnip: Porting to common vulkan implementation for synchronization.")

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14727>
2022-01-26 22:22:33 +00:00