Commit Graph

3456 Commits

Author SHA1 Message Date
Chia-I Wu 6929ccedff turnip: shared_consts and push_consts are mutually exclusive
Skip gather_push_constants when shared consts are enabled.  This makes
sure push_consts is only zero-initialized, and reserved_user_consts is
0.  This saves some space in the const file.

This change also adds a few asserts and a comment to
lower_load_push_constant.  Because shared consts share the same range
for all stages, we should not apply per-stage offsets in
lower_load_push_constant.  It worked because nir_lower_explicit_io
always sets base to 0 for nir_var_mem_push_const and
shader->push_consts.lo was always 0 for all stages.

Fixes: 0c787d57e6 ("tu: increase maxPushConstantsSize to 256.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17777>
2022-07-29 16:22:43 +00:00
Chia-I Wu 562e5ba286 turnip: remove shared_consts from tu_compiled_shaders
It is set but unused.  We also don't serialize/deserialize shared_consts
to/from the pipeline cache.

Fixes: e1f2cabc5e ("turnip: Change to use shared consts for PushConstants")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17777>
2022-07-29 16:22:43 +00:00
Eric Engestrom 438d5baa36 turnip: expose support for VK_EXT_acquire_drm_display
Turnip supports VK_EXT_direct_mode_display and can use the common
implementation of AcquireDrmDisplayEXT() & GetDrmDisplayEXT() (which use
wsi->can_present_on_device() that turnip implements).

Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17768>
2022-07-29 07:49:47 +00:00
Eric Engestrom 2c67457e5e util/list: rename LIST_ENTRY() to list_entry()
This follows the Linux kernel convention, and avoids collision with
macOS header macro.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6751
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6840
Cc: mesa-stable
Signed-off-by: Eric Engestrom <eric@igalia.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17772>
2022-07-28 10:10:44 +00:00
Danylo Piliaiev a9ebf55d02 turnip: Simple breadcrumbs implementation to debug hangs
A simple implementations of breadcrumbs tracking of GPU progress
intended to be the last resort when debugging unrecoverable hangs.
For best results use Vulkan traces to have a predictable place of hang.

Requires compilation with TU_BREADCRUMBS_ENABLED=1. See tu_cs_breadcrumbs.c
for details on how to use this feature.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15452>
2022-07-28 08:48:39 +00:00
Connor Abbott 9b844d7c42 tu: Add debug option to use emulated renderpass support
This should be useful for stress-testing dynamic rendering.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
2022-07-27 19:40:44 +00:00
Connor Abbott b90d628a7d tu: Use common vk_image_view base struct
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
2022-07-27 19:40:44 +00:00
Connor Abbott 89263fde20 tu: Use common vk_image struct
This eliminates some boilerplate, and will be necessary to use the
common render pass implementation for debugging purposes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
2022-07-27 19:40:44 +00:00
Connor Abbott cb0f414b2a tu: Add support for suspending and resuming renderpasses
This is unfortunately very complicated because we have to stitch
together the state of the suspended passes after the fact, with primary
command buffers at submit time.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
2022-07-27 19:40:44 +00:00
Connor Abbott 3aa20a4409 tu: Split out some state into a separate struct
These bits of state will have to be treated specially when
suspending/resuming a render pass, because they will need to be tracked
across command buffers.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
2022-07-27 19:40:44 +00:00
Connor Abbott 9689433eee tu: Update more state with secondaries
Some of these are actually bugfixes, some like the drawcall information
are just for autotune so they are just performance fixes. However this
came from an audit into what state is used in CmdEndRenderPass().

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
2022-07-27 19:40:44 +00:00
Connor Abbott 79c7c6e492 tu: Remove has_subpass_predication
The workaround this was used for was removed.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
2022-07-27 19:40:44 +00:00
Connor Abbott ed125e6cca tu: Initial support for dynamic rendering
Support for suspend/resume will be added later. This just sets up
the internal render pass, and adds support to pipeline creation and
secondary inheritance.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
2022-07-27 19:40:44 +00:00
Connor Abbott 22be08a21e tu: Remove usage of RenderPassBeginInfo
More preparation for using this with dynamic rendering.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
2022-07-27 19:40:44 +00:00
Connor Abbott cf391db4c6 tu: Move tu_render_pass definition up
So that we can embed one in tu_cmd_buffer.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
2022-07-27 19:40:44 +00:00
Connor Abbott 2b8b5259c7 tu: Disable GMEM for multiview inside tu_render_pass_gmem_config
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
2022-07-27 19:40:44 +00:00
Connor Abbott c42e7aa46c tu: Move TU_DONT_CARE_AS_LOAD into attachment_set_ops()
So that we can share it with dynamic rendering.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
2022-07-27 19:40:44 +00:00
Connor Abbott 907b892bb2 tu: Rewrite tess factor emission
Emit it at EndRenderPass time, if the renderpass has tessellation. This
avoids all the special handling for secondaries, will work with
suspended/resumed render passes, and will handle secondaries that
contain render passes which will be allowed with dynamic rendering.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
2022-07-27 19:40:44 +00:00
Connor Abbott c5be444500 tu: Treat CP_WAIT_FOR_ME as a cache invalidate
The workaround for draws that need a CP_WAIT_FOR_ME didn't work if the
barrier before the draw is in a separate command buffer from the draw.
The barrier would add a pending CP_WAIT_FOR_ME, but it would get dropped
on the floor at the end of the command buffer and the draw wouldn't have
a pending CP_WAIT_FOR_ME so it wouldn't emit one. We don't know in the
barrier if the destination is a draw with the workaround, so we have two
options:

- Emit any pending CP_WAIT_FOR_ME at the end of the command buffer (and
  before secondaries) in case there is a workaround draw later. This
  will emit an extra CP_WAIT_FOR_ME at the end of the command buffer in
  case there is an indirect command barrier.
- Always assume at the beginning of the command buffer that there is a
  pending CP_WAIT_FOR_ME. This will emit an extra CP_WAIT_FOR_ME before
  the first workaround-requiring draw in the command buffer, in case
  there was a barrier earlier.

The only draws requiring a workaround are currently
vkCmdDraw*IndirectCount(), which we assume are rarer than indirect
command barriers, so we implement the second option. This entails
treating it as a cache invalidate.

This fixes some upcoming dynamic rendering CTS tests that do
vkCmdDrawIndirectCount() in a secondary but put the barrier for it in
the primary.

Fixes: 37939e9c54 ("turnip: Fix the lack of WFM before indirect draws")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
2022-07-27 19:40:44 +00:00
Connor Abbott d2ad4c739c tu/lrz: Do not use framebuffer when inheriting LRZ
The only thing it's used for is to get the image view, and we can't rely
on it existing anyway. With dynamic rendering, we only have the format
of the attachments and sample count, so moving forward we can't rely on
anything other than that.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
2022-07-27 19:40:44 +00:00
Georg Lehmann df4b5914cd nir/fold_16bit_tex_image: Default to only_fold_all.
No driver doesn't use this option.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17757>
2022-07-27 18:57:12 +00:00
Chia-I Wu 8001c78d49 ir3: set UL flag before ir3_lower_subgroups
ir3_legalize_relative, extracted from ir3_legalize, assumes a0 is loaded
first in each block if there is any user in the block.
ir3_lower_subgroups breaks the assumption.  We need to do
ir3_legalize_relative first.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6902
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17720>
2022-07-27 17:08:03 +00:00
Danylo Piliaiev 4ba129cd86 tu: Do not dereference descriptorSetLayout in push descriptors tmpl
Fixes crash when capturing with RenderDoc.

From VK spec:

 descriptorSetLayout [...] This parameter is ignored if templateType
 is not VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17751>
2022-07-26 18:18:48 +00:00
Connor Abbott 19a2353446 tu: Fix resolving d32s8 into s8 on fast path
The code assumed that if the source was d32s8 then the destination would
also be d32s8, in particular that depth_base_addr/stencil_base_addr
would also be filled out. Move the destination and source handling into
two different ifs with different conditions.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17684>
2022-07-26 15:00:01 +00:00
Connor Abbott d426ee6a99 tu: Support resolving D24S8 -> S8
This was missed when we added support for VK_KHR_depth_stencil_resolve.
There is a similar feature where the stencil aspect of a D24S8 can be
copied "tightly" in CopyImageToBuffer, but it used the texture swizzle
and so required the 3d path. To get it to work with the 2D path, which
is required for resolves, we have to instead use the A8_UNORM format,
which works for texture sampling even for tiled images. We also have to
reuse the pre-existing image views because subpass resolves work on
image views rather than images, whereas before the fixup was applied
while creating the image view. This means threading through the
corresponding "opposite" format through setup, src, and dst functions,
doing the fixup there (through some shared helpers), and then getting
every user to specify the right format. As a bonus, we no longer need to
force the 3d path for the CopyImageToBuffer and CopyBufferToImage
special cases.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17684>
2022-07-26 15:00:01 +00:00
Chia-I Wu ba461f897b ir3: fix tess param allocation
primitive_param takes up 2 vec4's.  Remove an align that I don't
understand.

The align upset

  Test case 'dEQP-VK.subgroups.ballot_broadcast.graphics.subgroupbroadcast_vec4'..
  deqp-vk: ../src/freedreno/ir3/ir3_nir.c:1039:
  void ir3_setup_const_state(nir_shader *, struct ir3_shader_variant *, struct ir3_const_state *):
  Assertion `constoff <= ir3_max_const(v)' failed.

with an older version (android11-tests-dev branch) of deqp-vk.  This is
because ir3_nir_opt_preamble uses the function for the worst case but
the function fails to replace the align by the worst case.

No regression with dEQP-VK.*tess*.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17570>
2022-07-26 01:04:56 +00:00
Chia-I Wu e3ba8a2f07 ir3: increment constoff right after it is assigned
Minor improvement to readability.  No real change.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17570>
2022-07-26 01:04:56 +00:00
Chia-I Wu 4ae2966616 ir3: remove unused patch_vertices_in
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17570>
2022-07-26 01:04:56 +00:00
Chia-I Wu 74c96af71d ir3: fix output_loc size
It was off-by-one.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17570>
2022-07-26 01:04:56 +00:00
Chia-I Wu 9c106f3ee7 ir3: copy req_local_mem for MESA_SHADER_KERNEL
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17570>
2022-07-26 01:04:56 +00:00
Chia-I Wu 76ea28b9d0 ir3: update ir3_const_state comment
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17570>
2022-07-26 01:04:56 +00:00
David Heidelberg 1a244e1394 ci/freedreno: 3 pixel change in Raven restricted trace
Acked-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17732>
2022-07-25 18:14:40 +02:00
Hyunjun Ko 0c787d57e6 tu: increase maxPushConstantsSize to 256.
Now there are two paths for push constants.

When it's range is under 128b, we can use shared consts.
When it's over 128b, we can instead do loading data through
regular path, which is same as the previous way.

Now we can satisfy emulations like vkd3d that requires 256b for
its root signatures and we think it fairly maps to push constants
rather than inline uniform blocks that requires one indirection.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15503>
2022-07-24 09:03:47 +00:00
Hyunjun Ko e1f2cabc5e turnip: Change to use shared consts for PushConstants
Follow the way blob is doing for PushConstants though it supports only
128b, same as previous.

v1. Rename tu_push_constant_range.count into dwords to redue confusion.
( Danylo Piliaiev <dpiliaiev@igalia.com> )

v2. Enable shared constants only if necessary.

v3. Merge the two draw states TU_DRAW_STATE_SHADER_GEOM_CONST and
TU_DRAW_STATE_FS_CONST as shared constants are used.

Note that this leaves tu_push_constant_range in tu_shader so we could
use it again in the following patch.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15503>
2022-07-24 09:03:47 +00:00
Hyunjun Ko ce8e8051af turnip: clean up unused parameters for user consts.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15503>
2022-07-24 09:03:47 +00:00
Hyunjun Ko e6556b72fb ir3: handle shared consts.
Adds a shared consts base offset and a size of it(dwords) to ir3_compiler
since they might be depending on gpu generations. (Danylo Piliaiev <dpiliaiev@igalia.com> )

Adds a flag to present whether shared consts are enabled to
ir3_shader_options and then it sets to ir3_const_state when creating
an ir3 variant. Although this state is not per-shader state, this is
necessary when figureing out real constlens.

v1. Define a hw quirk for geometry shared const files and use it when
calculating const length.

v2. Don't hardcode when calculating a safe const length.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15503>
2022-07-24 09:03:47 +00:00
Hyunjun Ko b35c4bd050 ir3: change maximum size of const files.
According to the observation on a630/a650/a660, max_const_pipeline has
to be 512 when all geometry stages are present. Otherwise a gpu hang
happens. Acoordingly maximum safe size for each stage should be under
(max_const_pipeline / 5 (stages)).

Only when VS and FS stages are present, the limit is 640.

v1. Align max_const_safe to 4 vec4's.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15503>
2022-07-24 09:03:47 +00:00
Chia-I Wu 8ec81a4b11 turnip: fix an assertion with drm-shim
Fixes

  deqp-vk: ../src/vulkan/runtime/vk_device.c:49:
  get_timeline_mode: Assertion `timeline_type == NULL' failed.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17571>
2022-07-22 02:11:14 +00:00
Chia-I Wu 2d2912f18a freedreno/drm-shim: add a660
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17571>
2022-07-22 02:11:14 +00:00
Emma Anholt 7f4df969c9 Revert "ci/freedreno: Switch a630 to manual/disabled for lab maintenance."
This reverts commit 7e381ba9fc.  2 new
boards are in place, bringing us from 7 to 9.  We hoped for 12, but have
ongoing power stability issues.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17662>
2022-07-22 00:57:23 +00:00
Emma Anholt 94b4c0bc39 ci/turnip: Add a couple of missing a630 fails.
Same as a618.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17662>
2022-07-22 00:57:23 +00:00
Emma Anholt 8a7c4f4202 ci/turnip: Bump up the a630 full run timeout.
Test runtime has crept up with more CTS tests and more features.  The last
vk_full 1/2 run I tried timed out at:

Pass: 268488, Fail: 2, ExpectedFail: 7, Warn: 1, Skip: 602571, Duration: 1:29:29, Remaining: 45

Rude.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17662>
2022-07-22 00:57:23 +00:00
Emma Anholt d8fb219b2f ci/freedreno: Add some more known flakes for a630 from our IRC logs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17662>
2022-07-22 00:57:23 +00:00
Jason Ekstrand 87ab287436 vulkan: Call lower_clip_cull_distance_arrays in vk_spirv_to_nir
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17644>
2022-07-21 21:18:48 +00:00
Georg Lehmann 775578b885 ir3: Stop using nir_legalize_16bit_sampler_srcs.
nir_fold_16bit_tex_image's only_fold_all option ensures that there is never
a mix of bit sizes.

Closes https://gitlab.freedesktop.org/mesa/mesa/-/issues/6899

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16978>
2022-07-21 19:15:04 +00:00
Georg Lehmann 87e3277b82 nir: Rewrite and merge 16bit tex folding pass with 16bit image folding pass.
Allow folding constants/undef sources by sharing more code with the image_store
16bit folding pass.

Allow more than one set of sources because RADV wants two, one for
G16 (ddx/ddy) and one for A16 (all other sources).

Allow folding cube sampling destination conversions on radeonsi/radv because
I think the limitation only applies to sources.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16978>
2022-07-21 19:15:03 +00:00
Georg Lehmann 06b33770b6 ir3: Lower alu to scalar if nir_legalize_16bit_sampler_srcs made progress.
Fixes: 003327dd95 ("freedreno/ir3: Pass 16-bit sampler coordinates when possible.")
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16978>
2022-07-21 19:15:03 +00:00
Georg Lehmann 9fe382ba96 ir3: Only run 16bit tex NIR passes on a5xx+.
16bit types aren't yet supported on older hardware.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16978>
2022-07-21 19:15:03 +00:00
Konstantin Seurer 630df88a74 turnip: Remove format desc null assert
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17490>
2022-07-21 12:48:01 +00:00
Emma Anholt 6e819585da ci/turnip: Add a bit of spilling-vs-ballot testing on a618.
The shared reg usage involved in the subgroup-related macros can cause
trouble for the spiller, and spilling may be implicated in CTS failures
with old versions of the subgroup tests, so let's make sure we get some
coverage.  It does seem to catch a couple of failures.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17642>
2022-07-21 01:25:33 +00:00