Skip gather_push_constants when shared consts are enabled. This makes
sure push_consts is only zero-initialized, and reserved_user_consts is
0. This saves some space in the const file.
This change also adds a few asserts and a comment to
lower_load_push_constant. Because shared consts share the same range
for all stages, we should not apply per-stage offsets in
lower_load_push_constant. It worked because nir_lower_explicit_io
always sets base to 0 for nir_var_mem_push_const and
shader->push_consts.lo was always 0 for all stages.
Fixes: 0c787d57e6 ("tu: increase maxPushConstantsSize to 256.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17777>
Turnip supports VK_EXT_direct_mode_display and can use the common
implementation of AcquireDrmDisplayEXT() & GetDrmDisplayEXT() (which use
wsi->can_present_on_device() that turnip implements).
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17768>
A simple implementations of breadcrumbs tracking of GPU progress
intended to be the last resort when debugging unrecoverable hangs.
For best results use Vulkan traces to have a predictable place of hang.
Requires compilation with TU_BREADCRUMBS_ENABLED=1. See tu_cs_breadcrumbs.c
for details on how to use this feature.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15452>
Some of these are actually bugfixes, some like the drawcall information
are just for autotune so they are just performance fixes. However this
came from an audit into what state is used in CmdEndRenderPass().
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
Emit it at EndRenderPass time, if the renderpass has tessellation. This
avoids all the special handling for secondaries, will work with
suspended/resumed render passes, and will handle secondaries that
contain render passes which will be allowed with dynamic rendering.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
The workaround for draws that need a CP_WAIT_FOR_ME didn't work if the
barrier before the draw is in a separate command buffer from the draw.
The barrier would add a pending CP_WAIT_FOR_ME, but it would get dropped
on the floor at the end of the command buffer and the draw wouldn't have
a pending CP_WAIT_FOR_ME so it wouldn't emit one. We don't know in the
barrier if the destination is a draw with the workaround, so we have two
options:
- Emit any pending CP_WAIT_FOR_ME at the end of the command buffer (and
before secondaries) in case there is a workaround draw later. This
will emit an extra CP_WAIT_FOR_ME at the end of the command buffer in
case there is an indirect command barrier.
- Always assume at the beginning of the command buffer that there is a
pending CP_WAIT_FOR_ME. This will emit an extra CP_WAIT_FOR_ME before
the first workaround-requiring draw in the command buffer, in case
there was a barrier earlier.
The only draws requiring a workaround are currently
vkCmdDraw*IndirectCount(), which we assume are rarer than indirect
command barriers, so we implement the second option. This entails
treating it as a cache invalidate.
This fixes some upcoming dynamic rendering CTS tests that do
vkCmdDrawIndirectCount() in a secondary but put the barrier for it in
the primary.
Fixes: 37939e9c54 ("turnip: Fix the lack of WFM before indirect draws")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
The only thing it's used for is to get the image view, and we can't rely
on it existing anyway. With dynamic rendering, we only have the format
of the attachments and sample count, so moving forward we can't rely on
anything other than that.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17378>
Fixes crash when capturing with RenderDoc.
From VK spec:
descriptorSetLayout [...] This parameter is ignored if templateType
is not VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17751>
The code assumed that if the source was d32s8 then the destination would
also be d32s8, in particular that depth_base_addr/stencil_base_addr
would also be filled out. Move the destination and source handling into
two different ifs with different conditions.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17684>
This was missed when we added support for VK_KHR_depth_stencil_resolve.
There is a similar feature where the stencil aspect of a D24S8 can be
copied "tightly" in CopyImageToBuffer, but it used the texture swizzle
and so required the 3d path. To get it to work with the 2D path, which
is required for resolves, we have to instead use the A8_UNORM format,
which works for texture sampling even for tiled images. We also have to
reuse the pre-existing image views because subpass resolves work on
image views rather than images, whereas before the fixup was applied
while creating the image view. This means threading through the
corresponding "opposite" format through setup, src, and dst functions,
doing the fixup there (through some shared helpers), and then getting
every user to specify the right format. As a bonus, we no longer need to
force the 3d path for the CopyImageToBuffer and CopyBufferToImage
special cases.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17684>
Now there are two paths for push constants.
When it's range is under 128b, we can use shared consts.
When it's over 128b, we can instead do loading data through
regular path, which is same as the previous way.
Now we can satisfy emulations like vkd3d that requires 256b for
its root signatures and we think it fairly maps to push constants
rather than inline uniform blocks that requires one indirection.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15503>
Follow the way blob is doing for PushConstants though it supports only
128b, same as previous.
v1. Rename tu_push_constant_range.count into dwords to redue confusion.
( Danylo Piliaiev <dpiliaiev@igalia.com> )
v2. Enable shared constants only if necessary.
v3. Merge the two draw states TU_DRAW_STATE_SHADER_GEOM_CONST and
TU_DRAW_STATE_FS_CONST as shared constants are used.
Note that this leaves tu_push_constant_range in tu_shader so we could
use it again in the following patch.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15503>
We're about to make it so that the compiler warns/errors if you use the
wrong iterator macro. Fix up a bunch of places where someone used the
wrong one before we break anything.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17630>
RADV and PowerVR use the same implementation.
Turnip does use a slightly modified version but the helper only has one
use -> just inline it and get rid of turnip's vk_format.h.
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17515>
Unlike image writes, buffer writes may access the same memory in
different ways, which we've seen in the past can cause problems. Use an
incoherent access to force flush/invalidate between accesses to the same
buffer, unless we know the barrier applies to images only.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17193>
Fixed:
- VK_QUERY_TYPE_PRIMITIVES_GENERATED_EXT was able to stop prim counter
when pipeline stats query is running.
- This may have happened when prim gen query was in secondary cmdbuf.
- VK_QUERY_TYPE_PRIMITIVES_GENERATED_EXT counting geometry in each tile.
- VK_QUERY_TYPE_PRIMITIVES_GENERATED_EXT counting geometry in each tile
when pipeline stats query is started inside prim gen query and inside
a renderpass.
The matter of VK_QUERY_TYPE_PRIMITIVES_GENERATED_EXT and pipeline stats
interaction is solved by tracking whether pipeline stats query is
running both on CPU (for non secondary cmdbuf case) and on GPU (for
secondary cmdbuf).
Note, prim gen query is not allowed with secondary command buffers, so
only pipeline stats query is tracked on gpu.
See https://gitlab.khronos.org/vulkan/vulkan/-/issues/3142
Counting geometry per each tile is solved by:
- Conditionally executing START/STOP_PRIMITIVE_CTRS to not run in tiling
pass. Solves the case when prim gen query is inside a renderpass.
- Stop prim counters before executing `draw_cs` and restarting them
afterwards. Solves prim gen query being outside a renderpass.
Fixes GL CTS tests with Zink + `TU_DEBUG=gmem`:
GTF-GL46.gtf30.GL3Tests.transform_feedback.transform_feedback_max_separate
GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_basic
GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_framebuffer
GTF-GL46.gtf40.GL3Tests.transform_feedback3.transform_feedback3_streams_overflow
GTF-GL46.gtf40.GL3Tests.transform_feedback3.transform_feedback3_streams_queried
GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_states
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6602
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17163>
For a5xx+ renamed:
- RST_PIX_CNT -> START_FRAGMENT_CTRS
- RST_VTX_CNT -> STOP_FRAGMENT_CTRS
- TILE_FLUSH -> START_COMPUTE_CTRS
- STAT_EVENT -> STOP_COMPUTE_CTRS
I'm not sure about a5xx itself but I'll take a chance of it being
similar to a6xx in this regard.
Knowing this emit_begin_stat_query/emit_end_stat_query can now emit
only events that are needed for the pool's flags.
Also primitive generated query clearly doesn't need fragment and
compute counters.
Passes tests:
dEQP-VK.query_pool.statistics_query.*
dEQP-VK.transform_feedback.primitives_generated_query.*
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17163>