Some of our asserts and other checks depend on the total set of stages,
not just the stages set in the current pCreateInfo. Recording the stage
mask lets us combine them in vk_graphics_pipeline_state_merge().
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17696>
Test runtime has crept up with more CTS tests and more features. The last
vk_full 1/2 run I tried timed out at:
Pass: 268488, Fail: 2, ExpectedFail: 7, Warn: 1, Skip: 602571, Duration: 1:29:29, Remaining: 45
Rude.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17662>
Add process_frame to pipe_video codec
Add new structures/caps for video post-processing with rotation,
flip, alpha blending, crop, and scaling, via the video engine.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17557>
Previously, it was in a divergent branch, therefore
it could hang the GPU when a workgroup had a primitive-only wave.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17581>
There's no dependency between them.
This can simplify the compiler backend translation by
always storing prim id before vertex export, which also
benefits the LLVM backend in latter changes.
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17581>
In particular, we now call it before running dead variables so we get
the XFB info even for things which are never written. This fixes a 102
Vulkan CTS tests on ANV and probably turnip as well.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17644>
Rework:
* Ken: Check bo for IRIS_MMAP_NONE rather than the global
intel_vram_all_mappable
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17349>
This can be false on systems where the PCI Base Address Register (BAR)
is too small for the amount of VRAM. Eventually the kernel will be
able to tell us that a system can't map all of VRAM, and
`all_vram_mappable` will then be false.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17349>
We might not be able to map all vram buffers in the future, so only
map the buffer when actually required.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17349>
We cannot rely on unallocated_size on system memory for
VK_EXT_memory_budget.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4aecfbf0f4 ("intel/dev: Add devinfo::mem to store i915 regions information")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17349>
Allow folding constants/undef sources by sharing more code with the image_store
16bit folding pass.
Allow more than one set of sources because RADV wants two, one for
G16 (ddx/ddy) and one for A16 (all other sources).
Allow folding cube sampling destination conversions on radeonsi/radv because
I think the limitation only applies to sources.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16978>
the alternative here is to just spin aimlessly until the process ooms,
which causes problems when trying to detect failures in cts caselists
a separate env var is used so that it can be exported without affecting
ZINK_DEBUG
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17525>
It should be the responsibility of the driver to make sure, that "format" is a valid pipe_format.
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17490>
The script that generates the format tables does not set every pipe_format.
In practice, the length of the format tables is equal to PIPE_FORMAT_COUNT.
I just added the explicit size to future-proof it.
(If the largest valid format is not part of the format tables,
there will be a mismatch between the array length and PIPE_FORMAT_COUNT)
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17490>
The shared reg usage involved in the subgroup-related macros can cause
trouble for the spiller, and spilling may be implicated in CTS failures
with old versions of the subgroup tests, so let's make sure we get some
coverage. It does seem to catch a couple of failures.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17642>
..Instead of DEBUG so these work in debugoptimized builds.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17408>
In debugoptimized builds, DEBUG is not set (and neither is NDEBUG). The
intention of NDEBUG is to disable assertions. As such, list assertions should be
gated on !NDEBUG as opposed to on DEBUG.
But assert() is already disabled in that case, so we don't need our own special
assert (Eric).
This would have caught an assertion failure (due to the wrong iterator used)
sooner for the Valhall compiler.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17408>
It will get turned into SSA and copy-propagated in NIR, no need to walk
the IR collapsing it here.
iris shader-db results appear to be noise:
total instructions in shared programs: 8932195 -> 8932147 (<.01%)
instructions in affected programs: 537 -> 489 (-8.94%)
LOST: 12
GAINED: 11
lost/gained are simd32 switches in unigine, l4d2, portal2, asphalt9.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17613>
Now that we have no non-NIR drivers, we can retire the old code. We just
need to pass the variable accesses through to it.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17610>
The DISPATCH_TASKMESH_INDIRECT_MULTI_ACE packet has a firmware bug,
it hangs the GPU when the draw count is zero.
This commit adds a workaround sequence using COND_EXEC packets
which make sure that this indirect packet is never executed when
the draw count is zero.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16531>
Add a separate flush_bits field for tracking cache
flushes in the ACE internal cmdbuf.
In barriers and image transitions we add these flush bits to ACE.
Create a semaphore in the upload BO which makes it possible
for ACE to wait for GFX for the purpose of synchronization.
This is necessary when a barrier needs to block task shaders.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16531>
This implements NV_mesh_shader draw calls with task shaders.
- On the GFX side:
DISPATCH_TASKMESH_GFX for all draws
- On the ACE side:
DISPATCH_TASKMESH_DIRECT_ACE for direct draws
DISPATCH_TASKMESH_INDIRECT_MULTI_ACE for indirect draws
Additionally, the NV_mesh_shader indirect BO layout is
incompatible with AMD HW, so we add a function that copies
that into a suitable layout.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16531>
This is mainly going to be used by task shaders, because
the HW implementation mismatches the API:
- In the API, task shaders are considered graphics shaders which
are part of a graphics pipeline and the draws are submitted to
a graphics queue.
- The HW requires the driver to dispatch task shaders on
an async compute queue.
When a pipeline is bound that has a task shader, create a
driver-internal ACE (async compute engine) cmdbuf which
we are going to submit to an ACE queue.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16531>
This is going to be used with task shader dispatches.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16531>
We are going to reuse them outside of radv_pipeline.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16531>
- Move the uses_perf_counters ternary expression out of
the loop into a variable called cs_offset.
- Constify cmd_buffer_count.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16531>
This cleans up radv_flush_constants and also
the new function will be reused later.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16531>
Allow emitting these packets without a radv_cmd_buffer object.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16531>
Initialize the inverted predication VA only when it is used
for the first time.
This is needed to get conditional rendering work correctly with
task shaders because the internal compute cmdbuf may not exist
yet when conditional rendering starts.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16531>
in some cases it becomes desirable to "maybe" stop and start the current
renderpass, such as when updates MAY result in layout changes for attachments
for such cases, avoid splitting the renderpass unless it actually needs to
be split
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17640>
this is more accurate and fixes usage with lavapipe
cc: mesa-stable
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17627>
in this case, lying about having multiple images and then returning the
same image every time doesn't work, so use the busy flag
and return an available image when possible
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17590>
Fix defects reported by Coverity Scan.
Uninitialized scalar field (UNINIT_CTOR)
uninit_member: Non-static class member sgpr_spill_slots is not
initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member vgpr_spill_slots is not
initialized in this constructor nor in any functions that it calls.
Fixes: 7d34044908 ("aco: refactor VGPR spill/reload lowering")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17583>
the number of viewports in use is based on the outputs of the last vertex
stage, not the viewports passed by the state tracker
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17639>
this is a weird corner case where glsl permits a zero value, so clamp to 1
and then don't emit any vertices to avoid driver hangs
affects:
dEQP-GL45-ES31.functional.geometry_shading.emit.points_emit_0_end_0
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17639>
Two advantages:
* When using NIR_DEBUG=nir_print_xx, will print outcome only if
there is a change
* We can use NIR_PASS(_, ...) instead of NIR_PASS_V, that has
slightly more validation checks.
This includes:
* v3d_nir_lower_image_load_store
* v3d_nir_lower_io
* v3d_nir_lower_line_smooth
* v3d_nir_lower_load_store_bitsize
* v3d_nir_lower_robust_buffer_access
* v3d_nir_lower_scratch
* v3d_nir_lower_txf_ms
As we are here we also simplify some of them by using the
nir_shader_instructions_pass helper.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17609>
The trigger for this commit was when we found that we were not calling
nir_metadata_preserve when lowering the layout code. But then I found
that it would be better to just update the code to use
nir_shader_instructions_pass, so we can avoid to manually:
* Initialize the nir_builder
* Call nir_foreach functions (we pass the callback)
* Call nir_metadata_preserve functions (that as mentioned we were not calling)
We also get a nice cleanup of several functions by reducing the number
of parameters (we pass a state struct).
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17609>
Without it we got a metadata assert:
deqp-vk: ../src/compiler/nir/nir_metadata.c:108: nir_metadata_check_validation_flag: Assertion `!(function->impl->valid_metadata & nir_metadata_not_properly_reset)' failed
if we try to use NIR_PASS(_, instead of NIR_PASS_V (that among other
things, do more validations).
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17609>
VK_NULL_HANDLE descriptor set layouts are allowed when creating pipeline
layouts without VK_PIPELINE_LAYOUT_CREATE_INDEPENDENT_SETS_BIT_EXT.
From VUID-VkPipelineLayoutCreateInfo-graphicsPipelineLibrary-06753:
> If graphicsPipelineLibrary is not enabled, elements of pSetLayouts
> must be valid VkDescriptorSetLayout objects
From VUID-VkPipelineLayoutCreateInfo-pSetLayouts-parameter:
> If setLayoutCount is not 0, pSetLayouts must be a valid pointer to an
> array of setLayoutCount valid or VK_NULL_HANDLE VkDescriptorSetLayout
> handles
Signed-off-by: Ricardo Garcia <rgarcia@igalia.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17629>
This represents an offset from the actual start of the image data,
not from the start of the memory allocation bound to the image.
Fixes:
dEQP-VK.image.subresource_layout.*
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17648>
From the Vulkan spec:
"If poolSizeCount is not 0, pPoolSizes must be a valid pointer to an
array of poolSizeCount valid VkDescriptorPoolSize structures"
So 0 is actually allowed and there is a CTS to check it is handled gracefully.
Fixes:
dEQP-VK.api.descriptor_pool.zero_pool_size_count
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17648>
cached mode was great 2 years ago when template support was less widespread,
but now that templates are everywhere, caching is less performant in
every scenario
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17636>
Previously we would just unroll the loop one extra iteration and let
other optimisation passes clean up the mess. This worked to a degree
but if the loop happened to be nested inside another loop we would
end up with phi chains that would block other passes from being able
to do the cleanup.
With this commit we explicitly clone the variables create by lcsaa
and insert them directly in the last continue branch after we are done
unrolling. With this optimisation passes can recognise both sides
of the if output the same values and can progress further.
Help with the issues described in:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/6051
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17611>