On MoltenVK/MacOS when mapping memory that should already have content it does not appear until flushed.
This noticably effects vertex attribute uploads to descrete devices.
Did also try to add the Coherent memory flag, which did work, until there the Coherent type could only be used for transfer usage only.
This is a known limitation of MoltenVK.
See https://github.com/KhronosGroup/MoltenVK/blob/master/Docs/MoltenVK_Runtime_UserGuide.md#known-moltenvk-limitations
Seen when using MoltenVK 1.0.121, 1.2.131
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7470>
MoltenVK, at least upto 1.2.141, does not render triangle fans. This is reflected in the portability EXTX extension.
This code get the extensions properties and features and then sets the have_triangle_fans.
This extension is not avaiable on all systems, so an amout of the code has to be protected by the define VK_EXTX_PORTABILITY_SUBSET_EXTENSION_NAME.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7457>
Extends the flexability of the device info script.
Allows #if defined()/#endif guards around particular data, for platform or version specific structures.
Allows for feature structures to be retreved without having to have a have_ variable as well. Helps if there is more than one feature in the structure.
These changes help towards allowing the use of the portability set extensions.
Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7457>
Set the ZINK_DEBUG environment variable to 'validation' to automatically setup.
The debug util extnsion callback is used to capture information and logs the results to the error stream.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7431>
There are resources that may have more planes than chained resources. The
frontend has no way of figuring out which (if any) chained resource is the
right one to call resource_get_handle with and until a (now reverted)
change to the dri frontend it just always called with the first resource.
The convention of calling with the first resource of a chain allows the
pipe driver, which has the necessary information of how resources and
planes map to each other for a specific format/modifier combination, to do
the necessary walking. Document this as the official calling convention
of this function.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7419>
The ZSA state is not fully self contained, as other states (mostly
shader using discard or writing depth information) have an influence
on whether we can use early Z test/write.
Rework the ZSA state into a derived state that gets updated whenever
a new ZSA or SHADER state is bound. This way we can automatically
enable/disable early Z as needed.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7396>
The information about a shader using discard/kill is interesting
to other parts of the driver, as depth states need to programmed
differently depending on this. As we don't want to deal with
NIR/TGSI differences in other parts of the driver, track this
usage in the common etna_shader_variant.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7396>
Some depth config states changes require the depth cache to be
flushed, leading to a GPU hang if not done. As the conditions that
require the flush are not toally clear, better be safe than sorry
and always flush the cache on depth state changes.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7396>
The RA_EARLY_DEPTH is a depth state and so must be emitted on
dirty ZSA, instead of dirty SHADER.
Fixes: 785e2707b0 (etnaviv: Fix disabling early-z rejection on GC7000L)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7396>
Multiple threads may access st_update_* function at same time. Issues
happen when the threads modify lists managed by shader compiler.
Issues were found with script that runs multithread tests 1000 times in
a row with MESA_GLSL_CACHE_DISABLE=1 set. Problems start when 2
simultaneous st_create_[vp|fp]_variant calls start to compile a new
shader variant for the same program and various nir passes use and
modify same exec_lists.
Example failure:
deqp-egl: ../src/compiler/glsl/list.h:575: exec_list_validate: Assertion `node->next->prev == node' failed.
v2: instead of introducing new mutex, lock shared state
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7418>
ctz is a CL2.0 opcode but 3.0 requires it as well so just add support
for it.
Tested against CTS integer_ops integer_ctz test.
(long line broken up)
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7468>
Identical to GL_MESA_pack_invert in effect, just need to check for a
different enum value for GLES vs GL. The spec claims that "OpenGL 1.5 or
OpenGL ES 1.0 are required", but ReadPixels isn't a thing for ES1 so we
only enable it for ES2+.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3156>
This newly enables the extension for r100 and vieux. As far as I can
tell, that's safe to do. vieux's ReadPixels is just _mesa_readpixels,
which clearly already handles it correctly because the extension is
enabled for classic swrast. r100 has some custom acceleration paths
before falling down to _mesa_readpixels, which winds its way through to
r100_blit, which already has code to handle pack->Invert being set.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3156>
subpass_idx is of type uint32_t.
Fix defects reported by Coverity Scan.
Macro compares unsigned to 0 (NO_EFFECT)
unsigned_compare: This greater-than-or-equal-to-zero comparison of
an unsigned value is always true. subpass_idx >= 0U.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7325>
We need to consider shader calls as potential writes to their payloads.
For other ray-tracing intrinsics, we may not have a shader payload
pointer and have to treat them more like a barrier. We also need to
ensure that global and SSBO reads/writes aren't propagated across shader
call intrinsics.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
The SPV_KHR_ray_tracing extension adds 6 new storage classes which is a
bit on the ridiculous side. In order to avoid adding that many variable
modes to NIR, we make a few simplifying assumptions:
1. CallableData and RayPayload data actually lives on the stack
somewhere, presumably in the caller's stack. We assume that these
are no different from global variables and use nir_var_shader_temp
for them. We still need a separate storage class for the incoming
variants but only so we can figure out which one the incoming one
is and lower it to something useful.
2. There's no difference between incoming CallableData and RayPaolad
data. We can use a single storage class for both.
3. ShaderRecordBuffer data is just a global memory access. This lets
us avoid NIR variables entirely and just fetch the pointer via the
shader_record_ptr system value and it's accessed using a 64-bit
global memory pointer.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
If we were desperate to reduce bits, we could probably also use
shader_in/out for hit attributes as they really are an output from
intersection shaders and read-only in any-hit and closest-hit shaders.
However, other passes such as nir_gether_info like to assume that
anything with nir_var_shader_in/out is indexed using vec4 locations for
interface matching. It's easier to just add a new variable mode.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
These are a bit more tricky than most because they're matrix system
values. We make the intentional choice here to not bother with allowing
indirect addressing of columns for these. Since they're system values,
they may be magically constructed somehow or come from weird hardware so
it's easier on back-ends to just handle any indirects with bcsel.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
This is an operation we have to do already for nir_vector_extract and
I'm about to do something very similar for matrix columns. Having a
more generic helper is useful.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
Missing in this commit are NIR intrinsics for the ObjectToWorld and
WorldToObject built-ins. Those are matrices and so they take a bit more
work and justify a separate commit. For now, we add the enums and leave
the SYSTEM_VALUE <-> nir_intrinsic conversion commented out.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
For now, we assume its a 64-bit global pointer.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
We already fail in these same cases in vk_desc_type_for_mode. These
additional assertions are just extra code to update.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
This particular ordering makes them conveniently match
VkShaderStageFlagBits, which is a property we already take advantage
of in the previous shader stages.
Abbreviations are based on the ones used in glslangValidator.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
some drivers may have pre-lowered gl_ClipDistance to 2x vec4 to match hw
usage, so for those cases we'll be getting deref_var here and then components
will be stored to the deref at some point
fixesmesa/mesa#3480
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6563>
we can just check the bits using clip_distance_array_size here to simplify
everything and more easily determine if we need to be running this pass
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6563>
We are limited to 64 threads per dispatched group, regardless of what
num_cs_threads claims, so advertise that limit correctly.
Fixes (on TGL and up):
dEQP-VK.subgroups.size_control.compute.required_subgroup_size_min
and other *.required_subgroup_size_min tests.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7453>
Add missing GetPhysicalDeviceImageFormatProperties2 logic for the extension
and enable it.
Also stop exposing optimal tiling for formats which are linear only, to
simplify dealing with those.
Passes dEQP-VK.drm_format_modifiers.*
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6940>