This adds support to the X11 WSI for explicit synchronization using DRM
syncobjs. It relies on versions 1.4 of the DRI3 and Present extensions.
Signed-off-by: Erik Kurzinger <ekurzinger@nvidia.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27226>
We no longer support the old LINE+MAC lowering, and we already lower
this to MAD in NIR on Gfx11+, so the LINTERP virtual opcode always
corresponds the PLN. The only catch is that LINTERP's operands are
reversed from PLN, so we have to switch them.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28705>
We already have a logical opcode and lower to what is basically a send
instruction. We just weren't using SHADER_OPCODE_SEND, instead having
extra redundant infrastructure for no real gain.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28705>
None of the callers of anv_free_sparse_bindings() check for its return
result, and they also don't have a way to propagate it up the stack.
So just don't return error codes that won't be checked. Instead,
add an assertion so at least we can detect failures in our CI or
development runs.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28724>
The device->using_sparse variable is only used at cmd_buffer_barrier()
to decide if we need to apply the heavier-weight flushes that are only
applicable to sparse resources. The big problem here is that we need
to apply the flushes to the non-image and non-buffer memory barriers,
so we were trying to limit those only to applications that ever submit
a sparse resource to the sparse queue.
The reason why we were applying this only to devices that ever
submitted sparse resources is that dxvk games have this thing where
during startup they create and then delete tiny sparse resources, so
switching device->using_sparse to true at resource creation would make
basically every dxvk game start applying the heavier-weight
workaround.
The problem with all that is that even if an application creates a
sparse resource but doesn't ever bind them, the resource should still
behave as an unbound resource (because they are bound with a NULL
bind), so the flushes affecting them should happen. This case is
exercised by vkd3d-proton/test_buffer_feedback_instructions_sm51.
In order to satisfy all the above cases and only really apply the
heavier-weight flushes to applications actually using sparse
resources, let's just count the number of sparse resources that
currently exist and then apply the workaround only if it's not zero.
That covers the dxvk case since dxvk deletes the resources as soon as
they create, so num_sparse_resources goes back to 0.
Testcase: vkd3d-proton/test_buffer_feedback_instructions_sm51
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10960
Fixes: 6368c1445f ("anv/sparse: add the initial code for Sparse Resources")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28724>
This went unused a while ago. If we decide we want it again we can
just add it back.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28724>
Since we moved the dump_anv_vm_bind() call to anv_sparse_bind(), that
BEGIN/END block stopped making sense, so just keep the first set of
messages.
Also wrap everything around a single INTEL_DEBUG() check so we'll only
run this check once when debug is disabled (we don't care about
running the check multiple times if it's enabled).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28724>
In both cases we end up calling anv_image_aspect_to_plane(), which
already includes the same assertion.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28724>
If isl_surf_get_tile_info() returned the struct instead of having it
passed as a pointer, gcc would have detected this. I can write patches
for that if we want it.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28724>
No need to track is_push_descriptor in templ. No need to conditionally
decide to use set or NULL handle since we pass NULL handle from the cmd
side. Also fixed the arg type mismatch in the template helper.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28686>
So we can use the --depfile option, and not have to worry about new
files. Since it's Rust, the expectation is that you're going to have up
to date dependencies anyway.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28754>
Try to use v_fma_mix{lo,hi}_f16 if possible instead of v_cvt_pkrtz_f16_f32.
To ensure correct rounding we have to make sure that the fp16 rounding mode
can be rtz first.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28670>
Some uses don't have any post-dominator. An example is an atomic that
feeds itself in a loop. No instruction immediately post-dominates
the result of such an atomic because no instruction can strictly
post-dominate itself. This handles that case generally by setting
the root node as the post-dominator for instructions that can't be
reordered.
Fixes: ba54099dce - nir: add a utility computing post-dominance of SSA uses
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28436>
REG_A6XX_HLSQ_VS_CNTL ... REG_A6XX_HLSQ_GS_CNTL are not contiguous
on A7XX, and based on CTS runs with the stomper we cannot stomp
REG_A6XX_SP_VS_OBJ_START safely.
Signed-off-by: Amber Harmonia <amber@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28732>
Reorder its members to fill the current padding hole, reducing the
struct size from 32 to 24.
This struct appears multiple times inside struct anv_image and its
members, so this change brings down sizeof(struct anv_image) from
1744 to 1600.
We went from:
struct anv_image_memory_range {
enum anv_image_memory_binding binding; /* 0 4 */
/* XXX 4 bytes hole, try to pack */
uint64_t offset; /* 8 8 */
uint64_t size; /* 16 8 */
uint32_t alignment; /* 24 4 */
/* size: 32, cachelines: 1, members: 4 */
/* sum members: 24, holes: 1, sum holes: 4 */
/* padding: 4 */
/* last cacheline: 32 bytes */
};
to:
struct anv_image_memory_range {
enum anv_image_memory_binding binding; /* 0 4 */
uint32_t alignment; /* 4 4 */
uint64_t size; /* 8 8 */
uint64_t offset; /* 16 8 */
/* size: 24, cachelines: 1, members: 4 */
/* last cacheline: 24 bytes */
};
Considering we can have tens of thousands of anv_image structs
allocated at the same time on gaming workloads, this can save us a few
MB of memory. It ain't much but it's honest work.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28700>
Otherwise, the returned VA from vkGetBufferDeviceAddress() or via
VK_EXT_device_address_binding_report doesn't match and applications
would have to mask out.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28652>
This seems to cause some troubles for distro builds.
Fixes: 394652e5a0 ("etnaviv: hwdb: Generate hwdb.h")
Closes: #11012
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28745>
Not a whole lot of applications supports Vulkan 1.0, so let's wire up
support for MESA_VK_VERSION_OVERRIDE so we can easily override the
version to when testing.
While we're at it, let's switch to VK_MAKE_API_VERSION, as
VK_MAKE_VERSION is deprecated now.
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28694>
7e82c59fa4 ("Uprev Piglit to dd6f7eaf82e8dd442da28b346c236141cbcce0b1") pulled
in fixes to the testsuite, which makes two more tests pass on GC2000.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28744>
Remove a premature optimization. When PIPE_MAP_DISCARD_WHOLE_RESOURCE
is set we were setting create_new_bo, and then if that was set we skipped
a set of tests which if passed would cause a panfrost_flush_writer.
In fact we need that flush in some cases (e.g. when any batch is
reading the resource). Moreover, we should sometimes copy the resource
(set the copy_resource flag) and that again was being skipped if
create_new_bo was initially true due to PIPE_MAP_DISCARD_WHOLE_RESOURCE
being set.
Cc: mesa-stable
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28406>
X8Z24 surfaces have a don't care stencil channel, which is okay to be
cleared together with the depth channel. Set the depth clear bits
accordingly to allow those clears to use the fast-clear path when
only depth is to be cleared. This change aligns the RS with the BLT
ZS clear path.
Fixes: df63f188e8 ("etnaviv: fix separate depth/stencil clears")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28696>
Now that we properly switch between fast/regular clears for depth/stencil
surfaces as needed and fixed the resulting corner-case issues, there are
two more passing dEQP tests.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28696>
Currently both TS and non-TS paths use the same place to store the compiled
RS commands to clear the surface. In the TS case the commands only initialize
the TS buffer, while the non-TS commands clear the whole buffer. The
assumption here is that a TS enabled surface will only ever be fast cleared,
which doesn't hold anymore, now that we can fall back to slow clears on TS
enabled depth/stencil buffers.
The fallback to a slow clear will overwrite the stored RS commands with a
full buffer clear. If we can transition to a fast clear later, the commands
to initialize the TS buffer will not be regenerated and a full buffer clear
will be submitted instead. In addition to the performance degradation, it
will also leave TS in an inconsistent state, as the TS buffer will not be
initialized, but the TS state still gets marked as valid.
To avoid this confusion and not introduce any more state tracking to remember
the target of the clear commands and regenerate TS clears if needed, simply
split the storage for compiled TS and non-TS clear commands.
Fixes: df63f188e8 ("etnaviv: fix separate depth/stencil clears")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28696>
Now that we switch dynamically between fast (TS) and slow (regular)
clears on TS enabled surfaces, we must trigger reevaluation of the
current TS state also after a slow clear, as otherwise the PE might
continue to use the invalidated TS state.
Fixes: df63f188e8 ("etnaviv: fix separate depth/stencil clears")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28696>
With commit f2506780c8 ("mesa/st: Only set seamless for GLES3") ss->seamless_cube_map
should behave as wanted. For GLES2 it can only be set when PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE
is supported.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28669>
mem_access_base_pointer loads memory (the descriptor) and therefore
needs to be guarded. Fixes
dEQP-VK.spirv_assembly.instruction.terminate_invocation.terminate.no_null_pointer_load.
Fixes: fc8a83c ("gallivm/ssbo: mask offset with exec_mask instead of building the 'if'")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28614>
Since DGC preprocessing for IBO is supported, the driver generates
an indexed indirect draw but SQTT markers were missing and this
introduced complete non-sense in RGP captures.
Fixes: e59a16bbb8 ("radv: use an indirect draw when IBO isn't updated as part of DGC")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28710>
Fixes
dEQP-VK.video.encode.h264_i_p_not_matching_order
dEQP-VK.video.encode.h265_i_p_not_matching_order
Fixes: 54d499818c ("radv/video: add initial support for encoding with h264.")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28734>