The intention is to pick the system memory for the prime blit dst, but
that is not possible when all memory types are advertised to be local.
This fixes venus over vtest (i.e., unix socket) because the driver
provides no PCI bus info and wsi_device_matches_drm_fd returns false. A
driver might also use can_present_on_device to force prime blit.
Fixes: 469875596a ("vulkan/wsi: Fix prime blits to use system memory for the destination")
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11774>
This prevents the previous commit from being undone by the jump
optimizations in legalize, and fixes another potential case where
instead of a continue we have an if/else at the end of a loop.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
When loops have continue statements, it's expected that when we execute
a divergent continue (i.e. a continue where not all of the threads
active at the start take it) we keep going with the rest of the loop
body and then reconverge at the start of the next iteration. However the
Adreno ISA seems to always take a branch that jumps backwards, assuming
it's the bottom of a loop, so we get a different, undesired convergence
behavior. There's no way I know of to control this behavior in the
instruction set, so we have to instead insert a "continue block" at the
end of the loop where continue statements reconverge which then jumps
back to the top of the loop. Since this doesn't correspond 1:1 with any
NIR block we have to make control flow handling in NIR->ir3 a bit more
complicated, unfortunately.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
When we go to split e.g. a p0.x producer, the only other instructions
ready to schedule are often only p0.x producers. It could happen that
they all have a lower priority than the split instruction. Then we would
immediately schedule the split instruction again, then again try to
schedule one of the other producers, be blocked, and split it, around
and around again, leading to an infinite loop. The following commit
triggered this with
dEQP-GLES3.functional.shaders.discard.dynamic_loop_always on a3xx.
Fixes: d2f4d33 ("freedreno/ir3: new pre-RA scheduler")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
MOVMSK is a bit of a special case, because it takes multiple cycles (and
therefore reduces the nops needed if it's between some other assigner
and consumer) however weird things happen if you try to start reading
the first component while it isn't finished yet. On balance making it
use repeat seems to result in a fewer special cases.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
I realized that shared registers were never actually getting folded,
even after adding them to valid_flags, because the move wasn't even
being considered.
I looked at the other uses of is_same_type_mov(), and they should be ok
with this.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
This fixes a pre-existing bug in ir3, but it showed up even more due to
other changes in this series and it interacts with the logical/physical
CFG split. When both sides of an if end with a jump, a block may become
unreachable via the logical CFG, which can cause problems because it has
no predecessors to figure out the location of live-in non-shared
values. In this case we assume that nir_opt_if has removed any code in
these blocks and just skip processing live-ins for these blocks,
pretending that they aren't live.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
The way that the blob obtains the subgroup id on compute shaders is by
just and'ing gl_LocalInvocationIndex with 63, since it advertizes a
subgroupSize of 64. In order to support VK_EXT_subgroup_size_control and
expose a subgroupSize of 128, we'll have to do something a little more
flexible. Sometimes we have to fall back to a subgroup size of 64 due to
various constraints, and in that case we have to fake a subgroup size of
128 while actually using 64 under the hood, by just pretending that the
upper 64 invocations are all disabled. However when computing the
subgroup id we need to use the "real" subgroup size. For this purpose we
plumb through a driver param which exposes the real subgroup size. If
the user forces a particular subgroup size then we lower
load_subgroup_size in nir_lower_subgroups, otherwise we let it through,
and we assume when translating to ir3 that load_subgroup_size means
"give me the *actual* subgroup size that you decided in RA" and give you
the driver param.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
On qualcomm, we have shared registers similar to SGPR's on AMD. However,
there is no readlane or readfirstlane primitive. shared registers can
only be written to when just one lane is active. This means that we have
to lower readInvocation(val, id) to something like:
if (gl_SubgroupInvocation == id) {
scalar_reg = val;
}
return scalar_reg;
However it's a bit difficult to actually get the value of
gl_SubgroupInvocation in the backend, because for compute it requires
some calculations and we don't have any CSE support in the backend. This
intrinsic lets us turn it into
"readInvocationCond(val, id == gl_SubgroupInvocation)" in NIR at which
point the backend code generation is a lot easier.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
Qualcomm has a mode with a subgroup size of 128, so just emitting larger
integer operations and then lowering them later isn't an option. This
makes the pass able to handle the lowering itself, so that we don't have
to go down to 64-thread wavefronts when ballots are used.
(The GLSL and legacy SPIR-V extensions only support a maximum of 64
threads, but I guess we'll cross that bridge when we come to it...)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
Lower it to a vote instead of a ballot. This was only used for AMD, and
in that case they're pretty much the same. However Qualcomm has a vote
builtin, which we want to use instead of ballots.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
Return VK_ERROR_OUT_OF_HOST_MEMORY if
zwp_linux_dmabuf_v1_create_params fails.
Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11589>
If image->buffer cannot be allocated, the value returned by
wsi_create_native_image is returned. However, if we got that far,
that value is VK_SUCCESS.
Fix it and return VK_ERROR_OUT_OF_HOST_MEMORY.
Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11589>
A wl_proxy inherits its queue from its parent.
display->dmabuf.wl_dmabuf already has its queue correctly set up,
so it's unnecessary to set it again on the child
zwp_linux_buffer_params_v1 proxy.
Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11589>
The sole purpose of this wl_proxy is to set the queue to
chain->display->queue. However, wl_proxy inherit their queue from
their parent, so the original wl_drm proxy already has its queue
set up properly (inherited from wl_registry).
Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11589>
While we shrink some variable from "GLuint" to "ubyte",
need to update the check from "x != ~0U" to "x != 0xff" too.
This fixes the crash for SPECviewperf 13 benchmark medical
case.
Fixes: d947e3e2c8 "st/mesa: decrease the size of st_vertex_program"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11757>
Add DEBUG_NAMED_VALUE_END to finalize debug option array (see
lp_screen.c). Otherwise debug_get_flags_option might attempt to read
debug_named_value::name at an offset and SIGSEGV.
Signed-off-by: Heinrich Fink <hfink@snap.com>
Fixes: 991def0edc ("softpipe: Convert to comma-separated SOFTPIPE_DEBUG for debug options.")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11742>
If we're giong to build RADV on Windows, we need to make sure we trigger
the build on all RADV-changes.
Fixes: d18563ea58 ("ci: Update Windows image to build RADV")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11731>
We try to place persistent/coherent buffers from the application in
system memory, because they want the CPU-GPU coherency.
However, our internal u_upload_mgr buffers are also flagged persistent +
coherent, but we absolutely want most of them in device local memory.
Mark had done this correctly in an earlier patch series, but I made a
mistake when refactoring things during upstreaming, and accidentally
put these in SMEM again. This fixes that mistake.
Tested-by: Luis Felipe Strano Moraes <luis.strano@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11681>
With '-Wl,--gc-sections' meson.build cc.has_function() will never fail,
providing wrong input to the build system.
Fixes: 8621bd8d5e ("android: Add scripts to build using meson")
Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Tested-by: Mauro Rossi <issor.oruam@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11637>
Not really arch-dependent but technically uses GenXML. This is pretty
trivial anyway.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11745>
We don't want other files accessing these functions except through the
vtables, since they will soon be architecture dependent.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11745>