llvmpipe expects valid size parameter, and when just VK_WHOLE_SIZE is
passed very bad things can happen.
This was handled specially before, but got dropped when lavapipe was
converted to use the generated command queue.
Fixes: eb7eccc76f ("lavapipe: Use generated command queue code")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13036>
This knowledge was repeated in multiple places so move the values to
intel_device_info struct.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13014>
If an OpVariable's initializer is undef, there is no need to
initialize the variable.
v2: Comment the code (Caio)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13030>
The LLVM-SPIRV translator creates variables with initializers, but
most of those are actually undef initializers. We can just skip
composites that are entirely made of undefs, but for partially undefs,
we will still zero initialize.
v2: Rename wa_llvm_spirv_undef_initializer to wa_llvm_spirv_ignore_workgroup_initializer (Caio)
Limit workaround to OpenCL (Caio)
Make workaround clearer (Caio)
v3: Only apply workaround on workgroup storage (Caio)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13030>
We were assertion failing on some large draws due to indices >16bits,
despite asking draw to limit the max indices. I haven't managed to track
it down, so flip us back to the older, non-index drawing path that doesn't
hit this bug until it can get fixed. Leave an I915_DEBUG=vbuf flag around
so we can look into this later.
This is a pretty big performance hit for vertex shaders. Using glmark2 -b
build:use-vbo=true:
i915g-vbuf: 211 fps
i915g-nonvbuf: 185 fps
i915c: 41 fps
Given how massively better i915g still is than i915c (llvmpipe VS instead
of the classic swrast interpreter), I think it's still worth it to get
i915g correct before we fix this perf regression.
Fixes: #4971
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13052>
The common code fails dEQP-VK.wsi.display_control.register_device_event
due to having a stub NOT_IMPLEMENTED return, and thus fails the CTS. This
is one of our last failures, so disable the extension until it can get
finished off, so we can unblock passing the CTS.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13010>
This patch allows to form clauses even if the register pressure
is at the limit with the effect that VMEM instructions are less
scattered after the first clause in a Block.
It respects the previous clause size to avoid excessive moving
of VMEM instructions.
VMEM_CLAUSE_MAX_GRAB_DIST is further reduced to compensate
some of the effects.
Totals from 28922 (19.26% of 150170) affected shaders: (GFX10.3)
VGPRs: 1546568 -> 1523072 (-1.52%); split: -1.52%, +0.00%
CodeSize: 117524892 -> 117510288 (-0.01%); split: -0.08%, +0.07%
MaxWaves: 605554 -> 611120 (+0.92%)
Instrs: 22292568 -> 22291927 (-0.00%); split: -0.10%, +0.09%
Latency: 488975399 -> 490230904 (+0.26%); split: -0.06%, +0.32%
InvThroughput: 117842300 -> 116521653 (-1.12%); split: -1.15%, +0.03%
VClause: 541550 -> 522464 (-3.52%); split: -9.73%, +6.20%
SClause: 718185 -> 718298 (+0.02%); split: -0.00%, +0.02%
Copies: 1420603 -> 1386949 (-2.37%); split: -2.64%, +0.27%
Branches: 559559 -> 559278 (-0.05%); split: -0.06%, +0.01%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10896>
The X server doesn't get this wrong. It's not the client's job to
correct what the server says here. And if anyone ever implements HDR for
X11, you might in fact want to be able to use floats with a window.
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13002>
In GLX a "tag" usually means a context tag, "fbconfig attribute" is a
bit more obvious.
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13002>
It exactly matches the shader keys now. Everything was copied from
the pipeline key to the shader keys.
There is still some work to completely remove radv_shader_variant_key.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13032>
Just run some selected tests for now because we miss a lot of
functionality, which would cause so many crashes that the runs
aren't practical.
Once the core functionality is implemented, we can switch to the master
case list with skips.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13016>
Fixes several problems in the pan_blit() logic:
1. We actually need the reciprocal of the depth scaling in z_scale (maybe
we should rename this field z_scale_rcp to make it clear)
2. When Z end < Z start we should remove one to the cur_layer/layer_offset
instead of doing it on the last_layer field, otherwise there's an
off-by-one error
3. The Z src offset should be adjusted to account for scaling. If we don't
do that we won't sample from the right layer when upscaling.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12961>
Since we have no guarantee that start < end, we can't really tell to
which one the offset applies to. Let the caller take care of that.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12961>
Bit 12 of render->aux1 is GL_CCW/GL_CW. For GL_CCW (default of glFrontFace) we have
to set that bit active.
This is not what the blob does and what the original reverse engineering documentation
says. The blob sets this value inverted and does some bogus negation of the fragment
shaders gl_FrontFacing variable instead.
Anyway, doing it this way does not cause regressions but fixes
dEQP-GLES2.functional.shaders.builtin_variable.frontfacing and 4 piglit tests.
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7690>
The remaining extensions are optional features, just turn on vk 1.2
with them reporting as off.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12953>
Again if you get passed an invoc but the exec mask has the
active lane somewhere other than at 0, then if we have an
invoc we should find the active lane and extract the value
from invoc rather than using the idx.
This fixes a bunch of VK 1.2 subgroup tests once 1.2 is enabled:
dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_nonconst*
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12953>
These pass all the CTS tests, though not sure how useful they are.
[airlied: these may need some work in the future depending on app expectations]
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12953>
this works by tracking 1024-member arrays of images and textures using idalloc
for indexing. each idalloc id is an index into the array that is returned as a handle,
and this handle is used to index into the array in shaders.
in the driver, VK_EXT_descriptor_indexing features are used to enable updates on the live
bindless descriptor set and leave unused members of the arrays unbound, which works as
long as no member is updated while it is in use. to avoid this, idalloc ids must cycle through
a batch once the image/texture handle is destroyed before being returned to the available pool
in shaders, bindless ops come in one of two types:
- i/o variables
- bindless instructions
for i/o, the image/texture variables have to be rewritten back to the integer
handles which represent them so that the successive shader stage utilizing them
can perform the indexing
for instructions, the src representing the image/texture has to be rewritten as a deref
into the bindless image/texture array
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12855>
these are going to come through as direct variable derefs, so it's simple
to handle the functionality by reusing the same codepath to generate image
types
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12855>