If we want syntax-highlighting to actually work here, we should make
sure the code actually parses.
This fixes a warning during docs build.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8243>
There's a few more cases that needs proper quoting for Sphinx. Asterisks
and ticks at the start of words, as well as underscores at the end of
symbols, even when they have trailing escaped characters.
We should really find a way to robustly escape these things when
generating them.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8243>
Although I assume that this should be caught by the validation layers,
recently while triaging the following tests:
dEQP-VK.ycbcr.query.*r8g8b8a8_unorm*
We found that they were setting a descriptorCount of zero, because it
was not handling correctly the differences between Vulkan 1.0 and
Vulkan 1.1.
So let's just assert, just in case it happens again, as that would
make the bugfixing far easier.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8614>
The documentation states that if we disable Early Z for the whole
frame in the RCL Tile Rendering Mode packet, then we should not
emit any draw calls with it enabled (which we can do by enabling
it in the CFG_BITS packet).
Since we emit our RCL after recording our draw calls in the BCL
and we were not considering there if any condition for global disable
would be met, it was possible that we end up with an incorrect
configuration when we decide for a global disable in the RCL, which
can cause rendering artifacts. This can be easily observed by simply
forcing the RCL bit to disable early Z in applications that are known
to enable it in CFG_BITS (such as the UE Shooter demo for example).
With this change we keep track of this scenario when we record
draw calls in the BCL and if decide that we need to disable EZ for
the entire job, we make sure we never enable it for any draw calls
in the frame.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8589>
This is an optimization that should make Z/S clears faster. To enable
this we can't have any Z/S loads or stores in the job. Also, it seems
that enabling early Z/S clearing is independent of whether early Z/S
testing is enabled.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8589>
There was a misunderstanding regarding the scope of some hardware bugs
that led us to think that:
1. The Clear Tile Buffer Z/S bit was broken
2. The Clear Tile Buffer RTs bit would also clear Z/S.
1) is not really true, what happened was that some other bugs for which
we need workarounds anyway would have that effect. 2) was only true
for V3D 4.1, so it doesn't affect v3dv.
This change makes proper use of the Z/S bit instead of falling back to
clearing all tile buffers every time we have a Z/S clear. This also
allows us to do color clears on the tile store (which is faster) rather
than falling back to the clear all RTs bit every time we have a Z/S clear.
v2: rewrite the original comment about the hardwarebug description to
include recent discussions with Broadcom instead of keeping it as
is and amending it with an update note.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8589>
Currently, these cannot be vectorized as in NIR
shift operands are 32bit while for 16bit-vectorization
they need to be 16bit.
No fossildb changes.
Fixes: fcd2ef23e5 ('radv: vectorize 16bit instructions')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8612>
Because Gallium and Vulkan disagree on what kind of state strides is, we
need to wrangle this state a bit, and up until now, we've been simply
fixing this up while binding the vertex-buffers.
But this isn't robust, because the vertex element state might be bound
after the vertex-buffer state was bound. We also need to take
binding-map into account, which we're currently missing as well.
Instead, w need to deal with this at a place where we know what's being
used for both of these. So let's do this during draw instead.
Ideally, we'd also do some dirty-tracking to know if this is needed or
not, but I believe Mike has some patches in this areas lined up, so it
might be easier to wait for those.
Fixes: 8d46e35d16 ("zink: introduce opengl over vulkan")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3661
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4125
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8588>
This will allow to propagate and emit sub-register constants
on all hardware generations.
Also fixes GFX8 constant emission to not use SDWA.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
It could happen that due to inconsistent copy-propagation
v1 = p_parallelcopy v2b
instructions were left after optimization on GFX8.
Cc: 20.3
Cc: 21.0
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
When NGG is used, the hw can't know the number of geometry shader
primitives. To fix that, the NGG geometry shader accumulates itself
the number of primitives by using an atomic operation directly to GDS.
Then, begin/query copy the start/stop values from GDS to the
query pool buffer using a PS_DONE event. This was actually wrong
because PS_DONE is completely asynchronous to everything and executed
when the preceding draws finish pixel shaders.
Fix this by using a COPY_DATA packet which is synced with CP. This
fixes random failures on Sienna Cichlid with
dEQP-VK.query_pool.statistics_query.*.geometry_shader_primitives.*.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8590>
Fix defect reported by Coverity Scan.
Uninitialized pointer field (UNINIT_CTOR)
member_not_init_in_gen_ctor: The compiler-generated constructor for this
class does not initialize targ.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8541>
This changes glCallLists from using a switch inside a loop to using loops
inside a switch.
Also fix the comments that didn't tell WHY something was important.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
vbo_save implements this better by putting the vertices into a VBO.
The SET functions removed here were overriding it to use the slower
version that uploads the vertices on every call.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
The glapi scripts are fully capable of generating this correctly for all
GL APIs if we don't set exec="dynamic".
exec="dynamic" should only be used for glBegin, glEnd, and all functions
that are legal inside Begin/End.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
The existing display list code is reused to call display lists from the app
thread. The util_queue_fence_wait call waits for the last call that modifies
display lists (such as glEndList and glDeleteLists), which ensures that
accessing display lists from a non-mesa thread is thread safe because
the wait guarantees that display lists are immutable during the asynchronous
display list execution.
Display lists are executed just like normal display lists except that they
call glthread functions instead of the default GL dispatch. Many calls in
display lists are skipped because glthread only tracks a few states.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
This just tracks matrix stack depths in MatrixStackDepth and everything
else here is needed to make it correct.
Matrix stack depths will be returned by glGetIntegerv without synchronizing.
Display lists will be handled by a separate commit.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
This decreases CPU time spent in the unmarshal_DrawElements function
from 0.44% to 0.26% if no user buffers are present.
Instead of converting all calls to either unmarshal_DrawArraysInstanced-
BaseInstance or unmarshal_DrawElementsInstancedBaseVertexBaseInstance,
which both also conditionally bind uploaded user buffers if needed and
call one of:
- DrawArraysInstancedBaseInstance
- DrawElementsInstancedBaseVertexBaseInstance
- DrawRangeElementsBaseVertex,
add 3 unmarshal draw variants that are specialized version of the above that
never bind uploaded user buffers. This removes all conditionals from
the unmarshal functions for the common case when there are no user buffers.
Unused function enums are used for the various draw variants. For example,
CMD_DrawArrays is used to dispatch DrawArraysInstacedBaseInstance without
user buffers, while CMD_DrawArraysInstacedBaseInstance is used to dispatch
the same with user buffers. glthread isn't flexible enough to do it cleanly.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
It only checked whether the pointer was indices or indirect, but we can
just determine the same thing manually for each draw call.
Simplify it as follows:
- if a call contains a pointer without count and it's either indirect or
indices, set marshal="async". The marshal_sync attribute still determines
when it syncs.
- if a call doesn't contain any pointer without count, remove the marshal
attribute
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
To simplify tracking the checksum validity, only checksum 2D
resources.
The GPU cannot do checksumming when there is too much data to be
written out, so limit the number of bytes per pixel to an architecture
specific value.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7086>