To remove bubbles during layout transitions from UNDEFINED, especially
with MSAA because we might have all.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10004>
From the Vulkan spec 1.2.172:
"If there is no subpass dependency from VK_SUBPASS_EXTERNAL to the
first subpass that uses an attachment, then an implicit subpass
dependency exists from VK_SUBPASS_EXTERNAL to the first subpass
it is used in."
"Similarly, if there is no subpass dependency from the last subpass
that uses an attachment to VK_SUBPASS_EXTERNAL, then an implicit
subpass dependency exists from the last subpass it is used in to
VK_SUBPASS_EXTERNAL."
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9714>
It's not enabled by default because it requires performance testing.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9919>
texture_index is meaningless when a tex_instr has deref src.
Use var->data.binding instead.
CC: mesa-stable
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9931>
texture_index is meaningless when a tex_instr has deref src.
Use var->data.binding instead.
This fixes the incorrect lowering on radeonsi where the same
lowering steps was applied to all tex_instr based on the needs
of the first one (since texture_index is always 0).
CC: mesa-stable
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9931>
Since RADV requires DRM 3.35+, this code can be simplified.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9953>
From the Vulkan spec:
"VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL specifies a
layout for both the depth and stencil aspects of a depth/stencil
format image allowing read only access as a depth/stencil
attachment or in shaders as a sampled image, combined
image/sampler, or input attachment. It is equivalent to
VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL and
VK_IMAGE_LAYOUT_STENCIL_READ_ONLY_OPTIMAL."
So, it should be safe to keep HTILE compressed if the depth/stencil
image isn't going to be sampled. We could probably extend this
to separate depth/stencil layout but that seems a bit more
complicated.
This gives a huge boost to the deferredmultisampling Vulkan demo.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10008>
A break/continue in a loop is typically emitted like this:
if (cond) {
break/continue;
} else {
}
If cond is uniform, we'll emit code for a uniform if statement and
that will emit a branch right before the if to jump directly to the
else (or the block after the else in this case, since the else is
empty) in case cond evaluates to false. This means we end up emitting
two consecutive branch instructions, one before the if and one for the
THEN block right after:
branch(!cond) -> jump to else (or after else) if cond is false
nop
nop
nop
branch -> unconditional jump to break/continue
nop
nop
nop
Instead, if we are in this scenario, we can do better by emitting the
conditional jump directly and avoiding the "jump to else" case:
branch(cond) -> jump to break/continue if cond is true
nop
nop
nop
We need to be careful when emitting the break/continue for the case
where all lanes are disabled to avoid infinite loops: if we have a
break we always want to take the jump, but we don't want to take it
if it is a continue.
total instructions in shared programs: 13563672 -> 13557348 (-0.05%)
instructions in affected programs: 348034 -> 341710 (-1.82%)
helped: 1158
HURT: 10
Instructions are helped.
total uniforms in shared programs: 3779137 -> 3777535 (-0.04%)
uniforms in affected programs: 90583 -> 88981 (-1.77%)
helped: 1169
HURT: 0
Uniforms are helped.
total max-temps in shared programs: 2317670 -> 2317575 (<.01%)
max-temps in affected programs: 1943 -> 1848 (-4.89%)
helped: 85
HURT: 4
Max-temps are helped.
total sfu-stalls in shared programs: 32247 -> 32247 (0.00%)
sfu-stalls in affected programs: 69 -> 69 (0.00%)
helped: 7
HURT: 9
Inconclusive result (value mean confidence interval includes 0).
total inst-and-stalls in shared programs: 13595919 -> 13589595 (-0.05%)
inst-and-stalls in affected programs: 350674 -> 344350 (-1.80%)
helped: 1154
HURT: 11
Inst-and-stalls are helped.
total nops in shared programs: 358202 -> 354325 (-1.08%)
nops in affected programs: 17367 -> 13490 (-22.32%)
helped: 1168
HURT: 1
Nops are helped.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9948>
Earlier, I just tried to copy what iris was doing and, as it turns out,
copied it wrong. Also, Vulkan doesn't have a concept of getting the
conservative coverage in the shader. The spec for SampleMask says:
"Decorating a variable with the SampleMask built-in decoration will
make any variable contain the coverage mask for the current fragment
shader invocation."
And the spec for conservative rasterization says
"When overestimate conservative rasterization is enabled, rather
than evaluating coverage at individual sample locations, a
determination is made of whether any portion of the pixel (including
its edges and corners) is covered by the primitive. If any portion
of the pixel is covered, then all bits of the coverage mask for the
fragment corresponding to that pixel are enabled."
Putting these two together and you get what the Intel HW docs say for
ICMS_NORMAL:
"Input Coverage masks based on inner conservatism and factors in
SAMPLE_MASKs. If Pixel is conservatively fully covered all samples
are enabled."
So I'm pretty sure based on this that the right thing to do here is to
ignore conservative rasterization and leave it set to ICMS_NORMAL
whenever we're not in the post-depth-coverage special case.
While we're here, fix the silly indentation.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4565
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d5b56debde "anv: Implement VK_EXT_conservative_rasterization"
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10017>
Similar to 641320ce02 and 68bb26af63 to avoid Android build errors
Fixes the following building error:
In file included from external/mesa/src/vulkan/util/vk_util.c:28:
external/mesa/src/vulkan/util/vk_util.h:248:30: error: implicit declaration of function 'ffs' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
return (gl_shader_stage) (ffs((uint32_t) vk_stage) - 1);
^
1 error generated.
Fixes: 06ebbde630 ("vulkan: Deduplicate mesa stage conversion")
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10023>
Across every driver...
v2: Add casts to appease -fpermissive used on CI.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9477>
No more progress loop needed. I'm skeptical we really want a dataflow
approach long-term, though, this is annoyingly expensive.
No shader-db changes.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9421>
We need to update `replacement` with the results of copyprop earlier
within the pass, but after that there's no point running more than once
if we're not going to materialize any new moves.
No shader-db changes.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9421>
Compressed formats may have compatible formats, however they could
only be sampled, so we should not call tu6_format_color with them.
tu6_format_texture should have the same behaviour for checking swap
so use it for all cases.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10009>
Each call argument in the trace dump should be a separate <arg>
element. In this case the trace_dump_struct_array() needs to have
trace_dump_arg_begin() and trace_dump_arg_end() surrounding it,
otherwise it becomes just random data inside the parent call.
Also fix another case where one <arg> element erraneously contains
another <arg> element by moving trace_dump_arg_end() to its
proper place.
Signed-off-by: Matti Hamalainen <ccr@tnsp.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9994>
The number of source components can be deternmined based on the number
of dest components, and copying other components to the gradient get
call is not required.
Closes#4557
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10011>
Works around the other Missing case we've seen in CI, possibly fixes the
underlying issue, and adds support for comments in xfails lists.
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9806>