We incorrectly utilize the stencil layouts structures even if we
should stick to the depth_stencil ones if the layout includes stencil.
v2: Don't forget stencil only layout (Nanley)
Simplify callers of new helper functions (Nanley)
v3: Store VK_IMAGE_LAYOUT_UNDEFINED when no stencil is available (Nanley)
Use a switch statement (Nanley)
v4: Consider all layouts but depth only to be potential stencil layouts (Lionel)
v5: Refactor helper in vk_image_layout_depth_only() and discard
VkAttachmentDescriptionStencilLayoutKHR in
VkAttachmentDescription2KHR if format is not depth/stencil.
v5: s/LAYOUT_COLOR_ATTACHMENT_OPTIMAL/LAYOUT_DEPTH_ATTACHMENT_OPTIMAL/ (Nanley)
v6: Fix overly harsh assert()
Fixes: c1c346f166 ("anv: implement VK_KHR_separate_depth_stencil_layouts")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4084
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8475>
Saves emitting a MOV at the end of the program to store the output.
softpipe glmark2 -b buffer +9.73451% +/- 3.17924% (n=6)
softpipe glmark2 -b build +5.57621% +/- 1.35074% (n=9)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8023>
Fixes several dEQP-VK.robustness.robustness2.* tests on GFX8. Generations
other than GFX8 don't fail the tests because bounds-checking is done using
the index (making it per-vertex).
fossil-db (Polaris):
Totals from 1387 (0.99% of 140385) affected shaders:
(no statistics affected)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 03a0d39366 ("aco: use MUBUF in some situations instead of splitting vertex fetches")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7834>
We check below if HTILE is in compressed state, so checking if
the image has HTILE is useless because radv_layout_is_htile_compressed()
will return FALSE if no HTILE.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8579>
This is already checked in radv_handle_depth_image_transition().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8579>
Expand the list of bind flags to cache depth and stencil
buffers.
Extensive testing shows the following performance improvements:
Game | % difference in FPS
Plague Inc | 7
Portal 2 | 21
Overcooked 2 | 1.2
Hollow Knight | -1.1
Civilization V | 3.8
Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8560>
From comment in emit_mimg():
We don't need the bias, sample index, compare value or offset to be
computed in WQM but if the p_create_vector copies the coordinates, then it
needs to be in WQM.
fossil-db (GFX10.3):
Totals from 1778 (1.28% of 139391) affected shaders:
SGPRs: 105080 -> 105072 (-0.01%); split: -0.02%, +0.01%
VGPRs: 96800 -> 96776 (-0.02%); split: -0.07%, +0.05%
CodeSize: 10001120 -> 10001384 (+0.00%); split: -0.04%, +0.04%
MaxWaves: 18164 -> 18163 (-0.01%)
Instrs: 1883750 -> 1883598 (-0.01%); split: -0.06%, +0.05%
Cycles: 34800176 -> 34767840 (-0.09%); split: -0.10%, +0.01%
We don't have a p_create_vector if we use NSA.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
If tess consts aren't used they don't get included in constlen,
and we risk overrunning consts of the next stage.
Fixes:
dEQP-VK.tessellation.invariance.outer_edge_index_independence.quads_fractional_even_spacing_ccw
dEQP-VK.tessellation.invariance.outer_triangle_set.quads_fractional_odd_spacing
dEQP-VK.tessellation.invariance.primitive_set.isolines_fractional_odd_spacing_ccw
dEQP-VK.tessellation.invariance.primitive_set.quads_fractional_odd_spacing_cw
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4117
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8578>
We don't support them by hw, so we would need to get them
lowered. This fix some crashes while using renderdoc with UE4 shooter
demo traces.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8582>
Currently, r600/nir doens't have a proper scheduler or optimizer backend,
to be able to make use of this code path without performance regressions,
we enable the sb optimizer also for NIR.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8563>
Some emulated fp64 piglit create code that seems to make the register
allocation with sb worse than the original shader created from NIR, so
fall back to using the un-optimized shader.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8563>
When trying to blit using the TFU, as we are doing an exact copy with no
conversions, we can choose a supported format that is compatible with the
underlying format's texel size.
This allows to use the TFU to blit formats that are not supported, like
r8ui or r16ui.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8495>
This reverts commit aca67a555c, which
regressed the following Piglit test on i915 (and presumably r200):
piglit/spec/!opengl 1.1/sized-texture-format-channels
Specifically, it begins testing glTexImage2D with format GL_RGBA,
type GL_FLOAT, and internalFormat GL_RGB16F, which leads to the
following error:
Mesa 21.0.0-devel implementation error: unexpected format GL_RGB16F in _mesa_choose_tex_format()
Please report at https://gitlab.freedesktop.org/mesa/mesa/-/issues
sized-texture-format-channels: ../../src/mesa/main/teximage.c:2836: _mesa_choose_texture_format: Assertion `f != MESA_FORMAT_NONE' failed.
i915 and r200 unconditionally support ARB_half_float_pixel, but neither
support RGB16F as an internal format. According to Ian's rationale
in the commit message for 1edca151a0
(which enabled that extension for all drivers):
"This extension only adds data types that can be passed to, for
example, glTexImage2D. It does not add internal formats. Since
you can already pass GL_FLOAT to glTexImage2D this shouldn't pose
any additional issues with those drivers. Note that r200 and i915
already supported this extension, and they don't support
floating-point textures either."
So, commit aca67a55c011 enabled half-float internal formats on hardware
that cannot support them. We should revert the change.
v2: Don't reintroduce the _mesa_is_gles3() condition, as that shouldn't
be necessary (feedback from Erik Faye-Lund).
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8458>
_tnl_draw_prims loops over the prims, and for each one, maps the VBOs,
draws, and unmaps them. But it failed to reset nr_bos = 0 between each
loop iteration, which meant that when processing prim[n], the BO list
had all BOs for prior primitives too. Assuming each primitive used the
same VBOs, that means the same VBO would appear in the list multiple
times, and it would try to unmap the same BO multiple times. This
triggered asserts on the second unmap, as it had already been unmapped.
Fixes Piglit's oes_draw_elements_base_vertex-multidrawelements on i915.
Fixes: e99e7aa4c1 ("mesa: switch Draw(Range)Elements(BaseVertex) calls to DrawGallium")
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8522>
Prior to commit e99e7aa4c1 (mesa: switch
Draw(Range)Elements(BaseVertex) calls to DrawGallium), the indices
parameter of glDrawElements (an offset into the VBO) was handled by
setting ib->ptr. With that commit, it instead began binding the
index buffer with offset 0, and adding the offset to draw->start,
which eventually becomes prim->start.
t_draw.c's bind_indices() was trying to convert the relevant section of
the index buffer to GLuints, but was failing to account for start, so
it nabbed the wrong portion of the index buffer.
Fixes: e99e7aa4c1 ("mesa: switch Draw(Range)Elements(BaseVertex) calls to DrawGallium")
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8522>
Commit 4c751ad67a (vbo/dlist: use a shared
index buffer) caused multiple draws to use the same index buffer, and
began setting the primitive's `start` field to the offset needed to
access the right portion of the index buffer.
Unfortunately, t_rebase_prims completely botches handling this case.
Say for example we had start = 40, min_index = 6, max_index = 11.
The actual indexes in the buffer are ib[40..45]. t_rebase_prims,
however, would allocate an index buffer containing only 6 elements,
and populate them with <ib[0..5] - min_index>. For one, this reads
the wrong source data, leading to garbage index values. For another,
it stores the new index buffer in the wrong spot, so drawing will try
and read elements [40..45] of an array of length 6, and crash.
This patch makes t_rebase_prims allocate a larger index buffer, with
the blank space at the beginning, and try to copy the correct section
of index buffer data over. This only works if `start` is the same for
all primitives, however, so if we detect different ones, we recurse
to rebase and call draw() separately for each different start value.
Fixes: 4c751ad67a ("vbo/dlist: use a shared index buffer")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4082
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8522>
We only convert line strips to lines in certain cases, but were flagging
node->merged.prim as GL_LINES even if we simply copied a GL_LINE_STRIP
prim[0] over without modifying it.
Fixes Piglit's lineloop test (which triggers loop -> strip conversion
earlier in this path, then was incorrectly triggering strip -> list
mode modification with no changes to the underlying data).
Fixes: 310991415e ("vbo/dlist: implement primitive merging")
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8522>
I'm can't see why this is necessary. There are already new fields
(node->merged.{min,max}_index) for the new values in the merged case.
But in vbo_save_draw.c, in the !draw_using_merged_prim case, we would
try and use the original node...with the now destroyed min/max index.
Fixes some assert failures when running with swtnl and forcing the
non-merged path (though it takes the merged path by default).
Fixes: 4c751ad67a ("vbo/dlist: use a shared index buffer")
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8522>
Pulls in test fixes for rasterpos on softpipe and for
simple-barrier-atomics on freedreno. Note that many xfail-vs-xskips end
up changing, because apparently the
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8499>
Less main-specific code when we're pulling in util/ formats anyway. Drops
about 20kb from Mesa drivers.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8532>
This was the only code using the "get a pack-a-pixel function pointer"
generated code, switch it over to using util/format's.
This does mean some more format desc lookups in the loop, but this code is
only accessible from classic driver swrast fallbacks at this point, where
GPU read perf completely dominates the profile anyway.
basically no change to driver size.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8532>
In some rare cases, L2 needs to be flushed if an image is affected
by the pipe misaligned issue. This is roughly based on AMDVLK.
I confirmed that disabling TC-compat HTILE, and respectively DCC,
for the relevant images also fixes the regressions below.
This fixes some regressions introduced with L2 coherency for
dEQP-VK.renderpass2.depth_stencil_resolve.image_2d_* and for
dEQP-VK.renderpass2.suballocation.multisample_resolve.*.
Fixes: 4a783a3c78 ("radv: Use L2 coherency on GFX9+.")
Co-Authored-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8557>
The driver used to invalidate the vector cache for meta operations
but this has been removed and I think it should be restored to fix
a bunch of regressions on GFX8.
This probably needs to be cleaned up but this is a hotfix.
This fixes a bunch of regressions and flakes on GFX8 like
dEQP-VK.pipeline.multisample.sample_locations_ext.draw.color.samples_4.*.
Fixes: 8f8d72af55 ("radv: Use access helpers for flushing with meta operations.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8573>
I don't know why this wasn't enabled but I think it should be.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8562>