For some reason, CTS expects E5B9G9R9 and B10G11R11 with
transparent black border clamping produce alpha 1 instead of 0.
Since border color takes precedence over the texture state swizzle,
the only way to fix this is to lower the texture swizzle in the shader
to set alpha to 1.
Fixes:
dEQP-VK.pipeline.sampler.view_type.*b10g11r11*clamp_to_border_transparent_black
dEQP-VK.pipeline.sampler.view_type.*e5b9g9r9*.clamp_to_border_transparent_black
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
It seems that we only want to set the texture state's depth to the
number of 2D layers divided by 6 when sampling, not wen doing
load/store.
This means that we need to generate two different states and choose
the one to use depending on the descriptor.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
For these we want to divide the number of layers by 6.
Fixes:
dEQP-VK.pipeline.image_view.view_type.cube_array.*
dEQP-VK.pipeline.image.suballocation.sampling_type.combined.view_type.cube_array.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
In OpenGL, unnormalized coordinates are implicit based on the sampler
type (rectangle textures), so the compiler can set the flag when needed.
In Vulkan, however, this is configured explicitly in the sampler object,
so the compiler won't set it and we need to do it manually when we are
writing the P1 uniform.
Fixes:
dEQP-VK.pipeline.sampler.exact_sampling.*.unnormalized_coords
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
When translating combined depth/stencil blits to compatible color blits we
should look at the requested region aspects to decide the color
mask to apply.
Fixes:
dEQP-VK.api.copy_and_blit.*.buffer_to_depthstencil.buffer_offset_d24_unorm_s8_uint_D
dEQP-VK.api.copy_and_blit.*.buffer_to_depthstencil.buffer_offset_d24_unorm_s8_uint_SD
dEQP-VK.api.copy_and_blit.*.buffer_to_depthstencil.buffer_offset_d24_unorm_s8_uint_S_D
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Currently, we end the current job whenever the user emits a
pipeline barrier, but we then expect to have a valid job when
we emit a draw call.
If by the time we have to emit a draw call we don't have a valid
job, we need to create one by resuming execution of the current
subpass.
Fixes some tests in:
dEQP-VK.renderpass.suballocation.attachment_allocation.input_output.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Instead of asserting that users don't try to create images that
would require 4GB+ of memory, error out with the corresponding
OOM error when the user tries to actually allocate the memory
for the image.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
The first attribute must be active if using builtins.
This fixes a lot of simulator crashes for vertex input CTS tests.
It should be noted that some of these tests still fail after this
fix though, so there may be some other bug.
Fixes crashes in:
dEQP-VK.pipeline.vertex_input.multiple_attributes.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This is really similar to the existing lower_tex_src_to_offset, but
for now we prefer to keep them independent, just in case we start to
found specific image use-cases as we advance fixing CTS tests.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
The hardware doesn't have unorm/snorm packing variants and we were
already lowering the packing versions of these.
Fixes:
dEQP-VK.glsl.builtin.function.pack_unpack.unpacksnorm2x16_compute
dEQP-VK.glsl.builtin.function.pack_unpack.unpackunorm2x16_compute
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
If the meta operation did not change descriptor state then we should keep it,
not reset it.
Fixes:
dEQP-VK.fragment_operations.early_fragment.early_fragment_tests_stencil
dEQP-VK.fragment_operations.early_fragment.no_early_fragment_tests_stencil
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
We are treating them as a special case of texture, so the commit is
mostly about integrating them with the existing
SAMPLER/SAMPLER_IMAGE/COMBINED_IMAGE_SAMPLER infrastructure.
This commit doesn't use in any special way the render pass
information, including the dependencies, so it is possible that we
would need to do something else. But this commit gets several CTS
tests, and two Sascha Willem Vulkan demos, so let's start with this
commit and handle any other use case for following commits.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
So far we were using nir->data.num_textures to fill the default values
for the textures used on the shader, and set the value for the number
of textures used.
But nir->data.num_textures doesn't take into account input
attachments, even after nir_lower_input_attachments. Although that
could make sense from a general pov, in our case we are treating input
attachments mostly as textures.
This commit count the number of textures interating through the
pipeline combined index map, as it includes both. This also makes the
populate of the shader key for default values more similar to the one
done at cmd_buffer with real values.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
When rasterization is disabled there are a number of CreateInfo
structs that should be ignored. We were managing this correctly
for some cases, but not all of them. Specifically, viewport state
must be ignored and we weren't doing that.
Fixes:
dEQP-VK.api.descriptor_set.descriptor_set_layout_lifetime.graphics
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Only free the underlying BO when the exported memory object is freed
to avoid multiple frees of the same memory.
The only exception is winsys BOs where we import a BO created in the
display device into the render device. In this case, we only have one
memory object referencing the BO and we want to destroy it with that
memory object.
Fixes:
dEQP-VK.api.external.memory.dma_buf.*
dEQP-VK.api.external.memory.opaque_fd.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
The hardware can't do this, so we need to record a CPU job that will
map the indirect buffer at queue submission time, read the dispatch
parameters and then submit a regular dispatch.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
New jobs need to re-emit all state. Typically, this is achieved
by resetting all dirty state flags when we start a new job, but
for index buffers we were not using a dirty bit because we always
emit them immediately. This patch adds the bit and only tries
to skip index buffer state if the bit is not dirty, which will
ensure that we will always emit it for new jobs.
This fixes a regression in the shadowmapping demo from Sascha Willems
introduced with "v3dv: try harder to skip emission of redundant state".
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Focused specially on the cache, how many BOs and how much bo_size is
store on the cache, freed bos, bos moved to cache etc.
Initially not configured with V3D_DEBUG (like v3d) to avoid a runtime
check on most of v3dv_bo functions.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
V3DV_MAX_BO_CACHE_SIZE can be used to configure it.
So one way to disable the bo cache is setting V3DV_MAX_BO_CACHE_SIZE
to zero. This would still run all the bo_cache size checks, but having
another envvar just to ensure that anything related to the cache is
used seemed like an overkill.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Heavily based on the already existing for the v3d OpenGL driver, but
without references, and with some extra OOM checks (Vulkan CTS has
several OOM tests).
With this commit v3dv_bo_alloc and v3dv_bo_free became frontends to
the bo_cache. The former tries to get a BO from the cache if possible,
and the latter stores the BO on the cache if possible. The former also
adds a new parameter to point if the BO to allocate is private.
As v3d we are only caching private BOs, those created by the driver
for internal use (like CLs, tile_alloc, etc). They are the ones with
the highest change of being reused (for example, CL BOs are always
4KB, so they can always be reused). User-created BOs can have any
size, including some very large ones for buffers and images, which
makes them far less likely to be reused and would add a lot of memory
pressure if we decided to cache them.
In any case, in practice, we found that we could get a performance
improvement by caching also user-created BOs, but that would need more
care and an analysis to decide which ones makes sense. Would also
require to change how the cached BOs are stored by size. Right now
there are an array of list_head, that doesn't work well with big
BOs. If done, that would be handled on a separate commit.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Both the API user and the driver may attempt to map a BO, possibly
only partially and using different ranges. This is a problem because
we only have a single map per BO. Fix this by making sure that when
a BO is mapped, we always map its entire range. This way if a BO
has been mapped before, we know that map is still valid no matter the
region we need to access now.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
The blit shader path for buffer to image copies is pretty bad,
since it needs to produce a tiled image from the linear buffer
prior to emitting the blit copy.
This patch adds a new preferential path where we implement the
copy using the CPU, similar to what the GL driver does for
texture uploads. This makes vkQuake2 at least 4x faster when
dynamic lights are enabled (which triggers dynamic texture
updates).
We also tested a GPU path where we use a shader that takes the
linear buffer as a UBO and copies directly from it. This also
shows a clear performance gain, but still worse than the CPU
implementation.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
We had done all the plumbing for this but EZ can be disabled in 3 places
and we were never setting the enable bit in the configuration bits packet.
Also, configuration bits must not enable EZ if this has been disabled in
the RCL for the whole frame, which we do if we don't have a depth
attachment at all.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Otherwise cloned BO lists point to the original list objects and not
the cloned ones, and that will confuse anything that tries to iterate over
them, such as list_length(), leading to infinite loops.
Fixes (in debug mode):
dEQP-VK.api.command_buffers.render_pass_continue
In that test we clone a full CL job from a secondary, and without this,
the BO lists in its CL lists will point to the bo_list field in the
original job, leading to an infinite loop as we assert the expected size
of these lists at queue submit time in handle_cl_job.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>