Basically just reserve the memory in the descriptor sets.
On the shader side we construct a buffer descriptor, since
AFAIU VGPR indexing on 32-bit pointers in LLVM is still broken.
This fully supports update after bind and variable descriptor set
sizes. However, the limits are somewhat arbitrary and are mostly
about finding a reasonable division of a 2 GiB max memory size over
the set.
v2: - rebased on top of master (Samuel)
- remove the loading resources rework (Samuel)
- only load UBO descriptors if it's a pointer (Samuel)
- use LLVMBuildPtrToInt to avoid IR failures (Samuel)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v2)
Sampler descriptors don't have a buffer list.
This fixes some crashes with new CTS
dEQP-VK.binding_model.descriptor_copy.*.sampler_*.
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
We shouldn't increment the buffer list pointers twice.
This fixes some crashes with new CTS
dEQP-VK.binding_model.descriptor_copy.*.
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
For example, if a pipeline has two stages VS and FS. And if only
the fragment stage needs dynamic bindings, we shouldn't allocate
an extra user SGPR for the vertex stage. Of course, if the vertex
stage loads constants, it needs an user SGPR.
This should reduce the number of SET_SH_REG packets that are emitted.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
pool->size is a 64-bit unsigned integer too.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This reverts commit 90819abb56.
This logic was wrong, the original code is correct. The direct
impact is that we allocate up to approximately a squared amount
of memory compared to what we should allocate.
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
The size has to be multiplied by the number of sets.
This gets rid of the OUT_OF_POOL_KHR error and fixes
a crash with the Tangrams demo.
CC: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
We mostly use the same priority for all buffer objects, so
I don't think that matter much. This should reduce CPU
overhead a little bit.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Errors are not that common of a case so we can eat a slight perf
hit in having to call a function and do a runtime check.
In turn this makes debugging random errors happening for end users
easier, because they don't have to have a debug build on hand.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
In order to reduce a performance regression introduced by
4b13fe55a4 ("radv: Keep a global BO list for VkMemory."),
we are going to maintain two different paths.
One when VK_EXT_descriptor_indexing is enabled by the
application because we need to have a global BO list, and
one (the old one) when it's not enabled.
With Talos on Polaris, the global BO list reduces performance
by 10% which is too much for me.
This reverts commit ab6cadd3ec.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The continue means we do alignment differently than during creation,
making the buffer smaller than expected.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Previously we did not care about havin the set storage in order,
but for variable descriptor count we want the highest binding
at the end of the storage.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
The spec is pretty clear that this can be 0, and that it operates
as a reserved binding.
Fixes:
dEQP-VK.binding_model.descriptor_update.empty_descriptor.uniform_buffer
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This allows to update them with only one memcpy().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
The next patch will try and avoid calling the indirect function.
v2: add a missing conversion.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
It appears the latest dota2 vulkan uses this,
and we get a hang in VR mode without it.
v2: remove finishme I left in after finishing.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The vram_list linked list resulted in lots of pointer chasing.
Replacing this with an array instead improves descriptor set
allocation CPU usage by 3x at least (when also considering the free),
because it had to iterate through 300-400 sets on average.
Not a huge improvement as the pre-improvement CPU usage was only
about 2.3% in the busiest thread.
Reviewed-by: Dave Airlie <airlied@redhat.com>
This uses the new kernel interfaces for reduced cs overhead,
We only set the local flag for memory allocations that don't have
a dedicated allocation and ones that aren't imports.
v2: add to all the internal buffer creation paths.
v3: missed some command submission paths, handle 0/empty bo lists.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We are really not going to use a winsys which does not need to store
the va, so might as well store it in a standard field.
Not sure this helps perf much though, as most of the cost is in the
cache miss accessing the bo anyway, which we stil need to do.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
If a cube image has VK_IMAGE_USAGE_STORAGE_BIT set, the type in an image
view's descriptor was set to a 2D array (and a few other fields adjusted
accordingly). This is correct when the image view is actually bound as a
storage image, but not when bound as a sampled image. In that case the
type should be set as a cube.
Fix by generating 2 sets of descriptors at view creation time for both
storage and non-storage usage, and then choose between them based on
descriptor type when writing descriptor sets.
v2: Generate storage descriptors for images with TRANSFER_DST, since
those may be used as storage images internally.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This free was left in after dynamic descriptors were changed to not be
allocated separately from the descriptor set, and can cause a crash.
Fixes: 39644fa40a ("radv: Don't allocate dynamic descriptors separately")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Since the host pool changes,
Fixes:
dEQP-VK.api.descriptor_pool.out_of_pool_memory
Fixes: 126d5ad "radv: Use host memory pool for non-freeable descriptors."
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
With this we don't have any operations on a pool with non-freeable
descriptors left that have O(#descriptors) complexity.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Bas Nieuwenhuizen <basni@google.com>
v2: Handle out of pool memory error.
v3: Actually use VK_ERROR_OUT_OF_POOL_MEMORY_KHR for the error condition.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Bas Nieuwenhuizen <basni@google.com>
Makes more sense when we hash the layout for the pipeline cache.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
All offsets and strides are precomputed by
radv_CreateDescriptorUpdateTemplateKHR and stored in the template.
v2: Move the new struct declarations from radv_descriptor_set.h
to radv_private.h (Bas)
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Replace the !binding_layout->immutable_samplers assertion in
radv_update_descriptor_sets with a conditional.
The Vulkan specification does not say that it is illegal to update
a sampler descriptor when it is immutable; only that pImageInfo is
ignored.
This change is also needed for push descriptors, because valid
descriptors must be pushed for all bindings accessed by shaders,
including immutable sampler descriptors.
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Move the implementation into a separate function that takes a
cmd_buffer and a dstSetOverride parameter.
When cmd_buffer is not NULL, radv_update_descriptor_sets calls
cs_add_buffer directly instead of updating the buffer list.
This will be used to implement VK_KHR_push_descriptor.
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Rather than having an extra memory allocation [that we currently do not
and act accordingly] just make the API take an pointer to a stack
allocated instance.
This and follow-up steps will effectively make the _mesa_sha1_foo simple
define/inlines around their SHA1 counterparts.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
So we don't need to know about radv_sampler in ac_nir_to_llvm.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This one only keeps allocated memory in the list, and list nodes
in the descriptor sets. Thsi doesn't need messing around with
max_sets, and we get automatic merging of free regions.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
We only use the freed ones after all free space has been used. If
the app only allocates small descriptor sets, we might go over
max_sets before the memory is full.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
Fixes: f4e499ec79
There is a new error code in Maintenance1 that is more specific to the
situation: VK_ERROR_OUT_OF_POOL_MEMORY_KHR
Fixes CTS test case:
dEQP-VK.api.descriptor_pool.out_of_pool_memory
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This moves to the shared vk_alloc inlines for vulkan
memory allocations.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This squashes all the radv development up until now into
one for merging.
History can be found:
https://github.com/airlied/mesa/tree/semi-interesting
This requires llvm 3.9 and is in no way considered
a conformant vulkan implementation. It can run a number
of vulkan applications, and supports all GPUs using
the amdgpu kernel driver.
Thanks to Intel for providing anv and spirv->nir,
and Emil Velikov for reviewing build integration.
Parts of this are:
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Authors: Bas Nieuwenhuizen and Dave Airlie
Signed-off-by: Dave Airlie <airlied@redhat.com>