We have some features that seem to slow things down or cause other
possible undesireable side effects, but it would be nice to test
games etc with them easily.
I forsee multisample DCC and maybe some shader opt changes using this.
For now use it for batch chaining.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This just moves lots of stuff to the bind stage rather than
dealing with it in the draw stage.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
There is no need to calculate this at draw time.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
I want to use these in the pipeline setup stage.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We have Vulkan utilities in both src/util and src/vulkan/util. The
latter seems a more appropriate place for Vulkan-specific things, so
move them there.
v2: Android build system changes (from Tapani Pälli)
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
This adds some rb+ support, as on GFX9 we have to disable
it as per radeonsi.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
GFX9 needs to write event EOP to a fence buffer, allocate some
space for this, and just write an ever increasing number to it,
this isn't exactly what radeonsi does, but it seems to work.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is mostly mechanical changes of renaming types and introducing
"legacy" everywhere.
It doesn't use the ac_surface computation functions yet.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Like radeonsi. This saves memory, and the information can easily be
recomputed on the fly where necessary.
Reviewed-by: Dave Airlie <airlied@redhat.com>
For GFX9 these will be needed to be 64-bit, so bump them early,
to avoid it causing any wierdness later.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
In prep for GFX9 refactor some of the eop event writing code
out.
This changes behaviour, but aligns with what radeonsi does,
it does double emits on CIK/VI, whereas previously it only
did this on CIK.
v2: bump the size checks.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This code was in a few places, consolidate into one.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
And correct implementation to specify only what we support.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas pointed out the fs key doesn't take srgb into account,
since there is just one srgb variant, just create a separate
pipeline for it. This also uses dest format to be more consistent
on when srgb matters.
Fixes: 69136f4e63 "radv/meta: add resolve pass using fragment/vertex shaders"
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The color clear value is uniform and needs only to be emitted from
the frag shader, so just push it down via a push constant,
and remove the vertex buffer completely.
The depth clear value needs to be emitted from the vertex
shader, but is only a single value.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This drops the resolve workarounds that change an image
tiling mode behinds it's back, this is horrible and breaks
the image_view->image relationship. Remove all this.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
In order to resolve into DCC enabled dests we need to use
the fragment shader. This reuses the code from the compute
path and implements a resolve path in vertex/fragment shader.
This code isn't used until later.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This adds a path to allow compute resolves to be used
for subpass resolves.
This isn't used yet, but will be later.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If we are resolving into an srgb dest, we need to convert
to linear so the store does the conversion back.
This should fix some wierdness seen when we subresolves
hit the compute path.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This still doesn't give us complete pWaitDstStageMask support,
but it should provide enough to be correct if not as efficent as
possible.
If we have wait semaphores we must flush between submits and
flush the shaders as well.
This fixes the remaining fails in:
dEQP-VK.synchronization.op.single_queue.semaphore.*ssbo*
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is to rework the surface code like radeonsi.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This just makes it easier to do the follow in cleanups of the surface.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Gives me approximately a 2% perf increase in bot dota2 & talos.
Having descriptors (both sets and vertex buffers) prefetched
didn't help so I didn't include that.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
v2: Handle out of pool memory error.
v3: Actually use VK_ERROR_OUT_OF_POOL_MEMORY_KHR for the error condition.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Bas Nieuwenhuizen <basni@google.com>
We want to expose more descriptor sets to the applications,
but currently we have a 1:1 mapping between shader descriptor
sets and 2 user sgprs, limiting us to 4 per stage. This commit
check if we don't have enough user sgprs for the number of
bound sets for this shader, we can ask for them to be indirected.
Two sgprs are then used to point to a buffer or 64-bit pointers
to the number of allocated descriptor sets. All shaders point
to the same buffer.
We can use some user sgprs to inline one or two descriptor sets
in future, but until we have a workload that needs this I don't
think we should spend too much time on it.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This allows meta to use push descriptors without disturbing user
push descriptors.
radv_meta_push_descriptor_set differs from vkCmdPushDescriptorSetKHR
in that partial updates are not supported; all descriptors used in
subsequent draw commands must be pushed at the same time.
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
According to the Vulkan spec, VkPipelineInputAssemblyStateCreateInfo's
primitiveRestartEnable flag should only apply to indexed draws, however
it was being enabled regardless of the type of draw. This could cause
problems for non-indexed draws with >=65535 vertices if the previous
indexed draw used 16-bit indices.
Fixes corruption of the credits text in Mad Max.
v2: Reset primitive restart state after executing a secondary command
buffer.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Makes more sense when we hash the layout for the pipeline cache.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
The devil is in the shader again, otherwise this is
fairly straightforward.
The CTS contains no pipeline statistics copy to buffer
testcases, so I did a basic smoketest.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
For using them with both occlusion and pipeline statistics queries.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Adds a shader for writing occlusion query results to a buffer, as the
CP packet isn't support on SI or secondary buffers, and doesn't handle
the availability bit (or partial results) nor truncation to 32-bit.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
All offsets and strides are precomputed by
radv_CreateDescriptorUpdateTemplateKHR and stored in the template.
v2: Move the new struct declarations from radv_descriptor_set.h
to radv_private.h (Bas)
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Move the implementation into a separate function that takes a
cmd_buffer and a dstSetOverride parameter.
When cmd_buffer is not NULL, radv_update_descriptor_sets calls
cs_add_buffer directly instead of updating the buffer list.
This will be used to implement VK_KHR_push_descriptor.
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The current code was broken, and I decided to redesign it instead.
This puts the sample positions for all samples into the queue
constant descriptor buffer after all the spill/ring descriptors.
It then uses a single offset register to point how far into the
samples the samples for num_samples are. This saves one user sgpr
and means we only generate the sample position data in the rare
single case where we need it currently.
This doesn't fix the failing CTS tests without the followup
fix.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This calculates the pipeline state for tessellation.
It moves the gs ring calculation down to below
where the tessellation shaders will be compiled,
as it needs the info from those shaders.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>