RADV_CMD_BUFFER_STATUS_INVALID is not used for now, but I think
it makes sense to declare it. Could be used later with better
command buffer error handling.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
"vkAllocateCommandBuffers can be used to create multiple command
buffers. If the creation of any of those command buffers fails, the
implementation must destroy all successfully created command buffer
objects from this command, set all entries of the pCommandBuffers
array to NULL and return the error."
This has been suggested by gabriel@system.is.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This fixes two CTS regressions:
- dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_primary
- dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_secondary
These two tests are part the mustpass lists, so presumably they
are correct and my change was wrong.
This reverts commit 0f68208f1d.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Similar to how the driver sets the depth clear regs after a
fast depth clear. Most of the time, this will copy a 32-bit reg
instead of a 64-bit reg.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
For the fast path, radv_fill_buffer() ensures that the BO is
already in the list. For the slow path, the depth surface is
part of the framebuffer which means the BO is added to the list
when the framebuffer is emitted.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
aspects can't be zero and there is an assertion that ensures
it's not in emit_clear().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This should reduce the overhead of adding a BO to the current
list, especially when the list is huge. Also, when a new pipeline
is bound, we only need to update the descriptor, the buffer objects
should already be in the list.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
radv_fill_buffer() ensures that the image BO is added to the list.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
radv_fill_buffer() ensures that the image BO is added to the list.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This should not be needed, if the allocation fails an error is
returned and the host should handle it.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
For consistency and it might help for debugging purposes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Just after the vertex shader.
This seems to give a minor boost for, at least, Serious Sam
Fusion 2017 and Dawn of War 3. I don't see any real impacts
with The Talos Principle.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Will be used for VBO descriptors prefetching.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
For consistency because this function will also prefetch VBO
descriptors.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This register is the same on all gpus so far, so emit it in one
place and also for the pre-gfx9 gpus set the value in the pipeline
creation.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This moves some calculations of register values into the pipeline
construction, it saves looking at outinfo in the cmd buffer emit.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The next patch will try and avoid calling the indirect function.
v2: add a missing conversion.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The function that calls us has just added the buffer to the
list already, no need to try and add it again.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
There's no point recalculating these the whole time on descriptor
emission, just store them at pipeline creation.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Instead of storing all the pointers and zeroing them all out,
just store a valid bitmask in the state. This also moves
the CmdBindPipeline path down the cpu usage path for the
multithreading demo as it no longer has to traverse MAX_SETS
to find the active descriptor sets.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This isn't required to be cleared, since buffers are only linked
by vertex elements, so if elements are clear then no buffers
should be referenced.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If we allocate attachments in the begin command buffer due to the
render pass continue bit, we were leaking them.
Since renderpasses inside a cmd buffer malloc/free these properly,
and set to NULL, we just need to call free at end.
Fixes a memory leak with multithreading demo.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
uint32_t data[MAX_SETS * 2] = {}; was getting executed before
the exit and took significant amounts of time. By having the
check outside the function, we skip the execution of the clear.
Reviewed-by: Dave Airlie <airlied@redhat.com>
This should reduce the time where compute units are idle, mainly
for meta operations because they use a bunch of compute shaders.
This seems to have a really minor positive effect for Talos, at least.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Only needed when the CS path is used.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This uses the new kernel interfaces for reduced cs overhead,
We only set the local flag for memory allocations that don't have
a dedicated allocation and ones that aren't imports.
v2: add to all the internal buffer creation paths.
v3: missed some command submission paths, handle 0/empty bo lists.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
When binding a new pipeline, we applied all dynamic states
without checking if they really need to be re-emitted. This
doesn't seem to be useful for the meta operations because only
the viewports/scissors are updated.
This should reduce the number of commands added to the IB
when a new graphics pipeline is bound.
Also, rename radv_dynamic_state_copy() to radv_bind_dynamic_state()
and set the dirty flags directly there.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
The depth bounds test values are either set at pipeline
creation or dynamically using vkCmdSetDepthBounds().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
radeonsi only emits these when dfsm is enabled, so for now
just hinge them on a flag we never set.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
With merged shaders the vertex shader may not exist. This got in
because the offending patch was written before merged shaders were
upstream, but committed after.
Fixes: 75dfab24a2 'radv: refactor indirect draws with radv_draw_info'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Otherwise for non-indexed draws we set and immediately unset
RADV_CMD_DIRTY_INDEX_BUFFER. As all the set functions should
clear their own bit, this is unnecessary.
Fixes: 341529dbee 'radv: use optimal packet order for draws'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
As they were emitted after the new pipeline, the changed pipeline
detection was not working anymore.
Fixes: 341529dbee 'radv: use optimal packet order for draws'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Ported from RadeonSI. The time where shaders are idle should
be shorter now. This can give a little boost, like +6% with
the dynamicubo Vulkan demo.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The Vulkan specification says:
"... an execution dependency with only VK_PIPELINE_STAGE_TOP_OF_-
PIPE_BIT in the source stage mask will effectively not wait for
any prior commands to complete."
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Indirect draws with a count buffer will be refactored in a
separate patch.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Similar to the dispatch compute logic but for draw calls. For
convenience, indirect draws will be converted in a separate
patch.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Only on CIK and later. We should only update VGT_INDEX_TYPE but
it seems easier to re-emit all the index buffer packets.
Fixes: 966d66f28f (radv: do not re-emit the index buffer for every draw call)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This will allow us to fix the VGT_INDEX_TYPE issue properly.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
To be consistent with the emit function name.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CLEAR_STATE will initialize DB_COUNT_CONTROL to 0 for CIK+.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This fixes some access to the tess eval shader when it's combined
with geometry on gfx9.
This is a review of Bas's commit:
radv: Prevent crashing by accessing TES for VGT reuse depth.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Move it to radv_cmd_buffer_flush_state() because if
rasterizerDiscardEnable is true, the flags are not cleared.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
It can only be changed when CmdBindIndexBuffer() is called
or when a secondary buffer is used. Though not always, but
let's re-emit the packets in this situation for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This saves few CPU cycles when CmdDrawIndexed() is used a lot.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Missed that when I allowed waves to be launched out-of-order.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The spec requires the number of buffer to be greater than 0.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This saves some useless CMASK initializations/eliminations in
the Vulkan SSAO demo.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
My guess is that the GPU is going to report VM faults if
vkCmdDrawIndirectCountAMD() (and friends) are used.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
These registers don't change during the lifetime of the
command buffer, there is no need to re-emit them when
binding a new pipeline.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Ported from RadeonSI, and -pro seems to enable it as well.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
We only need to dirty the descriptors when the pipeline is
a new one, because user SGPRs can be potentially different.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
To check a valid usage requirement.
CID: 1401616
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
The number of viewports/scissors can only be specified at pipeline
creation time, so make sure to copy them when binding a new one
because the dynamic state is cleared in BeginCommandBuffer().
Fixes: dcf46e995d ("radv: do not update the number of scissors in vkCmdSetScissor()")
Fixes: 60878dd00c ("radv: do not update the number of viewports in vkCmdSetViewport()")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
We are really not going to use a winsys which does not need to store
the va, so might as well store it in a standard field.
Not sure this helps perf much though, as most of the cost is in the
cache miss accessing the bo anyway, which we stil need to do.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Since most games use only a few, iterating through all of them is
a waste. Simplifies the code too.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
The Vulkan spec (1.0.61) says:
"The number of scissors used by a pipeline is still specified
by the scissorCount member of VkPipelinescissorStateCreateInfo."
So, the number of scissors is defined at pipeline creation
time and shouldn't be updated when they are set dynamically.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
The Vulkan spec (1.0.61) says:
"The number of viewports used by a pipeline is still specified
by the viewportCount member of VkPipelineViewportStateCreateInfo."
So, the number of viewports is defined at pipeline creation
time and shouldn't be updated when they are set dynamically.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Also, it's useless to set the error code twice. Though, we
should probably skip the next commands when the command buffer
is considered invalid.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>