If implementation reports 4 alignment on offset, it must be able to
handle 4 byte offset on VAs.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
For internal debug shaders, it is helpful to ensure in-order logs when
sorted for later inspection.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Separate scratch pools by their intended usage. Allows e.g. preprocess buffers to be
allocated differently from normal buffers. Potentially can also allow
for separate pools for host visible scratch memory etc down the line.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
The D3D12 docs outline this as an implementation detail explicitly, so
we should do the same thing.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
The runtime is specified to validate certain things.
Also, be more robust against unsupported command signatures, since we
might need to draw/dispatch at an offset. Avoids hard GPU crashes.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Some dynamic state is at risk of being spammed with same arguments many
times. For the dynamic state that is trivial to check, do so.
Ghostwire: Tokyo has been observed to spam the same OMSetStencilRef
value causing some context rolls, also RSSetShadingRate has been set
redundantly.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Primitive restart is only used for strip primitive types, and must be
ignored for lists. Use and require extended_dynamic_state2 for this
purpose.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
{depth,stencil}AttachmentFormat and p{Depth,Stencil}Attachment are only
allowed if the format contains that aspect. Check this explicitly.
Fixes some validation errors.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
For EXTENDED_USAGE, we still need to restrict image usage when creating
concrete views.
Use VkImageViewUsageCreateInfo to restrict usage flags to the kind of
view we're creating.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Found some validation errors where rt_count != rtv_active_mask,
and blending used rt_count instead of rtv_active_mask. If shader renders
to a NULL attachment, we must make sure that it's part of the PSO
interface.
Also, use rt_count rather than active mask when beginning render pass.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
This is basically required for not horrible stutter and performance and
is widely supported.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
For this case, we want to block and teardown the debug ring thread.
It's okay to fish for dead messages in the ring, since we know there
won't be more GPU work submitted.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
If we expect device losts (breadcrumb debug), we need to use DEVICE uncached/coherent,
since we might not be able to flush GPU caches properly.
We also need to remove the idea of being able to copy out the control
block back to host. This is too brittle and we should instead just place
the control block in PCI-e BAR instead. Rethink how we pass messages
from GPU to CPU to make it more robust.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Spec says that in device lost, driver must return DEVICE_LOST in finite
time, but this does not happen on NV drivers. Use a long timeout instead
in this scenario.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
AMD path for this commit.
Idea is that we can automatically instrument markers with command list
information we can make some sense of in vkd3d-proton.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
It's redundant to add an UNDEFINED transition here for committed
resources. We need it for sparse and placed resources to handle aliasing
rules, but that's it.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
On some implementations, it doesn't matter for performance what we use,
and we can avoid a lot of ugly barriers this way.
Opt-in to use this extensions on GPUs we know handles it well,
otherwise, keep using the tracking paths.
With VK_KHR_dynamic_rendering, this is now feasible to do since we no longer
have to deal with shenanigans related to VkRenderPass layouts and
complicated compatibility rules.
To make this work with the existing framework, just need to consider
that GENERAL can be a common layout alongside DEPTH_STENCIL_OPTIMAL,
which are both common layouts that do not need to be tracked at all.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
When performing a decay of a DSV resource, make sure to transition all
subresources, not just the particular aspect being transitioned.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
We require separate DS layouts.
Fixes validation errors where we transition from read-only, but our
neighbor aspect might have been optimal.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
These only existed for VRS attachment, which is no longer
necessary with VK_KHR_dynamic_rendering.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>