Supports more advanced file operations than we'd normally need.
Intended to be used by magic disk cache.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
This was off by one, at some point, which could cause a stack buffer overrun which is naughty.
Replace this with just an ARRAY_SIZE on the dynamic_state_list for the array size.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Needed to support SM 6.0 CBufferLoad.
This path is mostly unused since it's opt-in in DXC and horribly broken
...
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Primitive restart is only used for strip primitive types, and must be
ignored for lists. Use and require extended_dynamic_state2 for this
purpose.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
There are strict limits on number of descriptors which can be used,
and we have to use MUTABLE + single set to make this work.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
{depth,stencil}AttachmentFormat and p{Depth,Stencil}Attachment are only
allowed if the format contains that aspect. Check this explicitly.
Fixes some validation errors.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
For EXTENDED_USAGE, we still need to restrict image usage when creating
concrete views.
Use VkImageViewUsageCreateInfo to restrict usage flags to the kind of
view we're creating.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
We need to check RTVFormats and IO signature.
If both RTVFormat uses non-null format and IO signature has an active
entry, we must fail compilation.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Found some validation errors where rt_count != rtv_active_mask,
and blending used rt_count instead of rtv_active_mask. If shader renders
to a NULL attachment, we must make sure that it's part of the PSO
interface.
Also, use rt_count rather than active mask when beginning render pass.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
This is basically required for not horrible stutter and performance and
is widely supported.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
For this case, we want to block and teardown the debug ring thread.
It's okay to fish for dead messages in the ring, since we know there
won't be more GPU work submitted.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
If we expect device losts (breadcrumb debug), we need to use DEVICE uncached/coherent,
since we might not be able to flush GPU caches properly.
We also need to remove the idea of being able to copy out the control
block back to host. This is too brittle and we should instead just place
the control block in PCI-e BAR instead. Rethink how we pass messages
from GPU to CPU to make it more robust.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
The elected lane must be able to perform side effects, so make sure
helper lanes don't participate.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
If we know the input is wave uniform (progress markers for example),
no need to spam the log.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Some implementation can support marker, but not explicit coherency.
Buffer markers are often uncached either way, so should be fine ...
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Spec says that in device lost, driver must return DEVICE_LOST in finite
time, but this does not happen on NV drivers. Use a long timeout instead
in this scenario.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
AMD path for this commit.
Idea is that we can automatically instrument markers with command list
information we can make some sense of in vkd3d-proton.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Rather than having to take writer lock on serialize calls from the
outside, we should just take locks when accessing the internal hashmaps
instead.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
If outer code has taken a reader lock, we don't need to lock again.
Also allows a reader lock to go GetSerializedSize + Serialize with one
reader lock.
This will be relevant for magic cache implementation.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
It's redundant to add an UNDEFINED transition here for committed
resources. We need it for sparse and placed resources to handle aliasing
rules, but that's it.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>