When we're using extended dynamic state, we will often end up with dummy
pipeline binds, which we should try to avoid if we can.
Also avoids having to rebind dynamic state redundantly.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Cleans up dynamic state such that we do not have to keep dynamic state
create infos around.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
vkd3d-shader is currently kinda buggy and crashes when you try to trace
DXBC. This used to never be run since it was guarded by
VKD3D_SHADER_DEBUG, but with the move to a static build we merged all
debug logging under VKD3D_DEBUG. Reintroduce different debug channels in
a way that is compatible with a statically linked vkd3d.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Otherwise, if a render pass gets suspended twice in a row, we
never emit the barrier because render_pass_suspended will be
set to false the second time.
Fixes validation errors in Hitman 2.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We shouldn't potentially override stuff in the std library and this allows us to map directly to __ATOMIC_* memory orders which is more correct.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
There is no stdatomic available on MSVC so let's clean things up.
This moves all the atomic helpers to vkd3d_atomic.h and implements all platform's spinlocks in entirely the same way.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Only support ANSI/UNICODE version for now. The PIX3BLOB format is
extremely weird, complicated and undocumented.
We can refer to RenderDoc if we need it later ...
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
debug_marker/debug_report are both deprecated in favor of debug_utils and vkd3d was using marker in a
buggy way anways, as debug_marker requires debug_report to work, but it was
only conditionally enabled.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Gets rid of the full barrier on command buffer end.
Instead, do what D3D12 wants, which is to serialize all
ExecuteCommandLists. Simplify the existing timeline sempahore setup for
sparse queues and use it for all submissions.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
D3D12 apparently does this implicitly. Fixes rendering issues in
the AMD COCOA demo on Polaris with RADV, which does not emit a
barrier between the AO compute passes and the tone mapping pass
in the next command buffer.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We should hook this up to the robustness2 feature at some point,
but for now, just use the dummy descriptors. Fixes a crash in
the AMD CACAO demo.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
In D3D12, Update/CopyTileMappings are implicitly synchronized with
respect to other commands executing on the same queue, which means:
- Signal and Execute have to wait for previously submitted
sparse binding operations to complete
- Wait and Execute have to complete before subsequently
submitted sparse binding operations can execute.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This allows us to perform clears inside the render pass even if
the render pass hasn't been started at the time of the clear yet.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We'll need to revisit this as the current implementation is
not only inefficient but also wrong in quite a few ways.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We need access to the resource in order to perform render pass
layout transitions, just the view handle isn't enough.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Passing the main struct to the public functions allows us
to share common data between multiple types of operations.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Works around an app-bug in SotTR, where the command pool is reset before
the command buffer completes.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
D3D12 supports out-of-order signal and wait. So does Vulkan timeline
semaphores. However, in Vulkan we don't have an infinite amount of
virtual queues. We must potentially map multiple D3D12 queues on top of
Vulkan, which might lead to a deadlock when app attempts to
wait-before-signal if the two queues are mapped to the same physical
Vulkan queue.
In order to solve this, we need to hold back submissions until we know
it is safe to do so. To make this work in practice as simply as possible, each
ID3D12CommandQueue has its own submission thread, which will block on an
ID3D12Fence's pending timeline value for a Wait command. The main reason to use a
submission thread is that resolving this directly in
ID3D12CommandQueue::Signal is extremely tricky and potentially
needs recursively locking queues and fences.
Note that we only block on the pending wait value, not the actual wait
value, so there is no real CPU <-> GPU synchronization here. In the
common case, no submission thread will block.
The added benefit is that submits are async now, so main thread CPU
overhead might slightly decrease.
To play nice with DXGI swapchain, the external entry point for acquiring
the Vulkan queue needs to drain the submission thread and lock it to ensure
submissions happen in order.
Fixes hangs in The Division 1, which makes use of this D3D12 feature.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
The current code uses D3D12 abstractions to create pipelines but
issues raw Vulkan API calls to actually implement the functionality,
which means the code makes assumptions about the exact descriptor
set layout and push constant layout, which is generally a bad idea
now that we have multiple code paths for root constants etc.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Prepares for a rewrite of queue submission, the legacy path is never
run in practice and will likely break in subtle ways.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
And add a function to (re-)apply dynamic state as necessary. This
will allow us to ignore dynamic state not needed by the pipeline,
and may become necessary if we implement shader-based copies etc.
Currently unused; the following commits will subsequently change
state setting methods over.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
All attachments must be at least as large as the framebuffer, using a
max operator is not compliant with Vulkan.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Logically split up descriptor pool allocation in three types:
- STATIC: Root descriptors and internal allocation.
- VOLATILE: For packed descriptor set which comes from heaps.
- IMMUTABLE_SAMPLER: For immutable samplers. This should be removed once
we start allocating sampler sets at sampler creation time.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
For now this is enbaled based on device capabilities, but future changes
may require this to be disabled for certain root signatures.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
When changing tables that only have bindless descriptors,
only update the push constants instead.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>