When we're using extended dynamic state, we will often end up with dummy
pipeline binds, which we should try to avoid if we can.
Also avoids having to rebind dynamic state redundantly.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Cleans up dynamic state such that we do not have to keep dynamic state
create infos around.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fall back when there is a mismatch, which can happen if application does
not declare inputs to hull shader (unlikely).
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
When using EXT_extended_dynamic_state, we will be able to compile a
master pipeline. Only in special cases will we have to fallback.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
This can happen in the worst case where we have all bindless sets, and:
- Static samplers
- Packed descriptors (UAV counters on drivers without support for this)
- Root descriptors
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
vkd3d-shader is currently kinda buggy and crashes when you try to trace
DXBC. This used to never be run since it was guarded by
VKD3D_SHADER_DEBUG, but with the move to a static build we merged all
debug logging under VKD3D_DEBUG. Reintroduce different debug channels in
a way that is compatible with a statically linked vkd3d.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Otherwise, if a render pass gets suspended twice in a row, we
never emit the barrier because render_pass_suspended will be
set to false the second time.
Fixes validation errors in Hitman 2.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
The output here is actually for secure presentation and restricting a swapchain to a certain output.
Correctly handle NULL (desktop) targets that we used to have.
Fixes crashes with titles that use fullscreen via an initial fullscreen desc.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
We need these for the upcoming swapchain factory implementation
for standalone D3D12.
They're also probably good to have around in future for the
d3d12 device.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
This reverts commit 0384362065.
It is not allowed to use RS 1.1 serialization for the non-versioned
entry point. RS 1.1 serialization must use the versioned entry point.
Reverting this fixes the relevant test case in d3d12.c:12522.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Saves a few CPU cycles. We expect things to explode anyway when
the app uses a non-UAV descriptor as a UAV in the shader.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Fixes a signedness comparison warning -- shouldn't be a problem as we aren't going to get images with 2m+ tiles.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
We shouldn't potentially override stuff in the std library and this allows us to map directly to __ATOMIC_* memory orders which is more correct.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Creates linking problems if we want to build vkd3d-shader statically given this links back to something in vkd3d-common.
We don't need this distinction anyways...
Signed-off-by: Joshua Ashton <joshua@froggi.es>
This commit moves the module handling code which was previously dumped in device.c and the code to retrieve the current executable path to its own file.
This also eliminates HAVE_DECL_PROGRAM_INVOCATION_NAME from config.h
Signed-off-by: Joshua Ashton <joshua@froggi.es>
This isn't going to change. Drivers use this to do special things,
so changing it would probably cause a bunch of random problems anyway.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
There is no stdatomic available on MSVC so let's clean things up.
This moves all the atomic helpers to vkd3d_atomic.h and implements all platform's spinlocks in entirely the same way.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
There is no reason to not load Vulkan dynamically, otherwise, we must
have loader dev packages installed, which is not ideal.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Only support ANSI/UNICODE version for now. The PIX3BLOB format is
extremely weird, complicated and undocumented.
We can refer to RenderDoc if we need it later ...
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
debug_marker/debug_report are both deprecated in favor of debug_utils and vkd3d was using marker in a
buggy way anways, as debug_marker requires debug_report to work, but it was
only conditionally enabled.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Gets rid of the full barrier on command buffer end.
Instead, do what D3D12 wants, which is to serialize all
ExecuteCommandLists. Simplify the existing timeline sempahore setup for
sparse queues and use it for all submissions.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Otherwise, we may end up failing to allocate memory on Tier 1
hardware, and also fail to use dedicated allocations in some
cases.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We'll need this to more accurately select the memory type for D3D12
heaps based on which resources are allowed to be placed in it.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
D3D12 apparently does this implicitly. Fixes rendering issues in
the AMD COCOA demo on Polaris with RADV, which does not emit a
barrier between the AO compute passes and the tone mapping pass
in the next command buffer.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Need float16_int8 and subgroup with extended types to implement new SM
6.2 features. For now, skip over SM 6.1 features until someone makes use
of them.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
We should hook this up to the robustness2 feature at some point,
but for now, just use the dummy descriptors. Fixes a crash in
the AMD CACAO demo.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
StartTileIndexInOverallResource can be 0 for images that have either
no mip tail or no standard mips, so we need to check the packed mip
count.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
In D3D12, Update/CopyTileMappings are implicitly synchronized with
respect to other commands executing on the same queue, which means:
- Signal and Execute have to wait for previously submitted
sparse binding operations to complete
- Wait and Execute have to complete before subsequently
submitted sparse binding operations can execute.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Some sparse resource may have a metadata aspect on some drivers,
which needs to be bound before the image can be used in any way.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This will serve as a fallback if at least one queue family
does not support sparse binding.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This allows us to perform clears inside the render pass even if
the render pass hasn't been started at the time of the clear yet.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We'll need to revisit this as the current implementation is
not only inefficient but also wrong in quite a few ways.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We need access to the resource in order to perform render pass
layout transitions, just the view handle isn't enough.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Passing the main struct to the public functions allows us
to share common data between multiple types of operations.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Look up the typeless format for any given image format, then
look up the corresponding compatibility list. This also fixes
a potential issue with implicit SRGB <-> UNORM compatibility.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This now takes a sampler desc like d3d12_create_static_sampler,
and supports border colors if the provided border color matches
any of the supported Vulkan ones.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
It makes sense to separate this from d3d12_create_sampler since static
samplers and regular samplers differ in border color support.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
And zero-initialize mapped memory allocations, which seems to
fix some font corruption occationaly seen in Resident Evil 2.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Unblocks some games that request Feature Level 12.0, such as
Anno 1800, Monster Hunter World, The Talos Principle. May
cause issues if games use unsupported features.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We no longer require coherent memory types, so we should
always prefer a HOST_CACHED memory type for the readback
heap as well as corresponding custom heaps.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Works around an app-bug in SotTR, where the command pool is reset before
the command buffer completes.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
D3D12 supports out-of-order signal and wait. So does Vulkan timeline
semaphores. However, in Vulkan we don't have an infinite amount of
virtual queues. We must potentially map multiple D3D12 queues on top of
Vulkan, which might lead to a deadlock when app attempts to
wait-before-signal if the two queues are mapped to the same physical
Vulkan queue.
In order to solve this, we need to hold back submissions until we know
it is safe to do so. To make this work in practice as simply as possible, each
ID3D12CommandQueue has its own submission thread, which will block on an
ID3D12Fence's pending timeline value for a Wait command. The main reason to use a
submission thread is that resolving this directly in
ID3D12CommandQueue::Signal is extremely tricky and potentially
needs recursively locking queues and fences.
Note that we only block on the pending wait value, not the actual wait
value, so there is no real CPU <-> GPU synchronization here. In the
common case, no submission thread will block.
The added benefit is that submits are async now, so main thread CPU
overhead might slightly decrease.
To play nice with DXGI swapchain, the external entry point for acquiring
the Vulkan queue needs to drain the submission thread and lock it to ensure
submissions happen in order.
Fixes hangs in The Division 1, which makes use of this D3D12 feature.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
The current code uses D3D12 abstractions to create pipelines but
issues raw Vulkan API calls to actually implement the functionality,
which means the code makes assumptions about the exact descriptor
set layout and push constant layout, which is generally a bad idea
now that we have multiple code paths for root constants etc.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Prepares for a rewrite of queue submission, the legacy path is never
run in practice and will likely break in subtle ways.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
On NVIDIA we sometimes fail to place images on a heap because the memory
region was dedicated. Only bother trying this if heap flags only allow
buffers.
Fixes a GPU crash in The Division.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>