Unblocks some games that request Feature Level 12.0, such as
Anno 1800, Monster Hunter World, The Talos Principle. May
cause issues if games use unsupported features.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We no longer require coherent memory types, so we should
always prefer a HOST_CACHED memory type for the readback
heap as well as corresponding custom heaps.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Works around an app-bug in SotTR, where the command pool is reset before
the command buffer completes.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
D3D12 supports out-of-order signal and wait. So does Vulkan timeline
semaphores. However, in Vulkan we don't have an infinite amount of
virtual queues. We must potentially map multiple D3D12 queues on top of
Vulkan, which might lead to a deadlock when app attempts to
wait-before-signal if the two queues are mapped to the same physical
Vulkan queue.
In order to solve this, we need to hold back submissions until we know
it is safe to do so. To make this work in practice as simply as possible, each
ID3D12CommandQueue has its own submission thread, which will block on an
ID3D12Fence's pending timeline value for a Wait command. The main reason to use a
submission thread is that resolving this directly in
ID3D12CommandQueue::Signal is extremely tricky and potentially
needs recursively locking queues and fences.
Note that we only block on the pending wait value, not the actual wait
value, so there is no real CPU <-> GPU synchronization here. In the
common case, no submission thread will block.
The added benefit is that submits are async now, so main thread CPU
overhead might slightly decrease.
To play nice with DXGI swapchain, the external entry point for acquiring
the Vulkan queue needs to drain the submission thread and lock it to ensure
submissions happen in order.
Fixes hangs in The Division 1, which makes use of this D3D12 feature.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
The current code uses D3D12 abstractions to create pipelines but
issues raw Vulkan API calls to actually implement the functionality,
which means the code makes assumptions about the exact descriptor
set layout and push constant layout, which is generally a bad idea
now that we have multiple code paths for root constants etc.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Prepares for a rewrite of queue submission, the legacy path is never
run in practice and will likely break in subtle ways.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
On NVIDIA we sometimes fail to place images on a heap because the memory
region was dedicated. Only bother trying this if heap flags only allow
buffers.
Fixes a GPU crash in The Division.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Instead of taking the resource type, take the binding flag.
This allows us to also use this function for UAV counters.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We currently can't implement this in a meaningful way, but we
should return an empty blob in order to not crash applications.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We're going to need this to implement other parts of the
API, so it should be in common code.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
And expose the following feature cap on capable GPUs:
VPAndRTArrayIndexFromAnyShaderFeedingRasterizerSupportedWithoutGSEmulation.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Otherwise, if two built-in outputs share the same register, we
may end up multiple redundant private variables, only one of
which gets initialized, leading to uninitialized outputs.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
And add a function to (re-)apply dynamic state as necessary. This
will allow us to ignore dynamic state not needed by the pipeline,
and may become necessary if we implement shader-based copies etc.
Currently unused; the following commits will subsequently change
state setting methods over.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Resident Evil 2 needs this method. Since we don't really have a concept
of explicit memory residency in Vulkan, we're fine not doing anything else.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This way we don't have to change all function parameter types
every time we upgrade the interface version.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We need a more extensible struct to contain the pipeline
descriptions in order to be able to support new rendering
features.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
RenderDoc will sometimes report extensions as unsupported, but still
fill out and accept the respective feature structs. Since we assume
extensions to be supported if the feature is enabled, we sometimes
try to use functionality that RenderDoc disables and crash.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Besides cleaning up the code, this also allows us to
use information about the available extensions earlier.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We'll add this to the root descriptor set since moving the binding
to one of the bindless sets would be hard to do; we'd need to track
the binding index of each "bindless" binding for set updates etc.
In order to stay within the limit of 8 sets, we also cannot introduce
a separate set for UAV counters (currently there are 6 bindless sets,
the static sampler set, and the root descriptor set).
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Also stores the type ID of the pointer to the UAV counter struct,
since we need to load the pointer before we can access the counter.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Greatly improves performance in various games that update or
copy a large number of descriptors per frame due to the high
overhead of pthread_mutex_{un}lock.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Allows us to have bindless UAVs without a special code path
for bindless UAV counters for now.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
All attachments must be at least as large as the framebuffer, using a
max operator is not compliant with Vulkan.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
NVIDIA currently seems to have some issues with bindless CBV on Vulkan,
which have been reported. Somehow, bindless SSBO works around black
screen on SotTR, as well as some rendering glitches on Control.
AMD won't care since UBOs and SSBOs are basically the same thing.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
This method is meant to process d3d12 device caps after the
device itself has been fully initialized. This helps avoid
code duplication in certain instances and guarantees that we
know about all enabled Vulkan features.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Meant to bundle all d3d12 feature caps and options, of which
we're going to have to add more over time.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Used as a fallback for older Nvidia generations which do not
support bindless uniform buffers.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Logically split up descriptor pool allocation in three types:
- STATIC: Root descriptors and internal allocation.
- VOLATILE: For packed descriptor set which comes from heaps.
- IMMUTABLE_SAMPLER: For immutable samplers. This should be removed once
we start allocating sampler sets at sampler creation time.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Enables us to unroll root constant buffers. Not loading unneeded
components early may also help some drivers generate better code.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Having a lot of special code here just makes it harder for
us to implement UBO-specific load path, not to mention that
the mov instruction itself is very rare.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
If we detect that a blob contains a DXIL chunk, use dxil-spirv to
compile the shader to SPIR-V if it is enabled in the build.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
For now this is enbaled based on device capabilities, but future changes
may require this to be disabled for certain root signatures.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Will be used on implementations that do not support enough
push constants to hold all root signature data.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Avoids having to use 16-byte array strides when using an
inline uniform block to store the table offsets.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
When changing tables that only have bindless descriptors,
only update the push constants instead.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We still need to pass binding info for each range to the shader compiler,
but bindless ranges will not contribute to the packed descriptor count.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Don't enable any bindless features for now so that we don't
introduce regressions as features get added.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Now that the binding code no longer makes any wild assumptions about
the exact binding layout, we can safely do this. Will make implementing
bindless a bit easier.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Uses the new data structures to iterate over descriptor
tables and populate the packed descriptor set.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Further changes will require a rework of how resource binding
works inside a command list, so for now, this is just a cleanup
that also removes some old code that is no longer needed.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Static samplers are embedded in the root signature, so we can create
a separate descriptor set layout and descriptor set which we only
need to rebind when the root signature itself changes.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Updates the root descriptor set or push descriptor at draw time.
This fixes a potential issue with shader-based clear/copy commands
invalidating previously bound root descriptors.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
- descriptor_index is the index of the descriptor within the packed
descriptor set or root descriptor set. Currently unused.
- binding_index should now index into the root_signature->bindings array.
- vk_set and vk_binding refer to the Vulkan descriptor set index and
binding number of packed descriptors or root descriptors.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Allows us to more easily refactor root signature-related code
without having to worry about root descriptors for now.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
All vkDestroy* functions are defined to perform no operation when passed
a null handle. vkd3d_free should follow regular free semantics.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Removes some unused counters and repurposes the existing ones to
differentiate between bindings (i.e. the array passed to the shader
compiler) and packed descriptors.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Fixes an issue where push constants can be invalidated by
shader-based clear/copy commands.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Uses one push constant range with VK_SHADER_STAGE_ALL. This
will allow us to easily add descriptor table offsets as push
constants.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This modifier can be applied to both destination and source
operands, so for the sake of simplicity and to avoid having
to pass down modifier information explicitly, just store this
state with the register.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
An upcoming change to the binding model will use these to
initialize descriptors that have the wrong resource type
bound, or were left uninitialized by the application.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Otherwise we might run into undefined behaviour if an app
tries to read a NULL UAV or perform atomic operations.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
The primary purpose of this function was to invalidate UAV
counters upon binding a pipeline. This is no longer an issue
and we don't have any other per-pipeline bindings, so this
function can be dropped.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Makes UAV-related code more readable and supports up to 64
UAV bindings, which is enough to support resource binding
tier 2.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This needs a major rework as the current implementation has bugs,
is hard to reason about, and very hard to maintain as we're about
to make major changes to the binding model as a whole.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We're going to need more capabilities outside the 0-63 range
going forward, so a bitmask doesn't cut it and adding extra
struct members for each capability seems excessive.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Resource index is found in idx[0] in SM 5.0, but idx[1] when using SM
5.1, and register space is encoded separately. An rb_tree keeps track of
the internal resource index idx[0] and can map that to space/binding as
required when emitting SPIR-V.
For this to work, we must also make UAV counters register space aware.
In earlier implementation, UAV counter mask was assumed to correlate 1:1
with register_index, which breaks on SM 5.1.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
On Windows, it is not ideal to rely on Vulkan being available as a
linkable library as a full install of the Vulkan SDK must be present and
set up, be friendly and load Vulkan dynamically instead.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Cannot disable VK_EXT_descriptor_indexing as we relied on internal
behavior in RADV related to global_bo_list. Implementing bindless
properly in vkd3d will solve this correctly.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Otherwise, we delcare certain input control points twice in shaders that
access them in a fork phase, which is not allowed as per Vulkan spec:
"Any two inputs listed as operands on the same OpEntryPoint must not
be assigned the same location, either explicitly or implicitly"
Fixes invalid SPIR-V and resulting RADV driver crashes in Metro Exodus.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Greatly reduce VA allocations we have to make and makes returned VA more
sensible, and better matches returned VAs we see on native drivers.
D3D12 usage flags for buffers seem generic enough that there is no
obvious benefit to place smaller VkBuffers on top of VkDeviceMemory.
Ideally, physical_buffer_address is used here, but this works as a good
fallback if that path is added later.
With this patch and previous VA optimization, I'm observing a 2.0-2.5%
FPS uplift on SOTTR when CPU bound.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
The alignments are now checked in d3d12_resource_validate_desc().
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
This would cause CoreValidation-Shader-InterfaceTypeMismatch validation
errors from Wine's test_shader_interstage_interface() d3d11 test. This
reverts parts of commits 1eb7eca411 and
04ec461fb4.
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
ID3D12GraphicsCommandList2 and WriteBufferImmediate() are used by
Hitman 2, but implementing the function on top of an AMD extension has
no effect on game behaviour. It's commonly used to write debug info.
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
This method was missing in version 10.0.15063.0 of the SDK, but is
present in version 10.0.18362.0, without a UUID change. Presumably that
means this was simply an omission in the older header, rather than an
API change in the newer header.
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
The right place for alignment validation is d3d12_resource_validate_desc().
The mod alignment test, which returns a size of ~0 on failure, is incorrect
on systems where Vulkan requires alignments of 0x20000 or more, and breaks
Hitman 2, which uses the returned value unchecked and allocates heaps of
0xffffffff bytes.
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Hitman 2 calls GetHeapProperties() for each swapchain buffer and checks if
the creation node mask is 1. If not then it fails to store the resource
pointers for later rendering.
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>