The fix which enabled waveops detection broke HZD, since we never tested
with that feature enabled.
Keep it disabled until we can figure out what is going on.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
USE_PUSH_DESCRIPTORS may be misleading since it would be set even when
we're not using push descriptors at all due to root descriptors being
passed in via VAs. Instead, make the flag represent whether or not we
use a regular descriptor set for root parameters.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We need to know the supported shader model to detect support
for certain features like wave ops correctly.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Previously this would make the user buffer count == 0, which obviously makes apps and assertions not happy.
Fixes a crash in Horizon Zero Dawn when minimized (therefore having a degenerate surface region)
Signed-off-by: Joshua Ashton <joshua@froggi.es>
The packed descriptor index is no longer needed, and causes issues in
case a game sets a root signature, then binds a root descriptor, and
then sets a different root signature which maps the given root parameter
index to a different descriptor since we may now read undefined data
when updating push descriptors.
Fixes#366.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
MSDN states that root signatures across multiple stages in a graphics
pipeline must be identical, but the D3D12 runtime does not validate
this and mixing different root signatures results in undefined
behaviour, so just taking this from the VS should be safe.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We only need to know the pipeline layout for pipeline variant
creation. We are not holding a strong reference to the root
signature anyway, which may be problematic, but this should
not introduce a regression.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Offset buffer state might be the only relevant difference between two
descriptors. We won't need to copy descriptors, but the offsets must be.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Otherwise, we may run into issues with an app accessing stale resource
or pointers. NULL descriptors are handled in OMSetRenderTargets.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
The struct definitions were identical anyway, and unifying
these will prevent unnecessary code duplication.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
The only currently known use case for this requires us to actually
perform the dispatch operation. Executing more than one indirect
dispatch command is not meaningful, however there might be
differences in behaviour in case the indirect count is zero.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This logic has to be the same as in d3d12_command_list_update_descriptor_table_offsets,
since not all active descriptor tables are necessarily used by the root signature.
Fixes an assert in the StarsX IrradianceMap demo (Github issue #347).
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This makes headers a dependency rather than a generator target.
This also means we get proper dependency tracking of them between projects.
Supercedes: #225
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Rename so objects we build so we don't conflict with vkd3d and don't
accidentially attempt to be built against Wine natively (it won't work).
Not quite ready for a 2.0 release yet, but bump the version to reflect
the intent. This creates a new timeline, completely separate from vkd3d.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Version string is used in logging for information purposes, but pipelines blobs and libraries use uint64_t–based commit hash. Using fixed–size integer silences warnings about string length and makes storing build info a little more efficient.
The hash is obtained separately from version string and is shifted to the left by 4 bits if the working tree is dirty.
Signed-off-by: Krzysztof Bogacki <krzysztof.bogacki@leancode.pl>
We will not have offset information for root descriptors, so
we can still only use them with four-byte aligned SSBOs.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Introduces 'extra' bindings to bindless sets which can be used to
bind additional storage buffers to the pipeline, which will occur
before the bindless descriptor array in the descriptor set.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We cannot rely on alignment analysis since games are buggy and screw up
RAW vs structured on occasion.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
If the image itself is sRGB or some other format that does not support
STORAGE, we need this flag.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
This can happen on Windows when windows are minimized.
Might not happen in winevulkan, but Vulkan spec outlines this Win32 case
explicitly and it happens on native Windows.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
It is considered a "success", in that fences must be signalled, so make
sure we wait and reset it so we don't risk calling vkAcquireNextImageKHR
later with an already signalled fence.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Only way to implement a D3D12 swapchain.
For now, disable compute paths, we'll introduce it properly after refactor.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Relevant for swapchain since a swapchain resource can be presented right
away without ever having been touched by an API call.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
It is broken by design and won't be needed by a swapchain
implementation which uses user buffers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Buffer views do not necessarily cover the entire resource, so we
should not spawn more workgroups than necessary to clear the view.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This will allow us to use the same bindless descriptor set for
different types of descriptor ranges.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This is no longer performance-critical, so in order to simplify changing
the binding model, remove hard-coded descriptor set numbers and instead
look them up based on the requested descriptor properties.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Ignore any indexed draw calls which uses a NULL index buffer.
This is not fully correct, but there is no easy way to emulate D3D12
behavior exactly.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
We cannot compare resource pointers or view pointers,
since the pointers might have been recycled.
This leads to a scenario where we're not updating descriptors we're
supposed to, and the GPU reads a stale descriptor.
Fixes a GPU hang in Death Stranding (and possibly lots of other weird
crashes as well).
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
For correctness, we will need to defer any initial resource state
handling to the queue timeline. Here, we will build an UNDEFINED ->
common layout barrier if (and only if):
- The resource is marked to care about initial layout transition.
- We are the first queue thread to observe that initial_transition
member is 1 (atomic exchange).
- The first use of the resource was not marked to be a discard.
E.g., if the first use of the resource is an alias barrier, we must
not emit an early barrier. The only we should do here is to clear the
initial_transition member, and leave it like that.
A command list maintains a list of d3d12_resources which *might* need a
transition. For the first frame a resource is used (or so), it will not
have the flag cleared yet, so multiple command lists might add the
d3d12_resource to its own transition list. This is fine, as the queue
will resolve it.
If multiple queues see the same initial transition, there might be
shenanigans, but the application must ensure there is either a
submission boundary or fence boundary between the uses. Any initial
layout transition will only be submitted after a Wait() is observed, as
submission of the transition command buffer will be in-order with other
submissions.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
An optimization and a requirement in D3D12. Clearing out an image
through a copy is considered enough to satisfy the requirement to acquire an
alias in the advanced usage model.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Just drop the VkSubpassDependency in this case to satisfy the validator,
since stages == 0 is not allowed.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Use a default format if there is no format specified.
Otherwise, the call fails on both Wine and DXVK DXGIs.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
When building natively on Windows we use dllexport/dllimport for vkd3d/vkd3d_utils public exports.
When building natively on Linux we simply make those visibility default.
Nothing changes for standalone here.
Closes#152
Signed-off-by: Joshua Ashton <joshua@froggi.es>
On systems without extended dynamic state, or for certain pipelines,
it is possible for vk_pso_cache to be VK_NULL_HANDLE, so we need to
check for this during serialization.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This is used extensively by Horizon Zero Dawn, and allows us
to skip the compile screen after the initial first run.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Unused now, instead we should implement D3D12 caching primitives
correctly and rely on the Vulkan driver otherwise.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
... if we have dirty vbo slots left.
Fixes textures when inspecting items in the inventory in RE2 and RE3.
Signed-off-by: Robin Kertels <robin.kertels@gmail.com>
There is no resource state associated with this, so emit the barrier at
the end of a command buffer based on trivial tracking.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Need to handle large (> 4G) jumps in timeline value, which is not
supported by all implementations.
There is no good way to handle that, so rewrite and clean up timeline
semaphore handling by separating the timeline into a virtual timeline
(which can rewind and jump around arbitrarely) and a physical timeline
which increments by one each time.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
These memory types might end up being used as fallback memory types,
which is problematic due to their tiny sizes, and unexpected performance
behavior. Generally, when we want to fallback, we should cleanly fall
back to system memory rather than a different device local type.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Manages unique static samplers for now, in order to reduce duplicates.
Can be extended to also manage descriptor pools for static samplers in
the future.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
RenderDoc does not support external_memory_host yet, and these heaps are
generally only used for debugging, so we should be able to get away with
this in practice.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
D3D12 allows much larger pools to be created for heaps that are not
shader-visible, which some games make use of. Fixes crashes on Nvidia.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Not just the shader visible ones, since we'll be using Vulkan
descriptor set copies to implement D3D12 descriptor copies.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Stores info about where exactly the descriptor is stored in the
Vulkan descriptor pool, and whether we have to worry about an
additional UAV counter descriptor.
This is meant to replace all the other non-static data stored
inside d3d12_desc.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We're not using these anywhere because we need formats to be correct
for image views. Buffer views are used for root descriptors and null
UAV counters.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Fixes a crash on drivers that don't support null descriptors.
Image UAVs and other descriptor types cannot have counters.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Useful to measure submission times, as well as time spent acquiring the
Vulkan queues. This correlates 1:1 with swapchain as well, so it's
useful when we want to get some "X / frame" metrics.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
There are two advantages of doing it like this:
- When profiling is not enabled, we get no overhead for device calls.
- Avoids cluttering up the main implementation.
Disadvantage is that rolling inherited vtables like this is quite
disgusting, but this is C, what you gonna do ...
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Calling this from CopyDescriptorsSimple on its own is a bad idea given its __stdcall and GCC doesn't like optimizing that.
Also marked it as inline given it can easily be optimized greatly contextually for CopyDescriptorsSimple
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Manually uses QPC if the Vulkan implementation does not support
the QPC domain by itself.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
When we're using extended dynamic state, we will often end up with dummy
pipeline binds, which we should try to avoid if we can.
Also avoids having to rebind dynamic state redundantly.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Cleans up dynamic state such that we do not have to keep dynamic state
create infos around.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fall back when there is a mismatch, which can happen if application does
not declare inputs to hull shader (unlikely).
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
When using EXT_extended_dynamic_state, we will be able to compile a
master pipeline. Only in special cases will we have to fallback.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
This can happen in the worst case where we have all bindless sets, and:
- Static samplers
- Packed descriptors (UAV counters on drivers without support for this)
- Root descriptors
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
vkd3d-shader is currently kinda buggy and crashes when you try to trace
DXBC. This used to never be run since it was guarded by
VKD3D_SHADER_DEBUG, but with the move to a static build we merged all
debug logging under VKD3D_DEBUG. Reintroduce different debug channels in
a way that is compatible with a statically linked vkd3d.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Otherwise, if a render pass gets suspended twice in a row, we
never emit the barrier because render_pass_suspended will be
set to false the second time.
Fixes validation errors in Hitman 2.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
The output here is actually for secure presentation and restricting a swapchain to a certain output.
Correctly handle NULL (desktop) targets that we used to have.
Fixes crashes with titles that use fullscreen via an initial fullscreen desc.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
We need these for the upcoming swapchain factory implementation
for standalone D3D12.
They're also probably good to have around in future for the
d3d12 device.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
This reverts commit 0384362065.
It is not allowed to use RS 1.1 serialization for the non-versioned
entry point. RS 1.1 serialization must use the versioned entry point.
Reverting this fixes the relevant test case in d3d12.c:12522.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Saves a few CPU cycles. We expect things to explode anyway when
the app uses a non-UAV descriptor as a UAV in the shader.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>