We will need separate descriptor sets to be able to handle typed vs
untyped buffer workarounds.
Also writes multiple descriptors for buffers views to make sure MUTABLE
and SSBO sets are filled (or TEXEL_BUFFER + SSBO for non-mutable).
Applications often get this wrong and use raw buffer in shader where
typed view was written and vice versa.
To mitigate this, just write a typed and untyped view together.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
The first range will store the byte offset, the second one will
be the typed buffer range. Typed descriptors should write both.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Co-authored-by: Hans-Kristian Arntzen <post@arntzen-software.no>
This begins the refactor toward letting us to use both texel buffer and
SSBO descriptors for typed buffers, which is a better workaround than
force_bindless_texel_buffers.
In this new approach, we store a mask in metadata instead of
set/binding.
When copying a descriptor, we will iterate over the masks and look up
binding directly from device->bindless_state.set_info[].
The mask is represented in terms of info index rather than set index to
avoid needless lookups. Add some new helpers to make this process
easier.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
We currently never reset occlusion queries. For some reason,
validation layers do not report this.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Unnecessary because the UAV counter buffer is a host memory
allocation anyway in case of host-only descriptor heaps, so
we will not read from uncached memory.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
When reading GPU hang dumps, we can figure out what happened to
descriptor types along the way.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Caused crash when using a driver that did not support
mutable_descriptor_type.
Was using the wrong enum bitfields ... Sigh, type safe enums would be nice.
Regression caused during refactor in review most likely.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
The creation infos use the format, which potentially contains other
information as well.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
By resetting query pools in advance, we can reduce the number of
stalls between draw calls in passes with occlusion queries, which
is currently causing serious performance issues in some games.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Since we'll be inserting lots of single queries, we want to
avoid having to resize the range array since that is an O(n)
operation at worst.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
The common case is that we find an entry, so taking a writer lock should
be the rare case. We need to optimize for the case where the application
hammers the view map with e.g. buffers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Official AMD drivers do not support VK_EXT_conditional_rendering,
so we'll use indirect draws instead to emulate the feature.
This also handles 64-bit predicates in combination with the
Vulkan extension, which was not possible previously.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
The idea is to use indirect draws and dispatches to implement
predication. For predicated indirect draws, we'll use indirect
count.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Potentially avoids some unnecessary host memory access. Use BDA for
the compute shader so that we can ignore alignment restrictions on
some GPU architectures.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Command lists may need to allocate temporary device memory for
certain operations. In order to avoid frequent alloc/free calls,
we'll recycle these scratch buffers until a certain threshold.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Realign VBO strides and offsets if we have to, for sake of
robustness. Violating these rules is against D3D12 spec, but it does not
cause crashes on native drivers. On RDNA we can hit hangs with unaligned
vertex attributes. It appears that native drivers apply some kind of
fixup here to avoid the crash, even if the result is not what we expect.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
BDA cannot map to their hardware, and we observe a large performance
loss in games which use root CBVs. For this reason, fall back to push
descriptors here.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Ensures that queries are always available and initialized
in the correct order on the GPU timeline.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Game renders the map with wrong descriptor type, which means we must
implement everything as texel buffers to make this work.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
We have observed a lot of large GPU bubbles when using back-to-back
timeline semaphores to synchronize GPU submissions. Use prebaked
pipeline barrier command buffers instead.
To resolve queue sparse serialization, use two binary semaphore pairs to
resolve this. There is no need to use timeline semaphores in this case.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
This is undefined behaviour in SPIR-V, but well-defined in
DXBC, so we should explicitly 'and' the shift amount with 31.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Can just use uvec2. Also improves performance on ACO since ACO cannot
promote uint64_t to SGPR yet, u32x2 however, works fine and can be
bitcast to pointer as well.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
The fix which enabled waveops detection broke HZD, since we never tested
with that feature enabled.
Keep it disabled until we can figure out what is going on.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
USE_PUSH_DESCRIPTORS may be misleading since it would be set even when
we're not using push descriptors at all due to root descriptors being
passed in via VAs. Instead, make the flag represent whether or not we
use a regular descriptor set for root parameters.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We'll always place them at the beginning of the push constant
buffer in order to avoid potential alignment issues.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We need to know the supported shader model to detect support
for certain features like wave ops correctly.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Previously this would make the user buffer count == 0, which obviously makes apps and assertions not happy.
Fixes a crash in Horizon Zero Dawn when minimized (therefore having a degenerate surface region)
Signed-off-by: Joshua Ashton <joshua@froggi.es>
The packed descriptor index is no longer needed, and causes issues in
case a game sets a root signature, then binds a root descriptor, and
then sets a different root signature which maps the given root parameter
index to a different descriptor since we may now read undefined data
when updating push descriptors.
Fixes#366.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Makes it possible to backtrace which shader we're working with
when we get raw SPIR-V from unrelated sources (Fossilize or RADV crash
dumps for example).
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
MSDN states that root signatures across multiple stages in a graphics
pipeline must be identical, but the D3D12 runtime does not validate
this and mixing different root signatures results in undefined
behaviour, so just taking this from the VS should be safe.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We only need to know the pipeline layout for pipeline variant
creation. We are not holding a strong reference to the root
signature anyway, which may be problematic, but this should
not introduce a regression.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Offset buffer state might be the only relevant difference between two
descriptors. We won't need to copy descriptors, but the offsets must be.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Otherwise, we may run into issues with an app accessing stale resource
or pointers. NULL descriptors are handled in OMSetRenderTargets.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
The struct definitions were identical anyway, and unifying
these will prevent unnecessary code duplication.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
The only currently known use case for this requires us to actually
perform the dispatch operation. Executing more than one indirect
dispatch command is not meaningful, however there might be
differences in behaviour in case the indirect count is zero.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This logic has to be the same as in d3d12_command_list_update_descriptor_table_offsets,
since not all active descriptor tables are necessarily used by the root signature.
Fixes an assert in the StarsX IrradianceMap demo (Github issue #347).
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This makes headers a dependency rather than a generator target.
This also means we get proper dependency tracking of them between projects.
Supercedes: #225
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Otherwise this won't work in MSVC because it'd technically be re-defining the D3D12 function prototypes with the decltypes.
There is no other nice way around this.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Useful for cases where we want to communicate important information to
the log by default, but not consider it an error.
Requested information which would only be logged when explicitly asked
for should also be considered INFO.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Rename so objects we build so we don't conflict with vkd3d and don't
accidentially attempt to be built against Wine natively (it won't work).
Not quite ready for a 2.0 release yet, but bump the version to reflect
the intent. This creates a new timeline, completely separate from vkd3d.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Version string is used in logging for information purposes, but pipelines blobs and libraries use uint64_t–based commit hash. Using fixed–size integer silences warnings about string length and makes storing build info a little more efficient.
The hash is obtained separately from version string and is shifted to the left by 4 bits if the working tree is dirty.
Signed-off-by: Krzysztof Bogacki <krzysztof.bogacki@leancode.pl>