Ensures that queries are always available and initialized
in the correct order on the GPU timeline.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We have observed a lot of large GPU bubbles when using back-to-back
timeline semaphores to synchronize GPU submissions. Use prebaked
pipeline barrier command buffers instead.
To resolve queue sparse serialization, use two binary semaphore pairs to
resolve this. There is no need to use timeline semaphores in this case.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
USE_PUSH_DESCRIPTORS may be misleading since it would be set even when
we're not using push descriptors at all due to root descriptors being
passed in via VAs. Instead, make the flag represent whether or not we
use a regular descriptor set for root parameters.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
The packed descriptor index is no longer needed, and causes issues in
case a game sets a root signature, then binds a root descriptor, and
then sets a different root signature which maps the given root parameter
index to a different descriptor since we may now read undefined data
when updating push descriptors.
Fixes#366.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
The struct definitions were identical anyway, and unifying
these will prevent unnecessary code duplication.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
The only currently known use case for this requires us to actually
perform the dispatch operation. Executing more than one indirect
dispatch command is not meaningful, however there might be
differences in behaviour in case the indirect count is zero.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This logic has to be the same as in d3d12_command_list_update_descriptor_table_offsets,
since not all active descriptor tables are necessarily used by the root signature.
Fixes an assert in the StarsX IrradianceMap demo (Github issue #347).
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We will not have offset information for root descriptors, so
we can still only use them with four-byte aligned SSBOs.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We cannot rely on alignment analysis since games are buggy and screw up
RAW vs structured on occasion.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Relevant for swapchain since a swapchain resource can be presented right
away without ever having been touched by an API call.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
It is broken by design and won't be needed by a swapchain
implementation which uses user buffers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Buffer views do not necessarily cover the entire resource, so we
should not spawn more workgroups than necessary to clear the view.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This is no longer performance-critical, so in order to simplify changing
the binding model, remove hard-coded descriptor set numbers and instead
look them up based on the requested descriptor properties.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Ignore any indexed draw calls which uses a NULL index buffer.
This is not fully correct, but there is no easy way to emulate D3D12
behavior exactly.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
For correctness, we will need to defer any initial resource state
handling to the queue timeline. Here, we will build an UNDEFINED ->
common layout barrier if (and only if):
- The resource is marked to care about initial layout transition.
- We are the first queue thread to observe that initial_transition
member is 1 (atomic exchange).
- The first use of the resource was not marked to be a discard.
E.g., if the first use of the resource is an alias barrier, we must
not emit an early barrier. The only we should do here is to clear the
initial_transition member, and leave it like that.
A command list maintains a list of d3d12_resources which *might* need a
transition. For the first frame a resource is used (or so), it will not
have the flag cleared yet, so multiple command lists might add the
d3d12_resource to its own transition list. This is fine, as the queue
will resolve it.
If multiple queues see the same initial transition, there might be
shenanigans, but the application must ensure there is either a
submission boundary or fence boundary between the uses. Any initial
layout transition will only be submitted after a Wait() is observed, as
submission of the transition command buffer will be in-order with other
submissions.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
An optimization and a requirement in D3D12. Clearing out an image
through a copy is considered enough to satisfy the requirement to acquire an
alias in the advanced usage model.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
When building natively on Windows we use dllexport/dllimport for vkd3d/vkd3d_utils public exports.
When building natively on Linux we simply make those visibility default.
Nothing changes for standalone here.
Closes#152
Signed-off-by: Joshua Ashton <joshua@froggi.es>
... if we have dirty vbo slots left.
Fixes textures when inspecting items in the inventory in RE2 and RE3.
Signed-off-by: Robin Kertels <robin.kertels@gmail.com>
There is no resource state associated with this, so emit the barrier at
the end of a command buffer based on trivial tracking.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Need to handle large (> 4G) jumps in timeline value, which is not
supported by all implementations.
There is no good way to handle that, so rewrite and clean up timeline
semaphore handling by separating the timeline into a virtual timeline
(which can rewind and jump around arbitrarely) and a physical timeline
which increments by one each time.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>