We will not have offset information for root descriptors, so
we can still only use them with four-byte aligned SSBOs.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We cannot rely on alignment analysis since games are buggy and screw up
RAW vs structured on occasion.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Relevant for swapchain since a swapchain resource can be presented right
away without ever having been touched by an API call.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
It is broken by design and won't be needed by a swapchain
implementation which uses user buffers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Buffer views do not necessarily cover the entire resource, so we
should not spawn more workgroups than necessary to clear the view.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This is no longer performance-critical, so in order to simplify changing
the binding model, remove hard-coded descriptor set numbers and instead
look them up based on the requested descriptor properties.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Ignore any indexed draw calls which uses a NULL index buffer.
This is not fully correct, but there is no easy way to emulate D3D12
behavior exactly.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
For correctness, we will need to defer any initial resource state
handling to the queue timeline. Here, we will build an UNDEFINED ->
common layout barrier if (and only if):
- The resource is marked to care about initial layout transition.
- We are the first queue thread to observe that initial_transition
member is 1 (atomic exchange).
- The first use of the resource was not marked to be a discard.
E.g., if the first use of the resource is an alias barrier, we must
not emit an early barrier. The only we should do here is to clear the
initial_transition member, and leave it like that.
A command list maintains a list of d3d12_resources which *might* need a
transition. For the first frame a resource is used (or so), it will not
have the flag cleared yet, so multiple command lists might add the
d3d12_resource to its own transition list. This is fine, as the queue
will resolve it.
If multiple queues see the same initial transition, there might be
shenanigans, but the application must ensure there is either a
submission boundary or fence boundary between the uses. Any initial
layout transition will only be submitted after a Wait() is observed, as
submission of the transition command buffer will be in-order with other
submissions.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
An optimization and a requirement in D3D12. Clearing out an image
through a copy is considered enough to satisfy the requirement to acquire an
alias in the advanced usage model.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
When building natively on Windows we use dllexport/dllimport for vkd3d/vkd3d_utils public exports.
When building natively on Linux we simply make those visibility default.
Nothing changes for standalone here.
Closes#152
Signed-off-by: Joshua Ashton <joshua@froggi.es>
... if we have dirty vbo slots left.
Fixes textures when inspecting items in the inventory in RE2 and RE3.
Signed-off-by: Robin Kertels <robin.kertels@gmail.com>
There is no resource state associated with this, so emit the barrier at
the end of a command buffer based on trivial tracking.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Need to handle large (> 4G) jumps in timeline value, which is not
supported by all implementations.
There is no good way to handle that, so rewrite and clean up timeline
semaphore handling by separating the timeline into a virtual timeline
(which can rewind and jump around arbitrarely) and a physical timeline
which increments by one each time.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Useful to measure submission times, as well as time spent acquiring the
Vulkan queues. This correlates 1:1 with swapchain as well, so it's
useful when we want to get some "X / frame" metrics.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Manually uses QPC if the Vulkan implementation does not support
the QPC domain by itself.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
When we're using extended dynamic state, we will often end up with dummy
pipeline binds, which we should try to avoid if we can.
Also avoids having to rebind dynamic state redundantly.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Cleans up dynamic state such that we do not have to keep dynamic state
create infos around.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
vkd3d-shader is currently kinda buggy and crashes when you try to trace
DXBC. This used to never be run since it was guarded by
VKD3D_SHADER_DEBUG, but with the move to a static build we merged all
debug logging under VKD3D_DEBUG. Reintroduce different debug channels in
a way that is compatible with a statically linked vkd3d.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Otherwise, if a render pass gets suspended twice in a row, we
never emit the barrier because render_pass_suspended will be
set to false the second time.
Fixes validation errors in Hitman 2.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We shouldn't potentially override stuff in the std library and this allows us to map directly to __ATOMIC_* memory orders which is more correct.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
There is no stdatomic available on MSVC so let's clean things up.
This moves all the atomic helpers to vkd3d_atomic.h and implements all platform's spinlocks in entirely the same way.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Only support ANSI/UNICODE version for now. The PIX3BLOB format is
extremely weird, complicated and undocumented.
We can refer to RenderDoc if we need it later ...
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
debug_marker/debug_report are both deprecated in favor of debug_utils and vkd3d was using marker in a
buggy way anways, as debug_marker requires debug_report to work, but it was
only conditionally enabled.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Gets rid of the full barrier on command buffer end.
Instead, do what D3D12 wants, which is to serialize all
ExecuteCommandLists. Simplify the existing timeline sempahore setup for
sparse queues and use it for all submissions.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
D3D12 apparently does this implicitly. Fixes rendering issues in
the AMD COCOA demo on Polaris with RADV, which does not emit a
barrier between the AO compute passes and the tone mapping pass
in the next command buffer.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We should hook this up to the robustness2 feature at some point,
but for now, just use the dummy descriptors. Fixes a crash in
the AMD CACAO demo.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
In D3D12, Update/CopyTileMappings are implicitly synchronized with
respect to other commands executing on the same queue, which means:
- Signal and Execute have to wait for previously submitted
sparse binding operations to complete
- Wait and Execute have to complete before subsequently
submitted sparse binding operations can execute.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This allows us to perform clears inside the render pass even if
the render pass hasn't been started at the time of the clear yet.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
We'll need to revisit this as the current implementation is
not only inefficient but also wrong in quite a few ways.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>