Simplifies this to make it easier to add new properties/features
so we don't have a bunch of pointers to things that are just a child
of the device info structure.
Fixes warnings when compiling without traces too.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Gives a massive boost on NVIDIA for some reason.
RADV defers push constant update, so ALL_STAGES doesn't have
that much of a perf hit.
~20% uplift in RE2, ~5% uplift in CP77 from some quick and dirty testing.
Seems to be heavily content dependent either way.
Also a bug fix, since we would clobber graphics push constants from
compute and vice versa if both graphics and compute used the same root
signature.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
As per MSDN, SetName is just a wrapper around SetPrivateData and a specific GUID.
Some apps and tools will use this to retrieve their name back.
So instead, just forward the name to Vulkan in the SetPrivateData call.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Instead, infer the required stages from the D3D12 shader visibility
field from all root parameters that we map to push constants.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
There are pragmatic reasons for not following spec 100% here.
The only known case where UpdateAfterBind robustness is not exposed
seems to be somewhat bogus, and we cannot run D3D12 correctly without
robustness either way.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Can only support a subset in Vulkan without extra heroics. The DXR API
lets you query things that you technically should know apriori in the
application. We might need to allocate some side-channel buffers on
demand, but let's defer that until actually needed ... :\
DXR is also very awkward in that we have a query which is resolved in
UNORDERED_ACCESS state instead of COPY_DEST state, so we'll have to
ping-pong through some barriers redundantly.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
When building acceleration structures, we need to have an
VkAccelerationStructureKHR object, but the D3D12 API just uses a plain
VA = ID3D12Resource::GetGPUVA() + offset.
For this to work, we need to resolve the VA back to VkBuffer + offset.
The only VkBuffer we can lookup is the original backing memory
allocation in the VA map, and that allocation itself must own a view
map, since we cannot tie the VA to any specific ID3D12Resource.
Since creating an RTAS is not the common path, we allocate the view map
on-demand with CAS.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
RTAS must stay in this resource state forever. The only way to
synchronize them is UAV barriers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Allows local root signatures to work correctly and is also a good
optimization since we no longer need to dereference memory (potentially
cold cache lines) to figure out heap offset in command buffer.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
If we're signalling and waiting on same physical queue (always true for
current SINGLE_QUEUE define), we can rely on submission boundary
synchronization which doesn't require any extra submissions to resolve.
Avoids awkward GPU driver bubbles with back to back signal -> wait pairs
with timeline.
Observed 2% GPU uplift on RE2 on AMD.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Otherwise, when suballocating memory, GetHeapProperties may
not return the exact same set of flags if we ignore flags
when looking up suitable chunks.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
The difference between a range's offset and the aligned
offset may be greater than the size of that range.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This is still useful as a low-level memory allocation function when
we don't want to bother with buffer offsets or D3D12 validation.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Our clear code assume that this is NULL for allocations owned
by a chunk, so we should actually do it that way. Fixes some
issues where we do not wait for clears to complete if a chunk
gets destroyed.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>