Even when misusing the API, S_OK is still returned on native runtimes.
Keep the error log, and add an error report to command allocator release
if there are still pending submissions.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
D3D12 has some unfortunate rules around CommandQueue::Wait().
It's legal to release the fence early, before the fence actually
completes its wait operation.
The behavior on D3D12 is just to release all waiters.
For out of order signal/wait, we hold off submissions,
so we can implement this implicitly through CPU signal to UINT64_MAX
on fence release. If we have submitted a wait which depends on the
fence, it will complete in finite time, so it still works fine.
We cannot release the semaphores early in Vulkan, so we must hold on
to a private reference of the ID3D12Fence object until we have observed
that the wait is complete.
To make this work, we refactor waits to use the vkd3d_queue wait list.
On other submits, we resolve the wait. This is a small optimization
since we don't have to perform dummy submits that only performs the wait.
At that time, we signal a timeline semaphore and queue up a d3d12_fence_dec_ref().
Since we're also adding this system where normal submissions signal
timelines, handle the submission counters more correctly by deferring
the decrements until we have waited for the submission itself.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Apparently RT shaders in RE Engine require min16float to
be implemented as native FP16. Fun ... ._.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
With DXR, it seems like some applications require other FL 12.2 features
to be enabled even if they are not actually used. Various RE engine
titles seem to be affected by this.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
It's technically undefined to use NULL UAV counters,
but drivers all implement some form of robust behavior here
when presented with NULL counters, so we'll have to follow suit.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
We have no way of expressing size / alignment requirements to
applications since the API query does not provide us with heap
information. Reuse the fallback path for promoting placed to committed.
Guardians of the Galaxy hits a case where it tries to place 3x
host-visible 3D images in one heap, and they end up overlapping in
memory due to a 16x16x80 3D texture taking up far less space in optimal
tiling compared to linear tiling on AMD.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
For internal debug shaders, it is helpful to ensure in-order logs when
sorted for later inspection.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
We cannot handle all scenarios if COLLECTIONS are incompatible,
but test the easier cases.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
CP77 relies on this to work somehow ...
The DXR spec seems to suggest this is allowed, but there is no direct
concept for this in Vulkan.
This seems to work on NVIDIA at least, but we're on very shaky ground
here ...
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
The hash should only depend on the raw byte stream, not the entire DXBC
blob. Useful now since we can declare root signatures either through
DXBC blob or as RDAT object (which is raw).
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
To keep things simple, outer code is responsible for keeping string
alive. Intended to be used for RTPSO entry point name debugging.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>