D3D12 has some unfortunate rules around CommandQueue::Wait().
It's legal to release the fence early, before the fence actually
completes its wait operation.
The behavior on D3D12 is just to release all waiters.
For out of order signal/wait, we hold off submissions,
so we can implement this implicitly through CPU signal to UINT64_MAX
on fence release. If we have submitted a wait which depends on the
fence, it will complete in finite time, so it still works fine.
We cannot release the semaphores early in Vulkan, so we must hold on
to a private reference of the ID3D12Fence object until we have observed
that the wait is complete.
To make this work, we refactor waits to use the vkd3d_queue wait list.
On other submits, we resolve the wait. This is a small optimization
since we don't have to perform dummy submits that only performs the wait.
At that time, we signal a timeline semaphore and queue up a d3d12_fence_dec_ref().
Since we're also adding this system where normal submissions signal
timelines, handle the submission counters more correctly by deferring
the decrements until we have waited for the submission itself.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>