Compare commits

...

1238 Commits
v2.0 ... master

Author SHA1 Message Date
Joshua Ashton d00d035321 ci: Use arch-mingw-github-action v8
Fixes safe directory stuff giving invalid version info.
2022-07-26 18:37:26 +00:00
Joshua Ashton 253dc9027a Revert "ci: Workaround safe directory errors in vkd3d_build generation."
This reverts commit 0c4df9b32c.
2022-07-26 18:37:26 +00:00
Derek Lesho 146f5b8a74 vkd3d: Fall back to regular fences when shared timeline semaphores aren't supported.
Signed-off-by: Derek Lesho <dlesho@codeweavers.com>
2022-07-25 23:55:40 +02:00
Hans-Kristian Arntzen db4a8544a1 tests: Avoid potential UB in fence_wait robustness test.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-25 23:11:37 +02:00
Hans-Kristian Arntzen 1d25b29413 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-25 21:55:45 +02:00
Hans-Kristian Arntzen 34a04a1a7f dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-25 18:39:06 +02:00
Hans-Kristian Arntzen b839fe14bb tests: Add test for freeing underlying memory of a reserved resource.
As long as the reserved regions are not used, this is okay.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-25 18:10:08 +02:00
Hans-Kristian Arntzen d3a76eee90 idl: Fix const correctness of UpdateTileMappings.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-25 18:10:08 +02:00
Hans-Kristian Arntzen 481680ecd8 vkd3d: Use IndexFormat as a sentinel for indexed RTAS build.
UE5 seems to only set IndexType to != UNKNOWN when querying RTAS sizes.
This contradicts D3D12 docs, but this matches Vulkan behavior, so do the
same thing. Adds a warn when IBO VA is NULL with non-null format to catch app
bugs.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-25 17:51:02 +02:00
Hans-Kristian Arntzen 11c82c84d1 vkd3d: Add some trace debug logs of RTAS build infos.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-25 17:51:02 +02:00
Hans-Kristian Arntzen c0b9682c69 vkd3d: Small warning fixes.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-25 17:09:07 +02:00
Hans-Kristian Arntzen 9d8abd2db5 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-25 11:32:58 +02:00
Derek Lesho df1829e407 vkd3d: Implement ID3D12Fence sharing on top of D3D12-Fence exportable Vulkan timeline semaphores.
Signed-off-by: Derek Lesho <dlesho@codeweavers.com>
2022-07-25 11:16:53 +02:00
Hans-Kristian Arntzen be2aafff1a vkd3d: Resolve fence waiters early.
Temporarily abandons the idea to fuse waiters with execution.
For whatever reason, this seemed to cause random flicker in Halo Infinite
with async compute on, and I have failed to figure out exactly why.
By playing around with how commands are fused, the results changed
dramatically, which means I doubt vkd3d-proton was actually at fault
here.

There is some questionable code around UpdateTileMappings in the game
where a COPY queue is used, and it does not seem to synchronize this with other
queues as far as I can tell. It is uncertain at this time if D3D12
requires a tile update to synchronize with *every* queue or just the
queue being submitted to. We assume the latter, as it's the only
behavior that makes sense.

It is possible that submitting waits as they are queued up
affects synchronization between queues in unexpected ways.

When separating out the wait operations, everything appears to work.
It is also simpler code.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-21 21:10:34 +02:00
Derek Lesho 849537614a vkd3d: HACK: Don't create host pointer heap for Halo Infinite.
Some usage pattern here is causing a failure inside amdgpu.

Signed-off-by: Derek Lesho <dlesho@codeweavers.com>
2022-07-21 20:48:56 +02:00
Derek Lesho f487db4756 vkd3d: Implement ID3D12Resource sharing.
Signed-off-by: Derek Lesho <dlesho@codeweavers.com>
2022-07-21 20:48:56 +02:00
Hans-Kristian Arntzen 6265a7b5ce tests: Add test creating root signature without RTS0 blob.
We're supposed to fail here, but we ended up failing
due to parsing uninitialized version instead, meaning
it could spuriously succeed or read garbage.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-20 12:00:07 +02:00
Hans-Kristian Arntzen 4f4c96bb11 vkd3d: Fail creating root signatures from blobs without RTS0.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-20 12:00:07 +02:00
Derek Lesho a2439e766f vkd3d: Flush queued waiters before waiting for the sparse binding semaphore.
Fixes a bug in the logic trying to combine the waits by simplifying the code.
Problem discovered by HK.

Signed-off-by: Derek Lesho <dlesho@codeweavers.com>
2022-07-20 01:27:20 +02:00
Hans-Kristian Arntzen 21799b202b tests: Add test verifying private ref behavior of ID3D12Fence.
Attempt to release fences before their signal/waits have been satisfied.
Also tests this behavior for shared fences.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-18 19:00:25 +02:00
Hans-Kristian Arntzen 4ff504b52d vkd3d: Match native runtime better in command allocator reset.
Even when misusing the API, S_OK is still returned on native runtimes.
Keep the error log, and add an error report to command allocator release
if there are still pending submissions.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-18 19:00:25 +02:00
Hans-Kristian Arntzen 6335e411bb vkd3d: Rewrite submission logic for wait fences.
D3D12 has some unfortunate rules around CommandQueue::Wait().
It's legal to release the fence early, before the fence actually
completes its wait operation.

The behavior on D3D12 is just to release all waiters.
For out of order signal/wait, we hold off submissions,
so we can implement this implicitly through CPU signal to UINT64_MAX
on fence release. If we have submitted a wait which depends on the
fence, it will complete in finite time, so it still works fine.

We cannot release the semaphores early in Vulkan, so we must hold on
to a private reference of the ID3D12Fence object until we have observed
that the wait is complete.

To make this work, we refactor waits to use the vkd3d_queue wait list.
On other submits, we resolve the wait. This is a small optimization
since we don't have to perform dummy submits that only performs the wait.
At that time, we signal a timeline semaphore and queue up a d3d12_fence_dec_ref().

Since we're also adding this system where normal submissions signal
timelines, handle the submission counters more correctly by deferring
the decrements until we have waited for the submission itself.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-18 19:00:25 +02:00
Hans-Kristian Arntzen 11c943dd7e vkd3d: Unblock all fence waiters when public ref-count hits 0.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-18 19:00:25 +02:00
Hans-Kristian Arntzen 5b73139f18 vkd3d: Fail creation of command signature if DGC is not supported.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-12 14:31:53 +02:00
Hans-Kristian Arntzen 73700f4c3a tests: Be robust against missing features when testing indirect state.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-12 14:31:53 +02:00
Hans-Kristian Arntzen a917d60ca5 profiler: Add --delta to profile helper tool.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:59:41 +02:00
Hans-Kristian Arntzen 8d780458f1 profiler: Use rdtsc instead of QPC.
Runs much faster and we don't really need accurate ns readings.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:59:41 +02:00
Hans-Kristian Arntzen 8da6ca6772 common: Add rdtsc helper.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:59:41 +02:00
Hans-Kristian Arntzen 766da69afb vkd3d: Also add profiles for RE3/RE7.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:58:21 +02:00
Hans-Kristian Arntzen b7a960f94f vkd3d: Also add RE workaround for RE2 DXR.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:58:21 +02:00
Hans-Kristian Arntzen ee39209798 vkd3d: Add flag to force native FP16 paths.
Apparently RT shaders in RE Engine require min16float to
be implemented as native FP16. Fun ... ._.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:58:21 +02:00
Hans-Kristian Arntzen afb87e013f vkd3d: Add per-application feature overrides.
With DXR, it seems like some applications require other FL 12.2 features
to be enabled even if they are not actually used. Various RE engine
titles seem to be affected by this.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:58:21 +02:00
Hans-Kristian Arntzen 433262c254 tests: Add headless D3D12 RenderDoc capture support.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:34:14 +02:00
Hans-Kristian Arntzen 277bbe35e8 tests: Test both aligned and "unaligned" argument buffer offsets.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:31:30 +02:00
Hans-Kristian Arntzen 9451fdcab9 tests: Add large root constant CBV to execute indirect advanced.
Tests that we can handle > 128 byte push constant blocks.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:31:30 +02:00
Hans-Kristian Arntzen 0640f44560 tests: Add test for early and late indirect patching.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:31:30 +02:00
Hans-Kristian Arntzen b287864cd1 tests: Remove TODOs from ExecuteIndirect state test.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:31:30 +02:00
Hans-Kristian Arntzen 0a7b13fe7f tests: Add test for advanced ExecuteIndirect features.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:31:30 +02:00
Hans-Kristian Arntzen f704cb9776 vkd3d: Use index type LUT for DGC.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:14:13 +02:00
Hans-Kristian Arntzen e17a7cb40c vkd3d: Attempt to reuse application indirect command buffer.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:14:13 +02:00
Hans-Kristian Arntzen 9e45c72256 tests: Test UAV counter behavior with NULL counters.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:07:47 +02:00
Hans-Kristian Arntzen 2a8c762025 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:07:47 +02:00
Hans-Kristian Arntzen 3b8a13e63d vkd3d-shader: Implement robust UAV counters.
It's technically undefined to use NULL UAV counters,
but drivers all implement some form of robust behavior here
when presented with NULL counters, so we'll have to follow suit.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 15:07:47 +02:00
Hans-Kristian Arntzen 65804bbde5 vkd3d: Ignore cpu_access_domain when reporting heap tier.
For host visible, we only place buffers anyways.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:59:24 +02:00
Hans-Kristian Arntzen 233ff38175 vkd3d: Force LINEAR images to be allocated as committed resources.
We have no way of expressing size / alignment requirements to
applications since the API query does not provide us with heap
information. Reuse the fallback path for promoting placed to committed.

Guardians of the Galaxy hits a case where it tries to place 3x
host-visible 3D images in one heap, and they end up overlapping in
memory due to a 16x16x80 3D texture taking up far less space in optimal
tiling compared to linear tiling on AMD.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:59:24 +02:00
Hans-Kristian Arntzen 4a07d9c038 debug: Add concept of implicit instance index to debug ring.
For internal debug shaders, it is helpful to ensure in-order logs when
sorted for later inspection.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:59:00 +02:00
Hans-Kristian Arntzen bcdac3180a debug: Make Instance sorting easier.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:59:00 +02:00
Hans-Kristian Arntzen df11b5ba5a debug: Pretty-print execute template debug messages.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:59:00 +02:00
Hans-Kristian Arntzen e138a5117a vkd3d: Encode in detail which commands we're emitting in template.
Feed this back to debug ring for less cryptic logs.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:59:00 +02:00
Hans-Kristian Arntzen 96fdb71ae4 vkd3d: Refactor out patch command token enum.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:59:00 +02:00
Hans-Kristian Arntzen fe707989fe vkd3d: Clamp command count in execute indirect path.
Shouldn't be required, but take no chances.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:59:00 +02:00
Hans-Kristian Arntzen 6d3c5d53b0 vkd3d: Add debug ring path for execute indirect template patches.
Somehow inspect draw parameters this way.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:59:00 +02:00
Hans-Kristian Arntzen f93a581dae vkd3d: Trace breadcrumbs for execute indirect templates.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:59:00 +02:00
Hans-Kristian Arntzen b7bbdcabd4 tests: Test that we can deal with local samplers in COLLECTIONS.
We cannot handle all scenarios if COLLECTIONS are incompatible,
but test the easier cases.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:58:19 +02:00
Hans-Kristian Arntzen a28e4b6e11 tests: Add test for querying identifiers from COLLECTION objects.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:58:19 +02:00
Hans-Kristian Arntzen eda0b2fab2 vkd3d: Do a best effort in handling COLLECTION local static samplers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:58:19 +02:00
Hans-Kristian Arntzen 7f5dbcfc40 vkd3d: Add workaround to allow identifiers to be queried from library.
CP77 relies on this to work somehow ...
The DXR spec seems to suggest this is allowed, but there is no direct
concept for this in Vulkan.

This seems to work on NVIDIA at least, but we're on very shaky ground
here ...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:34:34 +02:00
Hans-Kristian Arntzen d333159c86 vkd3d: Disallow querying identifiers from COLLECTION objects.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:34:34 +02:00
Hans-Kristian Arntzen 74eb676cfb vkd3d-shader: Normalize root signature compatibility hashing.
The hash should only depend on the raw byte stream, not the entire DXBC
blob. Useful now since we can declare root signatures either through
DXBC blob or as RDAT object (which is raw).

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:34:34 +02:00
Hans-Kristian Arntzen 5033904e10 debug: Add GLSLC_FLAGS to debug shader build.
When building ray query shaders, need --target-env=spv1.4.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:23:38 +02:00
Hans-Kristian Arntzen b34931eb17 vkd3d: Log how shader identifiers are queried.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:23:38 +02:00
Hans-Kristian Arntzen 7410f53912 vkd3d: Add debug ring support to raytracing shaders.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:23:38 +02:00
Hans-Kristian Arntzen 089d2c6cb7 debug: Add shader override build for ray tracing as well.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:23:38 +02:00
Hans-Kristian Arntzen 03fdbac59e vkd3d: Dump TraceRays parameters to breadcrumbs.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:04:38 +02:00
Hans-Kristian Arntzen 7832eeb60d vkd3d: Add detailed tracing for RTPSO creation.
So much state floating around ...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:04:38 +02:00
Hans-Kristian Arntzen 8a94c3ce0e vkd3d: Add more detailed breadcrumb logging for TraceRays.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:04:38 +02:00
Hans-Kristian Arntzen ddb425c5cb vkd3d: Add support for tag logging in breadcrumbs.
To keep things simple, outer code is responsible for keeping string
alive. Intended to be used for RTPSO entry point name debugging.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:04:38 +02:00
Hans-Kristian Arntzen ad7459551d vkd3d: Trivially ensure tighter packing of entry point struct.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:04:38 +02:00
Hans-Kristian Arntzen e3c36a47dd tests: Add test for default association tiebreak rules.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 13:41:06 +02:00
Hans-Kristian Arntzen ee8b8374b4 tests: Add test for how we handle DXIL embedded subobjects.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 13:41:06 +02:00
Hans-Kristian Arntzen ce00c9322d tests: Add some basic RTPSO validation rules tests.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 13:41:06 +02:00
Hans-Kristian Arntzen b88b04e4f1 vkd3d: Rewrite how submodules are associated with exports.
Handle embedded DXIL subobjects and fix various issues exposed by the
upcoming new tests.

Associating with global root signatures, shader config and pipeline
config needs to be rewritten so that we validate uniqueness late.

The strategy here is to look at all exports we care about and find an
association.

There are many priority levels which are implied by how I understand the
DXR docs. State objects in the API win over embedded DXIL state objects.
Any DXIL state object wins over a collection.

Hit group associations can trump an entry point. It's not entirely clear
how this works, but we let it win if it has higher priority, i.e.
an explicit association directed at the hit group.

There's also cases where explicit assignment trumps explicit default
assignment, which then trumps just declaring a state object.

Collection state is inherited in some cases like AddToStateObject() even
if this seems to be undocumented behavior.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 13:41:06 +02:00
Hans-Kristian Arntzen 4a121b9aaa vkd3d-shader: Forward RDAT subobjects.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:37:34 +02:00
Hans-Kristian Arntzen 0ef6a8b798 vkd3d: Expose utility for creating root signature from raw blob.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:37:34 +02:00
Hans-Kristian Arntzen 49b6e67e7d vkd3d-shader: Expose entry point for raw root signature parsing.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:37:34 +02:00
Hans-Kristian Arntzen 2ef3fd469c vkd3d-common: Add strequal_mixed between WCHAR and ASCII.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:37:34 +02:00
Hans-Kristian Arntzen 22778b99be vkd3d: Handle default global root signature in RTPSO.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:37:34 +02:00
Hans-Kristian Arntzen b8b2a93aa6 tests: Add test coverage for two stages of AddToStateObject().
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:11:27 +02:00
Hans-Kristian Arntzen 14470d5456 tests: Add test for AddToStateObject.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:11:27 +02:00
Hans-Kristian Arntzen 3aad4edf6e tests: Add default NODE_MASK state object to RTPSO tests.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:11:27 +02:00
Hans-Kristian Arntzen 3c92b3a1bc vkd3d: Implement AddToStateObject().
This is barely implementable, and relies on implementations to do kinda
what we want.

To make this work in practice, we need to allow two pipelines per state
object. One that is created with LIBRARY and one that can be bound. When
incrementing the PSO, we use the LIBRARY one.

It seems to be allowed to create a new library from an old library.
It is more convenient for us if we're allowed to do this, so do this
until we're forced to do otherwise.

DXR 1.1 requires that shader identifiers remain invariant for child
pipelines if the parent pipeline also have them.
Vulkan has no such guarantee, but we can speculate that it works and
validate that identifiers remain invariant. This seems to work fine on
NVIDIA at least ... It probably makes sense that it works for
implementations where pipeline libraries are compiled at that time.

The basic implementation of AddToStateObject() is to consider
the parent pipeline as a COLLECTION pipeline. This composes well and
avoids a lot of extra implementation cruft.

Also adds validation to ensure that COLLECTION global state matches with
other COLLECTION objects and the parent. We will also inherit global
state like root signatures, pipeline config, shader configs etc when
using AddToStateObject().

The tests pass on NVIDIA at least.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:11:27 +02:00
Hans-Kristian Arntzen 8473355a98 vkd3d: Hold private ownership over global root signature.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 11:49:44 +02:00
Hans-Kristian Arntzen 1438ff5637 vkd3d: Allow different but compatible global root signature objects.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 11:49:44 +02:00
Hans-Kristian Arntzen c3ee963d2f vkd3d: Ignore NODE_MASK subobjects.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 11:49:44 +02:00
Hans-Kristian Arntzen 684e41fabe vkd3d: Do not perform initial layout transition for placed RTV / DSV.
Docs explicitly specify that placed RTV / DSV resource must be properly
initialized before use, either on first use or after aliasing barriers,
so there should be no need to perform initial layout transition.

Fixes spurious GPU hangs in Hitman III where application aliases
an indirect buffer and a DSV. The DSV is cleared after the indirect
buffer is consumed, but the initial_layout_transition is triggered and
HTILE init clobbered the buffer.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-30 15:06:59 +02:00
Philip Rebohle 1d869e3e21 vkd3d: Do not execute indirect commands if count buffer is unsupported.
Also be a bit more uniform with using break/return on fail conditions.

Otherwise, the indirect command will read data from the count buffer
instead, which may lead to bugs or GPU hangs.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-06-28 14:57:11 +02:00
Tatsuyuki Ishi 02c7ec404c vkd3d: Fix transfer batch clobbering state in begin_render_pass.
Transfer batch can clobber graphics pipeline for e.g. depth->color copies.
Hence, flushing the batches after applying the graphics pipeline set by the
app can cause correctness issues.

To prevent that, do the transfer batch flush first before we apply any
render-related states.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-06-28 13:53:03 +02:00
Hans-Kristian Arntzen 9b5f3bfc26 vkd3d-shader: Fix GRAD sample on cubes.
offset_component_count was set to 0 for cubes, but GRAD path also
uses the variable to check how many components to use for GRAD.
OFFSET is not supported for cubes, so that's likely why it was bugged.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-28 12:14:41 +02:00
Hans-Kristian Arntzen b4ab6c3f08 cache: Unmap files before attempting to delete.
Native Win32 does not like it.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-28 12:13:03 +02:00
Hans-Kristian Arntzen 707af8152e vkd3d: Add workaround for forced clearing of certain buffers.
If game uses NOT_ZEROED, it might still rely on buffers being properly
cleared to 0.
Enable this and FORCE_RAW_VA_CBV for Halo Infinite.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-24 15:11:19 +02:00
Hans-Kristian Arntzen bc759be2af vkd3d: Optimize ExecuteIndirect() if no INDIRECT transitions happened.
The D3D12 docs outline this as an implementation detail explicitly, so
we should do the same thing.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-24 14:55:39 +02:00
Hans-Kristian Arntzen 18f1d1c72e vkd3d: Implement ExecuteIndirect with state update.
Implements the most basic iteration where we don't try to take advantage
of index LUT, hoisting CS patching or attempting to reuse application
indirect buffer directly.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-24 14:55:39 +02:00
Hans-Kristian Arntzen 1b704287e5 vkd3d: Enable NV_device_generated_commands extension.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-24 14:55:39 +02:00
Hans-Kristian Arntzen f975f09bb1 meta: Add ExecuteIndirect patch meta shader.
Currently we are translating the index type. This will be changed in a
follow up commit where we move over to index LUT.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 14:39:22 +02:00
Hans-Kristian Arntzen 619a54810d vkd3d: Pass down required memory types to scratch allocators.
Separate scratch pools by their intended usage.
Allows e.g. preprocess buffers to be
allocated differently from normal buffers, which is necessary on
implementations that use special memory types to implement preprocess
buffers.

Potentially can also allow for separate pools for
host visible scratch memory down the line.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 14:39:22 +02:00
Hans-Kristian Arntzen cecb8d6ebc vkd3d: Don't suballocate scratch buffers.
Scratch buffers are 1 MiB blocks which will end
up being suballocated. This was not intended and a fallout from the
earlier change where VA_SIZE was bumped to 2 MiB for Elden Ring.

Introduce a memory allocation flag INTERNAL_SCRATCH which disables
suballocation and VA map insert.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 14:39:22 +02:00
Hans-Kristian Arntzen 8ae391e675 vkd3d: Add more stringent validation for CreateCommandSignature.
The runtime is specified to validate certain things.
Also, be more robust against unsupported command signatures, since we
might need to draw/dispatch at an offset. Avoids hard GPU crashes.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen a30205589f common: Assert that alignment is > 0 and POT.
Found bug when allocating device generated commands.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen abdef77695 vkd3d: Add helper to invalidate all state.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen c132073df8 vkd3d: Refactor index buffer state to be flushed late.
With ExecuteIndirect state we'll need to modify or refresh index buffer
state.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen 128852200a vkd3d: Store the raw VA index in root signature for root descriptors.
Needed when building device generated commands later.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen 717026f903 vkd3d: Add VKD3D_CONFIG option to force raw VA CBV descriptors.
For certain ExecuteIndirect() uses, we're forced to use this path
since we have no way to update push descriptors indirectly yet.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen b849bd4256 vkd3d: Enable F1 2020 quirks on 2019 as well.
Same game bug.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-20 14:53:16 +02:00
Georg Lehmann d8905afd5d demos: Don't pretend to handle allocation failure.
This function doesn't indicate failure and the possibility of a return
causes -Wmaybe-uninitialized warnings.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2022-06-20 11:36:17 +02:00
Hans-Kristian Arntzen de5b751468 vkd3d: Enable VK_KHR_depth_stencil_resolve.
Required by KHR_dynamic_rendering. Caught by updated validation layers.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-17 11:54:31 +02:00
Hans-Kristian Arntzen 219d9698b3 tests: Fix compiler warnings in various tests.
Mostly related to casting vec4 struct to float where array[4] is expected.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-17 11:54:07 +02:00
Hans-Kristian Arntzen acef5429c5 vkd3d-shader: Workaround trivial compiler warning.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-17 11:54:07 +02:00
Hans-Kristian Arntzen 135aff4685 vkd3d: Remove the global VkPipelineCache.
Just use VK_NULL_HANDLE. We rely on the disk cache to exist anyways
here. We never serialize the global pipeline cache, so it might just
confuse drivers into disable disk cache if anything.

Also reduce memory bloat.

Also gets rid of very old NV driver workaround where we forced global
pipeline cache.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-17 11:53:46 +02:00
Hans-Kristian Arntzen 2f6a9e0d55 vkd3d: Do not attempt to clear dedicated memory allocations.
We rely on zerovram behavior in drivers. Opt-in to this path where we
know implementation does what we want (backed up by testing).

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-17 11:53:28 +02:00
Hans-Kristian Arntzen 3a19dea7c7 tests: Ensure we try to allocate some larger buffers as well.
The suballocation test should also try to allocate >= 2 MiB buffers so
we can verify VRAM clear behavior for dedicated allocations as well.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-17 11:53:28 +02:00
Tatsuyuki Ishi 39d07dea2c vkd3d: Check for alias and batch barriers in CopyTextureRegion batches.
Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-06-16 11:54:26 +02:00
Tatsuyuki Ishi 3577ca3144 vkd3d: Introduce transfer batches.
Transfer batches buffers CopyTextureRegion calls for batching.

The flushes needs to happen in a few places:
1. ResourceBarrier: This is where the transition from COPY_DEST to other
   might happen, at which point the writes must be visible. This might
   also transition away from COPY_SRC which invalidates the
   precondition.
2. Copy operations. Copies to the same resource are implicitly ordered.
3. Draws and dispatches. These are not strictly necessary, but we don't
   want too much command reordering so flushing here seems good.
4. Close. So that we don't throw commands into the void.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-06-16 11:54:26 +02:00
Tatsuyuki Ishi 829ac72e3d vkd3d: Break up CopyTextureRegion into three stages.
A parameter preparation stage, a pre-execution barrier stage, then finally
the execution and post-execution barrier stage.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-06-13 14:40:23 +02:00
Hans-Kristian Arntzen c64916686d vkd3d: Clear SUSPENDED flag properly.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-13 13:46:49 +02:00
Hans-Kristian Arntzen c4b00bbe1e tests: Avoid tripping out of spec UAV casts.
5.3.9.5 in D3D11 spec explicit outlines when we can
cast to R32{U,I,F}. The D3D12 validation layers
seem to have missed this.

Fixes assertions in RADV when running test under debug.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-08 17:09:40 +02:00
Hans-Kristian Arntzen fd05839eb9 vkd3d: Only enable native FP16 codegen for RADV.
Regression in Deathloop on NV with native FP16 FSR.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-03 16:15:54 +02:00
Hans-Kristian Arntzen 46470017a3 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-03 16:15:54 +02:00
Georg Lehmann cbca29dd90 tests: Fix -Wstringop-overread warnings.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2022-06-01 20:41:36 +02:00
Hans-Kristian Arntzen c3fb6a6c5e dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:53:02 +02:00
Hans-Kristian Arntzen e8f1936ee2 vkd3d: Convert VKD3D_CONFIG flags to 64-bit constants.
We're soon running out of 32-bit space.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:31:48 +02:00
Hans-Kristian Arntzen 4166eb042b tests: Add exploratory test for accessing root descriptors with overflow.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:31:38 +02:00
Hans-Kristian Arntzen 7a002698f3 vkd3d-shader: Prefer InBounds access chains for root descriptors.
Gets better codegen, since compiler no longer has to assume
that negative indices can be generated, which means full 64-bit sign
extension and addressing math (slow).

Based on experiments, no native driver lets -1 indices work,
so it's safe to make the u32 assumption.

See test_root_descriptor_offset_sign as a justification for this change.

Also, see https://gitlab.freedesktop.org/mesa/mesa/-/issues/6562
for discussion on InBounds.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:31:38 +02:00
Hans-Kristian Arntzen 896e6fb868 vkd3d-shader: Enable native 16-bit path for min16float DXIL.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:31:22 +02:00
Hans-Kristian Arntzen 8989360087 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:31:22 +02:00
Hans-Kristian Arntzen f804ddc4c7 vkd3d: Allow integer dot product unconditionally.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:31:22 +02:00
Hans-Kristian Arntzen 3b0d7e043d tests: Add more small resource tests to get_resource_tiling test.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-31 16:00:11 +02:00
Hans-Kristian Arntzen 75e0506404 tests: Add test for RTV count > 0 and no pixel shader.
Attempt to bind mismatching format. Observe it is ignored.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-31 16:00:11 +02:00
Hans-Kristian Arntzen 0f9d7dd10d vkd3d: Force RT count to 0 when PS does not exist.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-31 16:00:11 +02:00
Hans-Kristian Arntzen 7acc33ae39 vkd3d: Always return tile shape.
Docs are lying. :\

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-31 16:00:11 +02:00
Hans-Kristian Arntzen 7916d2a6d8 vkd3d: Enable and use VK_KHR_fragment_shader_barycentric.
For now, just keep the NV path as well. It's the exact same extension
basically as the KHR one.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-31 15:59:49 +02:00
Hans-Kristian Arntzen 48157c29e8 khronos: Update Vulkan headers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-31 15:59:49 +02:00
Hans-Kristian Arntzen 467db76f90 vkd3d: Remove obsolete COLOR -> COMPUTE workaround for Deathloop.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-31 15:59:35 +02:00
Hans-Kristian Arntzen 2953ef8688 tests: Remove query TODOs from ray tracing tests.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-30 20:26:50 +02:00
Hans-Kristian Arntzen f964532619 vkd3d: Implement extended DXR queries.
Requires ray_tracing_maintenance1.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-30 20:26:50 +02:00
Hans-Kristian Arntzen 5a0c8289d8 tests: Add test for FirstWSlice/WSlice on 3D UAV.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-30 15:09:09 +02:00
Hans-Kristian Arntzen cca7613bca dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-25 13:08:39 +02:00
Philip Rebohle 910f15dff8 vkd3d: Only set VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT for color attachments.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-05-23 17:17:17 +02:00
Hans-Kristian Arntzen a94e9b8b6a vkd3d: Don't create user descriptors until we have observed a pipeline.
If we don't get a swapchain on first frame for whatever reason, defer
creating the descriptors.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-23 16:58:15 +02:00
Hans-Kristian Arntzen 4ac0a3b455 vkd3d: Robustly fall back to user buffers if we fail to present twice.
If we fail to present after a swapchain recreation, force a SURFACE_LOST
scenario and try again later when things hopefully stabilize.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-23 16:58:15 +02:00
Hans-Kristian Arntzen 300058d9a7 vkd3d: Handle all errors after present, not just OUT_OF_DATE.
Can have SURFACE_LOST here as well for example.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-23 16:58:15 +02:00
Hans-Kristian Arntzen 2e16a777ca vkd3d: Get rid of redundant recreate swapchain call.
It just called create_vulkan_swapchain anyways.
Also, add in extra parameter to support temporary user buffer fallbacks.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-23 16:58:15 +02:00
Hans-Kristian Arntzen ac211d5f6a vkd3d: Remove direct calls to d3d12_swapchain_destroy_views.
Refactor destroy_buffers to destroy_resources as it's more obvious what
it's doing that way.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-23 16:58:15 +02:00
Hans-Kristian Arntzen 1dc4bbe5f2 utils: Report Wine segfault VkResult directly.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-23 16:58:15 +02:00
Tatsuyuki Ishi 2965b7e379 vkd3d/tests: Fix Release orders.
Fixes ASan use-after-free warnings on Release.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-05-23 09:58:30 +02:00
Tatsuyuki Ishi 0d9c0a3903 vkd3d: Fix aligned_alloc ASan errors on native.
Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-05-23 09:58:30 +02:00
Robin Kertels 1a773cfb71 tests: Add test for indirect ray tracing.
Signed-off-by: Robin Kertels <robin.kertels@gmail.com>
2022-05-11 19:11:01 +02:00
Robin Kertels cdabda7805 vkd3d: Implement indirect ray tracing.
Signed-off-by: Robin Kertels <robin.kertels@gmail.com>
2022-05-11 19:11:01 +02:00
Robin Kertels 8ac7aaca99 vkd3d: Enable VK_KHR_ray_tracing_maintenance1.
Signed-off-by: Robin Kertels <robin.kertels@gmail.com>
2022-05-11 19:11:01 +02:00
Robin Kertels 7e7c472005 khronos: Update Vulkan headers
Signed-off-by: Robin Kertels <robin.kertels@gmail.com>
2022-05-11 19:11:01 +02:00
Hans-Kristian Arntzen 71940797d1 vkd3d: Check for redundant dynamic state in some cases.
Some dynamic state is at risk of being spammed with same arguments many
times. For the dynamic state that is trivial to check, do so.

Ghostwire: Tokyo has been observed to spam the same OMSetStencilRef
value causing some context rolls, also RSSetShadingRate has been set
redundantly.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-03 16:30:42 +02:00
Hans-Kristian Arntzen 4603c25d69 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-28 13:54:11 +02:00
Hans-Kristian Arntzen 97201b8e93 vkd3d: Clean up straggling getenv() calls.
Replace with the new vkd3d_get_env wrapper.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-25 16:42:41 +02:00
Hans-Kristian Arntzen 51199752dd vkd3d: Fix queue creation for queue family -1.
Fixes validation error on Intel where we are trying to create
CONCURRENT family with {0, -1}.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-25 15:54:13 +02:00
Hans-Kristian Arntzen ebe589d622 tests: Add test for waveop in infinite loop convergence.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-25 14:42:18 +02:00
Hans-Kristian Arntzen 55a6847c61 vkd3d: Fix MSVC warning about redundant snprintf argument.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-25 14:41:44 +02:00
Hans-Kristian Arntzen 04c020525c common: Fix missing include.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-22 18:31:59 +02:00
Dean Beeler 063ce7e4bd Use Windows specific environment calls for better Windows compatibility.
Signed-off-by: David McCloskey <davmcclo@gmail.com>
2022-04-22 17:40:21 +02:00
Hans-Kristian Arntzen 2c54e18245 common: Fix _BitScanForward usage on MSVC.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-22 17:11:07 +02:00
Philip Rebohle bb2e35c539 vkd3d: Use vkGetDevice{Buffer,Image}MemoryRequirementsKHR in vkd3d_memory_info_init.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-04-22 11:36:02 +02:00
Philip Rebohle d5ad5bb1de vkd3d: Use vkGetDeviceImageMemoryRequirementsKHR in vkd3d_get_image_allocation_info.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-04-22 11:36:02 +02:00
Philip Rebohle beb58f8472 vkd3d: Enable and require VK_KHR_maintenance4.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-04-22 11:36:02 +02:00
Hans-Kristian Arntzen 358f95aff2 vkd3d: Ignore cached SPIR-V if we're dumping SPIR-V.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-22 11:29:27 +02:00
Philip Rebohle 119e00ed45 vkd3d: Do not add uint format to image format list.
Fixes #1069.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-04-21 13:51:58 +02:00
Philip Rebohle beaedbd857 vkd3d: Use UAV clear fallback based on format compatibility.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-04-21 13:51:58 +02:00
Philip Rebohle 81927c5895 vkd3d: Fix handling of non-zero base layer in ClearUAV fallback path.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-04-21 13:51:58 +02:00
Philip Rebohle e7a6af4971 vkd3d: Use texel buffer views for UAV clears with buffer to image copy.
Allows this to more easily work with more formats.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-04-21 13:51:58 +02:00
Philip Rebohle a1d5e6f39a vkd3d: Re-add R11G11B10 format compatibility info.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-04-21 13:51:58 +02:00
Hans-Kristian Arntzen 4a05360a0a dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-20 16:56:26 +02:00
Hans-Kristian Arntzen 0c4df9b32c ci: Workaround safe directory errors in vkd3d_build generation.
See https://github.com/actions/checkout/issues/760 for reference.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-20 15:54:30 +02:00
Hans-Kristian Arntzen 25c4bc18e7 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-08 13:04:26 +02:00
Hans-Kristian Arntzen 30ec6b7f1f dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-07 13:06:04 +02:00
Hans-Kristian Arntzen c47a6a904b meta: Add docs for magic shader cache.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen 5044975152 vkd3d: Drop redundant validate of PSO state blob from disk cache.
If we get an entry, it's implicitly validated.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen 8dc8b72807 cache: Add some performance information for shader cache operations.
They can take a long time and it's useful to have some reports here.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen ae0dafa3a1 cache: Attempt to use disk cache instead when appropriate.
When the disk cache is used, the cache we give back to applications is a
dummy. Therefore, try to use the disk cache blob if we detect a useless
application blob.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen 6c8542f7d6 vkd3d: Make use of internal pipeline library if we're asked to.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen 2dcb1e2efc cache: Implement an on-disk pipeline library.
With VKD3D_SHADER_CACHE_PATH, we can add automatic serialization of pipeline
blobs to disk, even for games which do not make any use of GetCachedBlob
of ID3D12PipelineLibrary interfaces. Most applications expect drivers to
have some kind of internal caching.

This is implemented as a system where a disk
thread will manage a private ID3D12PipelineLibrary, and new PSOs are
automatically committed to this library. PSO creation will also consult
this internal pipeline library if applications do not provide their own
blob.

The strategy for updating the cache is based on a read-only cache which
is mmaped from disk, with an exclusive write-only portion for new blobs,
which ensures some degree of safety if there are multiple
concurrent processes using the same cache.

The memory layout of the disk cache is optimized to be very efficient
for appending new blobs, just simple fwrites + fflush.
The format is also robust against sliced files, which solves the problem
where applications tear down without destroying the D3D12 device
properly.

This structure is very similar to Fossilize, and in fact the idea is to
move towards actually using the Fossilize format directly later.
This implementation prepares us for this scenario where e.g. Steam could
potentially manage the vkd3d-proton cache.

The main complication in this implementation is that we have to merge
the read-only and write caches.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen 3095ed84d3 cache: Add concept of internal pipeline libraries.
For internal pipeline libraries, we want a somewhat different strategy.

- PSOs are keyed by hash instead of user key.
- We want the option to conditionally store SPIR-V and PSO blobs.
  For internal caches, there isn't much of a reason to store PSO blobs
  since the disk cache is going to be primed anyways.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-05 14:12:20 +02:00
Hans-Kristian Arntzen db9b9a13de cache: Fix misleading comment about chunk alignment.
It's 8. Used to be 4 before some other fixes ...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-05 14:12:20 +02:00
Hans-Kristian Arntzen 637834dc75 vkd3d: Make private_root_signatures actually private.
Makes sure that we drop private root signature device references when
public pipeline state refcount hits 0.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-05 14:12:20 +02:00
Hans-Kristian Arntzen 93928424a9 common: Move time query to common header.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-05 14:12:20 +02:00
Hans-Kristian Arntzen c8b143c0bd common: Add wrapper for _ftelli64/_fseeki64.
MSVC doesn't have ftello64/fseeko64, nor off64_t.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-05 14:12:20 +02:00
Hans-Kristian Arntzen ca0a186a4b common: Add some file utils.
Supports more advanced file operations than we'd normally need.
Intended to be used by magic disk cache.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-05 14:12:20 +02:00
Philip Rebohle c9101b8ec3 tests: Add test to clear R11G11B10 UAV to zero.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-04-05 11:52:23 +02:00
Philip Rebohle 829c02bf90 vkd3d: Remove format compatibility info for R11G11B10.
Not allowing R32 views may give us compression back in some scenarios.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-04-05 11:52:23 +02:00
Philip Rebohle e4184830c5 vkd3d: Add ClearUAV path that uses buffer-to-image copies.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-04-05 11:52:23 +02:00
Philip Rebohle d1425ee4d1 vkd3d: Use VK_ACCESS_MEMORY_{READ,WRITE}_BIT where appropriate
Buggy RADV versions no longer work due to missing extension support.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-04-05 11:52:23 +02:00
Denis Barkar 8dda6df729 vkd3d: Force non-invariant position for Serious Sam 4.
Signed-off-by: Denis Barkar <dbarkar@nvidia.com>
2022-04-01 15:34:52 +02:00
Joshua Ashton 2ed513b99a vkd3d: Remove VKD3D_MAX_DYNAMIC_STATE_COUNT
This was off by one, at some point, which could cause a stack buffer overrun which is naughty.

Replace this with just an ARRAY_SIZE on the dynamic_state_list for the array size.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2022-04-01 15:19:18 +02:00
Hans-Kristian Arntzen 19e088cdfc tests: Add test for weird CBV layouts.
CBufferLoad and 16-bit/64-bit tests.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-30 20:13:32 +02:00
Hans-Kristian Arntzen 241078d7e8 vkd3d: Add scalar UBO layout requirement for SM 6.0.
Needed to support SM 6.0 CBufferLoad.
This path is mostly unused since it's opt-in in DXC and horribly broken
...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-30 20:13:32 +02:00
Hans-Kristian Arntzen e01589a33b dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-30 20:13:32 +02:00
Hans-Kristian Arntzen 2e704c5a5e tests: Test primitive restart behavior on list primitives.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-30 16:12:16 +02:00
Hans-Kristian Arntzen 6f43f450c8 vkd3d: Disable primitive restart when using non-compatible topologies.
Primitive restart is only used for strip primitive types, and must be
ignored for lists. Use and require extended_dynamic_state2 for this
purpose.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-30 16:12:16 +02:00
Hans-Kristian Arntzen cfeaa18b09 vkd3d: Enable MUTABLE_SINGLE_SET for Intel GPUs.
There are strict limits on number of descriptors which can be used,
and we have to use MUTABLE + single set to make this work.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-30 12:25:20 +02:00
Hans-Kristian Arntzen da63f0beac vkd3d: Compute range_end after sparse checks in copy tracking.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-30 12:13:25 +02:00
Hans-Kristian Arntzen 35e777f8a0 meta: Update docs for latest breadcrumbs/debug-ring work.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-30 12:13:16 +02:00
Hans-Kristian Arntzen 095a36cbaf meta: Update stale notes about driver versions.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-30 12:13:16 +02:00
Philip Rebohle 6378f1b880 vkd3d: Optimize WriteBufferImmediate for consecutive writes.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-30 11:51:10 +02:00
Philip Rebohle 307190e96b tests: Test WriteBufferImmediate with disjoint ranges.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-30 11:51:10 +02:00
Hans-Kristian Arntzen 2e8fb27182 vkd3d: Correctly handle dynamic depth/stencil attachment infos.
{depth,stencil}AttachmentFormat and p{Depth,Stencil}Attachment are only
allowed if the format contains that aspect. Check this explicitly.

Fixes some validation errors.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-24 17:55:32 +01:00
Hans-Kristian Arntzen 1b5f7e8fc3 vkd3d: Use VkImageViewCreateInfo correctly.
For EXTENDED_USAGE, we still need to restrict image usage when creating
concrete views.
Use VkImageViewUsageCreateInfo to restrict usage flags to the kind of
view we're creating.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-24 17:55:32 +01:00
Hans-Kristian Arntzen cf65a78570 vkd3d: Rename DSV UNKNOWN workaround query.
Make it more obvious what it's really trying to check.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-23 22:36:00 +01:00
Philip Rebohle 1d3957fe6d vkd3d: Do not create pipeline variants for NULL DSV.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-23 22:22:09 +01:00
Philip Rebohle c9abcfa656 vkd3d: Use d3d12_graphics_pipeline_state_has_unknown_dsv_format more consistently.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-23 22:22:09 +01:00
Hans-Kristian Arntzen 03427c6ee6 vkd3d: Explicitly use NULL RTV mask for dual source blending.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-23 14:29:51 +01:00
Hans-Kristian Arntzen 09682f8417 tests: Extend validation tests for dual source blending.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-23 14:29:51 +01:00
Hans-Kristian Arntzen 6273780e50 vkd3d: Accurately validate dual source blend state.
We need to check RTVFormats and IO signature.
If both RTVFormat uses non-null format and IO signature has an active
entry, we must fail compilation.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-23 14:29:51 +01:00
Hans-Kristian Arntzen 6e915dd2c0 vkd3d: Use rt_count as basis for binding RTVs.
Found some validation errors where rt_count != rtv_active_mask,
and blending used rt_count instead of rtv_active_mask. If shader renders
to a NULL attachment, we must make sure that it's part of the PSO
interface.

Also, use rt_count rather than active mask when beginning render pass.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-23 14:29:51 +01:00
Philip Rebohle 34f5fc6a31 vkd3d: Do not create pipeline variants for NULL RTVs.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-22 13:06:00 +01:00
Hans-Kristian Arntzen 63530501a5 vkd3d: Require VK_EXT_extended_dynamic_state.
This is basically required for not horrible stutter and performance and
is widely supported.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-16 17:48:21 +01:00
Hans-Kristian Arntzen dd6534f3f8 vkd3d: Report enabled debug ring size as INFO instead of WARN.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen 09997b4dd8 vkd3d: Fish for message clues on device lost.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen 6d35f98e59 vkd3d: Emit deadca7 cookie for num_words in debug ring.
Makes it somewhat feasible to fish for message begin codes in the
stream.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen e61cc0234a vkd3d: Allow debug ring to know about device lost scenarios.
For this case, we want to block and teardown the debug ring thread.
It's okay to fish for dead messages in the ring, since we know there
won't be more GPU work submitted.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen c54895b4b7 vkd3d: Fix overflow of ring_size.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen a6700d3d85 vkd3d: Make debug ring aware of potential crash scenarios.
If we expect device losts (breadcrumb debug), we need to use DEVICE uncached/coherent,
since we might not be able to flush GPU caches properly.

We also need to remove the idea of being able to copy out the control
block back to host. This is too brittle and we should instead just place
the control block in PCI-e BAR instead. Rethink how we pass messages
from GPU to CPU to make it more robust.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen f0cac9d97c debug: Make elects helper-lane aware.
The elected lane must be able to perform side effects, so make sure
helper lanes don't participate.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen 08c0ea209f debug: Add helper Makefile to easily build shader override modules.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen 64d42c08ee debug: Add helpers to do wave uniform debug messages.
If we know the input is wave uniform (progress markers for example),
no need to spam the log.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen 3d8ef2b349 debug: Emit messages more robustly in face of crashes.
Attempt to enforce memory order on the num_words
to only commit complete messages.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen 33b9166fec vkd3d: Make device coherency extension optional for breadcrumbs.
Some implementation can support marker, but not explicit coherency.
Buffer markers are often uncached either way, so should be fine ...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen 972ce74ac6 vkd3d: When using breadcrumbs, consider that WaitSemaphore can be buggy.
Spec says that in device lost, driver must return DEVICE_LOST in finite
time, but this does not happen on NV drivers. Use a long timeout instead
in this scenario.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:07:56 +01:00
Robin Kertels 5f97d1eb70 vkd3d: Implement NV_checkpoint path for breadcrumbs.
Signed-off-by: Robin Kertels <robin.kertels@gmail.com>
Co-authored-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:07:56 +01:00
Robin Kertels a6ea442819 vkd3d: Enable VK_NV_device_diagnostic_checkpoints.
Signed-off-by: Robin Kertels <robin.kertels@gmail.com>
2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen 365dd05557 vkd3d: Add breadcrumbs support.
AMD path for this commit.
Idea is that we can automatically instrument markers with command list
information we can make some sense of in vkd3d-proton.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen 5017b3723c vkd3d: Enable VK_AMD_device_coherent_memory.
For breadcrumbs support, along with buffer marker.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen 6a4f2842cb cache: Move d3d12_pipeline_library to internal references.
Allow us to hold internal magic pipeline libraries without creating
cycles.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 12:29:32 +01:00
Hans-Kristian Arntzen 18a5315db4 cache: Refactor lock strategy of internal hashmaps.
Rather than having to take writer lock on serialize calls from the
outside, we should just take locks when accessing the internal hashmaps
instead.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 12:29:32 +01:00
Hans-Kristian Arntzen 7c228139c3 cache: Refactor out pipeline library serialization.
If outer code has taken a reader lock, we don't need to lock again.
Also allows a reader lock to go GetSerializedSize + Serialize with one
reader lock.

This will be relevant for magic cache implementation.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 12:29:32 +01:00
Hans-Kristian Arntzen 30b4abcea1 vkd3d: Do not discard images in Clear*View() unless we have to.
It's redundant to add an UNDEFINED transition here for committed
resources. We need it for sparse and placed resources to handle aliasing
rules, but that's it.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-10 15:14:55 +01:00
Hans-Kristian Arntzen 17b1ffb41a vkd3d: Add path to use GENERAL depth-stencil images.
On some implementations, it doesn't matter for performance what we use,
and we can avoid a lot of ugly barriers this way.

Opt-in to use this extensions on GPUs we know handles it well,
otherwise, keep using the tracking paths.

With VK_KHR_dynamic_rendering, this is now feasible to do since we no longer
have to deal with shenanigans related to VkRenderPass layouts and
complicated compatibility rules.

To make this work with the existing framework, just need to consider
that GENERAL can be a common layout alongside DEPTH_STENCIL_OPTIMAL,
which are both common layouts that do not need to be tracked at all.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-10 15:14:55 +01:00
Hans-Kristian Arntzen f9da3bf564 vkd3d: Add VK_KHR_driver_properties.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-10 15:14:55 +01:00
Hans-Kristian Arntzen 5c70a24de1 tests: Test ref-count behavior of pipeline libraries.
It seems like we have to internally hold ID3D12PipelineState with
private references and hand it out to applications on request.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-09 18:35:09 +01:00
Hans-Kristian Arntzen c6149b47cd cache: Handle ref-count rules for multiple LoadPipeline/StorePipeline.
In pipeline libraries, the library holds on to private references of the
libraries so that they can be rapidly loaded on-demand.

This behavior is verifed by API tests.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-09 18:35:09 +01:00
Hans-Kristian Arntzen cc08339624 vkd3d: Use internal_refcounts for pipeline state.
When we store pipeline state in libraries we have to manage lifetime a
bit differently, which requires internal refcounts of some sort.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-09 18:35:09 +01:00
Hans-Kristian Arntzen 422f6804fb vkd3d: Enable VK_KHR_create_renderpass2.
Required extension by VK_KHR_fragment_shading_rate and
VK_KHR_separate_depth_stencil_layouts, but we don't care about enabling
any features or use it directly.

Needed to silence validation errors.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-09 16:35:05 +01:00
Georg Lehmann 7d4ed66881 meta: Remove VK_KHR_create_renderpass2 from README.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2022-03-08 18:34:18 +01:00
Georg Lehmann 14a06680d9 vkd3d: Remove unused renderpass remains.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2022-03-08 18:34:18 +01:00
Hans-Kristian Arntzen c9bac85dd1 tests: Add test for DSV plane tracking.
Tests various scenarios where we need to handle DSV layouts:
- Clears
- Discards
- Draw
- Transitions

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-08 18:11:50 +01:00
Hans-Kristian Arntzen 409dc57645 vkd3d: Properly decay depth-stencil images.
When performing a decay of a DSV resource, make sure to transition all
subresources, not just the particular aspect being transitioned.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-08 18:11:50 +01:00
Hans-Kristian Arntzen b330900659 vkd3d: Do not transition all aspects for single subresource.
We require separate DS layouts.
Fixes validation errors where we transition from read-only, but our
neighbor aspect might have been optimal.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-08 18:11:50 +01:00
Hans-Kristian Arntzen 92a8c0ad78 meta: Add KHR_dynamic_rendering to list of required features.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-08 18:05:51 +01:00
Hans-Kristian Arntzen c864f1322f khronos: Update Vulkan headers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-08 18:05:35 +01:00
Philip Rebohle 9a408367dc vkd3d: Remove render pass cache.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Philip Rebohle 51e6b2bbbe vkd3d: Remove render pass from command list state.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Philip Rebohle 94f82d1085 vkd3d: Get rid of pipeline variant flags.
These only existed for VRS attachment, which is no longer
necessary with VK_KHR_dynamic_rendering.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Philip Rebohle 1a68267962 vkd3d: Remove framebuffer list from d3d12_command_allocator.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Philip Rebohle c4f88951fc vkd3d: Use dynamic rendering for regular draw calls.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Philip Rebohle 9673ac173d vkd3d: Use dynamic rendering for pipeline creation.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Philip Rebohle 3783eaf4f7 vkd3d: Implement swap chain blits using dynamic rendering.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Philip Rebohle 024ef02f9b vkd3d: Implement meta image copies using dynamic rendering.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Philip Rebohle 549d4ee63f vkd3d: Remove render pass list from d3d12_command_allocator.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Philip Rebohle 6186cc1f0e vkd3d: Implement clears using dynamic rendering.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Philip Rebohle 2c92ab7d1e vkd3d: Enable and require VK_KHR_dynamic_rendering.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Philip Rebohle ba04b02bf6 khronos: Update Vulkan headers.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Hans-Kristian Arntzen 9fbae668fe vkd3d: Ensure that all SPIR-V modules are properly cached.
When we require inter-stage fixups, we need a solution for partial
validity of the cache. Accept the modules all or nothing.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-08 16:43:30 +01:00
Hans-Kristian Arntzen ce45297695 vkd3d: Enable debug_utils if vk_debug is enabled.
Allows debug callbacks to go through in Wine.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-08 16:40:51 +01:00
LemiSt24 c411d0d0c2 vkd3d: Add case for D3D12_STATE_SUBOBJECT_TYPE_GLOBAL_ROOT_SIGNATURE
Signed-off-by: LemiSt24 <lennard.strohmeyer@gmail.com>
2022-03-07 16:15:22 +01:00
Hans-Kristian Arntzen 3e5aab6fb3 meta: Update version to 2.6.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-04 16:49:28 +01:00
Hans-Kristian Arntzen bc40528b6f meta: Add CHANGELOG for 2.6.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-04 14:54:16 +01:00
Hans-Kristian Arntzen 7cd3b9c917 idl: Fix type of D3D12_ERROR defines.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-04 14:53:57 +01:00
Hans-Kristian Arntzen 9a63df07b8 vkd3d: Add punchthrough path for descriptor copies.
Proves out the viability of this style of implementation. Ideally we'd
have a more officially sanctioned way of doing similar things later :)

Unfortunately, the overhead removal is too great to ignore on target
platform. Makes use of a private (reserved) extension for now ...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-04 13:34:18 +01:00
Hans-Kristian Arntzen 277f485321 vkd3d: Add private extension header.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-04 13:34:18 +01:00
Mike Blumenkrantz 1d76803aff vkd3d: optimize memory access pattern for sampler descriptors
this removes them from the bitscan path

Signed-off-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
2022-03-01 22:50:45 +01:00
Hans-Kristian Arntzen dc622fc715 vkd3d: Recycle command pools in Elden Ring.
Very churny.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 18:40:52 +01:00
Hans-Kristian Arntzen 9817c52d24 vkd3d: Add workaround to ignore mismatch driver/device in PSO library.
Elden Ring does not detect the proper error code and create a new
pipeline library. Instead, create a fresh new library, which works
around the issue.

The game has a pattern of LoadPipeline -> if fail -> CreatePSO ->
StorePipeline. Sometimes, in the same process it will LoadLibrary from
its own cache (could explain some stutters),
so it's very useful to have this either way.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 14:50:57 +01:00
Hans-Kristian Arntzen a8229390f9 vkd3d: Add more pipeline_library_log snippets.
Hook GetCachedBlob and various attempts to use LoadPipeline.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 14:50:57 +01:00
Hans-Kristian Arntzen 12c73ee18a swapchain: More gracefully handle SURFACE_LOST.
Just like handling min/maxImageExtent of 0, we can just fall back to
user buffers.

Elden Ring hits this case on application teardown.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 14:04:06 +01:00
Hans-Kristian Arntzen f39ece9a7c vkd3d: Enable performance workarounds for Elden Ring.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 13:59:08 +01:00
Hans-Kristian Arntzen c19eaac376 vkd3d: Add VKD3D_CONFIG option for command pool recycling.
Normal behaving apps should not benefit from any of this.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 13:59:08 +01:00
Hans-Kristian Arntzen 54fbadcc94 vkd3d: Recycle command pools.
Elden Ring in particular spam frees and allocates command pools despite
this being a very bad idea.

Add a simple 8-entry cache which seems to take care of it.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 13:59:08 +01:00
Hans-Kristian Arntzen 4b07535909 vkd3d: Optimize memory access pattern for single descriptor copies.
We can mark a descriptor as being SINGLE_DESCRIPTOR, which means we
only need one descriptor copy. This way, we can avoid doing somewhat
expensive work (every nanosecond counts here):

- Bitscan loop
- Read deep into d3d12_device guts (often a cache miss). The memory
  index depends on the bitscan, which causes bubble.

When we have a single descriptor, we can just store the binding
information inline and avoid this jank.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 13:04:43 +01:00
Hans-Kristian Arntzen 84d632f194 vkd3d: Rewrite memory layout for resource descriptors.
Tune memory layout so that we can deduce various information without
making a single pointer dereference:

- d3d12_descriptor_heap*
- heap offset
- Pointer to various side data structures we need to keep around.

Instead of having one big 64 byte data structure with tons of padding,
tune it down to 32 + 8 bytes per descriptor of extra dummy data.

To make all of this work, use a somewhat clever encoding scheme for CPU
VA where lower bits store number of active bits used to encode
descriptor offset. From there, we can mask away bits to recover
d3d12_descriptor_heap. Metadata is stored inline in one big allocation,
and we can just offset from there based on extracted log2i_ceil(descriptor count).

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 13:04:43 +01:00
Hans-Kristian Arntzen b309913b6d vkd3d: Use unsafe_impl in CopyDescriptorsSimple.
This is an ultra-hot path and seems to show up somehow on profile.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 13:04:43 +01:00
Hans-Kristian Arntzen dc752991ef common: Add vkd3d_log2i_ceil.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 13:04:43 +01:00
Hans-Kristian Arntzen c29d005ef4 vkd3d: Don't enable fast descriptor copy path for descriptor QA.
The hooks are in the generic function.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-24 16:42:00 +01:00
Hans-Kristian Arntzen 8a46c21254 vkd3d: Add VKD3D_CONFIG to skip memory allocator clears.
For cases where games spam committed allocations and don't use
NOT_ZEROED. We still rely on zerovram behavior for initial backing which
should be enough in most cases.

Strictly speaking however, we are forced to clear the allocations every
time if application does not use the flag correctly.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-24 12:52:05 +01:00
Hans-Kristian Arntzen 76ca492a39 vkd3d: Add some debug logging for when clear passes happen.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-24 12:52:05 +01:00
Hans-Kristian Arntzen 83c4e62660 vkd3d: Bump suballocation limit to 2 MiB.
This is a more principled limit since that's the huge page size.

Avoids some allocation spam.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-24 12:14:22 +01:00
Hans-Kristian Arntzen 4bea653504 vkd3d: Fix CopyTiles for suballocated linear resources.
Forgot to offset buffer offset. Fun!
Found when bumping VA allocation limit to 2 MiB instead of 1 MiB.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-24 12:14:22 +01:00
Hans-Kristian Arntzen edbf49aad4 vkd3d: Support opt-in to single MUTABLE set.
Useful for Intel since Intel hardware cannot support more than 1M
descriptors in general, and opting in to correct behavior should improve
CPU overhead as well when copying descriptors.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-21 17:08:25 +01:00
Hans-Kristian Arntzen e0af8f2810 vkd3d: Make error message for buffer alignment more direct.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-21 16:37:12 +01:00
Hans-Kristian Arntzen b066e72243 swapchain: Add env-var to override swapchain images.
For perf debug mostly.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-21 16:36:36 +01:00
Hans-Kristian Arntzen 15704b2419 vkd3d: Optimize descriptor copies for common code paths.
The common path that we really need to optimize for is CBV_SRV_UAV +
Simple + 1 descriptor.

Descriptor benchmark shows an almost 50% reduction in overhead now.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-21 16:35:36 +01:00
Hans-Kristian Arntzen c725c29bb6 vkd3d: Inline query for set/binding from set_index.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-21 16:35:36 +01:00
Hans-Kristian Arntzen 2f6a91e772 vkd3d: De-virtualize query for descriptor size.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-21 16:35:36 +01:00
Hans-Kristian Arntzen 719a38a5fe tests: Add individual descriptor copy tests to descriptor benchmark.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-21 16:35:36 +01:00
Joshua Ashton 2278da339a build: Bump arch-mingw-github-action to v7
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2022-02-20 03:40:40 +00:00
Hans-Kristian Arntzen 1cc8afcc8e vkd3d: Fix potential crashes when VK_KHR_dynamic_rendering is added.
Checking for pNext here is too brittle and causes crashes when dynamic
rendering path is added.
Also need to chain in existing pNexts.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-17 11:27:25 +01:00
Hans-Kristian Arntzen 1112106db0 tests: Verify that runtime validates invalid PSO description for blob.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-17 11:00:03 +01:00
Hans-Kristian Arntzen 624bf53f8b tests: Verify that runtime must validate DXBC blob and RS.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-17 11:00:03 +01:00
Hans-Kristian Arntzen b363d8d2e4 tests: Remove TODO from PSO library test.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-17 11:00:03 +01:00
Hans-Kristian Arntzen 5d345f47cc vkd3d: Rewrite the pipeline library implementation.
This became basically a rewrite in the end, and it got too awkward to
split these commits in any meaningful way.

The goals here were primarily to:

- Support serializing SPIR-V and load SPIR-V.
  To do this robustly requires a lot more validation and checks to make
  sure end up compiling the same SPIR-V that we load from cache.
  This is critical for performance when games have primed their pipeline
  libraries and expect that loading a PSO should be fast. Without this,
  we will hit vkd3d-shader for every PSO, causing very long load times.
- Implement the required validation for mismatched PSO descriptions.
- Rewrite the binary layout of the pipeline library for flexibility
  concerns and performance.
  If the pipeline library is mmap-ed from disk - which appears to be
  the intended use - we only need to scan through the TOC to fully parse
  the library contents.
  From a flexibility concern, a blob needs to support inlined data,
  but a library can use referential links. We introduce separate
  hashmaps which store deduplicated SPIR-V and pipeline cache blobs,
  which significantly drop memory and storage requirements.
  For future improvements, it should be fairly easy to add information
  which lets us avoid SPIR-V or pipeline cache data altogether if
  relevant changes to Vulkan/drivers are made.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-17 11:00:03 +01:00
Georg Lehmann a078197e16 build: Avoid meson warning.
WARNING: You should add the boolean check kwarg to the run_command call.
         It currently defaults to false,
         but it will default to true in future releases of meson.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2022-02-11 16:21:52 +01:00
Krzysztof Bogacki 9029d1ae23 build: Merge Prepare and Build steps on Windows CI.
Signed-off-by: Krzysztof Bogacki <krzysztof.bogacki@leancode.pl>
2022-02-07 11:12:15 +01:00
Krzysztof Bogacki ae7081eb62 build: Use MSBuild backend on Windows CI.
Signed-off-by: Krzysztof Bogacki <krzysztof.bogacki@leancode.pl>
2022-02-07 11:12:15 +01:00
Hans-Kristian Arntzen 33f17cc74d vkd3d: Add VK_EXT_pipeline_creation_feedback.
Useful when used together with pipeline library logging. Confirms that
we can load pipeline caches as expected.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-04 14:31:34 +01:00
Hans-Kristian Arntzen 3b8265dccc common: Add a timedwait condvar API.
To be used for upcoming disk driver cache implementation which needs to
live on a thread.

Need a separate wrapper since pthread and SRWLock interface is quite
different. Similar rationale as rwlock_t.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-04 14:31:34 +01:00
Hans-Kristian Arntzen a2eddc181b common: Add f32/string hashing utils as well.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-04 14:31:34 +01:00
Hans-Kristian Arntzen 47337d5e0b vkd3d: Add VKD3D_CONFIG flags for various pipeline library logging.
Additionally, add option to ignore cached SPIR-V.
Will be useful for debugging, and also required for VKD3D_SHADER_OVERRIDE.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-04 14:31:34 +01:00
Hans-Kristian Arntzen f03940ef4b vkd3d: Add global_pipeline_cache option.
Avoids saving out pipeline cache blobs which are likely going to be
cached by on-disk cache anyways.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-04 14:31:34 +01:00
Hans-Kristian Arntzen e5e662ce22 vkd3d: Record root signature compatibility hashes.
For pipeline libraries and DXR to some extent later, we'll need an easy
way to compare root signature objects.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-04 14:31:34 +01:00
Hans-Kristian Arntzen bc3b25fb0e tests: Extend unbound RTV rendering test to cover invalidation of PSO.
Similar issue with this as with NULL DSV rendering test. We did not test
the scenario where RTV is bound, then it is not bound anymore with same
PSO.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-04 13:10:16 +01:00
Hans-Kristian Arntzen 05a5d366d5 tests: Test rendering to non-NULL DSV, then NULL DSV.
Uncovered CPU crash where we did not invalidate pipeline/render pass
properly.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-04 13:10:16 +01:00
Hans-Kristian Arntzen 1d39c25a59 vkd3d: Properly invalidate pipeline when binding NULL DSV.
We did not test the scenario where we first render with depth enabled,
and then bind a NULL DSV with the same pipeline.
Also fix issues if we bind NULL RTVs with same pipeline bound.

Fixes crash in Guardians of the Galaxy.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-04 13:10:16 +01:00
Hans-Kristian Arntzen 5e526d506b vkd3d: Remove warning for setting NULL index buffer.
This is benign and easily gets spammed a TON.
We will warn if an indexed draw is actually made like this.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-03 18:16:36 +01:00
Hans-Kristian Arntzen 91ca2ed8ba tests: Mark COLOR -> STENCIL copy test as TODO.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-03 15:43:41 +01:00
Hans-Kristian Arntzen 2ca7ce62da tests: Add test for color <-> stencil copies.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-03 15:43:41 +01:00
Hans-Kristian Arntzen 907acce30c tests: Fix D3D12 validation error in copy_texture test.
Copy out of bounds now seems to trigger device lost.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-03 15:43:41 +01:00
Hans-Kristian Arntzen 8b92d8e0bc tests: Add test for copying single aspects between DS images.
Also fixes test bug where texture was sampled as float, despite having
uint aspect.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-03 15:43:41 +01:00
Hans-Kristian Arntzen 81a215d0bf vkd3d: Implement COLOR -> STENCIL copy if stencil export is supported.
Fallback is a bit more involved. Cleans up the FIXME to not report
benign issues.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-03 15:43:41 +01:00
Hans-Kristian Arntzen 29d956c6c4 vkd3d: Fix memory leak of D3D12 device singleton.
Fairly trivial, caught by ASAN.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-02 13:56:36 +01:00
Hans-Kristian Arntzen 49d0eb37e3 vkd3d: Properly align d3d12_command_list allocations.
UBSAN found a bug here since we store RTV descriptors inline, the
compiler can assume the pointer is 64 byte aligned.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-02 13:56:36 +01:00
Hans-Kristian Arntzen 1da9ad900c hashmap: Avoid redundant copy of entry data.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-02 13:56:36 +01:00
Philip Rebohle 8f81aaa710 vkd3d: Fix reporting of WriteBufferImmediateSupportFlags.
Oversight from when we added bundles.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-02-01 16:21:43 +01:00
Philip Rebohle 91976b2edd tests: Add mesh and amplification shader tests.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-01-28 17:06:30 +01:00
Philip Rebohle 6aa73b3d53 tests: Move pipeline stream structs to common header.
We'll need to use the CreatePipelineState API for more tests.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-01-28 17:06:30 +01:00
Krzysztof Bogacki ab47aaf36d build: Add workflow for MSVC builds
Signed-off-by: Krzysztof Bogacki <krzysztof.bogacki@leancode.pl>
2022-01-25 16:19:11 +00:00
Hans-Kristian Arntzen 833f56154c cache: Store shader interface key in pipeline library as well.
If we're going to create different SPIR-V files from what the
VkPipelineCache represents, it's meaningless to load it.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-25 14:07:07 +01:00
Hans-Kristian Arntzen 86f8f41490 vkd3d: Compute a global shader interface key for a D3D12 device.
This key represents the variations of SPIR-V which would be generated
from otherwise identical inputs like DXBC blobs and root signatures.

Typically, changing VKD3D_CONFIG flags or enabled extensions will affect
this key. This ensures that we will not attempt to use a cached SPIR-V
file unless we can trust that the SPIR-V interface will match.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-25 14:07:07 +01:00
Hans-Kristian Arntzen a3f1a0e3cd vkd3d-shader: Add mechanism to get vkd3d-shader implementation revision.
Not immediately useful, might be nuked later in development.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-25 14:07:07 +01:00
Hans-Kristian Arntzen e90b573896 vkd3d-shader: Use flag for vkd3d_shader_meta bools.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-25 14:07:07 +01:00
Hans-Kristian Arntzen 8196b85408 vkd3d-shader: Make vkd3d_shader_hash public.
Prepare for meta struct to be serialized to a cache.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-25 14:07:07 +01:00
Hans-Kristian Arntzen a2c1527acd vkd3d-shader: Reuse hashmap.h hasher for shader hash.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-25 14:07:07 +01:00
Hans-Kristian Arntzen 3839144848 vkd3d: Add FNV-1a hash util.
To be used for pipeline library hashing.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-25 14:07:07 +01:00
Hans-Kristian Arntzen 6e697a54b6 vkd3d: Add d3d12_cached_pipeline_state.
Wraps the D3D12 struct with a pipeline library handle.
This is needed if the blob contains references to external data,
which then needs to be resolved.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-25 14:07:07 +01:00
Hans-Kristian Arntzen 41c977d616 cache: Move cache implementation over to read-writer locks.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-25 14:07:07 +01:00
Hans-Kristian Arntzen 7da708ea69 vkd3d: Add an RW lock wrapper.
For longer-lived locks where spinlock is bad form. To be used for
pipeline library.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-25 14:07:07 +01:00
Georg Lehmann 2c76840ff8 meta: Update COPYING year.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2022-01-25 11:11:51 +01:00
Georg Lehmann 182ebd7e00 meta: Update AUTHORS.
git log --pretty="%aN" | sort | uniq > AUTHORS

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2022-01-25 11:11:51 +01:00
Georg Lehmann c69b73ffcf meta: Create a .mailmap file.
Replaces some of the github account name authors with their real name from
the Signed-off-by.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2022-01-25 11:11:51 +01:00
Hans-Kristian Arntzen 1409ebab1f vkd3d: Consider sparse buffers to alias any other buffer.
Technically cannot alias committed buffers, but 🤷 ...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-20 15:14:27 +01:00
Hans-Kristian Arntzen 7d0743345a vkd3d: Remove useless buffer barrier tracking.
This copy is to a scratch buffer, which needs no tracking.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-20 15:14:27 +01:00
Hans-Kristian Arntzen 2b0a161a0d tests: Sanitize test_hull_shader_vertex_input_patch_constant_phase.
Was using w = 0.0, causing weird issues.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-19 17:10:19 +01:00
Philip Rebohle 1af62abfe7 vkd3d: Enable quirk for further UE4 shaders.
Fixes artifacts in The Ascent.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-01-19 16:49:42 +01:00
Hans-Kristian Arntzen 338157eb04 tests: Add test for overlapped buffer copies.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-19 14:44:33 +01:00
Hans-Kristian Arntzen 5c492e9e6c vkd3d: Handle overlapped transfer writes.
D3D12 expects drivers to implicitly synchronize transfer operations,
since there is no TRANSFER barrier ala UAV barriers.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-19 14:44:33 +01:00
Hans-Kristian Arntzen 68ce4b4116 vkd3d: MSVC build fix.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-19 14:21:09 +01:00
Hans-Kristian Arntzen 0f46a8a7d5 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-14 16:09:16 +01:00
Hans-Kristian Arntzen 6cba8b9945 vkd3d: Workaround broken barriers in DEATHLOOP.
In DEATHLOOP, there is a render pass which renders out a simple image,
which is then directly followed by a compute dispatch, reading that
image. The image is still in RENDER_TARGET state, and color buffers are
*not* flushed properly on at least RADV, manifesting as a very
distracting glitch pattern. This is a game bug, but for the time being,
we have to workaround it, *sigh*.

For a simple workaround, we can detect patterns where we see these
events in succession:

- Color RT is started
- StateBefore == RENDER_TARGET is not observed
- Dispatch()

In particular, when entering the options menu, highly distracting
glitches are observed in the background.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-12 12:20:03 +01:00
Hans-Kristian Arntzen e5efa8594e tests: Remove RADV bugs which have been fixed.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-12 12:18:56 +01:00
Hans-Kristian Arntzen 39c1f9d07a tests: Add test for invalid (?) alias barrier behavior.
Verifies that aliasing barriers on their own do not trigger image layout
transitions.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-12 12:16:52 +01:00
Robin Kertels 35be1329ed vkd3d: Don't do layout transition in aliasing barrier.
HZD issues an aliasing barrier for an alias of a resource that it
still needs.
Because D3D12 requires you to call DiscardResource or a full resource
clear/copy, we can just rely on those to do the actual image layout
transition and treat the aliasing barrier as a pure sync + flush.

This behavior is also observed in a test case where D3D12 drivers
do not seem to discard / fast-clear anything in an aliasing barrier.

Signed-off-by: Robin Kertels <robin.kertels@gmail.com>
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Co-authored-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-12 12:16:52 +01:00
Samuel Pitoiset f6a4e0fb71 vkd3d: Use VK_KHR_copy_commands2
Mesa RADV translates these legacy entrypoints to the 2 variants. Using
them directly will cost a bit less CPU cycles.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2022-01-12 12:06:06 +01:00
Samuel Pitoiset f6fe3e0183 vkd3d: Require VK_KHR_copy_commands2
This extension is trivial to implement for vendors and should be
widely supported.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2022-01-12 12:06:06 +01:00
Samuel Pitoiset 870dda927d vkd3d: Use VK_KHR_bind_memory2
Mesa RADV translates these legacy entrypoints to the 2 variants. Using
them directly will cost a bit less CPU cycles.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2022-01-12 12:06:06 +01:00
Samuel Pitoiset b42a7193fc vkd3d: Require VK_KHR_bind_memory2
This extension is trivial to implement for vendors and should be
widely supported.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2022-01-12 12:06:06 +01:00
Hans-Kristian Arntzen db943f2341 tests: Add DXIL test for FP32 -> FP16 conversions.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-12 12:04:49 +01:00
Hans-Kristian Arntzen 9162e82fb3 tests: Add DXBC test for f32tof16 behavior.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-12 12:04:49 +01:00
Hans-Kristian Arntzen d13424bf22 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-11 15:21:06 +01:00
Philip Rebohle 5923c53111 vkd3d: Only use VK_IMAGE_CREATE_EXTENDED_USAGE_BIT if necessary.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-01-11 15:14:30 +01:00
Joshua Ashton bd2be76132 build: Update arch-mingw-github-action to v6 for test builds 2022-01-10 11:49:34 +00:00
Joshua Ashton d94fdd1ca9 build: Update arch-mingw-github-action to v6 2022-01-10 11:46:01 +00:00
Philip Rebohle 1354ecabb4 vkd3d: Consider query pool when merging query ranges.
Otherwise, we accidentally merge ranges from different pools if
the indices happen to align.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-01-06 14:27:36 +01:00
Hans-Kristian Arntzen c0a3fa8adc vkd3d: Attempt to create linear image without EXTENDED_USAGE.
NVIDIA drivers apparently cannot support EXTENDED_USAGE linear
images for whatever reason, so attempt to create these images without
the creation flag.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-12-03 12:47:09 +01:00
Hans-Kristian Arntzen 459cae5673 vkd3d: Fix redundant return from void.
Fix MSVC warning.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-12-02 22:48:48 +01:00
Hans-Kristian Arntzen 7502b4c4c8 vkd3d: Fix MSVC build.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-12-02 22:48:48 +01:00
Hans-Kristian Arntzen 18b31a73ec tests: Add additional test cases to minLOD test.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-12-02 22:40:44 +01:00
Hans-Kristian Arntzen fffd6e935c vkd3d: Add R64_UINT to format compatibility list when needed.
For 64-bit image atomics, we should at the very least add 64-bit format
to compatibility list to avoid potential problems.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-12-02 22:40:32 +01:00
Hans-Kristian Arntzen 72f26c5699 vkd3d: Remove misleading FIXME.
We can bind texel buffers at scalar alignment now.
The warning is misleading for placed resources, since 64k never aligns
with a float3.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-12-02 22:40:21 +01:00
Arkadiusz Hiler 93d105adae vkd3d: Retry to create Vk device without NVX extensions.
The creation with those extensions may fail in few cases:
 * older 32 bit drivers
 * missing or inaccessible /dev/nvidia-uvm

There's also a mysterious crash that some Debian users experience with
64bit titles and a correct /dev/nvidia-uvm.

Signed-off-by: Arkadiusz Hiler <ahiler@codeweavers.com>
2021-12-02 12:44:37 +01:00
Hans-Kristian Arntzen 9c3549360d tests: Add more TODO for map_texture_validation.
NV really doesn't like linear images, huh ...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-26 20:02:14 +01:00
Hans-Kristian Arntzen d2fd3de7c1 vkd3d: Handle somewhat common VkResult.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-26 20:02:14 +01:00
Hans-Kristian Arntzen d9636d5c67 vkd3d: Fix check for vkBindImageMemory.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-26 20:02:14 +01:00
Hans-Kristian Arntzen 2c80431003 tests: Remove TODO for MinLod test.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-26 16:51:18 +01:00
Hans-Kristian Arntzen 9a59ded1c4 vkd3d: Simplify MinLod setup.
Only bother if we actually need to clamp LOD.
Simplifies some clamping logic as well.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-26 16:51:18 +01:00
Philip Rebohle f5a6d49e87 tests: Add test for clearing BGRA8 UAVs.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-26 16:51:01 +01:00
Philip Rebohle a99914b6ea vkd3d: Fix clear color swizzle for various UAV formats.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-26 16:51:01 +01:00
Philip Rebohle 4000397570 vkd3d: Remove legacy format compatibility info.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-26 16:51:01 +01:00
Philip Rebohle 0de25ac3cd vkd3d: Do not use vkd3d_find_uint_format in ClearUAV.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-26 16:51:01 +01:00
Philip Rebohle ab111dcdbe vkd3d: Don't use vkd3d_get_typeless_format to determine shader copy usage.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-26 16:51:01 +01:00
Philip Rebohle 99d949f5fb vkd3d: Fix enablement of MUTABLE_FORMAT_BIT and EXTENDED_USAGE_BIT.
We previously did not take into account the new relaxed format compatibility
rules that we allow with CastingFullyTypedFormatSupported being supported.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-26 16:51:01 +01:00
Philip Rebohle 9624102dcb vkd3d: Rework format compatibility lists.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-26 16:51:01 +01:00
Philip Rebohle 42b8fc3338 vkd3d: Introduce new format compatibility table.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-26 16:51:01 +01:00
Hans-Kristian Arntzen 6a7eee33b5 tests: Remove obsolete format feature check.
BGRA8 UAV is allowed now.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-26 16:32:05 +01:00
Hans-Kristian Arntzen 8305ddec92 tests: Add test for various clear patterns with fully typed cast.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-26 15:56:43 +01:00
Hans-Kristian Arntzen 3c9b8cb040 tests: Add detailed meta-test for CastFullyTypedFormat.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-26 15:56:43 +01:00
Georg Lehmann 4240ab7559 vkd3d: Allow B8G8R8A8 UAVs.
This is now allowed according to
https://microsoft.github.io/DirectX-Specs/d3d/RelaxedCasting.html

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-11-24 15:15:14 +01:00
Hans-Kristian Arntzen 7391e38602 vkd3d: Fix some type errors after idl update.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-22 16:25:13 +01:00
Philip Rebohle 9185edb42a vkd3d: Implement ID3D12GraphicsCommandList6.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-19 14:57:51 +01:00
Philip Rebohle b03c1fcb5f vkd3d: Implement ID3D12Device9.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-19 14:57:51 +01:00
Philip Rebohle 3b6a4ab988 vkd3d: Implement ID3D12Device8 and ID3D12Resource2.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-19 14:57:51 +01:00
Philip Rebohle d61f562a3e vkd3d: Implement ID3D12Device7.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-19 14:57:51 +01:00
Philip Rebohle 930e7cb251 idl: Add new interfaces up to ID3D12Device9.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-19 14:57:51 +01:00
Hans-Kristian Arntzen 6ad67bdecd dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-18 14:05:20 +01:00
Joshua Ashton 046524f2a1 vkd3d: Implement MinLODClamp using VK_EXT_image_view_min_lod
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-11-17 20:51:20 +01:00
Joshua Ashton 7241164e2d khronos: Update Vulkan headers.
Update to v1.2.199.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-11-17 20:51:20 +01:00
Hans-Kristian Arntzen 99e067d681 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-16 20:50:46 +01:00
Georg Lehmann 344f8d1ed4 tests: Fix various alignment warnings on 32bit clang.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-11-16 20:29:18 +01:00
Robin Kertels 19a1dce393 vkd3d: Set GetCopyableFootprints total_bytes late.
Halo Infinite uses &desc->Width for total_bytes.
We can't set total_bytes early because code after this relies on desc->Width.

Signed-off-by: Robin Kertels <robin.kertels@gmail.com>
2021-11-16 11:53:18 +01:00
Hans-Kristian Arntzen 3fefc540c8 vkd3d: Handle 64KB_UNDEFINED_SWIZZLE.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-12 10:32:13 +01:00
Hans-Kristian Arntzen 16d8bae263 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-11 17:13:16 +01:00
Hans-Kristian Arntzen 0251b4045c dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-10 15:17:36 +01:00
Hans-Kristian Arntzen 54da1dc9b2 tests: Only test FP64 if device supports it.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-09 15:57:48 +01:00
Hans-Kristian Arntzen a0eb938c7f tests: Only check lower 24-bit when testing D24 copies.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-09 15:57:16 +01:00
Hans-Kristian Arntzen 2da535fbbf tests: Remove TODO from test_depth_stencil_test_no_dsv.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-09 15:56:45 +01:00
Hans-Kristian Arntzen 3937e1a298 vkd3d: Handle illegal rendering to NULL DSV.
Guardians of the Galaxy hits this case. Fallback is to disable depth
attachment entirely in a fallback pipeline.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-09 15:56:45 +01:00
Hans-Kristian Arntzen 45ae742526 vkd3d: Pretend that SSBO alignment on NV is 4 bytes.
The 16-byte requirement is kind of a lie. The real requirement is tied
to how vectorized load-store instructions are emitted in the shader
itself since I guess it allows compiler to assume something about
alignment of the base pointer.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-09 14:47:10 +01:00
Hans-Kristian Arntzen b53a4a98a6 vkd3d: Enable per component robustness on AMD.
Tested and verified to work as expected, not so much on NV.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-09 14:47:10 +01:00
Hans-Kristian Arntzen 3210832ad9 vkd3d: Enable VK_EXT_scalar_block_layout.
dxil-spirv can take advantage of this now.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-09 14:47:10 +01:00
Hans-Kristian Arntzen 58aab78a5b vkd3d-shader: Add PER_COMPONENT_ROBUSTNESS shader extension.
Signals that we can use vectorized vec3 byte address buffers.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-09 14:47:10 +01:00
Hans-Kristian Arntzen e605d19ef7 vkd3d-shader: Add shader extension for scalar block layout.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-09 14:47:10 +01:00
Hans-Kristian Arntzen 7986e241f3 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-09 14:47:10 +01:00
Hans-Kristian Arntzen 164273521f tests: Add test for vectorized byte address buffers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-09 14:47:10 +01:00
Hans-Kristian Arntzen db89d403d6 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-02 17:51:39 +01:00
Hans-Kristian Arntzen 35d2f1e87f vkd3d: Correctly check for SM 6.6 required features.
Remove the experimental flag and unconditionally enable SM 6.6 if
available.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-01 14:20:38 +01:00
Hans-Kristian Arntzen 2b11c70129 vkd3d: Hook up WaveSize implementation.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-01 14:20:38 +01:00
Hans-Kristian Arntzen 6966cd2f33 vkd3d-shader: Reflect CS WaveSize.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-01 14:20:38 +01:00
Hans-Kristian Arntzen 7cc435c0bc vkd3d: Enable feature bits for 64-bit atomics.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-01 14:20:38 +01:00
Hans-Kristian Arntzen de64ebd1d1 vkd3d: Expose Int64 feature.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-01 14:20:38 +01:00
Hans-Kristian Arntzen 23ad0247e3 vkd3d: Enable 64-bit atomics extensions.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-01 14:20:38 +01:00
Hans-Kristian Arntzen a392e82d1c tests: Add test for SM 6.6 packed intrinsics.
To get any performance out of these, we require Int8, which is
fortunately widely supported.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-01 14:20:38 +01:00
Hans-Kristian Arntzen 570ecd5f79 tests: Add SM 6.6 WaveSize test.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-01 14:20:38 +01:00
Hans-Kristian Arntzen 1d99a80f22 tests: Add test for SM 6.6 IsHelperLane().
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-01 14:20:38 +01:00
Hans-Kristian Arntzen b47282e78a tests: Add test for SM 6.6 64-bit atomics.
Tests all major scenarios:
- Root descriptor
- Table
- Typed
- Groupshared

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-01 14:20:38 +01:00
Hans-Kristian Arntzen cd2218e9c3 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-01 14:20:38 +01:00
Hans-Kristian Arntzen 6255eaec32 vkd3d: Stub out the more recent FEATURE_DATA structs.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-01 14:20:38 +01:00
Hans-Kristian Arntzen daa96ba879 idl: Add new OPTIONS feature structs.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-01 14:20:38 +01:00
David Gow 2a8b5471ca vkd3d: Handle D3D12_APPEND_ALIGNED_ELEMENT for <4 byte wide elements
In d3d12, input element alignment needs to be the _minimum_ of 4 and the size of
the type. See the D3D11 spec, section 4.4.6, which behaves similarly:
https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#4.4.6%20Element%20Alignment

This is correctly taken into account when generating, e.g., the
vertex_buffer_stride_align_mask used for validation, but is not taken
into account when D3D12_APPEND_ALIGNED_ELEMENT is used to automatically
place input elements. Currently, vkd3d always assumes the alignment is
4.

This means that, for example, bytes or shorts should be packed tightly
together when D3D12_APPEND_ALIGNED_ELEMENT is used, but are instead
padded to 4 bytes.

Fixing this makes units appear in Age of Empires IV (see vkd3d-proton
issue #880 for examples.)

Signed-off-by: David Gow <david@ingeniumdigital.com>
2021-11-01 13:30:04 +01:00
Robin Kertels 430c77d3b3 vkd3d: Don't add xfb struct to rasterization state when NumEntries is 0.
Wine VKD3D version of my original commit.

Co-authored-by: Conor McCarthy <cmccarthy@codeweavers.com>

Signed-off-by: Robin Kertels <robin.kertels@gmail.com>
2021-10-29 18:06:31 +02:00
Hans-Kristian Arntzen c20852435d tests: Add tests for SM 6.6 compute derivatives.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-27 17:00:33 +02:00
Hans-Kristian Arntzen cd04aa63e6 tests: Test semantics for quad ops in SM 6.6.
Depending on the shader model used, quads are assigned to lanes differently.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-27 17:00:33 +02:00
Hans-Kristian Arntzen 85c75a042f vkd3d: Enable VK_NV_compute_shader_derivatives.
Supported on more implementations too :)

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-27 17:00:33 +02:00
Hans-Kristian Arntzen 30436436cd dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-27 17:00:33 +02:00
Georg Lehmann eb48213bfa vkd3d: Follow the new shaderStorageImage{Read, Write}WithoutFormat rules.
The Vulkan spec update 1.2.195 restricted these features to a very limited
format subset, and somehow this is supposed to not be an API break?
Anyway, let's follow the new rules.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-10-27 17:00:21 +02:00
Georg Lehmann fd690e3831 vkd3d: Enable typed uav loads based on KHR_format_feature_flags2.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-10-27 17:00:21 +02:00
Georg Lehmann 07d53a82cc vkd3d: Init shader extensions later.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-10-27 17:00:21 +02:00
Georg Lehmann 4c37b4c341 vkd3d: Use vkGetPhysicalDeviceFormatProperties2.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-10-27 17:00:21 +02:00
Georg Lehmann c8d633cb51 vkd3d: Enable VK_KHR_format_feature_flags2.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-10-27 17:00:21 +02:00
Hans-Kristian Arntzen 8ff91b23d6 vkd3d-shader: Hook up global descriptor heap for DXIL.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-26 15:32:36 +02:00
Hans-Kristian Arntzen aadccb66cf vkd3d: Add more root signature flags to the list of flags we recognize.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-26 15:32:36 +02:00
Hans-Kristian Arntzen 8977eaef88 vkd3d: Initialize global heap bindings for SM 6.6.
Refactor code which emits SRV/UAV bindings to common code.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-26 15:32:36 +02:00
Hans-Kristian Arntzen cbef48f90a vkd3d: Refactor out how binding counts are parsed.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-26 15:32:36 +02:00
Hans-Kristian Arntzen 6548e4fd00 vkd3d: Add VKD3D_CONFIG for experimentally enabling SM 6.6.
To be used for bringup and removed when we complete the support.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-26 15:32:36 +02:00
Hans-Kristian Arntzen ae185271ff tests: Add SM 6.6 bindless heap test.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-26 15:32:36 +02:00
Hans-Kristian Arntzen 1a57aa841a idl: Add new SM 6.6 root signature flags.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-26 15:32:36 +02:00
Hans-Kristian Arntzen e74213c576 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-26 15:32:36 +02:00
Danylo Piliaiev f6c61a3eae tests: Use 4 samples in test_shader_get_render_target_sample_count
Spec for CheckMultisampleQualityLevels says:
 "FEATURE_LEVEL_11_0 devices are required to support 4x MSAA for all
  render target formats, and 8x MSAA for all render target formats
  except R32G32B32A32 formats."

Test uses R32G32B32A32_FLOAT and since we don't check if this format
supports 8x MSAA, reduce MSAA to the minimum required by spec.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
2021-10-26 10:35:30 +02:00
Hans-Kristian Arntzen 5657f79974 tests: Test that buffer -> DS copies RowPitch is handled correctly.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-20 15:48:12 +02:00
Hans-Kristian Arntzen a0a29bae43 vkd3d: Use correct formats for image -> buffer copies.
Need to use placed format explicitly if we're copying planar resources.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-20 15:48:12 +02:00
ifedorov 0abe8a21dd Fixed row length calculation in CopyTextureRegion()
Signed-off-by: Ivan Fedorov <ifedorov@nvidia.com>
2021-10-20 14:28:35 +02:00
Hans-Kristian Arntzen 9a1b7ab002 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-20 14:12:50 +02:00
Hans-Kristian Arntzen 55e16539db meta: Update Meson build version to 2.5.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-18 17:31:03 +02:00
Philip Rebohle 9477d4af3d meta: Add Anno fix to change log.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-10-18 17:29:29 +02:00
Hans-Kristian Arntzen d4dfccece9 meta: Update CHANGELOG for 2.5.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-18 17:25:45 +02:00
Philip Rebohle 890ba87a7c vkd3d-shader: Merge i/o variables using the same location.
Fixes a number of issues observed in tessellation shaders,
and potentially geometry shaders, when inputs and/or outputs
are array variables.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-10-18 17:25:18 +02:00
Hans-Kristian Arntzen 740e23ea8a vkd3d: Add VKD3D_CONFIG to force non-invariant position.
It's common enough that new games break on RDNA2 because of this that we
should enable this by default. This matches DXVK behavior.

SOTTR gets a special weird exception, just like DXVK. The shaders are
broken enough that the proper fix is actually precise, not invariant.
This will be addressed at some later point.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-18 15:51:20 +02:00
Hans-Kristian Arntzen be8d6ec7ad vkd3d: Make global quirks info struct a value.
Allows us to fiddle with it after the fact.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-18 15:51:20 +02:00
Hans-Kristian Arntzen 26bd08bbde vkd3d-shader: Add global quirks for vkd3d_shader_quirk_info.
Will be used for VKD3D_CONFIG overrides.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-18 15:51:20 +02:00
Hans-Kristian Arntzen 32c5abf496 vkd3d-shader: Add INVARIANT_POSITION quirk.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-18 15:51:20 +02:00
Hans-Kristian Arntzen 2152500014 vkd3d-shader: Refactor out quirk selection.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-18 15:51:20 +02:00
Hans-Kristian Arntzen 4a774f872c dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-18 15:51:20 +02:00
Hans-Kristian Arntzen 3b415dbc89 vkd3d: Don't spam error if ReleaseSemaphore fails.
This function fails if the counter overflows.
CP77 hits this case a lot and we should just warn the specific failure
instead of a random error.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-18 14:58:49 +02:00
Hans-Kristian Arntzen dda02faf89 vkd3d: Pad reserved resources to 64k alignment.
Fix GPU crashes when attempting to bind non-aligned reserved resource.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-18 14:58:34 +02:00
Hans-Kristian Arntzen c3a92a0dad tests: Test more weird GetResourceTiling edge cases.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-15 15:12:41 +02:00
Hans-Kristian Arntzen 8beb7dde89 vkd3d: Handle NULL pointers in GetResourceTiling in more places.
DEATHLOOP uses all NULL at some point ...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-15 15:12:41 +02:00
Philip Rebohle dd23492348 vkd3d: Reduce memset overhead for query map.
Potentially reduces the size of the query map, and makes each entry
versioned so that we no longer have to clear the entire map for multiple
dispatches even if it is sparsely populated.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-10-14 17:48:13 +02:00
Hans-Kristian Arntzen 0c60791bb1 vkd3d: Pass down PrimitiveCulling extension to vkd3d-shader.
DXR 1.1 only feature.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-12 16:04:30 +02:00
Hans-Kristian Arntzen f98702603d vkd3d-shader: Add SPIR-V extension for PrimitiveCulling.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-12 16:04:30 +02:00
Hans-Kristian Arntzen 1417eb6244 tests: Add test for RayQuery.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-12 16:04:30 +02:00
Hans-Kristian Arntzen ae204143d5 tests: Add test for RTPSO side TRI/AABB culling.
DXR 1.1 feature.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-12 16:04:30 +02:00
Hans-Kristian Arntzen 3b0a430975 tests: Test various interaction with TraceRay flags.
Also test DXR 1.1 SKIP_TRIANGLES/AABB.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-12 16:04:30 +02:00
Hans-Kristian Arntzen e522053954 tests: Test more advanced RT geometry and shaders.
Add basic test for intersection + anyhit + AABB primitives.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-12 16:04:30 +02:00
Hans-Kristian Arntzen 1c0b760b7d tests: Add tier parameter to RT context creation.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-12 16:04:30 +02:00
Hans-Kristian Arntzen 6866b45637 vkd3d: Add CONFIG flag for enabling DXR 1.1.
We cannot support ExecuteIndirect with TraceRays() for time being.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-12 16:04:30 +02:00
Hans-Kristian Arntzen e6836c6255 vkd3d: Support RTPSO CONFIG1 flags.
DXR 1.1 and requires PrimitiveCullingFlags feature.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-12 16:04:30 +02:00
Hans-Kristian Arntzen 105882466b vkd3d: Validate that we cannot mix and match geom types in BLAS.
Runtime will error out and return 0 size.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-12 16:04:30 +02:00
Hans-Kristian Arntzen a3202444c8 vkd3d: Fix stack deduction for anyhit shaders.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-12 16:04:30 +02:00
Hans-Kristian Arntzen a36b987bf1 vkd3d: Add static pipeline variant flag to pipeline key.
If we need to fallback in both VRS and non-VRS scenarios, we need to key
on it. Fixes segfault in DIRT5 when toggling VRS.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-12 12:35:58 +02:00
Hans-Kristian Arntzen 3182882e21 d3d12: Do not export ordinals for most symbols.
The ordinals except for D3D12CreateDevice and GetDebugInterface are not
part of the ABI apparently.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-12 11:09:16 +01:00
Hans-Kristian Arntzen 99365bcaec vkd3d: Enable VK_NV_fragment_shader_barycentric.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-11 13:53:19 +01:00
Hans-Kristian Arntzen 158deeff22 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-11 13:29:49 +02:00
Hans-Kristian Arntzen 1ca9ec7284 tests: Add test for local root signature static samplers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-08 11:51:47 +02:00
Hans-Kristian Arntzen 08a7d7a165 vkd3d: Bind local root signature static set.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-08 11:51:47 +02:00
Hans-Kristian Arntzen d83ce4392b vkd3d: Check root signature associations in hit groups as well.
If we don't find a clear association to an entry point,
we can also find it in the hit group.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-08 11:51:47 +02:00
Hans-Kristian Arntzen c672429c70 vkd3d: Fix demangling of RT entry points.
Digits are of course also valid identifiers.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-08 11:51:47 +02:00
Hans-Kristian Arntzen 235541ace5 vkd3d: Build local static sampler set/pipeline layouts and allocate set.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-08 11:51:47 +02:00
Hans-Kristian Arntzen f605b88e90 vkd3d: Make some RS related functions non-static.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-08 11:51:47 +02:00
Hans-Kristian Arntzen 90d52abe94 vkd3d: Parse local RS static samplers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-08 11:51:47 +02:00
Hans-Kristian Arntzen 74f62784e4 vkd3d: Cleanup redundant parameter_count assignment.
parameter_count == NumParameters for local RS since
hoisting is explicitly ignored for those.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-08 11:51:47 +02:00
Hans-Kristian Arntzen 393ef6261b vkd3d: Add local root signature objects to RTPSO.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-08 11:51:47 +02:00
Hans-Kristian Arntzen 6802d9e5a3 vkd3d: Add helper to create augmented pipeline layout.
For local root signature static samplers, this is handy.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-08 11:51:47 +02:00
Hans-Kristian Arntzen 67be905421 vkd3d: Bump max number of descriptor sets.
Need one potentially for local root signature static samplers.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-08 11:51:47 +02:00
Hans-Kristian Arntzen b661c9b8ba vkd3d: Store set layout array in root signature.
With RTPSOs we might have to create static sampler sets for local root
signatures. In this case we will have to create a compatible pipeline
layout which is equal to global pipeline layout, except for an extra
set.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-08 11:51:47 +02:00
Hans-Kristian Arntzen ac9d98b2b4 tests: Verify that we can use UPDATE mode in PrebuildInfo.
As expected, the flag is ignored unless we're actually building.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-08 07:21:47 +01:00
Hans-Kristian Arntzen 1e42acf492 vkd3d: Allow BUILD_MODE_UPDATE in PrebuildInfo check.
Metro Exodus Enhanced Edition hits this a lot.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-08 07:21:47 +01:00
Hans-Kristian Arntzen 4244441aca tests: Test that we can pass in NULL to ppData in CreatePipelineLibrary.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-07 17:55:57 +02:00
Hans-Kristian Arntzen 0f2e448659 vkd3d: Handle CreatePipelineLibrary with NULL ppData.
Supposed to return S_FALSE.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-07 17:55:57 +02:00
Hans-Kristian Arntzen c58edfabe1 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-07 16:03:16 +02:00
Hans-Kristian Arntzen ab4e847e74 renderdoc: Add global capture support.
Useful for test suite since a test can be comprised of several smaller
submissions, and it's easier to debug if we have one trace.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-07 15:33:06 +02:00
Hans-Kristian Arntzen 385c3dc012 vkd3d: Add bug reference for split fallback types.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-07 15:32:54 +02:00
Hans-Kristian Arntzen d74cfe1883 tests: Add stress test for allocating RT/DS heaps.
Without a specific workaround, we will fail this test on NV.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-07 15:32:54 +02:00
Hans-Kristian Arntzen a2f350117f tests: Add simple stress test for UPLOAD allocation.
Try to allocate a lot of memory at once. Useful for seeing if fallbacks
work as intended.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-07 15:32:54 +02:00
Hans-Kristian Arntzen 26dc9e7da5 vkd3d: Allow CreateHeap to fail in certain fallback situations.
If we deduce that fallback heap allocation is impossible, we will accept
this, and defer allocation to CreatePlacedResource() instead where we make a committed resource.
This breaks aliasing, but in practice, this situation will only arise for render
targets, and it's not like we have a choice in the matter here on NV :\

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-07 15:32:54 +02:00
Hans-Kristian Arntzen 7ee8eac818 vkd3d: Add allocation flag for DEDICATED.
When allocating dedicated memory, ignore heap_flag requirements we
deduce from memory info. Any memory type is allowed. This is important
on NV when allocating fallback render targets.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-07 15:32:54 +02:00
Hans-Kristian Arntzen cddb98acc6 vkd3d: Consider that we might attempt to free NULL memory.
For deferred heaps, we will accept NULL allocations.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-07 15:32:54 +02:00
Hans-Kristian Arntzen 4075809a91 vkd3d: Make error message more precise when failing to allocate memory.
There are situations where we cannot fallback to system memory, so don't
log that we're going to do so.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-07 15:32:54 +02:00
Hans-Kristian Arntzen 9065f312d5 vkd3d: Refactor out validation of CUSTOM heap types.
Don't attempt to enter memory allocation when we can invalidate a heap
allocation up front. Avoids some dumb edge cases later.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-07 15:32:54 +02:00
Hans-Kristian Arntzen 9415191111 vkd3d: Add LOG_MEMORY_BUDGET logging for non-budget as well.
Useful to be able to debug which allocations happen.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-07 15:32:54 +02:00
Joshua Ashton c9ff20d4ac vkd3d: Make a generic UE4 shader quirk collection
Many UE4 games have this broken bloom shader that samples a texture with implicit lod in divergent control flow.

Fixes Bus Simulator 21

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-10-07 10:18:47 +01:00
Joshua Ashton 7a66669e92 vkd3d: Add empty element to shader quirks
If we ever remove these, we need this for MSVC.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-10-07 10:18:47 +01:00
Joshua Ashton d91d47d827 vkd3d: Use vkd3d_string_compare for shader quirks
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-10-07 10:18:47 +01:00
Joshua Ashton 70ee02bce0 vkd3d: Use vkd3d_string_compare for application overrides
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-10-07 10:18:47 +01:00
Joshua Ashton 4c959c8a77 vkd3d: Add vkd3d_string_compare helper
Compares a string with a given comparison mode.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-10-07 10:18:47 +01:00
Joshua Ashton 6dbb4f6dfe vkd3d: Add vkd3d_string_ends_with helper
Checks if a string ends with another string.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-10-07 10:18:47 +01:00
Hans-Kristian Arntzen 0f802b151e vkd3d-shader: Avoid undefined result for Ibfe/Ubfe/Bfi.
Width + offset must not overflow in SPIR-V. SM 5+ is well-defined here.
It's enough to just clamp the width against 32 - offset in all cases.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-05 15:45:02 +02:00
Hans-Kristian Arntzen cd3d759b95 vkd3d: Enable VK_KHR_shader_integer_dot_product.
Accelerates SM 6.4 packed ops if present.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-05 15:38:59 +02:00
Hans-Kristian Arntzen 50d41d8f02 khronos: Update SPIR-V and Vulkan headers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-05 15:38:59 +02:00
Hans-Kristian Arntzen f58b23e8e7 tests: Verify that integer dot products do not saturate.
SPIR-V has a saturating and non-saturating variant, so to be sure ...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-05 15:38:59 +02:00
Hans-Kristian Arntzen 807232ceff dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-05 14:49:49 +02:00
Danylo Piliaiev 77c67e2bf5 vkd3d: Use 64bit atomics on all 64bit platforms
Previous check was not exhausting.

Closes: #830

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
2021-10-05 13:22:25 +02:00
Hans-Kristian Arntzen 4ff1166230 tests: Remove obsolete check for lack of wait-before-signal support.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-05 11:44:42 +02:00
Hans-Kristian Arntzen d9cd18b1ca vkd3d-shader: Handle vectorized FIRSTBIT_HI.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-01 16:23:52 +02:00
Hans-Kristian Arntzen 7b4423eee5 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-01 16:14:37 +02:00
Hans-Kristian Arntzen 4edd76d8bb tests: Fix validation error in test_null_rtv.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-30 16:50:02 +02:00
Hans-Kristian Arntzen 4f7e4ee753 tests: Add test for rendering to unbound RTV.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-30 16:50:02 +02:00
Hans-Kristian Arntzen af822939fb vkd3d: Implement support for rendering to NULL/unbound RTV.
Need to use fallback pipeline system here.
Keep track of active masks for PSO and current render target.
The intersection of those sets are the attachments which should be
active in the render pass.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-30 16:50:02 +02:00
Hans-Kristian Arntzen b0f3512b8b tests: Add test for discarding UAVs in compute list.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-29 14:17:31 +02:00
Hans-Kristian Arntzen 173b565ccf vkd3d: Optimize DiscardResource when all subresources are discarded.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-29 14:17:31 +02:00
Hans-Kristian Arntzen 0b11fad67c vkd3d: Allow discarding UAV resources.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-29 14:17:31 +02:00
Hans-Kristian Arntzen 6f0677eb2e vkd3d: Refactor out queue flags -> stages conversion.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-29 14:17:31 +02:00
Hans-Kristian Arntzen 0c2ddb89cd vkd3d: Add CONFIG for forced CACHED memory.
Very useful for capturing. Speeds up a ton.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-27 14:48:26 +02:00
Hans-Kristian Arntzen 6863f1c6a8 vkd3d: Fix test suite regression on NV.
Fix failure in test_create_heap where a TIER_2 host visible heap was
attempted, but failed due to recent DEATHLOOP fixes.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-24 16:48:34 +02:00
Joshua Ashton bde3ad8e01 vkd3d: Move ID3D12StateObject impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton cabc31fc4c vkd3d: Move ID3D12Device impl_froms to header
Basic casts should not be function calls.
2021-09-23 12:12:13 +02:00
Joshua Ashton bfaf72386f vkd3d: Move ID3D12CommandSignature impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton b84c3ff163 vkd3d: Move ID3D12PipelineState impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton 7c993ae1a6 vkd3d: Move ID3D12RootSignature impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton 875fbe5f50 vkd3d: Move ID3D12QueryHeap impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton 2334c136e3 vkd3d: Move ID3D12DescriptorHeap impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton 8d5308c9a1 vkd3d: Move ID3D12Resource impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton 27e66b5c4a vkd3d: Move ID3D12Heap impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton 26d8011b06 vkd3d: Move ID3D12Fence impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton e597adb83a vkd3d: Move d3d12_query_heap_type_get_data_size to header
This should be inlined.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton 3b3bd37f93 vkd3d: Avoid tracking + ending render passes when calling ResolveQueryData with 0 queries
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Conor McCarthy da8daa860b tests: Add test for SampleDesc.Count == 0 in test_create_committed_resource().
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
2021-09-23 11:00:04 +01:00
Conor McCarthy 446c7423ce vkd3d: Return E_INVALIDARG for texture creation if SampleDesc.Count == 0.
Windows returns E_INVALIDARG at least on AMD and Intel.
Psychonaughts 2 seems to use this as a de facto "do not create"
value, and reasonable vram usage depends on the call failing.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
2021-09-23 11:00:04 +01:00
Conor McCarthy d366ba47ac Revert "vkd3d: Support SAMPLE_DESC.Count of 0"
Windows returns E_INVALIDARG in this case.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
2021-09-23 11:00:04 +01:00
Georg Lehmann cf4fb44629 vkd3d: Remove almost unused variable.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-09-21 11:22:34 +01:00
Georg Lehmann edeb0658b7 vkd3d: Fix memory leak on failure.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-09-21 11:22:34 +01:00
Georg Lehmann 0afa6732ad vkd3d: Cleanup weird assignment.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-09-21 11:22:34 +01:00
Georg Lehmann 1946e42367 vkd3d-shader: Fix use-after-free on failure.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-09-21 11:22:34 +01:00
David McCloskey a19619ccbf vkd3d: Fixing compile errors on Windows. 2021-09-18 21:40:30 +01:00
Hans-Kristian Arntzen 173b8ecef0 vkd3d: Add workaround for DEATHLOOP.
Game attempts to create a host visible resource with
ALLOW_RENDER_TARGET flag. We cannot make this work on NVIDIA, but the
game never seems to actually create an RTV, so as a workaround, nop out
the flag, which does make it work after all :3

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-17 14:21:09 +02:00
Hans-Kristian Arntzen fa4d2182b1 vkd3d: Copy all aspects in CopyResource.
Just like we're promoting layer count, also promote aspect mask.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-17 14:21:09 +02:00
Hans-Kristian Arntzen 2b13d06f82 tests: Add test for how blending on integer RTVs are validated.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:57:28 +02:00
Hans-Kristian Arntzen e687d489ab vkd3d: Validate blend state against output signature.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:57:28 +02:00
Hans-Kristian Arntzen a4b082a828 vkd3d-shader: Add helper to parse output signature.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:57:28 +02:00
Hans-Kristian Arntzen 1d51818d8f vkd3d: Fix compile error introduced by bad rebase.
Somehow the rebase got really screwed up :\

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:42:30 +02:00
Hans-Kristian Arntzen a8f623e60d vkd3d: Negate upload_hvv config.
Enable resizable BAR style allocations by default, and add option to
disable it.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen 12066a2b67 vkd3d: Add debug config to log resizable BAR allocations.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen 710fa98918 vkd3d: Setup resizable bar budget.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen cec741706d vkd3d: Refactor out memory topology queries.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen abdaeb136d vkd3d: Add a memory budget per memory type.
For resizable BAR, we don't want to endlessly promote UPLOAD heaps to
BAR since VRAM is precious. The aim is to set a fixed budget where we
can keep allocating until full, at which point we fall back to plain HOST.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen e0451bb541 vkd3d: Handle fallbacks properly in suballocator.
With BAR budgets, what will happen is that
- Small allocation is requested
- A new chunk is requested
- try_suballocate_memory will end up calling allocate_memory, which
  allocates a fallback memory type
- Subsequent small allocators will always end up allocating a new
  fallback memory block, never reusing existing blocks.
- System memory is rapidly exhausted once apps start hitting against
  budget.

The fix is to add flags which explicitly do not attempt to fallback
allocate. This makes it possible to handle fallbacks at the appropriate
level in try_suballocate_memory instead.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen cb94cfd10c vkd3d: Fix silly typo in global mask.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen 426cdc9218 vkd3d: Destroy GLOBAL_BUFFER for some early error out paths.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen 69d4f55219 vkd3d: Refactor VkDeviceMemory allocation to keep track of type/size.
We will need to consider some form of budgeting, so make sure that all
allocation and freeing is done in a central place.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen a590db2508 tests: Add test for host visible render target.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 15:35:57 +02:00
Hans-Kristian Arntzen d402255349 tests: Add reduced test for ReadWriteSubresource to 2D images.
3D linear images are not well supported.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 15:35:57 +02:00
Hans-Kristian Arntzen b4521ebbd8 tests: Add tests for various ways to map 2D textures.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 15:35:57 +02:00
Hans-Kristian Arntzen 8d49d3e9ae vkd3d: Add extra validation for mapping textures.
D3D12 validation layers complain if you try to map mipmapped 3D volumes
for ... some reason. The error is very explicit, so I assume it's
intentional :)

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 15:35:57 +02:00
Hans-Kristian Arntzen 9fd422a0fd vkd3d: Fix default layout check when using LINEAR tiled images.
Match behavior of d3d12_resource_pick_layout.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 15:35:57 +02:00
Hans-Kristian Arntzen 41295eff6c vkd3d: Consider CPU availibility when selecting memory types.
Need to consider that based on host visibility requirements, we need to
select either LINEAR or OPTIMAL image types, and those tiling modes can
have different memory requirements.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 15:35:57 +02:00
Hans-Kristian Arntzen 132638be67 vkd3d: Add more logging when linear image allocation fails.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 15:35:57 +02:00
Hans-Kristian Arntzen 50f2c35b44 vkd3d: Add stricter ROW_MAJOR texture validation.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 15:35:57 +02:00
Hans-Kristian Arntzen 961fef84de vkd3d: Allow map of texture as long as ppData is NULL.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 15:35:57 +02:00
Joshua Ashton 9c0fa91ca5 vkd3d: Add shader quirks for Psychonauts 2
Works around a game bug. It uses texture() inside divergent control flow.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-15 11:52:39 +02:00
Hans-Kristian Arntzen 3081887757 vkd3d: Add 12_2 to list of valid feature levels.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-14 21:18:29 +02:00
Hans-Kristian Arntzen 0e216b2b10 vkd3d: Narrow workaround for global pipeline cache.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-08 18:37:55 +02:00
Hans-Kristian Arntzen 11086a94e0 vkd3d: Add macros to parse/build NV driver versions.
The bit offsets are a bit different from Vulkan API.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-08 18:37:55 +02:00
Hans-Kristian Arntzen d2b3238b2d tests: Add tests for creating DS formats without ALLOW_DEPTH_STENCIL.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-07 13:31:28 +02:00
Hans-Kristian Arntzen fcaeca8d27 vkd3d: Allow typeless depth-stencil formats without ALLOW_DEPTH_STENCIL.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-07 13:31:28 +02:00
Hans-Kristian Arntzen 403d1f9743 vkd3d: Workaround huge memory overhead for individual VkPipelineCaches.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-07 13:21:54 +02:00
Hans-Kristian Arntzen b8f0cd6eb6 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-07 12:10:06 +02:00
Hans-Kristian Arntzen 1d5acef691 tests: Add test for footprint -> depth-stencil copy.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-06 17:00:51 +02:00
Hans-Kristian Arntzen a3267ba8e5 vkd3d: Fix copies between footprint and DS aspects.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-06 17:00:51 +02:00
Hans-Kristian Arntzen fa1d82e141 vkd3d: Fix regressions when introducing null-copy elision.
Need to initialize the set mask so that copies happen properly
on default-initialized descriptors. Also, move the current_null_type to
metadata so that it's properly copied on descriptor copy.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-03 12:24:26 +02:00
Rodrigo Locatti b4cb5a37f8 vkd3d: Optimize repeated null descriptor updates
There are titles clearing the same descriptors constantly.
This leads to unnecessary updates that can become costly.

This commit introduces a new flag to track when D3D12 descriptors are
not null, and skips clearing them if they are already null.
Descriptors are assumed to be null by default.

This fixes a performance regression introduced by
9983a1720f

Signed-off-by: Rodrigo Locatti <rlocatti@nvidia.com>
2021-09-02 21:21:34 +02:00
Philip Rebohle 7fea3527ed vkd3d: Remove deferred clears.
Emitting render pass clears while we're in the process of starting
a render pass overrides dsv layout tracking info.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-09-02 17:11:35 +02:00
Hans-Kristian Arntzen b05145b421 tests: Add test for depth testing against null DSV.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-02 17:10:47 +02:00
Hans-Kristian Arntzen ff74ad0ec5 vkd3d: Skip draw call if doing depth test on null DSV.
D3D12 validation layer errors out, so unless we can prove that specific
behavior is relied upon, we should be okay to just ignore.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-02 17:10:47 +02:00
Hans-Kristian Arntzen b54a1a6c2b vkd3d: Fix MSVC build.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-02 16:56:39 +02:00
Hans-Kristian Arntzen 00e4397467 vkd3d: Ignore depth/stencil test if DSVFormat does not have that aspect.
Fix some validation errors in F1 2021.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-02 16:25:27 +02:00
Hans-Kristian Arntzen 6f8ebaae7e tests: Add test for planar footprints.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-02 12:21:22 +02:00
Hans-Kristian Arntzen bc9bd9c482 vkd3d: Fix member types in vkd3d_format.
No need to use size_t.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-02 12:21:22 +02:00
Hans-Kristian Arntzen 7b67de7d0e vkd3d: Generalize get_plane_footprints.
Get information directly from vkd3d_format and allow for subsampled
formats in the future.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-02 12:21:22 +02:00
Hans-Kristian Arntzen 3d5010555e vkd3d: Add d3d12_resource_desc_get_sub_resource_count.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-02 12:21:22 +02:00
Hans-Kristian Arntzen 5c2376faf5 vkd3d: Handle multiplanar formats in GetCopyableFootprints.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-02 12:21:22 +02:00
Hans-Kristian Arntzen b8881ff693 vkd3d-common: Log TID in Wine's format.
Allows us to stay sane when correlating logs.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-01 15:41:59 +02:00
Hans-Kristian Arntzen d9bdd515a4 tests: Check for Native16Bit support before testing dot2add.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-30 23:06:26 +02:00
Hans-Kristian Arntzen 566cf1ed78 tests: Rename get_rt_lib() to something that allows for more libs.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-28 12:16:42 +02:00
Hans-Kristian Arntzen 9a92d62465 tests: Use RT factory for main RT PSO as well.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-28 12:16:42 +02:00
Hans-Kristian Arntzen fb8d8616b7 tests: Introduce a factory for building RTPSO subobject lists.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-28 12:16:42 +02:00
Hans-Kristian Arntzen 6aecbe2482 tests: Refactor out RT collection creation.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-28 12:16:42 +02:00
Hans-Kristian Arntzen dda18f0fcd tests: Hoist out helper function to create a complete RTAS.
Have a single helper function to create RTASes with X * Y quads.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-28 12:16:42 +02:00
Hans-Kristian Arntzen c1f848ed3b vkd3d: Only look at SourceRTAS when updating.
Be more robust against garbage inputs.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-28 12:16:42 +02:00
Hans-Kristian Arntzen af2d41f6f8 tests: Use helper functions to build top-level acceleration structures.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-28 12:16:42 +02:00
Hans-Kristian Arntzen 17d5984c2c tests: Add helpers for creating and copying RTASes.
Move bottom RTAS building over to new helpers.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-28 12:16:42 +02:00
Hans-Kristian Arntzen 82db981b26 tests: Refactor out transform buffer creation.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-28 12:16:42 +02:00
Hans-Kristian Arntzen 99d2e39dfa tests: Refactor out test geometry allocation.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-28 12:16:42 +02:00
Hans-Kristian Arntzen 830b9ef4e3 tests: Refactor out RT DXIL library declaration.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-28 12:16:42 +02:00
Hans-Kristian Arntzen d44d359a18 tests: Refactor out RT context creation.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-28 12:16:42 +02:00
rochaudhari 0828aec4f6 vkd3d: Implement new interfaces required for DX12 DLSS support.
Adds ID3D12GraphicsCommandListExt and ID3D12DeviceExt interfaces.

Signed-off-by: Roshan Chaudhari <rochaudhari@nvidia.com>
2021-08-27 11:37:15 +02:00
Joshua Ashton e9f04e8e0e vkd3d: Support SAMPLE_DESC.Count of 0
Psychonauts 2 uses a SAMPLE_DESC.Count of 0 for some things, which
previously was forcing it down the MSAA alignment placement path.

Found from playing a native D3D12 apitrace back and seeing
the log spam.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-08-26 14:23:37 +02:00
Philip Rebohle 715eca1b95 vkd3d: Reimplement frame latency event as a semaphore.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-08-26 14:21:38 +02:00
Philip Rebohle fef30f5037 vkd3d: Support releasing semaphores from a D3D12 fence.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-08-26 14:21:38 +02:00
Joshua Ashton 68a035c605 vkd3d-shader: Fix vkd3d-compiler crash
Since we added validation here for FH4, this crashes now as vkd3d-compiler passes a NULL shader_interface_info.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-08-26 14:06:47 +02:00
Joshua Ashton 56e12d88ce vkd3d-shader: Fix multiple constant buffers with RAW_VA
Consider we have declarations of CB0 of size 36 and CB1 of size 153.
Previously we'd just return the struct of CB0 when accessing CB1 because it came first as we didn't consider the size.

Psychonauts 2 indexes into CB1 by constant values above 36.
There is no reason a compiler could not eliminate these reads as it is technically out of bounds for the underlying array type.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-08-26 14:05:52 +02:00
Hans-Kristian Arntzen 5ef3d4bff9 tests: Move test implementations to appropriate files.
Avoids crippling 50+ ksloc files which are impossible to navigate
efficiently. IDEs tend to give up on files these large and editors start
to chug hard.

This commit is essentially pure cut 'n paste, which is why it's all in
one large commit. There is little to no reason to attempt to split this
up into multiple smaller commits.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-26 14:05:23 +02:00
Hans-Kristian Arntzen c2473fb873 tests: Improve log quality.
Set test name equal to the test that is actually running, not a global
"d3d12" which isn't very useful for a case with multiple files since
line number alone isn't enough to know where to look.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-26 14:05:23 +02:00
Hans-Kristian Arntzen 1a7ea5e0a6 tests: Make tests extern.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-26 14:05:23 +02:00
Hans-Kristian Arntzen 195e1a3447 tests: Declare test prototypes in separate header.
Allows for moving test implementations to their own translation units.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-26 14:05:23 +02:00
Hans-Kristian Arntzen 4a6fba9f56 tests: Make some statics in headers extern.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-26 14:05:23 +02:00
Hans-Kristian Arntzen 9d5cf16fc3 tests: Move common test code to its own file.
Some ifdef jank required since the various headers declare the main function.
Some additional jank with INITGUID, otherwise we get multiple
declaration errors.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-26 14:05:23 +02:00
Hans-Kristian Arntzen f589462ab5 tests: Move math MATH defines to d3d12_crosstest.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-26 14:05:23 +02:00
Hans-Kristian Arntzen aaaac271bd tests: Make other entry points extern.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-26 14:05:23 +02:00
Hans-Kristian Arntzen 8c89dacf76 tests: Make common test functionality extern.
Prepares for a situation where tests can be spread across multiple
translation units.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-26 14:05:23 +02:00
Hans-Kristian Arntzen 7ff3ef2654 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-24 12:52:56 +02:00
Hans-Kristian Arntzen f3fd2bf70b vkd3d: Use BAR memory type for descriptor heap helpers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-23 13:24:43 +02:00
Hans-Kristian Arntzen 7e165238e6 vkd3d: Allow all memory types if UPLOAD_HVV is used.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-23 13:24:43 +02:00
Joshua Ashton 1b957a1f74 vkd3d: Add config to use host-visible vram for UPLOAD heap
Adds the "upload_hvv" config flag, which will make D3D12_HEAP_TYPE_UPLOAD attempt to use host-visible VRAM for allocations.

This takes advantage of large or resizable BAR if available.

I see a perf delta of 83-84 -> 92-94 (~12%) when using this in Horizon Zero Dawn.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-08-23 13:24:43 +02:00
Hans-Kristian Arntzen 05e31bfba9 vkd3d: Ensure we do not fallback device allocations to BAR.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-23 13:24:43 +02:00
Robin Kertels 76f37c3cbf vkd3d: Only disable raster based on SO stream if SO is used.
Signed-off-by: Robin Kertels <robin.kertels@gmail.com>
2021-08-23 13:10:14 +02:00
Hans-Kristian Arntzen b2c99b035a vkd3d: Allow SM 6.2 on NV.
FloatControlProperties struct appears to be broken, and it does seem to
work just fine.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-30 15:19:35 +00:00
Hans-Kristian Arntzen 41d54e19f4 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-30 13:03:48 +02:00
Hans-Kristian Arntzen 093a8c49f3 vkd3d: Expose shader model 6.5.
WaveMatch and WaveMultiPrefix are implemented and pass test.
Other features are gated behind feature bits.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-29 20:42:32 +02:00
Hans-Kristian Arntzen 3c350ec0f5 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-29 20:42:32 +02:00
David McCloskey a2a7d78c27 vkd3d: Fixing CopyTextureRegion going out of bounds when src_box is null.
Signed-off-by: David McCloskey <davmcclo@gmail.com>
2021-07-29 17:28:52 +02:00
David McCloskey 155195ef99 tests: Adding test for crash caused by CopyTextureRegion with null source box from larger texture to smaller.
Signed-off-by: David McCloskey <davmcclo@gmail.com>
2021-07-29 17:28:52 +02:00
Hans-Kristian Arntzen 3f3162ab5f tests: Add test for fence signal with NULL event.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-29 17:21:20 +02:00
Hans-Kristian Arntzen e1bb5f3b77 vkd3d: Handle NULL event handles in ID3D12Fence::SetEvent*().
We need to block here for whatever reason.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-29 17:21:20 +02:00
Hans-Kristian Arntzen 455f00fe26 vkd3d: Log failures when signaling external events.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-29 17:21:20 +02:00
Hans-Kristian Arntzen 88978ab059 tests: Add SM 6.5 wave intrinsics test.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-29 16:58:07 +02:00
Hans-Kristian Arntzen a7e77fa777 tests: Add test for SM 6.4 packed arithmetic instructions.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-28 15:28:19 +02:00
Hans-Kristian Arntzen 4d97efc9d4 tests: Add test for SM 6.2 FP32 denorm attribute.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-28 15:28:19 +02:00
Hans-Kristian Arntzen 435a087047 vkd3d: Rework how shader model versions are exposed.
From native testing, we can expose higher shader models if
cap bits features are not supported. E.g. Polaris exposes SM 6.5, even
when 16-bit and barycentrics are not supported.

With latest dxil-spirv updates we can support the required SM 6.4
features.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-28 15:28:19 +02:00
Hans-Kristian Arntzen 760e8e1565 tests: Add tests for FP16 and how features are handled.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-28 15:28:19 +02:00
Hans-Kristian Arntzen 5b013d0b02 vkd3d: Validate shader meta against features.
We're supposed to validate and fail compilation if certain features are
not supported.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-28 15:28:19 +02:00
Hans-Kristian Arntzen 5df4a5c083 vkd3d-shader: Add 16-bit feature usage to meta.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-28 15:28:19 +02:00
Hans-Kristian Arntzen ab9e99cbfa vkd3d: Check for Int16 capability as well as extended subgroup types when exposing 16-bit ops.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-28 15:28:19 +02:00
Hans-Kristian Arntzen 27e0ca9bc1 dxil-spirv: Update submodule.
Add support for SM 6.2 denorm, SM 6.4 packed arith intrinsics.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-28 14:53:11 +02:00
Hans-Kristian Arntzen 6fd564db91 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-21 16:10:49 +02:00
Hans-Kristian Arntzen 229db9008a tests: Add test for SV_Barycentrics.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-21 14:34:29 +02:00
Hans-Kristian Arntzen cafe99e223 meta: Update version to 2.4.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-08 15:06:39 +00:00
Joshua Ashton 1d23bdbab7 vkd3d: Don't store pointer to QA info when not building with QA
This is entirely unnecessary and a waste of space as it will never be used.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-07-08 16:52:58 +02:00
Joshua Ashton a53a7f8d7c vkd3d-shader: Restrict descriptor-qa extras and logic to VKD3D_ENABLE_DESCRIPTOR_QA
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-07-08 16:52:58 +02:00
Joshua Ashton 309fc817e8 vkd3d: Fix RT local root signature interface flags
This was passing through flags of the root signature not the shader interface flags of it.

Need to get the shader interface flags of the root signature instead.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-07-08 16:52:58 +02:00
Hans-Kristian Arntzen 9197625fbf meta: Update CHANGELOG for 2.4.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-08 14:50:33 +02:00
Hans-Kristian Arntzen 4ed8931401 tests: Add test for ResolveSubresourceRegion.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-08 13:54:05 +02:00
Hans-Kristian Arntzen 29a9ccd356 vkd3d: Basic implementation of ResolveSubresourceRegion.
Used by DIRT5.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-08 13:54:05 +02:00
Hans-Kristian Arntzen f3c3e53f7a vkd3d: Add resolve mode argument to resolve helper.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-08 13:54:05 +02:00
Hans-Kristian Arntzen 591d47a6c5 vkd3d: Refactor out ResolveSubresource.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-08 13:54:05 +02:00
Hans-Kristian Arntzen 732d1dd234 vkd3d-shader: Reflect patch vertex count for DXIL HS.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:58:45 +02:00
Hans-Kristian Arntzen 4f3b4d1f79 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:58:45 +02:00
Hans-Kristian Arntzen 37e8f42f4a vkd3d: Move patch vertex count to meta struct.
Will make it easier to implement for DXIL.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:58:45 +02:00
Hans-Kristian Arntzen d19821ba84 vkd3d-shader: Change cs_workgroup_size type.
DXIL C API takes unsigned* not uint32_t*, avoid potential warnings.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:58:45 +02:00
Hans-Kristian Arntzen 3915090c12 vkd3d: Track depth-stencil image layouts over a command buffer.
Goal here is to avoid unnecessary image layout transitions when render
passes toggle depth-stencil PSO states. Since we cannot know which
states a resource is in, we have to be conservative, and assume that
shader reads *could* happen.

The best effort we can do is to detect when writes happen to a DSV
resource. In this scenario, we can deduce that the aspect cannot be
read, since DEPTH_WRITE | RESOURCE state is not allowed.

To make the tracking somewhat sane, we only promote to OPTIMAL if an
entire image's worth of subresources for a given aspect is transitioned.
The common case for depth-stencil images is 1 mip / 1 layer anyways.

Some other changes are required here:
- Instead of common_layout for the depth image, we need to consult the
  command list, which might promote the layout to optimal.
- We make use of render pass compatibility rules which state that we can
  change attachment reference layouts as well as initial/finalLayout.
  To make this change, a pipeline will fill in a
  vkd3d_render_pass_compat struct.
- A command list has a dsv_plane_optimal_mask which keeps track
  of the plane aspects we have promoted to OPTIMAL, and we know cannot
  be read by shaders.
  The desired optimal mask is (existing optimal | PSO write).
  The initial existing optimal is inherited from the command list's
  tracker.
- RTV/DSV/views no longer keep track of VkImageLayout. This is
  unnecessary since we always deduce image layout based on context.

Overall, this shows a massive gain in HZD benchmark (RADV, 1440p ultimate, ~16% FPS on RX 6800).

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:45:46 +02:00
Hans-Kristian Arntzen 515ed7fbd1 vkd3d: Make sure memory is available before change image layout.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:45:46 +02:00
Hans-Kristian Arntzen 8f05ac298c vkd3d: Add implementation for plane optimal tracker.
Idea is to keep track of scenarios where we know a resource's aspect is
known to be in a OPTIMAL state. Based on this, we can override the image
layout from the common_layout in order to avoid unnecessary full
barriers.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:45:46 +02:00
Hans-Kristian Arntzen 1288d0f9b1 vkd3d: Remove obsolete all_aspect parameter.
For copies, we can always use the intended aspects, since we have
separate DS layouts now.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:31:52 +02:00
Hans-Kristian Arntzen c29a2d1fa8 tests: Add test for COLOR -> DEPTH copies.
Only had DEPTH -> COLOR.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:31:52 +02:00
Hans-Kristian Arntzen 68ce7bd324 vkd3d: Handle separate DS layout for destination copies.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:31:52 +02:00
Hans-Kristian Arntzen fbe6f4a210 tests: Make sure that we exercise separate DS clears in test suite.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:31:52 +02:00
Hans-Kristian Arntzen 81d472242b vkd3d: Clear single depth-stencil aspect correctly.
When clearing a DSV, we must get aliasing guarantees, so we must
transition away from UNDEFINED. This is only possible when using
separate_ds_layouts and for render pass clears we need to use
renderpass2 mechanisms to do this.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:31:52 +02:00
Hans-Kristian Arntzen a87d086a39 tests: Update min_lod test with TODO which reflect existing impl.
An extension should be able to remove all the TODOs.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 12:50:23 +02:00
Hans-Kristian Arntzen 35c555c479 vkd3d: Use more correct fallback path for minLODClamp.
The clamp is absolute, not relative to baseMip. Also avoids validation
error and potential crash when LODClamp > numLevels.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 12:50:23 +02:00
Joshua Ashton a361bcb0f8 tests: Add a test for MinLODClamp
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 12:20:24 +02:00
Joshua Ashton 61ccdb9037 vkd3d: Make invalid RTV for attachment FIXME_ONCE
This spams constantly in Dirt 5.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-07-07 11:49:18 +02:00
Hans-Kristian Arntzen cf632186fd vkd3d: Add workaround for MinLODClamp.
Not correct, will need spec additions to handle it properly.
Fixes ground rendering in DIRT 5.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-06 16:45:19 +02:00
Hans-Kristian Arntzen 1e4628376f tests: Add test for VBV stride edge cases.
Verifies that AMD native driver behaves oddly with stride < offset
cases.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-06 15:29:51 +02:00
Hans-Kristian Arntzen 55bbea5d29 tests: Test depth-stencil discards as well.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-02 15:18:16 +02:00
Hans-Kristian Arntzen 3090ae01c1 vkd3d: Support discarding single aspects as required.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-02 15:18:16 +02:00
Hans-Kristian Arntzen 398724cd6e vkd3d: Require VK_KHR_separate_depth_stencil_layouts.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-02 15:18:16 +02:00
Hans-Kristian Arntzen 419790ac77 vkd3d: Add wave size workaround for GravityMark.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-02 15:15:42 +02:00
Hans-Kristian Arntzen 92c4f861e7 vkd3d-shader: Report CS workgroup size metadata.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-02 15:15:42 +02:00
Hans-Kristian Arntzen 17fd01a2c8 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-02 15:15:42 +02:00
Hans-Kristian Arntzen 7a00e56792 vkd3d: Handle multiple planes in d3d12_resource_get_subresource_count.
Separate out an explicit per_plane query for the cases where we need it.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-02 14:16:18 +02:00
rochaudhari be2362268c vkd3d: Return format2 information for d3d12_device_CheckFeatureSupport
Currently only format1 information is being returned for D3D12_FORMAT_SUPPORT.

Signed-off-by: Roshan Chaudhari <rochaudhari@nvidia.com>
2021-07-02 14:07:39 +02:00
Hans-Kristian Arntzen 33edd1b926 tests: Ensure we hit viewport count 0 case in test suite.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-01 13:53:19 +02:00
Hans-Kristian Arntzen 3ea20a91ad vkd3d: Handle zero viewports.
This can be used for rasterizer discard, just bind dummy viewport and
scissor.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-01 13:53:19 +02:00
Hans-Kristian Arntzen cb5283b6fb vkd3d: Allow dynamic vertex stride == 0 to go through.
Eliminates all late pipeline compiles in Scarlet Nexus DX12 (and several
other games).

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-29 16:00:33 +02:00
Hans-Kristian Arntzen c1860a1ead vkd3d: Add VKD3D_CONFIG flags for forcing EXCLUSIVE queue modes.
Helps in some cases, but we cannot do this by default :(

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-29 12:24:24 +02:00
Joshua Ashton 5e3ec4337b vkd3d: Fix top-most handling when restoring from fullscreen
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-06-25 17:28:35 +02:00
Hans-Kristian Arntzen ba7c2b7c5f swapchain: Log window rects for leaving and entering fullscreen.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-25 08:01:26 -07:00
Paul Gofman ca2ae195fb swapchain: Update original_window_rect in d3d12_swapchain_SetFullscreenState().
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-25 08:01:26 -07:00
Hans-Kristian Arntzen 84f4b893ee swapchain: Use VK_CALL macro.
There's a mix and match of vk_procs-> and CALL conventions. Harmonize
this.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-25 15:18:27 +02:00
Hans-Kristian Arntzen b5023bab32 swapchain: Synchronize before resetting blit command buffer.
Randomly appears in GravityMark, odd that validation didn't find this in
other cases.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-25 15:18:27 +02:00
Hans-Kristian Arntzen 7c80c92304 vkd3d: Use ALLOW_VARYING_SUBGROUP_SIZE flag as appropriate.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-25 15:08:53 +02:00
Hans-Kristian Arntzen 12e0aa2a46 vkd3d-shader: Query if subgroup size is used.
Lets calling code know if it should use ALLOW_VARYING_SUBGROUP_SIZE.
To avoid too much churn on pipeline caches, only add the flag when
needed.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-25 15:08:53 +02:00
Hans-Kristian Arntzen d1286f5ae1 dxil-spirv: Add support for querying subgroup size usage.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-25 15:08:53 +02:00
Hans-Kristian Arntzen 8a82b718e4 tests: Remove TODO in stencil_export test.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-24 16:22:24 +02:00
Hans-Kristian Arntzen cc324cadd1 dxil-spirv: Support SV_StencilRef.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-24 16:22:24 +02:00
Hans-Kristian Arntzen 27fdc39e67 vkd3d: Be more robust with out of bounds clear/discard rects.
GravityBench ends up using ClearView with too large dimensions.
This is a validation error in Vulkan, so just clamp the extents.

To make full rect detection a bit more robust, do a range check instead
of memcmp().

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-24 16:18:38 +02:00
Hans-Kristian Arntzen 0a732a3b27 tests: Add DXIL path to test_atomic_instructions.
It exercises root descriptor atomics, so it's useful.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-24 15:23:52 +02:00
Hans-Kristian Arntzen d0dd116bea dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-24 15:23:52 +02:00
Georg Lehmann a7922a7c85 vkd3d: Introduce vkd3d_internal_get_vk_format.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-06-24 12:55:17 +02:00
Georg Lehmann 0d9c7bc3ad vkd3d: Index formats by format.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-06-24 12:55:17 +02:00
Georg Lehmann c915f237e3 vkd3d: Index depth stencil formats by format.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-06-24 12:55:17 +02:00
Georg Lehmann 1af017c284 include: Add some new dxgi formats.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-06-24 12:55:17 +02:00
Hans-Kristian Arntzen c108bec58f vkd3d: Fix trivial indentation nit.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:41:09 +02:00
Hans-Kristian Arntzen 9900301886 vkd3d: Use read-write lock for fallback pipeline cache.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:41:09 +02:00
Hans-Kristian Arntzen bb723e859b vkd3d: Use read-write locks for render pass cache.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:41:09 +02:00
Hans-Kristian Arntzen 5fe135f3fb vkd3d: Ensure shader visibility happens for DEPTH_READ | RESOURCE scenarios.
If we're doing a layout transition of depth-stencil aspects, we need to ensure all potential
accesses are made visible.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:32:48 +02:00
Hans-Kristian Arntzen 8225edc726 vkd3d: Rewrite resource state implementation.
- Honor resource barriers for resource states which cannot automatically
  decay or promote. This includes COLOR_ATTACHMENT, UNORDERED_ACCESS and
  VRS image. If SIMULTANEOUS_ACCESS is used, we can still promote, and
  we handle that by setting common layout to GENERAL for these resources.

- Avoid redundant barriers in render passes since normal resource
  barriers will always make sure we are already in
  COLOR_ATTACHMENT_OPTIMAL.

- Do not force GENERAL layout if resource has UNORDERED_ACCESS flag set.
  As this is not a promotable state, we have to explicitly transition
  into it. I tested this on validation layers, where even COMMON state
  refuses to promote to UAV state. The exception here of course is
  SIMULTANOUS_ACCESS, but we handle that properly now.

- Verify that UAV or SIMULTANEOUS access is not used together with DSV
  state. This is explicitly banned in the API docs.

- Actually emit image barriers. Batch the image transitions as that's
  what D3D12 docs encourage app developers to do, and it also expects
  that drivers can optimize this. Ensure that we respect the in-order
  resource barrier rules by splitting batches if there are overlaps in
  the transitions.

- Ensure that correct image layout is used when clearing a suspended
  render pass attachment.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:32:48 +02:00
Hans-Kristian Arntzen 177679a766 vkd3d: Add VKD3D_RESOURCE_SIMULTANEOUS_ACCESS.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:32:48 +02:00
Hans-Kristian Arntzen 02398c4eef vkd3d: Normalize depth-stencil layouts if only one aspect is used.
Avoid using the separate layouts if we're only using formats with one
aspects. This makes it more likely to match layouts with common layout,
and we can avoid awkward transition barriers.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:32:48 +02:00
Philip Rebohle 014a3c0b94 vkd3d: Handle plane slice index in descriptor creation.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-06-21 21:23:03 +02:00
Samuel Pitoiset bf04b324c6 vkd3d: remove few occurrences of RADV/ACO
We recently dropped this from Mesa because ACO is the default
compiler since August 2020, so it's implicit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2021-06-18 16:11:26 +02:00
Samuel Pitoiset 72d9b322b8 vkd3d: reject creating a resource that is placed if the heap is too small
The spec is pretty clear that it's invalid usage. Return E_INVALIDARG
like native drivers.

This is a workaround for the inventory GPU hang with Cyberpunk 2077
which is actually a game bug. Luckily the game handles this error
properly.

The problem is that the game always assume that an image with 2 mips
is smaller than the same image but with 6 mips. This is not always
true if the swizzle mode is different and a recent Mesa update changed
that. Then the game creates a D3D12 heap that is too small and this
triggered a memory violation and then a GPU hang with RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2021-06-17 16:42:23 +02:00
Hans-Kristian Arntzen 1ea31701c5 vkd3d: Move F1 2020 workaround over to quirks system.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-17 16:42:14 +02:00
Hans-Kristian Arntzen 28c8a595fa vkd3d: Pass down shader quirks for Necromunda.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-17 16:42:14 +02:00
Hans-Kristian Arntzen cb61a4c83a vkd3d-shader: Implement sample explicit LOD override.
In control flow, we can force LOD 0.0 to avoid undefined result when
games sample with implicit LOD in non-quad uniform control flow.

Behavior on different implementations is:
- Helper lanes come to life and interpolate shader input.
- LOD is clamped to 0.0 in divergent control flow.

This hack is not safe in general, since we force 0.0 even when the
control flow is quad uniform.

This is the most practical solution for the problem for now.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-17 16:42:14 +02:00
Hans-Kristian Arntzen a08e493a3a vkd3d-shader: Add interface for shader workarounds.
Don't really have much of a choice for the short term. :\

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-17 16:42:14 +02:00
Hans-Kristian Arntzen 4c101a4e81 vkd3d-shader: Keep track of early returns.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-17 16:42:14 +02:00
Hans-Kristian Arntzen 9207d4f019 vkd3d: Ignore BlendEnable if write mask is 0.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-17 16:42:14 +02:00
Hans-Kristian Arntzen 8589a425fe vkd3d-shader: Emit NoContraction for MAD/DFMA.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-17 16:01:46 +02:00
Hans-Kristian Arntzen 5c971f216e vkd3d: Invalidate binding state on query resolve.
Fixes random broken AO in Necromunda on RADV.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-17 15:59:05 +02:00
Hans-Kristian Arntzen 7ab0846242 tests: Add test for placed resource runtime validation.
Runtime validates resource size.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-17 15:53:35 +02:00
Philip Rebohle 6d1d60e898 tests: Test tile mappings for 3D textures.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-06-14 15:53:33 +02:00
Philip Rebohle 14617a7bb2 tests: Test resource tiling for 3D textures.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-06-14 15:53:33 +02:00
Philip Rebohle b97a012787 vkd3d: Enable tiled resources tier 3.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-06-14 15:53:33 +02:00
Hans-Kristian Arntzen 42fb018d85 vkd3d: Fix leak of command pools on device destruction.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-11 15:17:45 +02:00
Hans-Kristian Arntzen d7843fa012 vkd3d: Fix potential deadlock in debug ring.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-11 11:04:38 +02:00
Hans-Kristian Arntzen 58854b0a9c vkd3d: Fix potential deadlock in descriptor QA checks.
If we destroy device right after creating it, we risk a deadlock.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-11 11:04:38 +02:00
Hans-Kristian Arntzen 76a8914d6b vkd3d: Add validation error workaround.
Our internal copy shaders are fine, but we get benign errors about
sample count being wrong since we alias descriptors.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-10 14:19:04 +02:00
Hans-Kristian Arntzen abe0995e88 vkd3d: Use correct allocation size for memory block.
We cannot use the memory requirement output, since we will zero-clear
memory with a size that might be larger than the VkBuffer size.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-10 14:16:01 +02:00
Hans-Kristian Arntzen fda8cba2b8 tests: Add missing resource barrier to some tests.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-10 13:57:32 +02:00
Hans-Kristian Arntzen 8056a71415 tests: Fix wrong resource state in test_bufinfo_instruction.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-10 13:57:32 +02:00
Hans-Kristian Arntzen 3c6174cafc tests: Fix type mismatch in test_draw_uav_only.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-10 13:57:32 +02:00
Hans-Kristian Arntzen b922292852 vkd3d: Fix view object leak when creating fallback UAV clear view.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-10 13:50:54 +02:00
rochaudhari 1699743c37 vkd3d: Enable binary import and image view handle extensions
Signed-off-by: Roshan Chaudhari <rochaudhari@nvidia.com>

Reviewed-by: Liam Middlebrook <lmiddlebrook@nvidia.com>
2021-06-10 11:26:34 +02:00
rochaudhari ba997f0736 vk-headers: Update subprojects/Vulkan-Headers to 1.2.180
This is needed for VK_NVX_binary_import and VK_NVX_image_view_handle.

Signed-off-by: Roshan Chaudhari <rochaudhari@nvidia.com>

Reviewed-by: Liam Middlebrook <lmiddlebrook@nvidia.com>
2021-06-10 11:26:34 +02:00
conor42 3b1f34217c vkd3d-shader: Fix a bug in constant double vector handling.
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
2021-06-09 20:54:02 -07:00
conor42 2ad16f89d3 tests: Modify dadd test to use a double2 vector.
Tests a codepath in vkd3d_dxbc_compiler_get_constant() where
component_count != 1.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
2021-06-09 20:54:02 -07:00
Hans-Kristian Arntzen 20a96cab57 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-08 15:35:20 +02:00
Hans-Kristian Arntzen a09819250f tests: Add null descriptor mismatch type test.
Verifies that we splat null descriptors appropriately.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-07 13:21:31 +02:00
Hans-Kristian Arntzen 47f978fec3 tests: Test clearing a NULL UAV.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-07 13:21:31 +02:00
Hans-Kristian Arntzen 9983a1720f vkd3d: Splat null descriptors to all sets.
Some games end up writing the wrong descriptor type when using null
descriptors, and to be robust against that, we have to clear out
all descriptors when creating null descriptors.

If we copy a null descriptor, we will also have to copy from all sets.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-07 13:21:31 +02:00
Hans-Kristian Arntzen 969776c1f8 vkd3d: Ignore NULL descriptor ClearUAV.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-07 13:21:31 +02:00
Hans-Kristian Arntzen c7c17d05ed vkd3d: Fix descriptor QA checks for CBV_AS_SSBO.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-07 13:21:06 +02:00
Hans-Kristian Arntzen ec5b4ccecf vkd3d: Ensure that swapchain is eventually recreated.
Latch SUBOPTIMAL state.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-02 19:46:05 +02:00
Joshua Ashton efa0eccc59 vkd3d: Low latency presentation and acquire semaphores
In cases where acquire image is blocking, we should call that after
presentation to avoid latency when the app calls present.

This avoids weird inverse frame cadences with Mesa WSI right now,
as acquiring an image is always a blocking call until it is complete.

In cases when we aren't blocking, this kicks off the acquisition so
it can be waited upon by the next present blit pass.

Use another set of semaphores to wait for the image acquisition on the
GPU.

In the non-blocking vkAcquireNextImageKHR case, this means that a
potential bubble of time between waiting on the fence and submitting
the blit + presentation is eliminated.

Runaway presentation in this setup is avoided by frame latency objects
and normal frame latency which is always 3 according to documentation.

Be careful about handling SUBOPTIMAL. Semaphores will be signaled, but
we might want to tear down the swapchain. In these cases, we need to
wait for the semaphore to be signaled first, which can only be done by
submitting a wait, since QueueWaitIdle or DeviceWaitIdle don't cover
WSI.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Co-authored-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-02 19:46:05 +02:00
Joshua Ashton 92ed98ccea vkd3d: Handle frame latency without WAITABLE_OBJECT
Documentation says that this should always be 3 without WAITABLE_OBJECT
unlike in D3D11 where it will use the DXGI device's frame latency.

This stops runaway presentations in the non-blocking acquire image case
with the new semaphore setup.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-06-02 19:46:05 +02:00
Hans-Kristian Arntzen 6f5f55c84a vkd3d: Avoid oldSwapchain.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-02 19:46:05 +02:00
Hans-Kristian Arntzen 582138b063 tests: Fix Clear UAV test constant.
Was using 0x8000 / 0xffff instead of 0x200 / 0x3ff, rounded differently
on NV.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-28 17:42:00 +02:00
Hans-Kristian Arntzen 616538aa47 tests: Add missing UAV barrier in bindless counter test.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-28 15:56:18 +02:00
Hans-Kristian Arntzen fee18f1820 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-28 15:41:49 +02:00
Hans-Kristian Arntzen a83c99ba77 vkd3d-shader: Don't apply offset buffers for non-bindless resources.
Fixes root descriptors when BDA support is disabled.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-27 23:30:51 +02:00
Hans-Kristian Arntzen 1a7b470681 tests: Add clear UAV test for RGB10A2 format.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-27 15:34:53 +02:00
Hans-Kristian Arntzen 32a2bd65f9 tests: Remove TODOs in ClearUAV.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-27 15:34:53 +02:00
Hans-Kristian Arntzen fa471962dc vkd3d: Mask clear color in ClearUAVUint.
Fixes test TODOs. Apparently Vulkan drivers can saturate here, which
caused the TODO to appear, at least on AMD Windows.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-27 15:34:53 +02:00
Hans-Kristian Arntzen 3c7f188863 vkd3d: Nuke code paths for !nullDescriptor.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-27 10:39:22 +02:00
Hans-Kristian Arntzen 7bf93b844d vkd3d: Require VK_EXT_robustness2.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-27 10:39:22 +02:00
Hans-Kristian Arntzen 0b8490a6b9 meta: DXBC descriptor QA is supported.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-26 17:26:01 +02:00
Hans-Kristian Arntzen a3fb2f1cd6 vkd3d-shader: Opt-in to early fragment tests with descriptor QA.
Since we introduce side effects, avoid full late-Z for everything, which
is slow, and not necessarily correct either.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-26 17:26:01 +02:00
Hans-Kristian Arntzen 077740f15c vkd3d-shader: Implement descriptor QA for DXBC as well.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-26 17:26:01 +02:00
Hans-Kristian Arntzen e60fab591b vkd3d: Add more enums/name LUTs for descriptor QA.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-26 17:26:01 +02:00
Hans-Kristian Arntzen 3470feceb4 meta: Add descriptor_qa_checks to README.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-26 17:26:01 +02:00
Hans-Kristian Arntzen a256a9266e vkd3d: Rewrite descriptor QA.
Adds support for GPU-assisted validation of descriptor usage in the
CBV_SRV_UAV heap.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-26 17:26:01 +02:00
Hans-Kristian Arntzen 815277e392 vkd3d: Add data structures for descriptor QA.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-26 17:26:01 +02:00
Hans-Kristian Arntzen 5e67d30883 vkd3d: Add config option for descriptor QA.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-26 17:26:01 +02:00
Hans-Kristian Arntzen 0d5f1d7784 vkd3d-shader: Add way to pass down descriptor QA buffers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-26 17:26:01 +02:00
Hans-Kristian Arntzen b49df76367 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-26 17:26:01 +02:00
Hans-Kristian Arntzen c7d9faedea vkd3d: Add atomic OR support.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-26 17:26:01 +02:00
Hans-Kristian Arntzen 96a84e2633 vkd3d: Fix build with DESCRIPTOR_QA.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-26 17:26:01 +02:00
Joshua Ashton 925a930d1e vkd3d: Fix missing trace arg in SetPipelineStackSize
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-05-20 11:36:21 +02:00
Joshua Ashton 2a82358c3f build: Don't strip binaries when doing a --dev-build
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-05-19 17:29:13 +02:00
Hans-Kristian Arntzen 9d405f0366 vkd3d: Don't try to use fallback SRV aux heap.
DXR requires buffer_device_address, so it's meaningless to attempt a
fallback.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-13 08:25:10 +01:00
David McCloskey 1744896142 vkd3d: Fix for freeing memory created with aligned_malloc
Signed-off-by: David McCloskey <davmcclo@gmail.com>
2021-05-07 06:42:12 +01:00
David McCloskey 217ffc27d2 vkd3d: Type error fix for d3d12_device_get_query_pool.
Signed-off-by: David McCloskey <davmcclo@gmail.com>
2021-05-07 06:41:59 +01:00
David McCloskey 09f5366941 build: clang-cl support for native Windows builds.
Signed-off-by: David McCloskey <davmcclo@gmail.com>
2021-05-07 06:41:39 +01:00
Hans-Kristian Arntzen 8734589e92 dxil-spirv: Update submodules.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-05 14:47:16 +02:00
Hans-Kristian Arntzen 47cae1095e tests: Test copying depth-stencil to color.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-05 00:11:10 +02:00
Hans-Kristian Arntzen 43bf0ed8c1 vkd3d: Ensure SAMPLED | COLOR_ATTACHMENT for R8_TYPELESS.
Needed for stencil -> color copies potentially.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-05 00:11:10 +02:00
Hans-Kristian Arntzen 4f0872152a meta: Add fs_copy_uint path.
For stencil -> color copies.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-05 00:11:10 +02:00
Hans-Kristian Arntzen ef5ad082a0 vkd3d: More precise logging for fallback copy fixmes.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-05 00:11:10 +02:00
Hans-Kristian Arntzen 0e93af9700 vkd3d: Handle multiple planes in subresource conversion for copies.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-05 00:11:10 +02:00
Hans-Kristian Arntzen e02031220a dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-04 15:41:37 +02:00
Georg Lehmann a411256c7f vkd3d: Enable and require shaderDrawParameters.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-04-29 06:48:37 +01:00
Joshua Ashton 3ed3526332 meson: Update to version 2.3.1
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-27 15:03:38 +01:00
Joshua Ashton 7d123c4774 meta: Update CHANGELOG for 2.3.1
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-27 15:03:38 +01:00
Joshua Ashton 1267b2a985 build: Fix installing vkd3d-proton when Wine is built without vkd3d
This would fail previously as Wine does not have d3d12.dll for us to make .old

Closes: #559
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-27 12:54:44 +01:00
Joshua Ashton 68d5510bdf build: Avoid Wine Mono and Gecko installs in the setup script
Co-authored-by: Alexis Peypelut <iroalexis@outlook.fr>
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-27 12:54:44 +01:00
Hans-Kristian Arntzen c7890219e7 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-27 12:21:21 +02:00
Georg Lehmann b858f8a478 vkd3d: Don't error out if vkGetPhysicalDeviceFragmentShadingRatesKHR isn't found.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-04-24 18:23:46 +01:00
Hans-Kristian Arntzen 26584b4d7c meson: Update to version 2.3.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-22 17:28:44 +02:00
Hans-Kristian Arntzen f4afcabed8 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-22 17:28:35 +02:00
Hans-Kristian Arntzen 99a180f7a1 vkd3d-utils: Fix .def version.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-22 15:30:58 +01:00
Joshua Ashton 364402c5ac meta: Add some extra stuff to the 2.3 CHANGELOG
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-21 16:30:26 +02:00
Hans-Kristian Arntzen be1b941e06 vkd3d: Workaround buggy NV driver in sparse update.
test_update_tile_mappings fails if we don't do this.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-21 16:29:05 +02:00
Hans-Kristian Arntzen 701ea350e1 meta: Update CHANGELOG for 2.3.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-21 13:04:19 +02:00
Georg Lehmann 0d727274f9 build: Conditionally enable --quiet for glslang.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-04-21 13:01:17 +02:00
Joshua Ashton 3d0913dc19 d3d12: Initialize optional extensions
This was missed and somehow magically worked fine when running the test suite for me.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-21 00:48:43 +02:00
Joshua Ashton 0e7e6e9520 build: Make package-release version independent
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-20 18:25:41 +02:00
Joshua Ashton 3118696706 vkd3d-utils: Bump SONAME version to 3.0.0
We made breaking ABI changes.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-20 18:25:41 +02:00
Joshua Ashton 911a202bd1 vkd3d: Bump SONAME version to 3.0.0
We made breaking ABI changes.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-20 18:25:41 +02:00
Joshua Ashton 1970051e7a tests: Use vkGetPhysicalDeviceProperties2 in d3d12_crosstest
We require Vulkan 1.1.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-20 18:25:41 +02:00
Joshua Ashton a3ad7cae90 vkd3d-shader: Remove type/next from interface structures
This was never really used for anything useful.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-20 18:25:41 +02:00
Joshua Ashton 0c8349cb8e vkd3d-shader: Remove vkd3d_shader_domain_shader_compile_arguments
This is never used by anything, and all the info is in the shader anyway.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-20 18:25:41 +02:00
Joshua Ashton 220e1146ee vkd3d-shader: Make vkd3d_shader_transform_feedback_info a member
Moves it into vkd3d_shader_interface_info, this doesn't need to be
a pNext.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-20 18:25:41 +02:00
Joshua Ashton 3e4a8b1504 vkd3d: Remove type/next from vkd3d device/instance structures
There's really no reason to overcomplicate adding optional extensions this way.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-20 18:25:41 +02:00
Joshua Ashton bd988f2b74 vkd3d: Remove vkd3d_optional_device_extensions_info
Roll this into vkd3d_device_create_info, no need for this to be a pNext thing.

Additionally, fix some memory leaks on device creation failure.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-20 18:25:41 +02:00
Joshua Ashton 59148c1932 vkd3d: Remove vkd3d_optional_instance_extensions_info
Roll this into vkd3d_instance_create_info, no need for this to be a pNext thing.

Additionally, fix some memory leaks on instance creation failure.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-20 18:25:41 +02:00
Philip Rebohle f06f94bfb4 vkd3d: Enable multi_queue by default.
And replace option with a single_queue flag to do the opposite.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-04-19 16:40:49 +02:00
Hans-Kristian Arntzen 91dc8249f2 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-19 14:01:33 +02:00
Hans-Kristian Arntzen afb2067d72 tests: Test that we can safely read ClipDistance in DS.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-19 12:38:37 +01:00
Joshua Ashton 1761cf3aa1 tests: Add a SV_ClipDistance in HULL shader test
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-19 13:19:44 +02:00
Joshua Ashton 07e801192f vkd3d-shader: Resolve arguments to variable before passing to epilogue
Otherwise we pass in a pointer which is bad, or a local value which is also illegal for some reason.

It has to be a "memory object declaration".

Found via. spirv-val

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-19 13:19:44 +02:00
Joshua Ashton 4470ec63cc vkd3d-shader: Don't emit builtin clip/cull arrays for hull shaders
There are no output built-ins here, just per-vertex stuff passed directly to DS to deal with there.

Closes: #227

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-19 13:19:44 +02:00
Joshua Ashton 94a9719557 vkd3d-shader: Rename vkd3d_dxbc_compiler_emit_shader_signature_outputs to vkd3d_dxbc_compiler_emit_clip_cull_outputs
This only ever emits these.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-19 13:19:44 +02:00
Joshua Ashton 000407d74c vkd3d-shader: Enable Clip/Cull distance capabilities
Found via. spirv-val

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-19 13:19:44 +02:00
Georg Lehmann 21dabb315d vkd3d: Unify _mm_pause detection.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-04-19 12:05:12 +02:00
Hans-Kristian Arntzen c7eb6fdf61 vkd3d: Add some tracing to help narrow down compiler crashes.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-15 16:24:05 +02:00
Hans-Kristian Arntzen 6292078433 vkd3d-shader: Return INVALID_ARGUMENT instead of SHADER.
For invalid bindings, we expect E_INVALIDARG in D3D12.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-15 16:24:05 +02:00
Hans-Kristian Arntzen 744497274c vkd3d-shader: Verify that we compile expected shader stage.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-15 16:24:05 +02:00
Hans-Kristian Arntzen 8f17fdd1fa vkd3d: Don't leak pipeline cache if we fail compile.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-15 16:24:05 +02:00
Hans-Kristian Arntzen 4925495e0a tests: Verifies behavior if we pass mismatching stages.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-15 16:24:05 +02:00
Hans-Kristian Arntzen 70f3f769a5 tests: Add test which verifies what happens with missing RS bindings.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-15 16:24:05 +02:00
Hans-Kristian Arntzen e7b6cf4089 vkd3d-shader: Report error if binding is not found in root signature.
Error out early.

Fixes some crashes when we keep going after having seen completely
broken bindings.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-15 16:24:05 +02:00
Georg Lehmann 2c3988e6df tests: Add env var to exclude tests.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-04-15 15:10:25 +02:00
Philip Rebohle 48536b2222 tests: Test command allocator reset behaviour with bundles.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-04-14 16:24:55 +02:00
Philip Rebohle 3fbce3c450 tests: Do not skip test_bundle_state_inheritance test.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-04-14 16:24:55 +02:00
Philip Rebohle 62cbf3d78a vkd3d: Remove unused unsafe_impl_from_ID3D12CommandAllocator.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-04-14 16:24:55 +02:00
Philip Rebohle 4f9ca6c3df vkd3d: Create bundles and bundle allocators as necessary.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-04-14 16:24:55 +02:00
Philip Rebohle 1bbbabcb94 vkd3d: Implement ExecuteBundle.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-04-14 16:24:55 +02:00
Philip Rebohle 728ce6c370 vkd3d: Validate command list type in ExecuteCommandLists.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-04-14 16:24:55 +02:00
Philip Rebohle 1990270bbb vkd3d: Implement CreateCommandList on top of CreateCommandList1.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-04-14 16:24:55 +02:00
Philip Rebohle 2ca62ecd12 vkd3d: Add bundle allocator and command list implementation.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-04-14 16:24:55 +02:00
Joshua Ashton 2860b0a548 vkd3d: Enable force_tgsm_barriers for F1 2020
Signed-off-by: Joshua Ashton <joshua@froggi.es>

Closes: #611
2021-04-12 16:29:57 +02:00
Joshua Ashton 043fd304f8 vkd3d-shader: Add force_tgsm_barriers config flag
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 16:29:57 +02:00
Joshua Ashton 7cfe17d2f5 vkd3d-shader: Passthrough vkd3d_config_flags
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 16:29:57 +02:00
Joshua Ashton 41df41305e include: Move vkd3d_config_flags to public header
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 16:29:57 +02:00
Joshua Ashton bc87d60ad8 tests: Add a test for RSSetShadingRateImage
Passes on D3D12 and VKD3D-Proton.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 13:22:01 +02:00
Joshua Ashton 3284f062de tests: Fix comparisons in test_vrs
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 13:22:01 +02:00
Joshua Ashton 82a1dc22a2 tests: Add a SV_ShadingRate test
Tests both VS primitive rate and the PS input.

Fails currently on Windows as both vendors have broken combiner
logic in their D3D12 drivers right now.
NV: Fails right now on Min/Max when mixing 2x1 and 1x2.
AMD: Everything is broken. Did they even test this?

Tests pass with Vulkan/vkd3d-proton and the expected values
are based on the D3D12 spec/docs around VRS.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 13:22:01 +02:00
Joshua Ashton 14cef6cf3f tests: Enable D3D12ExperimentalShaderModels
We need these for VRS shader tests and probably also for RT.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 13:22:01 +02:00
Joshua Ashton 4e20fd2f58 include: Define D3D12EnableExperimentalFeatures
Including associated UUIDs.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 13:22:01 +02:00
Joshua Ashton 5978f5958e vkd3d: Expose TIER_2 Variable Rate Shading
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 13:22:01 +02:00
Joshua Ashton 9fb624a429 vkd3d: Implement RSSetShadingRateImage
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 13:22:01 +02:00
Joshua Ashton 5d17f71441 vkd3d: Handle usage and implicit views for VRS capable resources
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 13:22:01 +02:00
Joshua Ashton 135c7332e4 vkd3d: Implement D3D12_RESOURCE_STATE_SHADING_RATE_SOURCE
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 13:22:01 +02:00
Joshua Ashton 601357c7c5 vkd3d: Implement a static pipeline variant system
Needed so we can switch between having a VRS and non-VRS attachment on the fly.
Extensible enough for this to work for other things down the line also.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 13:22:01 +02:00
Fabian Bornschein eb4909ea67 build: Switch to the portable shebang in scripts
Signed-off-by: Fabian Bornschein <fabiscafe@mailbox.org>
2021-04-12 11:26:39 +01:00
Philip Rebohle 4e777b9182 vkd3d: Use depth attachment when depth bounds test is enabled.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-04-12 11:23:51 +02:00
Hans-Kristian Arntzen 7dc2a5cad7 vkd3d: Enable VK_KHR_sampler_mirror_clamp_to_edge.
CP77 requires it now.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-07 21:57:50 +02:00
Philip Rebohle a0a04f9488 tests: Test root signature priority.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-04-06 11:13:35 +02:00
Philip Rebohle 2f1b23ece6 vkd3d: Enable conservative rasterization tier 3.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-25 18:00:59 +01:00
Philip Rebohle 6476fabb0b vkd3d-shader: Implement support for SV_InnerCoverage.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-25 18:00:59 +01:00
Philip Rebohle 698279ec90 vkd3d: Enable conservative rasterization state as requested.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-25 18:00:59 +01:00
Philip Rebohle 8a61128152 vkd3d: Enable VK_EXT_conservative_rasterization if available.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-25 18:00:59 +01:00
Philip Rebohle 9c8377c2d4 tests: Add test for conservative rasterization.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-25 18:00:59 +01:00
Philip Rebohle fdf4df18a4 vkd3d: Add Feature Level 12_2 detection.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-25 18:00:59 +01:00
Hans-Kristian Arntzen 0a8b5bca4e dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-25 18:00:41 +01:00
Hans-Kristian Arntzen 2f60a3bf66 vkd3d: Fix broken debug_vk_memory_{property,heap}_flags.
C is fun, yo. Returned data from dead stack variable, also triggered
overflow in some cases.

Uncalled in release mode, but can crash debug builds.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-25 17:58:18 +01:00
Joshua Ashton fe28436c34 vkd3d: Refactor vkd3d_render_pass_key to use flags
We're going to need more state in this key for VRS TIER_2 and we need to keep this aligned.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-24 15:20:10 +01:00
Joshua Ashton f812442199 meta: Add VK_KHR_create_renderpass2 to README
This is required now.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-24 15:20:10 +01:00
Joshua Ashton 65b13f6cd6 vkd3d: Use VK_KHR_create_renderpass2
We need this before implementing TIER_2 variable rate shading.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-24 15:20:10 +01:00
Hans-Kristian Arntzen e89dd8cf87 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-23 19:15:36 +01:00
Hans-Kristian Arntzen 93d042f9ce dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-23 18:35:35 +01:00
Hans-Kristian Arntzen 5197edb03b vkd3d: Enable 16-bit storage features.
Don't need extension, since VK_KHR_16bit_storage is core in Vulkan 1.1.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-23 18:35:35 +01:00
Hans-Kristian Arntzen 4afd4d355b vkd3d: Handle more DXR cases.
Found in Ghostrunner, still not working ...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-23 18:35:35 +01:00
Hans-Kristian Arntzen e0374d735d vkd3d-shader: Add shader replacement support for DXR as well.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-23 18:35:35 +01:00
Hans-Kristian Arntzen 5abc4b9af2 vkd3d: Add all relevant RT stages to push constant layout.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-23 18:35:35 +01:00
Hans-Kristian Arntzen 9d3603c336 vkd3d: Fix root descriptor RTAS.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-23 18:35:35 +01:00
Hans-Kristian Arntzen 9fa668867e vkd3d: Hold private reference to collection objects.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-23 18:35:35 +01:00
Hans-Kristian Arntzen bd16d1a88d vkd3d: Support RTPSO object collections.
This is quite complicated, but we can use VK_KHR_pipeline_library
to implement this functionality.
2021-03-23 18:35:35 +01:00
Hans-Kristian Arntzen 89679cbff1 tests: Test local definition exports.
Attempts to create a hit group out of shaders found in collection objects.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-23 18:35:35 +01:00
Hans-Kristian Arntzen b306d605f3 tests: Add basic RT collection test.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-23 18:35:35 +01:00
Joshua Ashton bc1b18dc02 vkd3d: Add some missing flags in debug_vk helpers
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-22 14:29:45 +01:00
Joshua Ashton d97683a8a4 d3d12: Rename d3d12_get_physical_device to d3d12_find_physical_device
A more accurate description of what's going on here.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-22 14:29:45 +01:00
Joshua Ashton 9f778bc871 d3d12: Use vkGetPhysicalDeviceProperties2
This is core in Vulkan 1.1.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-22 14:29:45 +01:00
Joshua Ashton 2fa97aa0fb vkd3d: Move API versions to public header
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-22 14:29:45 +01:00
Hans-Kristian Arntzen b7dfa99e57 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-19 13:42:47 +01:00
Hans-Kristian Arntzen cf39639f5b dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-18 20:58:58 +01:00
Joshua Ashton b71bc5ef6b tests: Don't crash if WRITE_WATCH is broken
This can happen under Wine.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-18 16:56:36 +01:00
Joshua Ashton 4b6a1ef40d tests: Add a WRITE_WATCH test
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-18 16:10:05 +01:00
Joshua Ashton 258173a0a7 vkd3d: Fix return value when WRITE_WATCH is forbidden
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-18 16:10:05 +01:00
Joshua Ashton aa12817ccf vkd3d: Implement D3D12_HEAP_TYPE_WRITE_WATCH
Needed for D3D12 APITrace

Closes: #373
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-18 14:41:46 +01:00
Hans-Kristian Arntzen 52a9c85bf2 vkd3d: Implement ClearState.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-18 10:50:21 +01:00
Joshua Ashton 4e31f5d54d vkd3d: Align d3d12_rtv_desc to D3D12_DESC_ALIGNMENT
Otherwise we can do an alligned_malloc with a non-aligned size as the descriptor size is 48 for a d3d12_rtv_desc otherwise.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-16 21:45:28 +01:00
Joshua Ashton 5b5293ec93 vkd3d: Fix out of range in UpdateTileMappings
Previously this incremented and indexed before the loop checked this.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-16 21:45:13 +01:00
Joshua Ashton 43e7316591 tests: Default VKD3D_TEST_DEBUG to 1
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-16 21:43:01 +01:00
Philip Rebohle dadace33b1 vkd3d: Fix potential hang in d3d12_command_queue_Release.
This can happen if the fence thread starts with a delay and
the queue gets destroyed shortly after being created.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-16 21:42:39 +01:00
Hans-Kristian Arntzen 34a09967d5 vkd3d: Prefer compute queues for TRANSFER.
TRANSFER + CONCURRENT is generally death for compression.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-16 21:41:37 +01:00
Hans-Kristian Arntzen 95fe4b61a6 vkd3d: Do not drop pending signals when signaling fence on CPU.
There isn't much of a reason why we should have to do this. The original
implementation was more of a hack if anything.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-16 21:41:37 +01:00
Hans-Kristian Arntzen e7672c3233 vkd3d: Refactor where max pending timeline value is computed.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-16 21:41:37 +01:00
Hans-Kristian Arntzen dbdbf94083 vkd3d: Ensure that virtual timeline values are updated in-order.
Increment physical value one by one, find the exact timeline value we're
supposed to signal and perform the update.

Select lowest physical timeline value correctly.
Array can be reordered now, so lowest value isn't necessarily first.

Fixes some super weird hangs in Control DXR.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-16 21:41:37 +01:00
Philip Rebohle eab288bb4e vkd3d: Simplify fence worker implementation.
Avoids potential busy-waiting on the driver with WAIT_ANY_BIT.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-16 12:43:24 +01:00
Philip Rebohle 93a80d5eaa vkd3d: Create one fence worker per command queue.
Rather than one per device. This solves issues with D3D12 fences
being signalled too late because the fence worker is waiting on
a different set of semaphores while the fence is being enqueued.

Greatly increases performance in Horizon Zero Dawn and Death
Stranding with multi-queue mode enabled.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-16 12:43:24 +01:00
Philip Rebohle 34bca90a9c vkd3d: Implement internal reference counting for d3d12_fence.
This will be necessary once we introduce fence workers per
command queue, since we cannot reliably store pointers to
queues.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-16 12:43:24 +01:00
Hans-Kristian Arntzen 102ea2211b vkd3d: Ignore IASetVertexBuffers for NULL pViews.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-15 14:55:53 +00:00
Hans-Kristian Arntzen 5b2cc545e8 vkd3d: Convert RTAS geometry flags.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-15 14:55:03 +00:00
Hans-Kristian Arntzen c425343f41 vkd3d: Remove FIXME spam for pResourceAfter = NULL cases. 2021-03-15 14:10:27 +01:00
Philip Rebohle 0e4ef88d18 vkd3d: Don't broadcast semaphore waits when zeroing memory.
Instead, let queues wait on demand.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-15 12:52:00 +01:00
Philip Rebohle 7185e9776d vkd3d: Introduce vkd3d_queue_add_wait.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-15 12:52:00 +01:00
Philip Rebohle 724257c0d8 vkd3d: Add multi_queue config flag.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-15 12:52:00 +01:00
Philip Rebohle 859aa3fd5a vkd3d: Use VK_SHARING_MODE_CONCURRENT if multi-queue is enabled.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-15 12:52:00 +01:00
Philip Rebohle 1e3c91579e vkd3d: Create one vkd3d queue per Vulkan device queue.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-15 12:52:00 +01:00
Philip Rebohle 3cd93781ff vkd3d: Create multiple queues per queue family if possible.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-15 12:52:00 +01:00
Philip Rebohle 6967b1e92b vkd3d: Wait for queue idle before destroying vkd3d queue.
Fixes a potential issue where we may destroy objects that
are still in use by the GPU.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-15 12:52:00 +01:00
Philip Rebohle b0309f6f90 vkd3d: Introduce d3d12_device_allocate_vkd3d_queue.
Replaces d3d12_device_get_vkd3d_queue when mapping D3D12
command queues to Vulkan device queues.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-15 12:52:00 +01:00
Philip Rebohle 7359686609 vkd3d: Introduce d3d12_device_get_vkd3d_queue_family.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-15 12:52:00 +01:00
Philip Rebohle 4c0a0b0467 vkd3d: Introduce vkd3d_queue_family_info.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-15 12:52:00 +01:00
Hans-Kristian Arntzen b4f48bf2d6 meta: Update README with new VKD3D_CONFIG flags.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-13 06:26:27 +00:00
Hans-Kristian Arntzen b44bfa7066 vkd3d: Remove obsolete comment.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-13 06:26:27 +00:00
Hans-Kristian Arntzen 43370c6426 vkd3d: Only enable DXR if requested.
The implemnentation is not complete enough to safely enable it, since
some games will try to create RTPSOs by default, leading to crashes.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-12 12:20:10 +01:00
Hans-Kristian Arntzen 58615cd5dc vkd3d: Allow devices with recursion of 1 to be accepted.
We can fail RTPSOs later if they for whatever reason use recursion.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-12 12:20:10 +01:00
Hans-Kristian Arntzen d9be9b57f2 vkd3d: Actually use RGBA16 formats for RT VBO.
It's really supposed to load 4 components and ignore. RGB16 is not
mandatory, so just use the "expected" formats after all.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-12 12:20:10 +01:00
Hans-Kristian Arntzen 2b6658da67 vkd3d: Enable RT tier 1.0 if possible.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-12 12:20:10 +01:00
Hans-Kristian Arntzen 3adc385167 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-12 12:20:10 +01:00
Hans-Kristian Arntzen c5c45b851f vkd3d-shader: Add missing stage conversion for RT.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-12 12:20:10 +01:00
Hans-Kristian Arntzen 4f2776ff93 vkd3d-shader: Dump RT export SPIR-V.
Need one unique blob per export.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-12 12:20:10 +01:00
Hans-Kristian Arntzen 3358fca922 vkd3d: Implement local root signature association.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-12 12:20:10 +01:00
Hans-Kristian Arntzen 028b87ab61 vkd3d: Fix some trivial bugs with local root signatures.
Did not properly allocate bindings.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-12 12:20:10 +01:00
Hans-Kristian Arntzen 4f40a5a6d2 tests: Tests multiple local root signatures.
Runtime behavior here is extremely weird and contradicts spec wording in
many ways:

- Default local root signatures can override explicit ones.
- Runtime silently fails if the associated subobject is not part of the
  PSO array.
- Order of default local root signatures doesn't appear to matter at
  all.

All in all, very confusing, and there is zero help from validation
layer, so we'll have to deduce this from whatever applications want.
Hopefully they are somewhat sane, and don't try to rely on very awkward
matching rules.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-11 13:57:55 +01:00
Philip Rebohle 85f15916c4 vkd3d: Optimize unmapping adjacent resource regions.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-11 13:57:45 +01:00
Philip Rebohle 2ef8106136 vkd3d: Optimize sparse binding for buffers and full subresources.
Compacts ranges and only issues one bind for buffer ranges and
full subresource updates, rather than one bind per tile.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-10 13:18:44 +01:00
Philip Rebohle ead9f2d620 vkd3d: Store subresource index in d3d12_sparse_image_region.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-10 13:18:44 +01:00
Hans-Kristian Arntzen ff78b2df1c vkd3d: Dump DXIL when parsing entry points as well.
Parse can fail, and it's is useful to debug that.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-10 13:08:25 +01:00
Hans-Kristian Arntzen 73d55ec65a dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-10 13:08:25 +01:00
Hans-Kristian Arntzen cd876284e0 vkd3d: Fix some const warnings on MSVC.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-10 10:48:01 +00:00
Hans-Kristian Arntzen ce62d3d700 vkd3d: Fix MSVC build errors.
args... is a GNU extension.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-10 10:48:01 +00:00
Hans-Kristian Arntzen 369f48e499 tests: Add test for RTAS update.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-10 11:46:22 +01:00
Hans-Kristian Arntzen 0bf3a1d441 vkd3d-shader: Recognize recent descriptor range flag.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-10 11:46:05 +01:00
Hans-Kristian Arntzen 56e7cbec80 test: Test CBV table hoisting.
Adds some spicy edge cases with array size of 1 w/ nonuniform access, etc.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-10 11:46:05 +01:00
Hans-Kristian Arntzen 13d132f1c4 vkd3d: Add support for hoisting CBV descriptors to push descriptors.
Bindless CBV is *pretty* bad on NVIDIA, so add a code path which can
promote descriptor table CBVs into push descriptors.

We can safely do this with Root Signature 1.1 STATIC or
the somewhat obscure STATIC_KEEPING_BUFFER_BOUNDS_CHECKS.

With VOLATILE, which basically all titles are using,
we can still force this behavior through a config flag,
but this is an incorrect speed hack. It works in most
titles however, since bindless CBV is exceptionally rare.

We only hoist descriptors when the root signature range has 1 descriptor
anyway, so we should avoid any reasonable bindless scenario.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-10 11:46:05 +01:00
Hans-Kristian Arntzen d758a6e296 vkd3d: Convert Root Signatures to 1.1.
We will be able make use of the use STATIC vs VOLATILE flags.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-10 11:46:05 +01:00
Hans-Kristian Arntzen c409d0f30a vkd3d: Optimize R32UI texel buffer creation.
There is no need to scan through the Vulkan format list,
especially since texel buffer creation happens in the hot path
in cases where we know we need to create R32UI texel buffer views.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-10 11:46:05 +01:00
Hans-Kristian Arntzen 3e876c2857 vkd3d: Log VKD3D_CONFIG with INFO.
Useful to make sure we actually passed it correctly ...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-10 11:46:05 +01:00
Hans-Kristian Arntzen c351dfc8d3 vkd3d: Remove dead code from d3d12_command_list.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-05 15:49:28 +01:00
Joshua Ashton ce9ae01c79 build: Warn about VLA usage
Using consts for array sizes is a C++-ism, and in GCC in C-mode it won't fold literal constants, and will instead prefer to make a VLA.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-04 15:50:28 +00:00
Hans-Kristian Arntzen 38bb845800 tests: Refactor raytracing test to be a bit more extensible.
Fixes a lot of the worst hardcoding, test different IBO / VBO formats.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-04 16:30:29 +01:00
Hans-Kristian Arntzen f2c5a6561c tests: Test RTAS clone and compact.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-04 16:30:29 +01:00
Hans-Kristian Arntzen b34af6a7fa vkd3d: Convert RT vertex format correctly.
Context sensitive formats, oh boy!

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-04 16:30:29 +01:00
Hans-Kristian Arntzen 686a3efc08 vkd3d: Verify VBO RTAS support when checking RT tier.
Format conventions for RT are ... special.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-04 16:30:29 +01:00
Hans-Kristian Arntzen b5d433baaa vkd3d: Implement RTAS clone and compact copy operations.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-04 16:30:29 +01:00
Philip Rebohle 39513d6503 vkd3d: Silence log spam around Min LOD Clamp.
This seriously hurts performance in AC:Valhalla.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-04 13:37:05 +01:00
Philip Rebohle 5e94183975 vkd3d-shader: Do not insert branch to loop header if outside of block.
Fixes invalid SPIR-V in case there is an unconditional break right
before the loop ends.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-04 13:36:31 +01:00
Philip Rebohle ba8e306452 vkd3d-shader: Ignore break instructions if there is no active block.
This can happen if a continue statement is immediately followed
by a break instruction in a switch case.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-04 13:36:31 +01:00
Joshua Ashton 96888b0663 build: Use --file-alignment=4096 with MinGW
Avoids a copy in the Wine loader as well as enables debug symbols to work in perf.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-03 19:22:44 +01:00
Joshua Ashton 47606f4339 build: Rename vkd3d_msvc and vkd3d_clang
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-03 19:22:44 +01:00
Hans-Kristian Arntzen 031ad9e139 vkd3d: Track dynamic pipeline stack size
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-03 19:19:47 +01:00
Hans-Kristian Arntzen 600a296ca7 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-03 19:19:47 +01:00
Hans-Kristian Arntzen 9588ec082e vkd3d: Fix warnings when AS is used without support.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-03 19:19:47 +01:00
Hans-Kristian Arntzen fcd00f0559 vkd3d: Implement DispatchRays.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-03 19:19:47 +01:00
Hans-Kristian Arntzen b162e5ec72 vkd3d: Refactor descriptor updates.
We might have to emit to different bind point than our binding entry
suggests due to DXR, so pass down information explicitly to leaf
functions.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-03 19:19:47 +01:00
Hans-Kristian Arntzen eeaca4a500 vkd3d: Pass down raygen pipeline layout to command list.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-03 19:19:47 +01:00
Hans-Kristian Arntzen 0b161f5693 vkd3d: Implement SetPipelineState1.
Refactor push constant invalidation to SetPipelineState,
it is technically more correct to only invalidate when actually pushing
constants, but we need to do full state invalidation when transitioning
between RT pipelines and non-RT pipelines due to bind point aliasing
shenanigans in D3D12, so it makes more sense to invalidate state based
on active bind point there.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-03 19:19:47 +01:00
Hans-Kristian Arntzen 77089065cd vkd3d: Compute default pipeline stack size.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-03 19:19:47 +01:00
Hans-Kristian Arntzen 9ffa3bf351 vkd3d: Support CreateSRV with RTAS.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-03 19:19:47 +01:00
Hans-Kristian Arntzen 9ff7b82235 vkd3d: Rename VKD3D_DESCRIPTOR_FLAG_UAV_COUNTER to RAW_VA_BUFFER.
We're going to place acceleration structures here.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-03 19:19:47 +01:00
Joshua Ashton 3224688295 vkd3d: Enable VK_EXT_debug_utils conditionally
Enabling VK_EXT_debug_utils comes at some overhead in Wine due to the object tracking required. There is also likely a non-zero overhead in some native implementations also.

By enabling this conditionally, we can also avoid additional overhead from apps that set debug labels on both the Vulkan and front-end side.

The default condition is to enable it when building with Renderdoc integration or in debug builds.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-02 11:09:29 +01:00
Joshua Ashton 91fc472601 vkd3d: Add mechanism for conditional extensions
Add a way to enable an extension via config flags.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-02 11:09:29 +01:00
Joshua Ashton 4c6f5375a6 vkd3d: Refactor config_flags to be global rather than instance state
Makes it so we can access it in code where we have no concept of a device/instance.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-02 11:09:29 +01:00
Joshua Ashton 64a6bae1a0 vkd3d: Remove vkd3d_application_info structure
This thing has no right to exist.

We don't get this information in D3D12 and it's getting in the way of me refactoring config flags.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-02 11:09:29 +01:00
Joshua Ashton 78b5b347b8 build: Disable TRACE calls in release builds
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-02 11:09:29 +01:00
Joshua Ashton 4da76cb51b vkd3d: Simplify properties/features tracing
Simplifies this to make it easier to add new properties/features
so we don't have a bunch of pointers to things that are just a child
of the device info structure.

Fixes warnings when compiling without traces too.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-02 11:09:29 +01:00
Joshua Ashton 615b2d714f build: Minor meson formatting change
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-02 11:09:29 +01:00
Georg Lehmann 7d518ea78f meta: Remove .gitlab-ci.
Github Actions replaced the frogs.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-02-27 13:57:53 +00:00
Hans-Kristian Arntzen 91fad86e4d tests: Test that root parameters are correctly invalidated.
When emitting push constants for graphics, these should invalidate push
constants for compute and vice versa. In Vulkan, vkCmdPushConstants is
not tied to a bind point.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-26 17:06:18 +01:00
Hans-Kristian Arntzen 89fbe334df vkd3d: Redirect push constants to their bind point stages.
Gives a massive boost on NVIDIA for some reason.
RADV defers push constant update, so ALL_STAGES doesn't have
that much of a perf hit.

~20% uplift in RE2, ~5% uplift in CP77 from some quick and dirty testing.
Seems to be heavily content dependent either way.

Also a bug fix, since we would clobber graphics push constants from
compute and vice versa if both graphics and compute used the same root
signature.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-26 17:06:18 +01:00
Hans-Kristian Arntzen 3839f5e17c vkd3d: Ignore known useless validation warnings.
These only clutter up validation in testing.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-26 15:04:11 +01:00
Joshua Ashton 29b410928b tests: Add a suite of tests for SetName
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-25 21:51:43 +01:00
Joshua Ashton 8c9527cdf7 vkd3d: Refactor SetName implementation
As per MSDN, SetName is just a wrapper around SetPrivateData and a specific GUID.

Some apps and tools will use this to retrieve their name back.

So instead, just forward the name to Vulkan in the SetPrivateData call.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-25 21:51:43 +01:00
Joshua Ashton 04b86b80b6 include: Define WKPID_D3DDebugObjectName and friends
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-25 21:51:43 +01:00
Joshua Ashton a76daad03f vkd3d-common: Add vkd3d_strdup_n
There is no strndup on Windows.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-25 21:51:43 +01:00
Philip Rebohle 26f5745ea1 vkd3d: Don't use SHADER_STAGE_ALL for push constants.
Instead, infer the required stages from the D3D12 shader visibility
field from all root parameters that we map to push constants.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-25 20:28:07 +01:00
Philip Rebohle c37e705761 vkd3d: Use push constant stage mask from root signature.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-25 20:28:07 +01:00
Hans-Kristian Arntzen 96b44fddbc tests: Remove some todo/is_bug()s for RADV.
Some tests are now passing.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 18:28:06 +01:00
Hans-Kristian Arntzen 4fe5b9388d vkd3d: Do not disable robustness, ever.
There are pragmatic reasons for not following spec 100% here.
The only known case where UpdateAfterBind robustness is not exposed
seems to be somewhat bogus, and we cannot run D3D12 correctly without
robustness either way.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 17:53:48 +01:00
Hans-Kristian Arntzen be9c376fde vkd3d: Implement postbuild info queries.
Can only support a subset in Vulkan without extra heroics. The DXR API
lets you query things that you technically should know apriori in the
application. We might need to allocate some side-channel buffers on
demand, but let's defer that until actually needed ... :\

DXR is also very awkward in that we have a query which is resolved in
UNORDERED_ACCESS state instead of COPY_DEST state, so we'll have to
ping-pong through some barriers redundantly.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen 4365f9962f vkd3d: Allocate query pools based on type index instead of D3D12 type.
Postbuild info is a query in Vulkan, but not so in D3D12.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen d88ce7cdea tests: Test post-build info output.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen a5aac500bc vkd3d: Basic implementation of GraphicsCommandList::BuildRTAS().
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen 3353ed14de vkd3d: Implement RTAS object creation.
When building acceleration structures, we need to have an
VkAccelerationStructureKHR object, but the D3D12 API just uses a plain
VA = ID3D12Resource::GetGPUVA() + offset.

For this to work, we need to resolve the VA back to VkBuffer + offset.
The only VkBuffer we can lookup is the original backing memory
allocation in the VA map, and that allocation itself must own a view
map, since we cannot tie the VA to any specific ID3D12Resource.

Since creating an RTAS is not the common path, we allocate the view map
on-demand with CAS.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen 0fc80d9067 vkd3d: Emit RT barriers as required.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen 221a658884 vkd3d: Mark resources as being RTAS depending on initial resource state.
RTAS must stay in this resource state forever. The only way to
synchronize them is UAV barriers.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen 86f7fdfe7a vkd3d: Add RTAS buffer usage flags.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen 83861cceed vkd3d: Allow RTAS initial resource state.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen 2afe25c0c8 vkd3d: Implement GetRaytracingAccelerationStructurePrebuildInfo.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen d773e67fff vkd3d: Add helper query to check if RT should be used.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen a90ed938b4 vkd3d-shader: Pass down SBT descriptor size to dxil-spirv.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-23 12:28:19 +01:00
Hans-Kristian Arntzen 15e36a0430 vkd3d: Use virtual VAs for descriptor heap GPU VAs.
Allows local root signatures to work correctly and is also a good
optimization since we no longer need to dereference memory (potentially
cold cache lines) to figure out heap offset in command buffer.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-23 12:28:19 +01:00
Hans-Kristian Arntzen 1af3f9c65f vkd3d: Use calloc for d3d12_device instead of manual memset.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-23 12:28:19 +01:00
Hans-Kristian Arntzen 1586a75ada vkd3d: Align d3d12_desc to 64 bytes.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-23 12:28:19 +01:00
Hans-Kristian Arntzen 3442d44649 vkd3d: Add aligned allocation helpers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-23 12:28:19 +01:00
Hans-Kristian Arntzen 0c94e07ab2 vkd3d: Elide timeline semaphore waits which can be satisfied implicitly.
If we're signalling and waiting on same physical queue (always true for
current SINGLE_QUEUE define), we can rely on submission boundary
synchronization which doesn't require any extra submissions to resolve.

Avoids awkward GPU driver bubbles with back to back signal -> wait pairs
with timeline.

Observed 2% GPU uplift on RE2 on AMD.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-22 13:00:22 +01:00
Hans-Kristian Arntzen dc246a70fc meson: Bump version to 2.2. 2021-02-19 20:23:10 +01:00
Philip Rebohle 1d7e424c44 vkd3d: Mask certain heap flags when suballocating memory.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-19 20:18:24 +01:00
Philip Rebohle f6c6a76735 vkd3d: Store original heap flags in d3d12_resource again.
Otherwise, when suballocating memory, GetHeapProperties may
not return the exact same set of flags if we ignore flags
when looking up suitable chunks.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-19 20:18:24 +01:00
Philip Rebohle be080edc7f vkd3d: Remove vkd3d_allocate_resource_memory.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-19 19:51:44 +01:00
Philip Rebohle a1e5b78bc4 vkd3d: Suballocate committed images if possible and if supported by the driver.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-19 19:51:44 +01:00
Philip Rebohle a1ffea1800 vkd3d: Fix integer underflow when checking for suitable free ranges.
The difference between a range's offset and the aligned
offset may be greater than the size of that range.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-19 18:11:36 +01:00
Hans-Kristian Arntzen 0fdf69ff46 changelog: Update for 2.2.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-19 13:49:25 +01:00
Hans-Kristian Arntzen d6d8e70955 tests: Add image placement alignment test.
Validates that we can create RTVs at 64k alignment without issues.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-19 13:19:51 +01:00
Joshua Ashton bb3e5f6cad vkd3d: Account for front buffer in swapchain image count
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-19 13:17:15 +01:00
Philip Rebohle be4391b972 vkd3d: Align images manually to meet Vulkan requirements if necessary.
Allows us to not allocate device memory for certain render targets on
Polaris GPUs.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 18:25:23 +01:00
Philip Rebohle d6a4826099 vkd3d: Remove heap_offset member from d3d12_resource.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 56ff4622b6 vkd3d: Remove cookie member from d3d12_resource.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 6e81621b82 vkd3d: Remove gpu_address member from d3d12_resource.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 833d7e207c vkd3d: Remove vk_buffer/vk_image union from d3d12_resource.
Use the unique_resource struct instead.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 5a0a5ef44b vkd3d: Remove unused resource flags and rename SPARSE -> RESERVED.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 6a34d3d204 vkd3d: Remove _2 suffix from memory allocation functions.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 53f6a9c78a vkd3d: Rename _2 suffix from resource creation functions.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle a2e14d7d1d vkd3d: Remove _2 suffix from d3d12_heap_2 and related functions.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 6f8bb2a4c0 vkd3d: Use vkd3d_allocate_device_memory_2 for sparse metadata.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 12f0c11c7f vkd3d: Simplify vkd3d_allocate_image_memory helper.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle ab2c190da5 vkd3d: Simplify vkd3d_allocate_buffer_memory helper.
This is still useful as a low-level memory allocation function when
we don't want to bother with buffer offsets or D3D12 validation.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle db2e0c7587 vkd3d: Remove vkd3d_gpu_va_allocator.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 8826f3c5bc vkd3d: Remove d3d12_heap and old resource creation functions.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 9792b02b26 vkd3d: Use vkd3d_memory_allocation for scratch buffers.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle db1b425d2a vkd3d: Use new resource and heap implementations.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Hans-Kristian Arntzen 8437eea2c0 vkd3d: Remove clamping assumption in RTPSO stack size.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-18 14:16:31 +01:00
Hans-Kristian Arntzen e228367e98 tests: Allow SetPipelineStackSize to propagate properly.
AMD and NV driver behaviors don't agree here, choose NV behavior as it
makes more sense.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-18 14:16:31 +01:00
Hans-Kristian Arntzen 3a48b97dd1 tests: Clean up some manual WCHAR strings.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-18 14:16:31 +01:00
Hans-Kristian Arntzen 20c4dfc685 tests: Add multithreaded suballocation test.
Also stresses VA mapping.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-18 14:16:31 +01:00
Joshua Ashton f01935d69e vkd3d: Fix SetName for inline query types
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-18 02:41:45 +01:00
Philip Rebohle 6fc8b67576 vkd3d: Fix incorrect chunk assignment for chunk allocations.
Our clear code assume that this is NULL for allocations owned
by a chunk, so we should actually do it that way. Fixes some
issues where we do not wait for clears to complete if a chunk
gets destroyed.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-17 16:38:47 +01:00
Philip Rebohle e12afd31d9 vkd3d: Actually use VKD3D_VA_BLOCK_COUNT.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-17 16:38:47 +01:00
Philip Rebohle 35f90c4b2f vkd3d: Only print some swapchain FIXMEs once.
Silences a whole bunch of log spam in Control.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-17 13:37:41 +01:00
Hans-Kristian Arntzen ea088ceecf vkd3d: Use UINT64* instead of uint64_t* in 64-bit CAS.
Avoids alignment warnings on 32-bit.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-16 16:14:14 +00:00
Hans-Kristian Arntzen 7051bf76f7 vkd3d: Fix validation errors with KHR_fragment_shading_rate.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-16 16:07:55 +00:00
Philip Rebohle a39bab95a1 vkd3d: Clear suballocated memory to zero.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-16 16:06:26 +01:00
Philip Rebohle 668a4e1f2c vkd3d: Do not suballocate small image-only heaps.
We have no way to manually reset these.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-16 16:06:26 +01:00
Philip Rebohle 4d68130be7 vkd3d: Add functionality to clear newly allocated memory.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-16 16:06:26 +01:00
Philip Rebohle 78713062fe vkd3d: Introduce unique_queue_mask.
Has one bit set for each vkd3d_queue_family that points to a
unique queue. This can be used to iterate over device queues
without having to check for duplicates manually.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-16 16:06:26 +01:00
Philip Rebohle 812c82f8ac vkd3d: Introduce VKD3D_QUEUE_FAMILY_INTERNAL_COMPUTE.
This needs a rework when we re-enable multi-queue support.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-16 16:06:26 +01:00
Hans-Kristian Arntzen da06323b87 tests: Add test which stresses suballocation implementation.
Designed to stress internal implementation details for memory rewrite.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-16 14:27:06 +01:00
Hans-Kristian Arntzen dc1b4b56ed tests: Fix build with vkd3d-utils Windows test suite.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-16 14:27:06 +01:00
Joshua Ashton bf2aa9ab99 build: Link against libatomic on x86 when using Clang
Needed for 64-bit atomics on 32-bit architectures on Clang.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-16 10:18:26 +01:00
Joshua Ashton a0f9891b11 tests: Fix -Wincompatible-pointer-types warnings
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-15 17:54:54 +01:00
Joshua Ashton b168a9278b tests: Fix missing hresult check in RTV descriptor copy test
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-15 17:54:54 +01:00
Joshua Ashton 485399ff81 tests: Fix -Wenum-conversion warnings
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-15 17:54:54 +01:00
Joshua Ashton 3bd5ba0681 tests: Fix -Wunused-variable warnings
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-15 17:54:54 +01:00
Joshua Ashton 2e1a5e75ac tests: Fix -Wabsolute-value warnings
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-15 17:54:54 +01:00
Joshua Ashton b91568b717 tests: Fix -Wdeclaration-after-statement warnings
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-15 17:54:54 +01:00
Joshua Ashton b6444b4728 tests: Fix -Wunused-function warnings
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-15 17:54:54 +01:00
Joshua Ashton 9b2841b50f tests: Fix -Wsign-compare warnings
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-15 17:54:54 +01:00
Joshua Ashton f32a2d5c70 tests: Fix -Wmissing-brace warnings
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-15 17:54:54 +01:00
Joshua Ashton 9953928379 meta: Remove autotools elements from gitignore
Additionally, fixes detecting packages and other build directories.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-15 17:54:54 +01:00
Philip Rebohle ba632148d7 vkd3d: Add new functions to create and destroy resources.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 17:04:52 +01:00
Philip Rebohle fee47ef695 vkd3d: Introduce d3d12_resource_validate_create_info.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 17:04:52 +01:00
Philip Rebohle 22f61611d1 vkd3d: Add d3d12_heap_2.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 17:04:52 +01:00
Philip Rebohle 229273fb3b vkd3d: Add memory allocator instance to device.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 17:04:52 +01:00
Philip Rebohle 6e1867b001 vkd3d: Add some more debug output to memory allocation functions.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 17:04:52 +01:00
Philip Rebohle 5e54c1fc5d vkd3d: Register allocation cookie for descriptor debugging.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 17:04:52 +01:00
Philip Rebohle 8f6e94dc30 vkd3d: Suballocate small allocations from larger chunks.
This is necessary to keep the amount of allocated memory manageable
in games that allocate a lot of small heaps or committed resources.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 16:38:16 +01:00
Georg Lehmann eaab2388b1 vkd3d: Fix warning with vkd3d_atomic_ptr*.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-02-15 15:47:17 +01:00
Philip Rebohle d65363b6b6 vkd3d: Add VA map to memory allocator.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 15:19:11 +01:00
Philip Rebohle 7c017c1dba vkd3d: Add VA->resource map and new VA allocator.
This is designed to work with actual device addresses if supported by
the Vulkan implementation.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 15:19:11 +01:00
Philip Rebohle f536daaacb vkd3d: Introduce new memory allocation functions.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 15:19:11 +01:00
Philip Rebohle 417b3b746e vkd3d: Introduce vkd3d_allocate_cookie.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 14:04:16 +01:00
Joshua Ashton 344f75aafd build: Enable --quiet on glslangValidator
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-15 11:25:51 +01:00
Joshua Ashton 00c8d1df9d vkd3d: Refactor vkd3d_physical_device_info_init
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-15 11:25:21 +01:00
Hans-Kristian Arntzen 22f052c366 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-12 14:28:49 +01:00
Joshua Ashton fb024f493f build: Add GCC problem matcher to build tests
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-12 13:41:34 +01:00
Joshua Ashton 7bb8346553 tests: Add test for Variable Rate Shading TIER_1
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-12 13:39:05 +01:00
Joshua Ashton c0d4ead8ca vkd3d: Implement TIER_1 variable rate shading
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-12 13:39:05 +01:00
Joshua Ashton fdf3d30792 build: Add Github Actions workflows
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-12 10:57:57 +01:00
Joshua Ashton 8e64da0eee
build: Add debug build option to package-release.sh
Co-authored-by: David McCloskey <davmcclo@gmail.com>
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-11 16:42:02 +00:00
Philip Rebohle 7549d70fbf vkd3d: Fix compiler errors when using vkd3d_atomic_ptr_store_explicit.
Atomic stores do not return anything, so we cannot cast to void* here.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-11 15:15:59 +00:00
Joshua Ashton fccbd3b5e2 vkd3d: Eliminate wchar_size, use UTF-16 string literals
Achieves this with C standard stuff alone, and no compiler hacks.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-09 11:26:28 +01:00
Joshua Ashton 38d2de9f4c vkd3d: Fix warning in query logging
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-08 16:36:15 +01:00
Hans-Kristian Arntzen 6e9bd28481 tests: Test more raytracing PSO details.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen c558c8f423 vkd3d: Implement Get*StackSize().
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen 13b737214b vkd3d: Remove owned root signatures.
Apparently the docs are lying and RTPSO does not hold references to the
root signatures after all.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen 13af141e84 common: Add truncated wide export strcmp.
Needed for GetShaderStackSize().

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen b41d01d580 tests: Verify refcount semantics for ID3D12StateObjectProperties.
The refcount is shared.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen bfe9a39c3b vkd3d: Implement the basics of RTPSO.
Implement enough that the test case compiles correctly.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen 859066cd9b vkd3d-shader: Add ray-tracing pipeline support to DXIL.
Also updates relevant submodules.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen d8d1e82024 vkd3d-shader: Refactor DXIL resource remapping.
Prepare for local root signatures.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen 1784351dcf vkd3d-shader: Move root parameter structs to vkd3d-shader.
Need it here since local root signatures need to know
the physical layout of the record buffer up front.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen fdcf583cbc vkd3d: Rename COUNTER buffer to AUX_BUFFER.
We will use the same pointer buffer to handle acceleration structures,
so unify this buffer under a new name. Simplifies some of the binding
code since SRV path and UAV path looks more similar now.

Only difference is that UAV path uses BDA -> uint32_t,
and SRV uses BDA -> RTAccelerationStructure.

RT requires BDA, so the fallback descriptor set (storage texel buffer) is never used for RT.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen f3becc21a4 vkd3d: Implement local root signatures.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen 9b856ed124 vkd3d: Add entry points for VK_KHR_ray_tracing_pipeline.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen 86da5d9bad common: Add string utilities for dealing with entry point conventions.
Used across both vkd3d-shader and vkd3d, so makes sense to move this to
common code.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen 4957d561dc vkd3d: Add dummy entry to app overrides.
Empty array declaration is not legal C.
Fixes compilation error on MSVC.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen 547867d505 tests: Make raytracing test robust against stubbed implementation.
Don't crash if some things are not implemented fully.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Joshua Ashton 51bf939743 vkd3d: Implement DXGI_FORMAT_B4G4R4A4_UNORM
Uses VK_EXT_4444_formats.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-04 12:04:10 +01:00
Philip Rebohle 00872471eb vkd3d: Set WriteBufferImmediateSupportFlags properly.
We do not support bundles, but advertizing WriteBufferImmediate
support for bundles is required for Feature Level 12_2.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-01 11:09:56 +01:00
Philip Rebohle 2560c76861 vkd3d: Disable accelerationStructureCaptureReplay feature.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-28 18:55:52 +01:00
Philip Rebohle b4bc92714a vkd3d: Always align scratch buffer for query data to 8 bytes.
Fixes a validation error. With VK_QUERY_RESULT_64_BIT we need
to use 8-byte alignment, but ssbo_alignment may be less.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-26 21:04:11 +01:00
Hans-Kristian Arntzen 2bc9dc7909 vkd3d: Add FL override for 12.2 (DX12 Ultimate).
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-26 15:00:15 +01:00
Hans-Kristian Arntzen dd2a963ae7 idl: Add D3D_FEATURE_LEVEL_12_2.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-26 15:00:15 +01:00
Hans-Kristian Arntzen 9893b7f52c vkd3d: Enable SM 6.3.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-26 15:00:15 +01:00
Hans-Kristian Arntzen 31fa512512 vkd3d: Add checks for RayTracing tier.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-26 15:00:15 +01:00
Hans-Kristian Arntzen c8f8b24674 vkd3d: Enable ray tracing extensions.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-26 15:00:15 +01:00
Hans-Kristian Arntzen e89c286075 vkd3d: Report OPTIONS7 features.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-26 15:00:15 +01:00
Georg Lehmann c76f37d41c vkd3d: Introduce VKD3D_FILTER_DEVICE_NAME.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-01-25 15:29:34 +01:00
Hans-Kristian Arntzen 326d1cde60 vkd3d-shader: Remove DXIL being optional.
We always build with DXIL, not using autotools anymore.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-25 14:03:37 +01:00
Philip Rebohle c5958d36bc tests: Add test to stress-test virtual query implementation.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00
Philip Rebohle c6095e740d vkd3d: Do not create query pool for inline query types.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00
Philip Rebohle 7b524590ab vkd3d: Introduce d3d12_query_heap_type_is_inline.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00
Philip Rebohle 098ad5c071 vkd3d: Remove disable_query_optimization workaround.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00
Philip Rebohle 7ea11ededb vkd3d: Use virtual queries for transform feedback queries as well.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00
Philip Rebohle c9525cf5ca vkd3d: Allocate new virtual query for active queries as necessary.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00
Philip Rebohle b995780de1 vkd3d: Reimplement binary occlusion query resolve.
No longer requires BDA support since it's easier now to work
around buffer alignment issues.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00
Philip Rebohle 5c550b5cda vkd3d: Rewrite binary occlusion query resolve shader.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00
Philip Rebohle 32f7ba6630 vkd3d: Use virtual queries for inline query types.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00
Philip Rebohle 4a558ce501 vkd3d: Compute query stride from heap type rather than query type.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00
Philip Rebohle abc204cea4 vkd3d: Create buffer for query heap as necessary.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00
Philip Rebohle 39c7f8f32d vkd3d: Introduce pending query list.
This will store the list of queries to resolve.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00
Philip Rebohle 6e3a7d37cc vkd3d: Store more information in active query list.
Allows us to map D3D12 queries to virtual queries and vice versa.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00
Philip Rebohle 59acbfeb41 vkd3d: Add query resolve pipelines to meta ops.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00
Philip Rebohle cc8fb3ae1c vkd3d: Add query resolve shader.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00
Philip Rebohle 16f5cff061 vkd3d: Implement virtual query allocation.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00
Hans-Kristian Arntzen 634d8fd0fa dxil-spirv: Update submodule.
Fixes HZD SSR regression.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-15 13:47:04 +01:00
Hans-Kristian Arntzen 6e50aaf11f tests: Modify typed_as_untyped test to test copies.
Verifies that copying multiple descriptors works as expected.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-14 15:34:20 +01:00
Hans-Kristian Arntzen a531ee5fd4 vkd3d: Remove force_bindless_texel_buffer workaround.
Obsolete now that we fully split typed and untyped buffer descriptors.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-14 15:34:20 +01:00
Hans-Kristian Arntzen 57f2124721 tests: Remove todo on typed_as_untyped test.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-14 15:34:20 +01:00
Hans-Kristian Arntzen 97e0d8e751 vkd3d: Move bindless SSBO out of MUTABLE set and fill both descriptors.
We will need separate descriptor sets to be able to handle typed vs
untyped buffer workarounds.

Also writes multiple descriptors for buffers views to make sure MUTABLE
and SSBO sets are filled (or TEXEL_BUFFER + SSBO for non-mutable).

Applications often get this wrong and use raw buffer in shader where
typed view was written and vice versa.
To mitigate this, just write a typed and untyped view together.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-14 15:34:20 +01:00
Philip Rebohle 6bddcb4352 vkd3d: Store both byte range and element range in offset buffer.
The first range will store the byte offset, the second one will
be the typed buffer range. Typed descriptors should write both.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Co-authored-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-14 15:34:20 +01:00
Hans-Kristian Arntzen dbbde3c6f1 vkd3d: Remove VKD3D_DESCRIPTOR_FLAG_DEFINED.
This is redundant now since this information is carried by set_info_mask.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-14 15:34:20 +01:00
Hans-Kristian Arntzen 1bddaa0fff vkd3d: Allow a heap binding to cover multiple descriptors.
This begins the refactor toward letting us to use both texel buffer and
SSBO descriptors for typed buffers, which is a better workaround than
force_bindless_texel_buffers.

In this new approach, we store a mask in metadata instead of
set/binding.

When copying a descriptor, we will iterate over the masks and look up
binding directly from device->bindless_state.set_info[].

The mask is represented in terms of info index rather than set index to
avoid needless lookups. Add some new helpers to make this process
easier.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-14 15:34:20 +01:00
Philip Rebohle f25df5b453 vkd3d: Reset inline queries in BeginQuery.
We currently never reset occlusion queries. For some reason,
validation layers do not report this.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-14 13:38:21 +01:00
Henri Verbeet c42f4d11e2 vkd3d-shader: Decorate "precise" arithmetic instructions with SpvDecorationNoContraction.
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
2021-01-12 15:22:11 +01:00
Philip Rebohle 29e3d292ae tests: Mark sparse depth image test as TODO on RADV.
Currently, RADV does not support sparse depth images.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-11 14:46:07 +01:00
Philip Rebohle 037efbdcda vkd3d: Add mapping for PACK16 formats.
Dirt 5 fails with an error message otherwise.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-08 18:37:26 +01:00
Hans-Kristian Arntzen d003424bc8 meta: Bump Meson build version to 2.1.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-14 12:12:12 +01:00
Hans-Kristian Arntzen 793fce068e meta: Slight modification to CHANGELOG.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-12 16:16:59 +01:00
Philip Rebohle a3d21494f7 vkd3d: Enable query workaround for AC:Valhalla.
Fixes #458.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-12-12 13:01:52 +01:00
Philip Rebohle b8c96d9b30 vkd3d: Add workaround to disable occlusion query optimization.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-12-12 13:01:52 +01:00
Hans-Kristian Arntzen e99a2c9da7 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-12 12:19:10 +01:00
Hans-Kristian Arntzen 49ed5beb63 meta: Add Cyberpunk 2077 to supported list with (huge) caveats.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-11 11:22:45 +01:00
Hans-Kristian Arntzen 9cbd1b2a0d vkd3d: Add Cyberpunk2077.exe to workaround detection.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-11 11:22:45 +01:00
Hans-Kristian Arntzen c2f1596b3e tests: Add test for reading typed R32 buffer as untyped.
Invokes undefined behavior that many games rely on by accident.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-11 11:20:55 +01:00
Philip Rebohle 946bcd7922 vkd3d: Do not store counter address in descriptor.
Unnecessary because the UAV counter buffer is a host memory
allocation anyway in case of host-only descriptor heaps, so
we will not read from uncached memory.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-12-10 18:14:16 +01:00
Hans-Kristian Arntzen 0cc374e0f8 meta: Add changelog for 2.1.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-09 14:56:24 +01:00
Hans-Kristian Arntzen 8797e15ddd meta: Add 2.0 change log as a file.
Makes it possible to review change logs with Git going forward.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-09 14:56:24 +01:00
Hans-Kristian Arntzen 193abc395b README: Document how to use VKD3D_DESCRIPTOR_QA_LOG.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-09 14:51:53 +01:00
Hans-Kristian Arntzen 22a907e11a vkd3d: Add descriptor QA logging.
When reading GPU hang dumps, we can figure out what happened to
descriptor types along the way.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-09 14:51:53 +01:00
Philip Rebohle 1d9f28b25f vkd3d: Add fast path for mutable descriptor copies.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-12-09 14:31:22 +01:00
Philip Rebohle 7d40d8a22e vkd3d: Rework descriptor copies to copy ranges.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-12-09 14:31:22 +01:00
Hans-Kristian Arntzen e2185df7de tests: Remove is_bug for MSAA clear test.
Fixed on Mesa master now (FMASK bug).

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-07 20:56:48 +01:00
Hans-Kristian Arntzen a888d81422 vkd3d: Fix embarassing enum bug.
Caused crash when using a driver that did not support
mutable_descriptor_type.
Was using the wrong enum bitfields ... Sigh, type safe enums would be nice.
Regression caused during refactor in review most likely.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-07 20:55:56 +01:00
Hans-Kristian Arntzen 051ba691be vkd3d: Clarify comment about not using MEMORY_READ/WRITE.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-07 20:45:02 +01:00
Philip Rebohle c057e881dc vkd3d: Do not interrupt render pass for occlusion queries.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-12-07 15:18:12 +01:00
Hans-Kristian Arntzen 7711b9ba1a README: Mention VK_VALVE_mutable_descriptor_type as a key extension.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-07 15:17:08 +01:00
Hans-Kristian Arntzen aa21d2d03d vkd3d: Add support for VK_VALVE_mutable_descriptor_type.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-07 15:17:08 +01:00
Hans-Kristian Arntzen 76a7eb7c57 vulkan-headers: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-07 15:17:08 +01:00
Hans-Kristian Arntzen 4fa24bb4ee tests: Remove old is_bug for conditional rendering.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-07 14:10:51 +01:00
Hans-Kristian Arntzen 8fb88855e5 vkd3d: Hash buffers and views based on format, not vk_format.
The creation infos use the format, which potentially contains other
information as well.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-05 15:09:39 +01:00
Hans-Kristian Arntzen 6b363e53d2 vkd3d: Actually compare against hashmap entry and not against itself.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-05 15:09:39 +01:00
Hans-Kristian Arntzen e6961afca6 vkd3d-shader: Emit typed format for UAVs which use atomics.
Mesa will assert if not, and the format must be known here.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-04 16:08:26 +01:00
Philip Rebohle c4fbe47106 vkd3d: Do not interrupt render pass for timestamp queries.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-12-03 16:54:35 +01:00
Philip Rebohle e13d69ad27 vkd3d: Batch query pool reset commands if possible.
By resetting query pools in advance, we can reduce the number of
stalls between draw calls in passes with occlusion queries, which
is currently causing serious performance issues in some games.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-12-03 16:21:43 +01:00
Philip Rebohle 648e41716b vkd3d: Add additional command buffer to batch intialization commands.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-12-03 16:21:43 +01:00
Philip Rebohle d0fc57413e vkd3d: Merge adjacent query ranges on insertion.
Since we'll be inserting lots of single queries, we want to
avoid having to resize the range array since that is an O(n)
operation at worst.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-12-03 16:21:43 +01:00
Philip Rebohle 81e6449f67 vkd3d: Add code to track query ranges used within a command list.
Useful to batch vkCmdResetQueryPool calls.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-12-03 16:21:43 +01:00
Hans-Kristian Arntzen ee4508ba97 vkd3d: Fix sign vs unsigned compare warning.
UINT16 promotes to int rather than UINT here.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-03 15:05:13 +01:00
Hans-Kristian Arntzen f67f55827e vkd3d: Parse patch version of PACKAGE_NAME as well.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-03 15:05:13 +01:00
Hans-Kristian Arntzen adf0be5bf1 vkd3d: Lower contention when spinlocking writers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-02 13:12:56 +01:00
Hans-Kristian Arntzen b85a345d48 vkd3d: Fix const-ness warning on MSVC.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-02 13:12:56 +01:00
Hans-Kristian Arntzen 5f8659f4bb vkd3d: Use reader-writer spinlock in view map.
The common case is that we find an entry, so taking a writer lock should
be the rare case. We need to optimize for the case where the application
hammers the view map with e.g. buffers.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-02 13:12:56 +01:00
Hans-Kristian Arntzen b3024365d0 vkd3d: Add a reader-writer spinlock.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-02 13:12:56 +01:00
Hans-Kristian Arntzen c2c674194d vkd3d: Add Add/Sub/And atomic u32 intrinsics.
Will be used for reader-writer spinlocks.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-02 13:12:56 +01:00
Hans-Kristian Arntzen f96e60b6ac vkd3d: Make hashmap compatible with reader-writer locks.
Yield insertion when there is a match.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-02 13:12:56 +01:00
Hans-Kristian Arntzen e0382cc451 vkd3d: Add extra typeless copy usage flags after clearing them.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-27 16:27:39 +01:00
Hans-Kristian Arntzen f46756ed85 vkd3d: Report if RTV/DSV resource does not set render target usage.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-27 16:27:39 +01:00
Hans-Kristian Arntzen c38fd9bfc3 vkd3d: Bind WHOLE_SIZE when using null SSBO descriptor.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-27 13:13:39 +01:00
Philip Rebohle 8c0958824a tests: Remove todo from 64-bit predicate test.
This is supported properly now as long as the device supports
buffer_device_address.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-27 12:43:39 +01:00
Philip Rebohle 2ec68af1d5 vkd3d: Add fallback path for predication using indirect draws.
Official AMD drivers do not support VK_EXT_conditional_rendering,
so we'll use indirect draws instead to emulate the feature.

This also handles 64-bit predicates in combination with the
Vulkan extension, which was not possible previously.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-27 12:43:39 +01:00
Philip Rebohle 82d9ba1ebf vkd3d: Add meta shader to generate predicated draw/dispatch commands.
The idea is to use indirect draws and dispatches to implement
predication. For predicated indirect draws, we'll use indirect
count.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-27 12:43:39 +01:00
Joshua Ashton e27a153a22 vkd3d-shader: Fix saturates of fp64 types
Closes: #419

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2020-11-27 11:11:59 +01:00
Hans-Kristian Arntzen 9cd082da69 dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-25 20:22:37 +01:00
Joshua Ashton 22794c67a4 include: Add missing enum flag operator definitions
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2020-11-25 15:58:45 +01:00
Joshua Ashton 3e5a3c835a include: Fix definition of D3D12_RESOURCE_ALIASING_BARRIER
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2020-11-25 15:58:45 +01:00
Joshua Ashton 6b9f7b7339 include: Fix typo in d3d12 header
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2020-11-25 15:58:45 +01:00
Joshua Ashton fcb4764228 include: Update D3D12 headers
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2020-11-25 15:58:45 +01:00
Hans-Kristian Arntzen 1ce5ea8073 vkd3d: Fix segfault when freeing pipeline library.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-25 10:26:36 +01:00
Hans-Kristian Arntzen 8a102d6a1c dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-24 18:58:06 +01:00
Philip Rebohle 2c9bacd760 vkd3d: Perform binary occlusion query fixup on scratch buffer.
Potentially avoids some unnecessary host memory access. Use BDA for
the compute shader so that we can ignore alignment restrictions on
some GPU architectures.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-24 16:45:55 +01:00
Philip Rebohle 78076a9a84 vkd3d: Introduce d3d12_resource_get_va.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-24 16:45:55 +01:00
Philip Rebohle afb85c79cd vkd3d: Add code to create, destroy and recycle scratch buffers.
Command lists may need to allocate temporary device memory for
certain operations. In order to avoid frequent alloc/free calls,
we'll recycle these scratch buffers until a certain threshold.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-24 16:45:55 +01:00
Hans-Kristian Arntzen c0b34fdb7b tests: Add unaligned VBO read test.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-24 15:07:29 +01:00
Hans-Kristian Arntzen 19193bf932 vkd3d: Sanitize VBO strides and VBO offsets.
Realign VBO strides and offsets if we have to, for sake of
robustness. Violating these rules is against D3D12 spec, but it does not
cause crashes on native drivers. On RDNA we can hit hangs with unaligned
vertex attributes. It appears that native drivers apply some kind of
fixup here to avoid the crash, even if the result is not what we expect.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-24 15:07:29 +01:00
Hans-Kristian Arntzen 10b503c893 vkd3d: Fallback to NULL VA when binding non-existent VBO.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-24 15:07:29 +01:00
Philip Rebohle 9d57489225 vkd3d-shader: Correctly handle infinity in f32tof16.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-23 15:46:55 +01:00
Philip Rebohle ced72326be tests: Test f32tof16 behaviour with infinity and high numbers.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-23 15:46:55 +01:00
Philip Rebohle 8cbecfb9f6 vkd3d: Fix offset for predicate buffer.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-20 11:23:06 +01:00
Philip Rebohle 35f6aa22c7 tests: Remove todo for binary occlusion query test.
This test passes correctly now.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-19 22:08:42 +01:00
Philip Rebohle fb6f078ba9 vkd3d: Fix up binary occlusion query results.
In D3D12, these return 1 rather than an actual sample count.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-19 22:08:42 +01:00
Philip Rebohle 89aea3304c vkd3d: Always add STORAGE_BUFFER_BIT to readback buffers.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-19 22:08:42 +01:00
Philip Rebohle fdd0dbafe4 vkd3d: Add meta compute shader to resolve binary occlusion queries.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-19 22:08:42 +01:00
Philip Rebohle 6886bb7f11 vkd3d: Handle empty viewports.
Assassin's Creed Valhalla relies on this.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-19 14:31:39 +01:00
Philip Rebohle ecc504922e vkd3d: Consider mip level for 3D UAV slice check.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-19 14:20:28 +01:00
Hans-Kristian Arntzen ffc1fa646c vkd3d: Mask out attachments which cannot safely be written to.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-19 14:13:59 +01:00
Hans-Kristian Arntzen 0dc0d75967 vkd3d: Use VK_IMAGE_LAYOUT_UNDEFINED for unused attachments.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-19 14:13:59 +01:00
Georg Lehmann 11bdc76aa0 vkd3d: Use static init for device map.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2020-11-18 18:29:48 +00:00
Georg Lehmann 24100cac07 vkd3d: Add Win32 PTHREAD_MUTEX_INITIALIZER.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2020-11-18 18:29:48 +00:00
Hans-Kristian Arntzen d0328e8760 vkd3d: Fix uninitialized variable in initial WSI transition.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-18 16:40:17 +01:00
Hans-Kristian Arntzen 9617a0f598 vkd3d: Disable RAW_VA root CBVs on NVIDIA.
BDA cannot map to their hardware, and we observe a large performance
loss in games which use root CBVs. For this reason, fall back to push
descriptors here.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-18 15:49:31 +01:00
Hans-Kristian Arntzen 52ee2edc3d vkd3d: Separate root VA use for CBV and SRV/UAV.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-18 15:49:31 +01:00
Philip Rebohle 5a288b7d0f tests: Adjust todos in some query tests.
Query init changes broke unissued timestamp queries, but
test_resolve_query_data_in_reordered_command_list passes.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-18 15:45:28 +01:00
Philip Rebohle 215989f6d5 vkd3d: Rework query pool initialization.
Ensures that queries are always available and initialized
in the correct order on the GPU timeline.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-18 15:45:28 +01:00
Philip Rebohle bb9d0f2741 vkd3d: Rework initial transitions to allow for different types.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-18 15:45:28 +01:00
Philip Rebohle 10e82fa7a0 tests: Add missing UAV barriers in test_cs_uav_store.
Fixes some random test failures.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-18 15:05:29 +01:00
Hans-Kristian Arntzen 27f91b99b0 vkd3d-shader: Add debug log callback to DXIL.
Allows us to capture dxil compiler messages in log.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-18 14:37:58 +01:00
Joshua Ashton a950191008 vkd3d: Implement singleton devices.
Matches D3D12 behaviour.

Co-authored-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-18 12:39:14 +01:00
Philip Rebohle 1563b80852 include: Fix various issues with atomic CAS.
- fail/success memory orders exist for a reason, we can't
  e.g. do release on fail since it's a read-only operation
- silence some warnings about pointer->integer casts
- fix linker errors on mingw by marking functions as static

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-18 12:39:14 +01:00
Hans-Kristian Arntzen f54ac3b9c5 vkd3d: Add app detection for buggy game: ds.exe.
Game renders the map with wrong descriptor type, which means we must
implement everything as texel buffers to make this work.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-18 12:27:19 +01:00
Hans-Kristian Arntzen 6f8ae20015 vkd3d: Add VKD3D_CONFIG option to disable bindless SSBO.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-18 12:27:19 +01:00
Hans-Kristian Arntzen d947c17fc2 meta: Add missing VKD3D_DEBUG level to README.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-18 12:27:19 +01:00
Philip Rebohle bab9b0af92 vkd3d: Support offset buffers for raw/structured texel buffers.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Co-authored-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-18 12:27:19 +01:00
Hans-Kristian Arntzen 3e15a3f06a vkd3d: Remove manual tracking of host barriers.
Just emit host barrier on submit unconditionally.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-17 16:34:03 +01:00
Hans-Kristian Arntzen 0f25b827e0 vkd3d: Use pipeline barrier command buffers for queue serialization.
We have observed a lot of large GPU bubbles when using back-to-back
timeline semaphores to synchronize GPU submissions. Use prebaked
pipeline barrier command buffers instead.

To resolve queue sparse serialization, use two binary semaphore pairs to
resolve this. There is no need to use timeline semaphores in this case.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-17 16:34:03 +01:00
Philip Rebohle 8fe83f5e9c vkd3d-shader: Correctly handle bit shifts greater than 31 bits.
This is undefined behaviour in SPIR-V, but well-defined in
DXBC, so we should explicitly 'and' the shift amount with 31.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-17 15:26:36 +01:00
Hans-Kristian Arntzen 0749f46d8e vkd3d: Re-enable wave ops.
dxil-spirv update fixed the issue for me.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-17 10:49:40 +01:00
Hans-Kristian Arntzen de4293f990 vkd3d: Use SHADER_READ for CBV visibility when using ROOT_VA CBV.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-16 17:01:58 +01:00
Hans-Kristian Arntzen 30c417bdbf dxil-spirv: Update submodule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-16 16:55:55 +01:00
Joshua Ashton 4d95cafe10 vkd3d: Implement compare exchange atomics
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2020-11-16 09:33:26 +01:00
Joshua Ashton 1e810e8f9e vkd3d: Use consistent comment style in atomic header
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2020-11-16 09:33:26 +01:00
Joshua Ashton 093f0eb053 vkd3d: Implement 64-bit and pointer atomics
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2020-11-16 09:33:26 +01:00
Joshua Ashton 71328b9be7 vkd3d: Handle reserved resources in host barrier code
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2020-11-16 09:25:50 +01:00
Joshua Ashton 08135f7746 vkd3d: Fix validation spam for null descriptor buffers
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2020-11-16 09:25:17 +01:00
Hans-Kristian Arntzen 412ec7ac2f vkd3d: Enable root descriptor BDA support.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-13 17:10:48 +01:00
Hans-Kristian Arntzen a1d851e717 vkd3d-shader: Do not require Int64 to use root descriptors.
Can just use uvec2. Also improves performance on ACO since ACO cannot
promote uint64_t to SGPR yet, u32x2 however, works fine and can be
bitcast to pointer as well.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-13 17:10:48 +01:00
Hans-Kristian Arntzen 009b3a69e0 vkd3d-shader: Update dxil-spirv with BDA root descriptor support.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-13 17:10:48 +01:00
Hans-Kristian Arntzen 74a654e273 vkd3d: Disable waveops for time being.
The fix which enabled waveops detection broke HZD, since we never tested
with that feature enabled.

Keep it disabled until we can figure out what is going on.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-13 12:58:22 +01:00
Philip Rebohle 3da44beb5d vkd3d: Change USE_PUSH_DESCRIPTORS to USE_ROOT_DESCRIPTOR_SET for clarity.
USE_PUSH_DESCRIPTORS may be misleading since it would be set even when
we're not using push descriptors at all due to root descriptors being
passed in via VAs. Instead, make the flag represent whether or not we
use a regular descriptor set for root parameters.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-12 15:21:56 +01:00
Philip Rebohle baf265c666 vkd3d: Update root descriptor VAs as necessary.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-12 15:21:56 +01:00
Philip Rebohle 8999093c54 vkd3d: Add new field to store root descriptor VA.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-12 15:21:56 +01:00
Philip Rebohle 677422993e vkd3d: Add root descriptor VAs to push constant range.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-12 15:21:56 +01:00
Philip Rebohle cd01371756 vkd3d: Always enable BUFFER_DEVICE_ADDRESS usage for buffers.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-12 15:21:56 +01:00
Philip Rebohle c11b58418a vkd3d-shader: Support physical storage buffer root SRVs/UAVs.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-12 15:21:56 +01:00
Philip Rebohle 4313eaa59c vkd3d-shader: Support physical storage buffer root CBVs.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-12 15:21:56 +01:00
Philip Rebohle 5d2b0e6632 vkd3d-shader: Add loadv/storev helpers for aligned memory access.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-12 15:21:56 +01:00
Philip Rebohle 6c9d0cea69 vkd3d-shader: Rename descriptor_table_var_id -> root_parameter_var_id.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-12 15:21:56 +01:00
Philip Rebohle 4b3cec53fc vkd3d-shader: Declare push constants for root descriptor VAs.
We'll always place them at the beginning of the push constant
buffer in order to avoid potential alignment issues.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-12 15:21:56 +01:00
Philip Rebohle 2689c9e0a3 vkd3d-shader: Enable Int64 capability as necessary.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-12 15:21:56 +01:00
Philip Rebohle f69564c6c1 vkd3d-shader: Implement buffer reference type declarations.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-12 15:21:56 +01:00
Philip Rebohle b536723f5a vkd3d: Fix shader model-related feature detection.
We need to know the supported shader model to detect support
for certain features like wave ops correctly.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-11-11 10:41:11 +01:00
Joshua Ashton d4d14dfca0 vkd3d: Ignore DXGI_PRESENT_ALLOW_TEARING
Fixes warning spam in Horizon Zero Dawn.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2020-11-09 15:34:08 +01:00
Joshua Ashton 536ed0427a vkd3d: Create user buffers for degenerate surfaces
Previously this would make the user buffer count == 0, which obviously makes apps and assertions not happy.

Fixes a crash in Horizon Zero Dawn when minimized (therefore having a degenerate surface region)

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2020-11-09 15:34:08 +01:00
Joshua Ashton c77428ba44 vkd3d: Implement DXGI_PRESENT_TEST
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2020-11-09 15:34:08 +01:00
Alexander Gabello da4a96a25b vkd3d: Free string after PIX decoding
Signed-off-by: Alexander Gabello <alexandergabello@mail.weber.edu>
2020-11-09 10:55:05 +01:00
142 changed files with 108333 additions and 56886 deletions

31
.github/workflows/artifacts.yml vendored Normal file
View File

@ -0,0 +1,31 @@
name: Artifacts (Package)
on: [push, pull_request, workflow_dispatch]
jobs:
build-artifacts:
runs-on: ubuntu-20.04
steps:
- name: Checkout code
id: checkout-code
uses: actions/checkout@v2
with:
submodules: recursive
- name: Build release
id: build-release
uses: Joshua-Ashton/arch-mingw-github-action@v8
with:
command: |
export VERSION_NAME="${GITHUB_REF##*/}-${GITHUB_SHA##*/}"
./package-release.sh ${VERSION_NAME} build --no-package
echo "VERSION_NAME=${VERSION_NAME}" >> $GITHUB_ENV
- name: Upload artifacts
id: upload-artifacts
uses: actions/upload-artifact@v2
with:
name: vkd3d-proton-${{ env.VERSION_NAME }}
path: build/vkd3d-proton-${{ env.VERSION_NAME }}
if-no-files-found: error

75
.github/workflows/test-build-linux.yml vendored Normal file
View File

@ -0,0 +1,75 @@
name: Test Builds on Linux
on: [push, pull_request, workflow_dispatch]
jobs:
build-set-linux:
runs-on: ubuntu-20.04
steps:
- name: Checkout code
id: checkout-code
uses: actions/checkout@v2
with:
submodules: recursive
- name: Setup problem matcher
uses: Joshua-Ashton/gcc-problem-matcher@v1
- name: Build MinGW x86
id: build-mingw-x86
uses: Joshua-Ashton/arch-mingw-github-action@v8
with:
command: |
meson -Denable_tests=True -Denable_extras=True --cross-file=build-win32.txt --buildtype release build-mingw-x86
ninja -C build-mingw-x86
- name: Build MinGW x64
id: build-mingw-x64
uses: Joshua-Ashton/arch-mingw-github-action@v8
with:
command: |
meson -Denable_tests=True -Denable_extras=True --cross-file=build-win64.txt --buildtype release build-mingw-x64
ninja -C build-mingw-x64
- name: Build Native GCC x86
id: build-native-gcc-x86
uses: Joshua-Ashton/arch-mingw-github-action@v8
with:
command: |
export CC="gcc -m32"
export CXX="g++ -m32"
export PKG_CONFIG_PATH="/usr/lib32/pkgconfig:/usr/lib/i386-linux-gnu/pkgconfig:/usr/lib/pkgconfig"
meson -Denable_tests=True -Denable_extras=True --buildtype release build-native-gcc-x86
ninja -C build-native-gcc-x86
- name: Build Native GCC x64
id: build-native-gcc-x64
uses: Joshua-Ashton/arch-mingw-github-action@v8
with:
command: |
export CC="gcc"
export CXX="g++"
meson -Denable_tests=True -Denable_extras=True --buildtype release build-native-gcc-x64
ninja -C build-native-gcc-x64
- name: Build Native Clang x86
id: build-native-clang-x86
uses: Joshua-Ashton/arch-mingw-github-action@v8
with:
command: |
export CC="clang -m32"
export CXX="clang++ -m32"
export PKG_CONFIG_PATH="/usr/lib32/pkgconfig:/usr/lib/i386-linux-gnu/pkgconfig:/usr/lib/pkgconfig"
meson -Denable_tests=True -Denable_extras=True --buildtype release build-native-clang-x86
ninja -C build-native-clang-x86
- name: Build Native Clang x64
id: build-native-clang-x64
uses: Joshua-Ashton/arch-mingw-github-action@v8
with:
command: |
export CC="clang"
export CXX="clang++"
meson -Denable_tests=True -Denable_extras=True --buildtype release build-native-clang-x64
ninja -C build-native-clang-x64

View File

@ -0,0 +1,53 @@
name: Test Builds on Windows
on: [push, pull_request, workflow_dispatch]
jobs:
build-set-windows:
runs-on: windows-2022
steps:
- name: Checkout code
id: checkout-code
uses: actions/checkout@v2
with:
submodules: recursive
- name: Setup widl and glslangValidator
shell: pwsh
run: |
choco install strawberryperl vulkan-sdk -y
Write-Output "C:\Strawberry\c\bin" | Out-File -FilePath "${Env:GITHUB_PATH}" -Append
Write-Output "$([System.Environment]::GetEnvironmentVariable('VULKAN_SDK', 'Machine'))\Bin" `
| Out-File -FilePath "${Env:GITHUB_PATH}" -Append
- name: Setup Meson
shell: pwsh
run: pip install meson
- name: Find Visual Studio
shell: pwsh
run: |
$installationPath = Get-VSSetupInstance `
| Select-VSSetupInstance -Require Microsoft.VisualStudio.Workload.NativeDesktop -Latest `
| Select-Object -ExpandProperty InstallationPath
Write-Output "VSDEVCMD=${installationPath}\Common7\Tools\VsDevCmd.bat" `
| Out-File -FilePath "${Env:GITHUB_ENV}" -Append
- name: Build MSVC x86
shell: pwsh
run: |
& "${Env:COMSPEC}" /s /c "`"${Env:VSDEVCMD}`" -arch=x86 -host_arch=x64 -no_logo && set" `
| % { , ($_ -Split '=', 2) } `
| % { [System.Environment]::SetEnvironmentVariable($_[0], $_[1]) }
meson -Denable_tests=True -Denable_extras=True --buildtype release --backend vs2022 build-msvc-x86
msbuild -m build-msvc-x86/vkd3d-proton.sln
- name: Build MSVC x64
shell: pwsh
run: |
& "${Env:COMSPEC}" /s /c "`"${Env:VSDEVCMD}`" -arch=x64 -host_arch=x64 -no_logo && set" `
| % { , ($_ -Split '=', 2) } `
| % { [System.Environment]::SetEnvironmentVariable($_[0], $_[1]) }
meson -Denable_tests=True -Denable_extras=True --buildtype release --backend vs2022 build-msvc-x64
msbuild -m build-msvc-x64/vkd3d-proton.sln

31
.gitignore vendored
View File

@ -1,28 +1,5 @@
aclocal.m4
autom4te.cache
config.log
config.status
configure
libtool
Makefile
Makefile.in
test-suite.log
vkd3d-compiler
vkd3d-*.tar.xz
*.exe
*.la
*.lo
*.log
*.o
*.pc
*.trs
*~
.deps
.dirstamp
.libs
build
build.native
build.cross
build.*
vkd3d-proton-*.tar.zst
vkd3d-proton-*/

View File

@ -1,9 +0,0 @@
vkd3d:
variables:
GIT_SUBMODULE_STRATEGY: recursive
script:
- ./package-release.sh release build --no-package
artifacts:
name: "vkd3d-${CI_COMMIT_REF_NAME}.${CI_COMMIT_SHA}"
paths:
- build/vkd3d-release

4
.mailmap Normal file
View File

@ -0,0 +1,4 @@
Conor McCarthy <cmccarthy@codeweavers.com>
Ivan Fedorov <ifedorov@nvidia.com>
James Beddek <telans@protonmail.com>
Roshan Chaudhari <rochaudhari@nvidia.com>

35
AUTHORS
View File

@ -1,7 +1,34 @@
Alexander Gabello
Alexandre Julliard
Andrew Eikum
Arkadiusz Hiler
Biswapriyo Nath
Chip Davis
Henri Verbeet
Józef Kucia
Sven Hesse
Conor McCarthy
Danylo Piliaiev
David Gow
David McCloskey
Derek Lesho
Fabian Bornschein
Georg Lehmann
Hans-Kristian Arntzen
Philip Rebohle
Henri Verbeet
Ivan Fedorov
Jactry Zeng
James Beddek
Jens Peters
Joshua Ashton
Józef Kucia
Juuso Alasuutari
Krzysztof Bogacki
Paul Gofman
Philip Rebohle
Rémi Bernon
Robin Kertels
Rodrigo Locatti
Roshan Chaudhari
Samuel Pitoiset
Sveinar Søpler
Sven Hesse
Thomas Crider
Zhiyi Zhang

388
CHANGELOG.md Normal file
View File

@ -0,0 +1,388 @@
# Change Log
## 2.6
It has been a long while since 2.5, and this release rolls up a lot of fixes, features and optimizations.
### Fixes
- Fix black screen rendering bug in Horizon Zero Dawn after latest game updates.
- Fix crashes on startup in Final Fantasy VII: Remake and Warframe.
- Fix crashes in Guardians of the Galaxy when interacting with certain game objects.
- Fix hang on game shutdown in Elden Ring.
- Fix broken geometry rendering in Age of Empires: IV.
### Optimization
- Improve generated shader code for vectorized load-store operations in DXIL.
- Greatly reduce CPU overhead for descriptor copy operations,
which is a key contributor to CPU overhead in D3D12.
### Features
#### Pipeline library rewrite
Support D3D12 pipeline libraries better where we can now also cache
generated SPIR-V from DXBC/DXIL.
Massively reduces subsequent load times in Monster Hunter: Rise,
and helps other titles like Guardian of the Galaxy and Elden Ring.
Also lays the groundwork for internal driver caches down the line for games which do not use this API.
Also, deduplicates binary blobs for reduced disk size requirements.
#### Shader models
Shader model 6.6 is now fully implemented. This includes support for:
- ResourceDescriptorHeap[] direct access
- 64-bit atomics
- IsHelperLane()
- Compute shader derivatives
- WaveSize attribute
- Packed math intrinsics
#### Minor features
- Handle API feature MinResourceLODClamp correctly if `VK_EXT_image_view_min_lod` is supported.
- Expose CastFullyTypedFormat feature.
- Expose some advanced shader features on Intel related to UAV formats (`VK_KHR_format_feature_flags2`).
- Support COLOR -> STENCIL copies.
### Workarounds
- Workaround DEATHLOOP not emitting synchronization commands correctly. Fixes menu flicker on RADV.
- Workaround quirky API usage in Elden Ring. Removes many kinds of stutter and chug when traversing the scenery.
- Workaround certain environments failing to create Vulkan device if some `VK_NVX_*` extensions are enabled.
- Workaround glitched foliage rendering in Horizon Zero Dawn after latest game updates.
- Workaround some questionable UE4 shaders causing glitched rendering on RADV.
### Note on future Vulkan driver requirements
2.6 is expected to be the last vkd3d-proton release before we require some newer Vulkan extensions.
`VK_KHR_dynamic_rendering` and `VK_EXT_extended_dynamic_state`
(and likely `dynamic_state_2` as well) will be required.
`VK_KHR_dynamic_rendering` in particular requires up-to-date drivers and the legacy render pass path
will be abandoned in favor of it. Supporting both paths at the same time is not practical.
Moving to `VK_KHR_dynamic_rendering` allows us to fix some critical flaws with the legacy API
which caused potential shader compilation stutters and extra CPU overhead.
## 2.5
This is a release with a little bit of everything!
### Features
#### DXR progress
DXR has seen significant work in the background.
- DXR 1.1 is now experimentally exposed. It can be enabled with `VKD3D_CONFIG=dxr11`.
Note that DXR 1.1 cannot be fully implemented in `VK_KHR_ray_tracing`'s current form, in particular
DispatchRays() indirect is not compatible yet,
although we have not observed a game which requires this API feature.
- DXR 1.1 inline raytracing support is fully implemented.
- DXR 1.0 support is more or less feature complete.
Some weird edge cases remain, but will likely not be implemented unless required by a game.
`VKD3D_CONFIG=dxr` will eventually be dropped when it matures.
Some new DXR games are starting to come alive, especially with DXR 1.1 enabled,
but there are significant bugs as well that we currently cannot easily debug.
Some experimental results on NVIDIA:
- **Control** - already worked
- **DEATHLOOP** - appears to work correctly
- **Cyberpunk 2077** - DXR can be enabled, but GPU timeouts
- **World of Warcraft** - according to a user, it works, but we have not confirmed ourselves
- **Metro Exodus: Enhanced Edition** -
gets ingame and appears to work? Not sure if it looks correct.
Heavy CPU stutter for some reason ...
- **Metro Exodus** (original release) - GPU timeouts when enabling DXR
- **Resident Evil: Village** - Appears to work, but the visual difference is subtle.
It's worth experimenting with these and others.
DXR is incredibly complicated, so expect bugs.
From here, DXR support is mostly a case of stamping out issues one by one.
#### NVIDIA DLSS
NVIDIA contributed integration APIs in vkd3d-proton which enables DLSS support in D3D12 titles in Proton.
See Proton documentation for how to enable NvAPI support.
#### Shader models
A fair bit of work went into DXIL translation support to catch up with native drivers.
- Shader model 6.5 is exposed.
Shader model 6.6 should be straight forward once that becomes relevant.
- Shader model 6.4 implementation takes advantage of `VK_KHR_shader_integer_dot_product` when supported.
- Proper fallback for FP16 math on GPUs which do not expose native FP16 support (Polaris, Pascal).
Notably fixes AMD FSR shaders in Resident Evil: Village (and others).
- Shader model 6.1 SV_Barycentric support implemented (NVIDIA only for now).
- Support shader model 6.2 FP32 denorm control.
### Performance
Resizable BAR can improve GPU performance about 10-15% in the best case, depends a lot on the game.
Horizon Zero Dawn and Death Stranding in particular improve massively with this change.
By default, vkd3d-proton will now take advantage of PCI-e BAR memory types through heuristics
as D3D12 does not expose direct support for resizable BAR, and native D3D12 drivers are known to use heuristics as well.
Without resizable BAR enabled in BIOS/vBIOS, we only get 256 MiB which can help performance,
but many games will improve performance even more
when we are allowed to use more than that.
There is an upper limit for how much VRAM is dedicated to this purpose.
We also added `VKD3D_CONFIG=no_upload_hvv` to disable all uses of PCI-e BAR memory.
Other performance improvements:
- Avoid redundant descriptor update work in certain scenarios (NVIDIA contribution).
- Minor tweaks here and there to reduce CPU overhead.
### Fixes and workarounds
- Fix behavior for swap chain presentation latency HANDLE. Fixes spurious deadlocks in some cases.
- Fix many issues related to depth-stencil handling, which fixed various issues in DEATHLOOP, F1 2021, WRC 10.
- Fix DIRT 5 rendering issues and crashes. Should be fully playable now.
- Fix some Diablo II Resurrected rendering issues.
- Workaround shader bugs in Psychonauts 2.
- Workaround some Unreal Engine 4 shader bugs which multiple titles trigger.
- Fix some stability issues when VRAM is exhausted on NVIDIA.
- Fix CPU crash in boot-up sequence of Far Cry 6 (game is still kinda buggy though, but gets in-game).
- Fix various bugs with host visible images. Fixes DEATHLOOP.
- Fix various DXIL conversion bugs.
- Add Invariant geometry workarounds for specific games which require it.
- Fix how d3d12.dll exports symbols to be more in line with MSVC.
- Fix some edge cases in bitfield instructions.
- Work around extreme CPU memory bloat on the specific NVIDIA driver versions which had this bug.
- Fix regression in Evil Genius 2: World Domination.
- Fix crashes in Hitman 3.
- Fix terrain rendering in Anno 1800.
- Various correctness and crash fixes.
## 2.4
This is a release which focuses on performance and bug-fixes.
### Performance
- Improve swapchain latency and frame pacing by up to one frame.
- Optimize lookup of format info.
- Avoid potential pipeline compilation stutter in certain scenarios.
- Rewrite how we handle image layouts for color and depth-stencil targets.
Allows us to remove a lot of dumb
barriers giving significant GPU-bound performance improvements.
~15%-20% GPU bound uplift in Horizon Zero Dawn,
~10% in Death Stranding,
and 5%-10% improvements in many other titles.
### Features
- Enable support for sparse 3D textures (tiled resources tier 3).
### Bug fixes and workarounds
- Various bug fixes in DXIL.
- Fix weird bug where sun would pop through walls in RE: Village.
- Workaround game bug in Cyberpunk 2077 where certain locales would render a black screen.
- Fix various bugs (in benchmark and in vkd3d-proton) allowing GravityMark to run.
- Improve robustness against certain app bugs related to NULL descriptors.
- Fix bug with constant FP64 vector handling in DXBC.
- Fix bug where Cyberpunk 2077 inventory screen could spuriously hang GPU on RADV.
- Add workaround for Necromunda: Hired Gun where character models would render random garbage on RADV.
- Fix bug in Necromunda: Hired Gun causing random screen flicker.
- Fix windowed mode tracking when leaving fullscreen. Fix Alt-Tab handling in Horizon Zero Dawn.
- Temporary workaround for SRV ResourceMinLODClamp. Fix black ground rendering in DIRT 5.
The overbright HDR rendering in DIRT 5 sadly persists however :(
- Implement fallback maximum swapchain latency correctly.
### Development features
Various features which are useful for developers were added to aid debugging.
- Descriptor QA can instrument shaders in runtime for GPU-assisted validation.
Performance is good enough (> 40 FPS) that games are actually playable in this mode.
See README for details.
- Allow forcing off CONCURRENT queue, and using EXCLUSIVE queue.
Not valid, but can be useful as a speed hack on Polaris when `single_queue` is not an option
and for testing driver behavior differences.
## 2.3.1
This is a minor bugfix release to address some issues solved shortly after the last release.
### Fixes
- Improved support for older Wine and Vulkan Loader versions.
- Fix blocky shadows in Horizon Zero Dawn.
- Fix the install script failing on Wine installs not built with upstream vkd3d.
- Fix minor dxil translation issues.
## 2.3
This release adds support for more D3D12 features and greatly improves GPU bound performance
in many scenarios.
### Features
#### Early DXR 1.0 support
`VK_KHR_raytracing` is used to enable cross-vendor ray-tracing support.
The implementation is WIP, but it is good enough to run some real content.
As of writing, only the NVIDIA driver works correctly.
It is expected AMD RDNA2 GPUs will work when working drivers are available
(amdgpu-pro 21.10 is known to not work).
Games which are expected to work include:
- Control (appears to be fully working)
- Ghostrunner (seems to work, not exhaustively tested)
To enable DXR support, `VKD3D_CONFIG=dxr %command%` should be used when launching game.
Certain games may be unstable if DXR is enabled by default.
#### Conservative rasterization
Full support (tier 3) for conservative rasterization was added.
#### Variable rate shading
Full support (tier 2) for variable rate shading was added.
#### Command list bundles
Allows Kingdom Hearts remaster to get past the errors, unsure if game fully works yet.
#### Write Watch and APITrace
Support for `D3D12_HEAP_FLAG_ALLOW_WRITE_WATCH` has been added.
This means [APITraces](https://github.com/Joshua-Ashton/apitrace/releases) of titles can now be captured.
### Performance
- Improve GPU bound performance in RE2 by up to 20% on NVIDIA.
- Enable async compute queues. Greatly improves GPU performance and frame pacing in many titles.
Horizon Zero Dawn and Death Stranding see exceptional gains with this fix,
due to how the engines work. GPU utilization should now reach ~100%.
For best results, AMD Navi+ GPUs are recommended, but Polaris and earlier still
see great results. It is possible to disable this path, if for whatever reason
multiple queues are causing issues. See README.
- Optimize bindless constant buffer GPU-bound performance on NVIDIA if certain API code paths are used.
- Optimize sparse binding CPU overhead.
- `TRACE` logging calls are disabled by default on release builds.
### Fixes and workarounds
- Fix various DXIL bugs.
- Be more robust against broken pipeline creation API calls.
Avoids driver crashes in Forza Horizon 4.
- Workaround some buggy shaders in F1 2020.
- Fix bugs if depth bounds test is used in certain ways.
- Fix a read out-of-bounds in `UpdateTileMappings`.
- Fix `SV_ClipDistance` and `SV_CullDistance` in Hull Shaders.
## 2.2
This release is mostly a maintenance release which fixes bugs and regressions.
It also unblocks significant future feature development.
### Workaround removals
- Replace old `force_bindless_texel_buffer` workaround with
a more correct and performant implementation.
Death Stranding and Cyberpunk 2077 (and probably other games as well) do the right thing by default without the hack now.
- Remove old workaround `disable_query_optimization` for occlusion queries which was enabled for AC: Valhalla,
and is now replaced by a correct and efficient implementation.
#### Cyberpunk 2077 status
From recent testing on our end, it is unknown at this time if `VK_VALVE_mutable_descriptor_type` is still required for
Cyberpunk 2077. Manual testing hasn't been able to trigger a GPU hang.
The memory allocation rewrite in 2.2 can plausibly work around some of the bugs that `VK_VALVE_mutable_descriptor_type` fixed by accident.
The bugs in question could also have been fixed since release day, but we cannot prove this since the bug is completely random in nature.
### Regression fixes
- Fix regression in Horizon Zero Dawn for screen space reflections on water surfaces.
### Stability fixes
- Greatly improve stability on Polaris or older cards for certain titles.
Crashes which used to happen in Horizon Zero Dawn and Death Stranding seem to have disappeared
after the memory allocation rewrite.
GPU memory usage should decrease on these cards as well.
- DIRT 5 can get in-game now due to DXIL fixes, but is not yet playable.
### New features
- Add support for Variable Rate Shading tier 1.
### Future development
DXR is not yet supported, but has seen a fair bit of background work.
- Basic DXR pipelines can be created successfully.
- Memory allocation rewrite in 2.2 unblocks further DXR development.
## 2.1
This release fixes various bugs (mostly workarounds) and improves GPU-bound performance.
New games added to "expected to work" list:
- The Division (was working already in 2.0, but missing from list)
- AC: Valhalla (*)
(*): Game requires full D3D12 sparse texture support to work.
Currently only works on NVIDIA drivers.
RADV status remains unknown until support for this feature lands in Mesa.
New games added to "kinda works, but expect a lot of jank" list:
- Cyberpunk 2077 (**)
(**): Currently only runs correctly on AMD hardware with RADV and `VK_VALVE_mutable_descriptor_type`.
As of game version 1.03, this requires the latest Mesa Git build.
The game has some fatal bugs where it relies on undefined behavior with descriptor management
which this extension works around by accident.
The game will start and run on NVIDIA, but just like what happens without the extension on AMD,
the GPU will randomly hang, making the game effectively unplayable.
A game update to fix this bug would likely make the game playable on NVIDIA as well.
Game version 1.04 changed some behavior, and support for this game will likely fluctuate over time as future patches come in.
Bug fixes and workarounds:
- Fix various implementation bugs which caused AC: Valhalla to not work.
- Work around game bug in Death Stranding where accessing map could cause corrupt rendering.
(Several games appear to have the same kind of application bug.)
- Fix corrupt textures in Horizon Zero Dawn benchmark.
- Fix SM 6.0 wave-op detection for Horizon Zero Dawn and DIRT 5.
- Work around GPU hangs in certain situations where games do not use D3D12 correctly,
but native D3D12 drivers just render wrong results rather than hang the system.
- Fix invalid SPIR-V generated by FP64 code.
- Fix crash with minimized windows in certain cases.
Performance:
- ~15% GPU-bound uplift in Ghostrunner. Might help UE4 titles in general.
- Slightly improve GPU bound performance when fully GPU bound on both AMD and NVIDIA.
- Slightly improve GPU bound performance on RADV in various titles.
- Reduce multi-threaded CPU overhead for certain D3D12 API usage patterns.
- Add support for `VK_VALVE_mutable_descriptor_type` which
improves CPU overhead, memory bloat, and avoids potential memory management thrashing on RADV.
Also avoids GPU hangs in certain situations where games misuse the D3D12 API.
Misc:
- Implement `DXGI_PRESENT_TEST`.
- Fix log spam when `DXGI_PRESENT_ALLOW_TEARING` is used.
## 2.0
This initial release supports D3D12 Feature Level 12.0 and Shader Model 6.0 (DXIL).
Games expected to work include:
- Control
- Death Stranding
- Devil May Cry 5
- Ghostrunner
- Horizon Zero Dawn
- Metro Exodus
- Monster Hunter World
- Resident Evil 2 / 3
Please refer to the README for supported driver versions.

View File

@ -1,4 +1,4 @@
Copyright 2016-2020 the vkd3d-proton project authors (see the file AUTHORS for a
Copyright 2016-2022 the vkd3d-proton project authors (see the file AUTHORS for a
complete list)
vkd3d-proton is free software; you can redistribute it and/or modify it under

249
README.md
View File

@ -22,31 +22,36 @@ There are some hard requirements on drivers to be able to implement D3D12 in a r
- `VK_EXT_descriptor_indexing` with at least 1000000 UpdateAfterBind descriptors for all types except UniformBuffer.
Essentially all features in `VkPhysicalDeviceDescriptorIndexingFeatures` must be supported.
- `VK_KHR_timeline_semaphore`
- `VK_KHR_sampler_mirror_clamp_to_edge`
- `VK_EXT_robustness2`
- `VK_KHR_separate_depth_stencil_layouts`
- `VK_KHR_bind_memory2`
- `VK_KHR_copy_commands2`
- `VK_KHR_dynamic_rendering`
- `VK_EXT_extended_dynamic_state`
- `VK_EXT_extended_dynamic_state2`
Some notable extensions that **should** be supported for optimal or correct behavior.
These extensions will likely become mandatory later.
- `VK_EXT_robustness2`
- `VK_KHR_buffer_device_address`
- `VK_EXT_extended_dynamic_state`
- `VK_EXT_image_view_min_lod`
### AMD (RADV / ACO)
`VK_VALVE_mutable_descriptor_type` is also highly recommended, but not mandatory.
### AMD (RADV)
For AMD, RADV is the recommended driver and the one that sees most testing on AMD GPUs.
The recommendation here is to use a driver built from Git.
The minimum requirement at the moment is Mesa 22.0 since it supports `VK_KHR_dynamic_rendering`.
NOTE: For older Mesa versions, use the v2.6 release.
### NVIDIA
The [Vulkan beta drivers](https://developer.nvidia.com/vulkan-driver) generally contain the latest
driver fixes that we identify while getting games to work.
At least Linux 455.26.01 (2020-10-20) is recommended as it contains fixes for:
> Reduce host memory consumption for descriptor memory when VkDescriptorSetVariableDescriptorCountAllocateInfo is used.
> Fixed a bug in a barrier optimization that allowed some back-to-back copies to run unordered
These fixes should find their way into stable drivers eventually, but if you're having issues, test the latest development drivers,
as that is what we test against.
The latest drivers (stable, beta or Vulkan beta tracks) are always preferred.
If you're having problems, always try the latest drivers.
### Intel
@ -143,14 +148,27 @@ Some of debug variables are lists of elements. Elements must be separated by
commas or semicolons.
- `VKD3D_CONFIG` - a list of options that change the behavior of vkd3d-proton.
- vk_debug - enables Vulkan debug extensions and loads validation layer.
- `vk_debug` - enables Vulkan debug extensions and loads validation layer.
- `skip_application_workarounds` - Skips all application workarounds.
For debugging purposes.
- `dxr` - Enables DXR support if supported by device.
- `dxr11` - Enables DXR tier 1.1 support if supported by device.
- `force_static_cbv` - Unsafe speed hack on NVIDIA. May or may not give a significant performance uplift.
- `single_queue` - Do not use asynchronous compute or transfer queues.
- `no_upload_hvv` - Blocks any attempt to use host-visible VRAM (large/resizable BAR) for the UPLOAD heap.
May free up vital VRAM in certain critical situations, at cost of lower GPU performance.
A fraction of VRAM is reserved for resizable BAR allocations either way,
so it should not be a real issue even on lower VRAM cards.
- `force_host_cached` - Forces all host visible allocations to be CACHED, which greatly accelerates captures.
- `no_invariant_position` - Avoids workarounds for invariant position. The workaround is enabled by default.
- `VKD3D_DEBUG` - controls the debug level for log messages produced by
vkd3d-proton. Accepts the following values: none, err, fixme, warn, trace.
vkd3d-proton. Accepts the following values: none, err, info, fixme, warn, trace.
- `VKD3D_SHADER_DEBUG` - controls the debug level for log messages produced by
the shader compilers. See `VKD3D_DEBUG` for accepted values.
- `VKD3D_LOG_FILE` - If set, redirects `VKD3D_DEBUG` logging output to a file instead.
- `VKD3D_VULKAN_DEVICE` - a zero-based device index. Use to force the selected
Vulkan device.
- `VKD3D_FILTER_DEVICE_NAME` - skips devices that don't include this substring.
- `VKD3D_DISABLE_EXTENSIONS` - a list of Vulkan extensions that vkd3d-proton should
not use even if available.
- `VKD3D_TEST_DEBUG` - enables additional debug messages in tests. Set to 0, 1
@ -158,6 +176,8 @@ commas or semicolons.
- `VKD3D_TEST_FILTER` - a filter string. Only the tests whose names matches the
filter string will be run, e.g. `VKD3D_TEST_FILTER=clear_render_target`.
Useful for debugging or developing new tests.
- `VKD3D_TEST_EXCLUDE` - excludes tests of which the name is included in the string,
e.g. `VKD3D_TEST_EXCLUDE=test_root_signature_priority,test_conservative_rasterization_dxil`.
- `VKD3D_TEST_PLATFORM` - can be set to "wine", "windows" or "other". The test
platform controls the behavior of todo(), todo_if(), bug_if() and broken()
conditions in tests.
@ -165,6 +185,39 @@ commas or semicolons.
- `VKD3D_PROFILE_PATH` - If profiling is enabled in the build, a profiling block is
emitted to `${VKD3D_PROFILE_PATH}.${pid}`.
## Shader cache
By default, vkd3d-proton manages its own driver cache.
This cache is intended to cache DXBC/DXIL -> SPIR-V conversion.
This reduces stutter (when pipelines are created last minute and app relies on hot driver cache)
and load times (when applications do the right thing of loading PSOs up front).
Behavior is designed to be close to DXVK state cache.
#### Default behavior
`vkd3d-proton.cache` (and `vkd3d-proton.cache.write`) are placed in the current working directory.
Generally, this is the game install folder when running in Steam.
#### Custom directory
`VKD3D_SHADER_CACHE_PATH=/path/to/directory` overrides the directory where `vkd3d-proton.cache` is placed.
#### Disable cache
`VKD3D_SHADER_CACHE_PATH=0` disables the internal cache, and any caching would have to be explicitly managed
by application.
### Behavior of ID3D12PipelineLibrary
When explicit shader cache is used, the need for application managed pipeline libraries is greatly diminished,
and the cache applications interact with is a dummy cache.
If the vkd3d-proton shader cache is disabled, ID3D12PipelineLibrary stores everything relevant for a full cache,
i.e. SPIR-V and PSO driver cache blob.
`VKD3D_CONFIG=pipeline_library_app_cache` is an alternative to `VKD3D_SHADER_CACHE_PATH=0` and can be
automatically enabled based on app-profiles if relevant in the future if applications manage the caches better
than vkd3d-proton can do automagically.
## CPU profiling (development)
Pass `-Denable_profiling=true` to Meson to enable a profiled build. With a profiled build, use `VKD3D_PROFILE_PATH` environment variable.
@ -186,12 +239,26 @@ pass `-Denable_renderdoc=true` to Meson.
vkd3d-proton will automatically make a capture when a specific shader is encountered.
- `VKD3D_AUTO_CAPTURE_COUNTS` - A comma-separated list of indices. This can be used to control which queue submissions to capture.
E.g., use `VKD3D_AUTO_CAPTURE_COUNTS=0,4,10` to capture the 0th (first submission), 4th and 10th submissions which are candidates for capturing.
If `VKD3D_AUTO_CAPTURE_COUNTS` is `-1`, the entire app runtime can be turned into one big capture.
This is only intended to be used when capturing something like the test suite,
or tiny applications with a finite runtime to make it easier to debug cross submission work.
If only `VKD3D_AUTO_CAPTURE_COUNTS` is set, any queue submission is considered for capturing.
If only `VKD3D_AUTO_CAPTURE_SHADER` is set, `VKD3D_AUTO_CAPTURE_COUNTS` is considered to be equal to `"0"`, i.e. a capture is only
made on first encounter with the target shader.
If both are set, the capture counter is only incremented and considered when a submission contains the use of the target shader.
### Breadcrumbs debugging
For debugging GPU hangs, it's useful to know where crashes happen.
If the build has trace enabled (non-release builds), breadcrumbs support is also enabled.
`VKD3D_CONFIG=breadcrumbs` will instrument command lists with `VK_AMD_buffer_marker` or `VK_NV_device_checkpoints`.
On GPU device lost or timeout, crash dumps are written to the log.
For best results on RADV, use `RADV_DEBUG=syncshaders`. The logs will print a digested form of the command lists
which were executing at the time, and attempt to narrow down the possible range of commands which could
have caused a crash.
### Shader logging
It is possible to log the output of replaced shaders, essentially a custom shader printf. To enable this feature, `VK_KHR_buffer_device_address` must be supported.
@ -203,8 +270,11 @@ and avoids any possible accidental hiding of bugs by introducing validation laye
Using `debugPrintEXT` is also possible if that fits better with your debugging scenario.
With this shader replacement scheme, we're able to add shader logging as unintrusive as possible.
Replaced shaders will need to include `debug_channel.h` from `include/shader-debug`.
Use `glslc -I/path/to/vkd3d-proton/include/shader-debug --target-env=vulkan1.1` when compiling replaced shaders.
```
# Inside folder full of override shaders, build everything with:
make -C /path/to/include/shader-debug M=$PWD
```
The shader can then include `#include "debug_channel.h"` and use various functions below.
```
void DEBUG_CHANNEL_INIT(uvec3 ID);
@ -228,3 +298,150 @@ void DEBUG_CHANNEL_MSG(float v0, float v1, ...); // Up to 4 components, ...
```
These functions log, formatting is `#%x` for uint, `%d` for int and `%f` for float type.
## Descriptor debugging
If `-Denable_descriptor_qa=true` is enabled in build, you can set the `VKD3D_DESCRIPTOR_QA_LOG` env-var to a file.
All descriptor updates and copies are logged so that it's possible to correlate descriptors with
GPU crash dumps. `enable_descriptor_qa` is not enabled by default,
since it adds some flat overhead in an extremely hot code path.
### GPU-assisted debugging
If `VKD3D_CONFIG=descriptor_qa_checks` is set with a build which enables `-Denable_descriptor_qa=true`,
all shaders will be instrumented to check for invalid access. In the log, you will see this to
make sure the feature is enabled.
```
932:info:vkd3d_descriptor_debug_init_once: Enabling descriptor QA checks!
```
The main motivation is the tight integration and high performance.
GPU-assisted debugging can be run at well over playable speeds.
#### Descriptor heap index out of bounds
```
============
Fault type: HEAP_OUT_OF_RANGE
Fault type: MISMATCH_DESCRIPTOR_TYPE
CBV_SRV_UAV heap cookie: 1800
Shader hash and instruction: edbaf1b5ed344467 (1)
Accessed resource/view cookie: 0
Shader desired descriptor type: 8 (STORAGE_BUFFER)
Found descriptor type in heap: 0 (NONE)
Failed heap index: 1024000
==========
```
The instruction `(1)`, is reported as well,
and a disassembly of the shader in question can be used to pinpoint exactly where
things are going wrong.
Dump all shaders with `VKD3D_SHADER_DUMP_PATH=/my/folder`,
and run `spirv-cross -V /my/folder/edbaf1b5ed344467.spv`.
(NOTE: clear out the folder before dumping, existing files are not overwritten).
The faulting instruction can be identified by looking at last argument, e.g.:
```
uint fixup_index = descriptor_qa_check(heap_index, descriptor_type, 1u /* instruction ID */);
```
#### Mismatch descriptor type
```
============
Fault type: MISMATCH_DESCRIPTOR_TYPE
CBV_SRV_UAV heap cookie: 1800 // Refer to VKD3D_DESCRIPTOR_QA_LOG
Shader hash and instruction: edbaf1b5ed344467 (1)
Accessed resource/view cookie: 1802 // Refer to VKD3D_DESCRIPTOR_QA_LOG
Shader desired descriptor type: 8 (STORAGE_BUFFER)
Found descriptor type in heap: 1 (SAMPLED_IMAGE)
Failed heap index: 1025
==========
```
#### Accessing destroyed resource
```
============
Fault type: DESTROYED_RESOURCE
CBV_SRV_UAV heap cookie: 1800
Shader hash and instruction: edbaf1b5ed344467 (2)
Accessed resource/view cookie: 1806
Shader desired descriptor type: 1 (SAMPLED_IMAGE)
Found descriptor type in heap: 1 (SAMPLED_IMAGE)
Failed heap index: 1029
==========
```
### Debugging descriptor crashes with RADV dumps (hardcore ultra nightmare mode)
For when you're absolutely desperate, there is a way to debug GPU hangs.
First, install [umr](https://gitlab.freedesktop.org/tomstdenis/umr) and make the binary setsuid.
`ACO_DEBUG=force-waitcnt RADV_DEBUG=hang VKD3D_DESCRIPTOR_QA_LOG=/somewhere/desc.txt %command%`
It is possible to use `RADV_DEBUG=hang,umr` as well, but from within Wine, there are weird things
happening where UMR dumps do not always succeed.
Instead, it is possible to invoke umr manually from an SSH shell when the GPU hangs.
```
#!/bin/bash
mkdir -p "$HOME/umr-dump"
# For Navi, older GPUs might have different rings. See RADV source.
umr -R gfx_0.0.0 > "$HOME/umr-dump/ring.txt" 2>&1
umr -O halt_waves -wa gfx_0.0.0 > "$HOME/umr-dump/halt-waves-1.txt" 2>&1
umr -O bits,halt_waves -wa gfx_0.0.0 > "$HOME/umr-dump/halt-waves-2.txt" 2>&1
```
A folder is placed in `~/radv_dumps*` by RADV, and the UMR script will place wave dumps in `~/umr-dump`.
First, we can study the wave dumps to see where things crash, e.g.:
```
pgm[6@0x800120e26c00 + 0x584 ] = 0xf0001108 image_load v47, v[4:5], s[48:55] dmask:0x1 dim:SQ_RSRC_IMG_2D unorm
pgm[6@0x800120e26c00 + 0x588 ] = 0x000c2f04 ;;
pgm[6@0x800120e26c00 + 0x58c ] = 0xbf8c3f70 s_waitcnt vmcnt(0)
* pgm[6@0x800120e26c00 + 0x590 ] = 0x930118c0 s_mul_i32 s1, 64, s24
pgm[6@0x800120e26c00 + 0x594 ] = 0xf40c0c09 s_load_dwordx8 s[48:55], s[18:19], s1
pgm[6@0x800120e26c00 + 0x598 ] = 0x02000000 ;;
```
excp: 256 is a memory error (at least on 5700xt).
```
TRAPSTS[50000100]:
excp: 256 | illegal_inst: 0 | buffer_oob: 0 | excp_cycle: 0 |
excp_wave64hi: 0 | xnack_error: 1 | dp_rate: 2 | excp_group_mask: 0 |
```
We can inspect all VGPRs and all SGPRs, here for the image descriptor.
```
[ 48.. 51] = { 0130a000, c0500080, 810dc1df, 93b00204 }
[ 52.. 55] = { 00000000, 00400000, 002b0000, 800130c8 }
```
Decode the VA and study `bo_history.log`. There is a script in RADV which lets you query history for a VA.
This lets us verify that the VA in question was freed at some point.
At point of writing, there is no easy way to decode raw descriptor blobs, but when you're desperate enough you can do it by hand :|
In `pipeline.log` we have the full SPIR-V (with OpSource reference to the source DXIL/DXBC)
and disassembly of the crashed pipeline. Here we can study the code to figure out which descriptor was read.
```
// s7 is the descriptor heap index, s1 is the offset (64 bytes per image descriptor),
// s[18:19] is the descriptor heap.
s_mul_i32 s1, 64, s7 ; 930107c0
s_load_dwordx8 s[48:55], s[18:19], s1 ; f40c0c09 02000000
s_waitcnt lgkmcnt(0) ; bf8cc07f
image_load v47, v[4:5], s[48:55] dmask:0x1 dim:SQ_RSRC_IMG_2D unorm ; f0001108 000c2f04
```
```
[ 4.. 7] = { 03200020, ffff8000, 0000002b, 00000103 }
```
Which is descriptor index #259. Based on this, we can inspect the descriptor QA log and verify that the application
did indeed do something invalid, which caused the GPU hang.

View File

@ -20,7 +20,7 @@
#include <stdbool.h>
#include <stdio.h>
#define DEMO_WINDOW_CLASS_NAME L"demo_wc"
#define DEMO_WINDOW_CLASS_NAME u"demo_wc"
struct demo
{

View File

@ -493,8 +493,6 @@ static inline struct demo_swapchain *demo_swapchain_create(ID3D12CommandQueue *c
WaitForFences(vk_device, 1, &vk_fence, VK_TRUE, UINT64_MAX);
ResetFences(vk_device, 1, &vk_fence);
resource_create_info.type = VKD3D_STRUCTURE_TYPE_IMAGE_RESOURCE_CREATE_INFO;
resource_create_info.next = NULL;
resource_create_info.desc.Dimension = D3D12_RESOURCE_DIMENSION_TEXTURE2D;
resource_create_info.desc.Alignment = 0;
resource_create_info.desc.Width = desc->width;

View File

@ -456,13 +456,8 @@ static void cxg_mesh_create(ID3D12Device *device, float inner_radius, float oute
float r0, r1, r2;
float angle, da;
if (!(vertices = calloc(tooth_count, 12 * sizeof(*vertices))))
return;
if (!(faces = calloc(tooth_count, 20 * sizeof(*faces))))
{
free(vertices);
return;
}
vertices = calloc(tooth_count, 12 * sizeof(*vertices));
faces = calloc(tooth_count, 20 * sizeof(*faces));
r0 = inner_radius;
r1 = outer_radius - tooth_depth / 2.0f;

View File

@ -10,6 +10,8 @@ vkd3d_idl = [
'vkd3d_dxgiformat.idl',
'vkd3d_dxgitype.idl',
'vkd3d_swapchain_factory.idl',
'vkd3d_command_list_vkd3d_ext.idl',
'vkd3d_device_vkd3d_ext.idl'
]
vkd3d_header_files = idl_generator.process(vkd3d_idl)

View File

@ -165,18 +165,23 @@ static inline struct hash_map_entry *hash_map_insert(struct hash_map *hash_map,
struct hash_map_entry *current = hash_map_get_entry(hash_map, entry_idx);
if (!(current->flags & HASH_MAP_ENTRY_OCCUPIED) ||
(current->hash_value == hash_value && hash_map->compare_func(key, entry)))
(current->hash_value == hash_value && hash_map->compare_func(key, current)))
target = current;
entry_idx = hash_map_next_entry_idx(hash_map, entry_idx);
else
entry_idx = hash_map_next_entry_idx(hash_map, entry_idx);
}
if (!(target->flags & HASH_MAP_ENTRY_OCCUPIED))
{
hash_map->used_count += 1;
target->flags = HASH_MAP_ENTRY_OCCUPIED;
target->hash_value = hash_value;
memcpy(target + 1, entry + 1, hash_map->entry_size - sizeof(*entry));
}
/* If target is occupied, we already have an entry in the hashmap.
* Return old one, caller is responsible for cleaning up the node we attempted to add. */
memcpy(target, entry, hash_map->entry_size);
target->flags = HASH_MAP_ENTRY_OCCUPIED;
target->hash_value = hash_value;
return target;
}
@ -188,6 +193,7 @@ static inline void hash_map_init(struct hash_map *hash_map, pfn_hash_func hash_f
hash_map->entry_size = entry_size;
hash_map->entry_count = 0;
hash_map->used_count = 0;
assert(entry_size > sizeof(struct hash_map_entry));
}
static inline void hash_map_clear(struct hash_map *hash_map)
@ -207,4 +213,43 @@ static inline uint32_t hash_uint64(uint64_t n)
return hash_combine((uint32_t)n, (uint32_t)(n >> 32));
}
/* A somewhat stronger hash when we're meant to store the hash (pipeline caches, etc). Based on FNV-1a. */
static inline uint64_t hash_fnv1_init()
{
return 0xcbf29ce484222325ull;
}
static inline uint64_t hash_fnv1_iterate_u8(uint64_t h, uint8_t value)
{
return (h * 0x100000001b3ull) ^ value;
}
static inline uint64_t hash_fnv1_iterate_u32(uint64_t h, uint32_t value)
{
return (h * 0x100000001b3ull) ^ value;
}
static inline uint64_t hash_fnv1_iterate_f32(uint64_t h, float value)
{
union u { float f32; uint32_t u32; } v;
v.f32 = value;
return hash_fnv1_iterate_u32(h, v.u32);
}
static inline uint64_t hash_fnv1_iterate_u64(uint64_t h, uint64_t value)
{
h = hash_fnv1_iterate_u32(h, value & UINT32_MAX);
h = hash_fnv1_iterate_u32(h, value >> 32);
return h;
}
static inline uint64_t hash_fnv1_iterate_string(uint64_t h, const char *str)
{
if (str)
while (*str)
h = hash_fnv1_iterate_u8(h, *str++);
h = hash_fnv1_iterate_u8(h, 0);
return h;
}
#endif /* __VKD3D_HASHMAP_H */

View File

@ -53,18 +53,19 @@ FORCEINLINE void vkd3d_atomic_load_barrier(vkd3d_memory_order order)
}
}
// Redefinitions for invalid memory orders...
#define InterlockedExchangeRelease InterlockedExchange
/* Redefinitions for invalid memory orders */
#define InterlockedExchangeRelease InterlockedExchange
#define InterlockedExchangeRelease64 InterlockedExchange64
#define vkd3d_atomic_choose_intrinsic(order, result, intrinsic, ...) \
switch (order) \
{ \
case vkd3d_memory_order_relaxed: result = intrinsic##NoFence (__VA_ARGS__); break; \
case vkd3d_memory_order_consume: \
case vkd3d_memory_order_acquire: result = intrinsic##Acquire (__VA_ARGS__); break; \
case vkd3d_memory_order_release: result = intrinsic##Release (__VA_ARGS__); break; \
case vkd3d_memory_order_acq_rel: \
case vkd3d_memory_order_seq_cst: result = intrinsic (__VA_ARGS__); break; \
#define vkd3d_atomic_choose_intrinsic(order, result, intrinsic, suffix, ...) \
switch (order) \
{ \
case vkd3d_memory_order_relaxed: result = intrinsic##NoFence##suffix (__VA_ARGS__); break; \
case vkd3d_memory_order_consume: \
case vkd3d_memory_order_acquire: result = intrinsic##Acquire##suffix (__VA_ARGS__); break; \
case vkd3d_memory_order_release: result = intrinsic##Release##suffix (__VA_ARGS__); break; \
case vkd3d_memory_order_acq_rel: \
case vkd3d_memory_order_seq_cst: result = intrinsic##suffix (__VA_ARGS__); break; \
}
FORCEINLINE uint32_t vkd3d_atomic_uint32_load_explicit(uint32_t *target, vkd3d_memory_order order)
@ -78,7 +79,7 @@ FORCEINLINE void vkd3d_atomic_uint32_store_explicit(uint32_t *target, uint32_t v
{
switch (order)
{
case vkd3d_memory_order_release: vkd3d_atomic_rw_barrier(); // fallthrough...
case vkd3d_memory_order_release: vkd3d_atomic_rw_barrier(); /* fallthrough */
case vkd3d_memory_order_relaxed: *((volatile uint32_t*)target) = value; break;
default:
case vkd3d_memory_order_seq_cst:
@ -89,43 +90,170 @@ FORCEINLINE void vkd3d_atomic_uint32_store_explicit(uint32_t *target, uint32_t v
FORCEINLINE uint32_t vkd3d_atomic_uint32_exchange_explicit(uint32_t *target, uint32_t value, vkd3d_memory_order order)
{
uint32_t result;
vkd3d_atomic_choose_intrinsic(order, result, InterlockedExchange, (LONG*)target, value);
vkd3d_atomic_choose_intrinsic(order, result, InterlockedExchange, /* no suffix */,(LONG*)target, value);
return result;
}
FORCEINLINE uint32_t vkd3d_atomic_uint32_increment(uint32_t *target, vkd3d_memory_order order)
{
uint32_t result;
vkd3d_atomic_choose_intrinsic(order, result, InterlockedIncrement, (LONG*)target);
vkd3d_atomic_choose_intrinsic(order, result, InterlockedIncrement, /* no suffix */,(LONG*)target);
return result;
}
FORCEINLINE uint32_t vkd3d_atomic_uint32_decrement(uint32_t *target, vkd3d_memory_order order)
{
uint32_t result;
vkd3d_atomic_choose_intrinsic(order, result, InterlockedDecrement, (LONG*)target);
vkd3d_atomic_choose_intrinsic(order, result, InterlockedDecrement, /* no suffix */,(LONG*)target);
return result;
}
FORCEINLINE uint32_t vkd3d_atomic_uint32_add(uint32_t *target, uint32_t value, vkd3d_memory_order order)
{
uint32_t result;
vkd3d_atomic_choose_intrinsic(order, result, InterlockedAdd, /* no suffix */,(LONG*)target, value);
return result;
}
FORCEINLINE uint32_t vkd3d_atomic_uint32_sub(uint32_t *target, uint32_t value, vkd3d_memory_order order)
{
uint32_t result;
vkd3d_atomic_choose_intrinsic(order, result, InterlockedAdd, /* no suffix */,(LONG*)target, (uint32_t)(-(int32_t)value));
return result;
}
FORCEINLINE uint32_t vkd3d_atomic_uint32_and(uint32_t *target, uint32_t value, vkd3d_memory_order order)
{
uint32_t result;
vkd3d_atomic_choose_intrinsic(order, result, InterlockedAnd, /* no suffix */,(LONG*)target, value);
return result;
}
FORCEINLINE uint32_t vkd3d_atomic_uint32_or(uint32_t *target, uint32_t value, vkd3d_memory_order order)
{
uint32_t result;
vkd3d_atomic_choose_intrinsic(order, result, InterlockedOr, /* no suffix */,(LONG*)target, value);
return result;
}
FORCEINLINE uint32_t vkd3d_atomic_uint32_compare_exchange(uint32_t* target, uint32_t expected, uint32_t desired,
vkd3d_memory_order success_order, vkd3d_memory_order fail_order)
{
uint32_t result;
/* InterlockedCompareExchange has desired (ExChange) first, then expected (Comperand) */
vkd3d_atomic_choose_intrinsic(success_order, result, InterlockedCompareExchange, /* no suffix */, (LONG*)target, desired, expected);
return result;
}
FORCEINLINE uint64_t vkd3d_atomic_uint64_load_explicit(uint64_t *target, vkd3d_memory_order order)
{
uint64_t value = *((volatile uint64_t*)target);
vkd3d_atomic_load_barrier(order);
return value;
}
FORCEINLINE void vkd3d_atomic_uint64_store_explicit(uint64_t *target, uint64_t value, vkd3d_memory_order order)
{
switch (order)
{
case vkd3d_memory_order_release: vkd3d_atomic_rw_barrier(); /* fallthrough */
case vkd3d_memory_order_relaxed: *((volatile uint64_t*)target) = value; break;
default:
case vkd3d_memory_order_seq_cst:
(void) InterlockedExchange64((LONG64*) target, value);
}
}
FORCEINLINE uint64_t vkd3d_atomic_uint64_exchange_explicit(uint64_t *target, uint64_t value, vkd3d_memory_order order)
{
uint64_t result;
vkd3d_atomic_choose_intrinsic(order, result, InterlockedExchange, 64, (LONG64*)target, value);
return result;
}
FORCEINLINE uint64_t vkd3d_atomic_uint64_increment(uint64_t *target, vkd3d_memory_order order)
{
uint64_t result;
vkd3d_atomic_choose_intrinsic(order, result, InterlockedIncrement, 64, (LONG64*)target);
return result;
}
FORCEINLINE uint64_t vkd3d_atomic_uint64_decrement(uint64_t *target, vkd3d_memory_order order)
{
uint64_t result;
vkd3d_atomic_choose_intrinsic(order, result, InterlockedDecrement, 64, (LONG64*)target);
return result;
}
FORCEINLINE uint64_t vkd3d_atomic_uint64_compare_exchange(UINT64* target, uint64_t expected, uint64_t desired,
vkd3d_memory_order success_order, vkd3d_memory_order fail_order)
{
uint64_t result;
/* InterlockedCompareExchange has desired (ExChange) first, then expected (Comperand). Use UINT64 to mark 8-byte alignment. */
vkd3d_atomic_choose_intrinsic(success_order, result, InterlockedCompareExchange, 64, (LONG64*)target, desired, expected);
return result;
}
#elif defined(__GNUC__) || defined(__clang__)
#define vkd3d_memory_order_relaxed __ATOMIC_RELAXED
#define vkd3d_memory_order_consume __ATOMIC_CONSUME
#define vkd3d_memory_order_acquire __ATOMIC_ACQUIRE
#define vkd3d_memory_order_release __ATOMIC_RELEASE
#define vkd3d_memory_order_acq_rel __ATOMIC_ACQ_REL
#define vkd3d_memory_order_seq_cst __ATOMIC_SEQ_CST
typedef enum
{
vkd3d_memory_order_relaxed = __ATOMIC_RELAXED,
vkd3d_memory_order_consume = __ATOMIC_CONSUME,
vkd3d_memory_order_acquire = __ATOMIC_ACQUIRE,
vkd3d_memory_order_release = __ATOMIC_RELEASE,
vkd3d_memory_order_acq_rel = __ATOMIC_ACQ_REL,
vkd3d_memory_order_seq_cst = __ATOMIC_SEQ_CST,
} vkd3d_memory_order;
# define vkd3d_atomic_uint32_load_explicit(target, order) __atomic_load_n(target, order)
# define vkd3d_atomic_uint32_store_explicit(target, value, order) __atomic_store_n(target, value, order)
# define vkd3d_atomic_uint32_exchange_explicit(target, value, order) __atomic_exchange_n(target, value, order)
# define vkd3d_atomic_uint32_increment(target, order) __atomic_add_fetch(target, 1, order)
# define vkd3d_atomic_uint32_decrement(target, order) __atomic_sub_fetch(target, 1, order)
# define vkd3d_atomic_generic_load_explicit(target, order) __atomic_load_n(target, order)
# define vkd3d_atomic_generic_store_explicit(target, value, order) __atomic_store_n(target, value, order)
# define vkd3d_atomic_generic_exchange_explicit(target, value, order) __atomic_exchange_n(target, value, order)
# define vkd3d_atomic_generic_increment(target, order) __atomic_add_fetch(target, 1, order)
# define vkd3d_atomic_generic_decrement(target, order) __atomic_sub_fetch(target, 1, order)
# define vkd3d_atomic_generic_add(target, value, order) __atomic_add_fetch(target, value, order)
# define vkd3d_atomic_generic_sub(target, value, order) __atomic_sub_fetch(target, value, order)
# define vkd3d_atomic_generic_and(target, value, order) __atomic_and_fetch(target, value, order)
# define vkd3d_atomic_generic_or(target, value, order) __atomic_or_fetch(target, value, order)
# define vkd3d_atomic_uint32_load_explicit(target, order) vkd3d_atomic_generic_load_explicit(target, order)
# define vkd3d_atomic_uint32_store_explicit(target, value, order) vkd3d_atomic_generic_store_explicit(target, value, order)
# define vkd3d_atomic_uint32_exchange_explicit(target, value, order) vkd3d_atomic_generic_exchange_explicit(target, value, order)
# define vkd3d_atomic_uint32_increment(target, order) vkd3d_atomic_generic_increment(target, order)
# define vkd3d_atomic_uint32_decrement(target, order) vkd3d_atomic_generic_decrement(target, order)
# define vkd3d_atomic_uint32_add(target, value, order) vkd3d_atomic_generic_add(target, value, order)
# define vkd3d_atomic_uint32_sub(target, value, order) vkd3d_atomic_generic_sub(target, value, order)
# define vkd3d_atomic_uint32_and(target, value, order) vkd3d_atomic_generic_and(target, value, order)
# define vkd3d_atomic_uint32_or(target, value, order) vkd3d_atomic_generic_or(target, value, order)
static inline uint32_t vkd3d_atomic_uint32_compare_exchange(uint32_t* target, uint32_t expected, uint32_t desired,
vkd3d_memory_order success_order, vkd3d_memory_order fail_order)
{
/* Expected is written to with the old value in the case that *target != expected */
__atomic_compare_exchange_n(target, &expected, desired, 0, success_order, fail_order);
return expected;
}
# define vkd3d_atomic_uint64_load_explicit(target, order) vkd3d_atomic_generic_load_explicit(target, order)
# define vkd3d_atomic_uint64_store_explicit(target, value, order) vkd3d_atomic_generic_store_explicit(target, value, order)
# define vkd3d_atomic_uint64_exchange_explicit(target, value, order) vkd3d_atomic_generic_exchange_explicit(target, value, order)
# define vkd3d_atomic_uint64_increment(target, order) vkd3d_atomic_generic_increment(target, order)
# define vkd3d_atomic_uint64_decrement(target, order) vkd3d_atomic_generic_decrement(target, order)
static inline uint64_t vkd3d_atomic_uint64_compare_exchange(UINT64* target, uint64_t expected, uint64_t desired,
vkd3d_memory_order success_order, vkd3d_memory_order fail_order)
{
/* Expected is written to with the old value in the case that *target != expected. Use UINT64 to mark 8-byte alignment. */
__atomic_compare_exchange_n(target, &expected, desired, 0, success_order, fail_order);
return expected;
}
# ifndef __MINGW32__
# define InterlockedIncrement(target) vkd3d_atomic_uint32_increment(target, vkd3d_memory_order_seq_cst)
# define InterlockedDecrement(target) vkd3d_atomic_uint32_decrement(target, vkd3d_memory_order_seq_cst)
# define InterlockedIncrement64(target) __atomic_add_fetch(target, 1, vkd3d_memory_order_seq_cst)
# define InterlockedIncrement(target) vkd3d_atomic_uint32_increment(target, vkd3d_memory_order_seq_cst)
# define InterlockedDecrement(target) vkd3d_atomic_uint32_decrement(target, vkd3d_memory_order_seq_cst)
# define InterlockedCompareExchange(target, desired, expected) vkd3d_atomic_uint32_compare_exchange(target, expected, desired, vkd3d_memory_order_seq_cst, vkd3d_memory_order_acquire)
# define InterlockedIncrement64(target) vkd3d_atomic_uint64_increment(target, vkd3d_memory_order_seq_cst)
# define InterlockedDecrement64(target) vkd3d_atomic_uint64_decrement(target, vkd3d_memory_order_seq_cst)
# define InterlockedCompareExchange64(target, desired, expected) vkd3d_atomic_uint64_compare_exchange(target, expected, desired, vkd3d_memory_order_seq_cst, vkd3d_memory_order_acquire)
# endif
#else
@ -134,4 +262,22 @@ FORCEINLINE uint32_t vkd3d_atomic_uint32_decrement(uint32_t *target, vkd3d_memor
#endif
#if INTPTR_MAX == INT64_MAX
# define vkd3d_atomic_ptr_load_explicit(target, order) ((void *)vkd3d_atomic_uint64_load_explicit((uint64_t *)target, order))
# define vkd3d_atomic_ptr_store_explicit(target, value, order) (vkd3d_atomic_uint64_store_explicit((uint64_t *)target, (uint64_t)value, order))
# define vkd3d_atomic_ptr_exchange_explicit(target, value, order) ((void *)vkd3d_atomic_uint64_exchange_explicit((uint64_t *)target, (uint64_t)value, order))
# define vkd3d_atomic_ptr_increment(target, order) ((void *)vkd3d_atomic_uint64_increment((uint64_t *)target, order))
# define vkd3d_atomic_ptr_decrement(target, order) ((void *)vkd3d_atomic_uint64_decrement((uint64_t *)target, order))
# define vkd3d_atomic_ptr_compare_exchange(target, expected, desired, success_order, fail_order) \
((void *)vkd3d_atomic_uint64_compare_exchange((UINT64 *)target, (uint64_t)expected, (uint64_t)desired, success_order, fail_order))
#else
# define vkd3d_atomic_ptr_load_explicit(target, order) ((void *)vkd3d_atomic_uint32_load_explicit((uint32_t *)target, order))
# define vkd3d_atomic_ptr_store_explicit(target, value, order) (vkd3d_atomic_uint32_store_explicit((uint32_t *)target, (uint32_t)value, order))
# define vkd3d_atomic_ptr_exchange_explicit(target, value, order) ((void *)vkd3d_atomic_uint32_exchange_explicit((uint32_t *)target, (uint32_t)value, order))
# define vkd3d_atomic_ptr_increment(target, order) ((void *)vkd3d_atomic_uint32_increment((uint32_t *)target, order))
# define vkd3d_atomic_ptr_decrement(target, order) ((void *)vkd3d_atomic_uint32_decrement((uint32_t *)target, order))
# define vkd3d_atomic_ptr_compare_exchange(target, expected, desired, success_order, fail_order) \
((void *)vkd3d_atomic_uint32_compare_exchange((uint32_t *)target, (uint32_t)expected, (uint32_t)desired, success_order, fail_order))
#endif
#endif

View File

@ -27,9 +27,12 @@
#include <stdint.h>
#include <limits.h>
#include <stdbool.h>
#include <assert.h>
#ifdef _MSC_VER
#include <intrin.h>
#else
#include <time.h>
#endif
#ifndef ARRAY_SIZE
@ -42,8 +45,15 @@
#define MEMBER_SIZE(t, m) sizeof(((t *)0)->m)
static inline uint64_t align64(uint64_t addr, uint64_t alignment)
{
assert(alignment > 0 && (alignment & (alignment - 1)) == 0);
return (addr + (alignment - 1)) & ~(alignment - 1);
}
static inline size_t align(size_t addr, size_t alignment)
{
assert(alignment > 0 && (alignment & (alignment - 1)) == 0);
return (addr + (alignment - 1)) & ~(alignment - 1);
}
@ -113,8 +123,7 @@ static inline unsigned int vkd3d_bitmask_tzcnt32(uint32_t mask)
{
#ifdef _MSC_VER
unsigned long result;
_BitScanForward(&result, mask) ? result : 32;
return result;
return _BitScanForward(&result, mask) ? result : 32;
#elif defined(__GNUC__) || defined(__clang__)
return mask ? __builtin_ctz(mask) : 32;
#else
@ -203,6 +212,14 @@ static inline unsigned int vkd3d_log2i(unsigned int x)
#endif
}
static inline unsigned int vkd3d_log2i_ceil(unsigned int x)
{
if (x == 1)
return 0;
else
return vkd3d_log2i(x - 1) + 1;
}
static inline int ascii_isupper(int c)
{
return 'A' <= c && c <= 'Z';
@ -231,16 +248,19 @@ static inline bool is_power_of_two(unsigned int x)
return x && !(x & (x -1));
}
static inline void vkd3d_parse_version(const char *version, int *major, int *minor)
static inline void vkd3d_parse_version(const char *version, int *major, int *minor, int *patch)
{
*major = atoi(version);
char *end;
while (isdigit(*version))
++version;
*major = strtol(version, &end, 10);
version = end;
if (*version == '.')
++version;
*minor = atoi(version);
*minor = strtol(version, &end, 10);
version = end;
if (*version == '.')
++version;
*patch = strtol(version, NULL, 10);
}
static inline uint32_t float_bits_to_uint32(float f)
@ -250,21 +270,13 @@ static inline uint32_t float_bits_to_uint32(float f)
return u;
}
static inline size_t vkd3d_wcslen(const WCHAR *wstr, size_t wchar_size)
static inline size_t vkd3d_wcslen(const WCHAR *wstr)
{
const uint16_t *data_16 = (const uint16_t*)wstr;
const uint32_t *data_32 = (const uint32_t*)wstr;
uint32_t curr_char;
size_t length = 0;
while (true)
{
if (wchar_size == sizeof(uint16_t))
curr_char = data_16[length];
else /* if (wchar_size == sizeof(uint32_t)) */
curr_char = data_32[length];
if (!curr_char)
if (!wstr[length])
return length;
length += 1;
@ -276,4 +288,42 @@ static inline void *void_ptr_offset(void *ptr, size_t offset)
return ((char*)ptr) + offset;
}
#ifdef _MSC_VER
#define VKD3D_THREAD_LOCAL __declspec(thread)
#else
#define VKD3D_THREAD_LOCAL __thread
#endif
static inline uint64_t vkd3d_get_current_time_ns(void)
{
#ifdef _WIN32
LARGE_INTEGER li, lf;
uint64_t whole, part;
QueryPerformanceCounter(&li);
QueryPerformanceFrequency(&lf);
whole = (li.QuadPart / lf.QuadPart) * 1000000000;
part = ((li.QuadPart % lf.QuadPart) * 1000000000) / lf.QuadPart;
return whole + part;
#else
struct timespec ts;
clock_gettime(CLOCK_MONOTONIC_RAW, &ts);
return ts.tv_sec * 1000000000ll + ts.tv_nsec;
#endif
}
#ifdef _MSC_VER
#pragma intrinsic(__rdtsc)
#endif
static inline uint64_t vkd3d_get_current_time_ticks(void)
{
#ifdef _MSC_VER
return __rdtsc();
#elif defined(__i386__) || defined(__x86_64__)
return __builtin_ia32_rdtsc();
#else
return vkd3d_get_current_time_ns();
#endif
}
#endif /* __VKD3D_COMMON_H */

View File

@ -26,13 +26,13 @@
#include <stdint.h>
#ifdef VKD3D_NO_TRACE_MESSAGES
#define TRACE(args...) do { } while (0)
#define TRACE(...) do { } while (0)
#define TRACE_ON() (false)
#endif
#ifdef VKD3D_NO_DEBUG_MESSAGES
#define WARN(args...) do { } while (0)
#define FIXME(args...) do { } while (0)
#define WARN(...) do { } while (0)
#define FIXME(...) do { } while (0)
#endif
enum vkd3d_dbg_level
@ -65,7 +65,7 @@ void vkd3d_dbg_printf(enum vkd3d_dbg_channel channel, enum vkd3d_dbg_level level
const char *vkd3d_dbg_sprintf(const char *fmt, ...) VKD3D_PRINTF_FUNC(1, 2);
const char *vkd3d_dbg_vsprintf(const char *fmt, va_list args);
const char *debugstr_a(const char *str);
const char *debugstr_w(const WCHAR *wstr, size_t wchar_size);
const char *debugstr_w(const WCHAR *wstr);
#define VKD3D_DBG_LOG(level) \
do { \

View File

@ -0,0 +1,119 @@
/*
* Copyright 2021 Hans-Kristian Arntzen for Valve Corporation
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
#ifndef __VKD3D_DESCRIPTOR_QA_DATA_H
#define __VKD3D_DESCRIPTOR_QA_DATA_H
#include <stdint.h>
/* Data types which are used by shader backends when emitting code. */
enum vkd3d_descriptor_qa_flag_bits
{
VKD3D_DESCRIPTOR_QA_TYPE_NONE_BIT = 0,
VKD3D_DESCRIPTOR_QA_TYPE_SAMPLED_IMAGE_BIT = 1 << 0,
VKD3D_DESCRIPTOR_QA_TYPE_STORAGE_IMAGE_BIT = 1 << 1,
VKD3D_DESCRIPTOR_QA_TYPE_UNIFORM_BUFFER_BIT = 1 << 2,
VKD3D_DESCRIPTOR_QA_TYPE_STORAGE_BUFFER_BIT = 1 << 3,
VKD3D_DESCRIPTOR_QA_TYPE_UNIFORM_TEXEL_BUFFER_BIT = 1 << 4,
VKD3D_DESCRIPTOR_QA_TYPE_STORAGE_TEXEL_BUFFER_BIT = 1 << 5,
VKD3D_DESCRIPTOR_QA_TYPE_RT_ACCELERATION_STRUCTURE_BIT = 1 << 6,
VKD3D_DESCRIPTOR_QA_TYPE_SAMPLER_BIT = 1 << 7,
VKD3D_DESCRIPTOR_QA_TYPE_RAW_VA_BIT = 1 << 8
};
typedef uint32_t vkd3d_descriptor_qa_flags;
struct vkd3d_descriptor_qa_cookie_descriptor
{
uint32_t cookie;
uint32_t descriptor_type;
};
enum vkd3d_descriptor_debug_fault_type
{
VKD3D_DESCRIPTOR_FAULT_TYPE_HEAP_OF_OF_RANGE = 1 << 0,
VKD3D_DESCRIPTOR_FAULT_TYPE_MISMATCH_DESCRIPTOR_TYPE = 1 << 1,
VKD3D_DESCRIPTOR_FAULT_TYPE_DESTROYED_RESOURCE = 1 << 2
};
/* Physical layout of QA buffer. */
struct vkd3d_descriptor_qa_global_buffer_data
{
uint64_t failed_hash;
uint32_t failed_offset;
uint32_t failed_heap;
uint32_t failed_cookie;
uint32_t fault_atomic;
uint32_t failed_instruction;
uint32_t failed_descriptor_type_mask;
uint32_t actual_descriptor_type_mask;
uint32_t fault_type;
uint32_t live_status_table[];
};
/* Physical layout of QA heap buffer. */
struct vkd3d_descriptor_qa_heap_buffer_data
{
uint32_t num_descriptors;
uint32_t heap_index;
struct vkd3d_descriptor_qa_cookie_descriptor desc[];
};
enum vkd3d_descriptor_qa_heap_buffer_data_member
{
VKD3D_DESCRIPTOR_QA_HEAP_MEMBER_NUM_DESCRIPTORS = 0,
VKD3D_DESCRIPTOR_QA_HEAP_MEMBER_HEAP_INDEX,
VKD3D_DESCRIPTOR_QA_HEAP_MEMBER_DESC,
VKD3D_DESCRIPTOR_QA_HEAP_MEMBER_COUNT
};
VKD3D_UNUSED static const char *vkd3d_descriptor_qa_heap_data_names[VKD3D_DESCRIPTOR_QA_HEAP_MEMBER_COUNT] = {
"num_descriptors",
"heap_index",
"desc",
};
enum vkd3d_descriptor_qa_global_buffer_data_member
{
VKD3D_DESCRIPTOR_QA_GLOBAL_BUFFER_DATA_MEMBER_FAILED_HASH = 0,
VKD3D_DESCRIPTOR_QA_GLOBAL_BUFFER_DATA_MEMBER_FAILED_OFFSET,
VKD3D_DESCRIPTOR_QA_GLOBAL_BUFFER_DATA_MEMBER_FAILED_HEAP,
VKD3D_DESCRIPTOR_QA_GLOBAL_BUFFER_DATA_MEMBER_FAILED_COOKIE,
VKD3D_DESCRIPTOR_QA_GLOBAL_BUFFER_DATA_MEMBER_FAULT_ATOMIC,
VKD3D_DESCRIPTOR_QA_GLOBAL_BUFFER_DATA_MEMBER_FAILED_INSTRUCTION,
VKD3D_DESCRIPTOR_QA_GLOBAL_BUFFER_DATA_MEMBER_FAILED_DESCRIPTOR_TYPE_MASK,
VKD3D_DESCRIPTOR_QA_GLOBAL_BUFFER_DATA_MEMBER_ACTUAL_DESCRIPTOR_TYPE_MASK,
VKD3D_DESCRIPTOR_QA_GLOBAL_BUFFER_DATA_MEMBER_FAULT_TYPE,
VKD3D_DESCRIPTOR_QA_GLOBAL_BUFFER_DATA_MEMBER_LIVE_STATUS_TABLE,
VKD3D_DESCRIPTOR_QA_GLOBAL_BUFFER_DATA_MEMBER_COUNT
};
VKD3D_UNUSED static const char *vkd3d_descriptor_qa_global_buffer_data_names[VKD3D_DESCRIPTOR_QA_GLOBAL_BUFFER_DATA_MEMBER_COUNT] = {
"failed_hash",
"failed_offset",
"failed_heap",
"failed_cookie",
"fault_atomic",
"failed_instruction",
"failed_descriptor_type_mask",
"actual_descriptor_type_mask",
"fault_type",
"live_status_table",
};
#endif

View File

@ -0,0 +1,42 @@
/*
* Copyright 2022 Hans-Kristian Arntzen for Valve Corporation
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
#ifndef __VKD3D_FILE_UTILS_H
#define __VKD3D_FILE_UTILS_H
#include <stddef.h>
#include <stdio.h>
#include <stdbool.h>
struct vkd3d_memory_mapped_file
{
void *mapped;
size_t mapped_size;
};
/* On failure, ensures the struct is cleared to zero.
* A reference to the file is kept through the memory mapping. */
bool vkd3d_file_map_read_only(const char *path, struct vkd3d_memory_mapped_file *file);
/* Clears out file on unmap. */
void vkd3d_file_unmap(struct vkd3d_memory_mapped_file *file);
bool vkd3d_file_rename_overwrite(const char *from_path, const char *to_path);
bool vkd3d_file_rename_no_replace(const char *from_path, const char *to_path);
bool vkd3d_file_delete(const char *path);
FILE *vkd3d_file_open_exclusive_write(const char *path);
#endif

View File

@ -23,6 +23,7 @@
#include <stdbool.h>
#include <stdlib.h>
#include "vkd3d_common.h"
#include "vkd3d_debug.h"
static inline void *vkd3d_malloc(size_t size)
@ -57,4 +58,22 @@ static inline void vkd3d_free(void *ptr)
bool vkd3d_array_reserve(void **elements, size_t *capacity,
size_t element_count, size_t element_size);
static inline void *vkd3d_malloc_aligned(size_t size, size_t alignment)
{
#ifdef _WIN32
return _aligned_malloc(size, alignment);
#else
return aligned_alloc(alignment, align(size, alignment));
#endif
}
static inline void vkd3d_free_aligned(void *ptr)
{
#ifdef _WIN32
_aligned_free(ptr);
#else
free(ptr);
#endif
}
#endif /* __VKD3D_MEMORY_H */

View File

@ -37,6 +37,8 @@ int vkd3d_dlclose(vkd3d_module_t handle);
const char *vkd3d_dlerror(void);
bool vkd3d_get_env_var(const char *name, char *value, size_t value_size);
bool vkd3d_get_program_name(char program_name[VKD3D_PATH_MAX]);
#endif

View File

@ -21,39 +21,15 @@
#include "vkd3d_windows.h"
#include "vkd3d_spinlock.h"
#include <stdint.h>
#include "vkd3d_common.h"
#ifdef VKD3D_ENABLE_PROFILING
#ifdef _WIN32
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#else
#include <time.h>
#endif
void vkd3d_init_profiling(void);
bool vkd3d_uses_profiling(void);
unsigned int vkd3d_profiling_register_region(const char *name, spinlock_t *lock, uint32_t *latch);
void vkd3d_profiling_notify_work(unsigned int index, uint64_t start_ticks, uint64_t end_ticks, unsigned int iteration_count);
static inline uint64_t vkd3d_profiling_get_tick_count(void)
{
#ifdef _WIN32
LARGE_INTEGER li, lf;
uint64_t whole, part;
QueryPerformanceCounter(&li);
QueryPerformanceFrequency(&lf);
whole = (li.QuadPart / lf.QuadPart) * 1000000000;
part = ((li.QuadPart % lf.QuadPart) * 1000000000) / lf.QuadPart;
return whole + part;
#else
struct timespec ts;
clock_gettime(CLOCK_MONOTONIC_RAW, &ts);
return ts.tv_sec * 1000000000ll + ts.tv_nsec;
#endif
}
#define VKD3D_REGION_DECL(name) \
static uint32_t _vkd3d_region_latch_##name; \
static spinlock_t _vkd3d_region_lock_##name; \
@ -65,12 +41,12 @@ static inline uint64_t vkd3d_profiling_get_tick_count(void)
do { \
if (!(_vkd3d_region_index_##name = vkd3d_atomic_uint32_load_explicit(&_vkd3d_region_latch_##name, vkd3d_memory_order_acquire))) \
_vkd3d_region_index_##name = vkd3d_profiling_register_region(#name, &_vkd3d_region_lock_##name, &_vkd3d_region_latch_##name); \
_vkd3d_region_begin_tick_##name = vkd3d_profiling_get_tick_count(); \
_vkd3d_region_begin_tick_##name = vkd3d_get_current_time_ticks(); \
} while(0)
#define VKD3D_REGION_END_ITERATIONS(name, iter) \
do { \
_vkd3d_region_end_tick_##name = vkd3d_profiling_get_tick_count(); \
_vkd3d_region_end_tick_##name = vkd3d_get_current_time_ticks(); \
vkd3d_profiling_notify_work(_vkd3d_region_index_##name, _vkd3d_region_begin_tick_##name, _vkd3d_region_end_tick_##name, iter); \
} while(0)

View File

@ -0,0 +1,59 @@
/*
* Copyright 2020 Hans-Kristian Arntzen for Valve Corporation
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
#ifndef __VKD3D_RW_SPINLOCK_H
#define __VKD3D_RW_SPINLOCK_H
#include "vkd3d_spinlock.h"
#define VKD3D_RW_SPINLOCK_WRITE 1u
#define VKD3D_RW_SPINLOCK_READ 2u
#define VKD3D_RW_SPINLOCK_IDLE 0u
static inline void rw_spinlock_acquire_read(spinlock_t *spinlock)
{
uint32_t count = vkd3d_atomic_uint32_add(spinlock, VKD3D_RW_SPINLOCK_READ, vkd3d_memory_order_acquire);
while (count & VKD3D_RW_SPINLOCK_WRITE)
{
vkd3d_pause();
count = vkd3d_atomic_uint32_load_explicit(spinlock, vkd3d_memory_order_acquire);
}
}
static inline void rw_spinlock_release_read(spinlock_t *spinlock)
{
vkd3d_atomic_uint32_sub(spinlock, VKD3D_RW_SPINLOCK_READ, vkd3d_memory_order_release);
}
static inline void rw_spinlock_acquire_write(spinlock_t *spinlock)
{
while (vkd3d_atomic_uint32_load_explicit(spinlock, vkd3d_memory_order_relaxed) != VKD3D_RW_SPINLOCK_IDLE ||
vkd3d_atomic_uint32_compare_exchange(spinlock,
VKD3D_RW_SPINLOCK_IDLE, VKD3D_RW_SPINLOCK_WRITE,
vkd3d_memory_order_acquire, vkd3d_memory_order_relaxed) != VKD3D_RW_SPINLOCK_IDLE)
{
vkd3d_pause();
}
}
static inline void rw_spinlock_release_write(spinlock_t *spinlock)
{
vkd3d_atomic_uint32_and(spinlock, ~VKD3D_RW_SPINLOCK_WRITE, vkd3d_memory_order_release);
}
#endif

View File

@ -28,6 +28,13 @@
#include <emmintrin.h>
#endif
static inline void vkd3d_pause(void)
{
#ifdef __SSE2__
_mm_pause();
#endif
}
#define vkd3d_spinlock_try_lock(lock) \
(!vkd3d_atomic_uint32_load_explicit(lock, vkd3d_memory_order_relaxed) && \
!vkd3d_atomic_uint32_exchange_explicit(lock, 1u, vkd3d_memory_order_acquire))
@ -49,11 +56,7 @@ static inline bool spinlock_try_acquire(spinlock_t *lock)
static inline void spinlock_acquire(spinlock_t *lock)
{
while (!spinlock_try_acquire(lock))
#ifdef __SSE2__
_mm_pause();
#else
continue;
#endif
vkd3d_pause();
}
static inline void spinlock_release(spinlock_t *lock)

View File

@ -0,0 +1,82 @@
/*
* Copyright 2021 Hans-Kristian Arntzen for Valve Corporation
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
#ifndef __VKD3D_STRING_H
#define __VKD3D_STRING_H
#include "vkd3d_common.h"
#include <stddef.h>
/* Various string utilities. */
WCHAR *vkd3d_dup_entry_point(const char *str);
WCHAR *vkd3d_dup_entry_point_n(const char *str, size_t len);
WCHAR *vkd3d_dup_demangled_entry_point(const char *str);
char *vkd3d_dup_demangled_entry_point_ascii(const char *str);
bool vkd3d_export_strequal(const WCHAR *a, const WCHAR *b);
bool vkd3d_export_strequal_mixed(const WCHAR *a, const char *b);
bool vkd3d_export_strequal_substr(const WCHAR *a, size_t n, const WCHAR *b);
char *vkd3d_strdup(const char *str);
char *vkd3d_strdup_n(const char *str, size_t n);
WCHAR *vkd3d_wstrdup(const WCHAR *str);
WCHAR *vkd3d_wstrdup_n(const WCHAR *str, size_t n);
static inline bool vkd3d_string_ends_with_n(const char *str, size_t str_len, const char *ending, size_t ending_len)
{
return str_len >= ending_len && !strncmp(str + (str_len - ending_len), ending, ending_len);
}
static inline bool vkd3d_string_ends_with(const char *str, const char *ending)
{
return vkd3d_string_ends_with_n(str, strlen(str), ending, strlen(ending));
}
enum vkd3d_string_compare_mode
{
VKD3D_STRING_COMPARE_NEVER,
VKD3D_STRING_COMPARE_ALWAYS,
VKD3D_STRING_COMPARE_EXACT,
VKD3D_STRING_COMPARE_STARTS_WITH,
VKD3D_STRING_COMPARE_ENDS_WITH,
VKD3D_STRING_COMPARE_CONTAINS,
};
static inline bool vkd3d_string_compare(enum vkd3d_string_compare_mode mode, const char *string, const char *comparator)
{
switch (mode)
{
default:
case VKD3D_STRING_COMPARE_NEVER:
return false;
case VKD3D_STRING_COMPARE_ALWAYS:
return true;
case VKD3D_STRING_COMPARE_EXACT:
return !strcmp(string, comparator);
case VKD3D_STRING_COMPARE_STARTS_WITH:
return !strncmp(string, comparator, strlen(comparator));
case VKD3D_STRING_COMPARE_ENDS_WITH:
return vkd3d_string_ends_with(string, comparator);
case VKD3D_STRING_COMPARE_CONTAINS:
return strstr(string, comparator) != NULL;
}
}
#endif /* __VKD3D_STRING_H */

View File

@ -20,6 +20,7 @@
#define __VKD3D_TEST_H
#include "vkd3d_common.h"
#include "vkd3d_debug.h"
#include <assert.h>
#include <inttypes.h>
#include <stdarg.h>
@ -28,16 +29,19 @@
#include <stdlib.h>
#include <string.h>
#ifdef VKD3D_TEST_DECLARE_MAIN
static void vkd3d_test_main(int argc, char **argv);
static const char *vkd3d_test_name;
static const char *vkd3d_test_platform = "other";
#endif
extern const char *vkd3d_test_name;
extern const char *vkd3d_test_platform;
static void vkd3d_test_start_todo(bool is_todo);
static int vkd3d_test_loop_todo(void);
static void vkd3d_test_end_todo(void);
#define START_TEST(name) \
static const char *vkd3d_test_name = #name; \
const char *vkd3d_test_name = #name; \
static void vkd3d_test_main(int argc, char **argv)
/*
@ -100,7 +104,7 @@ static void vkd3d_test_end_todo(void);
#define todo todo_if(true)
static struct
struct vkd3d_test_state_context
{
LONG success_count;
LONG failure_count;
@ -119,8 +123,10 @@ static struct
bool bug_enabled;
const char *test_name_filter;
const char *test_exclude_list;
char context[1024];
} vkd3d_test_state;
};
extern struct vkd3d_test_state_context vkd3d_test_state;
static bool
vkd3d_test_platform_is_windows(void)
@ -245,7 +251,7 @@ vkd3d_test_debug(const char *fmt, ...)
int size;
size = snprintf(buffer, sizeof(buffer), "%s: ", vkd3d_test_name);
if (0 < size && size < sizeof(buffer))
if (0 < size && size < (int)sizeof(buffer))
{
va_start(args, fmt);
vsnprintf(buffer + size, sizeof(buffer) - size, fmt, args);
@ -264,17 +270,20 @@ vkd3d_test_debug(const char *fmt, ...)
}
}
#ifdef VKD3D_TEST_DECLARE_MAIN
int main(int argc, char **argv)
{
const char *exclude_list = getenv("VKD3D_TEST_EXCLUDE");
const char *test_filter = getenv("VKD3D_TEST_FILTER");
const char *debug_level = getenv("VKD3D_TEST_DEBUG");
char *test_platform = getenv("VKD3D_TEST_PLATFORM");
const char *bug = getenv("VKD3D_TEST_BUG");
memset(&vkd3d_test_state, 0, sizeof(vkd3d_test_state));
vkd3d_test_state.debug_level = debug_level ? atoi(debug_level) : 0;
vkd3d_test_state.debug_level = debug_level ? atoi(debug_level) : 1;
vkd3d_test_state.bug_enabled = bug ? atoi(bug) : true;
vkd3d_test_state.test_name_filter = test_filter;
vkd3d_test_state.test_exclude_list = exclude_list;
if (test_platform)
{
@ -351,16 +360,27 @@ int wmain(int argc, WCHAR **wargv)
return ret;
}
#endif /* _WIN32 */
#endif /* VKD3D_TEST_DECLARE_MAIN */
typedef void (*vkd3d_test_pfn)(void);
static inline void vkd3d_run_test(const char *name, vkd3d_test_pfn test_pfn)
{
const char *old_test_name;
if (vkd3d_test_state.test_name_filter && !strstr(name, vkd3d_test_state.test_name_filter))
return;
vkd3d_test_debug("%s", name);
if (vkd3d_test_state.test_exclude_list
&& vkd3d_debug_list_has_member(vkd3d_test_state.test_exclude_list, name))
return;
old_test_name = vkd3d_test_name;
vkd3d_test_debug("======== %s begin ========", name);
vkd3d_test_name = name;
test_pfn();
vkd3d_test_name = old_test_name;
vkd3d_test_debug("======== %s end ==========", name);
}
static inline void vkd3d_test_start_todo(bool is_todo)

View File

@ -43,12 +43,16 @@ typedef struct pthread_mutex
SRWLOCK lock;
} pthread_mutex_t;
#define PTHREAD_MUTEX_INITIALIZER {SRWLOCK_INIT}
/* pthread_cond_t is not copyable, so embed CV inline. */
typedef struct pthread_cond
{
CONDITION_VARIABLE cond;
} pthread_cond_t;
typedef pthread_cond_t condvar_reltime_t;
static DWORD WINAPI win32_thread_wrapper_routine(void *arg)
{
pthread_t thread = arg;
@ -112,6 +116,48 @@ static inline int pthread_mutex_destroy(pthread_mutex_t *lock)
return 0;
}
/* SRWLocks distinguish between write and read unlocks, but pthread interface does not,
* so make a trivial wrapper type instead to avoid any possible API conflicts. */
typedef struct rwlock
{
SRWLOCK rwlock;
} rwlock_t;
static inline int rwlock_init(rwlock_t *lock)
{
InitializeSRWLock(&lock->rwlock);
return 0;
}
static inline int rwlock_lock_write(rwlock_t *lock)
{
AcquireSRWLockExclusive(&lock->rwlock);
return 0;
}
static inline int rwlock_lock_read(rwlock_t *lock)
{
AcquireSRWLockShared(&lock->rwlock);
return 0;
}
static inline int rwlock_unlock_write(rwlock_t *lock)
{
ReleaseSRWLockExclusive(&lock->rwlock);
return 0;
}
static inline int rwlock_unlock_read(rwlock_t *lock)
{
ReleaseSRWLockShared(&lock->rwlock);
return 0;
}
static inline int rwlock_destroy(rwlock_t *lock)
{
return 0;
}
static inline int pthread_cond_init(pthread_cond_t *cond, void *attr)
{
(void)attr;
@ -143,6 +189,32 @@ static inline int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *lock)
return ret ? 0 : -1;
}
static inline int condvar_reltime_init(condvar_reltime_t *cond)
{
return pthread_cond_init(cond, NULL);
}
static inline int condvar_reltime_destroy(condvar_reltime_t *cond)
{
return pthread_cond_destroy(cond);
}
static inline int condvar_reltime_signal(condvar_reltime_t *cond)
{
return pthread_cond_signal(cond);
}
static inline int condvar_reltime_wait_timeout_seconds(condvar_reltime_t *cond, pthread_mutex_t *lock, unsigned int seconds)
{
BOOL ret = SleepConditionVariableSRW(&cond->cond, &lock->lock, seconds * 1000, 0);
if (ret)
return 0;
else if (GetLastError() == ERROR_TIMEOUT)
return 1;
else
return -1;
}
static inline void vkd3d_set_thread_name(const char *name)
{
(void)name;
@ -166,10 +238,96 @@ static inline void pthread_once(pthread_once_t *once, void (*func)(void))
}
#else
#include <pthread.h>
#include <errno.h>
#include <time.h>
static inline void vkd3d_set_thread_name(const char *name)
{
pthread_setname_np(pthread_self(), name);
}
typedef struct rwlock
{
pthread_rwlock_t rwlock;
} rwlock_t;
static inline int rwlock_init(rwlock_t *lock)
{
return pthread_rwlock_init(&lock->rwlock, NULL);
}
static inline int rwlock_lock_write(rwlock_t *lock)
{
return pthread_rwlock_wrlock(&lock->rwlock);
}
static inline int rwlock_lock_read(rwlock_t *lock)
{
return pthread_rwlock_rdlock(&lock->rwlock);
}
static inline int rwlock_unlock_write(rwlock_t *lock)
{
return pthread_rwlock_unlock(&lock->rwlock);
}
static inline int rwlock_unlock_read(rwlock_t *lock)
{
return pthread_rwlock_unlock(&lock->rwlock);
}
static inline int rwlock_destroy(rwlock_t *lock)
{
return pthread_rwlock_destroy(&lock->rwlock);
}
typedef struct condvar_reltime
{
pthread_cond_t cond;
} condvar_reltime_t;
static inline int condvar_reltime_init(condvar_reltime_t *cond)
{
pthread_condattr_t attr;
int rc;
pthread_condattr_init(&attr);
pthread_condattr_setclock(&attr, CLOCK_MONOTONIC);
rc = pthread_cond_init(&cond->cond, &attr);
pthread_condattr_destroy(&attr);
return rc;
}
static inline void condvar_reltime_destroy(condvar_reltime_t *cond)
{
pthread_cond_destroy(&cond->cond);
}
static inline int condvar_reltime_signal(condvar_reltime_t *cond)
{
return pthread_cond_signal(&cond->cond);
}
static inline int condvar_reltime_wait_timeout_seconds(condvar_reltime_t *cond, pthread_mutex_t *lock, unsigned int seconds)
{
struct timespec ts;
int rc;
clock_gettime(CLOCK_MONOTONIC, &ts);
ts.tv_sec += seconds;
/* This is absolute time. */
rc = pthread_cond_timedwait(&cond->cond, lock, &ts);
if (rc == ETIMEDOUT)
return 1;
else if (rc == 0)
return 0;
else
return -1;
}
#define PTHREAD_ONCE_CALLBACK
#endif

View File

@ -23,6 +23,6 @@
/* max_elements is 0 if only nul-terminator should be used.
* Otherwise, terminate the string after either a nul-termination byte or max_elements. */
char *vkd3d_strdup_w_utf8(const WCHAR *wstr, size_t wchar_size, size_t max_elements);
char *vkd3d_strdup_w_utf8(const WCHAR *wstr, size_t max_elements);
#endif /* __VKD3D_UTF8_H */

View File

@ -0,0 +1,6 @@
#ifndef __VULKAN_PRIVATE_EXTENSIONS_H__
#define __VULKAN_PRIVATE_EXTENSIONS_H__
/* Nothing here at the moment. Add hacks here! */
#endif

View File

@ -0,0 +1,71 @@
INCLUDE_DIR := $(CURDIR)
VERT_SOURCES := $(wildcard $(M)/*.vert)
FRAG_SOURCES := $(wildcard $(M)/*.frag)
COMP_SOURCES := $(wildcard $(M)/*.comp)
TESC_SOURCES := $(wildcard $(M)/*.tesc)
TESE_SOURCES := $(wildcard $(M)/*.tese)
GEOM_SOURCES := $(wildcard $(M)/*.geom)
RGEN_SOURCES := $(wildcard $(M)/*.rgen)
RINT_SOURCES := $(wildcard $(M)/*.rint)
RAHIT_SOURCES := $(wildcard $(M)/*.rahit)
RCHIT_SOURCES := $(wildcard $(M)/*.rchit)
RMISS_SOURCES := $(wildcard $(M)/*.rmiss)
RCALL_SOURCES := $(wildcard $(M)/*.rcall)
SPV_OBJECTS := \
$(VERT_SOURCES:.vert=.spv) \
$(FRAG_SOURCES:.frag=.spv) \
$(COMP_SOURCES:.comp=.spv) \
$(TESC_SOURCES:.tesc=.spv) \
$(TESE_SOURCES:.tese=.spv) \
$(GEOM_SOURCES:.geom=.spv) \
$(RGEN_SOURCES:.rgen=.spv) \
$(RINT_SOURCES:.rint=.spv) \
$(RAHIT_SOURCES:.rahit=.spv) \
$(RCHIT_SOURCES:.rchit=.spv) \
$(RMISS_SOURCES:.rmiss=.spv) \
$(RCALL_SOURCES:.rcall=.spv)
%.spv: %.vert
glslc -o $@ $< -I$(INCLUDE_DIR) --target-env=vulkan1.1 $(GLSLC_FLAGS)
%.spv: %.frag
glslc -o $@ $< -I$(INCLUDE_DIR) --target-env=vulkan1.1 -DDEBUG_CHANNEL_HELPER_LANES $(GLSLC_FLAGS)
%.spv: %.comp
glslc -o $@ $< -I$(INCLUDE_DIR) --target-env=vulkan1.1 $(GLSLC_FLAGS)
%.spv: %.geom
glslc -o $@ $< -I$(INCLUDE_DIR) --target-env=vulkan1.1 $(GLSLC_FLAGS)
%.spv: %.tesc
glslc -o $@ $< -I$(INCLUDE_DIR) --target-env=vulkan1.1 $(GLSLC_FLAGS)
%.spv: %.tese
glslc -o $@ $< -I$(INCLUDE_DIR) --target-env=vulkan1.1 $(GLSLC_FLAGS)
%.spv: %.rgen
glslc -o $@ $< -I$(INCLUDE_DIR) --target-env=vulkan1.1 --target-spv=spv1.4 $(GLSLC_FLAGS)
%.spv: %.rint
glslc -o $@ $< -I$(INCLUDE_DIR) --target-env=vulkan1.1 --target-spv=spv1.4 $(GLSLC_FLAGS)
%.spv: %.rahit
glslc -o $@ $< -I$(INCLUDE_DIR) --target-env=vulkan1.1 --target-spv=spv1.4 $(GLSLC_FLAGS)
%.spv: %.rchit
glslc -o $@ $< -I$(INCLUDE_DIR) --target-env=vulkan1.1 --target-spv=spv1.4 $(GLSLC_FLAGS)
%.spv: %.rmiss
glslc -o $@ $< -I$(INCLUDE_DIR) --target-env=vulkan1.1 --target-spv=spv1.4 $(GLSLC_FLAGS)
%.spv: %.rcall
glslc -o $@ $< -I$(INCLUDE_DIR) --target-env=vulkan1.1 --target-spv=spv1.4 $(GLSLC_FLAGS)
all: $(SPV_OBJECTS)
clean:
rm -f $(SPV_OBJECTS)
.PHONY: clean

View File

@ -23,14 +23,17 @@
#extension GL_ARB_gpu_shader_int64 : require
#extension GL_KHR_shader_subgroup_basic : require
#extension GL_KHR_shader_subgroup_ballot : require
#ifdef DEBUG_CHANNEL_HELPER_LANES
#extension GL_EXT_demote_to_helper_invocation : require
#endif
layout(buffer_reference, std430, buffer_reference_align = 4) buffer ControlBlock
layout(buffer_reference, std430, buffer_reference_align = 4) coherent buffer ControlBlock
{
uint message_counter;
uint instance_counter;
};
layout(buffer_reference, std430, buffer_reference_align = 4) buffer RingBuffer
layout(buffer_reference, std430, buffer_reference_align = 4) coherent buffer RingBuffer
{
uint data[];
};
@ -48,24 +51,73 @@ const uint DEBUG_CHANNEL_FMT_F32 = 2;
const uint DEBUG_CHANNEL_FMT_HEX_ALL = DEBUG_CHANNEL_FMT_HEX * 0x55555555u;
const uint DEBUG_CHANNEL_FMT_I32_ALL = DEBUG_CHANNEL_FMT_I32 * 0x55555555u;
const uint DEBUG_CHANNEL_FMT_F32_ALL = DEBUG_CHANNEL_FMT_F32 * 0x55555555u;
const uint DEBUG_CHANNEL_WORD_COOKIE = 0xdeadca70u; /* Let host fish for this cookie in device lost scenarios. */
uint DEBUG_CHANNEL_INSTANCE_COUNTER;
uvec3 DEBUG_CHANNEL_ID;
/* Need to make sure the elected subgroup can have side effects. */
#ifdef DEBUG_CHANNEL_HELPER_LANES
bool DEBUG_CHANNEL_ELECT()
{
bool elected = false;
if (!helperInvocationEXT())
elected = subgroupElect();
return elected;
}
#else
bool DEBUG_CHANNEL_ELECT()
{
return subgroupElect();
}
#endif
void DEBUG_CHANNEL_INIT(uvec3 id)
{
if (!DEBUG_SHADER_RING_ACTIVE)
return;
DEBUG_CHANNEL_ID = id;
uint inst;
if (subgroupElect())
#ifdef DEBUG_CHANNEL_HELPER_LANES
if (!helperInvocationEXT())
{
/* Elect and broadcast must happen without helper lanes here.
* We must perform the instance increment with side effects,
* and broadcast first must pick the elected lane. */
if (subgroupElect())
inst = atomicAdd(ControlBlock(DEBUG_SHADER_ATOMIC_BDA).instance_counter, 1u);
DEBUG_CHANNEL_INSTANCE_COUNTER = subgroupBroadcastFirst(inst);
}
/* Helper lanes cannot write debug messages, since they cannot have side effects.
* Leave it undefined, and we should ensure SGPR propagation either way ... */
#else
if (DEBUG_CHANNEL_ELECT())
inst = atomicAdd(ControlBlock(DEBUG_SHADER_ATOMIC_BDA).instance_counter, 1u);
DEBUG_CHANNEL_INSTANCE_COUNTER = subgroupBroadcastFirst(inst);
#endif
}
void DEBUG_CHANNEL_WRITE_HEADER(RingBuffer buf, uint offset, uint num_words, uint fmt)
void DEBUG_CHANNEL_INIT_IMPLICIT_INSTANCE(uvec3 id, uint inst)
{
if (!DEBUG_SHADER_RING_ACTIVE)
return;
DEBUG_CHANNEL_ID = id;
DEBUG_CHANNEL_INSTANCE_COUNTER = inst;
}
void DEBUG_CHANNEL_UNLOCK_MESSAGE(RingBuffer buf, uint offset, uint num_words)
{
memoryBarrierBuffer();
/* Make sure this word is made visible last. This way the ring thread can avoid reading bogus messages.
* If the host thread observed a num_word of 0, we know a message was allocated, but we don't necessarily
* have a complete write yet.
* In a device lost scenario, we can try to fish for valid messages. */
buf.data[(offset + 0) & DEBUG_SHADER_RING_MASK] = num_words | DEBUG_CHANNEL_WORD_COOKIE;
memoryBarrierBuffer();
}
void DEBUG_CHANNEL_WRITE_HEADER(RingBuffer buf, uint offset, uint fmt)
{
buf.data[(offset + 0) & DEBUG_SHADER_RING_MASK] = num_words;
buf.data[(offset + 1) & DEBUG_SHADER_RING_MASK] = uint(DEBUG_SHADER_HASH);
buf.data[(offset + 2) & DEBUG_SHADER_RING_MASK] = uint(DEBUG_SHADER_HASH >> 32);
buf.data[(offset + 3) & DEBUG_SHADER_RING_MASK] = DEBUG_CHANNEL_INSTANCE_COUNTER;
@ -87,7 +139,9 @@ void DEBUG_CHANNEL_MSG_()
return;
uint words = 8;
uint offset = DEBUG_CHANNEL_ALLOCATE(words);
DEBUG_CHANNEL_WRITE_HEADER(RingBuffer(DEBUG_SHADER_RING_BDA), offset, words, 0);
RingBuffer buf = RingBuffer(DEBUG_SHADER_RING_BDA);
DEBUG_CHANNEL_WRITE_HEADER(buf, offset, 0);
DEBUG_CHANNEL_UNLOCK_MESSAGE(buf, offset, words);
}
void DEBUG_CHANNEL_MSG_(uint fmt, uint v0)
@ -97,8 +151,9 @@ void DEBUG_CHANNEL_MSG_(uint fmt, uint v0)
RingBuffer buf = RingBuffer(DEBUG_SHADER_RING_BDA);
uint words = 9;
uint offset = DEBUG_CHANNEL_ALLOCATE(words);
DEBUG_CHANNEL_WRITE_HEADER(buf, offset, words, fmt);
DEBUG_CHANNEL_WRITE_HEADER(buf, offset, fmt);
buf.data[(offset + 8) & DEBUG_SHADER_RING_MASK] = v0;
DEBUG_CHANNEL_UNLOCK_MESSAGE(buf, offset, words);
}
void DEBUG_CHANNEL_MSG_(uint fmt, uint v0, uint v1)
@ -108,9 +163,10 @@ void DEBUG_CHANNEL_MSG_(uint fmt, uint v0, uint v1)
RingBuffer buf = RingBuffer(DEBUG_SHADER_RING_BDA);
uint words = 10;
uint offset = DEBUG_CHANNEL_ALLOCATE(words);
DEBUG_CHANNEL_WRITE_HEADER(buf, offset, words, fmt);
DEBUG_CHANNEL_WRITE_HEADER(buf, offset, fmt);
buf.data[(offset + 8) & DEBUG_SHADER_RING_MASK] = v0;
buf.data[(offset + 9) & DEBUG_SHADER_RING_MASK] = v1;
DEBUG_CHANNEL_UNLOCK_MESSAGE(buf, offset, words);
}
void DEBUG_CHANNEL_MSG_(uint fmt, uint v0, uint v1, uint v2)
@ -120,10 +176,11 @@ void DEBUG_CHANNEL_MSG_(uint fmt, uint v0, uint v1, uint v2)
RingBuffer buf = RingBuffer(DEBUG_SHADER_RING_BDA);
uint words = 11;
uint offset = DEBUG_CHANNEL_ALLOCATE(words);
DEBUG_CHANNEL_WRITE_HEADER(buf, offset, words, fmt);
DEBUG_CHANNEL_WRITE_HEADER(buf, offset, fmt);
buf.data[(offset + 8) & DEBUG_SHADER_RING_MASK] = v0;
buf.data[(offset + 9) & DEBUG_SHADER_RING_MASK] = v1;
buf.data[(offset + 10) & DEBUG_SHADER_RING_MASK] = v2;
DEBUG_CHANNEL_UNLOCK_MESSAGE(buf, offset, words);
}
void DEBUG_CHANNEL_MSG_(uint fmt, uint v0, uint v1, uint v2, uint v3)
@ -133,11 +190,12 @@ void DEBUG_CHANNEL_MSG_(uint fmt, uint v0, uint v1, uint v2, uint v3)
RingBuffer buf = RingBuffer(DEBUG_SHADER_RING_BDA);
uint words = 12;
uint offset = DEBUG_CHANNEL_ALLOCATE(words);
DEBUG_CHANNEL_WRITE_HEADER(buf, offset, words, fmt);
DEBUG_CHANNEL_WRITE_HEADER(buf, offset, fmt);
buf.data[(offset + 8) & DEBUG_SHADER_RING_MASK] = v0;
buf.data[(offset + 9) & DEBUG_SHADER_RING_MASK] = v1;
buf.data[(offset + 10) & DEBUG_SHADER_RING_MASK] = v2;
buf.data[(offset + 11) & DEBUG_SHADER_RING_MASK] = v3;
DEBUG_CHANNEL_UNLOCK_MESSAGE(buf, offset, words);
}
void DEBUG_CHANNEL_MSG()
@ -205,4 +263,76 @@ void DEBUG_CHANNEL_MSG(float v0, float v1, float v2, float v3)
DEBUG_CHANNEL_MSG_(DEBUG_CHANNEL_FMT_F32_ALL, floatBitsToUint(v0), floatBitsToUint(v1), floatBitsToUint(v2), floatBitsToUint(v3));
}
void DEBUG_CHANNEL_MSG_UNIFORM(uint v0)
{
if (DEBUG_CHANNEL_ELECT())
DEBUG_CHANNEL_MSG(v0);
}
void DEBUG_CHANNEL_MSG_UNIFORM(uint v0, uint v1)
{
if (DEBUG_CHANNEL_ELECT())
DEBUG_CHANNEL_MSG(v0, v1);
}
void DEBUG_CHANNEL_MSG_UNIFORM(uint v0, uint v1, uint v2)
{
if (DEBUG_CHANNEL_ELECT())
DEBUG_CHANNEL_MSG(v0, v1, v2);
}
void DEBUG_CHANNEL_MSG_UNIFORM(uint v0, uint v1, uint v2, uint v3)
{
if (DEBUG_CHANNEL_ELECT())
DEBUG_CHANNEL_MSG(v0, v1, v2, v3);
}
void DEBUG_CHANNEL_MSG_UNIFORM(int v0)
{
if (DEBUG_CHANNEL_ELECT())
DEBUG_CHANNEL_MSG(v0);
}
void DEBUG_CHANNEL_MSG_UNIFORM(int v0, int v1)
{
if (DEBUG_CHANNEL_ELECT())
DEBUG_CHANNEL_MSG(v0, v1);
}
void DEBUG_CHANNEL_MSG_UNIFORM(int v0, int v1, int v2)
{
if (DEBUG_CHANNEL_ELECT())
DEBUG_CHANNEL_MSG(v0, v1, v2);
}
void DEBUG_CHANNEL_MSG_UNIFORM(int v0, int v1, int v2, int v3)
{
if (DEBUG_CHANNEL_ELECT())
DEBUG_CHANNEL_MSG(v0, v1, v2, v3);
}
void DEBUG_CHANNEL_MSG_UNIFORM(float v0)
{
if (DEBUG_CHANNEL_ELECT())
DEBUG_CHANNEL_MSG(v0);
}
void DEBUG_CHANNEL_MSG_UNIFORM(float v0, float v1)
{
if (DEBUG_CHANNEL_ELECT())
DEBUG_CHANNEL_MSG(v0, v1);
}
void DEBUG_CHANNEL_MSG_UNIFORM(float v0, float v1, float v2)
{
if (DEBUG_CHANNEL_ELECT())
DEBUG_CHANNEL_MSG(v0, v1, v2);
}
void DEBUG_CHANNEL_MSG_UNIFORM(float v0, float v1, float v2, float v3)
{
if (DEBUG_CHANNEL_ELECT())
DEBUG_CHANNEL_MSG(v0, v1, v2, v3);
}
#endif

View File

@ -31,8 +31,12 @@
# define VK_USE_PLATFORM_WIN32_KHR
# endif
# include <vulkan/vulkan.h>
# include "private/vulkan_private_extensions.h"
#endif /* VKD3D_NO_VULKAN_H */
#define VKD3D_MIN_API_VERSION VK_API_VERSION_1_1
#define VKD3D_MAX_API_VERSION VK_API_VERSION_1_1
#if defined(__GNUC__)
# define DECLSPEC_VISIBLE __attribute__((visibility("default")))
#else
@ -55,22 +59,39 @@
extern "C" {
#endif /* __cplusplus */
enum vkd3d_structure_type
{
/* 1.0 */
VKD3D_STRUCTURE_TYPE_INSTANCE_CREATE_INFO,
VKD3D_STRUCTURE_TYPE_DEVICE_CREATE_INFO,
VKD3D_STRUCTURE_TYPE_IMAGE_RESOURCE_CREATE_INFO,
/* 1.1 */
VKD3D_STRUCTURE_TYPE_OPTIONAL_INSTANCE_EXTENSIONS_INFO,
/* 1.2 */
VKD3D_STRUCTURE_TYPE_OPTIONAL_DEVICE_EXTENSIONS_INFO,
VKD3D_STRUCTURE_TYPE_APPLICATION_INFO,
VKD3D_FORCE_32_BIT_ENUM(VKD3D_STRUCTURE_TYPE),
};
#define VKD3D_CONFIG_FLAG_VULKAN_DEBUG (1ull << 0)
#define VKD3D_CONFIG_FLAG_SKIP_APPLICATION_WORKAROUNDS (1ull << 1)
#define VKD3D_CONFIG_FLAG_DEBUG_UTILS (1ull << 2)
#define VKD3D_CONFIG_FLAG_FORCE_STATIC_CBV (1ull << 3)
#define VKD3D_CONFIG_FLAG_DXR (1ull << 4)
#define VKD3D_CONFIG_FLAG_SINGLE_QUEUE (1ull << 5)
#define VKD3D_CONFIG_FLAG_DESCRIPTOR_QA_CHECKS (1ull << 6)
#define VKD3D_CONFIG_FLAG_FORCE_RTV_EXCLUSIVE_QUEUE (1ull << 7)
#define VKD3D_CONFIG_FLAG_FORCE_DSV_EXCLUSIVE_QUEUE (1ull << 8)
#define VKD3D_CONFIG_FLAG_FORCE_MINIMUM_SUBGROUP_SIZE (1ull << 9)
#define VKD3D_CONFIG_FLAG_NO_UPLOAD_HVV (1ull << 10)
#define VKD3D_CONFIG_FLAG_LOG_MEMORY_BUDGET (1ull << 11)
#define VKD3D_CONFIG_FLAG_IGNORE_RTV_HOST_VISIBLE (1ull << 12)
#define VKD3D_CONFIG_FLAG_FORCE_HOST_CACHED (1ull << 13)
#define VKD3D_CONFIG_FLAG_DXR11 (1ull << 14)
#define VKD3D_CONFIG_FLAG_FORCE_NO_INVARIANT_POSITION (1ull << 15)
#define VKD3D_CONFIG_FLAG_GLOBAL_PIPELINE_CACHE (1ull << 16)
#define VKD3D_CONFIG_FLAG_PIPELINE_LIBRARY_NO_SERIALIZE_SPIRV (1ull << 17)
#define VKD3D_CONFIG_FLAG_PIPELINE_LIBRARY_SANITIZE_SPIRV (1ull << 18)
#define VKD3D_CONFIG_FLAG_PIPELINE_LIBRARY_LOG (1ull << 19)
#define VKD3D_CONFIG_FLAG_PIPELINE_LIBRARY_IGNORE_SPIRV (1ull << 20)
#define VKD3D_CONFIG_FLAG_MUTABLE_SINGLE_SET (1ull << 21)
#define VKD3D_CONFIG_FLAG_MEMORY_ALLOCATOR_SKIP_CLEAR (1ull << 22)
#define VKD3D_CONFIG_FLAG_RECYCLE_COMMAND_POOLS (1ull << 23)
#define VKD3D_CONFIG_FLAG_PIPELINE_LIBRARY_IGNORE_MISMATCH_DRIVER (1ull << 24)
#define VKD3D_CONFIG_FLAG_BREADCRUMBS (1ull << 25)
#define VKD3D_CONFIG_FLAG_PIPELINE_LIBRARY_APP_CACHE_ONLY (1ull << 26)
#define VKD3D_CONFIG_FLAG_SHADER_CACHE_SYNC (1ull << 27)
#define VKD3D_CONFIG_FLAG_FORCE_RAW_VA_CBV (1ull << 28)
#define VKD3D_CONFIG_FLAG_ZERO_MEMORY_WORKAROUNDS_COMMITTED_BUFFER_UAV (1ull << 29)
#define VKD3D_CONFIG_FLAG_ALLOW_SBT_COLLECTION (1ull << 30)
#define VKD3D_CONFIG_FLAG_FORCE_NATIVE_FP16 (1ull << 31)
#define VKD3D_CONFIG_FLAG_USE_HOST_IMPORT_FALLBACK (1ull << 32)
typedef HRESULT (*PFN_vkd3d_signal_event)(HANDLE event);
@ -83,49 +104,22 @@ struct vkd3d_instance;
struct vkd3d_instance_create_info
{
enum vkd3d_structure_type type;
const void *next;
PFN_vkd3d_signal_event pfn_signal_event;
PFN_vkd3d_create_thread pfn_create_thread;
PFN_vkd3d_join_thread pfn_join_thread;
size_t wchar_size;
/* If set to NULL, libvkd3d loads libvulkan. */
PFN_vkGetInstanceProcAddr pfn_vkGetInstanceProcAddr;
const char * const *instance_extensions;
uint32_t instance_extension_count;
};
/* Extends vkd3d_instance_create_info. Available since 1.1. */
struct vkd3d_optional_instance_extensions_info
{
enum vkd3d_structure_type type;
const void *next;
const char * const *extensions;
uint32_t extension_count;
};
/* Extends vkd3d_instance_create_info. Available since 1.2. */
struct vkd3d_application_info
{
enum vkd3d_structure_type type;
const void *next;
const char *application_name;
uint32_t application_version;
const char *engine_name; /* "vkd3d" if NULL */
uint32_t engine_version; /* vkd3d version if engine_name is NULL */
const char * const *optional_instance_extensions;
uint32_t optional_instance_extension_count;
};
struct vkd3d_device_create_info
{
enum vkd3d_structure_type type;
const void *next;
D3D_FEATURE_LEVEL minimum_feature_level;
struct vkd3d_instance *instance;
@ -136,25 +130,15 @@ struct vkd3d_device_create_info
const char * const *device_extensions;
uint32_t device_extension_count;
const char * const *optional_device_extensions;
uint32_t optional_device_extension_count;
IUnknown *parent;
LUID adapter_luid;
};
/* Extends vkd3d_device_create_info. Available since 1.2. */
struct vkd3d_optional_device_extensions_info
{
enum vkd3d_structure_type type;
const void *next;
const char * const *extensions;
uint32_t extension_count;
};
struct vkd3d_image_resource_create_info
{
enum vkd3d_structure_type type;
const void *next;
VkImage vk_image;
D3D12_RESOURCE_DESC desc;
unsigned int flags;

View File

@ -0,0 +1,32 @@
/*
* * Copyright 2021 NVIDIA Corporation
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
import "vkd3d_d3d12.idl";
import "vkd3d_vk_includes.h";
[
uuid(77a86b09-2bea-4801-b89a-37648e104af1),
object,
local,
pointer_default(unique)
]
interface ID3D12GraphicsCommandListExt : IUnknown
{
HRESULT GetVulkanHandle(VkCommandBuffer *pVkCommandBuffer);
HRESULT LaunchCubinShader(D3D12_CUBIN_DATA_HANDLE *handle, UINT32 block_x, UINT32 block_y, UINT32 block_z, const void *params, UINT32 param_size);
}

View File

@ -26,11 +26,11 @@ cpp_quote("#ifndef _D3D12_CONSTANTS")
cpp_quote("#define _D3D12_CONSTANTS")
cpp_quote("#ifndef D3D12_ERROR_ADAPTER_NOT_FOUND")
cpp_quote("#define D3D12_ERROR_ADAPTER_NOT_FOUND 0x887e0001")
cpp_quote("#define D3D12_ERROR_ADAPTER_NOT_FOUND ((HRESULT)0x887e0001)")
cpp_quote("#endif")
cpp_quote("#ifndef D3D12_ERROR_DRIVER_VERSION_MISMATCH")
cpp_quote("#define D3D12_ERROR_DRIVER_VERSION_MISMATCH 0x887e0002")
cpp_quote("#define D3D12_ERROR_DRIVER_VERSION_MISMATCH ((HRESULT)0x887e0002)")
cpp_quote("#endif")
const UINT D3D12_CS_TGSM_REGISTER_COUNT = 8192;
@ -109,6 +109,7 @@ typedef enum D3D12_SHADER_MIN_PRECISION_SUPPORT
D3D12_SHADER_MIN_PRECISION_SUPPORT_10_BIT = 0x1,
D3D12_SHADER_MIN_PRECISION_SUPPORT_16_BIT = 0x2,
} D3D12_SHADER_MIN_PRECISION_SUPPORT;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_SHADER_MIN_PRECISION_SUPPORT);")
typedef enum D3D12_TILED_RESOURCES_TIER
{
@ -168,6 +169,7 @@ typedef enum D3D12_SHARED_RESOURCE_COMPATIBILITY_TIER
{
D3D12_SHARED_RESOURCE_COMPATIBILITY_TIER_0 = 0,
D3D12_SHARED_RESOURCE_COMPATIBILITY_TIER_1 = 1,
D3D12_SHARED_RESOURCE_COMPATIBILITY_TIER_2 = 2,
} D3D12_SHARED_RESOURCE_COMPATIBILITY_TIER;
typedef enum D3D12_HEAP_SERIALIZATION_TIER
@ -187,6 +189,7 @@ typedef enum D3D12_RAYTRACING_TIER
{
D3D12_RAYTRACING_TIER_NOT_SUPPORTED = 0,
D3D12_RAYTRACING_TIER_1_0 = 10,
D3D12_RAYTRACING_TIER_1_1 = 11,
} D3D12_RAYTRACING_TIER;
typedef enum D3D12_VARIABLE_SHADING_RATE_TIER
@ -196,6 +199,19 @@ typedef enum D3D12_VARIABLE_SHADING_RATE_TIER
D3D12_VARIABLE_SHADING_RATE_TIER_2 = 2,
} D3D12_VARIABLE_SHADING_RATE_TIER;
typedef enum D3D12_MESH_SHADER_TIER
{
D3D12_MESH_SHADER_TIER_NOT_SUPPORTED = 0,
D3D12_MESH_SHADER_TIER_1 = 10,
} D3D12_MESH_SHADER_TIER;
typedef enum D3D12_SAMPLER_FEEDBACK_TIER
{
D3D12_SAMPLER_FEEDBACK_TIER_NOT_SUPPORTED = 0,
D3D12_SAMPLER_FEEDBACK_TIER_0_9 = 90,
D3D12_SAMPLER_FEEDBACK_TIER_1_0 = 100,
} D3D12_SAMPLER_FEEDBACK_TIER;
typedef enum D3D12_COMMAND_LIST_SUPPORT_FLAGS
{
D3D12_COMMAND_LIST_SUPPORT_FLAG_NONE = 0x0,
@ -205,7 +221,9 @@ typedef enum D3D12_COMMAND_LIST_SUPPORT_FLAGS
D3D12_COMMAND_LIST_SUPPORT_FLAG_COPY = 0x8,
D3D12_COMMAND_LIST_SUPPORT_FLAG_VIDEO_DECODE = 0x10,
D3D12_COMMAND_LIST_SUPPORT_FLAG_VIDEO_PROCESS = 0x20,
D3D12_COMMAND_LIST_SUPPORT_FLAG_VIDEO_ENCODE = 0x40,
} D3D12_COMMAND_LIST_SUPPORT_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_COMMAND_LIST_SUPPORT_FLAGS);")
typedef enum D3D12_FORMAT_SUPPORT1
{
@ -240,6 +258,7 @@ typedef enum D3D12_FORMAT_SUPPORT1
D3D12_FORMAT_SUPPORT1_VIDEO_PROCESSOR_INPUT = 0x20000000,
D3D12_FORMAT_SUPPORT1_VIDEO_ENCODER = 0x40000000,
} D3D12_FORMAT_SUPPORT1;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_FORMAT_SUPPORT1);")
typedef enum D3D12_FORMAT_SUPPORT2
{
@ -255,7 +274,9 @@ typedef enum D3D12_FORMAT_SUPPORT2
D3D12_FORMAT_SUPPORT2_OUTPUT_MERGER_LOGIC_OP = 0x00000100,
D3D12_FORMAT_SUPPORT2_TILED = 0x00000200,
D3D12_FORMAT_SUPPORT2_MULTIPLANE_OVERLAY = 0x00004000,
D3D12_FORMAT_SUPPORT2_SAMPLER_FEEDBACK = 0x00008000,
} D3D12_FORMAT_SUPPORT2;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_FORMAT_SUPPORT2);")
typedef enum D3D12_WRITEBUFFERIMMEDIATE_MODE
{
@ -264,6 +285,12 @@ typedef enum D3D12_WRITEBUFFERIMMEDIATE_MODE
D3D12_WRITEBUFFERIMMEDIATE_MODE_MARKER_OUT = 0x2,
} D3D12_WRITEBUFFERIMMEDIATE_MODE;
typedef enum D3D12_WAVE_MMA_TIER
{
D3D12_WAVE_MMA_TIER_NOT_SUPPORTED = 0,
D3D12_WAVE_MMA_TIER_1_0 = 10,
} D3D12_WAVE_MMA_TIER;
interface ID3D12Fence;
interface ID3D12RootSignature;
interface ID3D12Heap;
@ -315,6 +342,13 @@ typedef struct D3D12_SUBRESOURCE_RANGE_UINT64
D3D12_RANGE_UINT64 Range;
} D3D12_SUBRESOURCE_RANGE_UINT64;
typedef struct D3D12_SUBRESOURCE_INFO
{
UINT64 Offset;
UINT RowPitch;
UINT DepthPitch;
} D3D12_SUBRESOURCE_INFO;
typedef struct D3D12_RESOURCE_ALLOCATION_INFO
{
UINT64 SizeInBytes;
@ -419,6 +453,38 @@ typedef struct D3D12_FEATURE_DATA_D3D12_OPTIONS6
BOOL BackgroundProcessingSupported;
} D3D12_FEATURE_DATA_D3D12_OPTIONS6;
typedef struct D3D12_FEATURE_DATA_D3D12_OPTIONS7
{
D3D12_MESH_SHADER_TIER MeshShaderTier;
D3D12_SAMPLER_FEEDBACK_TIER SamplerFeedbackTier;
} D3D12_FEATURE_DATA_D3D12_OPTIONS7;
typedef struct D3D12_FEATURE_DATA_D3D12_OPTIONS8
{
BOOL UnalignedBlockTexturesSupported;
} D3D12_FEATURE_DATA_D3D12_OPTIONS8;
typedef struct D3D12_FEATURE_DATA_D3D12_OPTIONS9
{
BOOL MeshShaderPipelineStatsSupported;
BOOL MeshShaderSupportsFullRangeRenderTargetArrayIndex;
BOOL AtomicInt64OnTypedResourceSupported;
BOOL AtomicInt64OnGroupSharedSupported;
BOOL DerivativesInMeshAndAmplificationShadersSupported;
D3D12_WAVE_MMA_TIER WaveMMATier;
} D3D12_FEATURE_DATA_D3D12_OPTIONS9;
typedef struct D3D12_FEATURE_DATA_D3D12_OPTIONS10
{
BOOL VariableRateShadingSumCombinerSupported;
BOOL MeshShaderPerPrimitiveShadingRateSupported;
} D3D12_FEATURE_DATA_D3D12_OPTIONS10;
typedef struct D3D12_FEATURE_DATA_D3D12_OPTIONS11
{
BOOL AtomicInt64OnDescriptorHeapResourceSupported;
} D3D12_FEATURE_DATA_D3D12_OPTIONS11;
typedef struct D3D12_FEATURE_DATA_FORMAT_SUPPORT
{
DXGI_FORMAT Format;
@ -431,6 +497,7 @@ typedef enum D3D12_MULTISAMPLE_QUALITY_LEVEL_FLAGS
D3D12_MULTISAMPLE_QUALITY_LEVELS_FLAG_NONE = 0x00000000,
D3D12_MULTISAMPLE_QUALITY_LEVELS_FLAG_TILED_RESOURCE = 0x00000001,
} D3D12_MULTISAMPLE_QUALITY_LEVEL_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_MULTISAMPLE_QUALITY_LEVEL_FLAGS);")
typedef struct D3D12_FEATURE_DATA_MULTISAMPLE_QUALITY_LEVELS
{
@ -508,11 +575,17 @@ typedef enum D3D12_HEAP_FLAGS
D3D12_HEAP_FLAG_SHARED_CROSS_ADAPTER = 0x20,
D3D12_HEAP_FLAG_DENY_RT_DS_TEXTURES = 0x40,
D3D12_HEAP_FLAG_DENY_NON_RT_DS_TEXTURES = 0x80,
D3D12_HEAP_FLAG_HARDWARE_PROTECTED = 0x100,
D3D12_HEAP_FLAG_ALLOW_WRITE_WATCH = 0x200,
D3D12_HEAP_FLAG_ALLOW_SHADER_ATOMICS = 0x400,
D3D12_HEAP_FLAG_CREATE_NOT_RESIDENT = 0x800,
D3D12_HEAP_FLAG_CREATE_NOT_ZEROED = 0x1000,
D3D12_HEAP_FLAG_ALLOW_ALL_BUFFERS_AND_TEXTURES = 0x00,
D3D12_HEAP_FLAG_ALLOW_ONLY_BUFFERS = 0xc0,
D3D12_HEAP_FLAG_ALLOW_ONLY_NON_RT_DS_TEXTURES = 0x44,
D3D12_HEAP_FLAG_ALLOW_ONLY_RT_DS_TEXTURES = 0x84,
} D3D12_HEAP_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_HEAP_FLAGS);")
typedef struct D3D12_HEAP_DESC
{
@ -628,6 +701,7 @@ typedef enum D3D12_RESOURCE_BARRIER_FLAGS
D3D12_RESOURCE_BARRIER_FLAG_BEGIN_ONLY = 0x1,
D3D12_RESOURCE_BARRIER_FLAG_END_ONLY = 0x2,
} D3D12_RESOURCE_BARRIER_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_RESOURCE_BARRIER_FLAGS);")
typedef struct D3D12_RESOURCE_TRANSITION_BARRIER
{
@ -637,7 +711,7 @@ typedef struct D3D12_RESOURCE_TRANSITION_BARRIER
D3D12_RESOURCE_STATES StateAfter;
} D3D12_RESOURCE_TRANSITION_BARRIER;
typedef struct D3D12_RESOURCE_ALIASING_BARRIER_ALIASING
typedef struct D3D12_RESOURCE_ALIASING_BARRIER
{
ID3D12Resource *pResourceBefore;
ID3D12Resource *pResourceAfter;
@ -704,12 +778,36 @@ typedef struct D3D12_RESOURCE_DESC
D3D12_RESOURCE_FLAGS Flags;
} D3D12_RESOURCE_DESC;
typedef struct D3D12_MIP_REGION
{
UINT Width;
UINT Height;
UINT Depth;
} D3D12_MIP_REGION;
typedef struct D3D12_RESOURCE_DESC1
{
D3D12_RESOURCE_DIMENSION Dimension;
UINT64 Alignment;
UINT64 Width;
UINT Height;
UINT16 DepthOrArraySize;
UINT16 MipLevels;
DXGI_FORMAT Format;
DXGI_SAMPLE_DESC SampleDesc;
D3D12_TEXTURE_LAYOUT Layout;
D3D12_RESOURCE_FLAGS Flags;
D3D12_MIP_REGION SamplerFeedbackMipRegion;
} D3D12_RESOURCE_DESC1;
typedef enum D3D12_RESOLVE_MODE
{
D3D12_RESOLVE_MODE_DECOMPRESS = 0,
D3D12_RESOLVE_MODE_MIN = 1,
D3D12_RESOLVE_MODE_MAX = 2,
D3D12_RESOLVE_MODE_AVERAGE = 3,
D3D12_RESOLVE_MODE_ENCODE_SAMPLER_FEEDBACK = 4,
D3D12_RESOLVE_MODE_DECODE_SAMPLER_FEEDBACK = 5,
} D3D12_RESOLVE_MODE;
typedef struct D3D12_SAMPLE_POSITION
@ -723,6 +821,7 @@ typedef enum D3D12_VIEW_INSTANCING_FLAGS
D3D12_VIEW_INSTANCING_FLAG_NONE = 0,
D3D12_VIEW_INSTANCING_FLAG_ENABLE_VIEW_INSTANCE_MASKING = 0x1,
} D3D12_VIEW_INSTANCING_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_VIEW_INSTANCING_FLAGS);")
typedef struct D3D12_VIEW_INSTANCE_LOCATION
{
@ -795,6 +894,7 @@ typedef enum D3D12_DESCRIPTOR_RANGE_FLAGS
D3D12_DESCRIPTOR_RANGE_FLAG_DATA_STATIC = 0x8,
D3D12_DESCRIPTOR_RANGE_FLAG_DESCRIPTORS_STATIC_KEEPING_BUFFER_BOUNDS_CHECKS = 0x10000,
} D3D12_DESCRIPTOR_RANGE_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_DESCRIPTOR_RANGE_FLAGS);")
typedef struct D3D12_DESCRIPTOR_RANGE1
{
@ -838,6 +938,7 @@ typedef enum D3D12_ROOT_DESCRIPTOR_FLAGS
D3D12_ROOT_DESCRIPTOR_FLAG_DATA_STATIC_WHILE_SET_AT_EXECUTE = 0x4,
D3D12_ROOT_DESCRIPTOR_FLAG_DATA_STATIC = 0x8,
} D3D12_ROOT_DESCRIPTOR_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_ROOT_DESCRIPTOR_FLAGS);")
typedef struct D3D12_ROOT_DESCRIPTOR1
{
@ -863,6 +964,8 @@ typedef enum D3D12_SHADER_VISIBILITY
D3D12_SHADER_VISIBILITY_DOMAIN = 3,
D3D12_SHADER_VISIBILITY_GEOMETRY = 4,
D3D12_SHADER_VISIBILITY_PIXEL = 5,
D3D12_SHADER_VISIBILITY_AMPLIFICATION = 6,
D3D12_SHADER_VISIBILITY_MESH = 7,
} D3D12_SHADER_VISIBILITY;
typedef struct D3D12_ROOT_PARAMETER
@ -1031,7 +1134,12 @@ typedef enum D3D12_ROOT_SIGNATURE_FLAGS
D3D12_ROOT_SIGNATURE_FLAG_DENY_PIXEL_SHADER_ROOT_ACCESS = 0x20,
D3D12_ROOT_SIGNATURE_FLAG_ALLOW_STREAM_OUTPUT = 0x40,
D3D12_ROOT_SIGNATURE_FLAG_LOCAL_ROOT_SIGNATURE = 0x80,
D3D12_ROOT_SIGNATURE_FLAG_DENY_AMPLIFICATION_SHADER_ROOT_ACCESS = 0x100,
D3D12_ROOT_SIGNATURE_FLAG_DENY_MESH_SHADER_ROOT_ACCESS = 0x200,
D3D12_ROOT_SIGNATURE_FLAG_CBV_SRV_UAV_HEAP_DIRECTLY_INDEXED = 0x400,
D3D12_ROOT_SIGNATURE_FLAG_SAMPLER_HEAP_DIRECTLY_INDEXED = 0x800,
} D3D12_ROOT_SIGNATURE_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_ROOT_SIGNATURE_FLAGS);")
typedef struct D3D12_ROOT_SIGNATURE_DESC
{
@ -1082,6 +1190,7 @@ typedef enum D3D12_DESCRIPTOR_HEAP_FLAGS
D3D12_DESCRIPTOR_HEAP_FLAG_NONE = 0x0,
D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE = 0x1,
} D3D12_DESCRIPTOR_HEAP_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_DESCRIPTOR_HEAP_FLAGS);")
typedef struct D3D12_DESCRIPTOR_HEAP_DESC
{
@ -1120,6 +1229,7 @@ typedef enum D3D12_BUFFER_SRV_FLAGS
D3D12_BUFFER_SRV_FLAG_NONE = 0x0,
D3D12_BUFFER_SRV_FLAG_RAW = 0x1,
} D3D12_BUFFER_SRV_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_BUFFER_SRV_FLAGS);")
typedef enum D3D12_SHADER_COMPONENT_MAPPING
{
@ -1261,6 +1371,7 @@ typedef enum D3D12_BUFFER_UAV_FLAGS
D3D12_BUFFER_UAV_FLAG_NONE = 0x0,
D3D12_BUFFER_UAV_FLAG_RAW = 0x1,
} D3D12_BUFFER_UAV_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_BUFFER_UAV_FLAGS);")
typedef struct D3D12_BUFFER_UAV
{
@ -1731,6 +1842,7 @@ typedef enum D3D12_PIPELINE_STATE_FLAGS
D3D12_PIPELINE_STATE_FLAG_NONE = 0x0,
D3D12_PIPELINE_STATE_FLAG_DEBUG = 0x1,
} D3D12_PIPELINE_STATE_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_PIPELINE_STATE_FLAGS);")
typedef struct D3D12_GRAPHICS_PIPELINE_STATE_DESC
{
@ -1779,7 +1891,8 @@ typedef enum D3D12_COMMAND_LIST_TYPE
D3D12_COMMAND_LIST_TYPE_COMPUTE = 2,
D3D12_COMMAND_LIST_TYPE_COPY = 3,
D3D12_COMMAND_LIST_TYPE_VIDEO_DECODE = 4,
D3D12_COMMAND_LIST_TYPE_VIDEO_PROVESS = 5,
D3D12_COMMAND_LIST_TYPE_VIDEO_PROCESS = 5,
D3D12_COMMAND_LIST_TYPE_VIDEO_ENCODE = 6,
} D3D12_COMMAND_LIST_TYPE;
typedef enum D3D12_COMMAND_QUEUE_PRIORITY
@ -1794,6 +1907,7 @@ typedef enum D3D12_COMMAND_QUEUE_FLAGS
D3D12_COMMAND_QUEUE_FLAG_NONE = 0x0,
D3D12_COMMAND_QUEUE_FLAG_DISABLE_GPU_TIMEOUT = 0x1,
} D3D12_COMMAND_QUEUE_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_COMMAND_QUEUE_FLAGS);")
typedef enum D3D12_SHADER_CACHE_SUPPORT_FLAGS
{
@ -1803,6 +1917,7 @@ typedef enum D3D12_SHADER_CACHE_SUPPORT_FLAGS
D3D12_SHADER_CACHE_SUPPORT_AUTOMATIC_INPROC_CACHE = 0x4,
D3D12_SHADER_CACHE_SUPPORT_AUTOMATIC_DISK_CACHE = 0x8,
} D3D12_SHADER_CACHE_SUPPORT_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_SHADER_CACHE_SUPPORT_FLAGS);")
typedef struct D3D12_COMMAND_QUEUE_DESC
{
@ -1873,6 +1988,8 @@ typedef enum D3D_SHADER_MODEL
D3D_SHADER_MODEL_6_2 = 0x62,
D3D_SHADER_MODEL_6_3 = 0x63,
D3D_SHADER_MODEL_6_4 = 0x64,
D3D_SHADER_MODEL_6_5 = 0x65,
D3D_SHADER_MODEL_6_6 = 0x66,
} D3D_SHADER_MODEL;
typedef struct D3D12_FEATURE_DATA_SHADER_MODEL
@ -1905,6 +2022,13 @@ typedef enum D3D12_FEATURE
D3D12_FEATURE_D3D12_OPTIONS5 = 27,
D3D12_FEATURE_D3D12_OPTIONS6 = 30,
D3D12_FEATURE_QUERY_META_COMMAND = 31,
D3D12_FEATURE_D3D12_OPTIONS7 = 32,
D3D12_FEATURE_PROTECTED_RESOURCE_SESSION_TYPE_COUNT = 33,
D3D12_FEATURE_PROTECTED_RESOURCE_SESSION_TYPES = 34,
D3D12_FEATURE_D3D12_OPTIONS8 = 36,
D3D12_FEATURE_D3D12_OPTIONS9 = 37,
D3D12_FEATURE_D3D12_OPTIONS10 = 39,
D3D12_FEATURE_D3D12_OPTIONS11 = 40,
} D3D12_FEATURE;
typedef struct D3D12_MEMCPY_DEST
@ -2034,6 +2158,15 @@ interface ID3D12Resource1 : ID3D12Resource
{
HRESULT GetProtectedResourceSession(REFIID riid, void **protected_session);
}
[
uuid(be36ec3b-ea85-4aeb-a45a-e9d76404a495),
object,
local,
pointer_default(unique)
]
interface ID3D12Resource2 : ID3D12Resource1 {
D3D12_RESOURCE_DESC1 GetDesc1();
}
[
uuid(7116d91c-e7e4-47ce-b8c6-ec8168f437e5),
@ -2053,6 +2186,7 @@ typedef enum D3D12_TILE_COPY_FLAGS
D3D12_TILE_COPY_FLAG_LINEAR_BUFFER_TO_SWIZZLED_TILED_RESOURCE = 0x2,
D3D12_TILE_COPY_FLAG_SWIZZLED_TILED_RESOURCE_TO_LINEAR_BUFFER = 0x4,
} D3D12_TILE_COPY_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_TILE_COPY_FLAGS);")
typedef struct D3D12_INDEX_BUFFER_VIEW
{
@ -2169,11 +2303,13 @@ typedef enum D3D12_PROTECTED_RESOURCE_SESSION_SUPPORT_FLAGS
D3D12_PROTECTED_RESOURCE_SESSION_SUPPORT_FLAG_NONE = 0,
D3D12_PROTECTED_RESOURCE_SESSION_SUPPORT_FLAG_SUPPORTED = 0x1,
} D3D12_PROTECTED_RESOURCE_SESSION_SUPPORT_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_PROTECTED_RESOURCE_SESSION_SUPPORT_FLAGS);")
typedef enum D3D12_PROTECTED_RESOURCE_SESSION_FLAGS
{
D3D12_PROTECTED_RESOURCE_SESSION_FLAG_NONE = 0,
} D3D12_PROTECTED_RESOURCE_SESSION_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_PROTECTED_RESOURCE_SESSION_FLAGS);")
typedef enum D3D12_PROTECTED_SESSION_STATUS
{
@ -2217,6 +2353,37 @@ interface ID3D12ProtectedResourceSession : ID3D12ProtectedSession
D3D12_PROTECTED_RESOURCE_SESSION_DESC GetDesc();
}
typedef struct D3D12_FEATURE_DATA_PROTECTED_RESOURCE_SESSION_TYPE_COUNT
{
UINT NodeIndex;
UINT Count;
} D3D12_FEATURE_DATA_PROTECTED_RESOURCE_SESSION_TYPE_COUNT;
typedef struct D3D12_FEATURE_DATA_PROTECTED_RESOURCE_SESSION_TYPES
{
UINT NodeIndex;
UINT Count;
GUID *pTypes;
} D3D12_FEATURE_DATA_PROTECTED_RESOURCE_SESSION_TYPES;
typedef struct D3D12_PROTECTED_RESOURCE_SESSION_DESC1
{
UINT NodeMask;
D3D12_PROTECTED_RESOURCE_SESSION_FLAGS Flags;
GUID ProtectionType;
} D3D12_PROTECTED_RESOURCE_SESSION_DESC1;
[
uuid(D6F12DD6-76FB-406E-8961-4296EEFC0409),
object,
local,
pointer_default(unique)
]
interface ID3D12ProtectedResourceSession1 : ID3D12ProtectedResourceSession
{
D3D12_PROTECTED_RESOURCE_SESSION_DESC1 GetDesc1();
}
typedef enum D3D12_PIPELINE_STATE_SUBOBJECT_TYPE
{
D3D12_PIPELINE_STATE_SUBOBJECT_TYPE_ROOT_SIGNATURE = 0,
@ -2242,7 +2409,9 @@ typedef enum D3D12_PIPELINE_STATE_SUBOBJECT_TYPE
D3D12_PIPELINE_STATE_SUBOBJECT_TYPE_FLAGS = 20,
D3D12_PIPELINE_STATE_SUBOBJECT_TYPE_DEPTH_STENCIL1 = 21,
D3D12_PIPELINE_STATE_SUBOBJECT_TYPE_VIEW_INSTANCING = 22,
D3D12_PIPELINE_STATE_SUBOBJECT_TYPE_MAX_VALID = 23,
D3D12_PIPELINE_STATE_SUBOBJECT_TYPE_AS = 24,
D3D12_PIPELINE_STATE_SUBOBJECT_TYPE_MS = 25,
D3D12_PIPELINE_STATE_SUBOBJECT_TYPE_MAX_VALID = 26,
} D3D12_PIPELINE_STATE_SUBOBJECT_TYPE;
typedef struct D3D12_PIPELINE_STATE_STREAM_DESC
@ -2311,6 +2480,7 @@ typedef enum D3D12_META_COMMAND_PARAMETER_FLAGS
D3D12_META_COMMAND_PARAMETER_FLAG_INPUT = 0x1,
D3D12_META_COMMAND_PARAMETER_FLAG_OUTPUT = 0x2,
} D3D12_META_COMMAND_PARAMETER_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_META_COMMAND_PARAMETER_FLAGS);")
typedef enum D3D12_META_COMMAND_PARAMETER_STAGE
{
@ -2349,6 +2519,7 @@ typedef enum D3D12_GRAPHICS_STATES
D3D12_GRAPHICS_STATE_SAMPLE_POSITIONS = 0x8000,
D3D12_GRAPHICS_STATE_VIEW_INSTANCE_MASK = 0x10000,
} D3D12_GRAPHICS_STATES;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_GRAPHICS_STATES);")
typedef struct D3D12_META_COMMAND_DESC
{
@ -2399,6 +2570,8 @@ typedef enum D3D12_STATE_SUBOBJECT_TYPE
D3D12_STATE_SUBOBJECT_TYPE_RAYTRACING_SHADER_CONFIG = 9,
D3D12_STATE_SUBOBJECT_TYPE_RAYTRACING_PIPELINE_CONFIG = 10,
D3D12_STATE_SUBOBJECT_TYPE_HIT_GROUP = 11,
D3D12_STATE_SUBOBJECT_TYPE_RAYTRACING_PIPELINE_CONFIG1 = 12,
D3D12_STATE_SUBOBJECT_TYPE_MAX_VALID = 13,
} D3D12_STATE_SUBOBJECT_TYPE;
typedef struct D3D12_STATE_SUBOBJECT
@ -2412,7 +2585,9 @@ typedef enum D3D12_STATE_OBJECT_FLAGS
D3D12_STATE_OBJECT_FLAG_NONE = 0,
D3D12_STATE_OBJECT_FLAG_ALLOW_LOCAL_DEPENDENCIES_ON_EXTERNAL_DEFINITIONS = 0x1,
D3D12_STATE_OBJECT_FLAG_ALLOW_EXTERNAL_DEPENDENCIES_ON_LOCAL_DEFINITIONS = 0x2,
D3D12_STATE_OBJECT_FLAG_ALLOW_STATE_OBJECT_ADDITIONS = 0x4,
} D3D12_STATE_OBJECT_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_STATE_OBJECT_FLAGS);")
typedef struct D3D12_STATE_OBJECT_CONFIG
{
@ -2438,6 +2613,7 @@ typedef enum D3D12_EXPORT_FLAGS
{
D3D12_EXPORT_FLAG_NONE = 0,
} D3D12_EXPORT_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_EXPORT_FLAGS);")
typedef struct D3D12_EXPORT_DESC
{
@ -2467,6 +2643,13 @@ typedef struct D3D12_SUBOBJECT_TO_EXPORTS_ASSOCIATION
LPCWSTR *pExports;
} D3D12_SUBOBJECT_TO_EXPORTS_ASSOCIATION;
typedef struct D3D12_DXIL_SUBOBJECT_TO_EXPORTS_ASSOCIATION
{
LPCWSTR SubobjectToAssociate;
UINT NumExports;
LPCWSTR *pExports;
} D3D12_DXIL_SUBOBJECT_TO_EXPORTS_ASSOCIATION;
typedef enum D3D12_HIT_GROUP_TYPE
{
D3D12_HIT_GROUP_TYPE_TRIANGLES = 0,
@ -2488,6 +2671,20 @@ typedef struct D3D12_RAYTRACING_SHADER_CONFIG
UINT MaxAttributeSizeInBytes;
} D3D12_RAYTRACING_SHADER_CONFIG;
typedef enum D3D12_RAYTRACING_PIPELINE_FLAGS
{
D3D12_RAYTRACING_PIPELINE_FLAG_NONE = 0x0,
D3D12_RAYTRACING_PIPELINE_FLAG_SKIP_TRIANGLES = 0x100,
D3D12_RAYTRACING_PIPELINE_FLAG_SKIP_PROCEDURAL_PRIMITIVES = 0x200,
} D3D12_RAYTRACING_PIPELINE_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_RAYTRACING_PIPELINE_FLAGS);")
typedef struct D3D12_RAYTRACING_PIPELINE_CONFIG1
{
UINT MaxTraceRecursionDepth;
D3D12_RAYTRACING_PIPELINE_FLAGS Flags;
} D3D12_RAYTRACING_PIPELINE_CONFIG1;
typedef struct D3D12_RAYTRACING_PIPELINE_CONFIG
{
UINT MaxTraceRecursionDepth;
@ -2512,6 +2709,7 @@ typedef enum D3D12_RAYTRACING_GEOMETRY_FLAGS
D3D12_RAYTRACING_GEOMETRY_FLAG_OPAQUE = 0x1,
D3D12_RAYTRACING_GEOMETRY_FLAG_NO_DUPLICATE_ANYHIT_INVOCATION = 0x2,
} D3D12_RAYTRACING_GEOMETRY_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_RAYTRACING_GEOMETRY_FLAGS);")
typedef enum D3D12_RAYTRACING_GEOMETRY_TYPE
{
@ -2527,6 +2725,7 @@ typedef enum D3D12_RAYTRACING_INSTANCE_FLAGS
D3D12_RAYTRACING_INSTANCE_FLAG_FORCE_OPAQUE = 0x4,
D3D12_RAYTRACING_INSTANCE_FLAG_FORCE_NON_OPAQUE = 0x8,
} D3D12_RAYTRACING_INSTANCE_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_RAYTRACING_INSTANCE_FLAGS);")
typedef struct D3D12_GPU_VIRTUAL_ADDRESS_AND_STRIDE
{
@ -2584,6 +2783,7 @@ typedef enum D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAGS
D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_MINIMIZE_MEMORY = 0x10,
D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_PERFORM_UPDATE = 0x20,
} D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAGS);")
typedef enum D3D12_RAYTRACING_ACCELERATION_STRUCTURE_COPY_MODE
{
@ -2736,7 +2936,10 @@ typedef enum D3D12_RAY_FLAGS
D3D12_RAY_FLAG_CULL_FRONT_FACING_TRIANGLES = 0x20,
D3D12_RAY_FLAG_CULL_OPAQUE = 0x40,
D3D12_RAY_FLAG_CULL_NON_OPAQUE = 0x80,
D3D12_RAY_FLAG_SKIP_TRIANGLES = 0x100,
D3D12_RAY_FLAG_SKIP_PROCEDURAL_PRIMITIVES = 0x200,
} D3D12_RAY_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_RAY_FLAGS);")
typedef enum D3D12_HIT_KIND
{
@ -2759,16 +2962,19 @@ typedef enum D3D12_COMMAND_LIST_FLAGS
{
D3D12_COMMAND_LIST_FLAG_NONE = 0,
} D3D12_COMMAND_LIST_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_COMMAND_LIST_FLAGS);")
typedef enum D3D12_COMMAND_POOL_FLAGS
{
D3D12_COMMAND_POOL_FLAG_NONE = 0,
} D3D12_COMMAND_POOL_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_COMMAND_POOL_FLAGS);")
typedef enum D3D12_COMMAND_RECORDER_FLAGS
{
D3D12_COMMAND_RECORDER_FLAG_NONE = 0,
} D3D12_COMMAND_RECORDER_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_COMMAND_RECORDER_FLAGS);")
typedef enum D3D12_AUTO_BREADCRUMB_OP
{
@ -2812,14 +3018,17 @@ typedef enum D3D12_AUTO_BREADCRUMB_OP
D3D12_AUTO_BREADCRUMB_OP_ESTIMATEMOTION = 37,
D3D12_AUTO_BREADCRUMB_OP_RESOLVEMOTIONVECTORHEAP = 38,
D3D12_AUTO_BREADCRUMB_OP_SETPIPELINESTATE1 = 39,
D3D12_AUTO_BREADCRUMB_OP_INITIALIZEEXTENSIONCOMMAND = 40,
D3D12_AUTO_BREADCRUMB_OP_EXECUTEEXTENSIONCOMMAND = 41,
D3D12_AUTO_BREADCRUMB_OP_DISPATCHMESH = 42,
} D3D12_AUTO_BREADCRUMB_OP;
typedef struct D3D12_AUTO_BREADCRUMB_NODE
{
const char *pCommandListDebugNameA;
const wchar_t *pCommandListDebugNameW;
const WCHAR *pCommandListDebugNameW;
const char *pCommandQueueDebugNameA;
const wchar_t *pCommandQueueDebugNameW;
const WCHAR *pCommandQueueDebugNameW;
ID3D12GraphicsCommandList *pCommandList;
ID3D12CommandQueue *pCommandQueue;
UINT32 BreadcrumbCount;
@ -2828,9 +3037,33 @@ typedef struct D3D12_AUTO_BREADCRUMB_NODE
struct D3D12_AUTO_BREADCRUMB_NODE *pNext;
} D3D12_AUTO_BREADCRUMB_NODE;
typedef struct D3D12_DRED_BREADCRUMB_CONTEXT
{
UINT BreadcrumbIndex;
const WCHAR *pContextString;
} D3D12_DRED_BREADCRUMB_CONTEXT;
typedef struct D3D12_AUTO_BREADCRUMB_NODE1
{
const char *pCommandListDebugNameA;
const WCHAR *pCommandListDebugNameW;
const char *pCommandQueueDebugNameA;
const WCHAR *pCommandQueueDebugNameW;
ID3D12GraphicsCommandList *pCommandList;
ID3D12CommandQueue *pCommandQueue;
UINT BreadcrumbCount;
const UINT *pLastBreadcrumbValue;
const D3D12_AUTO_BREADCRUMB_OP *pCommandHistory;
const struct D3D12_AUTO_BREADCRUMB_NODE1 *pNext;
UINT BreadcrumbContextsCount;
D3D12_DRED_BREADCRUMB_CONTEXT *pBreadcrumbContexts;
} D3D12_AUTO_BREADCRUMB_NODE1;
typedef enum D3D12_DRED_VERSION
{
D3D12_DRED_VERSION_1_0 = 1,
D3D12_DRED_VERSION_1_0 = 1,
D3D12_DRED_VERSION_1_1 = 2,
D3D12_DRED_VERSION_1_2 = 3,
} D3D12_DRED_VERSION;
typedef enum D3D12_DRED_FLAGS
@ -2839,6 +3072,14 @@ typedef enum D3D12_DRED_FLAGS
D3D12_DRED_FLAG_FORCE_ENABLE = 0x1,
D3D12_DRED_FLAG_AUTOBREADCRUMBS = 0x2,
} D3D12_DRED_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_DRED_FLAGS);")
typedef enum D3D12_DRED_ENABLEMENT
{
D3D12_DRED_ENABLEMENT_SYSTEM_CONTROLLED = 0,
D3D12_DRED_ENABLEMENT_FORCED_OFF = 1,
D3D12_DRED_ENABLEMENT_FORCED_ON = 2,
} D3D12_DRED_ENABLEMENT;
typedef struct D3D12_DEVICE_REMOVED_EXTENDED_DATA
{
@ -2846,6 +3087,92 @@ typedef struct D3D12_DEVICE_REMOVED_EXTENDED_DATA
D3D12_AUTO_BREADCRUMB_NODE *pHeadAutoBreadcrumbNode;
} D3D12_DEVICE_REMOVED_EXTENDED_DATA;
typedef enum D3D12_DRED_ALLOCATION_TYPE
{
D3D12_DRED_ALLOCATION_TYPE_COMMAND_QUEUE = 19,
D3D12_DRED_ALLOCATION_TYPE_COMMAND_ALLOCATOR = 20,
D3D12_DRED_ALLOCATION_TYPE_PIPELINE_STATE = 21,
D3D12_DRED_ALLOCATION_TYPE_COMMAND_LIST = 22,
D3D12_DRED_ALLOCATION_TYPE_FENCE = 23,
D3D12_DRED_ALLOCATION_TYPE_DESCRIPTOR_HEAP = 24,
D3D12_DRED_ALLOCATION_TYPE_HEAP = 25,
D3D12_DRED_ALLOCATION_TYPE_QUERY_HEAP = 27,
D3D12_DRED_ALLOCATION_TYPE_COMMAND_SIGNATURE = 28,
D3D12_DRED_ALLOCATION_TYPE_PIPELINE_LIBRARY = 29,
D3D12_DRED_ALLOCATION_TYPE_VIDEO_DECODER = 30,
D3D12_DRED_ALLOCATION_TYPE_VIDEO_PROCESSOR = 32,
D3D12_DRED_ALLOCATION_TYPE_RESOURCE = 34,
D3D12_DRED_ALLOCATION_TYPE_PASS = 35,
D3D12_DRED_ALLOCATION_TYPE_CRYPTOSESSION = 36,
D3D12_DRED_ALLOCATION_TYPE_CRYPTOSESSIONPOLICY = 37,
D3D12_DRED_ALLOCATION_TYPE_PROTECTEDRESOURCESESSION = 38,
D3D12_DRED_ALLOCATION_TYPE_VIDEO_DECODER_HEAP = 39,
D3D12_DRED_ALLOCATION_TYPE_COMMAND_POOL = 40,
D3D12_DRED_ALLOCATION_TYPE_COMMAND_RECORDER = 41,
D3D12_DRED_ALLOCATION_TYPE_STATE_OBJECT = 42,
D3D12_DRED_ALLOCATION_TYPE_METACOMMAND = 43,
D3D12_DRED_ALLOCATION_TYPE_SCHEDULINGGROUP = 44,
D3D12_DRED_ALLOCATION_TYPE_VIDEO_MOTION_ESTIMATOR = 45,
D3D12_DRED_ALLOCATION_TYPE_VIDEO_MOTION_VECTOR_HEAP = 46,
D3D12_DRED_ALLOCATION_TYPE_VIDEO_EXTENSION_COMMAND = 47,
D3D12_DRED_ALLOCATION_TYPE_INVALID = 0xffffffff,
} D3D12_DRED_ALLOCATION_TYPE;
typedef struct D3D12_DRED_ALLOCATION_NODE
{
const char *ObjectNameA;
const WCHAR *ObjectNameW;
D3D12_DRED_ALLOCATION_TYPE AllocationType;
const struct D3D12_DRED_ALLOCATION_NODE *pNext;
} D3D12_DRED_ALLOCATION_NODE;
typedef struct D3D12_DRED_ALLOCATION_NODE1
{
const char *ObjectNameA;
const WCHAR *ObjectNameW;
D3D12_DRED_ALLOCATION_TYPE AllocationType;
const struct D3D12_DRED_ALLOCATION_NODE1 *pNext;
const IUnknown *pObject;
} D3D12_DRED_ALLOCATION_NODE1;
typedef struct D3D12_DRED_AUTO_BREADCRUMBS_OUTPUT
{
const D3D12_AUTO_BREADCRUMB_NODE *pHeadAutoBreadcrumbNode;
} D3D12_DRED_AUTO_BREADCRUMBS_OUTPUT;
typedef struct D3D12_DRED_AUTO_BREADCRUMBS_OUTPUT1
{
const D3D12_AUTO_BREADCRUMB_NODE1 *pHeadAutoBreadcrumbNode;
} D3D12_DRED_AUTO_BREADCRUMBS_OUTPUT1;
typedef struct D3D12_DRED_PAGE_FAULT_OUTPUT
{
D3D12_GPU_VIRTUAL_ADDRESS PageFaultVA;
const D3D12_DRED_ALLOCATION_NODE *pHeadExistingAllocationNode;
const D3D12_DRED_ALLOCATION_NODE *pHeadRecentFreedAllocationNode;
} D3D12_DRED_PAGE_FAULT_OUTPUT;
typedef struct D3D12_DRED_PAGE_FAULT_OUTPUT1
{
D3D12_GPU_VIRTUAL_ADDRESS PageFaultVA;
const D3D12_DRED_ALLOCATION_NODE1 *pHeadExistingAllocationNode;
const D3D12_DRED_ALLOCATION_NODE1 *pHeadRecentFreedAllocationNode;
} D3D12_DRED_PAGE_FAULT_OUTPUT1;
typedef struct D3D12_DEVICE_REMOVED_EXTENDED_DATA1
{
HRESULT DeviceRemovedReason;
D3D12_DRED_AUTO_BREADCRUMBS_OUTPUT AutoBreadcrumbsOutput;
D3D12_DRED_PAGE_FAULT_OUTPUT PageFaultOutput;
} D3D12_DEVICE_REMOVED_EXTENDED_DATA1;
typedef struct D3D12_DEVICE_REMOVED_EXTENDED_DATA2
{
HRESULT DeviceRemovedReason;
D3D12_DRED_AUTO_BREADCRUMBS_OUTPUT1 AutoBreadcrumbsOutput;
D3D12_DRED_PAGE_FAULT_OUTPUT1 PageFaultOutput;
} D3D12_DEVICE_REMOVED_EXTENDED_DATA2;
typedef struct D3D12_VERSIONED_DEVICE_REMOVED_EXTENDED_DATA
{
D3D12_DRED_VERSION Version;
@ -2878,6 +3205,7 @@ typedef enum D3D12_RENDER_PASS_FLAGS
D3D12_RENDER_PASS_FLAG_SUSPENDING_PASS = 0x2,
D3D12_RENDER_PASS_FLAG_RESUMING_PASS = 0x4,
} D3D12_RENDER_PASS_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_RENDER_PASS_FLAGS);")
typedef enum D3D12_AXIS_SHADING_RATE
{
@ -2987,6 +3315,52 @@ typedef struct D3D12_RENDER_PASS_DEPTH_STENCIL_DESC
D3D12_RENDER_PASS_ENDING_ACCESS StencilEndingAccess;
} D3D12_RENDER_PASS_DEPTH_STENCIL_DESC;
typedef struct D3D12_DISPATCH_MESH_ARGUMENTS
{
UINT ThreadGroupCountX;
UINT ThreadGroupCountY;
UINT ThreadGroupCountZ;
} D3D12_DISPATCH_MESH_ARGUMENTS;
typedef enum D3D12_SHADER_CACHE_MODE
{
D3D12_SHADER_CACHE_MODE_MEMORY = 0,
D3D12_SHADER_CACHE_MODE_DISK = 1,
} D3D12_SHADER_CACHE_MODE;
typedef enum D3D12_SHADER_CACHE_FLAGS
{
D3D12_SHADER_CACHE_FLAG_NONE = 0,
D3D12_SHADER_CACHE_FLAG_DRIVER_VERSIONED = 0x1,
D3D12_SHADER_CACHE_FLAG_USE_WORKING_DIR = 0x2,
} D3D12_SHADER_CACHE_FLAGS;
typedef struct D3D12_SHADER_CACHE_SESSION_DESC
{
GUID Identifier;
D3D12_SHADER_CACHE_MODE Mode;
D3D12_SHADER_CACHE_FLAGS Flags;
UINT MaximumInMemoryCacheSizeBytes;
UINT MaximumInMemoryCacheEntries;
UINT MaximumValueFileSizeBytes;
UINT64 Version;
} D3D12_SHADER_CACHE_SESSION_DESC;
typedef enum D3D12_SHADER_CACHE_KIND_FLAGS
{
D3D12_SHADER_CACHE_KIND_FLAG_IMPLICIT_D3D_CACHE_FOR_DRIVER = 0x1,
D3D12_SHADER_CACHE_KIND_FLAG_IMPLICIT_D3D_CONVERSIONS = 0x2,
D3D12_SHADER_CACHE_KIND_FLAG_IMPLICIT_DRIVER_MANAGED = 0x4,
D3D12_SHADER_CACHE_KIND_FLAG_APPLICATION_MANAGED = 0x8,
} D3D12_SHADER_CACHE_KIND_FLAGS;
typedef enum D3D12_SHADER_CACHE_CONTROL_FLAGS
{
D3D12_SHADER_CACHE_CONTROL_FLAG_DISABLE = 0x1,
D3D12_SHADER_CACHE_CONTROL_FLAG_ENABLE = 0x2,
D3D12_SHADER_CACHE_CONTROL_FLAG_CLEAR = 0x4,
} D3D12_SHADER_CACHE_CONTROL_FLAGS;
[
uuid(dbb84c27-36ce-4fc9-b801-f048c46ac570),
object,
@ -3024,7 +3398,7 @@ interface ID3D12GraphicsCommandList : ID3D12CommandList
HRESULT Reset(ID3D12CommandAllocator *allocator, ID3D12PipelineState *initial_state);
HRESULT ClearState(ID3D12PipelineState *pipeline_state);
void ClearState(ID3D12PipelineState *pipeline_state);
void DrawInstanced(UINT vertex_count_per_instance, UINT instance_count,
UINT start_vertex_location, UINT start_instance_location);
@ -3230,6 +3604,17 @@ interface ID3D12GraphicsCommandList5 : ID3D12GraphicsCommandList4
void RSSetShadingRateImage(ID3D12Resource *image);
}
[
uuid(c3827890-e548-4cfa-96cf-5689a9370f80),
object,
local,
pointer_default(unique)
]
interface ID3D12GraphicsCommandList6 : ID3D12GraphicsCommandList5
{
void DispatchMesh(UINT x, UINT y, UINT z);
}
typedef enum D3D12_TILE_RANGE_FLAGS
{
D3D12_TILE_RANGE_FLAG_NONE = 0x0,
@ -3243,6 +3628,7 @@ typedef enum D3D12_TILE_MAPPING_FLAGS
D3D12_TILE_MAPPING_FLAG_NONE = 0x0,
D3D12_TILE_MAPPING_FLAG_NO_HAZARD = 0x1,
} D3D12_TILE_MAPPING_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_TILE_MAPPING_FLAGS);")
[
uuid(0ec870a6-5d7e-4c22-8cfc-5baae07616ed),
@ -3258,8 +3644,8 @@ interface ID3D12CommandQueue : ID3D12Pageable
ID3D12Heap *heap,
UINT range_count,
const D3D12_TILE_RANGE_FLAGS *range_flags,
UINT *heap_range_offsets,
UINT *range_tile_counts,
const UINT *heap_range_offsets,
const UINT *range_tile_counts,
D3D12_TILE_MAPPING_FLAGS flags);
void CopyTileMappings(ID3D12Resource *dst_resource,
@ -3290,7 +3676,9 @@ typedef enum D3D12_FENCE_FLAGS
D3D12_FENCE_FLAG_NONE = 0x0,
D3D12_FENCE_FLAG_SHARED = 0x1,
D3D12_FENCE_FLAG_SHARED_CROSS_ADAPTER = 0x2,
D3D12_FENCE_FLAG_NON_MONITORED = 0x4,
} D3D12_FENCE_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_FENCE_FLAGS);")
typedef enum D3D12_QUERY_HEAP_TYPE
{
@ -3307,6 +3695,7 @@ typedef enum D3D12_RESIDENCY_FLAGS
D3D12_RESIDENCY_FLAG_NONE = 0,
D3D12_RESIDENCY_FLAG_DENY_OVERBUDGET = 0,
} D3D12_RESIDENCY_FLAGS;
cpp_quote("DEFINE_ENUM_FLAG_OPERATORS(D3D12_RESIDENCY_FLAGS);")
typedef struct D3D12_QUERY_HEAP_DESC
{
@ -3326,6 +3715,8 @@ typedef enum D3D12_INDIRECT_ARGUMENT_TYPE
D3D12_INDIRECT_ARGUMENT_TYPE_CONSTANT_BUFFER_VIEW,
D3D12_INDIRECT_ARGUMENT_TYPE_SHADER_RESOURCE_VIEW,
D3D12_INDIRECT_ARGUMENT_TYPE_UNORDERED_ACCESS_VIEW,
D3D12_INDIRECT_ARGUMENT_TYPE_DISPATCH_RAYS,
D3D12_INDIRECT_ARGUMENT_TYPE_DISPATCH_MESH,
} D3D12_INDIRECT_ARGUMENT_TYPE;
typedef struct D3D12_INDIRECT_ARGUMENT_DESC
@ -3711,6 +4102,67 @@ interface ID3D12Device6 : ID3D12Device5
D3D12_MEASUREMENTS_ACTION action, HANDLE event, BOOL further_measurements);
}
[
uuid(5c014b53-68a1-4b9b-8bd1-dd6046b9358b),
object,
local,
pointer_default(unique)
]
interface ID3D12Device7 : ID3D12Device6
{
HRESULT AddToStateObject(const D3D12_STATE_OBJECT_DESC *addition,
ID3D12StateObject *state_object, REFIID riid, void **new_state_object);
HRESULT CreateProtectedResourceSession1(
const D3D12_PROTECTED_RESOURCE_SESSION_DESC1 *desc,
REFIID riid, void **session);
}
[
uuid(9218e6bb-f944-4f7e-a75c-b1b2c7b701f3),
object,
local,
pointer_default(unique)
]
interface ID3D12Device8 : ID3D12Device7
{
D3D12_RESOURCE_ALLOCATION_INFO GetResourceAllocationInfo2(UINT visible_mask,
UINT resource_desc_count, const D3D12_RESOURCE_DESC1 *resource_descs,
D3D12_RESOURCE_ALLOCATION_INFO1 *resource_allocation_infos);
HRESULT CreateCommittedResource2(const D3D12_HEAP_PROPERTIES *heap_properties,
D3D12_HEAP_FLAGS heap_flags, const D3D12_RESOURCE_DESC1 *resource_desc,
D3D12_RESOURCE_STATES initial_state, const D3D12_CLEAR_VALUE *optimized_clear_value,
ID3D12ProtectedResourceSession *protected_session, REFIID riid, void **resource);
HRESULT CreatePlacedResource1(ID3D12Heap *heap, UINT64 heap_offset,
const D3D12_RESOURCE_DESC1 *resource_desc, D3D12_RESOURCE_STATES initial_state,
const D3D12_CLEAR_VALUE *optimized_clear_value, REFIID riid, void **resource);
void CreateSamplerFeedbackUnorderedAccessView(ID3D12Resource *target_resource,
ID3D12Resource *feedback_resource, D3D12_CPU_DESCRIPTOR_HANDLE descriptor);
void GetCopyableFootprints1(const D3D12_RESOURCE_DESC1 *resource_desc,
UINT first_sub_resource, UINT sub_resource_count, UINT64 base_offset,
D3D12_PLACED_SUBRESOURCE_FOOTPRINT *layouts, UINT *row_count,
UINT64 *row_size, UINT64 *total_bytes);
}
[
uuid(4c80e962-f032-4f60-bc9e-ebc2cfa1d83c),
object,
local,
pointer_default(unique)
]
interface ID3D12Device9 : ID3D12Device8
{
HRESULT CreateShaderCacheSession(const D3D12_SHADER_CACHE_SESSION_DESC *desc,
REFIID riid, void **session);
HRESULT ShaderCacheControl(D3D12_SHADER_CACHE_KIND_FLAGS kinds,
D3D12_SHADER_CACHE_CONTROL_FLAGS control);
HRESULT CreateCommandQueue1(const D3D12_COMMAND_QUEUE_DESC *desc,
REFIID creator_id, REFIID riid, void **command_queue);
}
[
uuid(34ab647b-3cc8-46ac-841b-c0965645c046),
object,
@ -3765,3 +4217,13 @@ typedef HRESULT (__stdcall *PFN_D3D12_CREATE_DEVICE)(IUnknown *adapter,
typedef HRESULT (__stdcall *PFN_D3D12_GET_DEBUG_INTERFACE)(REFIID iid, void **debug);
[local] HRESULT __stdcall D3D12GetDebugInterface(REFIID iid, void **debug);
cpp_quote("DEFINE_GUID(D3D12ExperimentalShaderModels, 0x76f5573e, 0xf13a, 0x40f5, 0xb2, 0x97, 0x81, 0xce, 0x9e, 0x18, 0x93, 0x3f );")
cpp_quote("DEFINE_GUID(D3D12TiledResourceTier4, 0xc9c4725f, 0xa81a, 0x4f56, 0x8c, 0x5b, 0xc5, 0x10, 0x39, 0xd6, 0x94, 0xfb );")
cpp_quote("DEFINE_GUID(D3D12MetaCommand, 0xc734c97e, 0x8077, 0x48c8, 0x9f, 0xdc, 0xd9, 0xd1, 0xdd, 0x31, 0xdd, 0x77 );")
typedef HRESULT (__stdcall *PFN_D3D12_ENABLE_EXPERIMENTAL_FEATURES)(UINT num_features,
const IID *iids, void *config_structs, UINT *config_struct_sizes);
[local] HRESULT __stdcall D3D12EnableExperimentalFeatures(UINT num_features,
const IID *iids, void *config_structs, UINT *config_struct_sizes);

View File

@ -77,6 +77,7 @@ typedef enum D3D_FEATURE_LEVEL
D3D_FEATURE_LEVEL_11_1 = 0xb100,
D3D_FEATURE_LEVEL_12_0 = 0xc000,
D3D_FEATURE_LEVEL_12_1 = 0xc100,
D3D_FEATURE_LEVEL_12_2 = 0xc200,
} D3D_FEATURE_LEVEL;
[
@ -93,3 +94,7 @@ interface ID3D10Blob : IUnknown
typedef ID3D10Blob ID3DBlob;
cpp_quote("#define IID_ID3DBlob IID_ID3D10Blob")
cpp_quote("DEFINE_GUID(WKPDID_D3DDebugObjectName,0x429b8c22,0x9188,0x4b0c,0x87,0x42,0xac,0xb0,0xbf,0x85,0xc2,0x00);")
cpp_quote("DEFINE_GUID(WKPDID_D3DDebugObjectNameW,0x4cca5fd8,0x921f,0x42c8,0x85,0x66,0x70,0xca,0xf2,0xa9,0xb7,0x41);")
cpp_quote("DEFINE_GUID(WKPDID_CommentStringW,0xd0149dc0,0x90e8,0x4ec8,0x81,0x44,0xe9,0x00,0xad,0x26,0x6b,0xb2);")

View File

@ -0,0 +1,37 @@
/*
* * Copyright 2021 NVIDIA Corporation
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
import "vkd3d_d3d12.idl";
import "vkd3d_vk_includes.h";
[
uuid(11ea7a1a-0f6a-49bf-b612-3e30f8e201dd),
object,
local,
pointer_default(unique)
]
interface ID3D12DeviceExt : IUnknown
{
HRESULT GetVulkanHandles(VkInstance *vk_instance, VkPhysicalDevice *vk_physical_device, VkDevice *vk_device);
BOOL GetExtensionSupport(D3D12_VK_EXTENSION extension);
HRESULT CreateCubinComputeShaderWithName(const void *cubin_data, UINT32 cubin_size, UINT32 block_x, UINT32 block_y, UINT32 block_z, const char *shader_name, D3D12_CUBIN_DATA_HANDLE **handle);
HRESULT DestroyCubinComputeShader(D3D12_CUBIN_DATA_HANDLE *handle);
HRESULT GetCudaTextureObject(D3D12_CPU_DESCRIPTOR_HANDLE srv_handle, D3D12_CPU_DESCRIPTOR_HANDLE sampler_handle, UINT32 *cuda_texture_handle);
HRESULT GetCudaSurfaceObject(D3D12_CPU_DESCRIPTOR_HANDLE uav_handle, UINT32 *cuda_surface_handle);
HRESULT CaptureUAVInfo(D3D12_UAV_INFO *uav_info);
}

View File

@ -135,5 +135,12 @@ typedef enum DXGI_FORMAT
DXGI_FORMAT_A8P8 = 0x72,
DXGI_FORMAT_B4G4R4A4_UNORM = 0x73,
DXGI_FORMAT_P208 = 0x82,
DXGI_FORMAT_V208 = 0x83,
DXGI_FORMAT_V408 = 0x84,
DXGI_FORMAT_SAMPLER_FEEDBACK_MIN_MIP_OPAQUE = 0xbd,
DXGI_FORMAT_SAMPLER_FEEDBACK_MIP_REGION_USED_OPAQUE = 0xbe,
DXGI_FORMAT_FORCE_UINT = 0xffffffff,
} DXGI_FORMAT;

View File

@ -24,23 +24,13 @@
#include <stddef.h>
#include <hashmap.h>
#include <vkd3d_types.h>
#include <vkd3d_d3d12.h>
#include <vkd3d.h>
#ifdef __cplusplus
extern "C" {
#endif /* __cplusplus */
enum vkd3d_shader_structure_type
{
/* 1.2 */
VKD3D_SHADER_STRUCTURE_TYPE_SHADER_INTERFACE_INFO,
VKD3D_SHADER_STRUCTURE_TYPE_COMPILE_ARGUMENTS,
VKD3D_SHADER_STRUCTURE_TYPE_SCAN_INFO,
VKD3D_SHADER_STRUCTURE_TYPE_TRANSFORM_FEEDBACK_INFO,
VKD3D_SHADER_STRUCTURE_TYPE_DOMAIN_SHADER_COMPILE_ARGUMENTS,
VKD3D_FORCE_32_BIT_ENUM(VKD3D_SHADER_STRUCTURE_TYPE),
};
enum vkd3d_shader_compiler_option
{
VKD3D_SHADER_STRIP_DEBUG = 0x00000001,
@ -64,11 +54,22 @@ enum vkd3d_shader_visibility
typedef uint64_t vkd3d_shader_hash_t;
enum vkd3d_shader_meta_flags
{
VKD3D_SHADER_META_FLAG_REPLACED = 1 << 0,
VKD3D_SHADER_META_FLAG_USES_SUBGROUP_SIZE = 1 << 1,
VKD3D_SHADER_META_FLAG_USES_NATIVE_16BIT_OPERATIONS = 1 << 2,
};
struct vkd3d_shader_meta
{
vkd3d_shader_hash_t hash;
bool replaced;
unsigned int cs_workgroup_size[3]; /* Only contains valid data if uses_subgroup_size is true. */
unsigned int patch_vertex_count; /* Relevant for HS. May be 0, in which case the patch vertex count is not known. */
unsigned int cs_required_wave_size; /* If non-zero, force a specific CS subgroup size. */
uint32_t flags; /* vkd3d_shader_meta_flags */
};
STATIC_ASSERT(sizeof(struct vkd3d_shader_meta) == 32);
struct vkd3d_shader_code
{
@ -77,6 +78,8 @@ struct vkd3d_shader_code
struct vkd3d_shader_meta meta;
};
vkd3d_shader_hash_t vkd3d_shader_hash(const struct vkd3d_shader_code *shader);
enum vkd3d_shader_descriptor_type
{
VKD3D_SHADER_DESCRIPTOR_TYPE_UNKNOWN,
@ -96,12 +99,12 @@ struct vkd3d_shader_descriptor_binding
enum vkd3d_shader_binding_flag
{
VKD3D_SHADER_BINDING_FLAG_BUFFER = 0x00000001,
VKD3D_SHADER_BINDING_FLAG_IMAGE = 0x00000002,
VKD3D_SHADER_BINDING_FLAG_COUNTER = 0x00000004,
VKD3D_SHADER_BINDING_FLAG_BINDLESS = 0x00000008,
VKD3D_SHADER_BINDING_FLAG_RAW_VA = 0x00000010,
VKD3D_SHADER_BINDING_FLAG_RAW_SSBO = 0x00000020,
VKD3D_SHADER_BINDING_FLAG_BUFFER = 0x00000001,
VKD3D_SHADER_BINDING_FLAG_IMAGE = 0x00000002,
VKD3D_SHADER_BINDING_FLAG_AUX_BUFFER = 0x00000004,
VKD3D_SHADER_BINDING_FLAG_BINDLESS = 0x00000008,
VKD3D_SHADER_BINDING_FLAG_RAW_VA = 0x00000010,
VKD3D_SHADER_BINDING_FLAG_RAW_SSBO = 0x00000020,
VKD3D_FORCE_32_BIT_ENUM(VKD3D_SHADER_BINDING_FLAG),
};
@ -190,12 +193,11 @@ enum vkd3d_shader_interface_flag
VKD3D_SHADER_INTERFACE_BINDLESS_CBV_AS_STORAGE_BUFFER = 0x00000002u,
VKD3D_SHADER_INTERFACE_SSBO_OFFSET_BUFFER = 0x00000004u,
VKD3D_SHADER_INTERFACE_TYPED_OFFSET_BUFFER = 0x00000008u,
VKD3D_SHADER_INTERFACE_DESCRIPTOR_QA_BUFFER = 0x00000010u
};
struct vkd3d_shader_interface_info
{
enum vkd3d_shader_structure_type type;
const void *next;
unsigned int flags; /* vkd3d_shader_interface_flags */
unsigned int min_ssbo_alignment;
@ -210,6 +212,58 @@ struct vkd3d_shader_interface_info
const struct vkd3d_shader_descriptor_binding *push_constant_ubo_binding;
/* Ignored unless VKD3D_SHADER_INTERFACE_SSBO_OFFSET_BUFFER or TYPED_OFFSET_BUFFER is set */
const struct vkd3d_shader_descriptor_binding *offset_buffer_binding;
#ifdef VKD3D_ENABLE_DESCRIPTOR_QA
/* Ignored unless VKD3D_SHADER_INTERFACE_DESCRIPTOR_QA_BUFFER is set. */
const struct vkd3d_shader_descriptor_binding *descriptor_qa_global_binding;
/* Ignored unless VKD3D_SHADER_INTERFACE_DESCRIPTOR_QA_BUFFER is set. */
const struct vkd3d_shader_descriptor_binding *descriptor_qa_heap_binding;
#endif
VkShaderStageFlagBits stage;
const struct vkd3d_shader_transform_feedback_info *xfb_info;
};
struct vkd3d_shader_descriptor_table
{
uint32_t table_index;
uint32_t binding_count;
struct vkd3d_shader_resource_binding *first_binding;
};
struct vkd3d_shader_root_constant
{
uint32_t constant_index;
uint32_t constant_count;
};
struct vkd3d_shader_root_descriptor
{
struct vkd3d_shader_resource_binding *binding;
uint32_t raw_va_root_descriptor_index;
};
struct vkd3d_shader_root_parameter
{
D3D12_ROOT_PARAMETER_TYPE parameter_type;
union
{
struct vkd3d_shader_root_constant constant;
struct vkd3d_shader_root_descriptor descriptor;
struct vkd3d_shader_descriptor_table descriptor_table;
};
};
struct vkd3d_shader_interface_local_info
{
const struct vkd3d_shader_root_parameter *local_root_parameters;
unsigned int local_root_parameter_count;
const struct vkd3d_shader_push_constant_buffer *shader_record_constant_buffers;
unsigned int shader_record_buffer_count;
const struct vkd3d_shader_resource_binding *bindings;
unsigned int binding_count;
uint32_t descriptor_size;
};
struct vkd3d_shader_transform_feedback_element
@ -222,12 +276,8 @@ struct vkd3d_shader_transform_feedback_element
uint8_t output_slot;
};
/* Extends vkd3d_shader_interface_info. */
struct vkd3d_shader_transform_feedback_info
{
enum vkd3d_shader_structure_type type;
const void *next;
const struct vkd3d_shader_transform_feedback_element *elements;
unsigned int element_count;
const unsigned int *buffer_strides;
@ -247,14 +297,63 @@ enum vkd3d_shader_target_extension
VKD3D_SHADER_TARGET_EXTENSION_NONE,
VKD3D_SHADER_TARGET_EXTENSION_SPV_EXT_DEMOTE_TO_HELPER_INVOCATION,
VKD3D_SHADER_TARGET_EXTENSION_READ_STORAGE_IMAGE_WITHOUT_FORMAT
VKD3D_SHADER_TARGET_EXTENSION_READ_STORAGE_IMAGE_WITHOUT_FORMAT,
VKD3D_SHADER_TARGET_EXTENSION_SPV_KHR_INTEGER_DOT_PRODUCT,
VKD3D_SHADER_TARGET_EXTENSION_RAY_TRACING_PRIMITIVE_CULLING,
VKD3D_SHADER_TARGET_EXTENSION_SCALAR_BLOCK_LAYOUT,
/* When using scalar block layout with a vec3 array on a byte address buffer,
* there is diverging behavior across hardware.
* On AMD, robustness is checked per component, which means we can implement ByteAddressBuffer
* without further hackery. On NVIDIA, robustness does not seem to work this way, so it's either
* all in range, or all out of range. We can implement structured buffer vectorization of vec3,
* but not byte address buffer. */
VKD3D_SHADER_TARGET_EXTENSION_ASSUME_PER_COMPONENT_SSBO_ROBUSTNESS,
VKD3D_SHADER_TARGET_EXTENSION_BARYCENTRIC_KHR,
VKD3D_SHADER_TARGET_EXTENSION_MIN_PRECISION_IS_NATIVE_16BIT,
VKD3D_SHADER_TARGET_EXTENSION_COUNT,
};
enum vkd3d_shader_quirk
{
/* If sample or sample_b is used in control flow, force LOD 0.0 (which game should expect anyway).
* Works around specific, questionable shaders which rely on this to give sensible results,
* since LOD can become garbage on certain implementations, and even on native drivers
* the result is implementation defined.
* Outside of making this edge case well-defined in Vulkan or hacking driver compilers,
* this is the pragmatic solution.
* Hoisting gradients is not possible in all cases,
* and would not be worth it until it's a widespread problem. */
VKD3D_SHADER_QUIRK_FORCE_EXPLICIT_LOD_IN_CONTROL_FLOW = (1 << 0),
/* After every write to group shared memory, force a memory barrier.
* This works around buggy games which forget to use barrier(). */
VKD3D_SHADER_QUIRK_FORCE_TGSM_BARRIERS = (1 << 1),
/* For Position builtins in Output storage class, emit Invariant decoration.
* Normally, games have to emit Precise math for position, but if they forget ... */
VKD3D_SHADER_QUIRK_INVARIANT_POSITION = (1 << 2),
};
struct vkd3d_shader_quirk_hash
{
vkd3d_shader_hash_t shader_hash;
uint32_t quirks;
};
struct vkd3d_shader_quirk_info
{
const struct vkd3d_shader_quirk_hash *hashes;
unsigned int num_hashes;
uint32_t default_quirks;
/* Quirks which are ORed in with the other masks (including default_quirks).
* Used mostly for additional overrides from VKD3D_CONFIG. */
uint32_t global_quirks;
};
struct vkd3d_shader_compile_arguments
{
enum vkd3d_shader_structure_type type;
const void *next;
enum vkd3d_shader_target target;
unsigned int target_extension_count;
@ -266,6 +365,8 @@ struct vkd3d_shader_compile_arguments
bool dual_source_blending;
const unsigned int *output_swizzles;
unsigned int output_swizzle_count;
const struct vkd3d_shader_quirk_info *quirks;
};
enum vkd3d_tessellator_output_primitive
@ -284,16 +385,6 @@ enum vkd3d_tessellator_partitioning
VKD3D_TESSELLATOR_PARTITIONING_FRACTIONAL_EVEN = 4,
};
/* Extends vkd3d_shader_compile_arguments. */
struct vkd3d_shader_domain_shader_compile_arguments
{
enum vkd3d_shader_structure_type type;
const void *next;
enum vkd3d_tessellator_output_primitive output_primitive;
enum vkd3d_tessellator_partitioning partitioning;
};
/* root signature 1.0 */
enum vkd3d_filter
{
@ -488,6 +579,7 @@ enum vkd3d_descriptor_range_flags
VKD3D_DESCRIPTOR_RANGE_FLAG_DATA_VOLATILE = 0x2,
VKD3D_DESCRIPTOR_RANGE_FLAG_DATA_STATIC_WHILE_SET_AT_EXECUTE = 0x4,
VKD3D_DESCRIPTOR_RANGE_FLAG_DATA_STATIC = 0x8,
VKD3D_DESCRIPTOR_RANGE_FLAG_DESCRIPTORS_STATIC_KEEPING_BUFFER_BOUNDS_CHECKS = 0x10000
};
struct vkd3d_descriptor_range1
@ -556,12 +648,20 @@ enum vkd3d_shader_uav_flag
{
VKD3D_SHADER_UAV_FLAG_READ_ACCESS = 0x00000001,
VKD3D_SHADER_UAV_FLAG_ATOMIC_COUNTER = 0x00000002,
VKD3D_SHADER_UAV_FLAG_ATOMIC_ACCESS = 0x00000004,
};
struct vkd3d_shader_scan_info
{
struct hash_map register_map;
bool use_vocp;
bool early_fragment_tests;
bool has_side_effects;
bool needs_late_zs;
bool discards;
bool has_uav_counter;
unsigned int patch_vertex_count;
};
enum vkd3d_component_type
@ -654,7 +754,11 @@ int vkd3d_shader_compile_dxbc(const struct vkd3d_shader_code *dxbc,
void vkd3d_shader_free_shader_code(struct vkd3d_shader_code *code);
int vkd3d_shader_parse_root_signature(const struct vkd3d_shader_code *dxbc,
struct vkd3d_versioned_root_signature_desc *root_signature);
struct vkd3d_versioned_root_signature_desc *root_signature,
vkd3d_shader_hash_t *compatibility_hash);
int vkd3d_shader_parse_root_signature_raw(const char *data, unsigned int data_size,
struct vkd3d_versioned_root_signature_desc *desc,
vkd3d_shader_hash_t *compatibility_hash);
void vkd3d_shader_free_root_signature(struct vkd3d_versioned_root_signature_desc *root_signature);
/* FIXME: Add support for returning error messages (ID3DBlob). */
@ -667,18 +771,90 @@ int vkd3d_shader_convert_root_signature(struct vkd3d_versioned_root_signature_de
int vkd3d_shader_scan_dxbc(const struct vkd3d_shader_code *dxbc,
struct vkd3d_shader_scan_info *scan_info);
/* If value cannot be determined, *patch_vertex_count returns 0. */
int vkd3d_shader_scan_patch_vertex_count(const struct vkd3d_shader_code *dxbc,
unsigned int *patch_vertex_count);
int vkd3d_shader_parse_input_signature(const struct vkd3d_shader_code *dxbc,
struct vkd3d_shader_signature *signature);
int vkd3d_shader_parse_output_signature(const struct vkd3d_shader_code *dxbc,
struct vkd3d_shader_signature *signature);
struct vkd3d_shader_signature_element *vkd3d_shader_find_signature_element(
const struct vkd3d_shader_signature *signature, const char *semantic_name,
unsigned int semantic_index, unsigned int stream_index);
void vkd3d_shader_free_shader_signature(struct vkd3d_shader_signature *signature);
int vkd3d_shader_supports_dxil(void);
/* For DXR, use special purpose entry points since there's a lot of special purpose reflection required. */
struct vkd3d_shader_library_entry_point
{
unsigned int identifier;
VkShaderStageFlagBits stage;
WCHAR *mangled_entry_point;
WCHAR *plain_entry_point;
char *real_entry_point;
};
enum vkd3d_shader_subobject_kind
{
/* Matches DXIL for simplicity. */
VKD3D_SHADER_SUBOBJECT_KIND_STATE_OBJECT_CONFIG = 0,
VKD3D_SHADER_SUBOBJECT_KIND_GLOBAL_ROOT_SIGNATURE = 1,
VKD3D_SHADER_SUBOBJECT_KIND_LOCAL_ROOT_SIGNATURE = 2,
VKD3D_SHADER_SUBOBJECT_KIND_SUBOBJECT_TO_EXPORTS_ASSOCIATION = 8,
VKD3D_SHADER_SUBOBJECT_KIND_RAYTRACING_SHADER_CONFIG = 9,
VKD3D_SHADER_SUBOBJECT_KIND_RAYTRACING_PIPELINE_CONFIG = 10,
VKD3D_SHADER_SUBOBJECT_KIND_HIT_GROUP = 11,
VKD3D_SHADER_SUBOBJECT_KIND_RAYTRACING_PIPELINE_CONFIG1 = 12,
};
struct vkd3d_shader_library_subobject
{
enum vkd3d_shader_subobject_kind kind;
unsigned int dxil_identifier;
/* All const pointers here point directly to the DXBC blob,
* so they do not need to be freed.
* Fortunately for us, the C strings are zero-terminated in the blob itself. */
/* In the blob, ASCII is used as identifier, where API uses wide strings, sigh ... */
const char *name;
union
{
D3D12_RAYTRACING_PIPELINE_CONFIG1 pipeline_config;
D3D12_RAYTRACING_SHADER_CONFIG shader_config;
D3D12_STATE_OBJECT_CONFIG object_config;
/* Duped strings because API wants wide strings for no good reason. */
D3D12_HIT_GROUP_DESC hit_group;
D3D12_DXIL_SUBOBJECT_TO_EXPORTS_ASSOCIATION association;
struct
{
const void *data;
size_t size;
} payload;
} data;
};
int vkd3d_shader_dxil_append_library_entry_points_and_subobjects(
const D3D12_DXIL_LIBRARY_DESC *library_desc,
unsigned int identifier,
struct vkd3d_shader_library_entry_point **entry_points,
size_t *entry_point_size, size_t *entry_point_count,
struct vkd3d_shader_library_subobject **subobjects,
size_t *subobjects_size, size_t *subobjects_count);
void vkd3d_shader_dxil_free_library_entry_points(struct vkd3d_shader_library_entry_point *entry_points, size_t count);
void vkd3d_shader_dxil_free_library_subobjects(struct vkd3d_shader_library_subobject *subobjects, size_t count);
int vkd3d_shader_compile_dxil_export(const struct vkd3d_shader_code *dxil,
const char *export,
struct vkd3d_shader_code *spirv,
const struct vkd3d_shader_interface_info *shader_interface_info,
const struct vkd3d_shader_interface_local_info *shader_interface_local_info,
const struct vkd3d_shader_compile_arguments *compiler_args);
uint32_t vkd3d_shader_compile_arguments_select_quirks(
const struct vkd3d_shader_compile_arguments *args, vkd3d_shader_hash_t hash);
uint64_t vkd3d_shader_get_revision(void);
#endif /* VKD3D_SHADER_NO_PROTOTYPES */
@ -692,7 +868,8 @@ typedef int (*PFN_vkd3d_shader_compile_dxbc)(const struct vkd3d_shader_code *dxb
typedef void (*PFN_vkd3d_shader_free_shader_code)(struct vkd3d_shader_code *code);
typedef int (*PFN_vkd3d_shader_parse_root_signature)(const struct vkd3d_shader_code *dxbc,
struct vkd3d_versioned_root_signature_desc *root_signature);
struct vkd3d_versioned_root_signature_desc *root_signature,
vkd3d_shader_hash_t *compatibility_hash);
typedef void (*PFN_vkd3d_shader_free_root_signature)(struct vkd3d_versioned_root_signature_desc *root_signature);
typedef int (*PFN_vkd3d_shader_serialize_root_signature)(
@ -703,8 +880,6 @@ typedef int (*PFN_vkd3d_shader_convert_root_signature)(struct vkd3d_versioned_ro
typedef int (*PFN_vkd3d_shader_scan_dxbc)(const struct vkd3d_shader_code *dxbc,
struct vkd3d_shader_scan_info *scan_info);
typedef int (*PFN_vkd3d_shader_scan_patch_vertex_count)(const struct vkd3d_shader_code *dxbc,
unsigned int *patch_vertex_count);
typedef int (*PFN_vkd3d_shader_parse_input_signature)(const struct vkd3d_shader_code *dxbc,
struct vkd3d_shader_signature *signature);

View File

@ -0,0 +1,58 @@
/*
* * Copyright 2021 NVIDIA Corporation
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
#ifndef __VKD3D_VK_INCLUDES_H
#define __VKD3D_VK_INCLUDES_H
#if defined(__LP64__) || defined(_WIN64) || (defined(__x86_64__) && !defined(__ILP32__) ) || defined(_M_X64) || defined(__ia64) || defined (_M_IA64) || defined(__aarch64__) || defined(__powerpc64__)
typedef struct VkCuFunctionNVX_T *VkCuFunctionNVX;
typedef struct VkCuModuleNVX_T *VkCuModuleNVX;
#else
typedef UINT64 VkCuFunctionNVX;
typedef UINT64 VkCuModuleNVX;
#endif
typedef struct VkPhysicalDevice_T *VkPhysicalDevice;
typedef struct VkCommandBuffer_T *VkCommandBuffer;
typedef struct VkInstance_T *VkInstance;
typedef struct VkDevice_T *VkDevice;
typedef enum D3D12_VK_EXTENSION
{
D3D12_VK_NVX_BINARY_IMPORT = 0x1,
D3D12_VK_NVX_IMAGE_VIEW_HANDLE = 0x2
} D3D12_VK_EXTENSION;
typedef struct D3D12_CUBIN_DATA_HANDLE
{
VkCuFunctionNVX vkCuFunction;
VkCuModuleNVX vkCuModule;
UINT32 blockX;
UINT32 blockY;
UINT32 blockZ;
} D3D12_CUBIN_DATA_HANDLE;
typedef struct D3D12_UAV_INFO
{
UINT32 version;
UINT32 surfaceHandle;
UINT64 gpuVAStart;
UINT64 gpuVASize;
} D3D12_UAV_INFO;
#endif // __VKD3D_VK_INCLUDES_H

View File

@ -42,8 +42,20 @@
#define WIDL_C_INLINE_WRAPPERS
#include <vkd3d_windows.h>
/* Vulkan headers include static const declarations. Enable static keyword for
* them.
*/
#ifdef __MINGW32__
# undef static
#endif
#define VK_USE_PLATFORM_WIN32_KHR
#include <vulkan/vulkan.h>
#include "private/vulkan_private_extensions.h"
#ifdef __MINGW32__
# define static
#endif
#include <dxgi1_6.h>
@ -57,6 +69,8 @@
#define __vkd3d_dxgi1_4_h__
#include <vkd3d_swapchain_factory.h>
#include <vkd3d_command_list_vkd3d_ext.h>
#include <vkd3d_device_vkd3d_ext.h>
#include <vkd3d_d3d12.h>
#include <vkd3d_d3d12sdklayers.h>

View File

@ -88,6 +88,9 @@ typedef void *HANDLE;
typedef const WCHAR* LPCWSTR;
#define _fseeki64(a, b, c) fseeko64(a, b, c)
#define _ftelli64(a) ftello64(a)
/* GUID */
# ifdef __WIDL__
typedef struct

View File

@ -3,9 +3,9 @@ LIBRARY d3d12.dll
EXPORTS
D3D12CreateDevice @101
D3D12GetDebugInterface @102
D3D12CreateRootSignatureDeserializer @107
D3D12CreateVersionedRootSignatureDeserializer @108
D3D12CreateRootSignatureDeserializer
D3D12CreateVersionedRootSignatureDeserializer
D3D12EnableExperimentalFeatures @110
D3D12SerializeRootSignature @115
D3D12SerializeVersionedRootSignature @116
D3D12EnableExperimentalFeatures
D3D12SerializeRootSignature
D3D12SerializeVersionedRootSignature

View File

@ -159,44 +159,10 @@ done:
return hr;
}
static BOOL check_vk_instance_extension(VkInstance vk_instance,
PFN_vkGetInstanceProcAddr pfn_vkGetInstanceProcAddr, const char *name)
{
PFN_vkEnumerateInstanceExtensionProperties pfn_vkEnumerateInstanceExtensionProperties;
VkExtensionProperties *properties;
BOOL ret = FALSE;
unsigned int i;
uint32_t count;
pfn_vkEnumerateInstanceExtensionProperties
= (void *)pfn_vkGetInstanceProcAddr(vk_instance, "vkEnumerateInstanceExtensionProperties");
if (pfn_vkEnumerateInstanceExtensionProperties(NULL, &count, NULL) < 0)
return FALSE;
if (!(properties = calloc(count, sizeof(*properties))))
return FALSE;
if (pfn_vkEnumerateInstanceExtensionProperties(NULL, &count, properties) >= 0)
{
for (i = 0; i < count; ++i)
{
if (!strcmp(properties[i].extensionName, name))
{
ret = TRUE;
break;
}
}
}
free(properties);
return ret;
}
static VkPhysicalDevice d3d12_get_vk_physical_device(struct vkd3d_instance *instance,
static VkPhysicalDevice d3d12_find_physical_device(struct vkd3d_instance *instance,
PFN_vkGetInstanceProcAddr pfn_vkGetInstanceProcAddr, struct DXGI_ADAPTER_DESC *adapter_desc)
{
PFN_vkGetPhysicalDeviceProperties2 pfn_vkGetPhysicalDeviceProperties2 = NULL;
PFN_vkGetPhysicalDeviceProperties2 pfn_vkGetPhysicalDeviceProperties2;
PFN_vkGetPhysicalDeviceProperties pfn_vkGetPhysicalDeviceProperties;
PFN_vkEnumeratePhysicalDevices pfn_vkEnumeratePhysicalDevices;
VkPhysicalDevice vk_physical_device = VK_NULL_HANDLE;
@ -211,10 +177,8 @@ static VkPhysicalDevice d3d12_get_vk_physical_device(struct vkd3d_instance *inst
vk_instance = vkd3d_instance_get_vk_instance(instance);
pfn_vkEnumeratePhysicalDevices = (void *)pfn_vkGetInstanceProcAddr(vk_instance, "vkEnumeratePhysicalDevices");
pfn_vkGetPhysicalDeviceProperties = (void *)pfn_vkGetInstanceProcAddr(vk_instance, "vkGetPhysicalDeviceProperties");
if (check_vk_instance_extension(vk_instance, pfn_vkGetInstanceProcAddr, VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2_EXTENSION_NAME))
pfn_vkGetPhysicalDeviceProperties2 = (void *)pfn_vkGetInstanceProcAddr(vk_instance, "vkGetPhysicalDeviceProperties2KHR");
pfn_vkGetPhysicalDeviceProperties2 = (void *)pfn_vkGetInstanceProcAddr(vk_instance, "vkGetPhysicalDeviceProperties2");
if ((vr = pfn_vkEnumeratePhysicalDevices(vk_instance, &count, NULL)) < 0)
{
@ -233,42 +197,51 @@ static VkPhysicalDevice d3d12_get_vk_physical_device(struct vkd3d_instance *inst
if ((vr = pfn_vkEnumeratePhysicalDevices(vk_instance, &count, vk_physical_devices)) < 0)
goto done;
if (pfn_vkGetPhysicalDeviceProperties2)
{
TRACE("Matching adapters by LUIDs.\n");
for (i = 0; i < count; ++i)
{
memset(&id_properties, 0, sizeof(id_properties));
id_properties.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_ID_PROPERTIES;
properties2.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROPERTIES_2;
properties2.pNext = &id_properties;
pfn_vkGetPhysicalDeviceProperties2(vk_physical_devices[i], &properties2);
if (!memcmp(id_properties.deviceLUID, &adapter_desc->AdapterLuid, VK_LUID_SIZE))
{
vk_physical_device = vk_physical_devices[i];
break;
}
}
}
TRACE("Matching adapters by PCI IDs.\n");
TRACE("Matching adapters by LUIDs.\n");
for (i = 0; i < count; ++i)
{
pfn_vkGetPhysicalDeviceProperties(vk_physical_devices[i], &properties2.properties);
if (properties2.properties.deviceID == adapter_desc->DeviceId &&
properties2.properties.vendorID == adapter_desc->VendorId)
/* Skip over physical devices below our minimum API version */
if (properties2.properties.apiVersion < VKD3D_MIN_API_VERSION)
{
WARN("Skipped adapter %s as it is below our minimum API version.", properties2.properties.deviceName);
continue;
}
id_properties.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_ID_PROPERTIES;
id_properties.pNext = NULL;
properties2.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROPERTIES_2;
properties2.pNext = &id_properties;
pfn_vkGetPhysicalDeviceProperties2(vk_physical_devices[i], &properties2);
if (id_properties.deviceLUIDValid && !memcmp(id_properties.deviceLUID, &adapter_desc->AdapterLuid, VK_LUID_SIZE))
{
vk_physical_device = vk_physical_devices[i];
break;
}
}
if (!vk_physical_device)
{
TRACE("Matching adapters by PCI IDs.\n");
for (i = 0; i < count; ++i)
{
pfn_vkGetPhysicalDeviceProperties(vk_physical_devices[i], &properties2.properties);
if (properties2.properties.deviceID == adapter_desc->DeviceId &&
properties2.properties.vendorID == adapter_desc->VendorId)
{
vk_physical_device = vk_physical_devices[i];
break;
}
}
}
if (!vk_physical_device)
{
FIXME("Could not find Vulkan physical device for DXGI adapter.\n");
@ -284,7 +257,6 @@ done:
HRESULT WINAPI DLLEXPORT D3D12CreateDevice(IUnknown *adapter, D3D_FEATURE_LEVEL minimum_feature_level,
REFIID iid, void **device)
{
struct vkd3d_optional_instance_extensions_info optional_extensions_info;
struct vkd3d_instance_create_info instance_create_info;
PFN_vkGetInstanceProcAddr pfn_vkGetInstanceProcAddr;
struct vkd3d_device_create_info device_create_info;
@ -298,10 +270,6 @@ HRESULT WINAPI DLLEXPORT D3D12CreateDevice(IUnknown *adapter, D3D_FEATURE_LEVEL
VK_KHR_SURFACE_EXTENSION_NAME,
VK_KHR_WIN32_SURFACE_EXTENSION_NAME,
};
static const char * const optional_instance_extensions[] =
{
VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2_EXTENSION_NAME,
};
static const char * const device_extensions[] =
{
VK_KHR_SWAPCHAIN_EXTENSION_NAME,
@ -325,20 +293,14 @@ HRESULT WINAPI DLLEXPORT D3D12CreateDevice(IUnknown *adapter, D3D_FEATURE_LEVEL
goto done;
}
optional_extensions_info.type = VKD3D_STRUCTURE_TYPE_OPTIONAL_INSTANCE_EXTENSIONS_INFO;
optional_extensions_info.next = NULL;
optional_extensions_info.extensions = optional_instance_extensions;
optional_extensions_info.extension_count = ARRAYSIZE(optional_instance_extensions);
instance_create_info.type = VKD3D_STRUCTURE_TYPE_INSTANCE_CREATE_INFO;
instance_create_info.next = &optional_extensions_info;
instance_create_info.pfn_signal_event = d3d12_signal_event;
instance_create_info.pfn_create_thread = d3d12_create_thread;
instance_create_info.pfn_join_thread = d3d12_join_thread;
instance_create_info.wchar_size = sizeof(WCHAR);
instance_create_info.pfn_vkGetInstanceProcAddr = pfn_vkGetInstanceProcAddr;
instance_create_info.instance_extensions = instance_extensions;
instance_create_info.instance_extension_count = ARRAYSIZE(instance_extensions);
instance_create_info.optional_instance_extensions = NULL;
instance_create_info.optional_instance_extension_count = 0;
if (FAILED(hr = vkd3d_create_instance(&instance_create_info, &instance)))
{
@ -346,14 +308,14 @@ HRESULT WINAPI DLLEXPORT D3D12CreateDevice(IUnknown *adapter, D3D_FEATURE_LEVEL
goto done;
}
device_create_info.type = VKD3D_STRUCTURE_TYPE_DEVICE_CREATE_INFO;
device_create_info.next = NULL;
device_create_info.minimum_feature_level = minimum_feature_level;
device_create_info.instance = instance;
device_create_info.instance_create_info = NULL;
device_create_info.vk_physical_device = d3d12_get_vk_physical_device(instance, pfn_vkGetInstanceProcAddr, &adapter_desc);
device_create_info.vk_physical_device = d3d12_find_physical_device(instance, pfn_vkGetInstanceProcAddr, &adapter_desc);
device_create_info.device_extensions = device_extensions;
device_create_info.device_extension_count = ARRAYSIZE(device_extensions);
device_create_info.optional_device_extensions = NULL;
device_create_info.optional_device_extension_count = 0;
device_create_info.parent = (IUnknown *)dxgi_adapter;
memcpy(&device_create_info.adapter_luid, &adapter_desc.AdapterLuid, VK_LUID_SIZE);

View File

@ -7,7 +7,7 @@ d3d12_lib = shared_library('d3d12', d3d12_src,
dependencies : [ vkd3d_dep, lib_dxgi ],
include_directories : vkd3d_private_includes,
install : true,
objects : not vkd3d_msvc ? 'd3d12.def' : [],
objects : not vkd3d_is_msvc ? 'd3d12.def' : [],
vs_module_defs : 'd3d12.def',
override_options : [ 'c_std='+vkd3d_c_std ])

View File

@ -20,6 +20,8 @@
#include "vkd3d_debug.h"
#include "vkd3d_threads.h"
#include "vkd3d_platform.h"
#include <assert.h>
#include <ctype.h>
#include <errno.h>
@ -58,13 +60,13 @@ static FILE *vkd3d_log_file;
static void vkd3d_dbg_init_once(void)
{
const char *vkd3d_debug;
char vkd3d_debug[VKD3D_PATH_MAX];
unsigned int channel, i;
for (channel = 0; channel < VKD3D_DBG_CHANNEL_COUNT; channel++)
{
if (!(vkd3d_debug = getenv(env_for_channel[channel])))
vkd3d_debug = "";
if (!vkd3d_get_env_var(env_for_channel[channel], vkd3d_debug, sizeof(vkd3d_debug)))
strncpy(vkd3d_debug, "", VKD3D_PATH_MAX);
for (i = 1; i < ARRAY_SIZE(debug_level_names); ++i)
if (!strcmp(debug_level_names[i], vkd3d_debug))
@ -75,7 +77,7 @@ static void vkd3d_dbg_init_once(void)
vkd3d_dbg_level[channel] = VKD3D_DBG_LEVEL_FIXME;
}
if ((vkd3d_debug = getenv("VKD3D_LOG_FILE")))
if (vkd3d_get_env_var("VKD3D_LOG_FILE", vkd3d_debug, sizeof(vkd3d_debug)))
{
vkd3d_log_file = fopen(vkd3d_debug, "w");
if (!vkd3d_log_file)
@ -121,7 +123,7 @@ void vkd3d_dbg_printf(enum vkd3d_dbg_channel channel, enum vkd3d_dbg_level level
va_start(args, fmt);
spinlock_acquire(&spin);
fprintf(log_file, "%u:%s:%s: ", tid, debug_level_names[level], function);
fprintf(log_file, "%04x:%s:%s: ", tid, debug_level_names[level], function);
vfprintf(log_file, fmt, args);
spinlock_release(&spin);
va_end(args);
@ -219,10 +221,10 @@ const char *debugstr_a(const char *str)
return buffer;
}
static const char *debugstr_w16(const uint16_t *wstr)
const char *debugstr_w(const WCHAR *wstr)
{
char *buffer, *ptr;
uint16_t c;
WCHAR c;
if (!wstr)
return "(null)";
@ -279,80 +281,13 @@ static const char *debugstr_w16(const uint16_t *wstr)
return buffer;
}
static const char *debugstr_w32(const uint32_t *wstr)
{
char *buffer, *ptr;
uint32_t c;
if (!wstr)
return "(null)";
ptr = buffer = get_buffer();
*ptr++ = '"';
while ((c = *wstr++) && ptr <= buffer + VKD3D_DEBUG_BUFFER_SIZE - 10)
{
int escape_char;
switch (c)
{
case '"':
case '\\':
case '\n':
case '\r':
case '\t':
escape_char = c;
break;
default:
escape_char = 0;
break;
}
if (escape_char)
{
*ptr++ = '\\';
*ptr++ = escape_char;
continue;
}
if (isprint(c))
{
*ptr++ = c;
}
else
{
*ptr++ = '\\';
sprintf(ptr, "%04x", c);
ptr += 4;
}
}
*ptr++ = '"';
if (c)
{
*ptr++ = '.';
*ptr++ = '.';
*ptr++ = '.';
}
*ptr = '\0';
return buffer;
}
const char *debugstr_w(const WCHAR *wstr, size_t wchar_size)
{
if (wchar_size == 2)
return debugstr_w16((const uint16_t *)wstr);
return debugstr_w32((const uint32_t *)wstr);
}
unsigned int vkd3d_env_var_as_uint(const char *name, unsigned int default_value)
{
const char *value = getenv(name);
char value[VKD3D_PATH_MAX];
unsigned long r;
char *end_ptr;
if (value)
if (vkd3d_get_env_var(name, value, sizeof(value)) && strlen(value) > 0)
{
errno = 0;
r = strtoul(value, &end_ptr, 0);

View File

@ -0,0 +1,188 @@
/*
* Copyright 2022 Hans-Kristian Arntzen for Valve Corporation
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
#define VKD3D_DBG_CHANNEL VKD3D_DBG_CHANNEL_API
#include "vkd3d_file_utils.h"
#include "vkd3d_debug.h"
/* For disk cache. */
#ifdef _WIN32
#include <windows.h>
#include <io.h>
#else
#include <unistd.h>
#include <sys/mman.h>
#include <errno.h>
#endif
#include <fcntl.h>
#include <sys/stat.h>
#include <stdio.h>
bool vkd3d_file_rename_overwrite(const char *from_path, const char *to_path)
{
#ifdef _WIN32
DWORD code = ERROR_SUCCESS;
if (!MoveFileA(from_path, to_path))
{
code = GetLastError();
if (code == ERROR_ALREADY_EXISTS)
{
code = ERROR_SUCCESS;
if (!ReplaceFileA(to_path, from_path, NULL, 0, NULL, NULL))
code = GetLastError();
}
}
return code == ERROR_SUCCESS;
#else
return rename(from_path, to_path) == 0;
#endif
}
bool vkd3d_file_rename_no_replace(const char *from_path, const char *to_path)
{
#ifdef _WIN32
DWORD code = ERROR_SUCCESS;
if (!MoveFileA(from_path, to_path))
code = GetLastError();
return code == ERROR_SUCCESS;
#else
return renameat2(AT_FDCWD, from_path, AT_FDCWD, to_path, RENAME_NOREPLACE) == 0;
#endif
}
bool vkd3d_file_delete(const char *path)
{
#ifdef _WIN32
DWORD code = ERROR_SUCCESS;
if (!DeleteFileA(path))
code = GetLastError();
return code == ERROR_SUCCESS;
#else
return unlink(path) == 0;
#endif
}
FILE *vkd3d_file_open_exclusive_write(const char *path)
{
#ifdef _WIN32
/* From Fossilize. AFAIK, there is no direct way to make this work with FILE interface, so have to roundtrip
* through jank POSIX layer.
* wbx kinda works, but Wine warns about it, despite it working anyways.
* Older MSVC runtimes do not support wbx. */
FILE *file = NULL;
int fd;
fd = _open(path, _O_BINARY | _O_WRONLY | _O_CREAT | _O_EXCL | _O_TRUNC | _O_SEQUENTIAL,
_S_IWRITE | _S_IREAD);
if (fd >= 0)
{
file = _fdopen(fd, "wb");
/* _fdopen takes ownership. */
if (!file)
_close(fd);
}
return file;
#else
return fopen(path, "wbx");
#endif
}
void vkd3d_file_unmap(struct vkd3d_memory_mapped_file *file)
{
if (file->mapped)
{
#ifdef _WIN32
UnmapViewOfFile(file->mapped);
#else
munmap(file->mapped, file->mapped_size);
#endif
}
memset(file, 0, sizeof(*file));
}
bool vkd3d_file_map_read_only(const char *path, struct vkd3d_memory_mapped_file *file)
{
#ifdef _WIN32
DWORD size_hi, size_lo;
HANDLE file_mapping;
HANDLE handle;
#else
struct stat stat_buf;
int fd;
#endif
file->mapped = NULL;
file->mapped_size = 0;
#ifdef _WIN32
handle = CreateFileA(path, GENERIC_READ, FILE_SHARE_READ | FILE_SHARE_DELETE, NULL,
OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL | FILE_FLAG_SEQUENTIAL_SCAN,
INVALID_HANDLE_VALUE);
if (handle == INVALID_HANDLE_VALUE)
goto out;
size_lo = GetFileSize(handle, &size_hi);
file->mapped_size = size_lo | (((uint64_t)size_hi) << 32);
file_mapping = CreateFileMappingA(handle, NULL, PAGE_READONLY, 0, 0, NULL);
if (file_mapping == INVALID_HANDLE_VALUE)
goto out;
file->mapped = MapViewOfFile(file_mapping, FILE_MAP_READ, 0, 0, file->mapped_size);
CloseHandle(file_mapping);
file_mapping = INVALID_HANDLE_VALUE;
if (!file->mapped)
{
ERR("Failed to MapViewOfFile for %s.\n", path);
goto out;
}
out:
if (handle != INVALID_HANDLE_VALUE)
CloseHandle(handle);
#else
fd = open(path, O_RDONLY);
if (fd < 0)
goto out;
if (fstat(fd, &stat_buf) < 0)
{
ERR("Failed to fstat pipeline cache.\n");
goto out;
}
/* Map private to make sure we get CoW behavior in case someone clobbers
* the cache while in flight. We need to read data directly out of the cache. */
file->mapped = mmap(NULL, stat_buf.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
if (file->mapped != MAP_FAILED)
file->mapped_size = stat_buf.st_size;
else
goto out;
out:
if (fd >= 0)
close(fd);
#endif
if (!file->mapped)
file->mapped_size = 0;
return file->mapped != NULL;
}

View File

@ -2,7 +2,10 @@ vkd3d_common_src = [
'debug.c',
'memory.c',
'utf8.c',
'profiling.c'
'profiling.c',
'string.c',
'file_utils.c',
'platform.c',
]
vkd3d_common_lib = static_library('vkd3d_common', vkd3d_common_src, vkd3d_header_files,

View File

@ -18,6 +18,9 @@
#include "vkd3d_platform.h"
#include <assert.h>
#include <stdio.h>
#if defined(__linux__)
# include <dlfcn.h>
@ -153,3 +156,43 @@ bool vkd3d_get_program_name(char program_name[VKD3D_PATH_MAX])
}
#endif
#if defined(_WIN32)
bool vkd3d_get_env_var(const char *name, char *value, size_t value_size)
{
DWORD len;
assert(value);
assert(value_size > 0);
len = GetEnvironmentVariableA(name, value, value_size);
if (len > 0 && len <= value_size)
{
return true;
}
value[0] = '\0';
return false;
}
#else
bool vkd3d_get_env_var(const char *name, char *value, size_t value_size)
{
const char *env_value;
assert(value);
assert(value_size > 0);
if ((env_value = getenv(name)))
{
snprintf(value, value_size, "%s", env_value);
return true;
}
value[0] = '\0';
return false;
}
#endif

View File

@ -21,6 +21,7 @@
#define VKD3D_DBG_CHANNEL VKD3D_DBG_CHANNEL_API
#include "vkd3d_profiling.h"
#include "vkd3d_platform.h"
#include "vkd3d_threads.h"
#include "vkd3d_debug.h"
#include <stdlib.h>
@ -124,8 +125,10 @@ static void vkd3d_init_profiling_path(const char *path)
static void vkd3d_init_profiling_once(void)
{
const char *path = getenv("VKD3D_PROFILE_PATH");
if (path)
char path[VKD3D_PATH_MAX];
vkd3d_get_env_var("VKD3D_PROFILE_PATH", path, sizeof(path));
if (strlen(path) > 0)
vkd3d_init_profiling_path(path);
}

176
libs/vkd3d-common/string.c Normal file
View File

@ -0,0 +1,176 @@
/*
* Copyright 2021 Hans-Kristian Arntzen for Valve Corporation
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
#define VKD3D_DBG_CHANNEL VKD3D_DBG_CHANNEL_API
#include "vkd3d_string.h"
#include "vkd3d_memory.h"
STATIC_ASSERT(sizeof(WCHAR) == sizeof(uint16_t));
char *vkd3d_strdup(const char *str)
{
/* strdup() is actually not standard. */
char *duped;
size_t len;
len = strlen(str) + 1;
duped = vkd3d_malloc(len);
if (duped)
memcpy(duped, str, len);
return duped;
}
char *vkd3d_strdup_n(const char *str, size_t n)
{
char *duped;
size_t len;
len = strnlen(str, n);
duped = vkd3d_malloc(len + 1);
if (duped)
{
memcpy(duped, str, len);
duped[len] = '\0';
}
return duped;
}
WCHAR *vkd3d_wstrdup(const WCHAR *str)
{
WCHAR *duped;
size_t len;
len = vkd3d_wcslen(str) + 1;
duped = vkd3d_malloc(len * sizeof(WCHAR));
if (duped)
memcpy(duped, str, len * sizeof(WCHAR));
return duped;
}
bool vkd3d_export_strequal(const WCHAR *a, const WCHAR *b)
{
if (!a || !b)
return false;
while (*a != '\0' && *b != '\0')
{
if (*a != *b)
return false;
a++;
b++;
}
return *a == *b;
}
bool vkd3d_export_strequal_mixed(const WCHAR *a, const char *b)
{
if (!a || !b)
return false;
while (*a != '\0' && *b != '\0')
{
if (*a != *b)
return false;
a++;
b++;
}
return *a == *b;
}
bool vkd3d_export_strequal_substr(const WCHAR *a, size_t expected_n, const WCHAR *b)
{
size_t n = 0;
if (!a || !b)
return false;
while (*a != '\0' && *b != '\0' && n < expected_n)
{
if (*a != *b)
return false;
a++;
b++;
n++;
}
return n == expected_n && *b == '\0';
}
WCHAR *vkd3d_dup_entry_point(const char *str)
{
return vkd3d_dup_entry_point_n(str, strlen(str));
}
WCHAR *vkd3d_dup_entry_point_n(const char *str, size_t len)
{
WCHAR *duped;
size_t i;
duped = vkd3d_malloc((len + 1) * sizeof(WCHAR));
if (!duped)
return NULL;
for (i = 0; i < len; i++)
duped[i] = (unsigned char)str[i];
duped[len] = 0;
return duped;
}
static bool is_valid_identifier_character(char v)
{
return (v >= 'a' && v <= 'z') || (v >= 'A' && v <= 'Z') || v == '_' || (v >= '0' && v <= '9');
}
static const char *vkd3d_manged_entry_point_scan(const char *entry, const char **out_end_entry)
{
const char *end_entry;
while (*entry != '\0' && !is_valid_identifier_character(*entry))
entry++;
end_entry = entry;
while (*end_entry != '\0' && is_valid_identifier_character(*end_entry))
end_entry++;
if (entry == end_entry)
return NULL;
*out_end_entry = end_entry;
return entry;
}
WCHAR *vkd3d_dup_demangled_entry_point(const char *entry)
{
const char *end_entry;
if (!(entry = vkd3d_manged_entry_point_scan(entry, &end_entry)))
return NULL;
return vkd3d_dup_entry_point_n(entry, end_entry - entry);
}
char *vkd3d_dup_demangled_entry_point_ascii(const char *entry)
{
const char *end_entry;
if (!(entry = vkd3d_manged_entry_point_scan(entry, &end_entry)))
return NULL;
return vkd3d_strdup_n(entry, end_entry - entry);
}

View File

@ -84,9 +84,9 @@ static void vkd3d_utf8_append(char **dst, uint32_t c)
*dst += 4;
}
static uint32_t vkd3d_utf16_read(const uint16_t **src)
static uint32_t vkd3d_utf16_read(const WCHAR **src)
{
const uint16_t *s = *src;
const WCHAR *s = *src;
if (s[0] < 0xd800 || s[0] > 0xdfff) /* Not a surrogate pair. */
{
@ -105,21 +105,15 @@ static uint32_t vkd3d_utf16_read(const uint16_t **src)
return 0x10000 + ((s[0] & 0x3ff) << 10) + (s[1] & 0x3ff);
}
static inline bool vkd3d_string_should_loop_u16(ptrdiff_t max_elements, const uint16_t* src, const uint16_t* wstr)
static inline bool vkd3d_string_should_loop_u16(ptrdiff_t max_elements, const WCHAR* src, const WCHAR* wstr)
{
ptrdiff_t cursor_pos = src - wstr;
return (!max_elements || cursor_pos < max_elements) && *src;
}
static inline bool vkd3d_string_should_loop_u32(ptrdiff_t max_elements, const uint32_t* src, const uint32_t* wstr)
char *vkd3d_strdup_w_utf8(const WCHAR *wstr, size_t max_elements)
{
ptrdiff_t cursor_pos = src - wstr;
return (!max_elements || cursor_pos < max_elements) && *src;
}
static char *vkd3d_strdup_w16_utf8(const uint16_t *wstr, size_t max_elements)
{
const uint16_t *src = wstr;
const WCHAR *src = wstr;
size_t dst_size = 0;
char *dst, *utf8;
uint32_t c;
@ -143,36 +137,7 @@ static char *vkd3d_strdup_w16_utf8(const uint16_t *wstr, size_t max_elements)
continue;
vkd3d_utf8_append(&utf8, c);
}
*utf8 = 0;
*utf8 = '\0';
return dst;
}
static char *vkd3d_strdup_w32_utf8(const uint32_t *wstr, size_t max_elements)
{
const uint32_t *src = wstr;
size_t dst_size = 0;
char *dst, *utf8;
while (vkd3d_string_should_loop_u32(max_elements, src, wstr))
dst_size += vkd3d_utf8_len(*src++);
++dst_size;
if (!(dst = vkd3d_malloc(dst_size)))
return NULL;
utf8 = dst;
src = wstr;
while (vkd3d_string_should_loop_u32(max_elements, src, wstr))
vkd3d_utf8_append(&utf8, *src++);
*utf8 = 0;
return dst;
}
char *vkd3d_strdup_w_utf8(const WCHAR *wstr, size_t wchar_size, size_t max_elements)
{
if (wchar_size == 2)
return vkd3d_strdup_w16_utf8((const uint16_t *)wstr, max_elements);
return vkd3d_strdup_w32_utf8((const uint32_t *)wstr, max_elements);
}

View File

@ -1346,6 +1346,7 @@ static const enum vkd3d_shader_register_type register_type_table[] =
/* VKD3D_SM5_RT_DEPTHOUT_LESS_EQUAL */ VKD3DSPR_DEPTHOUTLE,
/* UNKNOWN */ ~0u,
/* VKD3D_SM5_RT_STENCILREFOUT */ VKD3DSPR_STENCILREFOUT,
/* VKD3D_SM5_RT_INNERCOVERAGE */ VKD3DSPR_INNERCOVERAGE,
};
static const struct vkd3d_sm4_opcode_info *get_opcode_info(enum vkd3d_sm4_opcode opcode)
@ -2248,6 +2249,21 @@ static int isgn_handler(const char *data, DWORD data_size, DWORD tag, void *ctx)
return shader_parse_signature(tag, data, data_size, is);
}
static int osgn_handler(const char *data, DWORD data_size, DWORD tag, void *ctx)
{
struct vkd3d_shader_signature *is = ctx;
if (tag != TAG_OSGN && tag != TAG_OSG1)
return VKD3D_OK;
if (is->elements)
{
FIXME("Multiple input signatures.\n");
vkd3d_shader_free_shader_signature(is);
}
return shader_parse_signature(tag, data, data_size, is);
}
int shader_parse_input_signature(const void *dxbc, size_t dxbc_length,
struct vkd3d_shader_signature *signature)
{
@ -2259,6 +2275,17 @@ int shader_parse_input_signature(const void *dxbc, size_t dxbc_length,
return ret;
}
int shader_parse_output_signature(const void *dxbc, size_t dxbc_length,
struct vkd3d_shader_signature *signature)
{
int ret;
memset(signature, 0, sizeof(*signature));
if ((ret = parse_dxbc(dxbc, dxbc_length, osgn_handler, signature)) < 0)
ERR("Failed to parse output signature.\n");
return ret;
}
static int dxil_handler(const char *data, DWORD data_size, DWORD tag, void *context)
{
switch (tag)
@ -2426,7 +2453,8 @@ static void shader_validate_descriptor_range1(const struct vkd3d_descriptor_rang
| VKD3D_DESCRIPTOR_RANGE_FLAG_DESCRIPTORS_VOLATILE
| VKD3D_DESCRIPTOR_RANGE_FLAG_DATA_VOLATILE
| VKD3D_DESCRIPTOR_RANGE_FLAG_DATA_STATIC_WHILE_SET_AT_EXECUTE
| VKD3D_DESCRIPTOR_RANGE_FLAG_DATA_STATIC);
| VKD3D_DESCRIPTOR_RANGE_FLAG_DATA_STATIC
| VKD3D_DESCRIPTOR_RANGE_FLAG_DESCRIPTORS_STATIC_KEEPING_BUFFER_BOUNDS_CHECKS);
if (unknown_flags)
FIXME("Unknown descriptor range flags %#x.\n", unknown_flags);
@ -2727,8 +2755,9 @@ static int shader_parse_static_samplers(struct root_signature_parser_context *co
return VKD3D_OK;
}
static int shader_parse_root_signature(const char *data, unsigned int data_size,
struct vkd3d_versioned_root_signature_desc *desc)
int vkd3d_shader_parse_root_signature_raw(const char *data, unsigned int data_size,
struct vkd3d_versioned_root_signature_desc *desc,
vkd3d_shader_hash_t *compatibility_hash)
{
struct vkd3d_root_signature_desc *v_1_0 = &desc->v_1_0;
struct root_signature_parser_context context;
@ -2736,6 +2765,8 @@ static int shader_parse_root_signature(const char *data, unsigned int data_size,
const char *ptr = data;
int ret;
memset(desc, 0, sizeof(*desc));
context.data = data;
context.data_size = data_size;
@ -2807,28 +2838,46 @@ static int shader_parse_root_signature(const char *data, unsigned int data_size,
read_uint32(&ptr, &v_1_0->flags);
TRACE("Flags %#x.\n", v_1_0->flags);
if (compatibility_hash)
{
struct vkd3d_shader_code code = { data, data_size };
*compatibility_hash = vkd3d_shader_hash(&code);
}
return VKD3D_OK;
}
static int rts0_handler(const char *data, DWORD data_size, DWORD tag, void *context)
{
struct vkd3d_versioned_root_signature_desc *desc = context;
struct vkd3d_shader_code *payload = context;
if (tag != TAG_RTS0)
return VKD3D_OK;
return shader_parse_root_signature(data, data_size, desc);
payload->code = data;
payload->size = data_size;
return VKD3D_OK;
}
int vkd3d_shader_parse_root_signature(const struct vkd3d_shader_code *dxbc,
struct vkd3d_versioned_root_signature_desc *root_signature)
struct vkd3d_versioned_root_signature_desc *root_signature,
vkd3d_shader_hash_t *compatibility_hash)
{
struct vkd3d_shader_code raw_payload;
int ret;
TRACE("dxbc {%p, %zu}, root_signature %p.\n", dxbc->code, dxbc->size, root_signature);
memset(root_signature, 0, sizeof(*root_signature));
if ((ret = parse_dxbc(dxbc->code, dxbc->size, rts0_handler, root_signature)) < 0)
memset(&raw_payload, 0, sizeof(raw_payload));
if ((ret = parse_dxbc(dxbc->code, dxbc->size, rts0_handler, &raw_payload)) < 0)
return ret;
if (!raw_payload.code)
return VKD3D_ERROR;
if ((ret = vkd3d_shader_parse_root_signature_raw(raw_payload.code, raw_payload.size,
root_signature, compatibility_hash)) < 0)
{
vkd3d_shader_free_root_signature(root_signature);
return ret;

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -20,6 +20,8 @@
#include "vkd3d_shader_private.h"
#include "vkd3d_platform.h"
#include <stdio.h>
#include <inttypes.h>
@ -30,6 +32,8 @@ static void vkd3d_shader_dump_blob(const char *path, vkd3d_shader_hash_t hash, c
snprintf(filename, ARRAY_SIZE(filename), "%s/%016"PRIx64".%s", path, hash, ext);
INFO("Dumping blob to %s.\n", filename);
/* Exclusive open to avoid multiple threads spamming out the same shader module, and avoids race condition. */
if ((f = fopen(filename, "wbx")))
{
@ -40,25 +44,12 @@ static void vkd3d_shader_dump_blob(const char *path, vkd3d_shader_hash_t hash, c
}
}
bool vkd3d_shader_replace(vkd3d_shader_hash_t hash, const void **data, size_t *size)
static bool vkd3d_shader_replace_path(const char *filename, vkd3d_shader_hash_t hash, const void **data, size_t *size)
{
static bool enabled = true;
char filename[1024];
void *buffer = NULL;
const char *path;
FILE *f = NULL;
size_t len;
if (!enabled)
return false;
if (!(path = getenv("VKD3D_SHADER_OVERRIDE")))
{
enabled = false;
return false;
}
snprintf(filename, ARRAY_SIZE(filename), "%s/%016"PRIx64".spv", path, hash);
if ((f = fopen(filename, "rb")))
{
if (fseek(f, 0, SEEK_END) < 0)
@ -78,7 +69,7 @@ bool vkd3d_shader_replace(vkd3d_shader_hash_t hash, const void **data, size_t *s
*data = buffer;
*size = len;
WARN("Overriding shader hash %016"PRIx64" with alternative SPIR-V module!\n", hash);
INFO("Overriding shader hash %016"PRIx64" with alternative SPIR-V module from %s!\n", hash, filename);
fclose(f);
return true;
@ -89,15 +80,53 @@ err:
return false;
}
bool vkd3d_shader_replace(vkd3d_shader_hash_t hash, const void **data, size_t *size)
{
static bool enabled = true;
char path[VKD3D_PATH_MAX];
char filename[1024];
if (!enabled)
return false;
if (!vkd3d_get_env_var("VKD3D_SHADER_OVERRIDE", path, sizeof(path)))
{
enabled = false;
return false;
}
snprintf(filename, ARRAY_SIZE(filename), "%s/%016"PRIx64".spv", path, hash);
return vkd3d_shader_replace_path(filename, hash, data, size);
}
bool vkd3d_shader_replace_export(vkd3d_shader_hash_t hash, const void **data, size_t *size, const char *export)
{
static bool enabled = true;
char path[VKD3D_PATH_MAX];
char filename[1024];
if (!enabled)
return false;
if (!vkd3d_get_env_var("VKD3D_SHADER_OVERRIDE", path, sizeof(path)))
{
enabled = false;
return false;
}
snprintf(filename, ARRAY_SIZE(filename), "%s/%016"PRIx64".lib.%s.spv", path, hash, export);
return vkd3d_shader_replace_path(filename, hash, data, size);
}
void vkd3d_shader_dump_shader(vkd3d_shader_hash_t hash, const struct vkd3d_shader_code *shader, const char *ext)
{
static bool enabled = true;
const char *path;
char path[VKD3D_PATH_MAX];
if (!enabled)
return;
if (!(path = getenv("VKD3D_SHADER_DUMP_PATH")))
if (!vkd3d_get_env_var("VKD3D_SHADER_DUMP_PATH", path, sizeof(path)))
{
enabled = false;
return;
@ -109,12 +138,12 @@ void vkd3d_shader_dump_shader(vkd3d_shader_hash_t hash, const struct vkd3d_shade
void vkd3d_shader_dump_spirv_shader(vkd3d_shader_hash_t hash, const struct vkd3d_shader_code *shader)
{
static bool enabled = true;
const char *path;
char path[VKD3D_PATH_MAX];
if (!enabled)
return;
if (!(path = getenv("VKD3D_SHADER_DUMP_PATH")))
if (!vkd3d_get_env_var("VKD3D_SHADER_DUMP_PATH", path, sizeof(path)))
{
enabled = false;
return;
@ -123,6 +152,26 @@ void vkd3d_shader_dump_spirv_shader(vkd3d_shader_hash_t hash, const struct vkd3d
vkd3d_shader_dump_blob(path, hash, shader->code, shader->size, "spv");
}
void vkd3d_shader_dump_spirv_shader_export(vkd3d_shader_hash_t hash, const struct vkd3d_shader_code *shader,
const char *export)
{
static bool enabled = true;
char path[VKD3D_PATH_MAX];
char tag[1024];
if (!enabled)
return;
if (!vkd3d_get_env_var("VKD3D_SHADER_DUMP_PATH", path, sizeof(path)))
{
enabled = false;
return;
}
snprintf(tag, sizeof(tag), "lib.%s.spv", export);
vkd3d_shader_dump_blob(path, hash, shader->code, shader->size, tag);
}
struct vkd3d_shader_parser
{
struct vkd3d_shader_desc shader_desc;
@ -166,12 +215,6 @@ static int vkd3d_shader_validate_compile_args(const struct vkd3d_shader_compile_
if (!compile_args)
return VKD3D_OK;
if (compile_args->type != VKD3D_SHADER_STRUCTURE_TYPE_COMPILE_ARGUMENTS)
{
WARN("Invalid structure type %#x.\n", compile_args->type);
return VKD3D_ERROR_INVALID_ARGUMENT;
}
switch (compile_args->target)
{
case VKD3D_SHADER_TARGET_SPIRV_VULKAN_1_0:
@ -255,6 +298,29 @@ static void vkd3d_shader_scan_destroy(struct vkd3d_shader_scan_info *scan_info)
hash_map_clear(&scan_info->register_map);
}
static int vkd3d_shader_validate_shader_type(enum vkd3d_shader_type type, VkShaderStageFlagBits stages)
{
static const VkShaderStageFlagBits table[VKD3D_SHADER_TYPE_COUNT] = {
VK_SHADER_STAGE_FRAGMENT_BIT,
VK_SHADER_STAGE_VERTEX_BIT,
VK_SHADER_STAGE_GEOMETRY_BIT,
VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT,
VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT,
VK_SHADER_STAGE_COMPUTE_BIT,
};
if (type >= VKD3D_SHADER_TYPE_COUNT)
return VKD3D_ERROR_INVALID_ARGUMENT;
if (table[type] != stages)
{
ERR("Expected VkShaderStage #%x, but got VkShaderStage #%x.\n", stages, table[type]);
return VKD3D_ERROR_INVALID_ARGUMENT;
}
return 0;
}
int vkd3d_shader_compile_dxbc(const struct vkd3d_shader_code *dxbc,
struct vkd3d_shader_code *spirv, unsigned int compiler_options,
const struct vkd3d_shader_interface_info *shader_interface_info,
@ -270,32 +336,22 @@ int vkd3d_shader_compile_dxbc(const struct vkd3d_shader_code *dxbc,
TRACE("dxbc {%p, %zu}, spirv %p, compiler_options %#x, shader_interface_info %p, compile_args %p.\n",
dxbc->code, dxbc->size, spirv, compiler_options, shader_interface_info, compile_args);
if (shader_interface_info && shader_interface_info->type != VKD3D_SHADER_STRUCTURE_TYPE_SHADER_INTERFACE_INFO)
{
WARN("Invalid structure type %#x.\n", shader_interface_info->type);
return VKD3D_ERROR_INVALID_ARGUMENT;
}
if ((ret = vkd3d_shader_validate_compile_args(compile_args)) < 0)
return ret;
/* DXIL is handled externally through dxil-spirv. */
if (shader_is_dxil(dxbc->code, dxbc->size))
{
#ifdef HAVE_DXIL_SPV
return vkd3d_shader_compile_dxil(dxbc, spirv, shader_interface_info, compile_args);
#else
ERR("DXIL shader found, but DXIL support is not enabled in vkd3d.\n");
return VKD3D_ERROR_INVALID_SHADER;
#endif
}
memset(&spirv->meta, 0, sizeof(spirv->meta));
hash = vkd3d_shader_hash(dxbc);
spirv->meta.replaced = false;
spirv->meta.hash = hash;
if (vkd3d_shader_replace(hash, &spirv->code, &spirv->size))
{
spirv->meta.replaced = true;
spirv->meta.flags |= VKD3D_SHADER_META_FLAG_REPLACED;
return VKD3D_OK;
}
@ -307,19 +363,31 @@ int vkd3d_shader_compile_dxbc(const struct vkd3d_shader_code *dxbc,
return ret;
}
spirv->meta.patch_vertex_count = scan_info.patch_vertex_count;
if ((ret = vkd3d_shader_parser_init(&parser, dxbc)) < 0)
{
vkd3d_shader_scan_destroy(&scan_info);
return ret;
}
if (shader_interface_info)
{
if ((ret = vkd3d_shader_validate_shader_type(parser.shader_version.type, shader_interface_info->stage)) < 0)
{
vkd3d_shader_scan_destroy(&scan_info);
return ret;
}
}
vkd3d_shader_dump_shader(hash, dxbc, "dxbc");
if (TRACE_ON())
vkd3d_shader_trace(parser.data);
if (!(spirv_compiler = vkd3d_dxbc_compiler_create(&parser.shader_version,
&parser.shader_desc, compiler_options, shader_interface_info, compile_args, &scan_info)))
&parser.shader_desc, compiler_options, shader_interface_info, compile_args, &scan_info,
spirv->meta.hash)))
{
ERR("Failed to create DXBC compiler.\n");
vkd3d_shader_scan_destroy(&scan_info);
@ -366,6 +434,24 @@ static bool vkd3d_shader_instruction_is_uav_read(const struct vkd3d_shader_instr
|| ((handler_idx == VKD3DSIH_LD_STRUCTURED || handler_idx == VKD3DSIH_LD_STRUCTURED_FEEDBACK) && instruction->src[2].reg.type == VKD3DSPR_UAV);
}
static bool vkd3d_shader_instruction_is_uav_write(const struct vkd3d_shader_instruction *instruction)
{
enum VKD3D_SHADER_INSTRUCTION_HANDLER handler_idx = instruction->handler_idx;
return (VKD3DSIH_ATOMIC_AND <= handler_idx && handler_idx <= VKD3DSIH_ATOMIC_XOR)
|| (VKD3DSIH_IMM_ATOMIC_ALLOC <= handler_idx && handler_idx <= VKD3DSIH_IMM_ATOMIC_XOR)
|| handler_idx == VKD3DSIH_STORE_UAV_TYPED
|| handler_idx == VKD3DSIH_STORE_RAW
|| handler_idx == VKD3DSIH_STORE_STRUCTURED;
}
static bool vkd3d_shader_instruction_is_uav_atomic(const struct vkd3d_shader_instruction *instruction)
{
enum VKD3D_SHADER_INSTRUCTION_HANDLER handler_idx = instruction->handler_idx;
return ((VKD3DSIH_ATOMIC_AND <= handler_idx && handler_idx <= VKD3DSIH_ATOMIC_XOR) ||
(VKD3DSIH_IMM_ATOMIC_AND <= handler_idx && handler_idx <= VKD3DSIH_IMM_ATOMIC_XOR)) &&
handler_idx != VKD3DSIH_IMM_ATOMIC_CONSUME;
}
static void vkd3d_shader_scan_record_uav_read(struct vkd3d_shader_scan_info *scan_info,
const struct vkd3d_shader_register *reg)
{
@ -373,6 +459,13 @@ static void vkd3d_shader_scan_record_uav_read(struct vkd3d_shader_scan_info *sca
reg->idx[0].offset, VKD3D_SHADER_UAV_FLAG_READ_ACCESS);
}
static void vkd3d_shader_scan_record_uav_atomic(struct vkd3d_shader_scan_info *scan_info,
const struct vkd3d_shader_register *reg)
{
vkd3d_shader_scan_set_register_flags(scan_info, VKD3DSPR_UAV,
reg->idx[0].offset, VKD3D_SHADER_UAV_FLAG_ATOMIC_ACCESS);
}
static bool vkd3d_shader_instruction_is_uav_counter(const struct vkd3d_shader_instruction *instruction)
{
enum VKD3D_SHADER_INSTRUCTION_HANDLER handler_idx = instruction->handler_idx;
@ -383,6 +476,8 @@ static bool vkd3d_shader_instruction_is_uav_counter(const struct vkd3d_shader_in
static void vkd3d_shader_scan_record_uav_counter(struct vkd3d_shader_scan_info *scan_info,
const struct vkd3d_shader_register *reg)
{
scan_info->has_side_effects = true;
scan_info->has_uav_counter = true;
vkd3d_shader_scan_set_register_flags(scan_info, VKD3DSPR_UAV,
reg->idx[0].offset, VKD3D_SHADER_UAV_FLAG_ATOMIC_COUNTER);
}
@ -396,81 +491,83 @@ static void vkd3d_shader_scan_input_declaration(struct vkd3d_shader_scan_info *s
scan_info->use_vocp = true;
}
static void vkd3d_shader_scan_output_declaration(struct vkd3d_shader_scan_info *scan_info,
const struct vkd3d_shader_instruction *instruction)
{
switch (instruction->declaration.dst.reg.type)
{
case VKD3DSPR_DEPTHOUT:
case VKD3DSPR_DEPTHOUTLE:
case VKD3DSPR_DEPTHOUTGE:
case VKD3DSPR_STENCILREFOUT:
case VKD3DSPR_SAMPLEMASK:
scan_info->needs_late_zs = true;
break;
default:
break;
}
}
static void vkd3d_shader_scan_instruction(struct vkd3d_shader_scan_info *scan_info,
const struct vkd3d_shader_instruction *instruction)
{
unsigned int i;
bool is_atomic;
switch (instruction->handler_idx)
{
case VKD3DSIH_DCL_INPUT:
vkd3d_shader_scan_input_declaration(scan_info, instruction);
break;
case VKD3DSIH_DCL_OUTPUT:
vkd3d_shader_scan_output_declaration(scan_info, instruction);
break;
case VKD3DSIH_DISCARD:
scan_info->discards = true;
break;
case VKD3DSIH_DCL_GLOBAL_FLAGS:
if (instruction->flags & VKD3DSGF_FORCE_EARLY_DEPTH_STENCIL)
scan_info->early_fragment_tests = true;
break;
case VKD3DSIH_DCL_INPUT_CONTROL_POINT_COUNT:
scan_info->patch_vertex_count = instruction->declaration.count;
break;
default:
break;
}
if (vkd3d_shader_instruction_is_uav_read(instruction))
{
is_atomic = vkd3d_shader_instruction_is_uav_atomic(instruction);
for (i = 0; i < instruction->dst_count; ++i)
{
if (instruction->dst[i].reg.type == VKD3DSPR_UAV)
{
vkd3d_shader_scan_record_uav_read(scan_info, &instruction->dst[i].reg);
if (is_atomic)
vkd3d_shader_scan_record_uav_atomic(scan_info, &instruction->dst[i].reg);
}
}
for (i = 0; i < instruction->src_count; ++i)
{
if (instruction->src[i].reg.type == VKD3DSPR_UAV)
{
vkd3d_shader_scan_record_uav_read(scan_info, &instruction->src[i].reg);
if (is_atomic)
vkd3d_shader_scan_record_uav_atomic(scan_info, &instruction->src[i].reg);
}
}
}
if (vkd3d_shader_instruction_is_uav_write(instruction))
scan_info->has_side_effects = true;
if (vkd3d_shader_instruction_is_uav_counter(instruction))
vkd3d_shader_scan_record_uav_counter(scan_info, &instruction->src[0].reg);
}
int vkd3d_shader_scan_patch_vertex_count(const struct vkd3d_shader_code *dxbc,
unsigned int *patch_vertex_count)
{
struct vkd3d_shader_instruction instruction;
struct vkd3d_shader_parser parser;
int ret;
if (shader_is_dxil(dxbc->code, dxbc->size))
{
/* TODO */
*patch_vertex_count = 0;
return VKD3D_OK;
}
else
{
if ((ret = vkd3d_shader_parser_init(&parser, dxbc)) < 0)
return ret;
*patch_vertex_count = 0;
while (!shader_sm4_is_end(parser.data, &parser.ptr))
{
shader_sm4_read_instruction(parser.data, &parser.ptr, &instruction);
if (instruction.handler_idx == VKD3DSIH_INVALID)
{
WARN("Encountered unrecognized or invalid instruction.\n");
vkd3d_shader_parser_destroy(&parser);
return VKD3D_ERROR_INVALID_ARGUMENT;
}
if (instruction.handler_idx == VKD3DSIH_DCL_INPUT_CONTROL_POINT_COUNT)
{
*patch_vertex_count = instruction.declaration.count;
break;
}
}
vkd3d_shader_parser_destroy(&parser);
return VKD3D_OK;
}
}
int vkd3d_shader_scan_dxbc(const struct vkd3d_shader_code *dxbc,
struct vkd3d_shader_scan_info *scan_info)
{
@ -578,6 +675,14 @@ int vkd3d_shader_parse_input_signature(const struct vkd3d_shader_code *dxbc,
return shader_parse_input_signature(dxbc->code, dxbc->size, signature);
}
int vkd3d_shader_parse_output_signature(const struct vkd3d_shader_code *dxbc,
struct vkd3d_shader_signature *signature)
{
TRACE("dxbc {%p, %zu}, signature %p.\n", dxbc->code, dxbc->size, signature);
return shader_parse_output_signature(dxbc->code, dxbc->size, signature);
}
struct vkd3d_shader_signature_element *vkd3d_shader_find_signature_element(
const struct vkd3d_shader_signature *signature, const char *semantic_name,
unsigned int semantic_index, unsigned int stream_index)
@ -608,23 +713,38 @@ void vkd3d_shader_free_shader_signature(struct vkd3d_shader_signature *signature
signature->elements = NULL;
}
int vkd3d_shader_supports_dxil(void)
{
#ifdef HAVE_DXIL_SPV
return 1;
#else
return 0;
#endif
}
vkd3d_shader_hash_t vkd3d_shader_hash(const struct vkd3d_shader_code *shader)
{
vkd3d_shader_hash_t h = 0xcbf29ce484222325ull;
vkd3d_shader_hash_t h = hash_fnv1_init();
const uint8_t *code = shader->code;
size_t i, n;
for (i = 0, n = shader->size; i < n; i++)
h = (h * 0x100000001b3ull) ^ code[i];
h = hash_fnv1_iterate_u8(h, code[i]);
return h;
}
uint32_t vkd3d_shader_compile_arguments_select_quirks(
const struct vkd3d_shader_compile_arguments *compile_args, vkd3d_shader_hash_t shader_hash)
{
unsigned int i;
if (compile_args && compile_args->quirks)
{
for (i = 0; i < compile_args->quirks->num_hashes; i++)
if (compile_args->quirks->hashes[i].shader_hash == shader_hash)
return compile_args->quirks->hashes[i].quirks | compile_args->quirks->global_quirks;
return compile_args->quirks->default_quirks | compile_args->quirks->global_quirks;
}
else
return 0;
}
uint64_t vkd3d_shader_get_revision(void)
{
/* This is meant to be bumped every time a change is made to the shader compiler.
* Might get nuked later ...
* It's not immediately useful for invalidating pipeline caches, since that would mostly be covered
* by vkd3d-proton Git hash. */
return 1;
}

View File

@ -342,6 +342,7 @@ enum vkd3d_shader_register_type
VKD3DSPR_DEPTHOUTLE,
VKD3DSPR_RASTERIZER,
VKD3DSPR_STENCILREFOUT,
VKD3DSPR_INNERCOVERAGE,
VKD3DSPR_INVALID = ~0u,
};
@ -780,6 +781,8 @@ void free_shader_desc(struct vkd3d_shader_desc *desc);
int shader_parse_input_signature(const void *dxbc, size_t dxbc_length,
struct vkd3d_shader_signature *signature);
int shader_parse_output_signature(const void *dxbc, size_t dxbc_length,
struct vkd3d_shader_signature *signature);
struct vkd3d_dxbc_compiler;
@ -787,7 +790,8 @@ struct vkd3d_dxbc_compiler *vkd3d_dxbc_compiler_create(const struct vkd3d_shader
const struct vkd3d_shader_desc *shader_desc, uint32_t compiler_options,
const struct vkd3d_shader_interface_info *shader_interface_info,
const struct vkd3d_shader_compile_arguments *compile_args,
const struct vkd3d_shader_scan_info *scan_info);
const struct vkd3d_shader_scan_info *scan_info,
vkd3d_shader_hash_t shader_hash);
int vkd3d_dxbc_compiler_handle_instruction(struct vkd3d_dxbc_compiler *compiler,
const struct vkd3d_shader_instruction *instruction);
int vkd3d_dxbc_compiler_generate_spirv(struct vkd3d_dxbc_compiler *compiler,
@ -797,8 +801,11 @@ void vkd3d_dxbc_compiler_destroy(struct vkd3d_dxbc_compiler *compiler);
void vkd3d_compute_dxbc_checksum(const void *dxbc, size_t size, uint32_t checksum[4]);
void vkd3d_shader_dump_spirv_shader(vkd3d_shader_hash_t hash, const struct vkd3d_shader_code *shader);
void vkd3d_shader_dump_spirv_shader_export(vkd3d_shader_hash_t hash, const struct vkd3d_shader_code *shader,
const char *export);
void vkd3d_shader_dump_shader(vkd3d_shader_hash_t hash, const struct vkd3d_shader_code *shader, const char *ext);
bool vkd3d_shader_replace(vkd3d_shader_hash_t hash, const void **data, size_t *size);
bool vkd3d_shader_replace_export(vkd3d_shader_hash_t hash, const void **data, size_t *size, const char *export);
static inline enum vkd3d_component_type vkd3d_component_type_from_data_type(
enum vkd3d_data_type data_type)
@ -900,27 +907,6 @@ static inline unsigned int vkd3d_compact_swizzle(unsigned int swizzle, unsigned
return compacted_swizzle;
}
struct vkd3d_struct
{
enum vkd3d_shader_structure_type type;
const void *next;
};
#define vkd3d_find_struct(c, t) vkd3d_find_struct_(c, VKD3D_SHADER_STRUCTURE_TYPE_##t)
static inline const void *vkd3d_find_struct_(const struct vkd3d_struct *chain,
enum vkd3d_shader_structure_type type)
{
while (chain)
{
if (chain->type == type)
return chain;
chain = chain->next;
}
return NULL;
}
#define VKD3D_DXBC_MAX_SOURCE_COUNT 6
#define VKD3D_DXBC_HEADER_SIZE (8 * sizeof(uint32_t))
@ -933,6 +919,4 @@ int vkd3d_shader_compile_dxil(const struct vkd3d_shader_code *dxbc,
const struct vkd3d_shader_interface_info *shader_interface_info,
const struct vkd3d_shader_compile_arguments *compiler_args);
vkd3d_shader_hash_t vkd3d_shader_hash(const struct vkd3d_shader_code *shader);
#endif /* __VKD3D_SHADER_PRIVATE_H */

View File

@ -6,11 +6,11 @@ vkd3d_utils_lib = shared_library('vkd3d-proton-utils', vkd3d_utils_src,
dependencies : vkd3d_dep,
include_directories : vkd3d_private_includes,
install : true,
objects : not vkd3d_msvc and vkd3d_platform == 'windows'
objects : not vkd3d_is_msvc and vkd3d_platform == 'windows'
? 'vkd3d-proton-utils.def'
: [],
vs_module_defs : 'vkd3d-proton-utils.def',
version : '2.0.0',
version : '3.0.0',
c_args : '-DVKD3D_UTILS_EXPORTS',
override_options : [ 'c_std='+vkd3d_c_std ])

View File

@ -1,14 +1,14 @@
LIBRARY vkd3d-proton-utils-2.dll
LIBRARY vkd3d-proton-utils-3.dll
EXPORTS
D3D12CreateDevice @101
D3D12GetDebugInterface @102
D3D12CreateRootSignatureDeserializer @107
D3D12CreateVersionedRootSignatureDeserializer @108
D3D12CreateRootSignatureDeserializer
D3D12CreateVersionedRootSignatureDeserializer
D3D12EnableExperimentalFeatures @110
D3D12SerializeRootSignature @115
D3D12SerializeVersionedRootSignature @116
D3D12EnableExperimentalFeatures
D3D12SerializeRootSignature
D3D12SerializeVersionedRootSignature
vkd3d_create_event
vkd3d_wait_event

View File

@ -31,7 +31,6 @@ VKD3D_UTILS_EXPORT HRESULT WINAPI D3D12GetDebugInterface(REFIID iid, void **debu
VKD3D_UTILS_EXPORT HRESULT WINAPI D3D12CreateDevice(IUnknown *adapter,
D3D_FEATURE_LEVEL minimum_feature_level, REFIID iid, void **device)
{
struct vkd3d_optional_instance_extensions_info optional_extensions_info;
struct vkd3d_instance_create_info instance_create_info;
struct vkd3d_device_create_info device_create_info;
@ -55,22 +54,14 @@ VKD3D_UTILS_EXPORT HRESULT WINAPI D3D12CreateDevice(IUnknown *adapter,
if (adapter)
FIXME("Ignoring adapter %p.\n", adapter);
memset(&optional_extensions_info, 0, sizeof(optional_extensions_info));
optional_extensions_info.type = VKD3D_STRUCTURE_TYPE_OPTIONAL_INSTANCE_EXTENSIONS_INFO;
optional_extensions_info.extensions = optional_instance_extensions;
optional_extensions_info.extension_count = ARRAY_SIZE(optional_instance_extensions);
memset(&instance_create_info, 0, sizeof(instance_create_info));
instance_create_info.type = VKD3D_STRUCTURE_TYPE_INSTANCE_CREATE_INFO;
instance_create_info.next = &optional_extensions_info;
instance_create_info.pfn_signal_event = vkd3d_signal_event;
instance_create_info.wchar_size = sizeof(WCHAR);
instance_create_info.instance_extensions = instance_extensions;
instance_create_info.instance_extension_count = ARRAY_SIZE(instance_extensions);
instance_create_info.optional_instance_extensions = optional_instance_extensions;
instance_create_info.optional_instance_extension_count = ARRAY_SIZE(optional_instance_extensions);
memset(&device_create_info, 0, sizeof(device_create_info));
device_create_info.type = VKD3D_STRUCTURE_TYPE_DEVICE_CREATE_INFO;
device_create_info.next = NULL;
device_create_info.minimum_feature_level = minimum_feature_level;
device_create_info.instance_create_info = &instance_create_info;
device_create_info.device_extensions = device_extensions;

View File

@ -0,0 +1,498 @@
/*
* Copyright 2021 Hans-Kristian Arntzen for Valve Corporation
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
#define VKD3D_DBG_CHANNEL VKD3D_DBG_CHANNEL_API
#include "vkd3d_private.h"
#define RT_TRACE TRACE
void vkd3d_acceleration_structure_build_info_cleanup(
struct vkd3d_acceleration_structure_build_info *info)
{
if (info->primitive_counts != info->primitive_counts_stack)
vkd3d_free(info->primitive_counts);
if (info->geometries != info->geometries_stack)
vkd3d_free(info->geometries);
if (info->build_range_ptrs != info->build_range_ptr_stack)
vkd3d_free((void *)info->build_range_ptrs);
if (info->build_ranges != info->build_range_stack)
vkd3d_free(info->build_ranges);
}
static VkBuildAccelerationStructureFlagsKHR d3d12_build_flags_to_vk(
D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAGS flags)
{
VkBuildAccelerationStructureFlagsKHR vk_flags = 0;
if (flags & D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_ALLOW_COMPACTION)
vk_flags |= VK_BUILD_ACCELERATION_STRUCTURE_ALLOW_COMPACTION_BIT_KHR;
if (flags & D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_ALLOW_UPDATE)
vk_flags |= VK_BUILD_ACCELERATION_STRUCTURE_ALLOW_UPDATE_BIT_KHR;
if (flags & D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_MINIMIZE_MEMORY)
vk_flags |= VK_BUILD_ACCELERATION_STRUCTURE_LOW_MEMORY_BIT_KHR;
if (flags & D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_PREFER_FAST_BUILD)
vk_flags |= VK_BUILD_ACCELERATION_STRUCTURE_PREFER_FAST_BUILD_BIT_KHR;
if (flags & D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_PREFER_FAST_TRACE)
vk_flags |= VK_BUILD_ACCELERATION_STRUCTURE_PREFER_FAST_TRACE_BIT_KHR;
return vk_flags;
}
static VkGeometryFlagsKHR d3d12_geometry_flags_to_vk(D3D12_RAYTRACING_GEOMETRY_FLAGS flags)
{
VkGeometryFlagsKHR vk_flags = 0;
if (flags & D3D12_RAYTRACING_GEOMETRY_FLAG_OPAQUE)
vk_flags |= VK_GEOMETRY_OPAQUE_BIT_KHR;
if (flags & D3D12_RAYTRACING_GEOMETRY_FLAG_NO_DUPLICATE_ANYHIT_INVOCATION)
vk_flags |= VK_GEOMETRY_NO_DUPLICATE_ANY_HIT_INVOCATION_BIT_KHR;
return vk_flags;
}
bool vkd3d_acceleration_structure_convert_inputs(const struct d3d12_device *device,
struct vkd3d_acceleration_structure_build_info *info,
const D3D12_BUILD_RAYTRACING_ACCELERATION_STRUCTURE_INPUTS *desc)
{
VkAccelerationStructureGeometryTrianglesDataKHR *triangles;
VkAccelerationStructureBuildGeometryInfoKHR *build_info;
VkAccelerationStructureGeometryAabbsDataKHR *aabbs;
const D3D12_RAYTRACING_GEOMETRY_DESC *geom_desc;
bool have_triangles, have_aabbs;
unsigned int i;
RT_TRACE("Converting inputs.\n");
RT_TRACE("=====================\n");
build_info = &info->build_info;
memset(build_info, 0, sizeof(*build_info));
build_info->sType = VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_BUILD_GEOMETRY_INFO_KHR;
if (desc->Type == D3D12_RAYTRACING_ACCELERATION_STRUCTURE_TYPE_TOP_LEVEL)
{
build_info->type = VK_ACCELERATION_STRUCTURE_TYPE_TOP_LEVEL_KHR;
RT_TRACE("Top level build.\n");
}
else
{
build_info->type = VK_ACCELERATION_STRUCTURE_TYPE_BOTTOM_LEVEL_KHR;
RT_TRACE("Bottom level build.\n");
}
build_info->flags = d3d12_build_flags_to_vk(desc->Flags);
if (desc->Flags & D3D12_RAYTRACING_ACCELERATION_STRUCTURE_BUILD_FLAG_PERFORM_UPDATE)
{
RT_TRACE("BUILD_FLAG_PERFORM_UPDATE.\n");
build_info->mode = VK_BUILD_ACCELERATION_STRUCTURE_MODE_UPDATE_KHR;
}
else
build_info->mode = VK_BUILD_ACCELERATION_STRUCTURE_MODE_BUILD_KHR;
info->geometries = info->geometries_stack;
info->primitive_counts = info->primitive_counts_stack;
info->build_ranges = info->build_range_stack;
info->build_range_ptrs = info->build_range_ptr_stack;
if (desc->Type == D3D12_RAYTRACING_ACCELERATION_STRUCTURE_TYPE_TOP_LEVEL)
{
memset(info->geometries, 0, sizeof(*info->geometries));
info->geometries[0].sType = VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_GEOMETRY_KHR;
info->geometries[0].geometryType = VK_GEOMETRY_TYPE_INSTANCES_KHR;
info->geometries[0].geometry.instances.sType =
VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_GEOMETRY_INSTANCES_DATA_KHR;
info->geometries[0].geometry.instances.arrayOfPointers =
desc->DescsLayout == D3D12_ELEMENTS_LAYOUT_ARRAY_OF_POINTERS ? VK_TRUE : VK_FALSE;
info->geometries[0].geometry.instances.data.deviceAddress = desc->InstanceDescs;
info->primitive_counts = info->primitive_counts_stack;
info->primitive_counts[0] = desc->NumDescs;
build_info->geometryCount = 1;
RT_TRACE(" ArrayOfPointers: %u.\n",
desc->DescsLayout == D3D12_ELEMENTS_LAYOUT_ARRAY_OF_POINTERS ? 1 : 0);
RT_TRACE(" NumDescs: %u.\n", info->primitive_counts[0]);
}
else
{
have_triangles = false;
have_aabbs = false;
if (desc->NumDescs <= VKD3D_BUILD_INFO_STACK_COUNT)
{
memset(info->geometries, 0, sizeof(*info->geometries) * desc->NumDescs);
memset(info->primitive_counts, 0, sizeof(*info->primitive_counts) * desc->NumDescs);
}
else
{
info->geometries = vkd3d_calloc(desc->NumDescs, sizeof(*info->geometries));
info->primitive_counts = vkd3d_calloc(desc->NumDescs, sizeof(*info->primitive_counts));
info->build_ranges = vkd3d_malloc(desc->NumDescs * sizeof(*info->build_ranges));
info->build_range_ptrs = vkd3d_malloc(desc->NumDescs * sizeof(*info->build_range_ptrs));
}
build_info->geometryCount = desc->NumDescs;
for (i = 0; i < desc->NumDescs; i++)
{
info->geometries[i].sType = VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_GEOMETRY_KHR;
RT_TRACE(" Geom %u:\n", i);
if (desc->DescsLayout == D3D12_ELEMENTS_LAYOUT_ARRAY_OF_POINTERS)
{
geom_desc = desc->ppGeometryDescs[i];
RT_TRACE(" ArrayOfPointers\n");
}
else
{
geom_desc = &desc->pGeometryDescs[i];
RT_TRACE(" PointerToArray\n");
}
info->geometries[i].flags = d3d12_geometry_flags_to_vk(geom_desc->Flags);
RT_TRACE(" Flags = #%x\n", geom_desc->Flags);
switch (geom_desc->Type)
{
case D3D12_RAYTRACING_GEOMETRY_TYPE_TRIANGLES:
/* Runtime validates this. */
if (have_aabbs)
{
ERR("Cannot mix and match geometry types in a BLAS.\n");
return false;
}
have_triangles = true;
info->geometries[i].geometryType = VK_GEOMETRY_TYPE_TRIANGLES_KHR;
triangles = &info->geometries[i].geometry.triangles;
triangles->sType = VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_GEOMETRY_TRIANGLES_DATA_KHR;
triangles->indexData.deviceAddress = geom_desc->Triangles.IndexBuffer;
if (geom_desc->Triangles.IndexFormat != DXGI_FORMAT_UNKNOWN)
{
if (!geom_desc->Triangles.IndexBuffer)
WARN("Application is using IndexBuffer = 0 and IndexFormat != UNKNOWN. Likely application bug.\n");
triangles->indexType =
geom_desc->Triangles.IndexFormat == DXGI_FORMAT_R16_UINT ?
VK_INDEX_TYPE_UINT16 : VK_INDEX_TYPE_UINT32;
info->primitive_counts[i] = geom_desc->Triangles.IndexCount / 3;
RT_TRACE(" Indexed : Index count = %u (%u bits)\n",
geom_desc->Triangles.IndexCount,
triangles->indexType == VK_INDEX_TYPE_UINT16 ? 16 : 32);
RT_TRACE(" Vertex count: %u\n", geom_desc->Triangles.VertexCount);
RT_TRACE(" IBO VA: %"PRIx64".\n", geom_desc->Triangles.IndexBuffer);
}
else
{
info->primitive_counts[i] = geom_desc->Triangles.VertexCount / 3;
triangles->indexType = VK_INDEX_TYPE_NONE_KHR;
RT_TRACE(" Triangle list : Vertex count: %u\n", geom_desc->Triangles.VertexCount);
}
triangles->maxVertex = max(1, geom_desc->Triangles.VertexCount) - 1;
triangles->vertexStride = geom_desc->Triangles.VertexBuffer.StrideInBytes;
triangles->vertexFormat = vkd3d_internal_get_vk_format(device, geom_desc->Triangles.VertexFormat);
triangles->vertexData.deviceAddress = geom_desc->Triangles.VertexBuffer.StartAddress;
triangles->transformData.deviceAddress = geom_desc->Triangles.Transform3x4;
RT_TRACE(" Transform3x4: %s\n", geom_desc->Triangles.Transform3x4 ? "on" : "off");
RT_TRACE(" Vertex format: %s\n", debug_dxgi_format(geom_desc->Triangles.VertexFormat));
RT_TRACE(" VBO VA: %"PRIx64"\n", geom_desc->Triangles.VertexBuffer.StartAddress);
RT_TRACE(" Vertex stride: %"PRIu64" bytes\n", geom_desc->Triangles.VertexBuffer.StrideInBytes);
break;
case D3D12_RAYTRACING_GEOMETRY_TYPE_PROCEDURAL_PRIMITIVE_AABBS:
/* Runtime validates this. */
if (have_triangles)
{
ERR("Cannot mix and match geometry types in a BLAS.\n");
return false;
}
have_aabbs = true;
info->geometries[i].geometryType = VK_GEOMETRY_TYPE_AABBS_KHR;
aabbs = &info->geometries[i].geometry.aabbs;
aabbs->sType = VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_GEOMETRY_AABBS_DATA_KHR;
aabbs->stride = geom_desc->AABBs.AABBs.StrideInBytes;
aabbs->data.deviceAddress = geom_desc->AABBs.AABBs.StartAddress;
info->primitive_counts[i] = geom_desc->AABBs.AABBCount;
RT_TRACE(" AABB stride: %"PRIu64" bytes\n", geom_desc->AABBs.AABBs.StrideInBytes);
break;
default:
FIXME("Unsupported geometry type %u.\n", geom_desc->Type);
return false;
}
RT_TRACE(" Primitive count %u.\n", info->primitive_counts[i]);
}
}
for (i = 0; i < build_info->geometryCount; i++)
{
info->build_range_ptrs[i] = &info->build_ranges[i];
info->build_ranges[i].primitiveCount = info->primitive_counts[i];
info->build_ranges[i].firstVertex = 0;
info->build_ranges[i].primitiveOffset = 0;
info->build_ranges[i].transformOffset = 0;
}
build_info->pGeometries = info->geometries;
RT_TRACE("=====================\n");
return true;
}
static void vkd3d_acceleration_structure_end_barrier(struct d3d12_command_list *list)
{
/* We resolve the query in TRANSFER, but DXR expects UNORDERED_ACCESS. */
const struct vkd3d_vk_device_procs *vk_procs = &list->device->vk_procs;
VkMemoryBarrier barrier;
barrier.sType = VK_STRUCTURE_TYPE_MEMORY_BARRIER;
barrier.pNext = NULL;
barrier.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
barrier.dstAccessMask = 0;
VK_CALL(vkCmdPipelineBarrier(list->vk_command_buffer,
VK_PIPELINE_STAGE_TRANSFER_BIT,
VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, 0,
1, &barrier, 0, NULL, 0, NULL));
}
static void vkd3d_acceleration_structure_write_postbuild_info(
struct d3d12_command_list *list,
const D3D12_RAYTRACING_ACCELERATION_STRUCTURE_POSTBUILD_INFO_DESC *desc,
VkDeviceSize desc_offset,
VkAccelerationStructureKHR vk_acceleration_structure)
{
const struct vkd3d_vk_device_procs *vk_procs = &list->device->vk_procs;
const struct vkd3d_unique_resource *resource;
VkQueryPool vk_query_pool;
VkQueryType vk_query_type;
uint32_t vk_query_index;
VkDeviceSize stride;
uint32_t type_index;
VkBuffer vk_buffer;
uint32_t offset;
resource = vkd3d_va_map_deref(&list->device->memory_allocator.va_map, desc->DestBuffer);
if (!resource)
{
ERR("Invalid resource.\n");
return;
}
vk_buffer = resource->vk_buffer;
offset = desc->DestBuffer - resource->va;
offset += desc_offset;
if (desc->InfoType == D3D12_RAYTRACING_ACCELERATION_STRUCTURE_POSTBUILD_INFO_COMPACTED_SIZE)
{
vk_query_type = VK_QUERY_TYPE_ACCELERATION_STRUCTURE_COMPACTED_SIZE_KHR;
type_index = VKD3D_QUERY_TYPE_INDEX_RT_COMPACTED_SIZE;
stride = sizeof(uint64_t);
}
else if (desc->InfoType == D3D12_RAYTRACING_ACCELERATION_STRUCTURE_POSTBUILD_INFO_CURRENT_SIZE &&
list->device->device_info.ray_tracing_maintenance1_features.rayTracingMaintenance1)
{
vk_query_type = VK_QUERY_TYPE_ACCELERATION_STRUCTURE_SIZE_KHR;
type_index = VKD3D_QUERY_TYPE_INDEX_RT_CURRENT_SIZE;
stride = sizeof(uint64_t);
}
else if (desc->InfoType == D3D12_RAYTRACING_ACCELERATION_STRUCTURE_POSTBUILD_INFO_SERIALIZATION)
{
vk_query_type = VK_QUERY_TYPE_ACCELERATION_STRUCTURE_SERIALIZATION_SIZE_KHR;
type_index = VKD3D_QUERY_TYPE_INDEX_RT_SERIALIZE_SIZE;
stride = sizeof(uint64_t);
}
else
{
FIXME("Unsupported InfoType %u.\n", desc->InfoType);
/* TODO: CURRENT_SIZE is something we cannot query in Vulkan, so
* we'll need to keep around a buffer to handle this.
* For now, just clear to 0. */
VK_CALL(vkCmdFillBuffer(list->vk_command_buffer, vk_buffer, offset,
sizeof(uint64_t), 0));
return;
}
if (!d3d12_command_allocator_allocate_query_from_type_index(list->allocator,
type_index, &vk_query_pool, &vk_query_index))
{
ERR("Failed to allocate query.\n");
return;
}
d3d12_command_list_reset_query(list, vk_query_pool, vk_query_index);
VK_CALL(vkCmdWriteAccelerationStructuresPropertiesKHR(list->vk_command_buffer,
1, &vk_acceleration_structure, vk_query_type, vk_query_pool, vk_query_index));
VK_CALL(vkCmdCopyQueryPoolResults(list->vk_command_buffer,
vk_query_pool, vk_query_index, 1,
vk_buffer, offset, stride,
VK_QUERY_RESULT_64_BIT | VK_QUERY_RESULT_WAIT_BIT));
if (desc->InfoType == D3D12_RAYTRACING_ACCELERATION_STRUCTURE_POSTBUILD_INFO_SERIALIZATION)
{
if (list->device->device_info.ray_tracing_maintenance1_features.rayTracingMaintenance1)
{
type_index = VKD3D_QUERY_TYPE_INDEX_RT_SERIALIZE_SIZE_BOTTOM_LEVEL_POINTERS;
if (!d3d12_command_allocator_allocate_query_from_type_index(list->allocator,
type_index, &vk_query_pool, &vk_query_index))
{
ERR("Failed to allocate query.\n");
return;
}
d3d12_command_list_reset_query(list, vk_query_pool, vk_query_index);
VK_CALL(vkCmdWriteAccelerationStructuresPropertiesKHR(list->vk_command_buffer,
1, &vk_acceleration_structure, vk_query_type, vk_query_pool, vk_query_index));
VK_CALL(vkCmdCopyQueryPoolResults(list->vk_command_buffer,
vk_query_pool, vk_query_index, 1,
vk_buffer, offset + sizeof(uint64_t), stride,
VK_QUERY_RESULT_64_BIT | VK_QUERY_RESULT_WAIT_BIT));
}
else
{
FIXME("NumBottomLevelPointers will always return 0.\n");
VK_CALL(vkCmdFillBuffer(list->vk_command_buffer, vk_buffer, offset + sizeof(uint64_t),
sizeof(uint64_t), 0));
}
}
}
void vkd3d_acceleration_structure_emit_postbuild_info(
struct d3d12_command_list *list,
const D3D12_RAYTRACING_ACCELERATION_STRUCTURE_POSTBUILD_INFO_DESC *desc,
uint32_t count,
const D3D12_GPU_VIRTUAL_ADDRESS *addresses)
{
const struct vkd3d_vk_device_procs *vk_procs = &list->device->vk_procs;
VkAccelerationStructureKHR vk_acceleration_structure;
VkMemoryBarrier barrier;
VkDeviceSize stride;
uint32_t i;
barrier.sType = VK_STRUCTURE_TYPE_MEMORY_BARRIER;
barrier.pNext = NULL;
barrier.srcAccessMask = 0;
barrier.dstAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
/* We resolve the query in TRANSFER, but DXR expects UNORDERED_ACCESS. */
VK_CALL(vkCmdPipelineBarrier(list->vk_command_buffer,
VK_PIPELINE_STAGE_ALL_COMMANDS_BIT,
VK_PIPELINE_STAGE_TRANSFER_BIT, 0,
1, &barrier, 0, NULL, 0, NULL));
stride = desc->InfoType == D3D12_RAYTRACING_ACCELERATION_STRUCTURE_POSTBUILD_INFO_SERIALIZATION ?
2 * sizeof(uint64_t) : sizeof(uint64_t);
for (i = 0; i < count; i++)
{
vk_acceleration_structure = vkd3d_va_map_place_acceleration_structure(
&list->device->memory_allocator.va_map, list->device, addresses[i]);
if (vk_acceleration_structure)
vkd3d_acceleration_structure_write_postbuild_info(list, desc, i * stride, vk_acceleration_structure);
else
ERR("Failed to query acceleration structure for VA 0x%"PRIx64".\n", addresses[i]);
}
vkd3d_acceleration_structure_end_barrier(list);
}
void vkd3d_acceleration_structure_emit_immediate_postbuild_info(
struct d3d12_command_list *list, uint32_t count,
const D3D12_RAYTRACING_ACCELERATION_STRUCTURE_POSTBUILD_INFO_DESC *desc,
VkAccelerationStructureKHR vk_acceleration_structure)
{
/* In D3D12 we are supposed to be able to emit without an explicit barrier,
* but we need to emit them for Vulkan. */
const struct vkd3d_vk_device_procs *vk_procs = &list->device->vk_procs;
VkMemoryBarrier barrier;
uint32_t i;
barrier.sType = VK_STRUCTURE_TYPE_MEMORY_BARRIER;
barrier.pNext = NULL;
barrier.srcAccessMask = VK_ACCESS_ACCELERATION_STRUCTURE_WRITE_BIT_KHR;
/* The query accesses STRUCTURE_READ_BIT in BUILD_BIT stage. */
barrier.dstAccessMask = VK_ACCESS_ACCELERATION_STRUCTURE_READ_BIT_KHR | VK_ACCESS_TRANSFER_WRITE_BIT;
/* Writing to the result buffer is supposed to happen in UNORDERED_ACCESS on DXR for
* some bizarre reason, so we have to satisfy a transfer barrier.
* Have to basically do a full stall to make this work ... */
VK_CALL(vkCmdPipelineBarrier(list->vk_command_buffer,
VK_PIPELINE_STAGE_ALL_COMMANDS_BIT,
VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_KHR | VK_PIPELINE_STAGE_TRANSFER_BIT, 0,
1, &barrier, 0, NULL, 0, NULL));
/* Could optimize a bit by batching more aggressively, but no idea if it's going to help in practice. */
for (i = 0; i < count; i++)
vkd3d_acceleration_structure_write_postbuild_info(list, &desc[i], 0, vk_acceleration_structure);
vkd3d_acceleration_structure_end_barrier(list);
}
static bool convert_copy_mode(
D3D12_RAYTRACING_ACCELERATION_STRUCTURE_COPY_MODE mode,
VkCopyAccelerationStructureModeKHR *vk_mode)
{
switch (mode)
{
case D3D12_RAYTRACING_ACCELERATION_STRUCTURE_COPY_MODE_CLONE:
*vk_mode = VK_COPY_ACCELERATION_STRUCTURE_MODE_CLONE_KHR;
return true;
case D3D12_RAYTRACING_ACCELERATION_STRUCTURE_COPY_MODE_COMPACT:
*vk_mode = VK_COPY_ACCELERATION_STRUCTURE_MODE_COMPACT_KHR;
return true;
default:
FIXME("Unsupported RTAS copy mode #%x.\n", mode);
return false;
}
}
void vkd3d_acceleration_structure_copy(
struct d3d12_command_list *list,
D3D12_GPU_VIRTUAL_ADDRESS dst, D3D12_GPU_VIRTUAL_ADDRESS src,
D3D12_RAYTRACING_ACCELERATION_STRUCTURE_COPY_MODE mode)
{
const struct vkd3d_vk_device_procs *vk_procs = &list->device->vk_procs;
VkAccelerationStructureKHR dst_as, src_as;
VkCopyAccelerationStructureInfoKHR info;
dst_as = vkd3d_va_map_place_acceleration_structure(&list->device->memory_allocator.va_map, list->device, dst);
if (dst_as == VK_NULL_HANDLE)
{
ERR("Invalid dst address #%"PRIx64" for RTAS copy.\n", dst);
return;
}
src_as = vkd3d_va_map_place_acceleration_structure(&list->device->memory_allocator.va_map, list->device, src);
if (src_as == VK_NULL_HANDLE)
{
ERR("Invalid src address #%"PRIx64" for RTAS copy.\n", src);
return;
}
info.sType = VK_STRUCTURE_TYPE_COPY_ACCELERATION_STRUCTURE_INFO_KHR;
info.pNext = NULL;
info.dst = dst_as;
info.src = src_as;
if (convert_copy_mode(mode, &info.mode))
VK_CALL(vkCmdCopyAccelerationStructureKHR(list->vk_command_buffer, &info));
}

655
libs/vkd3d/breadcrumbs.c Normal file
View File

@ -0,0 +1,655 @@
/*
* Copyright 2022 Hans-Kristian Arntzen for Valve Corporation
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
#define VKD3D_DBG_CHANNEL VKD3D_DBG_CHANNEL_API
#include "vkd3d_private.h"
#include "vkd3d_debug.h"
#include "vkd3d_common.h"
#include <assert.h>
#include <stdio.h>
/* Just allocate everything up front. This only consumes host memory anyways. */
#define MAX_COMMAND_LISTS (32 * 1024)
/* Questionable on 32-bit, but we don't really care. */
#define NV_ENCODE_CHECKPOINT(context, counter) ((void*) ((uintptr_t)(context) + (uintptr_t)MAX_COMMAND_LISTS * (counter)))
#define NV_CHECKPOINT_CONTEXT(ptr) ((uint32_t)((uintptr_t)(ptr) % MAX_COMMAND_LISTS))
#define NV_CHECKPOINT_COUNTER(ptr) ((uint32_t)((uintptr_t)(ptr) / MAX_COMMAND_LISTS))
static const char *vkd3d_breadcrumb_command_type_to_str(enum vkd3d_breadcrumb_command_type type)
{
switch (type)
{
case VKD3D_BREADCRUMB_COMMAND_SET_TOP_MARKER:
return "top_marker";
case VKD3D_BREADCRUMB_COMMAND_SET_BOTTOM_MARKER:
return "bottom_marker";
case VKD3D_BREADCRUMB_COMMAND_SET_SHADER_HASH:
return "set_shader_hash";
case VKD3D_BREADCRUMB_COMMAND_DRAW:
return "draw";
case VKD3D_BREADCRUMB_COMMAND_DRAW_INDEXED:
return "draw_indexed";
case VKD3D_BREADCRUMB_COMMAND_DISPATCH:
return "dispatch";
case VKD3D_BREADCRUMB_COMMAND_EXECUTE_INDIRECT:
return "execute_indirect";
case VKD3D_BREADCRUMB_COMMAND_EXECUTE_INDIRECT_TEMPLATE:
return "execute_indirect_template";
case VKD3D_BREADCRUMB_COMMAND_COPY:
return "copy";
case VKD3D_BREADCRUMB_COMMAND_RESOLVE:
return "resolve";
case VKD3D_BREADCRUMB_COMMAND_WBI:
return "wbi";
case VKD3D_BREADCRUMB_COMMAND_RESOLVE_QUERY:
return "resolve_query";
case VKD3D_BREADCRUMB_COMMAND_GATHER_VIRTUAL_QUERY:
return "gather_virtual_query";
case VKD3D_BREADCRUMB_COMMAND_BUILD_RTAS:
return "build_rtas";
case VKD3D_BREADCRUMB_COMMAND_COPY_RTAS:
return "copy_rtas";
case VKD3D_BREADCRUMB_COMMAND_EMIT_RTAS_POSTBUILD:
return "emit_rtas_postbuild";
case VKD3D_BREADCRUMB_COMMAND_TRACE_RAYS:
return "trace_rays";
case VKD3D_BREADCRUMB_COMMAND_BARRIER:
return "barrier";
case VKD3D_BREADCRUMB_COMMAND_AUX32:
return "aux32";
case VKD3D_BREADCRUMB_COMMAND_AUX64:
return "aux64";
case VKD3D_BREADCRUMB_COMMAND_VBO:
return "vbo";
case VKD3D_BREADCRUMB_COMMAND_IBO:
return "ibo";
case VKD3D_BREADCRUMB_COMMAND_ROOT_DESC:
return "root_desc";
case VKD3D_BREADCRUMB_COMMAND_ROOT_CONST:
return "root_const";
case VKD3D_BREADCRUMB_COMMAND_TAG:
return "tag";
default:
return "?";
}
}
HRESULT vkd3d_breadcrumb_tracer_init(struct vkd3d_breadcrumb_tracer *tracer, struct d3d12_device *device)
{
const struct vkd3d_vk_device_procs *vk_procs = &device->vk_procs;
D3D12_HEAP_PROPERTIES heap_properties;
D3D12_RESOURCE_DESC1 resource_desc;
VkMemoryPropertyFlags memory_props;
HRESULT hr;
int rc;
memset(tracer, 0, sizeof(*tracer));
if ((rc = pthread_mutex_init(&tracer->lock, NULL)))
return hresult_from_errno(rc);
if (device->vk_info.AMD_buffer_marker)
{
INFO("Enabling AMD_buffer_marker breadcrumbs.\n");
memset(&resource_desc, 0, sizeof(resource_desc));
resource_desc.Width = MAX_COMMAND_LISTS * sizeof(struct vkd3d_breadcrumb_counter);
resource_desc.Height = 1;
resource_desc.DepthOrArraySize = 1;
resource_desc.MipLevels = 1;
resource_desc.Format = DXGI_FORMAT_UNKNOWN;
resource_desc.SampleDesc.Count = 1;
resource_desc.SampleDesc.Quality = 0;
resource_desc.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR;
resource_desc.Flags = D3D12_RESOURCE_FLAG_NONE;
if (FAILED(hr = vkd3d_create_buffer(device, &heap_properties, D3D12_HEAP_FLAG_ALLOW_ONLY_BUFFERS,
&resource_desc, &tracer->host_buffer)))
{
goto err;
}
memory_props = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT |
VK_MEMORY_PROPERTY_HOST_CACHED_BIT;
/* If device faults in the middle of execution we will never get the chance to flush device caches.
* Make sure that breadcrumbs are always written directly out.
* This is the primary usecase for the device coherent/uncached extension after all ...
* Don't make this a hard requirement since buffer markers might be implicitly coherent on some
* implementations (Turnip?). */
if (device->device_info.device_coherent_memory_features_amd.deviceCoherentMemory)
{
memory_props |= VK_MEMORY_PROPERTY_DEVICE_COHERENT_BIT_AMD |
VK_MEMORY_PROPERTY_DEVICE_UNCACHED_BIT_AMD;
}
if (FAILED(hr = vkd3d_allocate_buffer_memory(device, tracer->host_buffer,
memory_props, &tracer->host_buffer_memory)))
{
goto err;
}
if (VK_CALL(vkMapMemory(device->vk_device, tracer->host_buffer_memory.vk_memory,
0, VK_WHOLE_SIZE,
0, (void**)&tracer->mapped)) != VK_SUCCESS)
{
hr = E_OUTOFMEMORY;
goto err;
}
memset(tracer->mapped, 0, sizeof(*tracer->mapped) * MAX_COMMAND_LISTS);
}
else if (device->vk_info.NV_device_diagnostic_checkpoints)
{
INFO("Enabling NV_device_diagnostics_checkpoints breadcrumbs.\n");
}
else
{
ERR("Breadcrumbs require support for either AMD_buffer_marker or NV_device_diagnostics_checkpoints.\n");
hr = E_FAIL;
goto err;
}
tracer->trace_contexts = vkd3d_calloc(MAX_COMMAND_LISTS, sizeof(*tracer->trace_contexts));
tracer->trace_context_index = 0;
return S_OK;
err:
vkd3d_breadcrumb_tracer_cleanup(tracer, device);
return hr;
}
void vkd3d_breadcrumb_tracer_cleanup(struct vkd3d_breadcrumb_tracer *tracer, struct d3d12_device *device)
{
const struct vkd3d_vk_device_procs *vk_procs = &device->vk_procs;
if (device->vk_info.AMD_buffer_marker)
{
VK_CALL(vkDestroyBuffer(device->vk_device, tracer->host_buffer, NULL));
vkd3d_free_device_memory(device, &tracer->host_buffer_memory);
}
vkd3d_free(tracer->trace_contexts);
pthread_mutex_destroy(&tracer->lock);
}
unsigned int vkd3d_breadcrumb_tracer_allocate_command_list(struct vkd3d_breadcrumb_tracer *tracer,
struct d3d12_command_list *list, struct d3d12_command_allocator *allocator)
{
unsigned int index = UINT32_MAX;
unsigned int iteration_count;
int rc;
if ((rc = pthread_mutex_lock(&tracer->lock)))
{
ERR("Failed to lock mutex, rc %d.\n", rc);
return UINT32_MAX;
}
/* Since this is a ring, this is extremely likely to succeed on first attempt. */
for (iteration_count = 0; iteration_count < MAX_COMMAND_LISTS; iteration_count++)
{
tracer->trace_context_index = (tracer->trace_context_index + 1) % MAX_COMMAND_LISTS;
if (!tracer->trace_contexts[tracer->trace_context_index].locked)
{
tracer->trace_contexts[tracer->trace_context_index].locked = 1;
index = tracer->trace_context_index;
break;
}
}
pthread_mutex_unlock(&tracer->lock);
if (index == UINT32_MAX)
{
ERR("Failed to allocate new index for command list.\n");
return index;
}
TRACE("Allocating breadcrumb context %u for list %p.\n", index, list);
list->breadcrumb_context_index = index;
/* Need to clear this on a fresh allocation rather than release, since we can end up releasing a command list
* before we observe the device lost. */
tracer->trace_contexts[index].command_count = 0;
tracer->trace_contexts[index].counter = 0;
if (list->device->vk_info.AMD_buffer_marker)
memset(&tracer->mapped[index], 0, sizeof(tracer->mapped[index]));
vkd3d_array_reserve((void**)&allocator->breadcrumb_context_indices, &allocator->breadcrumb_context_index_size,
allocator->breadcrumb_context_index_count + 1,
sizeof(*allocator->breadcrumb_context_indices));
allocator->breadcrumb_context_indices[allocator->breadcrumb_context_index_count++] = index;
return index;
}
/* Command allocator keeps a list of allocated breadcrumb command lists. */
void vkd3d_breadcrumb_tracer_release_command_lists(struct vkd3d_breadcrumb_tracer *tracer,
const unsigned int *indices, size_t indices_count)
{
unsigned int index;
size_t i;
int rc;
if (!indices_count)
return;
if ((rc = pthread_mutex_lock(&tracer->lock)))
{
ERR("Failed to lock mutex, rc %d.\n", rc);
return;
}
for (i = 0; i < indices_count; i++)
{
index = indices[i];
if (index != UINT32_MAX)
tracer->trace_contexts[index].locked = 0;
TRACE("Releasing breadcrumb context %u.\n", index);
}
pthread_mutex_unlock(&tracer->lock);
}
static void vkd3d_breadcrumb_tracer_report_command_list(
const struct vkd3d_breadcrumb_command_list_trace_context *context,
uint32_t begin_marker,
uint32_t end_marker)
{
const struct vkd3d_breadcrumb_command *cmd;
bool observed_begin_cmd = false;
bool observed_end_cmd = false;
unsigned int i;
if (end_marker == 0)
{
ERR(" ===== Potential crash region BEGIN (make sure RADV_DEBUG=syncshaders is used for maximum accuracy) =====\n");
observed_begin_cmd = true;
}
/* We can assume that possible culprit commands lie between the end_marker
* and top_marker. */
for (i = 0; i < context->command_count; i++)
{
cmd = &context->commands[i];
/* If there is a command which sets TOP_OF_PIPE, but we haven't observed the marker yet,
* the command processor hasn't gotten there yet (most likely ...), so that should be the
* natural end-point. */
if (!observed_end_cmd &&
cmd->type == VKD3D_BREADCRUMB_COMMAND_SET_TOP_MARKER &&
cmd->count > begin_marker)
{
observed_end_cmd = true;
ERR(" ===== Potential crash region END =====\n");
}
if (cmd->type == VKD3D_BREADCRUMB_COMMAND_AUX32)
{
ERR(" Set arg: %u (#%x)\n", cmd->word_32bit, cmd->word_32bit);
}
else if (cmd->type == VKD3D_BREADCRUMB_COMMAND_AUX64)
{
ERR(" Set arg: %"PRIu64" (#%"PRIx64")\n", cmd->word_64bit, cmd->word_64bit);
}
else if (cmd->type == VKD3D_BREADCRUMB_COMMAND_TAG)
{
ERR(" Tag: %s\n", cmd->tag);
}
else
{
ERR(" Command: %s\n", vkd3d_breadcrumb_command_type_to_str(cmd->type));
switch (cmd->type)
{
case VKD3D_BREADCRUMB_COMMAND_SET_TOP_MARKER:
case VKD3D_BREADCRUMB_COMMAND_SET_BOTTOM_MARKER:
ERR(" marker: %u\n", cmd->count);
break;
case VKD3D_BREADCRUMB_COMMAND_SET_SHADER_HASH:
ERR(" hash: %016"PRIx64", stage: %x\n", cmd->shader.hash, cmd->shader.stage);
break;
default:
break;
}
}
/* We have proved we observed this command is complete.
* Some command after this signal is at fault. */
if (!observed_begin_cmd &&
cmd->type == VKD3D_BREADCRUMB_COMMAND_SET_BOTTOM_MARKER &&
cmd->count == end_marker)
{
observed_begin_cmd = true;
ERR(" ===== Potential crash region BEGIN (make sure RADV_DEBUG=syncshaders is used for maximum accuracy) =====\n");
}
}
}
static void vkd3d_breadcrumb_tracer_report_command_list_amd(struct vkd3d_breadcrumb_tracer *tracer,
unsigned int context_index)
{
const struct vkd3d_breadcrumb_command_list_trace_context *context;
uint32_t begin_marker;
uint32_t end_marker;
context = &tracer->trace_contexts[context_index];
/* Unused, cannot be the cause. */
if (context->counter == 0)
return;
begin_marker = tracer->mapped[context_index].begin_marker;
end_marker = tracer->mapped[context_index].end_marker;
/* Never executed, cannot be the cause. */
if (begin_marker == 0 && end_marker == 0)
return;
/* Successfully retired, cannot be the cause. */
if (begin_marker == UINT32_MAX && end_marker == UINT32_MAX)
return;
/* Edge case if we re-submitted a command list,
* but it ends up crashing before we hit any BOTTOM_OF_PIPE
* marker. Normalize the inputs such that end_marker <= begin_marker. */
if (begin_marker > 0 && end_marker == UINT32_MAX)
end_marker = 0;
ERR("Found pending command list context %u in executable state, TOP_OF_PIPE marker %u, BOTTOM_OF_PIPE marker %u.\n",
context_index, begin_marker, end_marker);
vkd3d_breadcrumb_tracer_report_command_list(context, begin_marker, end_marker);
ERR("Done analyzing command list.\n");
}
static void vkd3d_breadcrumb_tracer_report_queue_nv(struct vkd3d_breadcrumb_tracer *tracer,
struct d3d12_device *device,
VkQueue vk_queue)
{
const struct vkd3d_vk_device_procs *vk_procs = &device->vk_procs;
uint32_t begin_marker, end_marker;
uint32_t checkpoint_context_index;
VkCheckpointDataNV *checkpoints;
uint32_t checkpoint_marker;
uint32_t checkpoint_count;
uint32_t context_index;
uint32_t i;
VK_CALL(vkGetQueueCheckpointDataNV(vk_queue, &checkpoint_count, NULL));
if (checkpoint_count == 0)
return;
checkpoints = vkd3d_calloc(checkpoint_count, sizeof(VkCheckpointDataNV));
for (i = 0; i < checkpoint_count; i++)
checkpoints[i].sType = VK_STRUCTURE_TYPE_CHECKPOINT_DATA_NV;
VK_CALL(vkGetQueueCheckpointDataNV(vk_queue, &checkpoint_count, checkpoints));
context_index = UINT32_MAX;
begin_marker = 0;
end_marker = 0;
for (i = 0; i < checkpoint_count; i++)
{
checkpoint_context_index = NV_CHECKPOINT_CONTEXT(checkpoints[i].pCheckpointMarker);
checkpoint_marker = NV_CHECKPOINT_COUNTER(checkpoints[i].pCheckpointMarker);
if (context_index != checkpoint_context_index && context_index != UINT32_MAX)
{
FIXME("Markers have different contexts. Execution is likely split across multiple command buffers?\n");
context_index = UINT32_MAX;
break;
}
context_index = checkpoint_context_index;
if (checkpoints[i].stage == VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT && checkpoint_marker > begin_marker)
{
/* We want to find the latest TOP_OF_PIPE_BIT. Then we prove that command processor got to that point. */
begin_marker = checkpoint_marker;
}
else if (checkpoints[i].stage == VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT && checkpoint_marker > end_marker)
{
/* We want to find the latest BOTTOM_OF_PIPE_BIT. Then we prove that we got that far. */
end_marker = checkpoint_marker;
}
else if (checkpoints[i].stage != VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT &&
checkpoints[i].stage != VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT)
{
FIXME("Unexpected checkpoint pipeline stage. #%x\n", checkpoints[i].stage);
context_index = UINT32_MAX;
break;
}
}
if (context_index != UINT32_MAX && begin_marker != 0 && end_marker != 0 && end_marker != UINT32_MAX)
{
ERR("Found pending command list context %u in executable state, TOP_OF_PIPE marker %u, BOTTOM_OF_PIPE marker %u.\n",
context_index, begin_marker, end_marker);
vkd3d_breadcrumb_tracer_report_command_list(&tracer->trace_contexts[context_index], begin_marker, end_marker);
ERR("Done analyzing command list.\n");
}
vkd3d_free(checkpoints);
}
void vkd3d_breadcrumb_tracer_report_device_lost(struct vkd3d_breadcrumb_tracer *tracer,
struct d3d12_device *device)
{
struct vkd3d_queue_family_info *queue_family_info;
VkQueue vk_queue;
unsigned int i;
ERR("Device lost observed, analyzing breadcrumbs ...\n");
if (device->vk_info.AMD_buffer_marker)
{
/* AMD path, buffer marker. */
for (i = 0; i < MAX_COMMAND_LISTS; i++)
vkd3d_breadcrumb_tracer_report_command_list_amd(tracer, i);
}
else if (device->vk_info.NV_device_diagnostic_checkpoints)
{
/* vkGetQueueCheckpointDataNV does not require us to synchronize access to the queue. */
queue_family_info = d3d12_device_get_vkd3d_queue_family(device, D3D12_COMMAND_LIST_TYPE_DIRECT);
for (i = 0; i < queue_family_info->queue_count; i++)
{
vk_queue = queue_family_info->queues[i]->vk_queue;
vkd3d_breadcrumb_tracer_report_queue_nv(tracer, device, vk_queue);
}
queue_family_info = d3d12_device_get_vkd3d_queue_family(device, D3D12_COMMAND_LIST_TYPE_COMPUTE);
for (i = 0; i < queue_family_info->queue_count; i++)
{
vk_queue = queue_family_info->queues[i]->vk_queue;
vkd3d_breadcrumb_tracer_report_queue_nv(tracer, device, vk_queue);
}
queue_family_info = d3d12_device_get_vkd3d_queue_family(device, D3D12_COMMAND_LIST_TYPE_COPY);
for (i = 0; i < queue_family_info->queue_count; i++)
{
vk_queue = queue_family_info->queues[i]->vk_queue;
vkd3d_breadcrumb_tracer_report_queue_nv(tracer, device, vk_queue);
}
}
ERR("Done analyzing breadcrumbs ...\n");
}
void vkd3d_breadcrumb_tracer_begin_command_list(struct d3d12_command_list *list)
{
struct vkd3d_breadcrumb_tracer *breadcrumb_tracer = &list->device->breadcrumb_tracer;
const struct vkd3d_vk_device_procs *vk_procs = &list->device->vk_procs;
struct vkd3d_breadcrumb_command_list_trace_context *trace;
unsigned int context = list->breadcrumb_context_index;
struct vkd3d_breadcrumb_command cmd;
if (context == UINT32_MAX)
return;
trace = &breadcrumb_tracer->trace_contexts[context];
trace->counter++;
cmd.count = trace->counter;
cmd.type = VKD3D_BREADCRUMB_COMMAND_SET_TOP_MARKER;
vkd3d_breadcrumb_tracer_add_command(list, &cmd);
if (list->device->vk_info.AMD_buffer_marker)
{
VK_CALL(vkCmdWriteBufferMarkerAMD(list->vk_command_buffer,
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT,
breadcrumb_tracer->host_buffer,
context * sizeof(struct vkd3d_breadcrumb_counter) +
offsetof(struct vkd3d_breadcrumb_counter, begin_marker),
trace->counter));
}
else if (list->device->vk_info.NV_device_diagnostic_checkpoints)
{
/* A checkpoint is implicitly a top and bottom marker. */
cmd.count = trace->counter;
cmd.type = VKD3D_BREADCRUMB_COMMAND_SET_BOTTOM_MARKER;
vkd3d_breadcrumb_tracer_add_command(list, &cmd);
VK_CALL(vkCmdSetCheckpointNV(list->vk_command_buffer, NV_ENCODE_CHECKPOINT(context, trace->counter)));
}
}
void vkd3d_breadcrumb_tracer_add_command(struct d3d12_command_list *list,
const struct vkd3d_breadcrumb_command *command)
{
struct vkd3d_breadcrumb_tracer *breadcrumb_tracer = &list->device->breadcrumb_tracer;
struct vkd3d_breadcrumb_command_list_trace_context *trace;
unsigned int context = list->breadcrumb_context_index;
if (context == UINT32_MAX)
return;
trace = &breadcrumb_tracer->trace_contexts[context];
TRACE("Adding command (%s) to context %u.\n",
vkd3d_breadcrumb_command_type_to_str(command->type), context);
vkd3d_array_reserve((void**)&trace->commands, &trace->command_size,
trace->command_count + 1, sizeof(*trace->commands));
trace->commands[trace->command_count++] = *command;
}
void vkd3d_breadcrumb_tracer_signal(struct d3d12_command_list *list)
{
struct vkd3d_breadcrumb_tracer *breadcrumb_tracer = &list->device->breadcrumb_tracer;
const struct vkd3d_vk_device_procs *vk_procs = &list->device->vk_procs;
struct vkd3d_breadcrumb_command_list_trace_context *trace;
unsigned int context = list->breadcrumb_context_index;
struct vkd3d_breadcrumb_command cmd;
if (context == UINT32_MAX)
return;
trace = &breadcrumb_tracer->trace_contexts[context];
if (list->device->vk_info.AMD_buffer_marker)
{
cmd.type = VKD3D_BREADCRUMB_COMMAND_SET_BOTTOM_MARKER;
cmd.count = trace->counter;
vkd3d_breadcrumb_tracer_add_command(list, &cmd);
TRACE("Breadcrumb signal bottom-of-pipe context %u -> %u\n", context, cmd.count);
VK_CALL(vkCmdWriteBufferMarkerAMD(list->vk_command_buffer,
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT,
breadcrumb_tracer->host_buffer,
context * sizeof(struct vkd3d_breadcrumb_counter) +
offsetof(struct vkd3d_breadcrumb_counter, end_marker),
trace->counter));
trace->counter++;
cmd.type = VKD3D_BREADCRUMB_COMMAND_SET_TOP_MARKER;
cmd.count = trace->counter;
vkd3d_breadcrumb_tracer_add_command(list, &cmd);
TRACE("Breadcrumb signal top-of-pipe context %u -> %u\n", context, cmd.count);
VK_CALL(vkCmdWriteBufferMarkerAMD(list->vk_command_buffer,
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT,
breadcrumb_tracer->host_buffer,
context * sizeof(struct vkd3d_breadcrumb_counter) +
offsetof(struct vkd3d_breadcrumb_counter, begin_marker),
trace->counter));
}
else if (list->device->vk_info.NV_device_diagnostic_checkpoints)
{
trace->counter++;
cmd.type = VKD3D_BREADCRUMB_COMMAND_SET_TOP_MARKER;
cmd.count = trace->counter;
vkd3d_breadcrumb_tracer_add_command(list, &cmd);
TRACE("Breadcrumb signal top-of-pipe context %u -> %u\n", context, cmd.count);
cmd.type = VKD3D_BREADCRUMB_COMMAND_SET_BOTTOM_MARKER;
cmd.count = trace->counter;
vkd3d_breadcrumb_tracer_add_command(list, &cmd);
TRACE("Breadcrumb signal bottom-of-pipe context %u -> %u\n", context, cmd.count);
VK_CALL(vkCmdSetCheckpointNV(list->vk_command_buffer, NV_ENCODE_CHECKPOINT(context, trace->counter)));
}
}
void vkd3d_breadcrumb_tracer_end_command_list(struct d3d12_command_list *list)
{
struct vkd3d_breadcrumb_tracer *breadcrumb_tracer = &list->device->breadcrumb_tracer;
const struct vkd3d_vk_device_procs *vk_procs = &list->device->vk_procs;
struct vkd3d_breadcrumb_command_list_trace_context *trace;
unsigned int context = list->breadcrumb_context_index;
struct vkd3d_breadcrumb_command cmd;
if (context == UINT32_MAX)
return;
trace = &breadcrumb_tracer->trace_contexts[context];
trace->counter = UINT32_MAX;
if (list->device->vk_info.AMD_buffer_marker)
{
VK_CALL(vkCmdWriteBufferMarkerAMD(list->vk_command_buffer,
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT,
breadcrumb_tracer->host_buffer,
context * sizeof(struct vkd3d_breadcrumb_counter) +
offsetof(struct vkd3d_breadcrumb_counter, begin_marker),
trace->counter));
VK_CALL(vkCmdWriteBufferMarkerAMD(list->vk_command_buffer,
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT,
breadcrumb_tracer->host_buffer,
context * sizeof(struct vkd3d_breadcrumb_counter) +
offsetof(struct vkd3d_breadcrumb_counter, end_marker),
trace->counter));
}
else if (list->device->vk_info.NV_device_diagnostic_checkpoints)
{
VK_CALL(vkCmdSetCheckpointNV(list->vk_command_buffer, NV_ENCODE_CHECKPOINT(context, trace->counter)));
}
cmd.count = trace->counter;
cmd.type = VKD3D_BREADCRUMB_COMMAND_SET_TOP_MARKER;
vkd3d_breadcrumb_tracer_add_command(list, &cmd);
cmd.type = VKD3D_BREADCRUMB_COMMAND_SET_BOTTOM_MARKER;
vkd3d_breadcrumb_tracer_add_command(list, &cmd);
}

1787
libs/vkd3d/bundle.c Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -459,7 +459,12 @@ static void STDMETHODCALLTYPE d3d12_command_list_RSSetShadingRateImage_profiled(
COMMAND_LIST_PROFILED_CALL(RSSetShadingRateImage, iface, image);
}
static CONST_VTBL struct ID3D12GraphicsCommandList5Vtbl d3d12_command_list_vtbl_profiled =
static void STDMETHODCALLTYPE d3d12_command_list_DispatchMesh_profiled(d3d12_command_list_iface *iface, UINT x, UINT y, UINT z)
{
COMMAND_LIST_PROFILED_CALL(DispatchMesh, iface, x, y, z);
}
static CONST_VTBL struct ID3D12GraphicsCommandList6Vtbl d3d12_command_list_vtbl_profiled =
{
/* IUnknown methods */
d3d12_command_list_QueryInterface,
@ -469,7 +474,7 @@ static CONST_VTBL struct ID3D12GraphicsCommandList5Vtbl d3d12_command_list_vtbl_
d3d12_command_list_GetPrivateData,
d3d12_command_list_SetPrivateData,
d3d12_command_list_SetPrivateDataInterface,
d3d12_command_list_SetName,
(void *)d3d12_object_SetName,
/* ID3D12DeviceChild methods */
d3d12_command_list_GetDevice,
/* ID3D12CommandList methods */
@ -550,6 +555,8 @@ static CONST_VTBL struct ID3D12GraphicsCommandList5Vtbl d3d12_command_list_vtbl_
/* ID3D12GraphicsCommandList5 methods */
d3d12_command_list_RSSetShadingRate_profiled,
d3d12_command_list_RSSetShadingRateImage_profiled,
/* ID3D12GraphicsCommandList6 methods */
d3d12_command_list_DispatchMesh_profiled,
};
#endif

View File

@ -0,0 +1,116 @@
/*
* * Copyright 2021 NVIDIA Corporation
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
#define VKD3D_DBG_CHANNEL VKD3D_DBG_CHANNEL_API
#include "vkd3d_private.h"
static inline struct d3d12_command_list *d3d12_command_list_from_ID3D12GraphicsCommandListExt(ID3D12GraphicsCommandListExt *iface)
{
return CONTAINING_RECORD(iface, struct d3d12_command_list, ID3D12GraphicsCommandListExt_iface);
}
extern ULONG STDMETHODCALLTYPE d3d12_command_list_AddRef(d3d12_command_list_iface *iface);
ULONG STDMETHODCALLTYPE d3d12_command_list_vkd3d_ext_AddRef(ID3D12GraphicsCommandListExt *iface)
{
struct d3d12_command_list *command_list = d3d12_command_list_from_ID3D12GraphicsCommandListExt(iface);
return d3d12_command_list_AddRef(&command_list->ID3D12GraphicsCommandList_iface);
}
extern ULONG STDMETHODCALLTYPE d3d12_command_list_Release(d3d12_command_list_iface *iface);
static ULONG STDMETHODCALLTYPE d3d12_command_list_vkd3d_ext_Release(ID3D12GraphicsCommandListExt *iface)
{
struct d3d12_command_list *command_list = d3d12_command_list_from_ID3D12GraphicsCommandListExt(iface);
return d3d12_command_list_Release(&command_list->ID3D12GraphicsCommandList_iface);
}
extern HRESULT STDMETHODCALLTYPE d3d12_command_list_QueryInterface(d3d12_command_list_iface *iface,
REFIID iid, void **object);
static HRESULT STDMETHODCALLTYPE d3d12_command_list_vkd3d_ext_QueryInterface(ID3D12GraphicsCommandListExt *iface,
REFIID iid, void **out)
{
struct d3d12_command_list *command_list = d3d12_command_list_from_ID3D12GraphicsCommandListExt(iface);
TRACE("iface %p, iid %s, out %p.\n", iface, debugstr_guid(iid), out);
return d3d12_command_list_QueryInterface(&command_list->ID3D12GraphicsCommandList_iface, iid, out);
}
static HRESULT STDMETHODCALLTYPE d3d12_command_list_vkd3d_ext_GetVulkanHandle(ID3D12GraphicsCommandListExt *iface,
VkCommandBuffer *pVkCommandBuffer)
{
struct d3d12_command_list *command_list = d3d12_command_list_from_ID3D12GraphicsCommandListExt(iface);
TRACE("iface %p, pVkCommandBuffer %p.\n", iface, pVkCommandBuffer);
if (!pVkCommandBuffer)
return E_INVALIDARG;
*pVkCommandBuffer = command_list->vk_command_buffer;
return S_OK;
}
#define CU_LAUNCH_PARAM_BUFFER_POINTER (const void*)0x01
#define CU_LAUNCH_PARAM_BUFFER_SIZE (const void*)0x02
#define CU_LAUNCH_PARAM_END (const void*)0x00
static HRESULT STDMETHODCALLTYPE d3d12_command_list_vkd3d_ext_LaunchCubinShader(ID3D12GraphicsCommandListExt *iface, D3D12_CUBIN_DATA_HANDLE *handle, UINT32 block_x, UINT32 block_y, UINT32 block_z, const void *params, UINT32 param_size)
{
VkCuLaunchInfoNVX launchInfo = { VK_STRUCTURE_TYPE_CU_LAUNCH_INFO_NVX };
const struct vkd3d_vk_device_procs *vk_procs;
const void *config[] = {
CU_LAUNCH_PARAM_BUFFER_POINTER, params,
CU_LAUNCH_PARAM_BUFFER_SIZE, &param_size,
CU_LAUNCH_PARAM_END
};
struct d3d12_command_list *command_list = d3d12_command_list_from_ID3D12GraphicsCommandListExt(iface);
TRACE("iface %p, handle %p, block_x %u, block_y %u, block_z %u, params %p, param_size %u \n", iface, handle, block_x, block_y, block_z, params, param_size);
if (!handle || !block_x || !block_y || !block_z || !params || !param_size)
return E_INVALIDARG;
launchInfo.function = handle->vkCuFunction;
launchInfo.gridDimX = block_x;
launchInfo.gridDimY = block_y;
launchInfo.gridDimZ = block_z;
launchInfo.blockDimX = handle->blockX;
launchInfo.blockDimY = handle->blockY;
launchInfo.blockDimZ = handle->blockZ;
launchInfo.sharedMemBytes = 0;
launchInfo.paramCount = 0;
launchInfo.pParams = NULL;
launchInfo.extraCount = 1;
launchInfo.pExtras = config;
vk_procs = &command_list->device->vk_procs;
VK_CALL(vkCmdCuLaunchKernelNVX(command_list->vk_command_buffer, &launchInfo));
return S_OK;
}
CONST_VTBL struct ID3D12GraphicsCommandListExtVtbl d3d12_command_list_vkd3d_ext_vtbl =
{
/* IUnknown methods */
d3d12_command_list_vkd3d_ext_QueryInterface,
d3d12_command_list_vkd3d_ext_AddRef,
d3d12_command_list_vkd3d_ext_Release,
/* ID3D12GraphicsCommandListExt methods */
d3d12_command_list_vkd3d_ext_GetVulkanHandle,
d3d12_command_list_vkd3d_ext_LaunchCubinShader
};

View File

@ -21,6 +21,7 @@
#include "vkd3d_private.h"
#include "vkd3d_debug.h"
#include "vkd3d_common.h"
#include "vkd3d_platform.h"
#include <stdio.h>
void vkd3d_shader_debug_ring_init_spec_constant(struct d3d12_device *device,
@ -53,22 +54,199 @@ void vkd3d_shader_debug_ring_init_spec_constant(struct d3d12_device *device,
info->map_entries[3].size = sizeof(uint32_t);
}
#define READ_RING_WORD(off) ring->mapped_ring[(off) & ((ring->ring_size / sizeof(uint32_t)) - 1)]
#define READ_RING_WORD_ACQUIRE(off) \
vkd3d_atomic_uint32_load_explicit(&ring->mapped_ring[(off) & ((ring->ring_size / sizeof(uint32_t)) - 1)], \
vkd3d_memory_order_acquire)
#define DEBUG_CHANNEL_WORD_COOKIE 0xdeadca70u
#define DEBUG_CHANNEL_WORD_MASK 0xfffffff0u
static const char *vkd3d_patch_command_token_str(enum vkd3d_patch_command_token token)
{
switch (token)
{
case VKD3D_PATCH_COMMAND_TOKEN_COPY_CONST_U32: return "RootConst";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_IBO_VA_LO: return "IBO VA LO";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_IBO_VA_HI: return "IBO VA HI";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_IBO_SIZE: return "IBO Size";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_INDEX_FORMAT: return "IBO Type";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_VBO_VA_LO: return "VBO VA LO";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_VBO_VA_HI: return "VBO VA HI";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_VBO_SIZE: return "VBO Size";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_VBO_STRIDE: return "VBO Stride";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_ROOT_VA_LO: return "ROOT VA LO";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_ROOT_VA_HI: return "ROOT VA HI";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_VERTEX_COUNT: return "Vertex Count";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_INDEX_COUNT: return "Index Count";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_INSTANCE_COUNT: return "Instance Count";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_FIRST_INDEX: return "First Index";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_FIRST_VERTEX: return "First Vertex";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_FIRST_INSTANCE: return "First Instance";
case VKD3D_PATCH_COMMAND_TOKEN_COPY_VERTEX_OFFSET: return "Vertex Offset";
default: return "???";
}
}
static bool vkd3d_patch_command_token_is_hex(enum vkd3d_patch_command_token token)
{
switch (token)
{
case VKD3D_PATCH_COMMAND_TOKEN_COPY_IBO_VA_LO:
case VKD3D_PATCH_COMMAND_TOKEN_COPY_IBO_VA_HI:
case VKD3D_PATCH_COMMAND_TOKEN_COPY_VBO_VA_LO:
case VKD3D_PATCH_COMMAND_TOKEN_COPY_VBO_VA_HI:
case VKD3D_PATCH_COMMAND_TOKEN_COPY_ROOT_VA_LO:
case VKD3D_PATCH_COMMAND_TOKEN_COPY_ROOT_VA_HI:
return true;
default:
return false;
}
}
static bool vkd3d_shader_debug_ring_print_message(struct vkd3d_shader_debug_ring *ring,
uint32_t word_offset, uint32_t message_word_count)
{
uint32_t i, debug_instance, debug_thread_id[3], fmt;
char message_buffer[4096];
uint64_t shader_hash;
size_t len, avail;
if (message_word_count < 8)
{
ERR("Message word count %u is invalid.\n", message_word_count);
return false;
}
shader_hash = (uint64_t)READ_RING_WORD(word_offset + 1) | ((uint64_t)READ_RING_WORD(word_offset + 2) << 32);
debug_instance = READ_RING_WORD(word_offset + 3);
for (i = 0; i < 3; i++)
debug_thread_id[i] = READ_RING_WORD(word_offset + 4 + i);
fmt = READ_RING_WORD(word_offset + 7);
word_offset += 8;
message_word_count -= 8;
if (shader_hash == 0)
{
/* We got this from our internal debug shaders. Pretty-print.
* Make sure the log is sortable for easier debug.
* TODO: Might consider a callback system that listeners from different subsystems can listen to and print their own messages,
* but that is overengineering at this time ... */
snprintf(message_buffer, sizeof(message_buffer), "ExecuteIndirect: GlobalCommandIndex %010u, Debug tag %010u, DrawID %04u (ThreadID %04u): ",
debug_instance, debug_thread_id[0], debug_thread_id[1], debug_thread_id[2]);
if (message_word_count == 2)
{
len = strlen(message_buffer);
avail = sizeof(message_buffer) - len;
snprintf(message_buffer + len, avail, "DrawCount %u, MaxDrawCount %u",
READ_RING_WORD(word_offset + 0),
READ_RING_WORD(word_offset + 1));
}
else if (message_word_count == 4)
{
union { uint32_t u32; float f32; int32_t s32; } value;
enum vkd3d_patch_command_token token;
uint32_t dst_offset;
uint32_t src_offset;
len = strlen(message_buffer);
avail = sizeof(message_buffer) - len;
token = READ_RING_WORD(word_offset + 0);
dst_offset = READ_RING_WORD(word_offset + 1);
src_offset = READ_RING_WORD(word_offset + 2);
value.u32 = READ_RING_WORD(word_offset + 3);
if (vkd3d_patch_command_token_is_hex(token))
{
snprintf(message_buffer + len, avail, "%s <- #%08x",
vkd3d_patch_command_token_str(token), value.u32);
}
else if (token == VKD3D_PATCH_COMMAND_TOKEN_COPY_CONST_U32)
{
snprintf(message_buffer + len, avail, "%s <- {hex #%08x, s32 %d, f32 %f}",
vkd3d_patch_command_token_str(token), value.u32, value.s32, value.f32);
}
else
{
snprintf(message_buffer + len, avail, "%s <- %d",
vkd3d_patch_command_token_str(token), value.s32);
}
len = strlen(message_buffer);
avail = sizeof(message_buffer) - len;
snprintf(message_buffer + len, avail, " (dst offset %u, src offset %u)", dst_offset, src_offset);
}
}
else
{
snprintf(message_buffer, sizeof(message_buffer), "Shader: %"PRIx64": Instance %010u, ID (%u, %u, %u):",
shader_hash, debug_instance,
debug_thread_id[0], debug_thread_id[1], debug_thread_id[2]);
for (i = 0; i < message_word_count; i++)
{
union
{
float f32;
uint32_t u32;
int32_t i32;
} u;
const char *delim;
u.u32 = READ_RING_WORD(word_offset + i);
len = strlen(message_buffer);
if (len + 1 >= sizeof(message_buffer))
break;
avail = sizeof(message_buffer) - len;
delim = i == 0 ? " " : ", ";
#define VKD3D_DEBUG_CHANNEL_FMT_HEX 0u
#define VKD3D_DEBUG_CHANNEL_FMT_I32 1u
#define VKD3D_DEBUG_CHANNEL_FMT_F32 2u
switch ((fmt >> (2u * i)) & 3u)
{
case VKD3D_DEBUG_CHANNEL_FMT_HEX:
snprintf(message_buffer + len, avail, "%s#%x", delim, u.u32);
break;
case VKD3D_DEBUG_CHANNEL_FMT_I32:
snprintf(message_buffer + len, avail, "%s%d", delim, u.i32);
break;
case VKD3D_DEBUG_CHANNEL_FMT_F32:
snprintf(message_buffer + len, avail, "%s%f", delim, u.f32);
break;
default:
snprintf(message_buffer + len, avail, "%s????", delim);
break;
}
}
}
INFO("%s\n", message_buffer);
return true;
}
void *vkd3d_shader_debug_ring_thread_main(void *arg)
{
uint32_t last_counter, new_counter, count, i, j, message_word_count, debug_instance, debug_thread_id[3], fmt;
uint32_t last_counter, new_counter, count, i, cookie_word_count;
volatile const uint32_t *ring_counter; /* Atomic updated by the GPU. */
struct vkd3d_shader_debug_ring *ring;
struct d3d12_device *device = arg;
const uint32_t *ring_counter;
const uint32_t *ring_base;
char message_buffer[4096];
bool is_active = true;
uint64_t shader_hash;
uint32_t *ring_base;
uint32_t word_count;
size_t ring_mask;
ring = &device->debug_ring;
ring_mask = ring->ring_size - 1;
ring_counter = ring->mapped;
ring_base = ring_counter + (ring->ring_offset / sizeof(uint32_t));
ring_mask = (ring->ring_size / sizeof(uint32_t)) - 1;
ring_counter = ring->mapped_control_block;
ring_base = ring->mapped_ring;
last_counter = 0;
vkd3d_set_thread_name("debug-ring");
@ -76,93 +254,99 @@ void *vkd3d_shader_debug_ring_thread_main(void *arg)
while (is_active)
{
pthread_mutex_lock(&ring->ring_lock);
pthread_cond_wait(&ring->ring_cond, &ring->ring_lock);
if (ring->active)
pthread_cond_wait(&ring->ring_cond, &ring->ring_lock);
is_active = ring->active;
pthread_mutex_unlock(&ring->ring_lock);
new_counter = *ring_counter;
if (last_counter != new_counter)
{
count = (new_counter - last_counter) & ring_mask;
/* Assume that each iteration can safely use 1/4th of the buffer to avoid WAR hazards. */
if ((new_counter - last_counter) > (ring->ring_size / 16))
if (count > (ring->ring_size / 16))
{
ERR("Debug ring is probably too small (%u new words this iteration), increase size to avoid risk of dropping messages.\n",
new_counter - last_counter);
count);
}
for (i = 0; i < count; )
{
#define READ_RING_WORD(off) ring_base[((off) + i + last_counter) & ring_mask]
message_word_count = READ_RING_WORD(0);
if (i + message_word_count > count)
break;
if (message_word_count < 8 || message_word_count > 16 + 8)
break;
/* The debug ring shader has "release" semantics for the word count write,
* so just make sure the reads don't get reordered here. */
cookie_word_count = READ_RING_WORD_ACQUIRE(last_counter + i);
word_count = cookie_word_count & ~DEBUG_CHANNEL_WORD_MASK;
shader_hash = (uint64_t)READ_RING_WORD(1) | ((uint64_t)READ_RING_WORD(2) << 32);
debug_instance = READ_RING_WORD(3);
for (j = 0; j < 3; j++)
debug_thread_id[j] = READ_RING_WORD(4 + j);
fmt = READ_RING_WORD(7);
snprintf(message_buffer, sizeof(message_buffer), "Shader: %"PRIx64": Instance %u, ID (%u, %u, %u):",
shader_hash, debug_instance,
debug_thread_id[0], debug_thread_id[1], debug_thread_id[2]);
i += 8;
message_word_count -= 8;
for (j = 0; j < message_word_count; j++)
if (cookie_word_count == 0)
{
union
{
float f32;
uint32_t u32;
int32_t i32;
} u;
const char *delim;
size_t len, avail;
u.u32 = READ_RING_WORD(j);
len = strlen(message_buffer);
if (len + 1 >= sizeof(message_buffer))
break;
avail = sizeof(message_buffer) - len;
delim = j == 0 ? " " : ", ";
#define VKD3D_DEBUG_CHANNEL_FMT_HEX 0u
#define VKD3D_DEBUG_CHANNEL_FMT_I32 1u
#define VKD3D_DEBUG_CHANNEL_FMT_F32 2u
switch ((fmt >> (2u * j)) & 3u)
{
case VKD3D_DEBUG_CHANNEL_FMT_HEX:
snprintf(message_buffer + len, avail, "%s#%x", delim, u.u32);
break;
case VKD3D_DEBUG_CHANNEL_FMT_I32:
snprintf(message_buffer + len, avail, "%s%d", delim, u.i32);
break;
case VKD3D_DEBUG_CHANNEL_FMT_F32:
snprintf(message_buffer + len, avail, "%s%f", delim, u.f32);
break;
default:
snprintf(message_buffer + len, avail, "%s????", delim);
break;
}
ERR("Message was allocated, but write did not complete. last_counter = %u, rewrite new_counter = %u -> %u\n",
last_counter, new_counter, last_counter + i);
/* Rewind the counter, and try again later. */
new_counter = last_counter + i;
break;
}
INFO("%s\n", message_buffer);
/* If something is written here, it must be a cookie. */
if ((cookie_word_count & DEBUG_CHANNEL_WORD_MASK) != DEBUG_CHANNEL_WORD_COOKIE)
{
ERR("Invalid message work cookie detected, 0x%x.\n", cookie_word_count);
break;
}
#undef READ_RING_WORD
i += message_word_count;
if (i + word_count > count)
{
ERR("Message word count %u is out of bounds (i = %u, count = %u).\n",
word_count, i, count);
break;
}
if (!vkd3d_shader_debug_ring_print_message(ring, last_counter + i, word_count))
break;
i += word_count;
}
}
last_counter = new_counter;
/* Make sure to clear out any messages we read so that when the ring gets around to
* this point again, we can detect unwritten memory.
* This relies on having a ring that is large enough, but in practice, if we just make the ring
* large enough, there is nothing to worry about. */
while (last_counter != new_counter)
{
ring_base[last_counter & ring_mask] = 0;
last_counter++;
}
}
if (ring->device_lost)
{
INFO("Device lost detected, attempting to fish for clues.\n");
new_counter = *ring_counter;
if (last_counter != new_counter)
{
count = (new_counter - last_counter) & ring_mask;
for (i = 0; i < count; )
{
cookie_word_count = READ_RING_WORD_ACQUIRE(last_counter + i);
word_count = cookie_word_count & ~DEBUG_CHANNEL_WORD_MASK;
/* This is considered a message if it has the marker and a word count that is in-range. */
if ((cookie_word_count & DEBUG_CHANNEL_WORD_MASK) == DEBUG_CHANNEL_WORD_COOKIE &&
i + word_count <= count &&
vkd3d_shader_debug_ring_print_message(ring, last_counter + i, word_count))
{
i += word_count;
}
else
{
/* Keep going. */
i++;
}
}
}
INFO("Done fishing for clues ...\n");
}
return NULL;
@ -173,20 +357,21 @@ HRESULT vkd3d_shader_debug_ring_init(struct vkd3d_shader_debug_ring *ring,
{
const struct vkd3d_vk_device_procs *vk_procs = &device->vk_procs;
D3D12_HEAP_PROPERTIES heap_properties;
D3D12_RESOURCE_DESC resource_desc;
const char *env;
D3D12_RESOURCE_DESC1 resource_desc;
VkMemoryPropertyFlags memory_props;
char env[VKD3D_PATH_MAX];
memset(ring, 0, sizeof(*ring));
if (!(env = getenv("VKD3D_SHADER_DEBUG_RING_SIZE_LOG2")))
if (!vkd3d_get_env_var("VKD3D_SHADER_DEBUG_RING_SIZE_LOG2", env, sizeof(env)))
return S_OK;
ring->active = true;
ring->ring_size = (size_t)1 << strtoul(env, NULL, 0);
// Reserve 4k to be used as a control block of some sort.
ring->ring_offset = 4096;
ring->control_block_size = 4096;
WARN("Enabling shader debug ring of size: %zu.\n", ring->ring_size);
INFO("Enabling shader debug ring of size: %zu.\n", ring->ring_size);
if (!device->device_info.buffer_device_address_features.bufferDeviceAddress)
{
@ -200,7 +385,7 @@ HRESULT vkd3d_shader_debug_ring_init(struct vkd3d_shader_debug_ring *ring,
heap_properties.MemoryPoolPreference = D3D12_MEMORY_POOL_L0;
memset(&resource_desc, 0, sizeof(resource_desc));
resource_desc.Width = ring->ring_offset + ring->ring_size;
resource_desc.Width = ring->ring_size;
resource_desc.Height = 1;
resource_desc.DepthOrArraySize = 1;
resource_desc.MipLevels = 1;
@ -211,32 +396,71 @@ HRESULT vkd3d_shader_debug_ring_init(struct vkd3d_shader_debug_ring *ring,
resource_desc.Flags = D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS;
if (FAILED(vkd3d_create_buffer(device, &heap_properties, D3D12_HEAP_FLAG_ALLOW_ONLY_BUFFERS,
&resource_desc, &ring->host_buffer)))
&resource_desc, &ring->host_buffer)))
goto err_free_buffers;
if (FAILED(vkd3d_allocate_buffer_memory(device, ring->host_buffer, NULL,
&heap_properties, D3D12_HEAP_FLAG_ALLOW_ONLY_BUFFERS, &ring->host_buffer_memory, NULL, NULL)))
memory_props = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_CACHED_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT;
/* If we're doing breadcrumb debugging, we also need to be able to read debug ring messages
* from a crash, so we cannot rely on being able to copy the device payload back to host.
* Use PCI-e BAR + UNCACHED + DEVICE_COHERENT if we must. */
if (vkd3d_config_flags & VKD3D_CONFIG_FLAG_BREADCRUMBS)
{
INFO("Using debug ring with breadcrumbs, opting in to device uncached payload buffer.\n");
/* We use coherent in the debug_channel.h header, but not necessarily guaranteed to be coherent with
* host reads, so make extra sure. */
if (device->device_info.device_coherent_memory_features_amd.deviceCoherentMemory)
{
memory_props |= VK_MEMORY_PROPERTY_DEVICE_UNCACHED_BIT_AMD | VK_MEMORY_PROPERTY_DEVICE_COHERENT_BIT_AMD;
INFO("Enabling uncached device memory for debug ring.\n");
}
}
if (FAILED(vkd3d_allocate_buffer_memory(device, ring->host_buffer,
memory_props, &ring->host_buffer_memory)))
goto err_free_buffers;
ring->ring_device_address = vkd3d_get_buffer_device_address(device, ring->host_buffer) + ring->ring_offset;
resource_desc.Width = ring->ring_offset;
resource_desc.Width = ring->control_block_size;
memset(&heap_properties, 0, sizeof(heap_properties));
heap_properties.Type = D3D12_HEAP_TYPE_DEFAULT;
if (FAILED(vkd3d_create_buffer(device, &heap_properties, D3D12_HEAP_FLAG_ALLOW_ONLY_BUFFERS,
&resource_desc, &ring->device_atomic_buffer)))
&resource_desc, &ring->device_atomic_buffer)))
goto err_free_buffers;
if (FAILED(vkd3d_allocate_buffer_memory(device, ring->device_atomic_buffer, NULL,
&heap_properties, D3D12_HEAP_FLAG_ALLOW_ONLY_BUFFERS, &ring->device_atomic_buffer_memory, NULL, NULL)))
memory_props = VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT |
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT;
if (vkd3d_config_flags & VKD3D_CONFIG_FLAG_BREADCRUMBS)
{
/* Expect crashes since we won't have time to flush caches.
* We use coherent in the debug_channel.h header, but not necessarily guaranteed to be coherent with
* host reads, so make extra sure. */
if (device->device_info.device_coherent_memory_features_amd.deviceCoherentMemory)
memory_props |= VK_MEMORY_PROPERTY_DEVICE_UNCACHED_BIT_AMD | VK_MEMORY_PROPERTY_DEVICE_COHERENT_BIT_AMD;
}
if (FAILED(vkd3d_allocate_buffer_memory(device, ring->device_atomic_buffer,
memory_props, &ring->device_atomic_buffer_memory)))
goto err_free_buffers;
if (VK_CALL(vkMapMemory(device->vk_device, ring->host_buffer_memory, 0, VK_WHOLE_SIZE, 0, &ring->mapped)) != VK_SUCCESS)
if (VK_CALL(vkMapMemory(device->vk_device, ring->host_buffer_memory.vk_memory,
0, VK_WHOLE_SIZE, 0, (void**)&ring->mapped_ring)) != VK_SUCCESS)
goto err_free_buffers;
if (VK_CALL(vkMapMemory(device->vk_device, ring->device_atomic_buffer_memory.vk_memory,
0, VK_WHOLE_SIZE, 0, (void**)&ring->mapped_control_block)) != VK_SUCCESS)
goto err_free_buffers;
ring->ring_device_address = vkd3d_get_buffer_device_address(device, ring->host_buffer);
ring->atomic_device_address = vkd3d_get_buffer_device_address(device, ring->device_atomic_buffer);
memset(ring->mapped_control_block, 0, ring->control_block_size);
memset(ring->mapped_ring, 0, ring->ring_size);
if (pthread_mutex_init(&ring->ring_lock, NULL) != 0)
goto err_free_buffers;
if (pthread_cond_init(&ring->ring_cond, NULL) != 0)
@ -257,8 +481,8 @@ err_destroy_cond:
err_free_buffers:
VK_CALL(vkDestroyBuffer(device->vk_device, ring->host_buffer, NULL));
VK_CALL(vkDestroyBuffer(device->vk_device, ring->device_atomic_buffer, NULL));
VK_CALL(vkFreeMemory(device->vk_device, ring->host_buffer_memory, NULL));
VK_CALL(vkFreeMemory(device->vk_device, ring->device_atomic_buffer_memory, NULL));
vkd3d_free_device_memory(device, &ring->host_buffer_memory);
vkd3d_free_device_memory(device, &ring->device_atomic_buffer_memory);
memset(ring, 0, sizeof(*ring));
return E_OUTOFMEMORY;
}
@ -280,42 +504,28 @@ void vkd3d_shader_debug_ring_cleanup(struct vkd3d_shader_debug_ring *ring,
VK_CALL(vkDestroyBuffer(device->vk_device, ring->host_buffer, NULL));
VK_CALL(vkDestroyBuffer(device->vk_device, ring->device_atomic_buffer, NULL));
VK_CALL(vkFreeMemory(device->vk_device, ring->host_buffer_memory, NULL));
VK_CALL(vkFreeMemory(device->vk_device, ring->device_atomic_buffer_memory, NULL));
vkd3d_free_device_memory(device, &ring->host_buffer_memory);
vkd3d_free_device_memory(device, &ring->device_atomic_buffer_memory);
}
void vkd3d_shader_debug_ring_end_command_buffer(struct d3d12_command_list *list)
static pthread_mutex_t debug_ring_teardown_lock = PTHREAD_MUTEX_INITIALIZER;
void vkd3d_shader_debug_ring_kick(struct vkd3d_shader_debug_ring *ring, struct d3d12_device *device, bool device_lost)
{
const struct vkd3d_vk_device_procs *vk_procs = &list->device->vk_procs;
VkBufferCopy buffer_copy;
VkMemoryBarrier barrier;
if (list->device->debug_ring.active &&
list->has_replaced_shaders &&
(list->type == D3D12_COMMAND_LIST_TYPE_DIRECT || list->type == D3D12_COMMAND_LIST_TYPE_COMPUTE))
if (device_lost)
{
barrier.sType = VK_STRUCTURE_TYPE_MEMORY_BARRIER;
barrier.pNext = NULL;
barrier.srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
barrier.dstAccessMask = VK_ACCESS_TRANSFER_READ_BIT;
VK_CALL(vkCmdPipelineBarrier(list->vk_command_buffer,
VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, 0,
1, &barrier, 0, NULL, 0, NULL));
buffer_copy.size = list->device->debug_ring.ring_offset;
buffer_copy.dstOffset = 0;
buffer_copy.srcOffset = 0;
VK_CALL(vkCmdCopyBuffer(list->vk_command_buffer,
list->device->debug_ring.device_atomic_buffer,
list->device->debug_ring.host_buffer,
1, &buffer_copy));
barrier.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
barrier.dstAccessMask = VK_ACCESS_HOST_READ_BIT;
VK_CALL(vkCmdPipelineBarrier(list->vk_command_buffer,
VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_HOST_BIT, 0,
1, &barrier, 0, NULL, 0, NULL));
/* Need a global lock here since multiple threads can observe device lost at the same time. */
pthread_mutex_lock(&debug_ring_teardown_lock);
{
ring->device_lost = true;
/* We're going to die or hang after this most likely, so make sure we get to see all messages the
* GPU had to write. Just cleanup now. */
vkd3d_shader_debug_ring_cleanup(ring, device);
}
pthread_mutex_unlock(&debug_ring_teardown_lock);
}
else
{
pthread_cond_signal(&ring->ring_cond);
}
}

View File

@ -0,0 +1,530 @@
/*
* Copyright 2020 Hans-Kristian Arntzen for Valve Corporation
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
#define VKD3D_DBG_CHANNEL VKD3D_DBG_CHANNEL_API
#include "vkd3d_descriptor_debug.h"
#include "vkd3d_threads.h"
#include <stdio.h>
#include <string.h>
#include <inttypes.h>
static pthread_once_t debug_once = PTHREAD_ONCE_INIT;
static pthread_mutex_t debug_lock = PTHREAD_MUTEX_INITIALIZER;
static bool descriptor_debug_active_qa_checks;
static bool descriptor_debug_active_log;
static FILE *descriptor_debug_file;
struct vkd3d_descriptor_qa_global_info
{
struct vkd3d_descriptor_qa_global_buffer_data *data;
VkDescriptorBufferInfo descriptor;
VkBuffer vk_buffer;
struct vkd3d_device_memory_allocation device_allocation;
unsigned int num_cookies;
pthread_t ring_thread;
pthread_mutex_t ring_lock;
pthread_cond_t ring_cond;
bool active;
};
static const char *debug_descriptor_type(vkd3d_descriptor_qa_flags type_flags)
{
bool has_raw_va = !!(type_flags & VKD3D_DESCRIPTOR_QA_TYPE_RAW_VA_BIT);
switch (type_flags & ~VKD3D_DESCRIPTOR_QA_TYPE_RAW_VA_BIT)
{
case VKD3D_DESCRIPTOR_QA_TYPE_SAMPLER_BIT: return "SAMPLER";
case VKD3D_DESCRIPTOR_QA_TYPE_SAMPLED_IMAGE_BIT: return "SAMPLED_IMAGE";
case VKD3D_DESCRIPTOR_QA_TYPE_STORAGE_IMAGE_BIT: return "STORAGE_IMAGE";
case VKD3D_DESCRIPTOR_QA_TYPE_UNIFORM_BUFFER_BIT: return "UNIFORM_BUFFER";
case VKD3D_DESCRIPTOR_QA_TYPE_STORAGE_BUFFER_BIT: return "STORAGE_BUFFER";
case VKD3D_DESCRIPTOR_QA_TYPE_UNIFORM_TEXEL_BUFFER_BIT: return "UNIFORM_TEXEL_BUFFER";
case VKD3D_DESCRIPTOR_QA_TYPE_STORAGE_TEXEL_BUFFER_BIT: return "STORAGE_TEXEL_BUFFER";
case VKD3D_DESCRIPTOR_QA_TYPE_STORAGE_TEXEL_BUFFER_BIT | VKD3D_DESCRIPTOR_QA_TYPE_STORAGE_BUFFER_BIT:
return has_raw_va ? "STORAGE_TEXEL_BUFFER / STORAGE_BUFFER (w/ counter)" : "STORAGE_TEXEL_BUFFER / STORAGE_BUFFER";
case VKD3D_DESCRIPTOR_QA_TYPE_UNIFORM_TEXEL_BUFFER_BIT | VKD3D_DESCRIPTOR_QA_TYPE_STORAGE_BUFFER_BIT:
return has_raw_va ? "UNIFORM_TEXEL_BUFFER / STORAGE_BUFFER (w/ counter)" : "UNIFORM_TEXEL_BUFFER / STORAGE_BUFFER";
case VKD3D_DESCRIPTOR_QA_TYPE_RT_ACCELERATION_STRUCTURE_BIT:
return "RTAS";
case 0:
return "NONE";
default: return "?";
}
}
static void vkd3d_descriptor_debug_init_once(void)
{
char env[VKD3D_PATH_MAX];
vkd3d_get_env_var("VKD3D_DESCRIPTOR_QA_LOG", env, sizeof(env));
if (strlen(env) > 0)
{
INFO("Enabling VKD3D_DESCRIPTOR_QA_LOG\n");
descriptor_debug_file = fopen(env, "w");
if (!descriptor_debug_file)
ERR("Failed to open file: %s.\n", env);
else
descriptor_debug_active_log = true;
}
if (vkd3d_config_flags & VKD3D_CONFIG_FLAG_DESCRIPTOR_QA_CHECKS)
{
INFO("Enabling descriptor QA checks!\n");
descriptor_debug_active_qa_checks = true;
}
}
void vkd3d_descriptor_debug_init(void)
{
pthread_once(&debug_once, vkd3d_descriptor_debug_init_once);
}
bool vkd3d_descriptor_debug_active_log(void)
{
return descriptor_debug_active_log;
}
bool vkd3d_descriptor_debug_active_qa_checks(void)
{
return descriptor_debug_active_qa_checks;
}
VkDeviceSize vkd3d_descriptor_debug_heap_info_size(unsigned int num_descriptors)
{
return offsetof(struct vkd3d_descriptor_qa_heap_buffer_data, desc) + num_descriptors *
sizeof(struct vkd3d_descriptor_qa_cookie_descriptor);
}
static void vkd3d_descriptor_debug_set_live_status_bit(
struct vkd3d_descriptor_qa_global_info *global_info, uint64_t cookie)
{
if (!global_info || !global_info->active || !global_info->data)
return;
if (cookie < global_info->num_cookies)
{
vkd3d_atomic_uint32_or(&global_info->data->live_status_table[cookie / 32],
1u << (cookie & 31), vkd3d_memory_order_relaxed);
}
else
INFO("Cookie index %"PRIu64" is out of range, cannot be tracked.\n", cookie);
}
static void vkd3d_descriptor_debug_unset_live_status_bit(
struct vkd3d_descriptor_qa_global_info *global_info, uint64_t cookie)
{
if (!global_info || !global_info->active || !global_info->data)
return;
if (cookie < global_info->num_cookies)
{
vkd3d_atomic_uint32_and(&global_info->data->live_status_table[cookie / 32],
~(1u << (cookie & 31)), vkd3d_memory_order_relaxed);
}
}
static void vkd3d_descriptor_debug_qa_check_report_fault(
struct vkd3d_descriptor_qa_global_info *global_info);
static void *vkd3d_descriptor_debug_qa_check_entry(void *userdata)
{
struct vkd3d_descriptor_qa_global_info *global_info = userdata;
bool active = true;
while (active)
{
/* Don't spin endlessly, this thread is kicked after a successful fence wait. */
pthread_mutex_lock(&global_info->ring_lock);
if (global_info->active)
pthread_cond_wait(&global_info->ring_cond, &global_info->ring_lock);
active = global_info->active;
pthread_mutex_unlock(&global_info->ring_lock);
if (global_info->data->fault_type != 0)
{
vkd3d_descriptor_debug_qa_check_report_fault(global_info);
ERR("Num failed checks: %u\n", global_info->data->fault_atomic);
/* Reset the latch so we can get more reports. */
vkd3d_atomic_uint32_store_explicit(&global_info->data->fault_type, 0, vkd3d_memory_order_relaxed);
vkd3d_atomic_uint32_store_explicit(&global_info->data->fault_atomic, 0, vkd3d_memory_order_release);
}
}
return NULL;
}
void vkd3d_descriptor_debug_kick_qa_check(struct vkd3d_descriptor_qa_global_info *global_info)
{
if (global_info && global_info->active)
pthread_cond_signal(&global_info->ring_cond);
}
const VkDescriptorBufferInfo *vkd3d_descriptor_debug_get_global_info_descriptor(
struct vkd3d_descriptor_qa_global_info *global_info)
{
if (global_info)
return &global_info->descriptor;
else
return NULL;
}
HRESULT vkd3d_descriptor_debug_alloc_global_info(
struct vkd3d_descriptor_qa_global_info **out_global_info, unsigned int num_cookies,
struct d3d12_device *device)
{
const struct vkd3d_vk_device_procs *vk_procs = &device->vk_procs;
struct vkd3d_descriptor_qa_global_info *global_info;
D3D12_RESOURCE_DESC1 buffer_desc;
D3D12_HEAP_PROPERTIES heap_info;
D3D12_HEAP_FLAGS heap_flags;
VkResult vr;
HRESULT hr;
global_info = vkd3d_calloc(1, sizeof(*global_info));
if (!global_info)
return E_OUTOFMEMORY;
memset(&buffer_desc, 0, sizeof(buffer_desc));
buffer_desc.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER;
buffer_desc.Width = sizeof(uint32_t) * ((num_cookies + 31) / 32) +
offsetof(struct vkd3d_descriptor_qa_global_buffer_data, live_status_table);
buffer_desc.Height = 1;
buffer_desc.DepthOrArraySize = 1;
buffer_desc.MipLevels = 1;
buffer_desc.SampleDesc.Count = 1;
buffer_desc.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR;
buffer_desc.Flags = D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS;
/* host-visible device memory */
memset(&heap_info, 0, sizeof(heap_info));
heap_info.Type = D3D12_HEAP_TYPE_UPLOAD;
heap_flags = D3D12_HEAP_FLAG_ALLOW_ONLY_BUFFERS;
if (FAILED(hr = vkd3d_create_buffer(device, &heap_info, heap_flags, &buffer_desc, &global_info->vk_buffer)))
{
vkd3d_descriptor_debug_free_global_info(global_info, device);
return hr;
}
if (FAILED(hr = vkd3d_allocate_buffer_memory(device, global_info->vk_buffer,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT,
&global_info->device_allocation)))
{
vkd3d_descriptor_debug_free_global_info(global_info, device);
return hr;
}
if ((vr = VK_CALL(vkMapMemory(device->vk_device, global_info->device_allocation.vk_memory,
0, VK_WHOLE_SIZE, 0, (void**)&global_info->data))))
{
ERR("Failed to map buffer, vr %d.\n", vr);
vkd3d_descriptor_debug_free_global_info(global_info, device);
return hresult_from_vk_result(vr);
}
memset(global_info->data, 0, buffer_desc.Width);
/* The NULL descriptor has cookie 0, and is always considered live. */
global_info->data->live_status_table[0] = 1u << 0;
global_info->descriptor.buffer = global_info->vk_buffer;
global_info->descriptor.offset = 0;
global_info->descriptor.range = buffer_desc.Width;
global_info->num_cookies = num_cookies;
pthread_mutex_init(&global_info->ring_lock, NULL);
pthread_cond_init(&global_info->ring_cond, NULL);
global_info->active = true;
if (pthread_create(&global_info->ring_thread, NULL, vkd3d_descriptor_debug_qa_check_entry, global_info) != 0)
{
vkd3d_descriptor_debug_free_global_info(global_info, device);
return E_OUTOFMEMORY;
}
*out_global_info = global_info;
return S_OK;
}
void vkd3d_descriptor_debug_free_global_info(
struct vkd3d_descriptor_qa_global_info *global_info,
struct d3d12_device *device)
{
const struct vkd3d_vk_device_procs *vk_procs = &device->vk_procs;
if (!global_info)
return;
if (global_info->active)
{
pthread_mutex_lock(&global_info->ring_lock);
global_info->active = false;
pthread_cond_signal(&global_info->ring_cond);
pthread_mutex_unlock(&global_info->ring_lock);
pthread_join(global_info->ring_thread, NULL);
pthread_mutex_destroy(&global_info->ring_lock);
pthread_cond_destroy(&global_info->ring_cond);
}
vkd3d_free_device_memory(device, &global_info->device_allocation);
VK_CALL(vkDestroyBuffer(device->vk_device, global_info->vk_buffer, NULL));
vkd3d_free(global_info);
}
#define DECL_BUFFER() \
char buffer[4096]; \
char *ptr; \
ptr = buffer; \
*ptr = '\0'
#define FLUSH_BUFFER() do { \
pthread_mutex_lock(&debug_lock); \
fprintf(descriptor_debug_file, "%s\n", buffer); \
pthread_mutex_unlock(&debug_lock); \
fflush(descriptor_debug_file); \
} while (0)
#define APPEND_SNPRINTF(...) do { ptr += strlen(ptr); snprintf(ptr, (buffer + ARRAY_SIZE(buffer)) - ptr, __VA_ARGS__); } while(0)
static void vkd3d_descriptor_debug_qa_check_report_fault(
struct vkd3d_descriptor_qa_global_info *global_info)
{
DECL_BUFFER();
if (global_info->data->fault_type & VKD3D_DESCRIPTOR_FAULT_TYPE_HEAP_OF_OF_RANGE)
APPEND_SNPRINTF("Fault type: HEAP_OUT_OF_RANGE\n");
if (global_info->data->fault_type & VKD3D_DESCRIPTOR_FAULT_TYPE_MISMATCH_DESCRIPTOR_TYPE)
APPEND_SNPRINTF("Fault type: MISMATCH_DESCRIPTOR_TYPE\n");
if (global_info->data->fault_type & VKD3D_DESCRIPTOR_FAULT_TYPE_DESTROYED_RESOURCE)
APPEND_SNPRINTF("Fault type: DESTROYED_RESOURCE\n");
APPEND_SNPRINTF("CBV_SRV_UAV heap cookie: %u\n", global_info->data->failed_heap);
APPEND_SNPRINTF("Shader hash and instruction: %"PRIx64" (%u)\n",
global_info->data->failed_hash, global_info->data->failed_instruction);
APPEND_SNPRINTF("Accessed resource/view cookie: %u\n", global_info->data->failed_cookie);
APPEND_SNPRINTF("Shader desired descriptor type: %u (%s)\n",
global_info->data->failed_descriptor_type_mask,
debug_descriptor_type(global_info->data->failed_descriptor_type_mask));
APPEND_SNPRINTF("Found descriptor type in heap: %u (%s)\n",
global_info->data->actual_descriptor_type_mask,
debug_descriptor_type(global_info->data->actual_descriptor_type_mask));
APPEND_SNPRINTF("Failed heap index: %u\n", global_info->data->failed_offset);
ERR("\n============\n%s==========\n", buffer);
if (!vkd3d_descriptor_debug_active_log())
return;
FLUSH_BUFFER();
}
void vkd3d_descriptor_debug_register_heap(
struct vkd3d_descriptor_qa_heap_buffer_data *heap, uint64_t cookie,
const D3D12_DESCRIPTOR_HEAP_DESC *desc)
{
DECL_BUFFER();
if (heap)
{
heap->num_descriptors = desc->NumDescriptors;
heap->heap_index = cookie <= UINT32_MAX ? (uint32_t)cookie : 0u;
memset(heap->desc, 0, desc->NumDescriptors * sizeof(*heap->desc));
}
if (!vkd3d_descriptor_debug_active_log())
return;
APPEND_SNPRINTF("REGISTER HEAP %"PRIu64" || COUNT = %u", cookie, desc->NumDescriptors);
if (desc->Flags & D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE)
APPEND_SNPRINTF(" || SHADER");
switch (desc->Type)
{
case D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV:
APPEND_SNPRINTF(" || CBV_SRV_UAV");
break;
case D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER:
APPEND_SNPRINTF(" || SAMPLER");
break;
case D3D12_DESCRIPTOR_HEAP_TYPE_RTV:
APPEND_SNPRINTF(" || RTV");
break;
case D3D12_DESCRIPTOR_HEAP_TYPE_DSV:
APPEND_SNPRINTF(" || DSV");
break;
default:
APPEND_SNPRINTF(" || ?");
break;
}
FLUSH_BUFFER();
}
void vkd3d_descriptor_debug_unregister_heap(uint64_t cookie)
{
DECL_BUFFER();
if (!vkd3d_descriptor_debug_active_log())
return;
APPEND_SNPRINTF("DESTROY HEAP %"PRIu64, cookie);
FLUSH_BUFFER();
}
void vkd3d_descriptor_debug_register_resource_cookie(struct vkd3d_descriptor_qa_global_info *global_info,
uint64_t cookie, const D3D12_RESOURCE_DESC1 *desc)
{
const char *fmt;
DECL_BUFFER();
vkd3d_descriptor_debug_set_live_status_bit(global_info, cookie);
if (!vkd3d_descriptor_debug_active_log())
return;
APPEND_SNPRINTF("RESOURCE CREATE #%"PRIu64" || ", cookie);
fmt = debug_dxgi_format(desc->Format);
switch (desc->Dimension)
{
case D3D12_RESOURCE_DIMENSION_BUFFER:
APPEND_SNPRINTF("Buffer");
APPEND_SNPRINTF(" || Size = 0x%"PRIx64" bytes", desc->Width);
break;
case D3D12_RESOURCE_DIMENSION_TEXTURE1D:
APPEND_SNPRINTF("Tex1D");
APPEND_SNPRINTF(" || Format = %s || Levels = %u || Layers = %u || Width = %"PRIu64,
fmt, desc->MipLevels, desc->DepthOrArraySize, desc->Width);
break;
case D3D12_RESOURCE_DIMENSION_TEXTURE2D:
APPEND_SNPRINTF("Tex2D");
APPEND_SNPRINTF(" || Format = %s || Levels = %u || Layers = %u || Width = %"PRIu64" || Height = %u",
fmt, desc->MipLevels, desc->DepthOrArraySize, desc->Width, desc->Height);
break;
case D3D12_RESOURCE_DIMENSION_TEXTURE3D:
APPEND_SNPRINTF("Tex3D");
APPEND_SNPRINTF(" || Format = %s || Levels = %u || Width = %"PRIu64" || Height = %u || Depth = %u",
fmt, desc->MipLevels, desc->Width, desc->Height, desc->DepthOrArraySize);
break;
default:
APPEND_SNPRINTF("Unknown dimension");
break;
}
if (desc->Flags & D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS)
APPEND_SNPRINTF(" || UAV");
if (desc->Flags & D3D12_RESOURCE_FLAG_ALLOW_RENDER_TARGET)
APPEND_SNPRINTF(" || RTV");
if (desc->Flags & D3D12_RESOURCE_FLAG_ALLOW_DEPTH_STENCIL)
APPEND_SNPRINTF(" || DSV");
FLUSH_BUFFER();
}
void vkd3d_descriptor_debug_register_allocation_cookie(
struct vkd3d_descriptor_qa_global_info *global_info,
uint64_t cookie, const struct vkd3d_allocate_memory_info *info)
{
D3D12_RESOURCE_DESC1 desc;
memset(&desc, 0, sizeof(desc));
desc.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER;
desc.Width = info->memory_requirements.size;
vkd3d_descriptor_debug_register_resource_cookie(global_info, cookie, &desc);
}
void vkd3d_descriptor_debug_register_view_cookie(
struct vkd3d_descriptor_qa_global_info *global_info,
uint64_t cookie, uint64_t resource_cookie)
{
DECL_BUFFER();
vkd3d_descriptor_debug_set_live_status_bit(global_info, cookie);
if (!vkd3d_descriptor_debug_active_log())
return;
APPEND_SNPRINTF("VIEW CREATE #%"PRIu64" <- RESOURCE #%"PRIu64, cookie, resource_cookie);
FLUSH_BUFFER();
}
void vkd3d_descriptor_debug_unregister_cookie(
struct vkd3d_descriptor_qa_global_info *global_info,
uint64_t cookie)
{
DECL_BUFFER();
/* Don't unset the null descriptor by mistake. */
if (cookie != 0)
vkd3d_descriptor_debug_unset_live_status_bit(global_info, cookie);
if (!vkd3d_descriptor_debug_active_log())
return;
APPEND_SNPRINTF("COOKIE DESTROY #%"PRIu64, cookie);
FLUSH_BUFFER();
}
void vkd3d_descriptor_debug_write_descriptor(struct vkd3d_descriptor_qa_heap_buffer_data *heap, uint64_t heap_cookie,
uint32_t offset, vkd3d_descriptor_qa_flags type_flags, uint64_t cookie)
{
DECL_BUFFER();
if (heap && offset < heap->num_descriptors)
{
/* Should never overflow here except if game is literally spamming allocations every frame and we
* wait around for hours/days.
* This case will trigger warnings either way. */
heap->desc[offset].cookie = cookie <= UINT32_MAX ? (uint32_t)cookie : 0u;
heap->desc[offset].descriptor_type = type_flags;
}
if (!vkd3d_descriptor_debug_active_log())
return;
APPEND_SNPRINTF("WRITE HEAP %"PRIu64" || OFFSET = %u || TYPE = %s || COOKIE = #%"PRIu64,
heap_cookie, offset, debug_descriptor_type(type_flags), cookie);
FLUSH_BUFFER();
}
void vkd3d_descriptor_debug_copy_descriptor(
struct vkd3d_descriptor_qa_heap_buffer_data *dst_heap, uint64_t dst_heap_cookie, uint32_t dst_offset,
struct vkd3d_descriptor_qa_heap_buffer_data *src_heap, uint64_t src_heap_cookie, uint32_t src_offset,
uint64_t cookie)
{
DECL_BUFFER();
if (dst_heap && src_heap && dst_offset < dst_heap->num_descriptors && src_offset < src_heap->num_descriptors)
dst_heap->desc[dst_offset] = src_heap->desc[src_offset];
if (!vkd3d_descriptor_debug_active_log())
return;
APPEND_SNPRINTF("COPY DST HEAP %"PRIu64" || DST OFFSET = %u || COOKIE = #%"PRIu64" || SRC HEAP %"PRIu64" || SRC OFFSET = %u",
dst_heap_cookie, dst_offset, cookie, src_heap_cookie, src_offset);
FLUSH_BUFFER();
}

File diff suppressed because it is too large Load Diff

View File

@ -215,7 +215,31 @@ static HRESULT STDMETHODCALLTYPE d3d12_device_CreatePipelineState_profiled(d3d12
DEVICE_PROFILED_CALL_HRESULT(CreatePipelineState, iface, desc, riid, pipeline_state);
}
static CONST_VTBL struct ID3D12Device6Vtbl d3d12_device_vtbl_profiled =
static HRESULT STDMETHODCALLTYPE d3d12_device_CreateCommittedResource2_profiled(d3d12_device_iface *iface,
const D3D12_HEAP_PROPERTIES *heap_properties, D3D12_HEAP_FLAGS heap_flags, const D3D12_RESOURCE_DESC1 *desc,
D3D12_RESOURCE_STATES initial_state, const D3D12_CLEAR_VALUE *optimized_clear_value,
ID3D12ProtectedResourceSession *protected_session, REFIID iid, void **resource)
{
DEVICE_PROFILED_CALL_HRESULT(CreateCommittedResource2, iface, heap_properties, heap_flags,
desc, initial_state, optimized_clear_value, protected_session, iid, resource);
}
static HRESULT STDMETHODCALLTYPE d3d12_device_CreatePlacedResource1_profiled(d3d12_device_iface *iface,
ID3D12Heap *heap, UINT64 heap_offset, const D3D12_RESOURCE_DESC1 *desc,
D3D12_RESOURCE_STATES initial_state, const D3D12_CLEAR_VALUE *optimized_clear_value,
REFIID iid, void **resource)
{
DEVICE_PROFILED_CALL_HRESULT(CreatePlacedResource1, iface, heap, heap_offset,
desc, initial_state, optimized_clear_value, iid, resource);
}
static void STDMETHODCALLTYPE d3d12_device_CreateSamplerFeedbackUnorderedAccessView_profiled(d3d12_device_iface *iface,
ID3D12Resource *target_resource, ID3D12Resource *feedback_resource, D3D12_CPU_DESCRIPTOR_HANDLE descriptor)
{
DEVICE_PROFILED_CALL(CreateSamplerFeedbackUnorderedAccessView, iface, target_resource, feedback_resource, descriptor);
}
CONST_VTBL struct ID3D12Device9Vtbl d3d12_device_vtbl_profiled =
{
/* IUnknown methods */
d3d12_device_QueryInterface,
@ -225,7 +249,7 @@ static CONST_VTBL struct ID3D12Device6Vtbl d3d12_device_vtbl_profiled =
d3d12_device_GetPrivateData,
d3d12_device_SetPrivateData,
d3d12_device_SetPrivateDataInterface,
d3d12_device_SetName,
(void *)d3d12_object_SetName,
/* ID3D12Device methods */
d3d12_device_GetNodeCount,
d3d12_device_CreateCommandQueue,
@ -292,6 +316,19 @@ static CONST_VTBL struct ID3D12Device6Vtbl d3d12_device_vtbl_profiled =
d3d12_device_CheckDriverMatchingIdentifier,
/* ID3D12Device6 methods */
d3d12_device_SetBackgroundProcessingMode,
/* ID3D12Device7 methods */
d3d12_device_AddToStateObject,
d3d12_device_CreateProtectedResourceSession1,
/* ID3D12Device8 methods */
d3d12_device_GetResourceAllocationInfo2,
d3d12_device_CreateCommittedResource2_profiled,
d3d12_device_CreatePlacedResource1_profiled,
d3d12_device_CreateSamplerFeedbackUnorderedAccessView_profiled,
d3d12_device_GetCopyableFootprints1,
/* ID3D12Device9 methods */
d3d12_device_CreateShaderCacheSession,
d3d12_device_ShaderCacheControl,
d3d12_device_CreateCommandQueue1,
};
#endif

View File

@ -0,0 +1,234 @@
/*
* * Copyright 2021 NVIDIA Corporation
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
#define VKD3D_DBG_CHANNEL VKD3D_DBG_CHANNEL_API
#include "vkd3d_private.h"
static inline struct d3d12_device *d3d12_device_from_ID3D12DeviceExt(ID3D12DeviceExt *iface)
{
return CONTAINING_RECORD(iface, struct d3d12_device, ID3D12DeviceExt_iface);
}
ULONG STDMETHODCALLTYPE d3d12_device_vkd3d_ext_AddRef(ID3D12DeviceExt *iface)
{
struct d3d12_device *device = d3d12_device_from_ID3D12DeviceExt(iface);
return d3d12_device_add_ref(device);
}
static ULONG STDMETHODCALLTYPE d3d12_device_vkd3d_ext_Release(ID3D12DeviceExt *iface)
{
struct d3d12_device *device = d3d12_device_from_ID3D12DeviceExt(iface);
return d3d12_device_release(device);
}
extern HRESULT STDMETHODCALLTYPE d3d12_device_QueryInterface(d3d12_device_iface *iface,
REFIID riid, void **object);
static HRESULT STDMETHODCALLTYPE d3d12_device_vkd3d_ext_QueryInterface(ID3D12DeviceExt *iface,
REFIID iid, void **out)
{
struct d3d12_device *device = d3d12_device_from_ID3D12DeviceExt(iface);
TRACE("iface %p, iid %s, out %p.\n", iface, debugstr_guid(iid), out);
return d3d12_device_QueryInterface(&device->ID3D12Device_iface, iid, out);
}
static HRESULT STDMETHODCALLTYPE d3d12_device_vkd3d_ext_GetVulkanHandles(ID3D12DeviceExt *iface, VkInstance *vk_instance, VkPhysicalDevice *vk_physical_device, VkDevice *vk_device)
{
struct d3d12_device *device = d3d12_device_from_ID3D12DeviceExt(iface);
TRACE("iface %p, vk_instance %p, vk_physical_device %u, vk_device %p \n", iface, vk_instance, vk_physical_device, vk_device);
if (!vk_device || !vk_instance || !vk_physical_device)
return E_INVALIDARG;
*vk_instance = device->vkd3d_instance->vk_instance;
*vk_physical_device = device->vk_physical_device;
*vk_device = device->vk_device;
return S_OK;
}
static BOOL STDMETHODCALLTYPE d3d12_device_vkd3d_ext_GetExtensionSupport(ID3D12DeviceExt *iface, D3D12_VK_EXTENSION extension)
{
const struct d3d12_device *device = d3d12_device_from_ID3D12DeviceExt(iface);
bool ret_val = false;
TRACE("iface %p, extension %u \n", iface, extension);
switch (extension)
{
case D3D12_VK_NVX_BINARY_IMPORT:
ret_val = device->vk_info.NVX_binary_import;
break;
case D3D12_VK_NVX_IMAGE_VIEW_HANDLE:
ret_val = device->vk_info.NVX_image_view_handle;
break;
default:
WARN("Invalid extension %x\n", extension);
}
return ret_val;
}
static HRESULT STDMETHODCALLTYPE d3d12_device_vkd3d_ext_CreateCubinComputeShaderWithName(ID3D12DeviceExt *iface, const void *cubin_data,
UINT32 cubin_size, UINT32 block_x, UINT32 block_y, UINT32 block_z, const char *shader_name, D3D12_CUBIN_DATA_HANDLE **out_handle)
{
VkCuFunctionCreateInfoNVX functionCreateInfo = { VK_STRUCTURE_TYPE_CU_FUNCTION_CREATE_INFO_NVX };
VkCuModuleCreateInfoNVX moduleCreateInfo = { VK_STRUCTURE_TYPE_CU_MODULE_CREATE_INFO_NVX };
const struct vkd3d_vk_device_procs *vk_procs;
D3D12_CUBIN_DATA_HANDLE *handle;
struct d3d12_device *device;
VkDevice vk_device;
VkResult vr;
TRACE("iface %p, cubin_data %p, cubin_size %u, shader_name %s \n", iface, cubin_data, cubin_size, shader_name);
if (!cubin_data || !cubin_size || !shader_name)
return E_INVALIDARG;
device = d3d12_device_from_ID3D12DeviceExt(iface);
vk_device = device->vk_device;
handle = vkd3d_calloc(1, sizeof(D3D12_CUBIN_DATA_HANDLE));
handle->blockX = block_x;
handle->blockY = block_y;
handle->blockZ = block_z;
moduleCreateInfo.pData = cubin_data;
moduleCreateInfo.dataSize = cubin_size;
vk_procs = &device->vk_procs;
if ((vr = VK_CALL(vkCreateCuModuleNVX(vk_device, &moduleCreateInfo, NULL, &handle->vkCuModule))) < 0)
{
ERR("Failed to create cubin shader, vr %d.\n", vr);
vkd3d_free(handle);
return hresult_from_vk_result(vr);
}
functionCreateInfo.module = handle->vkCuModule;
functionCreateInfo.pName = shader_name;
if ((vr = VK_CALL(vkCreateCuFunctionNVX(vk_device, &functionCreateInfo, NULL, &handle->vkCuFunction))) < 0)
{
ERR("Failed to create cubin function module, vr %d.\n", vr);
VK_CALL(vkDestroyCuModuleNVX(vk_device, handle->vkCuModule, NULL));
vkd3d_free(handle);
return hresult_from_vk_result(vr);
}
*out_handle = handle;
return S_OK;
}
static HRESULT STDMETHODCALLTYPE d3d12_device_vkd3d_ext_DestroyCubinComputeShader(ID3D12DeviceExt *iface, D3D12_CUBIN_DATA_HANDLE *handle)
{
const struct vkd3d_vk_device_procs *vk_procs;
struct d3d12_device *device;
VkDevice vk_device;
TRACE("iface %p, handle %p \n", iface, handle);
if (!iface || !handle)
return E_INVALIDARG;
device = d3d12_device_from_ID3D12DeviceExt(iface);
vk_device = device->vk_device;
vk_procs = &device->vk_procs;
VK_CALL(vkDestroyCuFunctionNVX(vk_device, handle->vkCuFunction, NULL));
VK_CALL(vkDestroyCuModuleNVX(vk_device, handle->vkCuModule, NULL));
vkd3d_free(handle);
return S_OK;
}
static HRESULT STDMETHODCALLTYPE d3d12_device_vkd3d_ext_GetCudaTextureObject(ID3D12DeviceExt *iface, D3D12_CPU_DESCRIPTOR_HANDLE srv_handle,
D3D12_CPU_DESCRIPTOR_HANDLE sampler_handle, UINT32 *cuda_texture_handle)
{
VkImageViewHandleInfoNVX imageViewHandleInfo = { VK_STRUCTURE_TYPE_IMAGE_VIEW_HANDLE_INFO_NVX };
const struct vkd3d_vk_device_procs *vk_procs;
struct d3d12_desc_split sampler_desc;
struct d3d12_desc_split srv_desc;
struct d3d12_device *device;
TRACE("iface %p, srv_handle %zu, sampler_handle %zu, cuda_texture_handle %p.\n",
iface, srv_handle.ptr, sampler_handle.ptr, cuda_texture_handle);
if (!cuda_texture_handle)
return E_INVALIDARG;
device = d3d12_device_from_ID3D12DeviceExt(iface);
srv_desc = d3d12_desc_decode_va(srv_handle.ptr);
sampler_desc = d3d12_desc_decode_va(sampler_handle.ptr);
imageViewHandleInfo.imageView = srv_desc.view->info.view->vk_image_view;
imageViewHandleInfo.sampler = sampler_desc.view->info.view->vk_sampler;
imageViewHandleInfo.descriptorType = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER;
vk_procs = &device->vk_procs;
*cuda_texture_handle = VK_CALL(vkGetImageViewHandleNVX(device->vk_device, &imageViewHandleInfo));
return S_OK;
}
static HRESULT STDMETHODCALLTYPE d3d12_device_vkd3d_ext_GetCudaSurfaceObject(ID3D12DeviceExt *iface, D3D12_CPU_DESCRIPTOR_HANDLE uav_handle,
UINT32 *cuda_surface_handle)
{
VkImageViewHandleInfoNVX imageViewHandleInfo = { VK_STRUCTURE_TYPE_IMAGE_VIEW_HANDLE_INFO_NVX };
const struct vkd3d_vk_device_procs *vk_procs;
struct d3d12_desc_split uav_desc;
struct d3d12_device *device;
TRACE("iface %p, uav_handle %zu, cuda_surface_handle %p.\n", iface, uav_handle.ptr, cuda_surface_handle);
if (!cuda_surface_handle)
return E_INVALIDARG;
device = d3d12_device_from_ID3D12DeviceExt(iface);
uav_desc = d3d12_desc_decode_va(uav_handle.ptr);
imageViewHandleInfo.imageView = uav_desc.view->info.view->vk_image_view;
imageViewHandleInfo.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_IMAGE;
vk_procs = &device->vk_procs;
*cuda_surface_handle = VK_CALL(vkGetImageViewHandleNVX(device->vk_device, &imageViewHandleInfo));
return S_OK;
}
extern VKD3D_THREAD_LOCAL struct D3D12_UAV_INFO *d3d12_uav_info;
static HRESULT STDMETHODCALLTYPE d3d12_device_vkd3d_ext_CaptureUAVInfo(ID3D12DeviceExt *iface, D3D12_UAV_INFO *uav_info)
{
if (!uav_info)
return E_INVALIDARG;
TRACE("iface %p, uav_info %p.\n", iface, uav_info);
/* CaptureUAVInfo() supposed to capture the information from the next CreateUnorderedAccess() on the same thread.
We use d3d12_uav_info pointer to update the information in CreateUnorderedAccess() */
d3d12_uav_info = uav_info;
return S_OK;
}
CONST_VTBL struct ID3D12DeviceExtVtbl d3d12_device_vkd3d_ext_vtbl =
{
/* IUnknown methods */
d3d12_device_vkd3d_ext_QueryInterface,
d3d12_device_vkd3d_ext_AddRef,
d3d12_device_vkd3d_ext_Release,
/* ID3D12DeviceExt methods */
d3d12_device_vkd3d_ext_GetVulkanHandles,
d3d12_device_vkd3d_ext_GetExtensionSupport,
d3d12_device_vkd3d_ext_CreateCubinComputeShaderWithName,
d3d12_device_vkd3d_ext_DestroyCubinComputeShader,
d3d12_device_vkd3d_ext_GetCudaTextureObject,
d3d12_device_vkd3d_ext_GetCudaSurfaceObject,
d3d12_device_vkd3d_ext_CaptureUAVInfo
};

278
libs/vkd3d/heap.c Normal file
View File

@ -0,0 +1,278 @@
/*
* Copyright 2016 Józef Kucia for CodeWeavers
* Copyright 2019 Conor McCarthy for CodeWeavers
* Copyright 2021 Philip Rebohle for Valve Corporation
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
#define VKD3D_DBG_CHANNEL VKD3D_DBG_CHANNEL_API
#include "vkd3d_private.h"
/* ID3D12Heap */
static HRESULT STDMETHODCALLTYPE d3d12_heap_QueryInterface(d3d12_heap_iface *iface,
REFIID iid, void **object)
{
TRACE("iface %p, iid %s, object %p.\n", iface, debugstr_guid(iid), object);
if (IsEqualGUID(iid, &IID_ID3D12Heap)
|| IsEqualGUID(iid, &IID_ID3D12Heap1)
|| IsEqualGUID(iid, &IID_ID3D12Pageable)
|| IsEqualGUID(iid, &IID_ID3D12DeviceChild)
|| IsEqualGUID(iid, &IID_ID3D12Object)
|| IsEqualGUID(iid, &IID_IUnknown))
{
ID3D12Heap_AddRef(iface);
*object = iface;
return S_OK;
}
WARN("%s not implemented, returning E_NOINTERFACE.\n", debugstr_guid(iid));
*object = NULL;
return E_NOINTERFACE;
}
static ULONG STDMETHODCALLTYPE d3d12_heap_AddRef(d3d12_heap_iface *iface)
{
struct d3d12_heap *heap = impl_from_ID3D12Heap1(iface);
ULONG refcount = InterlockedIncrement(&heap->refcount);
TRACE("%p increasing refcount to %u.\n", heap, refcount);
return refcount;
}
static void d3d12_heap_destroy(struct d3d12_heap *heap)
{
TRACE("Destroying heap %p.\n", heap);
vkd3d_free_memory(heap->device, &heap->device->memory_allocator, &heap->allocation);
vkd3d_private_store_destroy(&heap->private_store);
d3d12_device_release(heap->device);
vkd3d_free(heap);
}
static void d3d12_heap_set_name(struct d3d12_heap *heap, const char *name)
{
if (!heap->allocation.chunk)
vkd3d_set_vk_object_name(heap->device, (uint64_t)heap->allocation.device_allocation.vk_memory,
VK_OBJECT_TYPE_DEVICE_MEMORY, name);
}
static ULONG STDMETHODCALLTYPE d3d12_heap_Release(d3d12_heap_iface *iface)
{
struct d3d12_heap *heap = impl_from_ID3D12Heap1(iface);
ULONG refcount = InterlockedDecrement(&heap->refcount);
TRACE("%p decreasing refcount to %u.\n", heap, refcount);
if (!refcount)
d3d12_heap_destroy(heap);
return refcount;
}
static HRESULT STDMETHODCALLTYPE d3d12_heap_GetPrivateData(d3d12_heap_iface *iface,
REFGUID guid, UINT *data_size, void *data)
{
struct d3d12_heap *heap = impl_from_ID3D12Heap1(iface);
TRACE("iface %p, guid %s, data_size %p, data %p.\n", iface, debugstr_guid(guid), data_size, data);
return vkd3d_get_private_data(&heap->private_store, guid, data_size, data);
}
static HRESULT STDMETHODCALLTYPE d3d12_heap_SetPrivateData(d3d12_heap_iface *iface,
REFGUID guid, UINT data_size, const void *data)
{
struct d3d12_heap *heap = impl_from_ID3D12Heap1(iface);
TRACE("iface %p, guid %s, data_size %u, data %p.\n", iface, debugstr_guid(guid), data_size, data);
return vkd3d_set_private_data(&heap->private_store, guid, data_size, data,
(vkd3d_set_name_callback) d3d12_heap_set_name, heap);
}
static HRESULT STDMETHODCALLTYPE d3d12_heap_SetPrivateDataInterface(d3d12_heap_iface *iface,
REFGUID guid, const IUnknown *data)
{
struct d3d12_heap *heap = impl_from_ID3D12Heap1(iface);
TRACE("iface %p, guid %s, data %p.\n", iface, debugstr_guid(guid), data);
return vkd3d_set_private_data_interface(&heap->private_store, guid, data,
(vkd3d_set_name_callback) d3d12_heap_set_name, heap);
}
static HRESULT STDMETHODCALLTYPE d3d12_heap_GetDevice(d3d12_heap_iface *iface, REFIID iid, void **device)
{
struct d3d12_heap *heap = impl_from_ID3D12Heap1(iface);
TRACE("iface %p, iid %s, device %p.\n", iface, debugstr_guid(iid), device);
return d3d12_device_query_interface(heap->device, iid, device);
}
static D3D12_HEAP_DESC * STDMETHODCALLTYPE d3d12_heap_GetDesc(d3d12_heap_iface *iface,
D3D12_HEAP_DESC *desc)
{
struct d3d12_heap *heap = impl_from_ID3D12Heap1(iface);
TRACE("iface %p, desc %p.\n", iface, desc);
*desc = heap->desc;
return desc;
}
static HRESULT STDMETHODCALLTYPE d3d12_heap_GetProtectedResourceSession(d3d12_heap_iface *iface,
REFIID iid, void **protected_session)
{
FIXME("iface %p, iid %s, protected_session %p stub!", iface, debugstr_guid(iid), protected_session);
return E_NOTIMPL;
}
CONST_VTBL struct ID3D12Heap1Vtbl d3d12_heap_vtbl =
{
/* IUnknown methods */
d3d12_heap_QueryInterface,
d3d12_heap_AddRef,
d3d12_heap_Release,
/* ID3D12Object methods */
d3d12_heap_GetPrivateData,
d3d12_heap_SetPrivateData,
d3d12_heap_SetPrivateDataInterface,
(void *)d3d12_object_SetName,
/* ID3D12DeviceChild methods */
d3d12_heap_GetDevice,
/* ID3D12Heap methods */
d3d12_heap_GetDesc,
/* ID3D12Heap1 methods */
d3d12_heap_GetProtectedResourceSession,
};
HRESULT d3d12_device_validate_custom_heap_type(struct d3d12_device *device,
const D3D12_HEAP_PROPERTIES *heap_properties)
{
if (heap_properties->Type != D3D12_HEAP_TYPE_CUSTOM)
return S_OK;
if (heap_properties->MemoryPoolPreference == D3D12_MEMORY_POOL_UNKNOWN
|| (heap_properties->MemoryPoolPreference == D3D12_MEMORY_POOL_L1
&& (is_cpu_accessible_heap(heap_properties) || d3d12_device_is_uma(device, NULL))))
{
WARN("Invalid memory pool preference.\n");
return E_INVALIDARG;
}
if (heap_properties->CPUPageProperty == D3D12_CPU_PAGE_PROPERTY_UNKNOWN)
{
WARN("Must have explicit CPU page property for CUSTOM heap type.\n");
return E_INVALIDARG;
}
return S_OK;
}
static HRESULT validate_heap_desc(struct d3d12_device *device, const D3D12_HEAP_DESC *desc)
{
HRESULT hr;
if (!desc->SizeInBytes)
{
WARN("Invalid size %"PRIu64".\n", desc->SizeInBytes);
return E_INVALIDARG;
}
if (desc->Alignment != D3D12_DEFAULT_RESOURCE_PLACEMENT_ALIGNMENT
&& desc->Alignment != D3D12_DEFAULT_MSAA_RESOURCE_PLACEMENT_ALIGNMENT)
{
WARN("Invalid alignment %"PRIu64".\n", desc->Alignment);
return E_INVALIDARG;
}
if (desc->Flags & D3D12_HEAP_FLAG_ALLOW_DISPLAY)
{
WARN("D3D12_HEAP_FLAG_ALLOW_DISPLAY is only for committed resources.\n");
return E_INVALIDARG;
}
if (FAILED(hr = d3d12_device_validate_custom_heap_type(device, &desc->Properties)))
return hr;
return S_OK;
}
static HRESULT d3d12_heap_init(struct d3d12_heap *heap, struct d3d12_device *device,
const D3D12_HEAP_DESC *desc, void* host_address)
{
struct vkd3d_allocate_heap_memory_info alloc_info;
HRESULT hr;
memset(heap, 0, sizeof(*heap));
heap->ID3D12Heap_iface.lpVtbl = &d3d12_heap_vtbl;
heap->refcount = 1;
heap->desc = *desc;
heap->device = device;
if (!heap->desc.Properties.CreationNodeMask)
heap->desc.Properties.CreationNodeMask = 1;
if (!heap->desc.Properties.VisibleNodeMask)
heap->desc.Properties.VisibleNodeMask = 1;
if (!heap->desc.Alignment)
heap->desc.Alignment = D3D12_DEFAULT_RESOURCE_PLACEMENT_ALIGNMENT;
if (FAILED(hr = validate_heap_desc(device, &heap->desc)))
return hr;
alloc_info.heap_desc = heap->desc;
alloc_info.host_ptr = host_address;
alloc_info.extra_allocation_flags = 0;
if (FAILED(hr = vkd3d_private_store_init(&heap->private_store)))
return hr;
if (FAILED(hr = vkd3d_allocate_heap_memory(device,
&device->memory_allocator, &alloc_info, &heap->allocation)))
{
vkd3d_private_store_destroy(&heap->private_store);
return hr;
}
d3d12_device_add_ref(heap->device);
return S_OK;
}
HRESULT d3d12_heap_create(struct d3d12_device *device, const D3D12_HEAP_DESC *desc,
void* host_address, struct d3d12_heap **heap)
{
struct d3d12_heap *object;
HRESULT hr;
if (!(object = vkd3d_malloc(sizeof(*object))))
return E_OUTOFMEMORY;
if (FAILED(hr = d3d12_heap_init(object, device, desc, host_address)))
{
vkd3d_free(object);
return hr;
}
TRACE("Created heap %p.\n", object);
*heap = object;
return S_OK;
}

1595
libs/vkd3d/memory.c Normal file

File diff suppressed because it is too large Load Diff

View File

@ -12,8 +12,14 @@ vkd3d_shaders =[
'shaders/cs_clear_uav_image_2d_uint.comp',
'shaders/cs_clear_uav_image_3d_float.comp',
'shaders/cs_clear_uav_image_3d_uint.comp',
'shaders/cs_predicate_command.comp',
'shaders/cs_resolve_binary_queries.comp',
'shaders/cs_resolve_predicate.comp',
'shaders/cs_resolve_query.comp',
'shaders/fs_copy_image_float.frag',
'shaders/fs_copy_image_uint.frag',
'shaders/fs_copy_image_stencil.frag',
'shaders/gs_fullscreen.geom',
'shaders/vs_fullscreen.vert',
@ -21,19 +27,28 @@ vkd3d_shaders =[
'shaders/vs_swapchain_fullscreen.vert',
'shaders/fs_swapchain_fullscreen.frag',
'shaders/cs_execute_indirect_patch.comp',
'shaders/cs_execute_indirect_patch_debug_ring.comp',
]
vkd3d_src = [
'bundle.c',
'cache.c',
'command.c',
'command_list_vkd3d_ext.c',
'device.c',
'device_vkd3d_ext.c',
'heap.c',
'memory.c',
'meta.c',
'platform.c',
'resource.c',
'state.c',
'utils.c',
'debug_ring.c',
'va_map.c',
'vkd3d_main.c',
'raytracing_pipeline.c',
'acceleration_structure.c'
]
if enable_d3d12
@ -44,12 +59,24 @@ if enable_renderdoc
vkd3d_src += ['renderdoc.c']
endif
if enable_descriptor_qa
vkd3d_src += ['descriptor_debug.c']
endif
if enable_breadcrumbs
vkd3d_src += ['breadcrumbs.c']
endif
if vkd3d_platform == 'windows'
vkd3d_src += ['shared_metadata.c']
endif
if not enable_d3d12
vkd3d_lib = shared_library('vkd3d-proton', vkd3d_src, glsl_generator.process(vkd3d_shaders), vkd3d_build, vkd3d_version,
dependencies : [ vkd3d_common_dep, vkd3d_shader_dep ] + vkd3d_extra_libs,
include_directories : vkd3d_private_includes,
install : true,
version : '2.0.0',
version : '3.0.0',
c_args : '-DVKD3D_EXPORTS',
override_options : [ 'c_std='+vkd3d_c_std ])
else

View File

@ -137,68 +137,8 @@ static VkResult vkd3d_meta_create_compute_pipeline(struct d3d12_device *device,
return vr;
}
static VkResult vkd3d_meta_create_render_pass(struct d3d12_device *device, VkSampleCountFlagBits samples,
const struct vkd3d_format *format, VkRenderPass *vk_render_pass)
{
const struct vkd3d_vk_device_procs *vk_procs = &device->vk_procs;
VkAttachmentDescription attachment_info;
VkAttachmentReference attachment_ref;
VkSubpassDescription subpass_info;
VkRenderPassCreateInfo pass_info;
bool has_depth_target;
VkImageLayout layout;
VkResult vr;
assert(format);
has_depth_target = (format->vk_aspect_mask & (VK_IMAGE_ASPECT_DEPTH_BIT | VK_IMAGE_ASPECT_STENCIL_BIT)) != 0;
layout = has_depth_target
? VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL
: VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
attachment_info.flags = 0;
attachment_info.format = format->vk_format;
attachment_info.samples = samples;
attachment_info.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD;
attachment_info.storeOp = VK_ATTACHMENT_STORE_OP_STORE;
attachment_info.stencilLoadOp = VK_ATTACHMENT_LOAD_OP_LOAD;
attachment_info.stencilStoreOp = VK_ATTACHMENT_STORE_OP_STORE;
attachment_info.initialLayout = layout;
attachment_info.finalLayout = layout;
attachment_ref.attachment = 0;
attachment_ref.layout = layout;
subpass_info.flags = 0;
subpass_info.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS;
subpass_info.inputAttachmentCount = 0;
subpass_info.pInputAttachments = NULL;
subpass_info.colorAttachmentCount = has_depth_target ? 0 : 1;
subpass_info.pColorAttachments = has_depth_target ? NULL : &attachment_ref;
subpass_info.pResolveAttachments = NULL;
subpass_info.pDepthStencilAttachment = has_depth_target ? &attachment_ref : NULL;
subpass_info.preserveAttachmentCount = 0;
subpass_info.pPreserveAttachments = NULL;
pass_info.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO;
pass_info.pNext = NULL;
pass_info.flags = 0;
pass_info.attachmentCount = 1;
pass_info.pAttachments = &attachment_info;
pass_info.subpassCount = 1;
pass_info.pSubpasses = &subpass_info;
pass_info.dependencyCount = 0;
pass_info.pDependencies = NULL;
if ((vr = VK_CALL(vkCreateRenderPass(device->vk_device, &pass_info, NULL, vk_render_pass))) < 0)
ERR("Failed to create render pass, vr %d.\n", vr);
return vr;
}
static VkResult vkd3d_meta_create_graphics_pipeline(struct vkd3d_meta_ops *meta_ops,
VkPipelineLayout layout, VkRenderPass render_pass,
VkPipelineLayout layout, VkFormat color_format, VkFormat ds_format, VkImageAspectFlags vk_aspect_mask,
VkShaderModule vs_module, VkShaderModule fs_module,
VkSampleCountFlagBits samples, const VkPipelineDepthStencilStateCreateInfo *ds_state,
const VkPipelineColorBlendStateCreateInfo *cb_state, const VkSpecializationInfo *spec_info,
@ -208,6 +148,7 @@ static VkResult vkd3d_meta_create_graphics_pipeline(struct vkd3d_meta_ops *meta_
VkPipelineShaderStageCreateInfo shader_stages[3];
VkPipelineInputAssemblyStateCreateInfo ia_state;
VkPipelineRasterizationStateCreateInfo rs_state;
VkPipelineRenderingCreateInfoKHR rendering_info;
VkPipelineVertexInputStateCreateInfo vi_state;
VkPipelineMultisampleStateCreateInfo ms_state;
VkPipelineViewportStateCreateInfo vp_state;
@ -274,8 +215,16 @@ static VkResult vkd3d_meta_create_graphics_pipeline(struct vkd3d_meta_ops *meta_
dyn_state.dynamicStateCount = ARRAY_SIZE(dynamic_states);
dyn_state.pDynamicStates = dynamic_states;
rendering_info.sType = VK_STRUCTURE_TYPE_PIPELINE_RENDERING_CREATE_INFO_KHR;
rendering_info.pNext = NULL;
rendering_info.viewMask = 0;
rendering_info.colorAttachmentCount = color_format && (vk_aspect_mask & VK_IMAGE_ASPECT_COLOR_BIT) ? 1 : 0;
rendering_info.pColorAttachmentFormats = color_format ? &color_format : NULL;
rendering_info.depthAttachmentFormat = (vk_aspect_mask & VK_IMAGE_ASPECT_DEPTH_BIT) ? ds_format : VK_FORMAT_UNDEFINED;
rendering_info.stencilAttachmentFormat = (vk_aspect_mask & VK_IMAGE_ASPECT_STENCIL_BIT) ? ds_format : VK_FORMAT_UNDEFINED;
pipeline_info.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO;
pipeline_info.pNext = NULL;
pipeline_info.pNext = &rendering_info;
pipeline_info.flags = 0;
pipeline_info.stageCount = 0;
pipeline_info.pStages = shader_stages;
@ -289,7 +238,7 @@ static VkResult vkd3d_meta_create_graphics_pipeline(struct vkd3d_meta_ops *meta_
pipeline_info.pColorBlendState = cb_state;
pipeline_info.pDynamicState = &dyn_state;
pipeline_info.layout = layout;
pipeline_info.renderPass = render_pass;
pipeline_info.renderPass = VK_NULL_HANDLE;
pipeline_info.subpass = 0;
pipeline_info.basePipelineHandle = VK_NULL_HANDLE;
pipeline_info.basePipelineIndex = -1;
@ -587,12 +536,30 @@ HRESULT vkd3d_copy_image_ops_init(struct vkd3d_copy_image_ops *meta_copy_image_o
goto fail;
}
if ((vr = vkd3d_meta_create_shader_module(device, SPIRV_CODE(fs_copy_image_float), &meta_copy_image_ops->vk_fs_module)) < 0)
if ((vr = vkd3d_meta_create_shader_module(device, SPIRV_CODE(fs_copy_image_float),
&meta_copy_image_ops->vk_fs_float_module)) < 0)
{
ERR("Failed to create shader modules, vr %d.\n", vr);
goto fail;
}
if ((vr = vkd3d_meta_create_shader_module(device, SPIRV_CODE(fs_copy_image_uint),
&meta_copy_image_ops->vk_fs_uint_module)) < 0)
{
ERR("Failed to create shader modules, vr %d.\n", vr);
goto fail;
}
if (device->vk_info.EXT_shader_stencil_export)
{
if ((vr = vkd3d_meta_create_shader_module(device, SPIRV_CODE(fs_copy_image_stencil),
&meta_copy_image_ops->vk_fs_stencil_module)) < 0)
{
ERR("Failed to create shader modules, vr %d.\n", vr);
goto fail;
}
}
return S_OK;
fail:
@ -610,83 +577,28 @@ void vkd3d_copy_image_ops_cleanup(struct vkd3d_copy_image_ops *meta_copy_image_o
{
struct vkd3d_copy_image_pipeline *pipeline = &meta_copy_image_ops->pipelines[i];
VK_CALL(vkDestroyRenderPass(device->vk_device, pipeline->vk_render_pass, NULL));
VK_CALL(vkDestroyPipeline(device->vk_device, pipeline->vk_pipeline, NULL));
}
VK_CALL(vkDestroyDescriptorSetLayout(device->vk_device, meta_copy_image_ops->vk_set_layout, NULL));
VK_CALL(vkDestroyPipelineLayout(device->vk_device, meta_copy_image_ops->vk_pipeline_layout, NULL));
VK_CALL(vkDestroyShaderModule(device->vk_device, meta_copy_image_ops->vk_fs_module, NULL));
VK_CALL(vkDestroyShaderModule(device->vk_device, meta_copy_image_ops->vk_fs_float_module, NULL));
VK_CALL(vkDestroyShaderModule(device->vk_device, meta_copy_image_ops->vk_fs_uint_module, NULL));
VK_CALL(vkDestroyShaderModule(device->vk_device, meta_copy_image_ops->vk_fs_stencil_module, NULL));
pthread_mutex_destroy(&meta_copy_image_ops->mutex);
vkd3d_free(meta_copy_image_ops->pipelines);
}
static VkResult vkd3d_meta_create_swapchain_render_pass(struct d3d12_device *device,
const struct vkd3d_swapchain_pipeline_key *key, VkRenderPass *render_pass)
{
const struct vkd3d_vk_device_procs *vk_procs = &device->vk_procs;
VkRenderPassCreateInfo render_pass_info;
VkAttachmentDescription attachment_desc;
VkAttachmentReference attachment_ref;
VkSubpassDescription subpass_desc;
VkSubpassDependency subpass_dep;
render_pass_info.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO;
render_pass_info.pNext = NULL;
render_pass_info.flags = 0;
render_pass_info.attachmentCount = 1;
render_pass_info.pAttachments = &attachment_desc;
render_pass_info.subpassCount = 1;
render_pass_info.pSubpasses = &subpass_desc;
render_pass_info.dependencyCount = 1;
render_pass_info.pDependencies = &subpass_dep;
attachment_desc.loadOp = key->load_op;
attachment_desc.storeOp = VK_ATTACHMENT_STORE_OP_STORE;
attachment_desc.stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE;
attachment_desc.stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE;
attachment_desc.format = key->format;
attachment_desc.samples = VK_SAMPLE_COUNT_1_BIT;
attachment_desc.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
attachment_desc.finalLayout = VK_IMAGE_LAYOUT_PRESENT_SRC_KHR;
attachment_desc.flags = 0;
attachment_ref.attachment = 0;
attachment_ref.layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
memset(&subpass_desc, 0, sizeof(subpass_desc));
subpass_desc.colorAttachmentCount = 1;
subpass_desc.pColorAttachments = &attachment_ref;
subpass_desc.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS;
subpass_dep.srcSubpass = VK_SUBPASS_EXTERNAL;
subpass_dep.dstSubpass = 0;
subpass_dep.srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
subpass_dep.dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
subpass_dep.srcAccessMask = 0;
subpass_dep.dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
subpass_dep.dependencyFlags = 0;
return VK_CALL(vkCreateRenderPass(device->vk_device, &render_pass_info, NULL, render_pass));
}
static HRESULT vkd3d_meta_create_swapchain_pipeline(struct vkd3d_meta_ops *meta_ops,
const struct vkd3d_swapchain_pipeline_key *key, struct vkd3d_swapchain_pipeline *pipeline)
{
const struct vkd3d_vk_device_procs *vk_procs = &meta_ops->device->vk_procs;
struct vkd3d_swapchain_ops *meta_swapchain_ops = &meta_ops->swapchain;
VkPipelineColorBlendAttachmentState blend_att;
VkPipelineColorBlendStateCreateInfo cb_state;
VkResult vr;
if ((vr = vkd3d_meta_create_swapchain_render_pass(meta_ops->device, key, &pipeline->vk_render_pass)))
{
ERR("Failed to create render pass, vr %d.\n", vr);
return hresult_from_vk_result(vr);
}
memset(&cb_state, 0, sizeof(cb_state));
memset(&blend_att, 0, sizeof(blend_att));
cb_state.sType = VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO;
@ -699,14 +611,11 @@ static HRESULT vkd3d_meta_create_swapchain_pipeline(struct vkd3d_meta_ops *meta_
VK_COLOR_COMPONENT_A_BIT;
if ((vr = vkd3d_meta_create_graphics_pipeline(meta_ops,
meta_swapchain_ops->vk_pipeline_layouts[key->filter], pipeline->vk_render_pass,
meta_swapchain_ops->vk_pipeline_layouts[key->filter], key->format, VK_FORMAT_UNDEFINED, VK_IMAGE_ASPECT_COLOR_BIT,
meta_swapchain_ops->vk_vs_module, meta_swapchain_ops->vk_fs_module, 1,
NULL, &cb_state,
NULL, &pipeline->vk_pipeline)) < 0)
{
VK_CALL(vkDestroyRenderPass(meta_ops->device->vk_device, pipeline->vk_render_pass, NULL));
return hresult_from_vk_result(vr);
}
pipeline->key = *key;
return S_OK;
@ -715,12 +624,12 @@ static HRESULT vkd3d_meta_create_swapchain_pipeline(struct vkd3d_meta_ops *meta_
static HRESULT vkd3d_meta_create_copy_image_pipeline(struct vkd3d_meta_ops *meta_ops,
const struct vkd3d_copy_image_pipeline_key *key, struct vkd3d_copy_image_pipeline *pipeline)
{
const struct vkd3d_vk_device_procs *vk_procs = &meta_ops->device->vk_procs;
struct vkd3d_copy_image_ops *meta_copy_image_ops = &meta_ops->copy_image;
VkPipelineColorBlendAttachmentState blend_attachment;
VkPipelineDepthStencilStateCreateInfo ds_state;
VkPipelineColorBlendStateCreateInfo cb_state;
VkSpecializationInfo spec_info;
VkShaderModule vk_module;
bool has_depth_target;
VkResult vr;
@ -759,13 +668,30 @@ static HRESULT vkd3d_meta_create_copy_image_pipeline(struct vkd3d_meta_ops *meta
ds_state.sType = VK_STRUCTURE_TYPE_PIPELINE_DEPTH_STENCIL_STATE_CREATE_INFO;
ds_state.pNext = NULL;
ds_state.flags = 0;
ds_state.depthTestEnable = VK_TRUE;
ds_state.depthWriteEnable = VK_TRUE;
ds_state.depthTestEnable = (key->dst_aspect_mask & VK_IMAGE_ASPECT_DEPTH_BIT) ? VK_TRUE : VK_FALSE;
ds_state.depthWriteEnable = ds_state.depthTestEnable;
ds_state.depthCompareOp = VK_COMPARE_OP_ALWAYS;
ds_state.depthBoundsTestEnable = VK_FALSE;
ds_state.stencilTestEnable = VK_FALSE;
memset(&ds_state.front, 0, sizeof(ds_state.front));
memset(&ds_state.back, 0, sizeof(ds_state.back));
if (key->dst_aspect_mask & VK_IMAGE_ASPECT_STENCIL_BIT)
{
ds_state.stencilTestEnable = VK_TRUE;
ds_state.front.reference = 0;
ds_state.front.writeMask = 0xff;
ds_state.front.compareMask = 0xff;
ds_state.front.passOp = VK_STENCIL_OP_REPLACE;
ds_state.front.failOp = VK_STENCIL_OP_KEEP;
ds_state.front.depthFailOp = VK_STENCIL_OP_KEEP;
ds_state.front.compareOp = VK_COMPARE_OP_ALWAYS;
ds_state.back = ds_state.front;
}
else
{
ds_state.stencilTestEnable = VK_FALSE;
memset(&ds_state.front, 0, sizeof(ds_state.front));
memset(&ds_state.back, 0, sizeof(ds_state.back));
}
ds_state.minDepthBounds = 0.0f;
ds_state.maxDepthBounds = 1.0f;
@ -784,19 +710,32 @@ static HRESULT vkd3d_meta_create_copy_image_pipeline(struct vkd3d_meta_ops *meta
cb_state.pAttachments = &blend_attachment;
memset(&cb_state.blendConstants, 0, sizeof(cb_state.blendConstants));
if ((vr = vkd3d_meta_create_render_pass(meta_ops->device,
key->sample_count, key->format, &pipeline->vk_render_pass)) < 0)
return hresult_from_vk_result(vr);
/* Special path when copying stencil -> color. */
if (key->format->vk_format == VK_FORMAT_R8_UINT)
{
/* Special path when copying stencil -> color. */
vk_module = meta_copy_image_ops->vk_fs_uint_module;
}
else if (key->dst_aspect_mask == VK_IMAGE_ASPECT_STENCIL_BIT)
{
/* FragStencilRef path. */
vk_module = meta_copy_image_ops->vk_fs_stencil_module;
}
else
{
/* Depth or float color path. */
vk_module = meta_copy_image_ops->vk_fs_float_module;
}
if ((vr = vkd3d_meta_create_graphics_pipeline(meta_ops,
meta_copy_image_ops->vk_pipeline_layout, pipeline->vk_render_pass,
VK_NULL_HANDLE, meta_copy_image_ops->vk_fs_module, key->sample_count,
meta_copy_image_ops->vk_pipeline_layout,
has_depth_target ? VK_FORMAT_UNDEFINED : key->format->vk_format,
has_depth_target ? key->format->vk_format : VK_FORMAT_UNDEFINED,
key->format->vk_aspect_mask,
VK_NULL_HANDLE, vk_module, key->sample_count,
has_depth_target ? &ds_state : NULL, has_depth_target ? NULL : &cb_state,
&spec_info, &pipeline->vk_pipeline)) < 0)
{
VK_CALL(vkDestroyRenderPass(meta_ops->device->vk_device, pipeline->vk_render_pass, NULL));
return hresult_from_vk_result(vr);
}
pipeline->key = *key;
return S_OK;
@ -826,7 +765,6 @@ HRESULT vkd3d_meta_get_copy_image_pipeline(struct vkd3d_meta_ops *meta_ops,
if (!memcmp(key, &pipeline->key, sizeof(*key)))
{
info->vk_render_pass = pipeline->vk_render_pass;
info->vk_pipeline = pipeline->vk_pipeline;
pthread_mutex_unlock(&meta_copy_image_ops->mutex);
return S_OK;
@ -848,7 +786,6 @@ HRESULT vkd3d_meta_get_copy_image_pipeline(struct vkd3d_meta_ops *meta_ops,
return hr;
}
info->vk_render_pass = pipeline->vk_render_pass;
info->vk_pipeline = pipeline->vk_pipeline;
pthread_mutex_unlock(&meta_copy_image_ops->mutex);
@ -870,23 +807,32 @@ VkImageViewType vkd3d_meta_get_copy_image_view_type(D3D12_RESOURCE_DIMENSION dim
}
const struct vkd3d_format *vkd3d_meta_get_copy_image_attachment_format(struct vkd3d_meta_ops *meta_ops,
const struct vkd3d_format *dst_format, const struct vkd3d_format *src_format)
const struct vkd3d_format *dst_format, const struct vkd3d_format *src_format,
VkImageAspectFlags dst_aspect, VkImageAspectFlags src_aspect)
{
DXGI_FORMAT dxgi_format = DXGI_FORMAT_UNKNOWN;
if (dst_format->vk_aspect_mask & VK_IMAGE_ASPECT_DEPTH_BIT)
if (dst_aspect & (VK_IMAGE_ASPECT_DEPTH_BIT | VK_IMAGE_ASPECT_STENCIL_BIT))
return dst_format;
assert(src_format->vk_aspect_mask & VK_IMAGE_ASPECT_DEPTH_BIT);
assert(src_aspect & (VK_IMAGE_ASPECT_DEPTH_BIT | VK_IMAGE_ASPECT_STENCIL_BIT));
switch (src_format->vk_format)
{
case VK_FORMAT_D16_UNORM:
dxgi_format = DXGI_FORMAT_R16_UNORM;
break;
case VK_FORMAT_D16_UNORM_S8_UINT:
dxgi_format = (src_aspect & VK_IMAGE_ASPECT_DEPTH_BIT) ?
DXGI_FORMAT_R16_UNORM : DXGI_FORMAT_R8_UINT;
break;
case VK_FORMAT_D32_SFLOAT:
dxgi_format = DXGI_FORMAT_R32_FLOAT;
break;
case VK_FORMAT_D32_SFLOAT_S8_UINT:
dxgi_format = (src_aspect & VK_IMAGE_ASPECT_DEPTH_BIT) ?
DXGI_FORMAT_R32_FLOAT : DXGI_FORMAT_R8_UINT;
break;
default:
ERR("Unhandled format %u.\n", src_format->vk_format);
return NULL;
@ -1000,7 +946,6 @@ void vkd3d_swapchain_ops_cleanup(struct vkd3d_swapchain_ops *meta_swapchain_ops,
{
struct vkd3d_swapchain_pipeline *pipeline = &meta_swapchain_ops->pipelines[i];
VK_CALL(vkDestroyRenderPass(device->vk_device, pipeline->vk_render_pass, NULL));
VK_CALL(vkDestroyPipeline(device->vk_device, pipeline->vk_pipeline, NULL));
}
@ -1041,7 +986,6 @@ HRESULT vkd3d_meta_get_swapchain_pipeline(struct vkd3d_meta_ops *meta_ops,
if (!memcmp(key, &pipeline->key, sizeof(*key)))
{
info->vk_render_pass = pipeline->vk_render_pass;
info->vk_pipeline = pipeline->vk_pipeline;
pthread_mutex_unlock(&meta_swapchain_ops->mutex);
return S_OK;
@ -1063,13 +1007,354 @@ HRESULT vkd3d_meta_get_swapchain_pipeline(struct vkd3d_meta_ops *meta_ops,
return hr;
}
info->vk_render_pass = pipeline->vk_render_pass;
info->vk_pipeline = pipeline->vk_pipeline;
pthread_mutex_unlock(&meta_swapchain_ops->mutex);
return S_OK;
}
HRESULT vkd3d_query_ops_init(struct vkd3d_query_ops *meta_query_ops,
struct d3d12_device *device)
{
VkPushConstantRange push_constant_range;
VkSpecializationInfo spec_info;
uint32_t field_count;
VkResult vr;
static const VkDescriptorSetLayoutBinding gather_bindings[] =
{
{ 0, VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, 1, VK_SHADER_STAGE_COMPUTE_BIT },
{ 1, VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, 1, VK_SHADER_STAGE_COMPUTE_BIT },
{ 2, VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, 1, VK_SHADER_STAGE_COMPUTE_BIT },
};
static const VkDescriptorSetLayoutBinding resolve_bindings[] =
{
{ 0, VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, 1, VK_SHADER_STAGE_COMPUTE_BIT },
{ 1, VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, 1, VK_SHADER_STAGE_COMPUTE_BIT },
};
static const VkSpecializationMapEntry spec_map = { 0, 0, sizeof(uint32_t) };
if ((vr = vkd3d_meta_create_descriptor_set_layout(device,
ARRAY_SIZE(gather_bindings), gather_bindings,
&meta_query_ops->vk_gather_set_layout)) < 0)
goto fail;
push_constant_range.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
push_constant_range.offset = 0;
push_constant_range.size = sizeof(struct vkd3d_query_gather_args);
if ((vr = vkd3d_meta_create_pipeline_layout(device, 1, &meta_query_ops->vk_gather_set_layout,
1, &push_constant_range, &meta_query_ops->vk_gather_pipeline_layout)) < 0)
goto fail;
spec_info.mapEntryCount = 1;
spec_info.pMapEntries = &spec_map;
spec_info.dataSize = sizeof(field_count);
spec_info.pData = &field_count;
field_count = 1;
if ((vr = vkd3d_meta_create_compute_pipeline(device, sizeof(cs_resolve_query), cs_resolve_query,
meta_query_ops->vk_gather_pipeline_layout, &spec_info, &meta_query_ops->vk_gather_occlusion_pipeline)) < 0)
goto fail;
field_count = 2;
if ((vr = vkd3d_meta_create_compute_pipeline(device, sizeof(cs_resolve_query), cs_resolve_query,
meta_query_ops->vk_gather_pipeline_layout, &spec_info, &meta_query_ops->vk_gather_so_statistics_pipeline)) < 0)
goto fail;
push_constant_range.size = sizeof(struct vkd3d_query_resolve_args);
if ((vr = vkd3d_meta_create_descriptor_set_layout(device,
ARRAY_SIZE(resolve_bindings), resolve_bindings,
&meta_query_ops->vk_resolve_set_layout)) < 0)
goto fail;
if ((vr = vkd3d_meta_create_pipeline_layout(device, 1, &meta_query_ops->vk_resolve_set_layout,
1, &push_constant_range, &meta_query_ops->vk_resolve_pipeline_layout)) < 0)
goto fail;
if ((vr = vkd3d_meta_create_compute_pipeline(device, sizeof(cs_resolve_binary_queries), cs_resolve_binary_queries,
meta_query_ops->vk_resolve_pipeline_layout, NULL, &meta_query_ops->vk_resolve_binary_pipeline)) < 0)
goto fail;
return S_OK;
fail:
vkd3d_query_ops_cleanup(meta_query_ops, device);
return hresult_from_vk_result(vr);
}
void vkd3d_query_ops_cleanup(struct vkd3d_query_ops *meta_query_ops,
struct d3d12_device *device)
{
const struct vkd3d_vk_device_procs *vk_procs = &device->vk_procs;
VK_CALL(vkDestroyPipeline(device->vk_device, meta_query_ops->vk_gather_occlusion_pipeline, NULL));
VK_CALL(vkDestroyPipeline(device->vk_device, meta_query_ops->vk_gather_so_statistics_pipeline, NULL));
VK_CALL(vkDestroyPipelineLayout(device->vk_device, meta_query_ops->vk_gather_pipeline_layout, NULL));
VK_CALL(vkDestroyDescriptorSetLayout(device->vk_device, meta_query_ops->vk_gather_set_layout, NULL));
VK_CALL(vkDestroyDescriptorSetLayout(device->vk_device, meta_query_ops->vk_resolve_set_layout, NULL));
VK_CALL(vkDestroyPipelineLayout(device->vk_device, meta_query_ops->vk_resolve_pipeline_layout, NULL));
VK_CALL(vkDestroyPipeline(device->vk_device, meta_query_ops->vk_resolve_binary_pipeline, NULL));
}
bool vkd3d_meta_get_query_gather_pipeline(struct vkd3d_meta_ops *meta_ops,
D3D12_QUERY_HEAP_TYPE heap_type, struct vkd3d_query_gather_info *info)
{
const struct vkd3d_query_ops *query_ops = &meta_ops->query;
info->vk_set_layout = query_ops->vk_gather_set_layout;
info->vk_pipeline_layout = query_ops->vk_gather_pipeline_layout;
switch (heap_type)
{
case D3D12_QUERY_HEAP_TYPE_OCCLUSION:
info->vk_pipeline = query_ops->vk_gather_occlusion_pipeline;
return true;
case D3D12_QUERY_HEAP_TYPE_SO_STATISTICS:
info->vk_pipeline = query_ops->vk_gather_so_statistics_pipeline;
return true;
default:
ERR("No pipeline for query heap type %u.\n", heap_type);
return false;
}
}
HRESULT vkd3d_predicate_ops_init(struct vkd3d_predicate_ops *meta_predicate_ops,
struct d3d12_device *device)
{
VkPushConstantRange push_constant_range;
VkSpecializationInfo spec_info;
VkResult vr;
size_t i;
static const struct spec_data
{
uint32_t arg_count;
VkBool32 arg_indirect;
}
spec_data[] =
{
{ 4, VK_FALSE }, /* VKD3D_PREDICATE_OP_DRAW */
{ 5, VK_FALSE }, /* VKD3D_PREDICATE_OP_DRAW_INDEXED */
{ 1, VK_FALSE }, /* VKD3D_PREDICATE_OP_DRAW_INDIRECT */
{ 1, VK_TRUE }, /* VKD3D_PREDICATE_OP_DRAW_INDIRECT_COUNT */
{ 3, VK_FALSE }, /* VKD3D_PREDICATE_OP_DISPATCH */
{ 3, VK_TRUE }, /* VKD3D_PREDICATE_OP_DISPATCH_INDIRECT */
};
static const VkSpecializationMapEntry spec_map[] =
{
{ 0, offsetof(struct spec_data, arg_count), sizeof(uint32_t) },
{ 1, offsetof(struct spec_data, arg_indirect), sizeof(VkBool32) },
};
memset(meta_predicate_ops, 0, sizeof(*meta_predicate_ops));
push_constant_range.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
push_constant_range.offset = 0;
push_constant_range.size = sizeof(struct vkd3d_predicate_command_args);
if ((vr = vkd3d_meta_create_pipeline_layout(device, 0, NULL, 1,
&push_constant_range, &meta_predicate_ops->vk_command_pipeline_layout)) < 0)
return hresult_from_vk_result(vr);
push_constant_range.size = sizeof(struct vkd3d_predicate_resolve_args);
if ((vr = vkd3d_meta_create_pipeline_layout(device, 0, NULL, 1,
&push_constant_range, &meta_predicate_ops->vk_resolve_pipeline_layout)) < 0)
return hresult_from_vk_result(vr);
spec_info.mapEntryCount = ARRAY_SIZE(spec_map);
spec_info.pMapEntries = spec_map;
spec_info.dataSize = sizeof(struct spec_data);
for (i = 0; i < ARRAY_SIZE(spec_data); i++)
{
spec_info.pData = &spec_data[i];
if ((vr = vkd3d_meta_create_compute_pipeline(device, sizeof(cs_predicate_command), cs_predicate_command,
meta_predicate_ops->vk_command_pipeline_layout, &spec_info, &meta_predicate_ops->vk_command_pipelines[i])) < 0)
goto fail;
meta_predicate_ops->data_sizes[i] = spec_data[i].arg_count * sizeof(uint32_t);
}
if ((vr = vkd3d_meta_create_compute_pipeline(device, sizeof(cs_resolve_predicate), cs_resolve_predicate,
meta_predicate_ops->vk_resolve_pipeline_layout, &spec_info, &meta_predicate_ops->vk_resolve_pipeline)) < 0)
goto fail;
return S_OK;
fail:
vkd3d_predicate_ops_cleanup(meta_predicate_ops, device);
return hresult_from_vk_result(vr);
}
void vkd3d_predicate_ops_cleanup(struct vkd3d_predicate_ops *meta_predicate_ops,
struct d3d12_device *device)
{
const struct vkd3d_vk_device_procs *vk_procs = &device->vk_procs;
size_t i;
for (i = 0; i < VKD3D_PREDICATE_COMMAND_COUNT; i++)
VK_CALL(vkDestroyPipeline(device->vk_device, meta_predicate_ops->vk_command_pipelines[i], NULL));
VK_CALL(vkDestroyPipeline(device->vk_device, meta_predicate_ops->vk_resolve_pipeline, NULL));
VK_CALL(vkDestroyPipelineLayout(device->vk_device, meta_predicate_ops->vk_command_pipeline_layout, NULL));
VK_CALL(vkDestroyPipelineLayout(device->vk_device, meta_predicate_ops->vk_resolve_pipeline_layout, NULL));
}
void vkd3d_meta_get_predicate_pipeline(struct vkd3d_meta_ops *meta_ops,
enum vkd3d_predicate_command_type command_type, struct vkd3d_predicate_command_info *info)
{
const struct vkd3d_predicate_ops *predicate_ops = &meta_ops->predicate;
info->vk_pipeline_layout = predicate_ops->vk_command_pipeline_layout;
info->vk_pipeline = predicate_ops->vk_command_pipelines[command_type];
info->data_size = predicate_ops->data_sizes[command_type];
}
HRESULT vkd3d_execute_indirect_ops_init(struct vkd3d_execute_indirect_ops *meta_indirect_ops,
struct d3d12_device *device)
{
VkPushConstantRange push_constant_range;
VkResult vr;
int rc;
if ((rc = pthread_mutex_init(&meta_indirect_ops->mutex, NULL)))
return hresult_from_errno(rc);
push_constant_range.offset = 0;
push_constant_range.size = sizeof(struct vkd3d_execute_indirect_args);
push_constant_range.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
if ((vr = vkd3d_meta_create_pipeline_layout(device, 0, NULL, 1,
&push_constant_range, &meta_indirect_ops->vk_pipeline_layout)) < 0)
{
pthread_mutex_destroy(&meta_indirect_ops->mutex);
return hresult_from_vk_result(vr);
}
meta_indirect_ops->pipelines_count = 0;
meta_indirect_ops->pipelines_size = 0;
meta_indirect_ops->pipelines = NULL;
return S_OK;
}
struct vkd3d_meta_execute_indirect_spec_constant_data
{
struct vkd3d_shader_debug_ring_spec_constants constants;
uint32_t workgroup_size_x;
};
HRESULT vkd3d_meta_get_execute_indirect_pipeline(struct vkd3d_meta_ops *meta_ops,
uint32_t patch_command_count, struct vkd3d_execute_indirect_info *info)
{
struct vkd3d_meta_execute_indirect_spec_constant_data execute_indirect_spec_constants;
VkSpecializationMapEntry map_entry[VKD3D_SHADER_DEBUG_RING_SPEC_INFO_MAP_ENTRIES + 1];
struct vkd3d_execute_indirect_ops *meta_indirect_ops = &meta_ops->execute_indirect;
struct vkd3d_shader_debug_ring_spec_info debug_ring_info;
VkSpecializationInfo spec;
HRESULT hr = S_OK;
VkResult vr;
bool debug;
size_t i;
int rc;
if ((rc = pthread_mutex_lock(&meta_indirect_ops->mutex)))
{
ERR("Failed to lock mutex, error %d.\n", rc);
return hresult_from_errno(rc);
}
for (i = 0; i < meta_indirect_ops->pipelines_count; i++)
{
if (meta_indirect_ops->pipelines[i].workgroup_size_x == patch_command_count)
{
info->vk_pipeline_layout = meta_indirect_ops->vk_pipeline_layout;
info->vk_pipeline = meta_indirect_ops->pipelines[i].vk_pipeline;
goto out;
}
}
debug = meta_ops->device->debug_ring.active;
/* If we have debug ring, we can dump indirect command buffer data to the ring as well.
* Vital for debugging broken execute indirect data with templates. */
if (debug)
{
vkd3d_shader_debug_ring_init_spec_constant(meta_ops->device, &debug_ring_info,
0 /* Reserve this hash for internal debug streams. */);
memset(&execute_indirect_spec_constants, 0, sizeof(execute_indirect_spec_constants));
execute_indirect_spec_constants.constants = debug_ring_info.constants;
execute_indirect_spec_constants.workgroup_size_x = patch_command_count;
memcpy(map_entry, debug_ring_info.map_entries, sizeof(debug_ring_info.map_entries));
map_entry[VKD3D_SHADER_DEBUG_RING_SPEC_INFO_MAP_ENTRIES].constantID = 4;
map_entry[VKD3D_SHADER_DEBUG_RING_SPEC_INFO_MAP_ENTRIES].offset =
offsetof(struct vkd3d_meta_execute_indirect_spec_constant_data, workgroup_size_x);
map_entry[VKD3D_SHADER_DEBUG_RING_SPEC_INFO_MAP_ENTRIES].size = sizeof(patch_command_count);
spec.pMapEntries = map_entry;
spec.pData = &execute_indirect_spec_constants;
spec.mapEntryCount = ARRAY_SIZE(map_entry);
spec.dataSize = sizeof(execute_indirect_spec_constants);
}
else
{
map_entry[0].constantID = 0;
map_entry[0].offset = 0;
map_entry[0].size = sizeof(patch_command_count);
spec.pMapEntries = map_entry;
spec.pData = &patch_command_count;
spec.mapEntryCount = 1;
spec.dataSize = sizeof(patch_command_count);
}
vkd3d_array_reserve((void**)&meta_indirect_ops->pipelines, &meta_indirect_ops->pipelines_size,
meta_indirect_ops->pipelines_count + 1, sizeof(*meta_indirect_ops->pipelines));
meta_indirect_ops->pipelines[meta_indirect_ops->pipelines_count].workgroup_size_x = patch_command_count;
vr = vkd3d_meta_create_compute_pipeline(meta_ops->device,
debug ? sizeof(cs_execute_indirect_patch_debug_ring) : sizeof(cs_execute_indirect_patch),
debug ? cs_execute_indirect_patch_debug_ring : cs_execute_indirect_patch,
meta_indirect_ops->vk_pipeline_layout, &spec,
&meta_indirect_ops->pipelines[meta_indirect_ops->pipelines_count].vk_pipeline);
if (vr)
{
hr = hresult_from_vk_result(vr);
goto out;
}
info->vk_pipeline_layout = meta_indirect_ops->vk_pipeline_layout;
info->vk_pipeline = meta_indirect_ops->pipelines[meta_indirect_ops->pipelines_count].vk_pipeline;
meta_indirect_ops->pipelines_count++;
out:
pthread_mutex_unlock(&meta_indirect_ops->mutex);
return hr;
}
void vkd3d_execute_indirect_ops_cleanup(struct vkd3d_execute_indirect_ops *meta_indirect_ops,
struct d3d12_device *device)
{
const struct vkd3d_vk_device_procs *vk_procs = &device->vk_procs;
size_t i;
for (i = 0; i < meta_indirect_ops->pipelines_count; i++)
VK_CALL(vkDestroyPipeline(device->vk_device, meta_indirect_ops->pipelines[i].vk_pipeline, NULL));
VK_CALL(vkDestroyPipelineLayout(device->vk_device, meta_indirect_ops->vk_pipeline_layout, NULL));
pthread_mutex_destroy(&meta_indirect_ops->mutex);
}
HRESULT vkd3d_meta_ops_init(struct vkd3d_meta_ops *meta_ops, struct d3d12_device *device)
{
HRESULT hr;
@ -1089,8 +1374,23 @@ HRESULT vkd3d_meta_ops_init(struct vkd3d_meta_ops *meta_ops, struct d3d12_device
if (FAILED(hr = vkd3d_swapchain_ops_init(&meta_ops->swapchain, device)))
goto fail_swapchain_ops;
if (FAILED(hr = vkd3d_query_ops_init(&meta_ops->query, device)))
goto fail_query_ops;
if (FAILED(hr = vkd3d_predicate_ops_init(&meta_ops->predicate, device)))
goto fail_predicate_ops;
if (FAILED(hr = vkd3d_execute_indirect_ops_init(&meta_ops->execute_indirect, device)))
goto fail_execute_indirect_ops;
return S_OK;
fail_execute_indirect_ops:
vkd3d_predicate_ops_cleanup(&meta_ops->predicate, device);
fail_predicate_ops:
vkd3d_query_ops_cleanup(&meta_ops->query, device);
fail_query_ops:
vkd3d_swapchain_ops_cleanup(&meta_ops->swapchain, device);
fail_swapchain_ops:
vkd3d_copy_image_ops_cleanup(&meta_ops->copy_image, device);
fail_copy_image_ops:
@ -1103,6 +1403,9 @@ fail_common:
HRESULT vkd3d_meta_ops_cleanup(struct vkd3d_meta_ops *meta_ops, struct d3d12_device *device)
{
vkd3d_execute_indirect_ops_cleanup(&meta_ops->execute_indirect, device);
vkd3d_predicate_ops_cleanup(&meta_ops->predicate, device);
vkd3d_query_ops_cleanup(&meta_ops->query, device);
vkd3d_swapchain_ops_cleanup(&meta_ops->swapchain, device);
vkd3d_copy_image_ops_cleanup(&meta_ops->copy_image, device);
vkd3d_clear_uav_ops_cleanup(&meta_ops->clear_uav, device);

File diff suppressed because it is too large Load Diff

View File

@ -42,6 +42,7 @@ static vkd3d_shader_hash_t renderdoc_capture_shader_hash;
static uint32_t *renderdoc_capture_counts;
static size_t renderdoc_capture_counts_count;
static bool vkd3d_renderdoc_is_active;
static bool vkd3d_renderdoc_global_capture;
static void vkd3d_renderdoc_init_capture_count_list(const char *env)
{
@ -49,6 +50,13 @@ static void vkd3d_renderdoc_init_capture_count_list(const char *env)
uint32_t count;
char *endp;
if (strcmp(env, "-1") == 0)
{
INFO("Doing one big capture of the entire lifetime of a device.\n");
vkd3d_renderdoc_global_capture = true;
return;
}
while (*env != '\0')
{
errno = 0;
@ -92,9 +100,9 @@ static bool vkd3d_renderdoc_enable_submit_counter(uint32_t counter)
static void vkd3d_renderdoc_init_once(void)
{
char counts[VKD3D_PATH_MAX];
pRENDERDOC_GetAPI get_api;
const char *counts;
const char *env;
char env[VKD3D_PATH_MAX];
#ifdef _WIN32
HMODULE renderdoc;
@ -104,19 +112,19 @@ static void vkd3d_renderdoc_init_once(void)
void *fn_ptr;
#endif
env = getenv("VKD3D_AUTO_CAPTURE_SHADER");
counts = getenv("VKD3D_AUTO_CAPTURE_COUNTS");
vkd3d_get_env_var("VKD3D_AUTO_CAPTURE_SHADER", env, sizeof(env));
vkd3d_get_env_var("VKD3D_AUTO_CAPTURE_COUNTS", counts, sizeof(counts));
if (!env && !counts)
if (strlen(env) == 0 && strlen(counts) == 0)
{
WARN("VKD3D_AUTO_CAPTURE_SHADER or VKD3D_AUTO_CAPTURE_COUNTS is not set, RenderDoc auto capture will not be enabled.\n");
return;
}
if (!counts)
if (strlen(counts) == 0)
WARN("VKD3D_AUTO_CAPTURE_COUNTS is not set, will assume that only the first submission is captured.\n");
if (env)
if (strlen(env) > 0)
renderdoc_capture_shader_hash = strtoull(env, NULL, 16);
if (renderdoc_capture_shader_hash)
@ -124,7 +132,7 @@ static void vkd3d_renderdoc_init_once(void)
else
INFO("Enabling RenderDoc capture for all shaders.\n");
if (counts)
if (strlen(counts) > 0)
vkd3d_renderdoc_init_capture_count_list(counts);
else
{
@ -180,6 +188,11 @@ bool vkd3d_renderdoc_active(void)
return vkd3d_renderdoc_is_active;
}
bool vkd3d_renderdoc_global_capture_enabled(void)
{
return vkd3d_renderdoc_global_capture;
}
bool vkd3d_renderdoc_should_capture_shader_hash(vkd3d_shader_hash_t hash)
{
return (renderdoc_capture_shader_hash == hash) || (renderdoc_capture_shader_hash == 0);
@ -190,9 +203,12 @@ bool vkd3d_renderdoc_begin_capture(void *instance)
static uint32_t overall_counter;
uint32_t counter;
counter = vkd3d_atomic_uint32_increment(&overall_counter, vkd3d_memory_order_relaxed) - 1;
if (!vkd3d_renderdoc_enable_submit_counter(counter))
return false;
if (!vkd3d_renderdoc_global_capture)
{
counter = vkd3d_atomic_uint32_increment(&overall_counter, vkd3d_memory_order_relaxed) - 1;
if (!vkd3d_renderdoc_enable_submit_counter(counter))
return false;
}
if (renderdoc_api)
renderdoc_api->StartFrameCapture(RENDERDOC_DEVICEPOINTER_FROM_VKINSTANCE(instance), NULL);
@ -215,11 +231,14 @@ void vkd3d_renderdoc_command_list_check_capture(struct d3d12_command_list *list,
{
unsigned int i;
if (vkd3d_renderdoc_global_capture_enabled())
return;
if (vkd3d_renderdoc_active() && state)
{
if (state->vk_bind_point == VK_PIPELINE_BIND_POINT_COMPUTE)
{
if (vkd3d_renderdoc_should_capture_shader_hash(state->compute.meta.hash))
if (vkd3d_renderdoc_should_capture_shader_hash(state->compute.code.meta.hash))
{
WARN("Triggering RenderDoc capture for this command list.\n");
list->debug_capture = true;
@ -229,7 +248,7 @@ void vkd3d_renderdoc_command_list_check_capture(struct d3d12_command_list *list,
{
for (i = 0; i < state->graphics.stage_count; i++)
{
if (vkd3d_renderdoc_should_capture_shader_hash(state->graphics.stage_meta[i].hash))
if (vkd3d_renderdoc_should_capture_shader_hash(state->graphics.code[i].meta.hash))
{
WARN("Triggering RenderDoc capture for this command list.\n");
list->debug_capture = true;
@ -246,6 +265,9 @@ bool vkd3d_renderdoc_command_queue_begin_capture(struct d3d12_command_queue *com
VkDebugUtilsLabelEXT capture_label;
bool debug_capture;
if (vkd3d_renderdoc_global_capture_enabled())
return false;
debug_capture = vkd3d_renderdoc_begin_capture(command_queue->device->vkd3d_instance->vk_instance);
if (debug_capture && !vkd3d_renderdoc_loaded_api())
{
@ -273,6 +295,9 @@ void vkd3d_renderdoc_command_queue_end_capture(struct d3d12_command_queue *comma
const struct vkd3d_vk_device_procs *vk_procs = &command_queue->device->vk_procs;
VkDebugUtilsLabelEXT capture_label;
if (vkd3d_renderdoc_global_capture_enabled())
return;
if (!vkd3d_renderdoc_loaded_api())
{
/* Magic fallback which lets us bridge the Wine barrier over to Linux RenderDoc. */

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,67 @@
#version 450
#extension GL_EXT_buffer_reference : require
#extension GL_EXT_buffer_reference_uvec2 : require
layout(local_size_x_id = 0) in;
struct Command
{
uint type;
uint src_offset;
uint dst_offset;
};
layout(buffer_reference, std430, buffer_reference_align = 4) readonly buffer Commands
{
Command commands[];
};
layout(buffer_reference, std430, buffer_reference_align = 4) readonly buffer SrcBuffer {
uint values[];
};
layout(buffer_reference, std430, buffer_reference_align = 4) writeonly buffer DstBuffer {
uint values[];
};
layout(buffer_reference, std430, buffer_reference_align = 4) readonly buffer IndirectCount {
uint count;
};
layout(buffer_reference, std430, buffer_reference_align = 4) writeonly buffer IndirectCountWrite {
uint count;
};
layout(push_constant) uniform Registers
{
Commands commands_va;
SrcBuffer src_buffer_va;
DstBuffer dst_buffer_va;
uvec2 indirect_count_va;
IndirectCountWrite dst_indirect_count_va;
uint src_stride;
uint dst_stride;
};
void main()
{
Command cmd = commands_va.commands[gl_LocalInvocationIndex];
uint draw_id = gl_WorkGroupID.x;
uint max_draws = gl_NumWorkGroups.x;
if (any(notEqual(indirect_count_va, uvec2(0))))
{
max_draws = min(max_draws, IndirectCount(indirect_count_va).count);
if (gl_WorkGroupID.x == 0u)
dst_indirect_count_va.count = max_draws;
}
if (draw_id < max_draws)
{
uint src_offset = src_stride * draw_id + cmd.src_offset;
uint dst_offset = dst_stride * draw_id + cmd.dst_offset;
uint src_value = src_buffer_va.values[src_offset];
dst_buffer_va.values[dst_offset] = src_value;
}
}

View File

@ -0,0 +1,83 @@
#version 450
#extension GL_EXT_buffer_reference : require
#extension GL_EXT_buffer_reference_uvec2 : require
#extension GL_GOOGLE_include_directive : require
#include "../../../include/shader-debug/debug_channel.h"
layout(local_size_x_id = 4) in;
struct Command
{
uint type;
uint src_offset;
uint dst_offset;
};
layout(buffer_reference, std430, buffer_reference_align = 4) readonly buffer Commands
{
Command commands[];
};
layout(buffer_reference, std430, buffer_reference_align = 4) readonly buffer SrcBuffer {
uint values[];
};
layout(buffer_reference, std430, buffer_reference_align = 4) writeonly buffer DstBuffer {
uint values[];
};
layout(buffer_reference, std430, buffer_reference_align = 4) readonly buffer IndirectCount {
uint count;
};
layout(buffer_reference, std430, buffer_reference_align = 4) writeonly buffer IndirectCountWrite {
uint count;
};
layout(push_constant) uniform Registers
{
Commands commands_va;
SrcBuffer src_buffer_va;
DstBuffer dst_buffer_va;
uvec2 indirect_count_va;
IndirectCountWrite dst_indirect_count_va;
uint src_stride;
uint dst_stride;
// Debug metadata here
uint debug_tag;
uint implicit_instance;
};
void main()
{
if (debug_tag != 0u)
DEBUG_CHANNEL_INIT_IMPLICIT_INSTANCE(uvec3(debug_tag, gl_WorkGroupID.x, gl_LocalInvocationIndex), implicit_instance);
Command cmd = commands_va.commands[gl_LocalInvocationIndex];
uint draw_id = gl_WorkGroupID.x;
uint max_draws = gl_NumWorkGroups.x;
if (any(notEqual(indirect_count_va, uvec2(0))))
{
max_draws = min(max_draws, IndirectCount(indirect_count_va).count);
if (gl_WorkGroupID.x == 0u)
dst_indirect_count_va.count = max_draws;
}
if (debug_tag != 0u && gl_WorkGroupID.x == 0)
DEBUG_CHANNEL_MSG_UNIFORM(int(max_draws), int(gl_NumWorkGroups.x));
if (draw_id < max_draws)
{
uint src_offset = src_stride * draw_id + cmd.src_offset;
uint dst_offset = dst_stride * draw_id + cmd.dst_offset;
uint src_value = src_buffer_va.values[src_offset];
if (debug_tag != 0u)
DEBUG_CHANNEL_MSG(cmd.type, dst_offset, src_offset, src_value);
dst_buffer_va.values[dst_offset] = src_value;
}
}

View File

@ -0,0 +1,40 @@
#version 450
#extension GL_EXT_buffer_reference : require
layout(local_size_x = 1) in;
layout(constant_id = 0) const uint c_arg_count = 0;
layout(constant_id = 1) const bool c_arg_indirect = false;
layout(std430, buffer_reference, buffer_reference_align = 4)
readonly buffer predicate_t {
uint data;
};
layout(std430, buffer_reference, buffer_reference_align = 4)
readonly buffer src_args_t {
uint data[];
};
layout(std430, buffer_reference, buffer_reference_align = 4)
writeonly buffer dst_args_t {
uint data[];
};
layout(push_constant)
uniform u_info_t {
predicate_t predicate;
src_args_t src_args;
dst_args_t dst_args;
uint cmd_args[5];
};
void main() {
bool do_exec = predicate.data != 0;
for (uint i = 0; i < c_arg_count; i++) {
uint arg = c_arg_indirect ? src_args.data[i] : cmd_args[i];
dst_args.data[i] = do_exec ? arg : 0u;
}
}

View File

@ -0,0 +1,29 @@
#version 450
#extension GL_ARB_gpu_shader_int64 : require
layout(local_size_x = 64) in;
layout(std430, binding = 0)
writeonly buffer dst_buffer_t {
uint64_t dst_queries[];
};
layout(std430, binding = 1)
readonly buffer src_buffer_t {
uint64_t src_queries[];
};
layout(push_constant)
uniform u_info_t {
uint dst_index;
uint src_index;
uint query_count;
};
void main() {
uint thread_id = gl_GlobalInvocationID.x;
if (thread_id < query_count)
dst_queries[dst_index + thread_id] = min(src_queries[src_index + thread_id], uint64_t(1u));
}

View File

@ -0,0 +1,26 @@
#version 450
#extension GL_EXT_buffer_reference : require
layout(local_size_x = 1) in;
layout(std430, buffer_reference, buffer_reference_align = 8)
readonly buffer src_predicate_t {
uvec2 data;
};
layout(std430, buffer_reference, buffer_reference_align = 4)
writeonly buffer dst_predicate_t {
uint data;
};
layout(push_constant)
uniform u_info_t {
src_predicate_t src;
dst_predicate_t dst;
bool invert;
};
void main() {
dst.data = (all(equal(src.data, 0u.xx)) != invert) ? 0u : 1u;
}

View File

@ -0,0 +1,62 @@
#version 450
#extension GL_ARB_gpu_shader_int64 : require
layout(local_size_x = 64) in;
layout(constant_id = 0) const uint c_field_count = 1;
layout(std430, binding = 0)
buffer dst_queries_t {
uint64_t dst_queries[];
};
layout(std430, binding = 1)
readonly buffer src_queries_t {
uint64_t src_queries[];
};
struct query_map_entry_t {
uint dst_index;
uint src_index;
uint next;
};
layout(std430, binding = 2)
readonly buffer query_map_t {
query_map_entry_t entries[];
} map;
layout(push_constant)
uniform u_info_t {
uint query_count;
uint entry_offset;
};
void main() {
uint thread_id = gl_GlobalInvocationID.x;
if (thread_id >= query_count)
return;
// The query map is an array of linked lists, with the
// first query_count entries guaranteed to be list heads
query_map_entry_t entry = map.entries[thread_id + entry_offset];
uint64_t dst_data[c_field_count];
// By copying the first query we get the reset for free
for (uint i = 0; i < c_field_count; i++)
dst_data[i] = src_queries[c_field_count * entry.src_index + i];
// Accumulate data from additional queries
while (entry.next != ~0u) {
entry = map.entries[entry.next + entry_offset];
for (uint i = 0; i < c_field_count; i++)
dst_data[i] += src_queries[c_field_count * entry.src_index + i];
}
// dst_index has the same value for all entries in the list
for (uint i = 0; i < c_field_count; i++)
dst_queries[c_field_count * entry.dst_index + i] = dst_data[i];
}

View File

@ -0,0 +1,28 @@
#version 450
#extension GL_EXT_samplerless_texture_functions : enable
#extension GL_ARB_shader_stencil_export : enable
#define MODE_1D 0
#define MODE_2D 1
#define MODE_MS 2
layout(constant_id = 0) const uint c_mode = MODE_2D;
layout(binding = 0) uniform utexture1DArray tex_1d;
layout(binding = 0) uniform utexture2DArray tex_2d;
layout(binding = 0) uniform utexture2DMSArray tex_ms;
layout(push_constant)
uniform u_info_t {
ivec2 offset;
} u_info;
void main() {
ivec3 coord = ivec3(u_info.offset + ivec2(gl_FragCoord.xy), gl_Layer);
uint value;
if (c_mode == MODE_1D) value = texelFetch(tex_1d, coord.xz, 0).r;
if (c_mode == MODE_2D) value = texelFetch(tex_2d, coord, 0).r;
if (c_mode == MODE_MS) value = texelFetch(tex_ms, coord, gl_SampleID).r;
gl_FragStencilRefARB = int(value);
}

View File

@ -0,0 +1,29 @@
#version 450
#extension GL_EXT_samplerless_texture_functions : enable
#define MODE_1D 0
#define MODE_2D 1
#define MODE_MS 2
layout(constant_id = 0) const uint c_mode = MODE_2D;
layout(binding = 0) uniform utexture1DArray tex_1d;
layout(binding = 0) uniform utexture2DArray tex_2d;
layout(binding = 0) uniform utexture2DMSArray tex_ms;
layout(location = 0) out uint o_color;
layout(push_constant)
uniform u_info_t {
ivec2 offset;
} u_info;
void main() {
ivec3 coord = ivec3(u_info.offset + ivec2(gl_FragCoord.xy), gl_Layer);
uint value;
if (c_mode == MODE_1D) value = texelFetch(tex_1d, coord.xz, 0).r;
if (c_mode == MODE_2D) value = texelFetch(tex_2d, coord, 0).r;
if (c_mode == MODE_MS) value = texelFetch(tex_ms, coord, gl_SampleID).r;
o_color = value;
}

View File

@ -0,0 +1,68 @@
/*
* Copyright 2021 Derek Lesho for Codeweavers
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
#define VKD3D_DBG_CHANNEL VKD3D_DBG_CHANNEL_API
#include "vkd3d_private.h"
#include "winioctl.h"
#define IOCTL_SHARED_GPU_RESOURCE_SET_METADATA CTL_CODE(FILE_DEVICE_VIDEO, 4, METHOD_BUFFERED, FILE_WRITE_ACCESS)
#define IOCTL_SHARED_GPU_RESOURCE_GET_METADATA CTL_CODE(FILE_DEVICE_VIDEO, 5, METHOD_BUFFERED, FILE_READ_ACCESS)
#define IOCTL_SHARED_GPU_RESOURCE_OPEN CTL_CODE(FILE_DEVICE_VIDEO, 1, METHOD_BUFFERED, FILE_WRITE_ACCESS)
bool vkd3d_set_shared_metadata(HANDLE handle, void *buf, uint32_t buf_size)
{
DWORD ret_size;
return DeviceIoControl(handle, IOCTL_SHARED_GPU_RESOURCE_SET_METADATA, buf, buf_size, NULL, 0, &ret_size, NULL);
}
bool vkd3d_get_shared_metadata(HANDLE handle, void *buf, uint32_t buf_size, uint32_t *metadata_size)
{
DWORD ret_size;
bool ret = DeviceIoControl(handle, IOCTL_SHARED_GPU_RESOURCE_GET_METADATA, NULL, 0, buf, buf_size, &ret_size, NULL);
if (metadata_size)
*metadata_size = ret_size;
return ret;
}
HANDLE vkd3d_open_kmt_handle(HANDLE kmt_handle)
{
struct
{
unsigned int kmt_handle;
/* the following parameter represents a larger sized string for a dynamically allocated struct for use when opening an object by name */
WCHAR name[1];
} shared_resource_open;
HANDLE nt_handle = CreateFileA("\\\\.\\SharedGpuResource", GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if (nt_handle == INVALID_HANDLE_VALUE)
return nt_handle;
shared_resource_open.kmt_handle = (ULONG_PTR)kmt_handle;
shared_resource_open.name[0] = 0;
if (!DeviceIoControl(nt_handle, IOCTL_SHARED_GPU_RESOURCE_OPEN, &shared_resource_open, sizeof(shared_resource_open), NULL, 0, NULL, NULL))
{
CloseHandle(nt_handle);
return INVALID_HANDLE_VALUE;
}
return nt_handle;
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -22,6 +22,8 @@
#include <errno.h>
#define VKD3D_MAX_DXGI_FORMAT DXGI_FORMAT_B4G4R4A4_UNORM
#define COLOR (VK_IMAGE_ASPECT_COLOR_BIT)
#define DEPTH (VK_IMAGE_ASPECT_DEPTH_BIT)
#define STENCIL (VK_IMAGE_ASPECT_STENCIL_BIT)
@ -95,6 +97,8 @@ static const struct vkd3d_format vkd3d_formats[] =
{DXGI_FORMAT_B8G8R8X8_TYPELESS, VK_FORMAT_B8G8R8A8_UNORM, 4, 1, 1, 1, COLOR, 1},
{DXGI_FORMAT_B8G8R8X8_UNORM_SRGB, VK_FORMAT_B8G8R8A8_SRGB, 4, 1, 1, 1, COLOR, 1},
{DXGI_FORMAT_R9G9B9E5_SHAREDEXP, VK_FORMAT_E5B9G9R9_UFLOAT_PACK32, 4, 1, 1, 1, COLOR, 1},
{DXGI_FORMAT_B5G6R5_UNORM, VK_FORMAT_R5G6B5_UNORM_PACK16, 2, 1, 1, 1, COLOR, 1},
{DXGI_FORMAT_B5G5R5A1_UNORM, VK_FORMAT_A1R5G5B5_UNORM_PACK16, 2, 1, 1, 1, COLOR, 1},
{DXGI_FORMAT_BC1_TYPELESS, VK_FORMAT_BC1_RGBA_UNORM_BLOCK, 1, 4, 4, 8, COLOR, 1, TYPELESS},
{DXGI_FORMAT_BC1_UNORM, VK_FORMAT_BC1_RGBA_UNORM_BLOCK, 1, 4, 4, 8, COLOR, 1},
{DXGI_FORMAT_BC1_UNORM_SRGB, VK_FORMAT_BC1_RGBA_SRGB_BLOCK, 1, 4, 4, 8, COLOR, 1},
@ -116,19 +120,26 @@ static const struct vkd3d_format vkd3d_formats[] =
{DXGI_FORMAT_BC7_TYPELESS, VK_FORMAT_BC7_UNORM_BLOCK, 1, 4, 4, 16, COLOR, 1, TYPELESS},
{DXGI_FORMAT_BC7_UNORM, VK_FORMAT_BC7_UNORM_BLOCK, 1, 4, 4, 16, COLOR, 1},
{DXGI_FORMAT_BC7_UNORM_SRGB, VK_FORMAT_BC7_SRGB_BLOCK, 1, 4, 4, 16, COLOR, 1},
{DXGI_FORMAT_B4G4R4A4_UNORM, VK_FORMAT_A4R4G4B4_UNORM_PACK16_EXT,2, 1, 1, 1, COLOR, 1},
};
static const struct vkd3d_format_footprint depth_stencil_copy_footprints[] =
{
{ DXGI_FORMAT_R32_TYPELESS, 1, 1, 4, 0, 0 },
{ DXGI_FORMAT_R8_TYPELESS, 1, 1, 1, 0, 0 },
};
/* Each depth/stencil format is only compatible with itself in Vulkan. */
static const struct vkd3d_format vkd3d_depth_stencil_formats[] =
{
{DXGI_FORMAT_R32G8X24_TYPELESS, VK_FORMAT_D32_SFLOAT_S8_UINT, 8, 1, 1, 1, DEPTH_STENCIL, 2, TYPELESS},
{DXGI_FORMAT_D32_FLOAT_S8X24_UINT, VK_FORMAT_D32_SFLOAT_S8_UINT, 8, 1, 1, 1, DEPTH_STENCIL, 2},
{DXGI_FORMAT_R32G8X24_TYPELESS, VK_FORMAT_D32_SFLOAT_S8_UINT, 8, 1, 1, 1, DEPTH_STENCIL, 2, TYPELESS, false, depth_stencil_copy_footprints},
{DXGI_FORMAT_D32_FLOAT_S8X24_UINT, VK_FORMAT_D32_SFLOAT_S8_UINT, 8, 1, 1, 1, DEPTH_STENCIL, 2, 0, false, depth_stencil_copy_footprints},
{DXGI_FORMAT_R32_FLOAT_X8X24_TYPELESS, VK_FORMAT_D32_SFLOAT_S8_UINT, 8, 1, 1, 1, DEPTH, 2},
{DXGI_FORMAT_X32_TYPELESS_G8X24_UINT, VK_FORMAT_D32_SFLOAT_S8_UINT, 8, 1, 1, 1, STENCIL, 2},
{DXGI_FORMAT_R32_TYPELESS, VK_FORMAT_D32_SFLOAT, 4, 1, 1, 1, DEPTH, 1, TYPELESS},
{DXGI_FORMAT_R32_FLOAT, VK_FORMAT_D32_SFLOAT, 4, 1, 1, 1, DEPTH, 1},
{DXGI_FORMAT_R24G8_TYPELESS, VK_FORMAT_D24_UNORM_S8_UINT, 4, 1, 1, 1, DEPTH_STENCIL, 2, TYPELESS},
{DXGI_FORMAT_D24_UNORM_S8_UINT, VK_FORMAT_D24_UNORM_S8_UINT, 4, 1, 1, 1, DEPTH_STENCIL, 2},
{DXGI_FORMAT_R24G8_TYPELESS, VK_FORMAT_D24_UNORM_S8_UINT, 4, 1, 1, 1, DEPTH_STENCIL, 2, TYPELESS, false, depth_stencil_copy_footprints},
{DXGI_FORMAT_D24_UNORM_S8_UINT, VK_FORMAT_D24_UNORM_S8_UINT, 4, 1, 1, 1, DEPTH_STENCIL, 2, 0, false, depth_stencil_copy_footprints},
{DXGI_FORMAT_R24_UNORM_X8_TYPELESS, VK_FORMAT_D24_UNORM_S8_UINT, 4, 1, 1, 1, DEPTH, 2},
{DXGI_FORMAT_X24_TYPELESS_G8_UINT, VK_FORMAT_D24_UNORM_S8_UINT, 4, 1, 1, 1, STENCIL, 2},
{DXGI_FORMAT_R16_TYPELESS, VK_FORMAT_D16_UNORM, 2, 1, 1, 1, DEPTH, 1, TYPELESS},
@ -142,133 +153,258 @@ static const struct vkd3d_format vkd3d_depth_stencil_formats[] =
#undef SINT
#undef UINT
static const struct vkd3d_format_compatibility_info
static const struct dxgi_format_compatibility_list
{
DXGI_FORMAT format;
DXGI_FORMAT typeless_format;
DXGI_FORMAT image_format;
DXGI_FORMAT view_formats[VKD3D_MAX_COMPATIBLE_FORMAT_COUNT];
DXGI_FORMAT uint_format; /* for ClearUAVUint */
}
vkd3d_format_compatibility_info[] =
dxgi_format_compatibility_list[] =
{
/* DXGI_FORMAT_R32G32B32A32_TYPELESS */
{DXGI_FORMAT_R32G32B32A32_UINT, DXGI_FORMAT_R32G32B32A32_TYPELESS},
{DXGI_FORMAT_R32G32B32A32_SINT, DXGI_FORMAT_R32G32B32A32_TYPELESS},
{DXGI_FORMAT_R32G32B32A32_FLOAT, DXGI_FORMAT_R32G32B32A32_TYPELESS},
/* DXGI_FORMAT_R32G32B32_TYPELESS */
{DXGI_FORMAT_R32G32B32_UINT, DXGI_FORMAT_R32G32B32_TYPELESS},
{DXGI_FORMAT_R32G32B32_SINT, DXGI_FORMAT_R32G32B32_TYPELESS},
{DXGI_FORMAT_R32G32B32_FLOAT, DXGI_FORMAT_R32G32B32_TYPELESS},
/* DXGI_FORMAT_R16G16B16A16_TYPELESS */
{DXGI_FORMAT_R16G16B16A16_UNORM, DXGI_FORMAT_R16G16B16A16_TYPELESS},
{DXGI_FORMAT_R16G16B16A16_SNORM, DXGI_FORMAT_R16G16B16A16_TYPELESS},
{DXGI_FORMAT_R16G16B16A16_UINT, DXGI_FORMAT_R16G16B16A16_TYPELESS},
{DXGI_FORMAT_R16G16B16A16_SINT, DXGI_FORMAT_R16G16B16A16_TYPELESS},
{DXGI_FORMAT_R16G16B16A16_FLOAT, DXGI_FORMAT_R16G16B16A16_TYPELESS},
/* DXGI_FORMAT_R32G32_TYPELESS */
{DXGI_FORMAT_R32G32_UINT, DXGI_FORMAT_R32G32_TYPELESS},
{DXGI_FORMAT_R32G32_SINT, DXGI_FORMAT_R32G32_TYPELESS},
{DXGI_FORMAT_R32G32_FLOAT, DXGI_FORMAT_R32G32_TYPELESS},
/* DXGI_FORMAT_R32G8X24_TYPELESS */
{DXGI_FORMAT_R32_FLOAT_X8X24_TYPELESS, DXGI_FORMAT_R32G8X24_TYPELESS},
{DXGI_FORMAT_X32_TYPELESS_G8X24_UINT, DXGI_FORMAT_R32G8X24_TYPELESS},
{DXGI_FORMAT_D32_FLOAT_S8X24_UINT, DXGI_FORMAT_R32G8X24_TYPELESS},
/* DXGI_FORMAT_R10G10B10A2_TYPELESS */
{DXGI_FORMAT_R10G10B10A2_UINT, DXGI_FORMAT_R10G10B10A2_TYPELESS},
{DXGI_FORMAT_R10G10B10A2_UNORM, DXGI_FORMAT_R10G10B10A2_TYPELESS},
/* DXGI_FORMAT_R8G8B8A8_TYPELESS */
{DXGI_FORMAT_R8G8B8A8_UINT, DXGI_FORMAT_R8G8B8A8_TYPELESS},
{DXGI_FORMAT_R8G8B8A8_SINT, DXGI_FORMAT_R8G8B8A8_TYPELESS},
{DXGI_FORMAT_R8G8B8A8_SNORM, DXGI_FORMAT_R8G8B8A8_TYPELESS},
{DXGI_FORMAT_R8G8B8A8_UNORM_SRGB, DXGI_FORMAT_R8G8B8A8_TYPELESS},
{DXGI_FORMAT_R8G8B8A8_UNORM, DXGI_FORMAT_R8G8B8A8_TYPELESS},
/* DXGI_FORMAT_R16G16_TYPELESS */
{DXGI_FORMAT_R16G16_UNORM, DXGI_FORMAT_R16G16_TYPELESS},
{DXGI_FORMAT_R16G16_SNORM, DXGI_FORMAT_R16G16_TYPELESS},
{DXGI_FORMAT_R16G16_UINT, DXGI_FORMAT_R16G16_TYPELESS},
{DXGI_FORMAT_R16G16_SINT, DXGI_FORMAT_R16G16_TYPELESS},
{DXGI_FORMAT_R16G16_FLOAT, DXGI_FORMAT_R16G16_TYPELESS},
/* DXGI_FORMAT_R32_TYPELESS */
{DXGI_FORMAT_D32_FLOAT, DXGI_FORMAT_R32_TYPELESS},
{DXGI_FORMAT_R32_FLOAT, DXGI_FORMAT_R32_TYPELESS},
{DXGI_FORMAT_R32_UINT, DXGI_FORMAT_R32_TYPELESS},
{DXGI_FORMAT_R32_SINT, DXGI_FORMAT_R32_TYPELESS},
/* DXGI_FORMAT_R24G8_TYPELESS */
{DXGI_FORMAT_R24_UNORM_X8_TYPELESS, DXGI_FORMAT_R24G8_TYPELESS},
{DXGI_FORMAT_X24_TYPELESS_G8_UINT, DXGI_FORMAT_R24G8_TYPELESS},
{DXGI_FORMAT_D24_UNORM_S8_UINT, DXGI_FORMAT_R24G8_TYPELESS},
/* DXGI_FORMAT_R8G8_TYPELESS */
{DXGI_FORMAT_R8G8_SNORM, DXGI_FORMAT_R8G8_TYPELESS},
{DXGI_FORMAT_R8G8_UNORM, DXGI_FORMAT_R8G8_TYPELESS},
{DXGI_FORMAT_R8G8_UINT, DXGI_FORMAT_R8G8_TYPELESS},
{DXGI_FORMAT_R8G8_SINT, DXGI_FORMAT_R8G8_TYPELESS},
/* DXGI_FORMAT_R16_TYPELESS */
{DXGI_FORMAT_D16_UNORM, DXGI_FORMAT_R16_TYPELESS},
{DXGI_FORMAT_R16_UNORM, DXGI_FORMAT_R16_TYPELESS},
{DXGI_FORMAT_R16_SNORM, DXGI_FORMAT_R16_TYPELESS},
{DXGI_FORMAT_R16_UINT, DXGI_FORMAT_R16_TYPELESS},
{DXGI_FORMAT_R16_SINT, DXGI_FORMAT_R16_TYPELESS},
{DXGI_FORMAT_R16_FLOAT, DXGI_FORMAT_R16_TYPELESS},
/* DXGI_FORMAT_R8_TYPELESS */
{DXGI_FORMAT_R8_UNORM, DXGI_FORMAT_R8_TYPELESS},
{DXGI_FORMAT_R8_SNORM, DXGI_FORMAT_R8_TYPELESS},
{DXGI_FORMAT_R8_UINT, DXGI_FORMAT_R8_TYPELESS},
{DXGI_FORMAT_R8_SINT, DXGI_FORMAT_R8_TYPELESS},
/* DXGI_FORMAT_BC1_TYPELESS */
{DXGI_FORMAT_BC1_UNORM_SRGB, DXGI_FORMAT_BC1_TYPELESS},
{DXGI_FORMAT_BC1_UNORM, DXGI_FORMAT_BC1_TYPELESS},
/* DXGI_FORMAT_BC2_TYPELESS */
{DXGI_FORMAT_BC2_UNORM_SRGB, DXGI_FORMAT_BC2_TYPELESS},
{DXGI_FORMAT_BC2_UNORM, DXGI_FORMAT_BC2_TYPELESS},
/* DXGI_FORMAT_BC3_TYPELESS */
{DXGI_FORMAT_BC3_UNORM_SRGB, DXGI_FORMAT_BC3_TYPELESS},
{DXGI_FORMAT_BC3_UNORM, DXGI_FORMAT_BC3_TYPELESS},
/* DXGI_FORMAT_BC4_TYPELESS */
{DXGI_FORMAT_BC4_UNORM, DXGI_FORMAT_BC4_TYPELESS},
{DXGI_FORMAT_BC4_SNORM, DXGI_FORMAT_BC4_TYPELESS},
/* DXGI_FORMAT_BC5_TYPELESS */
{DXGI_FORMAT_BC5_UNORM, DXGI_FORMAT_BC5_TYPELESS},
{DXGI_FORMAT_BC5_SNORM, DXGI_FORMAT_BC5_TYPELESS},
/* DXGI_FORMAT_BC6H_TYPELESS */
{DXGI_FORMAT_BC6H_UF16, DXGI_FORMAT_BC6H_TYPELESS},
{DXGI_FORMAT_BC6H_SF16, DXGI_FORMAT_BC6H_TYPELESS},
/* DXGI_FORMAT_BC7_TYPELESS */
{DXGI_FORMAT_BC7_UNORM_SRGB, DXGI_FORMAT_BC7_TYPELESS},
{DXGI_FORMAT_BC7_UNORM, DXGI_FORMAT_BC7_TYPELESS},
/* DXGI_FORMAT_B8G8R8A8_TYPELESS */
{DXGI_FORMAT_B8G8R8A8_UNORM_SRGB, DXGI_FORMAT_B8G8R8A8_TYPELESS},
{DXGI_FORMAT_B8G8R8A8_UNORM, DXGI_FORMAT_B8G8R8A8_TYPELESS},
/* DXGI_FORMAT_B8G8R8X8_TYPELESS */
{DXGI_FORMAT_B8G8R8X8_UNORM_SRGB, DXGI_FORMAT_B8G8R8X8_TYPELESS},
{DXGI_FORMAT_B8G8R8X8_UNORM, DXGI_FORMAT_B8G8R8X8_TYPELESS},
{DXGI_FORMAT_R32G32B32A32_TYPELESS,
{DXGI_FORMAT_R32G32B32A32_FLOAT, DXGI_FORMAT_R32G32B32A32_UINT, DXGI_FORMAT_R32G32B32A32_SINT},
DXGI_FORMAT_R32G32B32A32_UINT},
{DXGI_FORMAT_R32G32B32A32_FLOAT, {DXGI_FORMAT_UNKNOWN},
DXGI_FORMAT_R32G32B32A32_UINT},
{DXGI_FORMAT_R32G32B32A32_UINT,
{DXGI_FORMAT_R32G32B32A32_SINT},
DXGI_FORMAT_R32G32B32A32_UINT},
{DXGI_FORMAT_R32G32B32A32_SINT,
{DXGI_FORMAT_R32G32B32A32_UINT},
DXGI_FORMAT_R32G32B32A32_UINT},
{DXGI_FORMAT_R32G32B32_TYPELESS,
{DXGI_FORMAT_R32G32B32_FLOAT, DXGI_FORMAT_R32G32B32_UINT, DXGI_FORMAT_R32G32B32_SINT},
DXGI_FORMAT_R32G32B32_UINT},
{DXGI_FORMAT_R32G32B32_FLOAT, {DXGI_FORMAT_UNKNOWN},
DXGI_FORMAT_R32G32B32_UINT},
{DXGI_FORMAT_R32G32B32_UINT,
{DXGI_FORMAT_R32G32B32_SINT},
DXGI_FORMAT_R32G32B32_UINT},
{DXGI_FORMAT_R32G32B32_SINT,
{DXGI_FORMAT_R32G32B32_UINT},
DXGI_FORMAT_R32G32B32_UINT},
{DXGI_FORMAT_R16G16B16A16_TYPELESS,
{DXGI_FORMAT_R16G16B16A16_FLOAT, DXGI_FORMAT_R16G16B16A16_UNORM, DXGI_FORMAT_R16G16B16A16_SNORM, DXGI_FORMAT_R16G16B16A16_UINT, DXGI_FORMAT_R16G16B16A16_SINT},
DXGI_FORMAT_R16G16B16A16_UINT},
{DXGI_FORMAT_R16G16B16A16_FLOAT, {DXGI_FORMAT_UNKNOWN},
DXGI_FORMAT_R16G16B16A16_UINT},
{DXGI_FORMAT_R16G16B16A16_UINT,
{DXGI_FORMAT_R16G16B16A16_SINT, DXGI_FORMAT_R16G16B16A16_UNORM, DXGI_FORMAT_R16G16B16A16_SNORM},
DXGI_FORMAT_R16G16B16A16_UINT},
{DXGI_FORMAT_R16G16B16A16_SINT,
{DXGI_FORMAT_R16G16B16A16_UINT, DXGI_FORMAT_R16G16B16A16_UNORM, DXGI_FORMAT_R16G16B16A16_SNORM},
DXGI_FORMAT_R16G16B16A16_UINT},
{DXGI_FORMAT_R16G16B16A16_UNORM,
{DXGI_FORMAT_R16G16B16A16_UINT, DXGI_FORMAT_R16G16B16A16_SINT},
DXGI_FORMAT_R16G16B16A16_UINT},
{DXGI_FORMAT_R16G16B16A16_SNORM,
{DXGI_FORMAT_R16G16B16A16_UINT, DXGI_FORMAT_R16G16B16A16_SINT},
DXGI_FORMAT_R16G16B16A16_UINT},
{DXGI_FORMAT_R32G32_TYPELESS,
{DXGI_FORMAT_R32G32_FLOAT, DXGI_FORMAT_R32G32_UINT, DXGI_FORMAT_R32G32_SINT},
DXGI_FORMAT_R32G32_UINT},
{DXGI_FORMAT_R32G32_FLOAT, {DXGI_FORMAT_UNKNOWN},
DXGI_FORMAT_R32G32_UINT},
{DXGI_FORMAT_R32G32_UINT,
{DXGI_FORMAT_R32G32_SINT},
DXGI_FORMAT_R32G32_UINT},
{DXGI_FORMAT_R32G32_SINT,
{DXGI_FORMAT_R32G32_UINT},
DXGI_FORMAT_R32G32_UINT},
{DXGI_FORMAT_R10G10B10A2_TYPELESS,
{DXGI_FORMAT_R10G10B10A2_UNORM, DXGI_FORMAT_R10G10B10A2_UINT},
DXGI_FORMAT_R10G10B10A2_UINT},
{DXGI_FORMAT_R10G10B10A2_UINT,
{DXGI_FORMAT_R10G10B10A2_UNORM},
DXGI_FORMAT_R10G10B10A2_UINT},
{DXGI_FORMAT_R10G10B10A2_UNORM,
{DXGI_FORMAT_R10G10B10A2_UINT},
DXGI_FORMAT_R10G10B10A2_UINT},
{DXGI_FORMAT_R11G11B10_FLOAT, {DXGI_FORMAT_UNKNOWN},
DXGI_FORMAT_R32_UINT},
{DXGI_FORMAT_R8G8_TYPELESS,
{DXGI_FORMAT_R8G8_UINT, DXGI_FORMAT_R8G8_SINT, DXGI_FORMAT_R8G8_UNORM, DXGI_FORMAT_R8G8_SNORM},
DXGI_FORMAT_R8G8_UINT},
{DXGI_FORMAT_R8G8_UINT,
{DXGI_FORMAT_R8G8_SINT, DXGI_FORMAT_R8G8_UNORM, DXGI_FORMAT_R8G8_SNORM},
DXGI_FORMAT_R8G8_UINT},
{DXGI_FORMAT_R8G8_SINT,
{DXGI_FORMAT_R8G8_UINT, DXGI_FORMAT_R8G8_UNORM, DXGI_FORMAT_R8G8_SNORM},
DXGI_FORMAT_R8G8_UINT},
{DXGI_FORMAT_R8G8_UNORM,
{DXGI_FORMAT_R8G8_UINT, DXGI_FORMAT_R8G8_SINT},
DXGI_FORMAT_R8G8_UINT},
{DXGI_FORMAT_R8G8_SNORM,
{DXGI_FORMAT_R8G8_UINT, DXGI_FORMAT_R8G8_SINT},
DXGI_FORMAT_R8G8_UINT},
{DXGI_FORMAT_R8G8B8A8_TYPELESS,
{DXGI_FORMAT_R8G8B8A8_UINT, DXGI_FORMAT_R8G8B8A8_SINT, DXGI_FORMAT_R8G8B8A8_UNORM, DXGI_FORMAT_R8G8B8A8_UNORM_SRGB, DXGI_FORMAT_R8G8B8A8_SNORM},
DXGI_FORMAT_R8G8B8A8_UINT},
{DXGI_FORMAT_R8G8B8A8_UINT,
{DXGI_FORMAT_R8G8B8A8_SINT, DXGI_FORMAT_R8G8B8A8_UNORM, DXGI_FORMAT_R8G8B8A8_UNORM_SRGB, DXGI_FORMAT_R8G8B8A8_SNORM},
DXGI_FORMAT_R8G8B8A8_UINT},
{DXGI_FORMAT_R8G8B8A8_SINT,
{DXGI_FORMAT_R8G8B8A8_UINT, DXGI_FORMAT_R8G8B8A8_UNORM, DXGI_FORMAT_R8G8B8A8_UNORM_SRGB, DXGI_FORMAT_R8G8B8A8_SNORM},
DXGI_FORMAT_R8G8B8A8_UINT},
{DXGI_FORMAT_R8G8B8A8_UNORM_SRGB,
{DXGI_FORMAT_R8G8B8A8_UINT, DXGI_FORMAT_R8G8B8A8_SINT, DXGI_FORMAT_R8G8B8A8_UNORM},
DXGI_FORMAT_R8G8B8A8_UINT},
{DXGI_FORMAT_R8G8B8A8_UNORM,
{DXGI_FORMAT_R8G8B8A8_UINT, DXGI_FORMAT_R8G8B8A8_SINT, DXGI_FORMAT_R8G8B8A8_UNORM_SRGB},
DXGI_FORMAT_R8G8B8A8_UINT},
{DXGI_FORMAT_R8G8B8A8_SNORM,
{DXGI_FORMAT_R8G8B8A8_UINT, DXGI_FORMAT_R8G8B8A8_SINT},
DXGI_FORMAT_R8G8B8A8_UINT},
{DXGI_FORMAT_R16G16_TYPELESS,
{DXGI_FORMAT_R16G16_FLOAT, DXGI_FORMAT_R16G16_UINT, DXGI_FORMAT_R16G16_SINT, DXGI_FORMAT_R16G16_UNORM, DXGI_FORMAT_R16G16_SNORM},
DXGI_FORMAT_R16G16_UINT},
{DXGI_FORMAT_R16G16_FLOAT, {DXGI_FORMAT_UNKNOWN},
DXGI_FORMAT_R16G16_UINT},
{DXGI_FORMAT_R16G16_UINT,
{DXGI_FORMAT_R16G16_SINT, DXGI_FORMAT_R16G16_UNORM, DXGI_FORMAT_R16G16_SNORM},
DXGI_FORMAT_R16G16_UINT},
{DXGI_FORMAT_R16G16_SINT,
{DXGI_FORMAT_R16G16_UINT, DXGI_FORMAT_R16G16_UNORM, DXGI_FORMAT_R16G16_SNORM},
DXGI_FORMAT_R16G16_UINT},
{DXGI_FORMAT_R16G16_UNORM,
{DXGI_FORMAT_R16G16_UINT, DXGI_FORMAT_R16G16_SINT},
DXGI_FORMAT_R16G16_UINT},
{DXGI_FORMAT_R16G16_SNORM,
{DXGI_FORMAT_R16G16_UINT, DXGI_FORMAT_R16G16_SINT},
DXGI_FORMAT_R16G16_UINT},
{DXGI_FORMAT_R32_TYPELESS,
{DXGI_FORMAT_R32_FLOAT, DXGI_FORMAT_R32_UINT, DXGI_FORMAT_R32_SINT},
DXGI_FORMAT_R32_UINT},
{DXGI_FORMAT_R32_FLOAT, {DXGI_FORMAT_UNKNOWN},
DXGI_FORMAT_R32_UINT},
{DXGI_FORMAT_R32_UINT,
{DXGI_FORMAT_R32_SINT},
DXGI_FORMAT_R32_UINT},
{DXGI_FORMAT_R32_SINT,
{DXGI_FORMAT_R32_UINT},
DXGI_FORMAT_R32_UINT},
{DXGI_FORMAT_R16_TYPELESS,
{DXGI_FORMAT_R16_FLOAT, DXGI_FORMAT_R16_UINT, DXGI_FORMAT_R16_SINT, DXGI_FORMAT_R16_UNORM, DXGI_FORMAT_R16_SNORM},
DXGI_FORMAT_R16_UINT},
{DXGI_FORMAT_R16_FLOAT, {DXGI_FORMAT_UNKNOWN},
DXGI_FORMAT_R16_UINT},
{DXGI_FORMAT_R16_UINT,
{DXGI_FORMAT_R16_SINT, DXGI_FORMAT_R16_UNORM, DXGI_FORMAT_R16_SNORM},
DXGI_FORMAT_R16_UINT},
{DXGI_FORMAT_R16_SINT,
{DXGI_FORMAT_R16_UINT, DXGI_FORMAT_R16_UNORM, DXGI_FORMAT_R16_SNORM},
DXGI_FORMAT_R16_UINT},
{DXGI_FORMAT_R16_UNORM,
{DXGI_FORMAT_R16_UINT, DXGI_FORMAT_R16_SINT},
DXGI_FORMAT_R16_UINT},
{DXGI_FORMAT_R16_SNORM,
{DXGI_FORMAT_R16_UINT, DXGI_FORMAT_R16_SINT},
DXGI_FORMAT_R16_UINT},
{DXGI_FORMAT_R8_TYPELESS,
{DXGI_FORMAT_R8_UINT, DXGI_FORMAT_R8_SINT, DXGI_FORMAT_R8_UNORM, DXGI_FORMAT_R8_SNORM, DXGI_FORMAT_A8_UNORM},
DXGI_FORMAT_R8_UINT},
{DXGI_FORMAT_R8_UINT,
{DXGI_FORMAT_R8_SINT, DXGI_FORMAT_R8_UNORM, DXGI_FORMAT_R8_SNORM, DXGI_FORMAT_A8_UNORM},
DXGI_FORMAT_R8_UINT},
{DXGI_FORMAT_R8_SINT,
{DXGI_FORMAT_R8_UINT, DXGI_FORMAT_R8_UNORM, DXGI_FORMAT_R8_SNORM, DXGI_FORMAT_A8_UNORM},
DXGI_FORMAT_R8_UINT},
{DXGI_FORMAT_R8_UNORM,
{DXGI_FORMAT_R8_UINT, DXGI_FORMAT_R8_SINT, DXGI_FORMAT_A8_UNORM},
DXGI_FORMAT_R8_UINT},
{DXGI_FORMAT_R8_SNORM,
{DXGI_FORMAT_R8_UINT, DXGI_FORMAT_R8_SINT},
DXGI_FORMAT_R8_UINT},
{DXGI_FORMAT_A8_UNORM,
{DXGI_FORMAT_R8_UINT, DXGI_FORMAT_R8_SINT, DXGI_FORMAT_R8_UNORM},
DXGI_FORMAT_R8_UINT},
{DXGI_FORMAT_B8G8R8A8_TYPELESS,
{DXGI_FORMAT_B8G8R8A8_UNORM, DXGI_FORMAT_B8G8R8A8_UNORM_SRGB},
DXGI_FORMAT_R8G8B8A8_UINT},
{DXGI_FORMAT_B8G8R8A8_UNORM,
{DXGI_FORMAT_B8G8R8A8_UNORM_SRGB},
DXGI_FORMAT_R8G8B8A8_UINT},
{DXGI_FORMAT_B8G8R8A8_UNORM_SRGB,
{DXGI_FORMAT_B8G8R8A8_UNORM},
DXGI_FORMAT_R8G8B8A8_UINT},
{DXGI_FORMAT_B8G8R8X8_TYPELESS,
{DXGI_FORMAT_B8G8R8X8_UNORM, DXGI_FORMAT_B8G8R8X8_UNORM_SRGB},
DXGI_FORMAT_R8G8B8A8_UINT},
{DXGI_FORMAT_B8G8R8X8_UNORM,
{DXGI_FORMAT_B8G8R8X8_UNORM_SRGB},
DXGI_FORMAT_R8G8B8A8_UINT},
{DXGI_FORMAT_B8G8R8X8_UNORM_SRGB,
{DXGI_FORMAT_B8G8R8X8_UNORM},
DXGI_FORMAT_R8G8B8A8_UINT},
{DXGI_FORMAT_BC1_TYPELESS,
{DXGI_FORMAT_BC1_UNORM, DXGI_FORMAT_BC1_UNORM_SRGB}},
{DXGI_FORMAT_BC1_UNORM,
{DXGI_FORMAT_BC1_UNORM_SRGB}},
{DXGI_FORMAT_BC1_UNORM_SRGB,
{DXGI_FORMAT_BC1_UNORM}},
{DXGI_FORMAT_BC2_TYPELESS,
{DXGI_FORMAT_BC2_UNORM, DXGI_FORMAT_BC2_UNORM_SRGB}},
{DXGI_FORMAT_BC2_UNORM,
{DXGI_FORMAT_BC2_UNORM_SRGB}},
{DXGI_FORMAT_BC2_UNORM_SRGB,
{DXGI_FORMAT_BC2_UNORM}},
{DXGI_FORMAT_BC3_TYPELESS,
{DXGI_FORMAT_BC3_UNORM, DXGI_FORMAT_BC3_UNORM_SRGB}},
{DXGI_FORMAT_BC3_UNORM,
{DXGI_FORMAT_BC3_UNORM_SRGB}},
{DXGI_FORMAT_BC3_UNORM_SRGB,
{DXGI_FORMAT_BC3_UNORM}},
{DXGI_FORMAT_BC4_TYPELESS,
{DXGI_FORMAT_BC4_UNORM, DXGI_FORMAT_BC4_SNORM}},
{DXGI_FORMAT_BC5_TYPELESS,
{DXGI_FORMAT_BC5_UNORM, DXGI_FORMAT_BC5_SNORM}},
{DXGI_FORMAT_BC6H_TYPELESS,
{DXGI_FORMAT_BC6H_UF16, DXGI_FORMAT_BC6H_SF16}},
{DXGI_FORMAT_BC7_TYPELESS,
{DXGI_FORMAT_BC7_UNORM, DXGI_FORMAT_BC7_UNORM_SRGB}},
{DXGI_FORMAT_BC7_UNORM,
{DXGI_FORMAT_BC7_UNORM_SRGB}},
{DXGI_FORMAT_BC7_UNORM_SRGB,
{DXGI_FORMAT_BC7_UNORM}},
};
static bool dxgi_format_is_depth_stencil(DXGI_FORMAT dxgi_format)
void vkd3d_format_compatibility_list_add_format(struct vkd3d_format_compatibility_list *list, VkFormat vk_format)
{
unsigned int i;
bool found = false;
for (i = 0; i < ARRAY_SIZE(vkd3d_formats); ++i)
for (i = 0; i < list->format_count && !found; i++)
found = list->vk_formats[i] == vk_format;
if (!found)
{
const struct vkd3d_format *current = &vkd3d_formats[i];
if (current->dxgi_format == dxgi_format)
return current->vk_aspect_mask & (VK_IMAGE_ASPECT_DEPTH_BIT | VK_IMAGE_ASPECT_STENCIL_BIT);
assert(list->format_count < ARRAY_SIZE(list->vk_formats));
list->vk_formats[list->format_count++] = vk_format;
}
for (i = 0; i < ARRAY_SIZE(vkd3d_depth_stencil_formats); ++i)
{
if (vkd3d_depth_stencil_formats[i].dxgi_format == dxgi_format)
return true;
}
return false;
}
/* FIXME: This table should be generated at compile-time. */
static HRESULT vkd3d_init_format_compatibility_lists(struct d3d12_device *device)
{
struct vkd3d_format_compatibility_list *lists, *current_list;
const struct vkd3d_format_compatibility_info *current;
DXGI_FORMAT dxgi_format;
VkFormat vk_format;
struct vkd3d_format_compatibility_list *lists, *dst;
const struct dxgi_format_compatibility_list *src;
unsigned int count;
unsigned int i, j;
@ -278,62 +414,25 @@ static HRESULT vkd3d_init_format_compatibility_lists(struct d3d12_device *device
if (!device->vk_info.KHR_image_format_list)
return S_OK;
count = 1;
dxgi_format = vkd3d_format_compatibility_info[0].typeless_format;
for (i = 0; i < ARRAY_SIZE(vkd3d_format_compatibility_info); ++i)
{
DXGI_FORMAT typeless_format = vkd3d_format_compatibility_info[i].typeless_format;
if (dxgi_format != typeless_format)
{
++count;
dxgi_format = typeless_format;
}
}
count = 0;
for (i = 0; i < ARRAY_SIZE(dxgi_format_compatibility_list); ++i)
count = max(count, dxgi_format_compatibility_list[i].image_format + 1);
if (!(lists = vkd3d_calloc(count, sizeof(*lists))))
return E_OUTOFMEMORY;
count = 0;
current_list = lists;
current_list->typeless_format = vkd3d_format_compatibility_info[0].typeless_format;
for (i = 0; i < ARRAY_SIZE(vkd3d_format_compatibility_info); ++i)
for (i = 0; i < ARRAY_SIZE(dxgi_format_compatibility_list); ++i)
{
current = &vkd3d_format_compatibility_info[i];
src = &dxgi_format_compatibility_list[i];
dst = &lists[src->image_format];
if (current_list->typeless_format != current->typeless_format)
{
/* Avoid empty format lists. */
if (current_list->format_count)
{
++current_list;
++count;
}
dst->uint_format = src->uint_format;
dst->vk_formats[dst->format_count++] = vkd3d_get_vk_format(src->image_format);
current_list->typeless_format = current->typeless_format;
}
/* In Vulkan, each depth-stencil format is only compatible with itself. */
if (dxgi_format_is_depth_stencil(current->format))
continue;
if (!(vk_format = vkd3d_get_vk_format(current->format)))
continue;
for (j = 0; j < current_list->format_count; ++j)
{
if (current_list->vk_formats[j] == vk_format)
break;
}
if (j >= current_list->format_count)
{
assert(current_list->format_count < VKD3D_MAX_COMPATIBLE_FORMAT_COUNT);
current_list->vk_formats[current_list->format_count++] = vk_format;
}
for (j = 0; j < ARRAY_SIZE(src->view_formats) && src->view_formats[j]; j++)
vkd3d_format_compatibility_list_add_format(dst, vkd3d_get_vk_format(src->view_formats[j]));
}
if (current_list->format_count)
++count;
device->format_compatibility_list_count = count;
device->format_compatibility_lists = lists;
@ -350,51 +449,74 @@ static void vkd3d_cleanup_format_compatibility_lists(struct d3d12_device *device
static HRESULT vkd3d_init_depth_stencil_formats(struct d3d12_device *device)
{
const unsigned int count = ARRAY_SIZE(vkd3d_depth_stencil_formats);
const struct vkd3d_vk_device_procs *vk_procs = &device->vk_procs;
struct vkd3d_format *formats, *format;
VkFormatProperties properties;
struct vkd3d_format *formats;
unsigned int i;
if (!(formats = vkd3d_calloc(VKD3D_MAX_DXGI_FORMAT + 1, sizeof(*formats))))
return E_OUTOFMEMORY;
VK_CALL(vkGetPhysicalDeviceFormatProperties(device->vk_physical_device,
VK_FORMAT_D24_UNORM_S8_UINT, &properties));
if (properties.optimalTilingFeatures & VK_FORMAT_FEATURE_DEPTH_STENCIL_ATTACHMENT_BIT)
{
device->depth_stencil_formats = vkd3d_depth_stencil_formats;
}
else
if (!(properties.optimalTilingFeatures & VK_FORMAT_FEATURE_DEPTH_STENCIL_ATTACHMENT_BIT))
{
/* AMD doesn't support VK_FORMAT_D24_UNORM_S8_UINT. */
WARN("Mapping VK_FORMAT_D24_UNORM_S8_UINT to VK_FORMAT_D32_SFLOAT_S8_UINT.\n");
if (!(formats = vkd3d_calloc(count, sizeof(*formats))))
return E_OUTOFMEMORY;
memcpy(formats, vkd3d_depth_stencil_formats, sizeof(vkd3d_depth_stencil_formats));
for (i = 0; i < count; ++i)
{
if (formats[i].vk_format == VK_FORMAT_D24_UNORM_S8_UINT)
{
formats[i].vk_format = VK_FORMAT_D32_SFLOAT_S8_UINT;
formats[i].is_emulated = true;
}
}
device->depth_stencil_formats = formats;
}
for (i = 0; i < ARRAY_SIZE(vkd3d_depth_stencil_formats); ++i)
{
assert(vkd3d_depth_stencil_formats[i].dxgi_format <= VKD3D_MAX_DXGI_FORMAT);
format = &formats[vkd3d_depth_stencil_formats[i].dxgi_format];
*format = vkd3d_depth_stencil_formats[i];
if (format->vk_format == VK_FORMAT_D24_UNORM_S8_UINT &&
!(properties.optimalTilingFeatures & VK_FORMAT_FEATURE_DEPTH_STENCIL_ATTACHMENT_BIT))
{
format->vk_format = VK_FORMAT_D32_SFLOAT_S8_UINT;
format->is_emulated = true;
}
}
device->depth_stencil_formats = formats;
return S_OK;
}
static void vkd3d_cleanup_depth_stencil_formats(struct d3d12_device *device)
{
if (vkd3d_depth_stencil_formats != device->depth_stencil_formats)
vkd3d_free((void *)device->depth_stencil_formats);
vkd3d_free((void *)device->depth_stencil_formats);
device->depth_stencil_formats = NULL;
}
static HRESULT vkd3d_init_formats(struct d3d12_device *device)
{
struct vkd3d_format *formats;
unsigned int i;
if (!(formats = vkd3d_calloc(VKD3D_MAX_DXGI_FORMAT + 1, sizeof(*formats))))
return E_OUTOFMEMORY;
for (i = 0; i < ARRAY_SIZE(vkd3d_formats); ++i)
{
assert(vkd3d_formats[i].dxgi_format <= VKD3D_MAX_DXGI_FORMAT);
formats[vkd3d_formats[i].dxgi_format] = vkd3d_formats[i];
}
device->formats = formats;
return S_OK;
}
static void vkd3d_cleanup_formats(struct d3d12_device *device)
{
vkd3d_free((void *)device->formats);
device->formats = NULL;
}
HRESULT vkd3d_init_format_info(struct d3d12_device *device)
{
HRESULT hr;
@ -402,8 +524,17 @@ HRESULT vkd3d_init_format_info(struct d3d12_device *device)
if (FAILED(hr = vkd3d_init_depth_stencil_formats(device)))
return hr;
if FAILED(hr = vkd3d_init_format_compatibility_lists(device))
if (FAILED(hr = vkd3d_init_format_compatibility_lists(device)))
{
vkd3d_cleanup_depth_stencil_formats(device);
return hr;
}
if (FAILED(hr = vkd3d_init_formats(device)))
{
vkd3d_cleanup_depth_stencil_formats(device);
vkd3d_cleanup_format_compatibility_lists(device);
}
return hr;
}
@ -412,6 +543,7 @@ void vkd3d_cleanup_format_info(struct d3d12_device *device)
{
vkd3d_cleanup_depth_stencil_formats(device);
vkd3d_cleanup_format_compatibility_lists(device);
vkd3d_cleanup_formats(device);
}
/* We use overrides for depth/stencil formats. This is required in order to
@ -421,79 +553,64 @@ void vkd3d_cleanup_format_info(struct d3d12_device *device)
static const struct vkd3d_format *vkd3d_get_depth_stencil_format(const struct d3d12_device *device,
DXGI_FORMAT dxgi_format)
{
const struct vkd3d_format *formats;
unsigned int i;
const struct vkd3d_format *format;
assert(device);
formats = device->depth_stencil_formats;
format = &device->depth_stencil_formats[dxgi_format];
for (i = 0; i < ARRAY_SIZE(vkd3d_depth_stencil_formats); ++i)
{
if (formats[i].dxgi_format == dxgi_format)
return &formats[i];
}
return NULL;
return format->dxgi_format ? format : NULL;
}
const struct vkd3d_format *vkd3d_get_format(const struct d3d12_device *device,
DXGI_FORMAT dxgi_format, bool depth_stencil)
{
const struct vkd3d_format *format;
unsigned int i;
if (depth_stencil && (format = vkd3d_get_depth_stencil_format(device, dxgi_format)))
return format;
for (i = 0; i < ARRAY_SIZE(vkd3d_formats); ++i)
{
if (vkd3d_formats[i].dxgi_format == dxgi_format)
return &vkd3d_formats[i];
}
return NULL;
}
DXGI_FORMAT vkd3d_get_typeless_format(const struct d3d12_device *device, DXGI_FORMAT dxgi_format)
{
const struct vkd3d_format *format = vkd3d_get_format(device, dxgi_format, true);
unsigned int i;
if (!format)
return DXGI_FORMAT_UNKNOWN;
if (format->type == VKD3D_FORMAT_TYPE_TYPELESS)
return dxgi_format;
for (i = 0; i < ARRAY_SIZE(vkd3d_format_compatibility_info); ++i)
{
if (vkd3d_format_compatibility_info[i].format == dxgi_format)
return vkd3d_format_compatibility_info[i].typeless_format;
}
return DXGI_FORMAT_UNKNOWN;
}
const struct vkd3d_format *vkd3d_find_uint_format(const struct d3d12_device *device, DXGI_FORMAT dxgi_format)
{
DXGI_FORMAT typeless_format = DXGI_FORMAT_UNKNOWN;
const struct vkd3d_format *vkd3d_format;
unsigned int i;
if (!(typeless_format = vkd3d_get_typeless_format(device, dxgi_format)))
if (dxgi_format > VKD3D_MAX_DXGI_FORMAT)
return NULL;
for (i = 0; i < ARRAY_SIZE(vkd3d_format_compatibility_info); ++i)
/* If we request a depth-stencil format (or typeless variant) that is planar,
* there cannot be any ambiguity which format to select, we must choose a depth-stencil format.
* For single aspect formats,
* there are cases where we need to choose either COLOR or DEPTH aspect variants based on depth_stencil argument,
* but there cannot be any such issue for DEPTH_STENCIL types.
* This fixes issues where e.g. R24_UNORM_X8_TYPELESS format is used without ALLOW_DEPTH_STENCIL. */
format = vkd3d_get_depth_stencil_format(device, dxgi_format);
if (format && (depth_stencil || format->plane_count > 1))
return format;
format = &device->formats[dxgi_format];
return format->dxgi_format ? format : NULL;
}
struct vkd3d_format_footprint vkd3d_format_footprint_for_plane(const struct vkd3d_format *format, unsigned int plane_idx)
{
if (format->plane_footprints)
{
if (vkd3d_format_compatibility_info[i].typeless_format != typeless_format)
continue;
vkd3d_format = vkd3d_get_format(device, vkd3d_format_compatibility_info[i].format, false);
if (vkd3d_format->type == VKD3D_FORMAT_TYPE_UINT)
return vkd3d_format;
return format->plane_footprints[plane_idx];
}
else
{
struct vkd3d_format_footprint footprint;
footprint.dxgi_format = format->dxgi_format;
footprint.block_width = format->block_width;
footprint.block_height = format->block_height;
footprint.subsample_x_log2 = 0;
footprint.subsample_y_log2 = 0;
footprint.block_byte_count = format->byte_count * format->block_byte_count;
return footprint;
}
}
return NULL;
VkFormat vkd3d_internal_get_vk_format(const struct d3d12_device *device, DXGI_FORMAT dxgi_format)
{
const struct vkd3d_format *format;
if ((format = vkd3d_get_format(device, dxgi_format, false)))
return format->vk_format;
return VK_FORMAT_UNDEFINED;
}
void vkd3d_format_copy_data(const struct vkd3d_format *format, const uint8_t *src,
@ -522,12 +639,15 @@ void vkd3d_format_copy_data(const struct vkd3d_format *format, const uint8_t *sr
VKD3D_EXPORT VkFormat vkd3d_get_vk_format(DXGI_FORMAT format)
{
const struct vkd3d_format *vkd3d_format;
unsigned int i;
if (!(vkd3d_format = vkd3d_get_format(NULL, format, false)))
return VK_FORMAT_UNDEFINED;
for (i = 0; i < ARRAY_SIZE(vkd3d_formats); ++i)
{
if (vkd3d_formats[i].dxgi_format == format)
return vkd3d_formats[i].vk_format;
}
return vkd3d_format->vk_format;
return VK_FORMAT_UNDEFINED;
}
VKD3D_EXPORT DXGI_FORMAT vkd3d_get_dxgi_format(VkFormat format)
@ -552,6 +672,7 @@ bool is_valid_feature_level(D3D_FEATURE_LEVEL feature_level)
{
static const D3D_FEATURE_LEVEL valid_feature_levels[] =
{
D3D_FEATURE_LEVEL_12_2,
D3D_FEATURE_LEVEL_12_1,
D3D_FEATURE_LEVEL_12_0,
D3D_FEATURE_LEVEL_11_1,
@ -603,7 +724,9 @@ bool is_valid_resource_state(D3D12_RESOURCE_STATES state)
D3D12_RESOURCE_STATE_RESOLVE_SOURCE |
D3D12_RESOURCE_STATE_GENERIC_READ |
D3D12_RESOURCE_STATE_PRESENT |
D3D12_RESOURCE_STATE_PREDICATION;
D3D12_RESOURCE_STATE_PREDICATION |
D3D12_RESOURCE_STATE_RAYTRACING_ACCELERATION_STRUCTURE |
D3D12_RESOURCE_STATE_SHADING_RATE_SOURCE;
if (state & ~valid_states)
{
@ -761,6 +884,11 @@ const char *debug_dxgi_format(DXGI_FORMAT format)
ENUM_NAME(DXGI_FORMAT_P8)
ENUM_NAME(DXGI_FORMAT_A8P8)
ENUM_NAME(DXGI_FORMAT_B4G4R4A4_UNORM)
ENUM_NAME(DXGI_FORMAT_P208)
ENUM_NAME(DXGI_FORMAT_V208)
ENUM_NAME(DXGI_FORMAT_V408)
ENUM_NAME(DXGI_FORMAT_SAMPLER_FEEDBACK_MIN_MIP_OPAQUE)
ENUM_NAME(DXGI_FORMAT_SAMPLER_FEEDBACK_MIP_REGION_USED_OPAQUE)
ENUM_NAME(DXGI_FORMAT_FORCE_UINT)
}
#undef ENUM_NAME
@ -818,16 +946,15 @@ const char *debug_vk_extent_3d(VkExtent3D extent)
(unsigned int)extent.depth);
}
const char *debug_vk_queue_flags(VkQueueFlags flags)
const char *debug_vk_queue_flags(VkQueueFlags flags, char buffer[VKD3D_DEBUG_FLAGS_BUFFER_SIZE])
{
char buffer[120];
buffer[0] = '\0';
#define FLAG_TO_STR(f) if (flags & f) { strcat(buffer, " | "#f); flags &= ~f; }
FLAG_TO_STR(VK_QUEUE_GRAPHICS_BIT)
FLAG_TO_STR(VK_QUEUE_COMPUTE_BIT)
FLAG_TO_STR(VK_QUEUE_TRANSFER_BIT)
FLAG_TO_STR(VK_QUEUE_SPARSE_BINDING_BIT)
FLAG_TO_STR(VK_QUEUE_PROTECTED_BIT)
#undef FLAG_TO_STR
if (flags)
FIXME("Unrecognized flag(s) %#x.\n", flags);
@ -837,13 +964,12 @@ const char *debug_vk_queue_flags(VkQueueFlags flags)
return vkd3d_dbg_sprintf("%s", &buffer[3]);
}
const char *debug_vk_memory_heap_flags(VkMemoryHeapFlags flags)
const char *debug_vk_memory_heap_flags(VkMemoryHeapFlags flags, char buffer[VKD3D_DEBUG_FLAGS_BUFFER_SIZE])
{
char buffer[50];
buffer[0] = '\0';
#define FLAG_TO_STR(f) if (flags & f) { strcat(buffer, " | "#f); flags &= ~f; }
FLAG_TO_STR(VK_MEMORY_HEAP_DEVICE_LOCAL_BIT)
FLAG_TO_STR(VK_MEMORY_HEAP_MULTI_INSTANCE_BIT)
#undef FLAG_TO_STR
if (flags)
FIXME("Unrecognized flag(s) %#x.\n", flags);
@ -853,10 +979,8 @@ const char *debug_vk_memory_heap_flags(VkMemoryHeapFlags flags)
return vkd3d_dbg_sprintf("%s", &buffer[3]);
}
const char *debug_vk_memory_property_flags(VkMemoryPropertyFlags flags)
const char *debug_vk_memory_property_flags(VkMemoryPropertyFlags flags, char buffer[VKD3D_DEBUG_FLAGS_BUFFER_SIZE])
{
char buffer[200];
buffer[0] = '\0';
#define FLAG_TO_STR(f) if (flags & f) { strcat(buffer, " | "#f); flags &= ~f; }
FLAG_TO_STR(VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT)
@ -864,6 +988,9 @@ const char *debug_vk_memory_property_flags(VkMemoryPropertyFlags flags)
FLAG_TO_STR(VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)
FLAG_TO_STR(VK_MEMORY_PROPERTY_HOST_CACHED_BIT)
FLAG_TO_STR(VK_MEMORY_PROPERTY_LAZILY_ALLOCATED_BIT)
FLAG_TO_STR(VK_MEMORY_PROPERTY_PROTECTED_BIT)
FLAG_TO_STR(VK_MEMORY_PROPERTY_DEVICE_COHERENT_BIT_AMD)
FLAG_TO_STR(VK_MEMORY_PROPERTY_DEVICE_UNCACHED_BIT_AMD)
#undef FLAG_TO_STR
if (flags)
FIXME("Unrecognized flag(s) %#x.\n", flags);
@ -891,6 +1018,16 @@ HRESULT hresult_from_errno(int rc)
HRESULT hresult_from_vk_result(VkResult vr)
{
/* Wine tends to dispatch Vulkan calls to their own syscall stack.
* Crashes are captured and return this magic VkResult.
* Report it explicitly here so it's easier to debug when it happens. */
if (vr == -1073741819)
{
ERR("Detected segfault in Wine syscall handler.\n");
/* HACK: For ad-hoc debugging can also trigger backtrace printing here. */
return E_POINTER;
}
switch (vr)
{
case VK_SUCCESS:
@ -900,6 +1037,9 @@ HRESULT hresult_from_vk_result(VkResult vr)
/* fall-through */
case VK_ERROR_OUT_OF_HOST_MEMORY:
return E_OUTOFMEMORY;
case VK_ERROR_VALIDATION_FAILED_EXT:
/* NV driver sometimes returns this on invalid API usage. */
return E_INVALIDARG;
default:
FIXME("Unhandled VkResult %d.\n", vr);
/* fall-through */
@ -1018,7 +1158,7 @@ static struct vkd3d_private_data *vkd3d_private_store_get_private_data(
return NULL;
}
static HRESULT vkd3d_private_store_set_private_data(struct vkd3d_private_store *store,
HRESULT vkd3d_private_store_set_private_data(struct vkd3d_private_store *store,
const GUID *tag, const void *data, unsigned int data_size, bool is_object)
{
struct vkd3d_private_data *d, *old_data;
@ -1063,18 +1203,14 @@ HRESULT vkd3d_get_private_data(struct vkd3d_private_store *store,
const GUID *tag, unsigned int *out_size, void *out)
{
const struct vkd3d_private_data *data;
HRESULT hr = S_OK;
unsigned int size;
int rc;
HRESULT hr;
if (!out_size)
return E_INVALIDARG;
if ((rc = pthread_mutex_lock(&store->mutex)))
{
ERR("Failed to lock mutex, error %d.\n", rc);
return hresult_from_errno(rc);
}
if (FAILED(hr = vkd3d_private_data_lock(store)))
return hr;
if (!(data = vkd3d_private_store_get_private_data(store, tag)))
{
@ -1099,52 +1235,28 @@ HRESULT vkd3d_get_private_data(struct vkd3d_private_store *store,
memcpy(out, data->data, data->size);
done:
pthread_mutex_unlock(&store->mutex);
vkd3d_private_data_unlock(store);
return hr;
}
HRESULT vkd3d_set_private_data(struct vkd3d_private_store *store,
const GUID *tag, unsigned int data_size, const void *data)
HRESULT STDMETHODCALLTYPE d3d12_object_SetName(ID3D12Object *iface, const WCHAR *name)
{
HRESULT hr;
int rc;
size_t size = 0;
if ((rc = pthread_mutex_lock(&store->mutex)))
{
ERR("Failed to lock mutex, error %d.\n", rc);
return hresult_from_errno(rc);
}
TRACE("iface %p, name %s.\n", iface, debugstr_w(name));
hr = vkd3d_private_store_set_private_data(store, tag, data, data_size, false);
if (name)
size = sizeof(WCHAR) * (vkd3d_wcslen(name) + 1);
pthread_mutex_unlock(&store->mutex);
return hr;
return ID3D12Object_SetPrivateData(iface, &WKPDID_D3DDebugObjectNameW, size, name);
}
HRESULT vkd3d_set_private_data_interface(struct vkd3d_private_store *store,
const GUID *tag, const IUnknown *object)
{
const void *data = object ? object : (void *)&object;
HRESULT hr;
int rc;
if ((rc = pthread_mutex_lock(&store->mutex)))
{
ERR("Failed to lock mutex, error %d.\n", rc);
return hresult_from_errno(rc);
}
hr = vkd3d_private_store_set_private_data(store, tag, data, sizeof(object), !!object);
pthread_mutex_unlock(&store->mutex);
return hr;
}
VkResult vkd3d_set_vk_object_name_utf8(struct d3d12_device *device, uint64_t vk_object,
HRESULT vkd3d_set_vk_object_name(struct d3d12_device *device, uint64_t vk_object,
VkObjectType vk_object_type, const char *name)
{
const struct vkd3d_vk_device_procs *vk_procs = &device->vk_procs;
VkDebugUtilsObjectNameInfoEXT info;
VkResult vr;
if (!device->vk_info.EXT_debug_utils)
return VK_SUCCESS;
@ -1154,28 +1266,8 @@ VkResult vkd3d_set_vk_object_name_utf8(struct d3d12_device *device, uint64_t vk_
info.objectType = vk_object_type;
info.objectHandle = vk_object;
info.pObjectName = name;
return VK_CALL(vkSetDebugUtilsObjectNameEXT(device->vk_device, &info));
}
HRESULT vkd3d_set_vk_object_name(struct d3d12_device *device, uint64_t vk_object,
VkObjectType vk_object_type, const WCHAR *name)
{
char *name_utf8;
VkResult vr;
if (!name)
return E_INVALIDARG;
if (!device->vk_info.EXT_debug_utils)
return S_OK;
if (!(name_utf8 = vkd3d_strdup_w_utf8(name, device->wchar_size, 0)))
return E_OUTOFMEMORY;
vr = vkd3d_set_vk_object_name_utf8(device, vk_object, vk_object_type, name_utf8);
vkd3d_free(name_utf8);
vr = VK_CALL(vkSetDebugUtilsObjectNameEXT(device->vk_device, &info));
return hresult_from_vk_result(vr);
}

442
libs/vkd3d/va_map.c Normal file
View File

@ -0,0 +1,442 @@
/*
* Copyright 2021 Philip Rebohle for Valve Software
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
#define VKD3D_DBG_CHANNEL VKD3D_DBG_CHANNEL_API
#include "vkd3d_private.h"
static inline VkDeviceAddress vkd3d_va_map_get_next_address(VkDeviceAddress va)
{
return va >> (VKD3D_VA_BLOCK_SIZE_BITS + VKD3D_VA_BLOCK_BITS);
}
static inline VkDeviceAddress vkd3d_va_map_get_block_address(VkDeviceAddress va)
{
return (va >> VKD3D_VA_BLOCK_SIZE_BITS) & VKD3D_VA_BLOCK_MASK;
}
static struct vkd3d_va_block *vkd3d_va_map_find_block(struct vkd3d_va_map *va_map, VkDeviceAddress va)
{
VkDeviceAddress next_address = vkd3d_va_map_get_next_address(va);
struct vkd3d_va_tree *tree = &va_map->va_tree;
while (next_address && tree)
{
tree = vkd3d_atomic_ptr_load_explicit(&tree->next[next_address & VKD3D_VA_NEXT_MASK], vkd3d_memory_order_acquire);
next_address >>= VKD3D_VA_NEXT_BITS;
}
if (!tree)
return NULL;
return &tree->blocks[vkd3d_va_map_get_block_address(va)];
}
static struct vkd3d_va_block *vkd3d_va_map_get_block(struct vkd3d_va_map *va_map, VkDeviceAddress va)
{
VkDeviceAddress next_address = vkd3d_va_map_get_next_address(va);
struct vkd3d_va_tree *tree, **tree_ptr;
tree = &va_map->va_tree;
while (next_address)
{
tree_ptr = &tree->next[next_address & VKD3D_VA_NEXT_MASK];
tree = vkd3d_atomic_ptr_load_explicit(tree_ptr, vkd3d_memory_order_acquire);
if (!tree)
{
void *orig;
tree = vkd3d_calloc(1, sizeof(*tree));
orig = vkd3d_atomic_ptr_compare_exchange(tree_ptr, NULL, tree, vkd3d_memory_order_release, vkd3d_memory_order_acquire);
if (orig)
{
vkd3d_free(tree);
tree = orig;
}
}
next_address >>= VKD3D_VA_NEXT_BITS;
}
return &tree->blocks[vkd3d_va_map_get_block_address(va)];
}
static void vkd3d_va_map_cleanup_tree(struct vkd3d_va_tree *tree)
{
unsigned int i;
for (i = 0; i < ARRAY_SIZE(tree->next); i++)
{
if (tree->next[i])
{
vkd3d_va_map_cleanup_tree(tree->next[i]);
vkd3d_free(tree->next[i]);
}
}
}
static struct vkd3d_unique_resource *vkd3d_va_map_find_small_entry(struct vkd3d_va_map *va_map,
VkDeviceAddress va, size_t *index)
{
struct vkd3d_unique_resource *resource = NULL;
size_t hi = va_map->small_entries_count;
size_t lo = 0;
while (lo < hi)
{
struct vkd3d_unique_resource *r;
size_t i = lo + (hi - lo) / 2;
r = va_map->small_entries[i];
if (va < r->va)
hi = i;
else if (va >= r->va + r->size)
lo = i + 1;
else
{
lo = hi = i;
resource = r;
}
}
if (index)
*index = lo;
return resource;
}
void vkd3d_va_map_insert(struct vkd3d_va_map *va_map, struct vkd3d_unique_resource *resource)
{
VkDeviceAddress block_va, min_va, max_va;
struct vkd3d_va_block *block;
size_t index;
if (resource->size >= VKD3D_VA_BLOCK_SIZE)
{
min_va = resource->va;
max_va = resource->va + resource->size;
block_va = min_va & ~VKD3D_VA_LO_MASK;
while (block_va < max_va)
{
block = vkd3d_va_map_get_block(va_map, block_va);
if (block_va < min_va)
{
vkd3d_atomic_uint64_store_explicit(&block->r.va, min_va, vkd3d_memory_order_relaxed);
vkd3d_atomic_ptr_store_explicit(&block->r.resource, resource, vkd3d_memory_order_relaxed);
}
else
{
vkd3d_atomic_uint64_store_explicit(&block->l.va, max_va, vkd3d_memory_order_relaxed);
vkd3d_atomic_ptr_store_explicit(&block->l.resource, resource, vkd3d_memory_order_relaxed);
}
block_va += VKD3D_VA_BLOCK_SIZE;
}
}
else
{
pthread_mutex_lock(&va_map->mutex);
if (!vkd3d_va_map_find_small_entry(va_map, resource->va, &index))
{
vkd3d_array_reserve((void**)&va_map->small_entries, &va_map->small_entries_size,
va_map->small_entries_count + 1, sizeof(*va_map->small_entries));
memmove(&va_map->small_entries[index + 1], &va_map->small_entries[index],
sizeof(*va_map->small_entries) * (va_map->small_entries_count - index));
va_map->small_entries[index] = resource;
va_map->small_entries_count += 1;
}
pthread_mutex_unlock(&va_map->mutex);
}
}
void vkd3d_va_map_remove(struct vkd3d_va_map *va_map, const struct vkd3d_unique_resource *resource)
{
VkDeviceAddress block_va, min_va, max_va;
struct vkd3d_va_block *block;
size_t index;
if (resource->size >= VKD3D_VA_BLOCK_SIZE)
{
min_va = resource->va;
max_va = resource->va + resource->size;
block_va = min_va & ~VKD3D_VA_LO_MASK;
while (block_va < max_va)
{
block = vkd3d_va_map_get_block(va_map, block_va);
if (vkd3d_atomic_ptr_load_explicit(&block->l.resource, vkd3d_memory_order_relaxed) == resource)
{
vkd3d_atomic_uint64_store_explicit(&block->l.va, 0, vkd3d_memory_order_relaxed);
vkd3d_atomic_ptr_store_explicit(&block->l.resource, NULL, vkd3d_memory_order_relaxed);
}
else if (vkd3d_atomic_ptr_load_explicit(&block->r.resource, vkd3d_memory_order_relaxed) == resource)
{
vkd3d_atomic_uint64_store_explicit(&block->r.va, 0, vkd3d_memory_order_relaxed);
vkd3d_atomic_ptr_store_explicit(&block->r.resource, NULL, vkd3d_memory_order_relaxed);
}
block_va += VKD3D_VA_BLOCK_SIZE;
}
}
else
{
pthread_mutex_lock(&va_map->mutex);
if (vkd3d_va_map_find_small_entry(va_map, resource->va, &index) == resource)
{
va_map->small_entries_count -= 1;
memmove(&va_map->small_entries[index], &va_map->small_entries[index + 1],
sizeof(*va_map->small_entries) * (va_map->small_entries_count - index));
}
pthread_mutex_unlock(&va_map->mutex);
}
}
static struct vkd3d_unique_resource *vkd3d_va_map_deref_mutable(struct vkd3d_va_map *va_map, VkDeviceAddress va)
{
struct vkd3d_va_block *block = vkd3d_va_map_find_block(va_map, va);
struct vkd3d_unique_resource *resource = NULL;
if (block)
{
if (va < vkd3d_atomic_uint64_load_explicit(&block->l.va, vkd3d_memory_order_relaxed))
resource = vkd3d_atomic_ptr_load_explicit(&block->l.resource, vkd3d_memory_order_relaxed);
else if (va >= vkd3d_atomic_uint64_load_explicit(&block->r.va, vkd3d_memory_order_relaxed))
resource = vkd3d_atomic_ptr_load_explicit(&block->r.resource, vkd3d_memory_order_relaxed);
}
if (!resource)
{
pthread_mutex_lock(&va_map->mutex);
resource = vkd3d_va_map_find_small_entry(va_map, va, NULL);
pthread_mutex_unlock(&va_map->mutex);
}
return resource;
}
const struct vkd3d_unique_resource *vkd3d_va_map_deref(struct vkd3d_va_map *va_map, VkDeviceAddress va)
{
return vkd3d_va_map_deref_mutable(va_map, va);
}
VkAccelerationStructureKHR vkd3d_va_map_place_acceleration_structure(struct vkd3d_va_map *va_map,
struct d3d12_device *device,
VkDeviceAddress va)
{
struct vkd3d_unique_resource *resource;
struct vkd3d_view_map *old_view_map;
struct vkd3d_view_map *view_map;
const struct vkd3d_view *view;
struct vkd3d_view_key key;
resource = vkd3d_va_map_deref_mutable(va_map, va);
if (!resource || !resource->va)
return VK_NULL_HANDLE;
view_map = vkd3d_atomic_ptr_load_explicit(&resource->view_map, vkd3d_memory_order_acquire);
if (!view_map)
{
/* This is the first time we attempt to place an AS on top of this allocation, so
* CAS in a pointer. */
view_map = vkd3d_malloc(sizeof(*view_map));
if (!view_map)
return VK_NULL_HANDLE;
if (FAILED(vkd3d_view_map_init(view_map)))
{
vkd3d_free(view_map);
return VK_NULL_HANDLE;
}
/* Need to release in case other RTASes are placed at the same time, so they observe
* the initialized view map, and need to acquire if some other thread placed it. */
old_view_map = vkd3d_atomic_ptr_compare_exchange(&resource->view_map, NULL, view_map,
vkd3d_memory_order_release, vkd3d_memory_order_acquire);
if (old_view_map)
{
vkd3d_view_map_destroy(view_map, device);
vkd3d_free(view_map);
view_map = old_view_map;
}
}
key.view_type = VKD3D_VIEW_TYPE_ACCELERATION_STRUCTURE;
key.u.buffer.buffer = resource->vk_buffer;
key.u.buffer.offset = va - resource->va;
key.u.buffer.size = resource->size - key.u.buffer.offset;
key.u.buffer.format = NULL;
view = vkd3d_view_map_create_view(view_map, device, &key);
if (!view)
return VK_NULL_HANDLE;
return view->vk_acceleration_structure;
}
#define VKD3D_FAKE_VA_ALIGNMENT (65536)
VkDeviceAddress vkd3d_va_map_alloc_fake_va(struct vkd3d_va_map *va_map, VkDeviceSize size)
{
struct vkd3d_va_allocator *allocator = &va_map->va_allocator;
struct vkd3d_va_range range;
VkDeviceAddress va;
size_t i;
int rc;
if ((rc = pthread_mutex_lock(&allocator->mutex)))
{
ERR("Failed to lock mutex, rc %d.\n", rc);
return 0;
}
memset(&range, 0, sizeof(range));
size = align(size, VKD3D_FAKE_VA_ALIGNMENT);
/* The free list is ordered in such a way that the largest range
* is always first, so we don't have to iterate over the list */
if (allocator->free_range_count)
range = allocator->free_ranges[0];
if (range.size >= size)
{
va = range.base;
range.base += size;
range.size -= size;
for (i = 0; i < allocator->free_range_count - 1; i++)
{
if (allocator->free_ranges[i + 1].size > range.size)
allocator->free_ranges[i] = allocator->free_ranges[i + 1];
else
break;
}
if (range.size)
allocator->free_ranges[i] = range;
else
allocator->free_range_count--;
}
else
{
va = allocator->next_va;
allocator->next_va += size;
}
pthread_mutex_unlock(&allocator->mutex);
return va;
}
void vkd3d_va_map_free_fake_va(struct vkd3d_va_map *va_map, VkDeviceAddress va, VkDeviceSize size)
{
struct vkd3d_va_allocator *allocator = &va_map->va_allocator;
size_t range_idx, range_shift, i;
struct vkd3d_va_range new_range;
int rc;
if ((rc = pthread_mutex_lock(&allocator->mutex)))
{
ERR("Failed to lock mutex, rc %d.\n", rc);
return;
}
new_range.base = va;
new_range.size = align(size, VKD3D_FAKE_VA_ALIGNMENT);
range_idx = allocator->free_range_count;
range_shift = 0;
/* Find and effectively delete any free range adjacent to new_range */
for (i = 0; i < allocator->free_range_count; i++)
{
const struct vkd3d_va_range *cur_range = &allocator->free_ranges[i];
if (range_shift)
allocator->free_ranges[i - range_shift] = *cur_range;
if (cur_range->base == new_range.base + new_range.size || cur_range->base + cur_range->size == new_range.base)
{
if (range_idx == allocator->free_range_count)
range_idx = i;
else
range_shift++;
new_range.base = min(new_range.base, cur_range->base);
new_range.size += cur_range->size;
}
}
if (range_idx == allocator->free_range_count)
{
/* range_idx will be valid and point to the last element afterwards */
if (!(vkd3d_array_reserve((void **)&allocator->free_ranges, &allocator->free_ranges_size,
allocator->free_range_count + 1, sizeof(*allocator->free_ranges))))
{
ERR("Failed to add free range.\n");
pthread_mutex_unlock(&allocator->mutex);
return;
}
allocator->free_range_count += 1;
}
else
allocator->free_range_count -= range_shift;
/* Move ranges smaller than our new free range back to keep the list ordered */
while (range_idx && allocator->free_ranges[range_idx - 1].size < new_range.size)
{
allocator->free_ranges[range_idx] = allocator->free_ranges[range_idx - 1];
range_idx--;
}
allocator->free_ranges[range_idx] = new_range;
pthread_mutex_unlock(&allocator->mutex);
}
void vkd3d_va_map_init(struct vkd3d_va_map *va_map)
{
memset(va_map, 0, sizeof(*va_map));
pthread_mutex_init(&va_map->mutex, NULL);
pthread_mutex_init(&va_map->va_allocator.mutex, NULL);
/* Make sure we never return 0 as a valid VA */
va_map->va_allocator.next_va = VKD3D_VA_BLOCK_SIZE;
}
void vkd3d_va_map_cleanup(struct vkd3d_va_map *va_map)
{
vkd3d_va_map_cleanup_tree(&va_map->va_tree);
pthread_mutex_destroy(&va_map->va_allocator.mutex);
pthread_mutex_destroy(&va_map->mutex);
vkd3d_free(va_map->va_allocator.free_ranges);
vkd3d_free(va_map->small_entries);
}

View File

@ -0,0 +1,94 @@
/*
* Copyright 2020 Hans-Kristian Arntzen for Valve Corporation
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
*/
#ifndef __VKD3D_DESCRIPTOR_DEBUG_H
#define __VKD3D_DESCRIPTOR_DEBUG_H
#include "vkd3d_private.h"
#include "vkd3d_descriptor_qa_data.h"
/* Cost is 1 bit per cookie, and spending 256 MB of host memory on this is reasonable,
* and overflowing this pool should never happen. */
#define VKD3D_DESCRIPTOR_DEBUG_DEFAULT_NUM_COOKIES (2 * 1000 * 1000 * 1000)
#define VKD3D_DESCRIPTOR_DEBUG_NUM_PAD_DESCRIPTORS 1
#ifdef VKD3D_ENABLE_DESCRIPTOR_QA
HRESULT vkd3d_descriptor_debug_alloc_global_info(
struct vkd3d_descriptor_qa_global_info **global_info,
unsigned int num_cookies,
struct d3d12_device *device);
void vkd3d_descriptor_debug_free_global_info(
struct vkd3d_descriptor_qa_global_info *global_info,
struct d3d12_device *device);
void vkd3d_descriptor_debug_kick_qa_check(struct vkd3d_descriptor_qa_global_info *global_info);
const VkDescriptorBufferInfo *vkd3d_descriptor_debug_get_global_info_descriptor(
struct vkd3d_descriptor_qa_global_info *global_info);
void vkd3d_descriptor_debug_init(void);
bool vkd3d_descriptor_debug_active_log(void);
bool vkd3d_descriptor_debug_active_qa_checks(void);
void vkd3d_descriptor_debug_register_heap(
struct vkd3d_descriptor_qa_heap_buffer_data *heap, uint64_t cookie,
const D3D12_DESCRIPTOR_HEAP_DESC *desc);
void vkd3d_descriptor_debug_unregister_heap(uint64_t cookie);
void vkd3d_descriptor_debug_register_resource_cookie(
struct vkd3d_descriptor_qa_global_info *global_info,
uint64_t cookie, const D3D12_RESOURCE_DESC1 *desc);
void vkd3d_descriptor_debug_register_allocation_cookie(
struct vkd3d_descriptor_qa_global_info *global_info,
uint64_t cookie, const struct vkd3d_allocate_memory_info *info);
void vkd3d_descriptor_debug_register_view_cookie(
struct vkd3d_descriptor_qa_global_info *global_info,
uint64_t cookie, uint64_t resource_cookie);
void vkd3d_descriptor_debug_unregister_cookie(
struct vkd3d_descriptor_qa_global_info *global_info,
uint64_t cookie);
void vkd3d_descriptor_debug_write_descriptor(
struct vkd3d_descriptor_qa_heap_buffer_data *heap, uint64_t heap_cookie, uint32_t offset,
vkd3d_descriptor_qa_flags type_flags, uint64_t cookie);
void vkd3d_descriptor_debug_copy_descriptor(
struct vkd3d_descriptor_qa_heap_buffer_data *dst_heap, uint64_t dst_heap_cookie, uint32_t dst_offset,
struct vkd3d_descriptor_qa_heap_buffer_data *src_heap, uint64_t src_heap_cookie, uint32_t src_offset,
uint64_t cookie);
VkDeviceSize vkd3d_descriptor_debug_heap_info_size(unsigned int num_descriptors);
#else
#define vkd3d_descriptor_debug_alloc_global_info(global_info, num_cookies, device) (S_OK)
#define vkd3d_descriptor_debug_free_global_info(global_info, device) ((void)0)
#define vkd3d_descriptor_debug_kick_qa_check(global_info) ((void)0)
#define vkd3d_descriptor_debug_get_global_info_descriptor(global_info) ((const VkDescriptorBufferInfo *)NULL)
#define vkd3d_descriptor_debug_init() ((void)0)
#define vkd3d_descriptor_debug_active_log() ((void)0)
#define vkd3d_descriptor_debug_active_qa_checks() (false)
#define vkd3d_descriptor_debug_register_heap(heap, cookie, desc) ((void)0)
#define vkd3d_descriptor_debug_unregister_heap(cookie) ((void)0)
#define vkd3d_descriptor_debug_register_resource_cookie(global_info, cookie, desc) ((void)0)
#define vkd3d_descriptor_debug_register_allocation_cookie(global_info, cookie, info) ((void)0)
#define vkd3d_descriptor_debug_register_view_cookie(global_info, cookie, resource_cookie) ((void)0)
#define vkd3d_descriptor_debug_unregister_cookie(global_info, cookie) ((void)0)
#define vkd3d_descriptor_debug_write_descriptor(heap, heap_cookie, offset, type_flags, cookie) ((void)0)
#define vkd3d_descriptor_debug_copy_descriptor(dst_heap, dst_heap_cookie, dst_offset, src_heap, src_heap_cookie, src_offset, cookie) ((void)0)
#define vkd3d_descriptor_debug_heap_info_size(num_descriptors) 0
#endif
#endif

View File

@ -32,11 +32,6 @@ VKD3D_EXPORT HRESULT vkd3d_create_device(const struct vkd3d_device_create_info *
if (!create_info)
return E_INVALIDARG;
if (create_info->type != VKD3D_STRUCTURE_TYPE_DEVICE_CREATE_INFO)
{
WARN("Invalid structure type %#x.\n", create_info->type);
return E_INVALIDARG;
}
if (!create_info->instance && !create_info->instance_create_info)
{
ERR("Instance or instance create info is required.\n");
@ -168,31 +163,43 @@ static CONST_VTBL struct ID3D12RootSignatureDeserializerVtbl d3d12_root_signatur
d3d12_root_signature_deserializer_GetRootSignatureDesc,
};
int vkd3d_parse_root_signature_v_1_0(const struct vkd3d_shader_code *dxbc,
struct vkd3d_versioned_root_signature_desc *out_desc)
static int vkd3d_parse_root_signature_for_version(const struct vkd3d_shader_code *dxbc,
struct vkd3d_versioned_root_signature_desc *out_desc,
enum vkd3d_root_signature_version target_version,
bool raw_payload,
vkd3d_shader_hash_t *compatibility_hash)
{
struct vkd3d_versioned_root_signature_desc desc, converted_desc;
int ret;
if ((ret = vkd3d_shader_parse_root_signature(dxbc, &desc)) < 0)
if (raw_payload)
{
WARN("Failed to parse root signature, vkd3d result %d.\n", ret);
return ret;
if ((ret = vkd3d_shader_parse_root_signature_raw(dxbc->code, dxbc->size, &desc, compatibility_hash)) < 0)
{
WARN("Failed to parse root signature, vkd3d result %d.\n", ret);
return ret;
}
}
else
{
if ((ret = vkd3d_shader_parse_root_signature(dxbc, &desc, compatibility_hash)) < 0)
{
WARN("Failed to parse root signature, vkd3d result %d.\n", ret);
return ret;
}
}
if (desc.version == VKD3D_ROOT_SIGNATURE_VERSION_1_0)
if (desc.version == target_version)
{
*out_desc = desc;
}
else
{
enum vkd3d_root_signature_version version = desc.version;
ret = vkd3d_shader_convert_root_signature(&converted_desc, VKD3D_ROOT_SIGNATURE_VERSION_1_0, &desc);
ret = vkd3d_shader_convert_root_signature(&converted_desc, target_version, &desc);
vkd3d_shader_free_root_signature(&desc);
if (ret < 0)
{
WARN("Failed to convert from version %#x, vkd3d result %d.\n", version, ret);
WARN("Failed to convert from version %#x, vkd3d result %d.\n", desc.version, ret);
return ret;
}
@ -202,6 +209,30 @@ int vkd3d_parse_root_signature_v_1_0(const struct vkd3d_shader_code *dxbc,
return ret;
}
int vkd3d_parse_root_signature_v_1_0(const struct vkd3d_shader_code *dxbc,
struct vkd3d_versioned_root_signature_desc *out_desc,
vkd3d_shader_hash_t *compatibility_hash)
{
return vkd3d_parse_root_signature_for_version(dxbc, out_desc, VKD3D_ROOT_SIGNATURE_VERSION_1_0, false,
compatibility_hash);
}
int vkd3d_parse_root_signature_v_1_1(const struct vkd3d_shader_code *dxbc,
struct vkd3d_versioned_root_signature_desc *out_desc,
vkd3d_shader_hash_t *compatibility_hash)
{
return vkd3d_parse_root_signature_for_version(dxbc, out_desc, VKD3D_ROOT_SIGNATURE_VERSION_1_1, false,
compatibility_hash);
}
int vkd3d_parse_root_signature_v_1_1_from_raw_payload(const struct vkd3d_shader_code *dxbc,
struct vkd3d_versioned_root_signature_desc *out_desc,
vkd3d_shader_hash_t *compatibility_hash)
{
return vkd3d_parse_root_signature_for_version(dxbc, out_desc, VKD3D_ROOT_SIGNATURE_VERSION_1_1, true,
compatibility_hash);
}
static HRESULT d3d12_root_signature_deserializer_init(struct d3d12_root_signature_deserializer *deserializer,
const struct vkd3d_shader_code *dxbc)
{
@ -210,7 +241,7 @@ static HRESULT d3d12_root_signature_deserializer_init(struct d3d12_root_signatur
deserializer->ID3D12RootSignatureDeserializer_iface.lpVtbl = &d3d12_root_signature_deserializer_vtbl;
deserializer->refcount = 1;
if ((ret = vkd3d_parse_root_signature_v_1_0(dxbc, &deserializer->desc.vkd3d)) < 0)
if ((ret = vkd3d_parse_root_signature_v_1_0(dxbc, &deserializer->desc.vkd3d, NULL)) < 0)
return hresult_from_vkd3d_result(ret);
return S_OK;
@ -388,7 +419,7 @@ static HRESULT d3d12_versioned_root_signature_deserializer_init(struct d3d12_ver
deserializer->ID3D12VersionedRootSignatureDeserializer_iface.lpVtbl = &d3d12_versioned_root_signature_deserializer_vtbl;
deserializer->refcount = 1;
if ((ret = vkd3d_shader_parse_root_signature(dxbc, &deserializer->desc.vkd3d)) < 0)
if ((ret = vkd3d_shader_parse_root_signature(dxbc, &deserializer->desc.vkd3d, NULL)) < 0)
{
WARN("Failed to parse root signature, vkd3d result %d.\n", ret);
return hresult_from_vkd3d_result(ret);

File diff suppressed because it is too large Load Diff

View File

@ -26,6 +26,7 @@
bool vkd3d_renderdoc_active(void);
bool vkd3d_renderdoc_loaded_api(void);
bool vkd3d_renderdoc_should_capture_shader_hash(vkd3d_shader_hash_t hash);
bool vkd3d_renderdoc_global_capture_enabled(void);
bool vkd3d_renderdoc_begin_capture(void *instance);
void vkd3d_renderdoc_end_capture(void *instance);

View File

@ -41,10 +41,18 @@ enum vkd3d_meta_copy_mode
#include <cs_clear_uav_image_2d_uint.h>
#include <cs_clear_uav_image_3d_float.h>
#include <cs_clear_uav_image_3d_uint.h>
#include <cs_predicate_command.h>
#include <cs_resolve_binary_queries.h>
#include <cs_resolve_predicate.h>
#include <cs_resolve_query.h>
#include <cs_execute_indirect_patch.h>
#include <cs_execute_indirect_patch_debug_ring.h>
#include <vs_fullscreen_layer.h>
#include <vs_fullscreen.h>
#include <gs_fullscreen.h>
#include <fs_copy_image_float.h>
#include <fs_copy_image_uint.h>
#include <fs_copy_image_stencil.h>
#include <vs_swapchain_fullscreen.h>
#include <fs_swapchain_fullscreen.h>

View File

@ -41,6 +41,7 @@ VK_INSTANCE_PFN(vkEnumeratePhysicalDevices)
VK_INSTANCE_PFN(vkGetDeviceProcAddr)
VK_INSTANCE_PFN(vkGetPhysicalDeviceFeatures)
VK_INSTANCE_PFN(vkGetPhysicalDeviceFormatProperties)
VK_INSTANCE_PFN(vkGetPhysicalDeviceFormatProperties2)
VK_INSTANCE_PFN(vkGetPhysicalDeviceImageFormatProperties)
VK_INSTANCE_PFN(vkGetPhysicalDeviceMemoryProperties)
VK_INSTANCE_PFN(vkGetPhysicalDeviceProperties)
@ -48,6 +49,7 @@ VK_INSTANCE_PFN(vkGetPhysicalDeviceQueueFamilyProperties)
VK_INSTANCE_PFN(vkGetPhysicalDeviceSparseImageFormatProperties)
VK_INSTANCE_PFN(vkGetPhysicalDeviceFeatures2)
VK_INSTANCE_PFN(vkGetPhysicalDeviceProperties2)
VK_INSTANCE_PFN(vkGetPhysicalDeviceExternalSemaphoreProperties)
/* VK_EXT_debug_utils */
VK_INSTANCE_EXT_PFN(vkCreateDebugUtilsMessengerEXT)
@ -60,22 +62,14 @@ VK_DEVICE_PFN(vkAllocateCommandBuffers)
VK_DEVICE_PFN(vkAllocateDescriptorSets)
VK_DEVICE_PFN(vkAllocateMemory)
VK_DEVICE_PFN(vkBeginCommandBuffer)
VK_DEVICE_PFN(vkBindBufferMemory)
VK_DEVICE_PFN(vkBindImageMemory)
VK_DEVICE_PFN(vkCmdBeginQuery)
VK_DEVICE_PFN(vkCmdBeginRenderPass)
VK_DEVICE_PFN(vkCmdBindDescriptorSets)
VK_DEVICE_PFN(vkCmdBindIndexBuffer)
VK_DEVICE_PFN(vkCmdBindPipeline)
VK_DEVICE_PFN(vkCmdBindVertexBuffers)
VK_DEVICE_PFN(vkCmdBlitImage)
VK_DEVICE_PFN(vkCmdClearAttachments)
VK_DEVICE_PFN(vkCmdClearColorImage)
VK_DEVICE_PFN(vkCmdClearDepthStencilImage)
VK_DEVICE_PFN(vkCmdCopyBuffer)
VK_DEVICE_PFN(vkCmdCopyBufferToImage)
VK_DEVICE_PFN(vkCmdCopyImage)
VK_DEVICE_PFN(vkCmdCopyImageToBuffer)
VK_DEVICE_PFN(vkCmdCopyQueryPoolResults)
VK_DEVICE_PFN(vkCmdDispatch)
VK_DEVICE_PFN(vkCmdDispatchIndirect)
@ -84,7 +78,6 @@ VK_DEVICE_PFN(vkCmdDrawIndexed)
VK_DEVICE_PFN(vkCmdDrawIndexedIndirect)
VK_DEVICE_PFN(vkCmdDrawIndirect)
VK_DEVICE_PFN(vkCmdEndQuery)
VK_DEVICE_PFN(vkCmdEndRenderPass)
VK_DEVICE_PFN(vkCmdExecuteCommands)
VK_DEVICE_PFN(vkCmdFillBuffer)
VK_DEVICE_PFN(vkCmdNextSubpass)
@ -92,7 +85,6 @@ VK_DEVICE_PFN(vkCmdPipelineBarrier)
VK_DEVICE_PFN(vkCmdPushConstants)
VK_DEVICE_PFN(vkCmdResetEvent)
VK_DEVICE_PFN(vkCmdResetQueryPool)
VK_DEVICE_PFN(vkCmdResolveImage)
VK_DEVICE_PFN(vkCmdSetBlendConstants)
VK_DEVICE_PFN(vkCmdSetDepthBias)
VK_DEVICE_PFN(vkCmdSetDepthBounds)
@ -121,7 +113,6 @@ VK_DEVICE_PFN(vkCreateImageView)
VK_DEVICE_PFN(vkCreatePipelineCache)
VK_DEVICE_PFN(vkCreatePipelineLayout)
VK_DEVICE_PFN(vkCreateQueryPool)
VK_DEVICE_PFN(vkCreateRenderPass)
VK_DEVICE_PFN(vkCreateSampler)
VK_DEVICE_PFN(vkCreateSemaphore)
VK_DEVICE_PFN(vkCreateShaderModule)
@ -139,7 +130,6 @@ VK_DEVICE_PFN(vkDestroyPipeline)
VK_DEVICE_PFN(vkDestroyPipelineCache)
VK_DEVICE_PFN(vkDestroyPipelineLayout)
VK_DEVICE_PFN(vkDestroyQueryPool)
VK_DEVICE_PFN(vkDestroyRenderPass)
VK_DEVICE_PFN(vkDestroySampler)
VK_DEVICE_PFN(vkDestroySemaphore)
VK_DEVICE_PFN(vkDestroyShaderModule)
@ -163,7 +153,6 @@ VK_DEVICE_PFN(vkGetImageSparseMemoryRequirements2)
VK_DEVICE_PFN(vkGetImageSubresourceLayout)
VK_DEVICE_PFN(vkGetPipelineCacheData)
VK_DEVICE_PFN(vkGetQueryPoolResults)
VK_DEVICE_PFN(vkGetRenderAreaGranularity)
VK_DEVICE_PFN(vkInvalidateMappedMemoryRanges)
VK_DEVICE_PFN(vkMapMemory)
VK_DEVICE_PFN(vkMergePipelineCaches)
@ -197,6 +186,56 @@ VK_DEVICE_EXT_PFN(vkCmdDrawIndexedIndirectCountKHR)
/* VK_KHR_push_descriptor */
VK_DEVICE_EXT_PFN(vkCmdPushDescriptorSetKHR)
/* VK_KHR_ray_tracing_pipeline */
VK_DEVICE_EXT_PFN(vkCreateRayTracingPipelinesKHR)
VK_DEVICE_EXT_PFN(vkGetRayTracingShaderGroupHandlesKHR)
VK_DEVICE_EXT_PFN(vkGetRayTracingShaderGroupStackSizeKHR)
VK_DEVICE_EXT_PFN(vkCmdSetRayTracingPipelineStackSizeKHR)
VK_DEVICE_EXT_PFN(vkCmdTraceRaysKHR)
VK_DEVICE_EXT_PFN(vkCmdTraceRaysIndirectKHR)
/* VK_KHR_acceleration_structure */
VK_DEVICE_EXT_PFN(vkGetAccelerationStructureBuildSizesKHR)
VK_DEVICE_EXT_PFN(vkCreateAccelerationStructureKHR)
VK_DEVICE_EXT_PFN(vkDestroyAccelerationStructureKHR)
VK_DEVICE_EXT_PFN(vkGetAccelerationStructureDeviceAddressKHR)
VK_DEVICE_EXT_PFN(vkCmdBuildAccelerationStructuresKHR)
VK_DEVICE_EXT_PFN(vkCmdWriteAccelerationStructuresPropertiesKHR)
VK_DEVICE_EXT_PFN(vkCmdCopyAccelerationStructureKHR)
/* VK_KHR_fragment_shading_rate */
VK_INSTANCE_EXT_PFN(vkGetPhysicalDeviceFragmentShadingRatesKHR)
VK_DEVICE_EXT_PFN(vkCmdSetFragmentShadingRateKHR)
/* VK_KHR_bind_memory2 */
VK_DEVICE_EXT_PFN(vkBindBufferMemory2KHR)
VK_DEVICE_EXT_PFN(vkBindImageMemory2KHR)
/* VK_KHR_copy_commands2 */
VK_DEVICE_EXT_PFN(vkCmdBlitImage2KHR)
VK_DEVICE_EXT_PFN(vkCmdCopyBuffer2KHR)
VK_DEVICE_EXT_PFN(vkCmdCopyBufferToImage2KHR)
VK_DEVICE_EXT_PFN(vkCmdCopyImage2KHR)
VK_DEVICE_EXT_PFN(vkCmdCopyImageToBuffer2KHR)
VK_DEVICE_EXT_PFN(vkCmdResolveImage2KHR)
/* VK_KHR_maintenance4 */
VK_DEVICE_EXT_PFN(vkGetDeviceBufferMemoryRequirementsKHR)
VK_DEVICE_EXT_PFN(vkGetDeviceImageMemoryRequirementsKHR)
VK_DEVICE_EXT_PFN(vkGetDeviceImageSparseMemoryRequirementsKHR)
#ifdef VK_KHR_external_memory_win32
/* VK_KHR_external_memory_win32 */
VK_DEVICE_EXT_PFN(vkGetMemoryWin32HandleKHR)
VK_DEVICE_EXT_PFN(vkGetMemoryWin32HandlePropertiesKHR)
#endif
#ifdef VK_KHR_external_semaphore_win32
/* VK_KHR_external_semaphore_win32 */
VK_DEVICE_EXT_PFN(vkGetSemaphoreWin32HandleKHR)
VK_DEVICE_EXT_PFN(vkImportSemaphoreWin32HandleKHR)
#endif
/* VK_EXT_calibrated_timestamps */
VK_DEVICE_EXT_PFN(vkGetCalibratedTimestampsEXT)
VK_INSTANCE_EXT_PFN(vkGetPhysicalDeviceCalibrateableTimeDomainsEXT)
@ -224,6 +263,9 @@ VK_DEVICE_EXT_PFN(vkCmdSetPrimitiveTopologyEXT)
VK_DEVICE_EXT_PFN(vkCmdSetScissorWithCountEXT)
VK_DEVICE_EXT_PFN(vkCmdSetViewportWithCountEXT)
/* VK_EXT_extended_dynamic_state2 */
VK_DEVICE_EXT_PFN(vkCmdSetPrimitiveRestartEnableEXT)
/* VK_EXT_external_memory_host */
VK_DEVICE_EXT_PFN(vkGetMemoryHostPointerPropertiesEXT)
@ -247,9 +289,41 @@ VK_DEVICE_EXT_PFN(vkGetSwapchainImagesKHR)
VK_DEVICE_EXT_PFN(vkAcquireNextImageKHR)
VK_DEVICE_EXT_PFN(vkQueuePresentKHR)
/* VK_KHR_dynamic_rendering */
VK_DEVICE_EXT_PFN(vkCmdBeginRenderingKHR)
VK_DEVICE_EXT_PFN(vkCmdEndRenderingKHR)
/* VK_KHR_ray_tracing_maintenance1 */
VK_DEVICE_EXT_PFN(vkCmdTraceRaysIndirect2KHR)
/* VK_AMD_buffer_marker */
VK_DEVICE_EXT_PFN(vkCmdWriteBufferMarkerAMD)
/* VK_NV_device_diagnostic_checkpoints */
VK_DEVICE_EXT_PFN(vkCmdSetCheckpointNV)
VK_DEVICE_EXT_PFN(vkGetQueueCheckpointDataNV)
/* VK_NVX_binary_import */
VK_DEVICE_EXT_PFN(vkCreateCuModuleNVX)
VK_DEVICE_EXT_PFN(vkCreateCuFunctionNVX)
VK_DEVICE_EXT_PFN(vkDestroyCuModuleNVX)
VK_DEVICE_EXT_PFN(vkDestroyCuFunctionNVX)
VK_DEVICE_EXT_PFN(vkCmdCuLaunchKernelNVX)
/* VK_NVX_image_view_handle */
VK_DEVICE_EXT_PFN(vkGetImageViewHandleNVX)
VK_DEVICE_EXT_PFN(vkGetImageViewAddressNVX)
/* VK_VALVE_descriptor_set_host_mapping */
VK_DEVICE_EXT_PFN(vkGetDescriptorSetLayoutHostMappingInfoVALVE)
VK_DEVICE_EXT_PFN(vkGetDescriptorSetHostMappingVALVE)
/* VK_NV_device_generated_commands */
VK_DEVICE_EXT_PFN(vkCreateIndirectCommandsLayoutNV)
VK_DEVICE_EXT_PFN(vkDestroyIndirectCommandsLayoutNV)
VK_DEVICE_EXT_PFN(vkGetGeneratedCommandsMemoryRequirementsNV)
VK_DEVICE_EXT_PFN(vkCmdExecuteGeneratedCommandsNV)
#undef VK_INSTANCE_PFN
#undef VK_INSTANCE_EXT_PFN
#undef VK_DEVICE_PFN

View File

@ -1,31 +1,42 @@
project('vkd3d-proton', ['c'], version : '2.0', meson_version : '>= 0.49', default_options : [
project('vkd3d-proton', ['c'], version : '2.6', meson_version : '>= 0.49', default_options : [
'warning_level=2',
])
cpu_family = target_machine.cpu_family()
vkd3d_compiler = meson.get_compiler('c')
vkd3d_msvc = vkd3d_compiler.get_id() == 'msvc'
vkd3d_c_std = 'c99'
vkd3d_is_msvc = vkd3d_compiler.get_id() == 'msvc' or vkd3d_compiler.get_id() == 'clang-cl'
vkd3d_is_clang = vkd3d_compiler.get_id() == 'clang'
vkd3d_c_std = 'c11'
vkd3d_platform = target_machine.system()
vkd3d_buildtype = get_option('buildtype')
vkd3d_debug = vkd3d_buildtype == 'debug' or vkd3d_buildtype == 'debugoptimized'
enable_tests = get_option('enable_tests')
enable_extras = get_option('enable_extras')
enable_d3d12 = get_option('enable_d3d12')
enable_profiling = get_option('enable_profiling')
enable_renderdoc = get_option('enable_renderdoc')
enable_descriptor_qa = get_option('enable_descriptor_qa')
enable_trace = get_option('enable_trace')
if enable_d3d12 == 'auto'
enable_d3d12 = (vkd3d_platform == 'windows')
enable_d3d12 = vkd3d_platform == 'windows'
else
enable_d3d12 = (enable_d3d12 == 'true')
enable_d3d12 = enable_d3d12 == 'true'
endif
if enable_trace == 'auto'
enable_trace = vkd3d_debug
else
enable_trace = enable_trace == 'true'
endif
if vkd3d_platform != 'windows' and enable_d3d12
error('Standalone D3D12 is only supported on Windows.')
endif
add_project_arguments('-DHAVE_DXIL_SPV', language : 'c')
add_project_arguments('-D_GNU_SOURCE', language : 'c')
add_project_arguments('-DPACKAGE_VERSION="' + meson.project_version() + '"', language : 'c')
@ -45,6 +56,19 @@ if enable_renderdoc
add_project_arguments('-DVKD3D_ENABLE_RENDERDOC', language : 'c')
endif
if enable_descriptor_qa
add_project_arguments('-DVKD3D_ENABLE_DESCRIPTOR_QA', language : 'c')
endif
if not enable_trace
add_project_arguments('-DVKD3D_NO_TRACE_MESSAGES', language : 'c')
endif
enable_breadcrumbs = enable_trace
if enable_breadcrumbs
add_project_arguments('-DVKD3D_ENABLE_BREADCRUMBS', language : 'c')
endif
vkd3d_external_includes = [ './subprojects/Vulkan-Headers/include', './subprojects/SPIRV-Headers/include' ]
vkd3d_public_includes = [ './include' ] + vkd3d_external_includes
vkd3d_private_includes = [ './include/private' ] + vkd3d_public_includes
@ -59,9 +83,13 @@ idl_generator = generator(idl_compiler,
arguments : [ '-h', '-o', '@OUTPUT@', '@INPUT@' ])
glsl_compiler = find_program('glslangValidator')
glsl_args = [ '-V', '--target-env', 'vulkan1.1', '--vn', '@BASENAME@', '@INPUT@', '-o', '@OUTPUT@' ]
if run_command(glsl_compiler, [ '--quiet', '--version' ], check : false).returncode() == 0
glsl_args += [ '--quiet' ]
endif
glsl_generator = generator(glsl_compiler,
output : [ '@BASENAME@.h' ],
arguments : [ '-V', '--vn', '@BASENAME@', '@INPUT@', '-o', '@OUTPUT@' ])
arguments : glsl_args)
threads_dep = dependency('threads')
lib_d3d12 = vkd3d_compiler.find_library('d3d12', required : false)
@ -78,6 +106,9 @@ endif
add_project_arguments(vkd3d_compiler.get_supported_arguments([
'-fvisibility=hidden',
# For some reason, the use of VLAs isn't in all+extra+pedantic
# We don't want to use these accidentally from consts...
'-Wvla',
'-Wno-format',
'-Wno-missing-field-initializers',
'-Wno-unused-parameter',
@ -97,6 +128,21 @@ if cpu_family == 'x86'
'-Wl,--add-stdcall-alias',
'-Wl,--enable-stdcall-fixup']),
language : [ 'c', 'cpp' ])
# Need to link against libatomic for 64-bit atomic emulation on x86
# when using clang.
if vkd3d_is_clang
lib_atomic = vkd3d_compiler.find_library('atomic')
vkd3d_extra_libs += lib_atomic
endif
endif
if not vkd3d_is_msvc
# We need to set the section alignment for debug symbols to
# work properly as well as avoiding a memcpy from the Wine loader.
if vkd3d_compiler.has_link_argument('-Wl,--file-alignment=4096')
add_global_link_arguments('-Wl,--file-alignment=4096', language : [ 'c', 'cpp' ])
endif
endif
vkd3d_build = vcs_tag(

View File

@ -3,3 +3,5 @@ option('enable_extras', type : 'boolean', value : false)
option('enable_d3d12', type : 'combo', value : 'auto', choices : ['false', 'true', 'auto'])
option('enable_profiling', type : 'boolean', value : false)
option('enable_renderdoc', type : 'boolean', value : false)
option('enable_descriptor_qa', type : 'boolean', value : false)
option('enable_trace', type : 'combo', value : 'auto', choices : ['false', 'true', 'auto'])

Some files were not shown because too many files have changed in this diff Show More