Commit Graph

4098 Commits

Author SHA1 Message Date
Hans-Kristian Arntzen fe707989fe vkd3d: Clamp command count in execute indirect path.
Shouldn't be required, but take no chances.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:59:00 +02:00
Hans-Kristian Arntzen 6d3c5d53b0 vkd3d: Add debug ring path for execute indirect template patches.
Somehow inspect draw parameters this way.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:59:00 +02:00
Hans-Kristian Arntzen f93a581dae vkd3d: Trace breadcrumbs for execute indirect templates.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:59:00 +02:00
Hans-Kristian Arntzen b7bbdcabd4 tests: Test that we can deal with local samplers in COLLECTIONS.
We cannot handle all scenarios if COLLECTIONS are incompatible,
but test the easier cases.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:58:19 +02:00
Hans-Kristian Arntzen a28e4b6e11 tests: Add test for querying identifiers from COLLECTION objects.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:58:19 +02:00
Hans-Kristian Arntzen eda0b2fab2 vkd3d: Do a best effort in handling COLLECTION local static samplers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:58:19 +02:00
Hans-Kristian Arntzen 7f5dbcfc40 vkd3d: Add workaround to allow identifiers to be queried from library.
CP77 relies on this to work somehow ...
The DXR spec seems to suggest this is allowed, but there is no direct
concept for this in Vulkan.

This seems to work on NVIDIA at least, but we're on very shaky ground
here ...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:34:34 +02:00
Hans-Kristian Arntzen d333159c86 vkd3d: Disallow querying identifiers from COLLECTION objects.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:34:34 +02:00
Hans-Kristian Arntzen 74eb676cfb vkd3d-shader: Normalize root signature compatibility hashing.
The hash should only depend on the raw byte stream, not the entire DXBC
blob. Useful now since we can declare root signatures either through
DXBC blob or as RDAT object (which is raw).

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:34:34 +02:00
Hans-Kristian Arntzen 5033904e10 debug: Add GLSLC_FLAGS to debug shader build.
When building ray query shaders, need --target-env=spv1.4.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:23:38 +02:00
Hans-Kristian Arntzen b34931eb17 vkd3d: Log how shader identifiers are queried.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:23:38 +02:00
Hans-Kristian Arntzen 7410f53912 vkd3d: Add debug ring support to raytracing shaders.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:23:38 +02:00
Hans-Kristian Arntzen 089d2c6cb7 debug: Add shader override build for ray tracing as well.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:23:38 +02:00
Hans-Kristian Arntzen 03fdbac59e vkd3d: Dump TraceRays parameters to breadcrumbs.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:04:38 +02:00
Hans-Kristian Arntzen 7832eeb60d vkd3d: Add detailed tracing for RTPSO creation.
So much state floating around ...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:04:38 +02:00
Hans-Kristian Arntzen 8a94c3ce0e vkd3d: Add more detailed breadcrumb logging for TraceRays.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:04:38 +02:00
Hans-Kristian Arntzen ddb425c5cb vkd3d: Add support for tag logging in breadcrumbs.
To keep things simple, outer code is responsible for keeping string
alive. Intended to be used for RTPSO entry point name debugging.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:04:38 +02:00
Hans-Kristian Arntzen ad7459551d vkd3d: Trivially ensure tighter packing of entry point struct.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 14:04:38 +02:00
Hans-Kristian Arntzen e3c36a47dd tests: Add test for default association tiebreak rules.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 13:41:06 +02:00
Hans-Kristian Arntzen ee8b8374b4 tests: Add test for how we handle DXIL embedded subobjects.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 13:41:06 +02:00
Hans-Kristian Arntzen ce00c9322d tests: Add some basic RTPSO validation rules tests.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 13:41:06 +02:00
Hans-Kristian Arntzen b88b04e4f1 vkd3d: Rewrite how submodules are associated with exports.
Handle embedded DXIL subobjects and fix various issues exposed by the
upcoming new tests.

Associating with global root signatures, shader config and pipeline
config needs to be rewritten so that we validate uniqueness late.

The strategy here is to look at all exports we care about and find an
association.

There are many priority levels which are implied by how I understand the
DXR docs. State objects in the API win over embedded DXIL state objects.
Any DXIL state object wins over a collection.

Hit group associations can trump an entry point. It's not entirely clear
how this works, but we let it win if it has higher priority, i.e.
an explicit association directed at the hit group.

There's also cases where explicit assignment trumps explicit default
assignment, which then trumps just declaring a state object.

Collection state is inherited in some cases like AddToStateObject() even
if this seems to be undocumented behavior.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 13:41:06 +02:00
Hans-Kristian Arntzen 4a121b9aaa vkd3d-shader: Forward RDAT subobjects.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:37:34 +02:00
Hans-Kristian Arntzen 0ef6a8b798 vkd3d: Expose utility for creating root signature from raw blob.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:37:34 +02:00
Hans-Kristian Arntzen 49b6e67e7d vkd3d-shader: Expose entry point for raw root signature parsing.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:37:34 +02:00
Hans-Kristian Arntzen 2ef3fd469c vkd3d-common: Add strequal_mixed between WCHAR and ASCII.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:37:34 +02:00
Hans-Kristian Arntzen 22778b99be vkd3d: Handle default global root signature in RTPSO.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:37:34 +02:00
Hans-Kristian Arntzen b8b2a93aa6 tests: Add test coverage for two stages of AddToStateObject().
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:11:27 +02:00
Hans-Kristian Arntzen 14470d5456 tests: Add test for AddToStateObject.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:11:27 +02:00
Hans-Kristian Arntzen 3aad4edf6e tests: Add default NODE_MASK state object to RTPSO tests.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:11:27 +02:00
Hans-Kristian Arntzen 3c92b3a1bc vkd3d: Implement AddToStateObject().
This is barely implementable, and relies on implementations to do kinda
what we want.

To make this work in practice, we need to allow two pipelines per state
object. One that is created with LIBRARY and one that can be bound. When
incrementing the PSO, we use the LIBRARY one.

It seems to be allowed to create a new library from an old library.
It is more convenient for us if we're allowed to do this, so do this
until we're forced to do otherwise.

DXR 1.1 requires that shader identifiers remain invariant for child
pipelines if the parent pipeline also have them.
Vulkan has no such guarantee, but we can speculate that it works and
validate that identifiers remain invariant. This seems to work fine on
NVIDIA at least ... It probably makes sense that it works for
implementations where pipeline libraries are compiled at that time.

The basic implementation of AddToStateObject() is to consider
the parent pipeline as a COLLECTION pipeline. This composes well and
avoids a lot of extra implementation cruft.

Also adds validation to ensure that COLLECTION global state matches with
other COLLECTION objects and the parent. We will also inherit global
state like root signatures, pipeline config, shader configs etc when
using AddToStateObject().

The tests pass on NVIDIA at least.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 12:11:27 +02:00
Hans-Kristian Arntzen 8473355a98 vkd3d: Hold private ownership over global root signature.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 11:49:44 +02:00
Hans-Kristian Arntzen 1438ff5637 vkd3d: Allow different but compatible global root signature objects.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 11:49:44 +02:00
Hans-Kristian Arntzen c3ee963d2f vkd3d: Ignore NODE_MASK subobjects.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-11 11:49:44 +02:00
Hans-Kristian Arntzen 684e41fabe vkd3d: Do not perform initial layout transition for placed RTV / DSV.
Docs explicitly specify that placed RTV / DSV resource must be properly
initialized before use, either on first use or after aliasing barriers,
so there should be no need to perform initial layout transition.

Fixes spurious GPU hangs in Hitman III where application aliases
an indirect buffer and a DSV. The DSV is cleared after the indirect
buffer is consumed, but the initial_layout_transition is triggered and
HTILE init clobbered the buffer.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-30 15:06:59 +02:00
Philip Rebohle 1d869e3e21 vkd3d: Do not execute indirect commands if count buffer is unsupported.
Also be a bit more uniform with using break/return on fail conditions.

Otherwise, the indirect command will read data from the count buffer
instead, which may lead to bugs or GPU hangs.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-06-28 14:57:11 +02:00
Tatsuyuki Ishi 02c7ec404c vkd3d: Fix transfer batch clobbering state in begin_render_pass.
Transfer batch can clobber graphics pipeline for e.g. depth->color copies.
Hence, flushing the batches after applying the graphics pipeline set by the
app can cause correctness issues.

To prevent that, do the transfer batch flush first before we apply any
render-related states.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-06-28 13:53:03 +02:00
Hans-Kristian Arntzen 9b5f3bfc26 vkd3d-shader: Fix GRAD sample on cubes.
offset_component_count was set to 0 for cubes, but GRAD path also
uses the variable to check how many components to use for GRAD.
OFFSET is not supported for cubes, so that's likely why it was bugged.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-28 12:14:41 +02:00
Hans-Kristian Arntzen b4ab6c3f08 cache: Unmap files before attempting to delete.
Native Win32 does not like it.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-28 12:13:03 +02:00
Hans-Kristian Arntzen 707af8152e vkd3d: Add workaround for forced clearing of certain buffers.
If game uses NOT_ZEROED, it might still rely on buffers being properly
cleared to 0.
Enable this and FORCE_RAW_VA_CBV for Halo Infinite.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-24 15:11:19 +02:00
Hans-Kristian Arntzen bc759be2af vkd3d: Optimize ExecuteIndirect() if no INDIRECT transitions happened.
The D3D12 docs outline this as an implementation detail explicitly, so
we should do the same thing.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-24 14:55:39 +02:00
Hans-Kristian Arntzen 18f1d1c72e vkd3d: Implement ExecuteIndirect with state update.
Implements the most basic iteration where we don't try to take advantage
of index LUT, hoisting CS patching or attempting to reuse application
indirect buffer directly.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-24 14:55:39 +02:00
Hans-Kristian Arntzen 1b704287e5 vkd3d: Enable NV_device_generated_commands extension.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-24 14:55:39 +02:00
Hans-Kristian Arntzen f975f09bb1 meta: Add ExecuteIndirect patch meta shader.
Currently we are translating the index type. This will be changed in a
follow up commit where we move over to index LUT.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 14:39:22 +02:00
Hans-Kristian Arntzen 619a54810d vkd3d: Pass down required memory types to scratch allocators.
Separate scratch pools by their intended usage.
Allows e.g. preprocess buffers to be
allocated differently from normal buffers, which is necessary on
implementations that use special memory types to implement preprocess
buffers.

Potentially can also allow for separate pools for
host visible scratch memory down the line.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 14:39:22 +02:00
Hans-Kristian Arntzen cecb8d6ebc vkd3d: Don't suballocate scratch buffers.
Scratch buffers are 1 MiB blocks which will end
up being suballocated. This was not intended and a fallout from the
earlier change where VA_SIZE was bumped to 2 MiB for Elden Ring.

Introduce a memory allocation flag INTERNAL_SCRATCH which disables
suballocation and VA map insert.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 14:39:22 +02:00
Hans-Kristian Arntzen 8ae391e675 vkd3d: Add more stringent validation for CreateCommandSignature.
The runtime is specified to validate certain things.
Also, be more robust against unsupported command signatures, since we
might need to draw/dispatch at an offset. Avoids hard GPU crashes.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen a30205589f common: Assert that alignment is > 0 and POT.
Found bug when allocating device generated commands.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen abdef77695 vkd3d: Add helper to invalidate all state.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen c132073df8 vkd3d: Refactor index buffer state to be flushed late.
With ExecuteIndirect state we'll need to modify or refresh index buffer
state.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 12:52:29 +02:00