Commit Graph

4033 Commits

Author SHA1 Message Date
Hans-Kristian Arntzen fc69f469d5 vkd3d: Prototype implementation of shader module identifier.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:37:38 +02:00
Hans-Kristian Arntzen 4d708bd7fe vkd3d: Enable prototype extension VK_EXT_shader_module_identifier.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:36:58 +02:00
Hans-Kristian Arntzen d9dc4b862a vkd3d: Add helper for late compilation of DXBC -> SPIR-V.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen b42caa0bff vkd3d: Use rwlock instead of spinlock in PSO fallback cache.
If we defer SPIR-V compilation we risk holding the lock for quite a long
time.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen b0a706cb4e cache: Explicitly do not serialize SPIR-V code for cached PSOs.
With upcoming refactor, we might have to compile code on the fly.
To avoid any race conditions on fallback compile storing code[i] <-> StorePipeline reading code[i],
explicitly mark that code[] should be ignored.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen f510e92f6e vkd3d: Separate compilation to SPIR-V and creation of VkShaderModule.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen 0123e5fe5c vkd3d: Stub out DXBC code duplication for later.
When we have the ability to load PSO from identifiers only, we need to
retain DXBC blobs for later.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen 131ff90ca3 vkd3d: Separate out the different stages of graphics PSO creation.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen 7f758e5904 vkd3d: Refactor stages of obtaining SPIR-V modules.
- Try to load SPIR-V from cache
- Fallback compile to SPIR-V if necessary
- Parse PSO metadata obtained from either compilation or cache lookup

Also moves SPIR-V compilation to end of PSO init.
Prepares for refactor where we completely decouple PSO creation info
setup and SPIR-V compilation.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen 4384b708d7 vkd3d: Prepare for system where we can retain DXBC blobs in pipeline.
Simplifies the code somewhat. Only iterate over the shader_stages LUT
once.

Adds concept of duped DXBC blobs as well.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen f16875d195 vkd3d: Add FIXME for dubious use of dsv_plane_optimal_mask.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen ef7924ce86 vkd3d: Hoist out pipeline cache creation.
Not super useful to create a local pipeline cache if we're not going to
compile early, but it's super rare, and cleans up the code either way.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen a098cce48a vkd3d: Streamline vkd3d_create_compute_pipeline.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen 73fa8b9588 vkd3d: Sink shader interface struct build to where we need it.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen dc45142b93 vkd3d: Refactor out how XFB info is stored.
For deferred compilation, we need to dupe the structs.
XFB is kinda rare, so it's okay to eat allocations here.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen f816eeb60e vkd3d: Ensure shader interface is set up per vkd3d_create_shader_stage.
Prepares for a situation where we can move this code into
vkd3d_create_shader_stage itself.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen b387def67c vkd3d: Refactor how we set compiler options.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen 1495ead2c4 vkd3d: Refactor out shader interface struct plumbing.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen e438c42da0 vkd3d: Unify how we hold on to root signatures in PSO state.
Make use of private references to hold on to the root signature object.
This is important in situations where we end up compiling pipelines
late.

With private references like this, there is no longer a need to
distinguish a "private_root_signature", so just rename.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-07-01 12:33:40 +02:00
Hans-Kristian Arntzen 684e41fabe vkd3d: Do not perform initial layout transition for placed RTV / DSV.
Docs explicitly specify that placed RTV / DSV resource must be properly
initialized before use, either on first use or after aliasing barriers,
so there should be no need to perform initial layout transition.

Fixes spurious GPU hangs in Hitman III where application aliases
an indirect buffer and a DSV. The DSV is cleared after the indirect
buffer is consumed, but the initial_layout_transition is triggered and
HTILE init clobbered the buffer.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-30 15:06:59 +02:00
Philip Rebohle 1d869e3e21 vkd3d: Do not execute indirect commands if count buffer is unsupported.
Also be a bit more uniform with using break/return on fail conditions.

Otherwise, the indirect command will read data from the count buffer
instead, which may lead to bugs or GPU hangs.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-06-28 14:57:11 +02:00
Tatsuyuki Ishi 02c7ec404c vkd3d: Fix transfer batch clobbering state in begin_render_pass.
Transfer batch can clobber graphics pipeline for e.g. depth->color copies.
Hence, flushing the batches after applying the graphics pipeline set by the
app can cause correctness issues.

To prevent that, do the transfer batch flush first before we apply any
render-related states.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-06-28 13:53:03 +02:00
Hans-Kristian Arntzen 9b5f3bfc26 vkd3d-shader: Fix GRAD sample on cubes.
offset_component_count was set to 0 for cubes, but GRAD path also
uses the variable to check how many components to use for GRAD.
OFFSET is not supported for cubes, so that's likely why it was bugged.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-28 12:14:41 +02:00
Hans-Kristian Arntzen b4ab6c3f08 cache: Unmap files before attempting to delete.
Native Win32 does not like it.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-28 12:13:03 +02:00
Hans-Kristian Arntzen 707af8152e vkd3d: Add workaround for forced clearing of certain buffers.
If game uses NOT_ZEROED, it might still rely on buffers being properly
cleared to 0.
Enable this and FORCE_RAW_VA_CBV for Halo Infinite.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-24 15:11:19 +02:00
Hans-Kristian Arntzen bc759be2af vkd3d: Optimize ExecuteIndirect() if no INDIRECT transitions happened.
The D3D12 docs outline this as an implementation detail explicitly, so
we should do the same thing.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-24 14:55:39 +02:00
Hans-Kristian Arntzen 18f1d1c72e vkd3d: Implement ExecuteIndirect with state update.
Implements the most basic iteration where we don't try to take advantage
of index LUT, hoisting CS patching or attempting to reuse application
indirect buffer directly.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-24 14:55:39 +02:00
Hans-Kristian Arntzen 1b704287e5 vkd3d: Enable NV_device_generated_commands extension.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-24 14:55:39 +02:00
Hans-Kristian Arntzen f975f09bb1 meta: Add ExecuteIndirect patch meta shader.
Currently we are translating the index type. This will be changed in a
follow up commit where we move over to index LUT.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 14:39:22 +02:00
Hans-Kristian Arntzen 619a54810d vkd3d: Pass down required memory types to scratch allocators.
Separate scratch pools by their intended usage.
Allows e.g. preprocess buffers to be
allocated differently from normal buffers, which is necessary on
implementations that use special memory types to implement preprocess
buffers.

Potentially can also allow for separate pools for
host visible scratch memory down the line.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 14:39:22 +02:00
Hans-Kristian Arntzen cecb8d6ebc vkd3d: Don't suballocate scratch buffers.
Scratch buffers are 1 MiB blocks which will end
up being suballocated. This was not intended and a fallout from the
earlier change where VA_SIZE was bumped to 2 MiB for Elden Ring.

Introduce a memory allocation flag INTERNAL_SCRATCH which disables
suballocation and VA map insert.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 14:39:22 +02:00
Hans-Kristian Arntzen 8ae391e675 vkd3d: Add more stringent validation for CreateCommandSignature.
The runtime is specified to validate certain things.
Also, be more robust against unsupported command signatures, since we
might need to draw/dispatch at an offset. Avoids hard GPU crashes.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen a30205589f common: Assert that alignment is > 0 and POT.
Found bug when allocating device generated commands.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen abdef77695 vkd3d: Add helper to invalidate all state.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen c132073df8 vkd3d: Refactor index buffer state to be flushed late.
With ExecuteIndirect state we'll need to modify or refresh index buffer
state.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen 128852200a vkd3d: Store the raw VA index in root signature for root descriptors.
Needed when building device generated commands later.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen 717026f903 vkd3d: Add VKD3D_CONFIG option to force raw VA CBV descriptors.
For certain ExecuteIndirect() uses, we're forced to use this path
since we have no way to update push descriptors indirectly yet.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen b849bd4256 vkd3d: Enable F1 2020 quirks on 2019 as well.
Same game bug.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-20 14:53:16 +02:00
Georg Lehmann d8905afd5d demos: Don't pretend to handle allocation failure.
This function doesn't indicate failure and the possibility of a return
causes -Wmaybe-uninitialized warnings.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2022-06-20 11:36:17 +02:00
Hans-Kristian Arntzen de5b751468 vkd3d: Enable VK_KHR_depth_stencil_resolve.
Required by KHR_dynamic_rendering. Caught by updated validation layers.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-17 11:54:31 +02:00
Hans-Kristian Arntzen 219d9698b3 tests: Fix compiler warnings in various tests.
Mostly related to casting vec4 struct to float where array[4] is expected.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-17 11:54:07 +02:00
Hans-Kristian Arntzen acef5429c5 vkd3d-shader: Workaround trivial compiler warning.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-17 11:54:07 +02:00
Hans-Kristian Arntzen 135aff4685 vkd3d: Remove the global VkPipelineCache.
Just use VK_NULL_HANDLE. We rely on the disk cache to exist anyways
here. We never serialize the global pipeline cache, so it might just
confuse drivers into disable disk cache if anything.

Also reduce memory bloat.

Also gets rid of very old NV driver workaround where we forced global
pipeline cache.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-17 11:53:46 +02:00
Hans-Kristian Arntzen 2f6a9e0d55 vkd3d: Do not attempt to clear dedicated memory allocations.
We rely on zerovram behavior in drivers. Opt-in to this path where we
know implementation does what we want (backed up by testing).

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-17 11:53:28 +02:00
Hans-Kristian Arntzen 3a19dea7c7 tests: Ensure we try to allocate some larger buffers as well.
The suballocation test should also try to allocate >= 2 MiB buffers so
we can verify VRAM clear behavior for dedicated allocations as well.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-17 11:53:28 +02:00
Tatsuyuki Ishi 39d07dea2c vkd3d: Check for alias and batch barriers in CopyTextureRegion batches.
Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-06-16 11:54:26 +02:00
Tatsuyuki Ishi 3577ca3144 vkd3d: Introduce transfer batches.
Transfer batches buffers CopyTextureRegion calls for batching.

The flushes needs to happen in a few places:
1. ResourceBarrier: This is where the transition from COPY_DEST to other
   might happen, at which point the writes must be visible. This might
   also transition away from COPY_SRC which invalidates the
   precondition.
2. Copy operations. Copies to the same resource are implicitly ordered.
3. Draws and dispatches. These are not strictly necessary, but we don't
   want too much command reordering so flushing here seems good.
4. Close. So that we don't throw commands into the void.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-06-16 11:54:26 +02:00
Tatsuyuki Ishi 829ac72e3d vkd3d: Break up CopyTextureRegion into three stages.
A parameter preparation stage, a pre-execution barrier stage, then finally
the execution and post-execution barrier stage.

Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
2022-06-13 14:40:23 +02:00
Hans-Kristian Arntzen c64916686d vkd3d: Clear SUSPENDED flag properly.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-13 13:46:49 +02:00
Hans-Kristian Arntzen c4b00bbe1e tests: Avoid tripping out of spec UAV casts.
5.3.9.5 in D3D11 spec explicit outlines when we can
cast to R32{U,I,F}. The D3D12 validation layers
seem to have missed this.

Fixes assertions in RADV when running test under debug.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-08 17:09:40 +02:00