Commit Graph

569 Commits

Author SHA1 Message Date
Dean Beeler 063ce7e4bd Use Windows specific environment calls for better Windows compatibility.
Signed-off-by: David McCloskey <davmcclo@gmail.com>
2022-04-22 17:40:21 +02:00
Philip Rebohle beb58f8472 vkd3d: Enable and require VK_KHR_maintenance4.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-04-22 11:36:02 +02:00
Hans-Kristian Arntzen 358f95aff2 vkd3d: Ignore cached SPIR-V if we're dumping SPIR-V.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-22 11:29:27 +02:00
Hans-Kristian Arntzen 6c8542f7d6 vkd3d: Make use of internal pipeline library if we're asked to.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen 2dcb1e2efc cache: Implement an on-disk pipeline library.
With VKD3D_SHADER_CACHE_PATH, we can add automatic serialization of pipeline
blobs to disk, even for games which do not make any use of GetCachedBlob
of ID3D12PipelineLibrary interfaces. Most applications expect drivers to
have some kind of internal caching.

This is implemented as a system where a disk
thread will manage a private ID3D12PipelineLibrary, and new PSOs are
automatically committed to this library. PSO creation will also consult
this internal pipeline library if applications do not provide their own
blob.

The strategy for updating the cache is based on a read-only cache which
is mmaped from disk, with an exclusive write-only portion for new blobs,
which ensures some degree of safety if there are multiple
concurrent processes using the same cache.

The memory layout of the disk cache is optimized to be very efficient
for appending new blobs, just simple fwrites + fflush.
The format is also robust against sliced files, which solves the problem
where applications tear down without destroying the D3D12 device
properly.

This structure is very similar to Fossilize, and in fact the idea is to
move towards actually using the Fossilize format directly later.
This implementation prepares us for this scenario where e.g. Steam could
potentially manage the vkd3d-proton cache.

The main complication in this implementation is that we have to merge
the read-only and write caches.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen 3095ed84d3 cache: Add concept of internal pipeline libraries.
For internal pipeline libraries, we want a somewhat different strategy.

- PSOs are keyed by hash instead of user key.
- We want the option to conditionally store SPIR-V and PSO blobs.
  For internal caches, there isn't much of a reason to store PSO blobs
  since the disk cache is going to be primed anyways.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-05 14:12:20 +02:00
Denis Barkar 8dda6df729 vkd3d: Force non-invariant position for Serious Sam 4.
Signed-off-by: Denis Barkar <dbarkar@nvidia.com>
2022-04-01 15:34:52 +02:00
Hans-Kristian Arntzen 241078d7e8 vkd3d: Add scalar UBO layout requirement for SM 6.0.
Needed to support SM 6.0 CBufferLoad.
This path is mostly unused since it's opt-in in DXC and horribly broken
...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-30 20:13:32 +02:00
Hans-Kristian Arntzen 6f43f450c8 vkd3d: Disable primitive restart when using non-compatible topologies.
Primitive restart is only used for strip primitive types, and must be
ignored for lists. Use and require extended_dynamic_state2 for this
purpose.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-30 16:12:16 +02:00
Hans-Kristian Arntzen 63530501a5 vkd3d: Require VK_EXT_extended_dynamic_state.
This is basically required for not horrible stutter and performance and
is widely supported.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-16 17:48:21 +01:00
Robin Kertels a6ea442819 vkd3d: Enable VK_NV_device_diagnostic_checkpoints.
Signed-off-by: Robin Kertels <robin.kertels@gmail.com>
2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen 365dd05557 vkd3d: Add breadcrumbs support.
AMD path for this commit.
Idea is that we can automatically instrument markers with command list
information we can make some sense of in vkd3d-proton.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen 5017b3723c vkd3d: Enable VK_AMD_device_coherent_memory.
For breadcrumbs support, along with buffer marker.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen f9da3bf564 vkd3d: Add VK_KHR_driver_properties.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-10 15:14:55 +01:00
Hans-Kristian Arntzen 422f6804fb vkd3d: Enable VK_KHR_create_renderpass2.
Required extension by VK_KHR_fragment_shading_rate and
VK_KHR_separate_depth_stencil_layouts, but we don't care about enabling
any features or use it directly.

Needed to silence validation errors.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-09 16:35:05 +01:00
Georg Lehmann 14a06680d9 vkd3d: Remove unused renderpass remains.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2022-03-08 18:34:18 +01:00
Philip Rebohle 9a408367dc vkd3d: Remove render pass cache.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Philip Rebohle 2c92ab7d1e vkd3d: Enable and require VK_KHR_dynamic_rendering.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Hans-Kristian Arntzen ce45297695 vkd3d: Enable debug_utils if vk_debug is enabled.
Allows debug callbacks to go through in Wine.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-08 16:40:51 +01:00
Hans-Kristian Arntzen 9a63df07b8 vkd3d: Add punchthrough path for descriptor copies.
Proves out the viability of this style of implementation. Ideally we'd
have a more officially sanctioned way of doing similar things later :)

Unfortunately, the overhead removal is too great to ignore on target
platform. Makes use of a private (reserved) extension for now ...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-04 13:34:18 +01:00
Hans-Kristian Arntzen dc622fc715 vkd3d: Recycle command pools in Elden Ring.
Very churny.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 18:40:52 +01:00
Hans-Kristian Arntzen 9817c52d24 vkd3d: Add workaround to ignore mismatch driver/device in PSO library.
Elden Ring does not detect the proper error code and create a new
pipeline library. Instead, create a fresh new library, which works
around the issue.

The game has a pattern of LoadPipeline -> if fail -> CreatePSO ->
StorePipeline. Sometimes, in the same process it will LoadLibrary from
its own cache (could explain some stutters),
so it's very useful to have this either way.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 14:50:57 +01:00
Hans-Kristian Arntzen f39ece9a7c vkd3d: Enable performance workarounds for Elden Ring.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 13:59:08 +01:00
Hans-Kristian Arntzen c19eaac376 vkd3d: Add VKD3D_CONFIG option for command pool recycling.
Normal behaving apps should not benefit from any of this.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 13:59:08 +01:00
Hans-Kristian Arntzen 54fbadcc94 vkd3d: Recycle command pools.
Elden Ring in particular spam frees and allocates command pools despite
this being a very bad idea.

Add a simple 8-entry cache which seems to take care of it.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 13:59:08 +01:00
Hans-Kristian Arntzen 84d632f194 vkd3d: Rewrite memory layout for resource descriptors.
Tune memory layout so that we can deduce various information without
making a single pointer dereference:

- d3d12_descriptor_heap*
- heap offset
- Pointer to various side data structures we need to keep around.

Instead of having one big 64 byte data structure with tons of padding,
tune it down to 32 + 8 bytes per descriptor of extra dummy data.

To make all of this work, use a somewhat clever encoding scheme for CPU
VA where lower bits store number of active bits used to encode
descriptor offset. From there, we can mask away bits to recover
d3d12_descriptor_heap. Metadata is stored inline in one big allocation,
and we can just offset from there based on extracted log2i_ceil(descriptor count).

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 13:04:43 +01:00
Hans-Kristian Arntzen b309913b6d vkd3d: Use unsafe_impl in CopyDescriptorsSimple.
This is an ultra-hot path and seems to show up somehow on profile.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-25 13:04:43 +01:00
Hans-Kristian Arntzen c29d005ef4 vkd3d: Don't enable fast descriptor copy path for descriptor QA.
The hooks are in the generic function.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-24 16:42:00 +01:00
Hans-Kristian Arntzen 8a46c21254 vkd3d: Add VKD3D_CONFIG to skip memory allocator clears.
For cases where games spam committed allocations and don't use
NOT_ZEROED. We still rely on zerovram behavior for initial backing which
should be enough in most cases.

Strictly speaking however, we are forced to clear the allocations every
time if application does not use the flag correctly.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-24 12:52:05 +01:00
Hans-Kristian Arntzen edbf49aad4 vkd3d: Support opt-in to single MUTABLE set.
Useful for Intel since Intel hardware cannot support more than 1M
descriptors in general, and opting in to correct behavior should improve
CPU overhead as well when copying descriptors.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-21 17:08:25 +01:00
Hans-Kristian Arntzen 15704b2419 vkd3d: Optimize descriptor copies for common code paths.
The common path that we really need to optimize for is CBV_SRV_UAV +
Simple + 1 descriptor.

Descriptor benchmark shows an almost 50% reduction in overhead now.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-21 16:35:36 +01:00
Hans-Kristian Arntzen 2f6a91e772 vkd3d: De-virtualize query for descriptor size.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-21 16:35:36 +01:00
Hans-Kristian Arntzen 33f17cc74d vkd3d: Add VK_EXT_pipeline_creation_feedback.
Useful when used together with pipeline library logging. Confirms that
we can load pipeline caches as expected.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-04 14:31:34 +01:00
Hans-Kristian Arntzen 47337d5e0b vkd3d: Add VKD3D_CONFIG flags for various pipeline library logging.
Additionally, add option to ignore cached SPIR-V.
Will be useful for debugging, and also required for VKD3D_SHADER_OVERRIDE.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-04 14:31:34 +01:00
Hans-Kristian Arntzen f03940ef4b vkd3d: Add global_pipeline_cache option.
Avoids saving out pipeline cache blobs which are likely going to be
cached by on-disk cache anyways.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-04 14:31:34 +01:00
Hans-Kristian Arntzen 29d956c6c4 vkd3d: Fix memory leak of D3D12 device singleton.
Fairly trivial, caught by ASAN.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-02-02 13:56:36 +01:00
Philip Rebohle 8f81aaa710 vkd3d: Fix reporting of WriteBufferImmediateSupportFlags.
Oversight from when we added bundles.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-02-01 16:21:43 +01:00
Hans-Kristian Arntzen 86f8f41490 vkd3d: Compute a global shader interface key for a D3D12 device.
This key represents the variations of SPIR-V which would be generated
from otherwise identical inputs like DXBC blobs and root signatures.

Typically, changing VKD3D_CONFIG flags or enabled extensions will affect
this key. This ensures that we will not attempt to use a cached SPIR-V
file unless we can trust that the SPIR-V interface will match.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-25 14:07:07 +01:00
Hans-Kristian Arntzen e90b573896 vkd3d-shader: Use flag for vkd3d_shader_meta bools.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-25 14:07:07 +01:00
Philip Rebohle 1af62abfe7 vkd3d: Enable quirk for further UE4 shaders.
Fixes artifacts in The Ascent.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-01-19 16:49:42 +01:00
Hans-Kristian Arntzen 6cba8b9945 vkd3d: Workaround broken barriers in DEATHLOOP.
In DEATHLOOP, there is a render pass which renders out a simple image,
which is then directly followed by a compute dispatch, reading that
image. The image is still in RENDER_TARGET state, and color buffers are
*not* flushed properly on at least RADV, manifesting as a very
distracting glitch pattern. This is a game bug, but for the time being,
we have to workaround it, *sigh*.

For a simple workaround, we can detect patterns where we see these
events in succession:

- Color RT is started
- StateBefore == RENDER_TARGET is not observed
- Dispatch()

In particular, when entering the options menu, highly distracting
glitches are observed in the background.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-01-12 12:20:03 +01:00
Samuel Pitoiset f6fe3e0183 vkd3d: Require VK_KHR_copy_commands2
This extension is trivial to implement for vendors and should be
widely supported.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2022-01-12 12:06:06 +01:00
Samuel Pitoiset b42a7193fc vkd3d: Require VK_KHR_bind_memory2
This extension is trivial to implement for vendors and should be
widely supported.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2022-01-12 12:06:06 +01:00
Hans-Kristian Arntzen 459cae5673 vkd3d: Fix redundant return from void.
Fix MSVC warning.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-12-02 22:48:48 +01:00
Arkadiusz Hiler 93d105adae vkd3d: Retry to create Vk device without NVX extensions.
The creation with those extensions may fail in few cases:
 * older 32 bit drivers
 * missing or inaccessible /dev/nvidia-uvm

There's also a mysterious crash that some Debian users experience with
64bit titles and a correct /dev/nvidia-uvm.

Signed-off-by: Arkadiusz Hiler <ahiler@codeweavers.com>
2021-12-02 12:44:37 +01:00
Georg Lehmann 4240ab7559 vkd3d: Allow B8G8R8A8 UAVs.
This is now allowed according to
https://microsoft.github.io/DirectX-Specs/d3d/RelaxedCasting.html

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-11-24 15:15:14 +01:00
Philip Rebohle b03c1fcb5f vkd3d: Implement ID3D12Device9.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-19 14:57:51 +01:00
Philip Rebohle 3b6a4ab988 vkd3d: Implement ID3D12Device8 and ID3D12Resource2.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-19 14:57:51 +01:00
Philip Rebohle d61f562a3e vkd3d: Implement ID3D12Device7.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-19 14:57:51 +01:00
Joshua Ashton 046524f2a1 vkd3d: Implement MinLODClamp using VK_EXT_image_view_min_lod
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-11-17 20:51:20 +01:00