Commit Graph

838 Commits

Author SHA1 Message Date
Hans-Kristian Arntzen b77091ba6b vkd3d: Don't suballocate scratch buffers.
Scratch buffers are 1 MiB blocks which will end
up being suballocated. This was not intended and a fallout from the
earlier change where VA_SIZE was bumped to 2 MiB for Elden Ring.

Introduce a memory allocation flag INTERNAL_SCRATCH which disables
suballocation and VA map insert.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-08 13:19:29 +02:00
Hans-Kristian Arntzen eb1e3ae656 debug: Add concept of implicit instance index to debug ring.
For internal debug shaders, it is helpful to ensure in-order logs when
sorted for later inspection.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:53:25 +02:00
Hans-Kristian Arntzen 8140b26c93 vkd3d: Encode in detail which commands we're emitting in template.
Feed this back to debug ring for less cryptic logs.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:53:25 +02:00
Hans-Kristian Arntzen 4aeca16468 vkd3d: Refactor out patch command token enum.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:53:25 +02:00
Hans-Kristian Arntzen ebbf4b5338 vkd3d: Add debug ring path for execute indirect template patches.
Somehow inspect draw parameters this way.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:53:25 +02:00
Hans-Kristian Arntzen 458391e794 vkd3d: Trace breadcrumbs for execute indirect templates.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:53:25 +02:00
Hans-Kristian Arntzen 186b45a61f vkd3d: Pass down required memory types to scratch allocators.
Separate scratch pools by their intended usage. Allows e.g. preprocess buffers to be
allocated differently from normal buffers. Potentially can also allow
for separate pools for host visible scratch memory etc down the line.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:53:25 +02:00
Hans-Kristian Arntzen 124768c1d6 vkd3d: Optimize ExecuteIndirect() if no INDIRECT transitions happened.
The D3D12 docs outline this as an implementation detail explicitly, so
we should do the same thing.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:53:25 +02:00
Hans-Kristian Arntzen 59b75b5b1d vkd3d: Implement some advanced use cases of ExecuteIndirect.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:53:25 +02:00
Hans-Kristian Arntzen e72fd1414f vkd3d: Enable NV_device_generated_commands extension.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:53:25 +02:00
Hans-Kristian Arntzen 4ade0d37b8 meta: Add ExecuteIndirect patch meta shader.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:53:25 +02:00
Hans-Kristian Arntzen f46d175935 vkd3d: Refactor index buffer state to be flushed late.
With ExecuteIndirect state we'll need to modify or refresh index buffer
state.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:53:25 +02:00
Hans-Kristian Arntzen 102e2dac3a vkd3d: Add more stringent validation for CreateCommandSignature.
The runtime is specified to validate certain things.
Also, be more robust against unsupported command signatures, since we
might need to draw/dispatch at an offset. Avoids hard GPU crashes.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-06-01 15:53:25 +02:00
Hans-Kristian Arntzen 7916d2a6d8 vkd3d: Enable and use VK_KHR_fragment_shader_barycentric.
For now, just keep the NV path as well. It's the exact same extension
basically as the KHR one.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-31 15:59:49 +02:00
Hans-Kristian Arntzen 467db76f90 vkd3d: Remove obsolete COLOR -> COMPUTE workaround for Deathloop.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-31 15:59:35 +02:00
Hans-Kristian Arntzen f964532619 vkd3d: Implement extended DXR queries.
Requires ray_tracing_maintenance1.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-05-30 20:26:50 +02:00
Robin Kertels 8ac7aaca99 vkd3d: Enable VK_KHR_ray_tracing_maintenance1.
Signed-off-by: Robin Kertels <robin.kertels@gmail.com>
2022-05-11 19:11:01 +02:00
Philip Rebohle beb58f8472 vkd3d: Enable and require VK_KHR_maintenance4.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-04-22 11:36:02 +02:00
Philip Rebohle e7a6af4971 vkd3d: Use texel buffer views for UAV clears with buffer to image copy.
Allows this to more easily work with more formats.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-04-21 13:51:58 +02:00
Hans-Kristian Arntzen ae0dafa3a1 cache: Attempt to use disk cache instead when appropriate.
When the disk cache is used, the cache we give back to applications is a
dummy. Therefore, try to use the disk cache blob if we detect a useless
application blob.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen 6c8542f7d6 vkd3d: Make use of internal pipeline library if we're asked to.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen 2dcb1e2efc cache: Implement an on-disk pipeline library.
With VKD3D_SHADER_CACHE_PATH, we can add automatic serialization of pipeline
blobs to disk, even for games which do not make any use of GetCachedBlob
of ID3D12PipelineLibrary interfaces. Most applications expect drivers to
have some kind of internal caching.

This is implemented as a system where a disk
thread will manage a private ID3D12PipelineLibrary, and new PSOs are
automatically committed to this library. PSO creation will also consult
this internal pipeline library if applications do not provide their own
blob.

The strategy for updating the cache is based on a read-only cache which
is mmaped from disk, with an exclusive write-only portion for new blobs,
which ensures some degree of safety if there are multiple
concurrent processes using the same cache.

The memory layout of the disk cache is optimized to be very efficient
for appending new blobs, just simple fwrites + fflush.
The format is also robust against sliced files, which solves the problem
where applications tear down without destroying the D3D12 device
properly.

This structure is very similar to Fossilize, and in fact the idea is to
move towards actually using the Fossilize format directly later.
This implementation prepares us for this scenario where e.g. Steam could
potentially manage the vkd3d-proton cache.

The main complication in this implementation is that we have to merge
the read-only and write caches.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen 3095ed84d3 cache: Add concept of internal pipeline libraries.
For internal pipeline libraries, we want a somewhat different strategy.

- PSOs are keyed by hash instead of user key.
- We want the option to conditionally store SPIR-V and PSO blobs.
  For internal caches, there isn't much of a reason to store PSO blobs
  since the disk cache is going to be primed anyways.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-05 14:12:20 +02:00
Hans-Kristian Arntzen 637834dc75 vkd3d: Make private_root_signatures actually private.
Makes sure that we drop private root signature device references when
public pipeline state refcount hits 0.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-04-05 14:12:20 +02:00
Joshua Ashton 2ed513b99a vkd3d: Remove VKD3D_MAX_DYNAMIC_STATE_COUNT
This was off by one, at some point, which could cause a stack buffer overrun which is naughty.

Replace this with just an ARRAY_SIZE on the dynamic_state_list for the array size.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2022-04-01 15:19:18 +02:00
Hans-Kristian Arntzen 241078d7e8 vkd3d: Add scalar UBO layout requirement for SM 6.0.
Needed to support SM 6.0 CBufferLoad.
This path is mostly unused since it's opt-in in DXC and horribly broken
...

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-30 20:13:32 +02:00
Hans-Kristian Arntzen 6f43f450c8 vkd3d: Disable primitive restart when using non-compatible topologies.
Primitive restart is only used for strip primitive types, and must be
ignored for lists. Use and require extended_dynamic_state2 for this
purpose.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-30 16:12:16 +02:00
Hans-Kristian Arntzen 1b5f7e8fc3 vkd3d: Use VkImageViewCreateInfo correctly.
For EXTENDED_USAGE, we still need to restrict image usage when creating
concrete views.
Use VkImageViewUsageCreateInfo to restrict usage flags to the kind of
view we're creating.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-24 17:55:32 +01:00
Hans-Kristian Arntzen cf65a78570 vkd3d: Rename DSV UNKNOWN workaround query.
Make it more obvious what it's really trying to check.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-23 22:36:00 +01:00
Hans-Kristian Arntzen 6e915dd2c0 vkd3d: Use rt_count as basis for binding RTVs.
Found some validation errors where rt_count != rtv_active_mask,
and blending used rt_count instead of rtv_active_mask. If shader renders
to a NULL attachment, we must make sure that it's part of the PSO
interface.

Also, use rt_count rather than active mask when beginning render pass.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-23 14:29:51 +01:00
Philip Rebohle 34f5fc6a31 vkd3d: Do not create pipeline variants for NULL RTVs.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-22 13:06:00 +01:00
Hans-Kristian Arntzen 63530501a5 vkd3d: Require VK_EXT_extended_dynamic_state.
This is basically required for not horrible stutter and performance and
is widely supported.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-16 17:48:21 +01:00
Hans-Kristian Arntzen e61cc0234a vkd3d: Allow debug ring to know about device lost scenarios.
For this case, we want to block and teardown the debug ring thread.
It's okay to fish for dead messages in the ring, since we know there
won't be more GPU work submitted.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen a6700d3d85 vkd3d: Make debug ring aware of potential crash scenarios.
If we expect device losts (breadcrumb debug), we need to use DEVICE uncached/coherent,
since we might not be able to flush GPU caches properly.

We also need to remove the idea of being able to copy out the control
block back to host. This is too brittle and we should instead just place
the control block in PCI-e BAR instead. Rethink how we pass messages
from GPU to CPU to make it more robust.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:26:27 +01:00
Robin Kertels a6ea442819 vkd3d: Enable VK_NV_device_diagnostic_checkpoints.
Signed-off-by: Robin Kertels <robin.kertels@gmail.com>
2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen 365dd05557 vkd3d: Add breadcrumbs support.
AMD path for this commit.
Idea is that we can automatically instrument markers with command list
information we can make some sense of in vkd3d-proton.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen 5017b3723c vkd3d: Enable VK_AMD_device_coherent_memory.
For breadcrumbs support, along with buffer marker.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen 6a4f2842cb cache: Move d3d12_pipeline_library to internal references.
Allow us to hold internal magic pipeline libraries without creating
cycles.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 12:29:32 +01:00
Hans-Kristian Arntzen 18a5315db4 cache: Refactor lock strategy of internal hashmaps.
Rather than having to take writer lock on serialize calls from the
outside, we should just take locks when accessing the internal hashmaps
instead.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-11 12:29:32 +01:00
Hans-Kristian Arntzen 17b1ffb41a vkd3d: Add path to use GENERAL depth-stencil images.
On some implementations, it doesn't matter for performance what we use,
and we can avoid a lot of ugly barriers this way.

Opt-in to use this extensions on GPUs we know handles it well,
otherwise, keep using the tracking paths.

With VK_KHR_dynamic_rendering, this is now feasible to do since we no longer
have to deal with shenanigans related to VkRenderPass layouts and
complicated compatibility rules.

To make this work with the existing framework, just need to consider
that GENERAL can be a common layout alongside DEPTH_STENCIL_OPTIMAL,
which are both common layouts that do not need to be tracked at all.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-10 15:14:55 +01:00
Hans-Kristian Arntzen f9da3bf564 vkd3d: Add VK_KHR_driver_properties.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-10 15:14:55 +01:00
Hans-Kristian Arntzen c6149b47cd cache: Handle ref-count rules for multiple LoadPipeline/StorePipeline.
In pipeline libraries, the library holds on to private references of the
libraries so that they can be rapidly loaded on-demand.

This behavior is verifed by API tests.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-09 18:35:09 +01:00
Hans-Kristian Arntzen cc08339624 vkd3d: Use internal_refcounts for pipeline state.
When we store pipeline state in libraries we have to manage lifetime a
bit differently, which requires internal refcounts of some sort.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-09 18:35:09 +01:00
Hans-Kristian Arntzen 422f6804fb vkd3d: Enable VK_KHR_create_renderpass2.
Required extension by VK_KHR_fragment_shading_rate and
VK_KHR_separate_depth_stencil_layouts, but we don't care about enabling
any features or use it directly.

Needed to silence validation errors.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2022-03-09 16:35:05 +01:00
Georg Lehmann 14a06680d9 vkd3d: Remove unused renderpass remains.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2022-03-08 18:34:18 +01:00
Philip Rebohle 9a408367dc vkd3d: Remove render pass cache.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Philip Rebohle 51e6b2bbbe vkd3d: Remove render pass from command list state.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Philip Rebohle 94f82d1085 vkd3d: Get rid of pipeline variant flags.
These only existed for VRS attachment, which is no longer
necessary with VK_KHR_dynamic_rendering.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Philip Rebohle 1a68267962 vkd3d: Remove framebuffer list from d3d12_command_allocator.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00
Philip Rebohle c4f88951fc vkd3d: Use dynamic rendering for regular draw calls.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2022-03-08 17:44:47 +01:00