Commit Graph

347 Commits

Author SHA1 Message Date
Hans-Kristian Arntzen 5b013d0b02 vkd3d: Validate shader meta against features.
We're supposed to validate and fail compilation if certain features are
not supported.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-28 15:28:19 +02:00
Hans-Kristian Arntzen 37e8f42f4a vkd3d: Move patch vertex count to meta struct.
Will make it easier to implement for DXIL.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:58:45 +02:00
Hans-Kristian Arntzen 3915090c12 vkd3d: Track depth-stencil image layouts over a command buffer.
Goal here is to avoid unnecessary image layout transitions when render
passes toggle depth-stencil PSO states. Since we cannot know which
states a resource is in, we have to be conservative, and assume that
shader reads *could* happen.

The best effort we can do is to detect when writes happen to a DSV
resource. In this scenario, we can deduce that the aspect cannot be
read, since DEPTH_WRITE | RESOURCE state is not allowed.

To make the tracking somewhat sane, we only promote to OPTIMAL if an
entire image's worth of subresources for a given aspect is transitioned.
The common case for depth-stencil images is 1 mip / 1 layer anyways.

Some other changes are required here:
- Instead of common_layout for the depth image, we need to consult the
  command list, which might promote the layout to optimal.
- We make use of render pass compatibility rules which state that we can
  change attachment reference layouts as well as initial/finalLayout.
  To make this change, a pipeline will fill in a
  vkd3d_render_pass_compat struct.
- A command list has a dsv_plane_optimal_mask which keeps track
  of the plane aspects we have promoted to OPTIMAL, and we know cannot
  be read by shaders.
  The desired optimal mask is (existing optimal | PSO write).
  The initial existing optimal is inherited from the command list's
  tracker.
- RTV/DSV/views no longer keep track of VkImageLayout. This is
  unnecessary since we always deduce image layout based on context.

Overall, this shows a massive gain in HZD benchmark (RADV, 1440p ultimate, ~16% FPS on RX 6800).

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:45:46 +02:00
Hans-Kristian Arntzen 419790ac77 vkd3d: Add wave size workaround for GravityMark.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-02 15:15:42 +02:00
Hans-Kristian Arntzen cb5283b6fb vkd3d: Allow dynamic vertex stride == 0 to go through.
Eliminates all late pipeline compiles in Scarlet Nexus DX12 (and several
other games).

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-29 16:00:33 +02:00
Hans-Kristian Arntzen 7c80c92304 vkd3d: Use ALLOW_VARYING_SUBGROUP_SIZE flag as appropriate.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-25 15:08:53 +02:00
Hans-Kristian Arntzen c108bec58f vkd3d: Fix trivial indentation nit.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:41:09 +02:00
Hans-Kristian Arntzen 9900301886 vkd3d: Use read-write lock for fallback pipeline cache.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:41:09 +02:00
Hans-Kristian Arntzen bb723e859b vkd3d: Use read-write locks for render pass cache.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:41:09 +02:00
Hans-Kristian Arntzen 02398c4eef vkd3d: Normalize depth-stencil layouts if only one aspect is used.
Avoid using the separate layouts if we're only using formats with one
aspects. This makes it more likely to match layouts with common layout,
and we can avoid awkward transition barriers.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:32:48 +02:00
Hans-Kristian Arntzen 1ea31701c5 vkd3d: Move F1 2020 workaround over to quirks system.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-17 16:42:14 +02:00
Hans-Kristian Arntzen 28c8a595fa vkd3d: Pass down shader quirks for Necromunda.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-17 16:42:14 +02:00
Hans-Kristian Arntzen 9207d4f019 vkd3d: Ignore BlendEnable if write mask is 0.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-17 16:42:14 +02:00
Hans-Kristian Arntzen a256a9266e vkd3d: Rewrite descriptor QA.
Adds support for GPU-assisted validation of descriptor usage in the
CBV_SRV_UAV heap.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-26 17:26:01 +02:00
Hans-Kristian Arntzen 9d405f0366 vkd3d: Don't try to use fallback SRV aux heap.
DXR requires buffer_device_address, so it's meaningless to attempt a
fallback.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-13 08:25:10 +01:00
Joshua Ashton a3ad7cae90 vkd3d-shader: Remove type/next from interface structures
This was never really used for anything useful.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-20 18:25:41 +02:00
Joshua Ashton 220e1146ee vkd3d-shader: Make vkd3d_shader_transform_feedback_info a member
Moves it into vkd3d_shader_interface_info, this doesn't need to be
a pNext.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-20 18:25:41 +02:00
Hans-Kristian Arntzen c7eb6fdf61 vkd3d: Add some tracing to help narrow down compiler crashes.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-15 16:24:05 +02:00
Hans-Kristian Arntzen 744497274c vkd3d-shader: Verify that we compile expected shader stage.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-15 16:24:05 +02:00
Hans-Kristian Arntzen 8f17fdd1fa vkd3d: Don't leak pipeline cache if we fail compile.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-15 16:24:05 +02:00
Joshua Ashton 7cfe17d2f5 vkd3d-shader: Passthrough vkd3d_config_flags
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 16:29:57 +02:00
Joshua Ashton 9fb624a429 vkd3d: Implement RSSetShadingRateImage
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 13:22:01 +02:00
Joshua Ashton 601357c7c5 vkd3d: Implement a static pipeline variant system
Needed so we can switch between having a VRS and non-VRS attachment on the fly.
Extensible enough for this to work for other things down the line also.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 13:22:01 +02:00
Philip Rebohle 4e777b9182 vkd3d: Use depth attachment when depth bounds test is enabled.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-04-12 11:23:51 +02:00
Philip Rebohle 698279ec90 vkd3d: Enable conservative rasterization state as requested.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-25 18:00:59 +01:00
Joshua Ashton fe28436c34 vkd3d: Refactor vkd3d_render_pass_key to use flags
We're going to need more state in this key for VRS TIER_2 and we need to keep this aligned.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-24 15:20:10 +01:00
Joshua Ashton 65b13f6cd6 vkd3d: Use VK_KHR_create_renderpass2
We need this before implementing TIER_2 variable rate shading.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-24 15:20:10 +01:00
Hans-Kristian Arntzen 5abc4b9af2 vkd3d: Add all relevant RT stages to push constant layout.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-23 18:35:35 +01:00
Hans-Kristian Arntzen bd16d1a88d vkd3d: Support RTPSO object collections.
This is quite complicated, but we can use VK_KHR_pipeline_library
to implement this functionality.
2021-03-23 18:35:35 +01:00
Hans-Kristian Arntzen 028b87ab61 vkd3d: Fix some trivial bugs with local root signatures.
Did not properly allocate bindings.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-12 12:20:10 +01:00
Hans-Kristian Arntzen 13d132f1c4 vkd3d: Add support for hoisting CBV descriptors to push descriptors.
Bindless CBV is *pretty* bad on NVIDIA, so add a code path which can
promote descriptor table CBVs into push descriptors.

We can safely do this with Root Signature 1.1 STATIC or
the somewhat obscure STATIC_KEEPING_BUFFER_BOUNDS_CHECKS.

With VOLATILE, which basically all titles are using,
we can still force this behavior through a config flag,
but this is an incorrect speed hack. It works in most
titles however, since bindless CBV is exceptionally rare.

We only hoist descriptors when the root signature range has 1 descriptor
anyway, so we should avoid any reasonable bindless scenario.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-10 11:46:05 +01:00
Hans-Kristian Arntzen d758a6e296 vkd3d: Convert Root Signatures to 1.1.
We will be able make use of the use STATIC vs VOLATILE flags.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-10 11:46:05 +01:00
Hans-Kristian Arntzen 89fbe334df vkd3d: Redirect push constants to their bind point stages.
Gives a massive boost on NVIDIA for some reason.
RADV defers push constant update, so ALL_STAGES doesn't have
that much of a perf hit.

~20% uplift in RE2, ~5% uplift in CP77 from some quick and dirty testing.
Seems to be heavily content dependent either way.

Also a bug fix, since we would clobber graphics push constants from
compute and vice versa if both graphics and compute used the same root
signature.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-26 17:06:18 +01:00
Joshua Ashton 8c9527cdf7 vkd3d: Refactor SetName implementation
As per MSDN, SetName is just a wrapper around SetPrivateData and a specific GUID.

Some apps and tools will use this to retrieve their name back.

So instead, just forward the name to Vulkan in the SetPrivateData call.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-25 21:51:43 +01:00
Philip Rebohle 26f5745ea1 vkd3d: Don't use SHADER_STAGE_ALL for push constants.
Instead, infer the required stages from the D3D12 shader visibility
field from all root parameters that we map to push constants.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-25 20:28:07 +01:00
Joshua Ashton c0d4ead8ca vkd3d: Implement TIER_1 variable rate shading
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-12 13:39:05 +01:00
Joshua Ashton fccbd3b5e2 vkd3d: Eliminate wchar_size, use UTF-16 string literals
Achieves this with C standard stuff alone, and no compiler hacks.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-09 11:26:28 +01:00
Hans-Kristian Arntzen bfe9a39c3b vkd3d: Implement the basics of RTPSO.
Implement enough that the test case compiles correctly.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen 1784351dcf vkd3d-shader: Move root parameter structs to vkd3d-shader.
Need it here since local root signatures need to know
the physical layout of the record buffer up front.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen fdcf583cbc vkd3d: Rename COUNTER buffer to AUX_BUFFER.
We will use the same pointer buffer to handle acceleration structures,
so unify this buffer under a new name. Simplifies some of the binding
code since SRV path and UAV path looks more similar now.

Only difference is that UAV path uses BDA -> uint32_t,
and SRV uses BDA -> RTAccelerationStructure.

RT requires BDA, so the fallback descriptor set (storage texel buffer) is never used for RT.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen f3becc21a4 vkd3d: Implement local root signatures.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen a531ee5fd4 vkd3d: Remove force_bindless_texel_buffer workaround.
Obsolete now that we fully split typed and untyped buffer descriptors.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-14 15:34:20 +01:00
Hans-Kristian Arntzen 97e0d8e751 vkd3d: Move bindless SSBO out of MUTABLE set and fill both descriptors.
We will need separate descriptor sets to be able to handle typed vs
untyped buffer workarounds.

Also writes multiple descriptors for buffers views to make sure MUTABLE
and SSBO sets are filled (or TEXEL_BUFFER + SSBO for non-mutable).

Applications often get this wrong and use raw buffer in shader where
typed view was written and vice versa.
To mitigate this, just write a typed and untyped view together.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-14 15:34:20 +01:00
Hans-Kristian Arntzen 1bddaa0fff vkd3d: Allow a heap binding to cover multiple descriptors.
This begins the refactor toward letting us to use both texel buffer and
SSBO descriptors for typed buffers, which is a better workaround than
force_bindless_texel_buffers.

In this new approach, we store a mask in metadata instead of
set/binding.

When copying a descriptor, we will iterate over the masks and look up
binding directly from device->bindless_state.set_info[].

The mask is represented in terms of info index rather than set index to
avoid needless lookups. Add some new helpers to make this process
easier.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-14 15:34:20 +01:00
Philip Rebohle 1d9f28b25f vkd3d: Add fast path for mutable descriptor copies.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2020-12-09 14:31:22 +01:00
Hans-Kristian Arntzen aa21d2d03d vkd3d: Add support for VK_VALVE_mutable_descriptor_type.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-12-07 15:17:08 +01:00
Hans-Kristian Arntzen 19193bf932 vkd3d: Sanitize VBO strides and VBO offsets.
Realign VBO strides and offsets if we have to, for sake of
robustness. Violating these rules is against D3D12 spec, but it does not
cause crashes on native drivers. On RDNA we can hit hangs with unaligned
vertex attributes. It appears that native drivers apply some kind of
fixup here to avoid the crash, even if the result is not what we expect.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-24 15:07:29 +01:00
Hans-Kristian Arntzen ffc1fa646c vkd3d: Mask out attachments which cannot safely be written to.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-19 14:13:59 +01:00
Hans-Kristian Arntzen 0dc0d75967 vkd3d: Use VK_IMAGE_LAYOUT_UNDEFINED for unused attachments.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-19 14:13:59 +01:00
Hans-Kristian Arntzen 9617a0f598 vkd3d: Disable RAW_VA root CBVs on NVIDIA.
BDA cannot map to their hardware, and we observe a large performance
loss in games which use root CBVs. For this reason, fall back to push
descriptors here.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2020-11-18 15:49:31 +01:00