Commit Graph

835 Commits

Author SHA1 Message Date
Hans-Kristian Arntzen cd3d759b95 vkd3d: Enable VK_KHR_shader_integer_dot_product.
Accelerates SM 6.4 packed ops if present.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-05 15:38:59 +02:00
Hans-Kristian Arntzen af822939fb vkd3d: Implement support for rendering to NULL/unbound RTV.
Need to use fallback pipeline system here.
Keep track of active masks for PSO and current render target.
The intersection of those sets are the attachments which should be
active in the render pass.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-30 16:50:02 +02:00
Joshua Ashton bde3ad8e01 vkd3d: Move ID3D12StateObject impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton cabc31fc4c vkd3d: Move ID3D12Device impl_froms to header
Basic casts should not be function calls.
2021-09-23 12:12:13 +02:00
Joshua Ashton bfaf72386f vkd3d: Move ID3D12CommandSignature impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton b84c3ff163 vkd3d: Move ID3D12PipelineState impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton 7c993ae1a6 vkd3d: Move ID3D12RootSignature impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton 875fbe5f50 vkd3d: Move ID3D12QueryHeap impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton 2334c136e3 vkd3d: Move ID3D12DescriptorHeap impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton 8d5308c9a1 vkd3d: Move ID3D12Resource impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton 27e66b5c4a vkd3d: Move ID3D12Heap impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton 26d8011b06 vkd3d: Move ID3D12Fence impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton e597adb83a vkd3d: Move d3d12_query_heap_type_get_data_size to header
This should be inlined.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Hans-Kristian Arntzen 1d51818d8f vkd3d: Fix compile error introduced by bad rebase.
Somehow the rebase got really screwed up :\

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:42:30 +02:00
Hans-Kristian Arntzen abdaeb136d vkd3d: Add a memory budget per memory type.
For resizable BAR, we don't want to endlessly promote UPLOAD heaps to
BAR since VRAM is precious. The aim is to set a fixed budget where we
can keep allocating until full, at which point we fall back to plain HOST.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen e0451bb541 vkd3d: Handle fallbacks properly in suballocator.
With BAR budgets, what will happen is that
- Small allocation is requested
- A new chunk is requested
- try_suballocate_memory will end up calling allocate_memory, which
  allocates a fallback memory type
- Subsequent small allocators will always end up allocating a new
  fallback memory block, never reusing existing blocks.
- System memory is rapidly exhausted once apps start hitting against
  budget.

The fix is to add flags which explicitly do not attempt to fallback
allocate. This makes it possible to handle fallbacks at the appropriate
level in try_suballocate_memory instead.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen 69d4f55219 vkd3d: Refactor VkDeviceMemory allocation to keep track of type/size.
We will need to consider some form of budgeting, so make sure that all
allocation and freeing is done in a central place.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen 41295eff6c vkd3d: Consider CPU availibility when selecting memory types.
Need to consider that based on host visibility requirements, we need to
select either LINEAR or OPTIMAL image types, and those tiling modes can
have different memory requirements.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 15:35:57 +02:00
Hans-Kristian Arntzen 11086a94e0 vkd3d: Add macros to parse/build NV driver versions.
The bit offsets are a bit different from Vulkan API.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-08 18:37:55 +02:00
Hans-Kristian Arntzen 403d1f9743 vkd3d: Workaround huge memory overhead for individual VkPipelineCaches.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-07 13:21:54 +02:00
Hans-Kristian Arntzen fa1d82e141 vkd3d: Fix regressions when introducing null-copy elision.
Need to initialize the set mask so that copies happen properly
on default-initialized descriptors. Also, move the current_null_type to
metadata so that it's properly copied on descriptor copy.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-03 12:24:26 +02:00
Rodrigo Locatti b4cb5a37f8 vkd3d: Optimize repeated null descriptor updates
There are titles clearing the same descriptors constantly.
This leads to unnecessary updates that can become costly.

This commit introduces a new flag to track when D3D12 descriptors are
not null, and skips clearing them if they are already null.
Descriptors are assumed to be null by default.

This fixes a performance regression introduced by
9983a1720f

Signed-off-by: Rodrigo Locatti <rlocatti@nvidia.com>
2021-09-02 21:21:34 +02:00
Philip Rebohle 7fea3527ed vkd3d: Remove deferred clears.
Emitting render pass clears while we're in the process of starting
a render pass overrides dsv layout tracking info.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-09-02 17:11:35 +02:00
Hans-Kristian Arntzen bc9bd9c482 vkd3d: Fix member types in vkd3d_format.
No need to use size_t.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-02 12:21:22 +02:00
Hans-Kristian Arntzen 7b67de7d0e vkd3d: Generalize get_plane_footprints.
Get information directly from vkd3d_format and allow for subsampled
formats in the future.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-02 12:21:22 +02:00
Hans-Kristian Arntzen 3d5010555e vkd3d: Add d3d12_resource_desc_get_sub_resource_count.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-02 12:21:22 +02:00
Hans-Kristian Arntzen 5c2376faf5 vkd3d: Handle multiplanar formats in GetCopyableFootprints.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-02 12:21:22 +02:00
rochaudhari 0828aec4f6 vkd3d: Implement new interfaces required for DX12 DLSS support.
Adds ID3D12GraphicsCommandListExt and ID3D12DeviceExt interfaces.

Signed-off-by: Roshan Chaudhari <rochaudhari@nvidia.com>
2021-08-27 11:37:15 +02:00
Philip Rebohle 715eca1b95 vkd3d: Reimplement frame latency event as a semaphore.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-08-26 14:21:38 +02:00
Philip Rebohle fef30f5037 vkd3d: Support releasing semaphores from a D3D12 fence.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-08-26 14:21:38 +02:00
Hans-Kristian Arntzen e1bb5f3b77 vkd3d: Handle NULL event handles in ID3D12Fence::SetEvent*().
We need to block here for whatever reason.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-29 17:21:20 +02:00
Hans-Kristian Arntzen 5b013d0b02 vkd3d: Validate shader meta against features.
We're supposed to validate and fail compilation if certain features are
not supported.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-28 15:28:19 +02:00
Joshua Ashton 1d23bdbab7 vkd3d: Don't store pointer to QA info when not building with QA
This is entirely unnecessary and a waste of space as it will never be used.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-07-08 16:52:58 +02:00
Hans-Kristian Arntzen 3915090c12 vkd3d: Track depth-stencil image layouts over a command buffer.
Goal here is to avoid unnecessary image layout transitions when render
passes toggle depth-stencil PSO states. Since we cannot know which
states a resource is in, we have to be conservative, and assume that
shader reads *could* happen.

The best effort we can do is to detect when writes happen to a DSV
resource. In this scenario, we can deduce that the aspect cannot be
read, since DEPTH_WRITE | RESOURCE state is not allowed.

To make the tracking somewhat sane, we only promote to OPTIMAL if an
entire image's worth of subresources for a given aspect is transitioned.
The common case for depth-stencil images is 1 mip / 1 layer anyways.

Some other changes are required here:
- Instead of common_layout for the depth image, we need to consult the
  command list, which might promote the layout to optimal.
- We make use of render pass compatibility rules which state that we can
  change attachment reference layouts as well as initial/finalLayout.
  To make this change, a pipeline will fill in a
  vkd3d_render_pass_compat struct.
- A command list has a dsv_plane_optimal_mask which keeps track
  of the plane aspects we have promoted to OPTIMAL, and we know cannot
  be read by shaders.
  The desired optimal mask is (existing optimal | PSO write).
  The initial existing optimal is inherited from the command list's
  tracker.
- RTV/DSV/views no longer keep track of VkImageLayout. This is
  unnecessary since we always deduce image layout based on context.

Overall, this shows a massive gain in HZD benchmark (RADV, 1440p ultimate, ~16% FPS on RX 6800).

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:45:46 +02:00
Hans-Kristian Arntzen 8f05ac298c vkd3d: Add implementation for plane optimal tracker.
Idea is to keep track of scenarios where we know a resource's aspect is
known to be in a OPTIMAL state. Based on this, we can override the image
layout from the common_layout in order to avoid unnecessary full
barriers.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:45:46 +02:00
Hans-Kristian Arntzen 68ce7bd324 vkd3d: Handle separate DS layout for destination copies.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:31:52 +02:00
Hans-Kristian Arntzen cf632186fd vkd3d: Add workaround for MinLODClamp.
Not correct, will need spec additions to handle it properly.
Fixes ground rendering in DIRT 5.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-06 16:45:19 +02:00
Hans-Kristian Arntzen 398724cd6e vkd3d: Require VK_KHR_separate_depth_stencil_layouts.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-02 15:18:16 +02:00
Hans-Kristian Arntzen 7a00e56792 vkd3d: Handle multiple planes in d3d12_resource_get_subresource_count.
Separate out an explicit per_plane query for the cases where we need it.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-02 14:16:18 +02:00
Hans-Kristian Arntzen 7c80c92304 vkd3d: Use ALLOW_VARYING_SUBGROUP_SIZE flag as appropriate.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-25 15:08:53 +02:00
Georg Lehmann a7922a7c85 vkd3d: Introduce vkd3d_internal_get_vk_format.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-06-24 12:55:17 +02:00
Georg Lehmann 0d9c7bc3ad vkd3d: Index formats by format.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-06-24 12:55:17 +02:00
Hans-Kristian Arntzen 9900301886 vkd3d: Use read-write lock for fallback pipeline cache.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:41:09 +02:00
Hans-Kristian Arntzen bb723e859b vkd3d: Use read-write locks for render pass cache.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:41:09 +02:00
Hans-Kristian Arntzen 8225edc726 vkd3d: Rewrite resource state implementation.
- Honor resource barriers for resource states which cannot automatically
  decay or promote. This includes COLOR_ATTACHMENT, UNORDERED_ACCESS and
  VRS image. If SIMULTANEOUS_ACCESS is used, we can still promote, and
  we handle that by setting common layout to GENERAL for these resources.

- Avoid redundant barriers in render passes since normal resource
  barriers will always make sure we are already in
  COLOR_ATTACHMENT_OPTIMAL.

- Do not force GENERAL layout if resource has UNORDERED_ACCESS flag set.
  As this is not a promotable state, we have to explicitly transition
  into it. I tested this on validation layers, where even COMMON state
  refuses to promote to UAV state. The exception here of course is
  SIMULTANOUS_ACCESS, but we handle that properly now.

- Verify that UAV or SIMULTANEOUS access is not used together with DSV
  state. This is explicitly banned in the API docs.

- Actually emit image barriers. Batch the image transitions as that's
  what D3D12 docs encourage app developers to do, and it also expects
  that drivers can optimize this. Ensure that we respect the in-order
  resource barrier rules by splitting batches if there are overlaps in
  the transitions.

- Ensure that correct image layout is used when clearing a suspended
  render pass attachment.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:32:48 +02:00
Hans-Kristian Arntzen 177679a766 vkd3d: Add VKD3D_RESOURCE_SIMULTANEOUS_ACCESS.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:32:48 +02:00
Hans-Kristian Arntzen 02398c4eef vkd3d: Normalize depth-stencil layouts if only one aspect is used.
Avoid using the separate layouts if we're only using formats with one
aspects. This makes it more likely to match layouts with common layout,
and we can avoid awkward transition barriers.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:32:48 +02:00
Philip Rebohle 014a3c0b94 vkd3d: Handle plane slice index in descriptor creation.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-06-21 21:23:03 +02:00
Hans-Kristian Arntzen 28c8a595fa vkd3d: Pass down shader quirks for Necromunda.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-17 16:42:14 +02:00
rochaudhari 1699743c37 vkd3d: Enable binary import and image view handle extensions
Signed-off-by: Roshan Chaudhari <rochaudhari@nvidia.com>

Reviewed-by: Liam Middlebrook <lmiddlebrook@nvidia.com>
2021-06-10 11:26:34 +02:00
Hans-Kristian Arntzen 9983a1720f vkd3d: Splat null descriptors to all sets.
Some games end up writing the wrong descriptor type when using null
descriptors, and to be robust against that, we have to clear out
all descriptors when creating null descriptors.

If we copy a null descriptor, we will also have to copy from all sets.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-07 13:21:31 +02:00
Hans-Kristian Arntzen 3c7f188863 vkd3d: Nuke code paths for !nullDescriptor.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-27 10:39:22 +02:00
Hans-Kristian Arntzen a256a9266e vkd3d: Rewrite descriptor QA.
Adds support for GPU-assisted validation of descriptor usage in the
CBV_SRV_UAV heap.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-26 17:26:01 +02:00
David McCloskey 217ffc27d2 vkd3d: Type error fix for d3d12_device_get_query_pool.
Signed-off-by: David McCloskey <davmcclo@gmail.com>
2021-05-07 06:41:59 +01:00
Hans-Kristian Arntzen 4f0872152a meta: Add fs_copy_uint path.
For stencil -> color copies.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-05 00:11:10 +02:00
Hans-Kristian Arntzen 0e93af9700 vkd3d: Handle multiple planes in subresource conversion for copies.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-05 00:11:10 +02:00
Georg Lehmann a411256c7f vkd3d: Enable and require shaderDrawParameters.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-04-29 06:48:37 +01:00
Philip Rebohle 62cbf3d78a vkd3d: Remove unused unsafe_impl_from_ID3D12CommandAllocator.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-04-14 16:24:55 +02:00
Philip Rebohle 1990270bbb vkd3d: Implement CreateCommandList on top of CreateCommandList1.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-04-14 16:24:55 +02:00
Philip Rebohle 2ca62ecd12 vkd3d: Add bundle allocator and command list implementation.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-04-14 16:24:55 +02:00
Joshua Ashton 41df41305e include: Move vkd3d_config_flags to public header
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 16:29:57 +02:00
Joshua Ashton 9fb624a429 vkd3d: Implement RSSetShadingRateImage
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 13:22:01 +02:00
Joshua Ashton 5d17f71441 vkd3d: Handle usage and implicit views for VRS capable resources
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 13:22:01 +02:00
Joshua Ashton 601357c7c5 vkd3d: Implement a static pipeline variant system
Needed so we can switch between having a VRS and non-VRS attachment on the fly.
Extensible enough for this to work for other things down the line also.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-04-12 13:22:01 +02:00
Hans-Kristian Arntzen 7dc2a5cad7 vkd3d: Enable VK_KHR_sampler_mirror_clamp_to_edge.
CP77 requires it now.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-04-07 21:57:50 +02:00
Philip Rebohle 698279ec90 vkd3d: Enable conservative rasterization state as requested.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-25 18:00:59 +01:00
Philip Rebohle 8a61128152 vkd3d: Enable VK_EXT_conservative_rasterization if available.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-25 18:00:59 +01:00
Hans-Kristian Arntzen 2f60a3bf66 vkd3d: Fix broken debug_vk_memory_{property,heap}_flags.
C is fun, yo. Returned data from dead stack variable, also triggered
overflow in some cases.

Uncalled in release mode, but can crash debug builds.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-25 17:58:18 +01:00
Joshua Ashton fe28436c34 vkd3d: Refactor vkd3d_render_pass_key to use flags
We're going to need more state in this key for VRS TIER_2 and we need to keep this aligned.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-24 15:20:10 +01:00
Hans-Kristian Arntzen 5197edb03b vkd3d: Enable 16-bit storage features.
Don't need extension, since VK_KHR_16bit_storage is core in Vulkan 1.1.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-23 18:35:35 +01:00
Hans-Kristian Arntzen 9fa668867e vkd3d: Hold private reference to collection objects.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-23 18:35:35 +01:00
Hans-Kristian Arntzen bd16d1a88d vkd3d: Support RTPSO object collections.
This is quite complicated, but we can use VK_KHR_pipeline_library
to implement this functionality.
2021-03-23 18:35:35 +01:00
Joshua Ashton 2fa97aa0fb vkd3d: Move API versions to public header
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-22 14:29:45 +01:00
Joshua Ashton aa12817ccf vkd3d: Implement D3D12_HEAP_TYPE_WRITE_WATCH
Needed for D3D12 APITrace

Closes: #373
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-18 14:41:46 +01:00
Joshua Ashton 4e31f5d54d vkd3d: Align d3d12_rtv_desc to D3D12_DESC_ALIGNMENT
Otherwise we can do an alligned_malloc with a non-aligned size as the descriptor size is 48 for a d3d12_rtv_desc otherwise.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-16 21:45:28 +01:00
Hans-Kristian Arntzen dbdbf94083 vkd3d: Ensure that virtual timeline values are updated in-order.
Increment physical value one by one, find the exact timeline value we're
supposed to signal and perform the update.

Select lowest physical timeline value correctly.
Array can be reordered now, so lowest value isn't necessarily first.

Fixes some super weird hangs in Control DXR.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-16 21:41:37 +01:00
Philip Rebohle eab288bb4e vkd3d: Simplify fence worker implementation.
Avoids potential busy-waiting on the driver with WAIT_ANY_BIT.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-16 12:43:24 +01:00
Philip Rebohle 93a80d5eaa vkd3d: Create one fence worker per command queue.
Rather than one per device. This solves issues with D3D12 fences
being signalled too late because the fence worker is waiting on
a different set of semaphores while the fence is being enqueued.

Greatly increases performance in Horizon Zero Dawn and Death
Stranding with multi-queue mode enabled.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-16 12:43:24 +01:00
Philip Rebohle 34bca90a9c vkd3d: Implement internal reference counting for d3d12_fence.
This will be necessary once we introduce fence workers per
command queue, since we cannot reliably store pointers to
queues.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-16 12:43:24 +01:00
Philip Rebohle 7185e9776d vkd3d: Introduce vkd3d_queue_add_wait.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-15 12:52:00 +01:00
Philip Rebohle 724257c0d8 vkd3d: Add multi_queue config flag.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-15 12:52:00 +01:00
Philip Rebohle 1e3c91579e vkd3d: Create one vkd3d queue per Vulkan device queue.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-15 12:52:00 +01:00
Philip Rebohle b0309f6f90 vkd3d: Introduce d3d12_device_allocate_vkd3d_queue.
Replaces d3d12_device_get_vkd3d_queue when mapping D3D12
command queues to Vulkan device queues.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-15 12:52:00 +01:00
Philip Rebohle 4c0a0b0467 vkd3d: Introduce vkd3d_queue_family_info.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-15 12:52:00 +01:00
Hans-Kristian Arntzen 43370c6426 vkd3d: Only enable DXR if requested.
The implemnentation is not complete enough to safely enable it, since
some games will try to create RTPSOs by default, leading to crashes.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-12 12:20:10 +01:00
Philip Rebohle 2ef8106136 vkd3d: Optimize sparse binding for buffers and full subresources.
Compacts ranges and only issues one bind for buffer ranges and
full subresource updates, rather than one bind per tile.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-10 13:18:44 +01:00
Philip Rebohle ead9f2d620 vkd3d: Store subresource index in d3d12_sparse_image_region.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-03-10 13:18:44 +01:00
Hans-Kristian Arntzen 13d132f1c4 vkd3d: Add support for hoisting CBV descriptors to push descriptors.
Bindless CBV is *pretty* bad on NVIDIA, so add a code path which can
promote descriptor table CBVs into push descriptors.

We can safely do this with Root Signature 1.1 STATIC or
the somewhat obscure STATIC_KEEPING_BUFFER_BOUNDS_CHECKS.

With VOLATILE, which basically all titles are using,
we can still force this behavior through a config flag,
but this is an incorrect speed hack. It works in most
titles however, since bindless CBV is exceptionally rare.

We only hoist descriptors when the root signature range has 1 descriptor
anyway, so we should avoid any reasonable bindless scenario.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-10 11:46:05 +01:00
Hans-Kristian Arntzen d758a6e296 vkd3d: Convert Root Signatures to 1.1.
We will be able make use of the use STATIC vs VOLATILE flags.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-10 11:46:05 +01:00
Hans-Kristian Arntzen c409d0f30a vkd3d: Optimize R32UI texel buffer creation.
There is no need to scan through the Vulkan format list,
especially since texel buffer creation happens in the hot path
in cases where we know we need to create R32UI texel buffer views.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-10 11:46:05 +01:00
Hans-Kristian Arntzen c351dfc8d3 vkd3d: Remove dead code from d3d12_command_list.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-05 15:49:28 +01:00
Hans-Kristian Arntzen b5d433baaa vkd3d: Implement RTAS clone and compact copy operations.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-04 16:30:29 +01:00
Hans-Kristian Arntzen 031ad9e139 vkd3d: Track dynamic pipeline stack size
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-03 19:19:47 +01:00
Hans-Kristian Arntzen eeaca4a500 vkd3d: Pass down raygen pipeline layout to command list.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-03 19:19:47 +01:00
Hans-Kristian Arntzen 0b161f5693 vkd3d: Implement SetPipelineState1.
Refactor push constant invalidation to SetPipelineState,
it is technically more correct to only invalidate when actually pushing
constants, but we need to do full state invalidation when transitioning
between RT pipelines and non-RT pipelines due to bind point aliasing
shenanigans in D3D12, so it makes more sense to invalidate state based
on active bind point there.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-03 19:19:47 +01:00
Hans-Kristian Arntzen 9ff7b82235 vkd3d: Rename VKD3D_DESCRIPTOR_FLAG_UAV_COUNTER to RAW_VA_BUFFER.
We're going to place acceleration structures here.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-03-03 19:19:47 +01:00
Joshua Ashton 3224688295 vkd3d: Enable VK_EXT_debug_utils conditionally
Enabling VK_EXT_debug_utils comes at some overhead in Wine due to the object tracking required. There is also likely a non-zero overhead in some native implementations also.

By enabling this conditionally, we can also avoid additional overhead from apps that set debug labels on both the Vulkan and front-end side.

The default condition is to enable it when building with Renderdoc integration or in debug builds.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-02 11:09:29 +01:00
Joshua Ashton 4c6f5375a6 vkd3d: Refactor config_flags to be global rather than instance state
Makes it so we can access it in code where we have no concept of a device/instance.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-03-02 11:09:29 +01:00
Hans-Kristian Arntzen 89fbe334df vkd3d: Redirect push constants to their bind point stages.
Gives a massive boost on NVIDIA for some reason.
RADV defers push constant update, so ALL_STAGES doesn't have
that much of a perf hit.

~20% uplift in RE2, ~5% uplift in CP77 from some quick and dirty testing.
Seems to be heavily content dependent either way.

Also a bug fix, since we would clobber graphics push constants from
compute and vice versa if both graphics and compute used the same root
signature.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-26 17:06:18 +01:00
Joshua Ashton 8c9527cdf7 vkd3d: Refactor SetName implementation
As per MSDN, SetName is just a wrapper around SetPrivateData and a specific GUID.

Some apps and tools will use this to retrieve their name back.

So instead, just forward the name to Vulkan in the SetPrivateData call.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-25 21:51:43 +01:00
Hans-Kristian Arntzen be9c376fde vkd3d: Implement postbuild info queries.
Can only support a subset in Vulkan without extra heroics. The DXR API
lets you query things that you technically should know apriori in the
application. We might need to allocate some side-channel buffers on
demand, but let's defer that until actually needed ... :\

DXR is also very awkward in that we have a query which is resolved in
UNORDERED_ACCESS state instead of COPY_DEST state, so we'll have to
ping-pong through some barriers redundantly.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen 4365f9962f vkd3d: Allocate query pools based on type index instead of D3D12 type.
Postbuild info is a query in Vulkan, but not so in D3D12.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen 3353ed14de vkd3d: Implement RTAS object creation.
When building acceleration structures, we need to have an
VkAccelerationStructureKHR object, but the D3D12 API just uses a plain
VA = ID3D12Resource::GetGPUVA() + offset.

For this to work, we need to resolve the VA back to VkBuffer + offset.
The only VkBuffer we can lookup is the original backing memory
allocation in the VA map, and that allocation itself must own a view
map, since we cannot tie the VA to any specific ID3D12Resource.

Since creating an RTAS is not the common path, we allocate the view map
on-demand with CAS.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen 221a658884 vkd3d: Mark resources as being RTAS depending on initial resource state.
RTAS must stay in this resource state forever. The only way to
synchronize them is UAV barriers.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen 2afe25c0c8 vkd3d: Implement GetRaytracingAccelerationStructurePrebuildInfo.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen d773e67fff vkd3d: Add helper query to check if RT should be used.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-25 16:14:16 +01:00
Hans-Kristian Arntzen 15e36a0430 vkd3d: Use virtual VAs for descriptor heap GPU VAs.
Allows local root signatures to work correctly and is also a good
optimization since we no longer need to dereference memory (potentially
cold cache lines) to figure out heap offset in command buffer.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-23 12:28:19 +01:00
Hans-Kristian Arntzen 1586a75ada vkd3d: Align d3d12_desc to 64 bytes.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-23 12:28:19 +01:00
Hans-Kristian Arntzen 0c94e07ab2 vkd3d: Elide timeline semaphore waits which can be satisfied implicitly.
If we're signalling and waiting on same physical queue (always true for
current SINGLE_QUEUE define), we can rely on submission boundary
synchronization which doesn't require any extra submissions to resolve.

Avoids awkward GPU driver bubbles with back to back signal -> wait pairs
with timeline.

Observed 2% GPU uplift on RE2 on AMD.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-22 13:00:22 +01:00
Philip Rebohle f6c6a76735 vkd3d: Store original heap flags in d3d12_resource again.
Otherwise, when suballocating memory, GetHeapProperties may
not return the exact same set of flags if we ignore flags
when looking up suitable chunks.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-19 20:18:24 +01:00
Philip Rebohle be080edc7f vkd3d: Remove vkd3d_allocate_resource_memory.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-19 19:51:44 +01:00
Philip Rebohle a1e5b78bc4 vkd3d: Suballocate committed images if possible and if supported by the driver.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-19 19:51:44 +01:00
Philip Rebohle d6a4826099 vkd3d: Remove heap_offset member from d3d12_resource.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 56ff4622b6 vkd3d: Remove cookie member from d3d12_resource.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 6e81621b82 vkd3d: Remove gpu_address member from d3d12_resource.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 833d7e207c vkd3d: Remove vk_buffer/vk_image union from d3d12_resource.
Use the unique_resource struct instead.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 5a0a5ef44b vkd3d: Remove unused resource flags and rename SPARSE -> RESERVED.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 6a34d3d204 vkd3d: Remove _2 suffix from memory allocation functions.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 53f6a9c78a vkd3d: Rename _2 suffix from resource creation functions.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle a2e14d7d1d vkd3d: Remove _2 suffix from d3d12_heap_2 and related functions.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 6f8bb2a4c0 vkd3d: Use vkd3d_allocate_device_memory_2 for sparse metadata.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 12f0c11c7f vkd3d: Simplify vkd3d_allocate_image_memory helper.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle ab2c190da5 vkd3d: Simplify vkd3d_allocate_buffer_memory helper.
This is still useful as a low-level memory allocation function when
we don't want to bother with buffer offsets or D3D12 validation.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle db2e0c7587 vkd3d: Remove vkd3d_gpu_va_allocator.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 8826f3c5bc vkd3d: Remove d3d12_heap and old resource creation functions.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle 9792b02b26 vkd3d: Use vkd3d_memory_allocation for scratch buffers.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-18 14:17:22 +01:00
Philip Rebohle e12afd31d9 vkd3d: Actually use VKD3D_VA_BLOCK_COUNT.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-17 16:38:47 +01:00
Hans-Kristian Arntzen 7051bf76f7 vkd3d: Fix validation errors with KHR_fragment_shading_rate.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-16 16:07:55 +00:00
Philip Rebohle 4d68130be7 vkd3d: Add functionality to clear newly allocated memory.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-16 16:06:26 +01:00
Philip Rebohle 78713062fe vkd3d: Introduce unique_queue_mask.
Has one bit set for each vkd3d_queue_family that points to a
unique queue. This can be used to iterate over device queues
without having to check for duplicates manually.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-16 16:06:26 +01:00
Philip Rebohle 812c82f8ac vkd3d: Introduce VKD3D_QUEUE_FAMILY_INTERNAL_COMPUTE.
This needs a rework when we re-enable multi-queue support.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-16 16:06:26 +01:00
Philip Rebohle ba632148d7 vkd3d: Add new functions to create and destroy resources.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 17:04:52 +01:00
Philip Rebohle 22f61611d1 vkd3d: Add d3d12_heap_2.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 17:04:52 +01:00
Philip Rebohle 229273fb3b vkd3d: Add memory allocator instance to device.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 17:04:52 +01:00
Philip Rebohle 8f6e94dc30 vkd3d: Suballocate small allocations from larger chunks.
This is necessary to keep the amount of allocated memory manageable
in games that allocate a lot of small heaps or committed resources.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 16:38:16 +01:00
Philip Rebohle d65363b6b6 vkd3d: Add VA map to memory allocator.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 15:19:11 +01:00
Philip Rebohle 7c017c1dba vkd3d: Add VA->resource map and new VA allocator.
This is designed to work with actual device addresses if supported by
the Vulkan implementation.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 15:19:11 +01:00
Philip Rebohle f536daaacb vkd3d: Introduce new memory allocation functions.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 15:19:11 +01:00
Philip Rebohle 417b3b746e vkd3d: Introduce vkd3d_allocate_cookie.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-02-15 14:04:16 +01:00
Joshua Ashton c0d4ead8ca vkd3d: Implement TIER_1 variable rate shading
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-12 13:39:05 +01:00
Joshua Ashton fccbd3b5e2 vkd3d: Eliminate wchar_size, use UTF-16 string literals
Achieves this with C standard stuff alone, and no compiler hacks.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-09 11:26:28 +01:00
Hans-Kristian Arntzen c558c8f423 vkd3d: Implement Get*StackSize().
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen 13b737214b vkd3d: Remove owned root signatures.
Apparently the docs are lying and RTPSO does not hold references to the
root signatures after all.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen bfe9a39c3b vkd3d: Implement the basics of RTPSO.
Implement enough that the test case compiles correctly.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen 1784351dcf vkd3d-shader: Move root parameter structs to vkd3d-shader.
Need it here since local root signatures need to know
the physical layout of the record buffer up front.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Hans-Kristian Arntzen fdcf583cbc vkd3d: Rename COUNTER buffer to AUX_BUFFER.
We will use the same pointer buffer to handle acceleration structures,
so unify this buffer under a new name. Simplifies some of the binding
code since SRV path and UAV path looks more similar now.

Only difference is that UAV path uses BDA -> uint32_t,
and SRV uses BDA -> RTAccelerationStructure.

RT requires BDA, so the fallback descriptor set (storage texel buffer) is never used for RT.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-02-05 10:05:07 +01:00
Joshua Ashton 51bf939743 vkd3d: Implement DXGI_FORMAT_B4G4R4A4_UNORM
Uses VK_EXT_4444_formats.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-02-04 12:04:10 +01:00
Hans-Kristian Arntzen c8f8b24674 vkd3d: Enable ray tracing extensions.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-26 15:00:15 +01:00
Hans-Kristian Arntzen e89c286075 vkd3d: Report OPTIONS7 features.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-01-26 15:00:15 +01:00
Philip Rebohle 7b524590ab vkd3d: Introduce d3d12_query_heap_type_is_inline.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-01-19 14:11:46 +01:00