Commit Graph

477 Commits

Author SHA1 Message Date
Philip Rebohle ab111dcdbe vkd3d: Don't use vkd3d_get_typeless_format to determine shader copy usage.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-26 16:51:01 +01:00
Philip Rebohle 99d949f5fb vkd3d: Fix enablement of MUTABLE_FORMAT_BIT and EXTENDED_USAGE_BIT.
We previously did not take into account the new relaxed format compatibility
rules that we allow with CastingFullyTypedFormatSupported being supported.

Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-26 16:51:01 +01:00
Philip Rebohle 9624102dcb vkd3d: Rework format compatibility lists.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-26 16:51:01 +01:00
Philip Rebohle 3b6a4ab988 vkd3d: Implement ID3D12Device8 and ID3D12Resource2.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-11-19 14:57:51 +01:00
Joshua Ashton 046524f2a1 vkd3d: Implement MinLODClamp using VK_EXT_image_view_min_lod
Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-11-17 20:51:20 +01:00
Hans-Kristian Arntzen 3fefc540c8 vkd3d: Handle 64KB_UNDEFINED_SWIZZLE.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-11-12 10:32:13 +01:00
Hans-Kristian Arntzen dda02faf89 vkd3d: Pad reserved resources to 64k alignment.
Fix GPU crashes when attempting to bind non-aligned reserved resource.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-18 14:58:34 +02:00
Hans-Kristian Arntzen 26dc9e7da5 vkd3d: Allow CreateHeap to fail in certain fallback situations.
If we deduce that fallback heap allocation is impossible, we will accept
this, and defer allocation to CreatePlacedResource() instead where we make a committed resource.
This breaks aliasing, but in practice, this situation will only arise for render
targets, and it's not like we have a choice in the matter here on NV :\

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-07 15:32:54 +02:00
Hans-Kristian Arntzen 7ee8eac818 vkd3d: Add allocation flag for DEDICATED.
When allocating dedicated memory, ignore heap_flag requirements we
deduce from memory info. Any memory type is allowed. This is important
on NV when allocating fallback render targets.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-10-07 15:32:54 +02:00
Hans-Kristian Arntzen 0c2ddb89cd vkd3d: Add CONFIG for forced CACHED memory.
Very useful for capturing. Speeds up a ton.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-27 14:48:26 +02:00
Hans-Kristian Arntzen 6863f1c6a8 vkd3d: Fix test suite regression on NV.
Fix failure in test_create_heap where a TIER_2 host visible heap was
attempted, but failed due to recent DEATHLOOP fixes.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-24 16:48:34 +02:00
Joshua Ashton cabc31fc4c vkd3d: Move ID3D12Device impl_froms to header
Basic casts should not be function calls.
2021-09-23 12:12:13 +02:00
Joshua Ashton 875fbe5f50 vkd3d: Move ID3D12QueryHeap impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton 2334c136e3 vkd3d: Move ID3D12DescriptorHeap impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton 8d5308c9a1 vkd3d: Move ID3D12Resource impl_froms to header
Basic casts should not be function calls.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Joshua Ashton e597adb83a vkd3d: Move d3d12_query_heap_type_get_data_size to header
This should be inlined.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-09-23 12:12:13 +02:00
Conor McCarthy 446c7423ce vkd3d: Return E_INVALIDARG for texture creation if SampleDesc.Count == 0.
Windows returns E_INVALIDARG at least on AMD and Intel.
Psychonaughts 2 seems to use this as a de facto "do not create"
value, and reasonable vram usage depends on the call failing.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
2021-09-23 11:00:04 +01:00
Conor McCarthy d366ba47ac Revert "vkd3d: Support SAMPLE_DESC.Count of 0"
Windows returns E_INVALIDARG in this case.

Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
2021-09-23 11:00:04 +01:00
Georg Lehmann cf4fb44629 vkd3d: Remove almost unused variable.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
2021-09-21 11:22:34 +01:00
Hans-Kristian Arntzen 173b8ecef0 vkd3d: Add workaround for DEATHLOOP.
Game attempts to create a host visible resource with
ALLOW_RENDER_TARGET flag. We cannot make this work on NVIDIA, but the
game never seems to actually create an RTV, so as a workaround, nop out
the flag, which does make it work after all :3

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-17 14:21:09 +02:00
Hans-Kristian Arntzen a8f623e60d vkd3d: Negate upload_hvv config.
Enable resizable BAR style allocations by default, and add option to
disable it.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen 710fa98918 vkd3d: Setup resizable bar budget.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen cec741706d vkd3d: Refactor out memory topology queries.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen abdaeb136d vkd3d: Add a memory budget per memory type.
For resizable BAR, we don't want to endlessly promote UPLOAD heaps to
BAR since VRAM is precious. The aim is to set a fixed budget where we
can keep allocating until full, at which point we fall back to plain HOST.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen cb94cfd10c vkd3d: Fix silly typo in global mask.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen 69d4f55219 vkd3d: Refactor VkDeviceMemory allocation to keep track of type/size.
We will need to consider some form of budgeting, so make sure that all
allocation and freeing is done in a central place.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 16:10:57 +02:00
Hans-Kristian Arntzen 8d49d3e9ae vkd3d: Add extra validation for mapping textures.
D3D12 validation layers complain if you try to map mipmapped 3D volumes
for ... some reason. The error is very explicit, so I assume it's
intentional :)

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 15:35:57 +02:00
Hans-Kristian Arntzen 41295eff6c vkd3d: Consider CPU availibility when selecting memory types.
Need to consider that based on host visibility requirements, we need to
select either LINEAR or OPTIMAL image types, and those tiling modes can
have different memory requirements.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 15:35:57 +02:00
Hans-Kristian Arntzen 132638be67 vkd3d: Add more logging when linear image allocation fails.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 15:35:57 +02:00
Hans-Kristian Arntzen 50f2c35b44 vkd3d: Add stricter ROW_MAJOR texture validation.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 15:35:57 +02:00
Hans-Kristian Arntzen 961fef84de vkd3d: Allow map of texture as long as ppData is NULL.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-16 15:35:57 +02:00
Hans-Kristian Arntzen fa1d82e141 vkd3d: Fix regressions when introducing null-copy elision.
Need to initialize the set mask so that copies happen properly
on default-initialized descriptors. Also, move the current_null_type to
metadata so that it's properly copied on descriptor copy.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-09-03 12:24:26 +02:00
Rodrigo Locatti b4cb5a37f8 vkd3d: Optimize repeated null descriptor updates
There are titles clearing the same descriptors constantly.
This leads to unnecessary updates that can become costly.

This commit introduces a new flag to track when D3D12 descriptors are
not null, and skips clearing them if they are already null.
Descriptors are assumed to be null by default.

This fixes a performance regression introduced by
9983a1720f

Signed-off-by: Rodrigo Locatti <rlocatti@nvidia.com>
2021-09-02 21:21:34 +02:00
Joshua Ashton e9f04e8e0e vkd3d: Support SAMPLE_DESC.Count of 0
Psychonauts 2 uses a SAMPLE_DESC.Count of 0 for some things, which
previously was forcing it down the MSAA alignment placement path.

Found from playing a native D3D12 apitrace back and seeing
the log spam.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
2021-08-26 14:23:37 +02:00
Hans-Kristian Arntzen f3fd2bf70b vkd3d: Use BAR memory type for descriptor heap helpers.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-23 13:24:43 +02:00
Hans-Kristian Arntzen 7e165238e6 vkd3d: Allow all memory types if UPLOAD_HVV is used.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-08-23 13:24:43 +02:00
Hans-Kristian Arntzen 3915090c12 vkd3d: Track depth-stencil image layouts over a command buffer.
Goal here is to avoid unnecessary image layout transitions when render
passes toggle depth-stencil PSO states. Since we cannot know which
states a resource is in, we have to be conservative, and assume that
shader reads *could* happen.

The best effort we can do is to detect when writes happen to a DSV
resource. In this scenario, we can deduce that the aspect cannot be
read, since DEPTH_WRITE | RESOURCE state is not allowed.

To make the tracking somewhat sane, we only promote to OPTIMAL if an
entire image's worth of subresources for a given aspect is transitioned.
The common case for depth-stencil images is 1 mip / 1 layer anyways.

Some other changes are required here:
- Instead of common_layout for the depth image, we need to consult the
  command list, which might promote the layout to optimal.
- We make use of render pass compatibility rules which state that we can
  change attachment reference layouts as well as initial/finalLayout.
  To make this change, a pipeline will fill in a
  vkd3d_render_pass_compat struct.
- A command list has a dsv_plane_optimal_mask which keeps track
  of the plane aspects we have promoted to OPTIMAL, and we know cannot
  be read by shaders.
  The desired optimal mask is (existing optimal | PSO write).
  The initial existing optimal is inherited from the command list's
  tracker.
- RTV/DSV/views no longer keep track of VkImageLayout. This is
  unnecessary since we always deduce image layout based on context.

Overall, this shows a massive gain in HZD benchmark (RADV, 1440p ultimate, ~16% FPS on RX 6800).

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 15:45:46 +02:00
Hans-Kristian Arntzen 35c555c479 vkd3d: Use more correct fallback path for minLODClamp.
The clamp is absolute, not relative to baseMip. Also avoids validation
error and potential crash when LODClamp > numLevels.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-07 12:50:23 +02:00
Hans-Kristian Arntzen cf632186fd vkd3d: Add workaround for MinLODClamp.
Not correct, will need spec additions to handle it properly.
Fixes ground rendering in DIRT 5.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-06 16:45:19 +02:00
Hans-Kristian Arntzen 7a00e56792 vkd3d: Handle multiple planes in d3d12_resource_get_subresource_count.
Separate out an explicit per_plane query for the cases where we need it.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-07-02 14:16:18 +02:00
Hans-Kristian Arntzen c1860a1ead vkd3d: Add VKD3D_CONFIG flags for forcing EXCLUSIVE queue modes.
Helps in some cases, but we cannot do this by default :(

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-29 12:24:24 +02:00
Hans-Kristian Arntzen 8225edc726 vkd3d: Rewrite resource state implementation.
- Honor resource barriers for resource states which cannot automatically
  decay or promote. This includes COLOR_ATTACHMENT, UNORDERED_ACCESS and
  VRS image. If SIMULTANEOUS_ACCESS is used, we can still promote, and
  we handle that by setting common layout to GENERAL for these resources.

- Avoid redundant barriers in render passes since normal resource
  barriers will always make sure we are already in
  COLOR_ATTACHMENT_OPTIMAL.

- Do not force GENERAL layout if resource has UNORDERED_ACCESS flag set.
  As this is not a promotable state, we have to explicitly transition
  into it. I tested this on validation layers, where even COMMON state
  refuses to promote to UAV state. The exception here of course is
  SIMULTANOUS_ACCESS, but we handle that properly now.

- Verify that UAV or SIMULTANEOUS access is not used together with DSV
  state. This is explicitly banned in the API docs.

- Actually emit image barriers. Batch the image transitions as that's
  what D3D12 docs encourage app developers to do, and it also expects
  that drivers can optimize this. Ensure that we respect the in-order
  resource barrier rules by splitting batches if there are overlaps in
  the transitions.

- Ensure that correct image layout is used when clearing a suspended
  render pass attachment.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:32:48 +02:00
Hans-Kristian Arntzen 177679a766 vkd3d: Add VKD3D_RESOURCE_SIMULTANEOUS_ACCESS.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-22 14:32:48 +02:00
Philip Rebohle 014a3c0b94 vkd3d: Handle plane slice index in descriptor creation.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2021-06-21 21:23:03 +02:00
Samuel Pitoiset 72d9b322b8 vkd3d: reject creating a resource that is placed if the heap is too small
The spec is pretty clear that it's invalid usage. Return E_INVALIDARG
like native drivers.

This is a workaround for the inventory GPU hang with Cyberpunk 2077
which is actually a game bug. Luckily the game handles this error
properly.

The problem is that the game always assume that an image with 2 mips
is smaller than the same image but with 6 mips. This is not always
true if the swizzle mode is different and a recent Mesa update changed
that. Then the game creates a D3D12 heap that is too small and this
triggered a memory violation and then a GPU hang with RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2021-06-17 16:42:23 +02:00
Hans-Kristian Arntzen 9983a1720f vkd3d: Splat null descriptors to all sets.
Some games end up writing the wrong descriptor type when using null
descriptors, and to be robust against that, we have to clear out
all descriptors when creating null descriptors.

If we copy a null descriptor, we will also have to copy from all sets.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-07 13:21:31 +02:00
Hans-Kristian Arntzen c7c17d05ed vkd3d: Fix descriptor QA checks for CBV_AS_SSBO.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-06-07 13:21:06 +02:00
Hans-Kristian Arntzen 3c7f188863 vkd3d: Nuke code paths for !nullDescriptor.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-27 10:39:22 +02:00
Hans-Kristian Arntzen a256a9266e vkd3d: Rewrite descriptor QA.
Adds support for GPU-assisted validation of descriptor usage in the
CBV_SRV_UAV heap.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-26 17:26:01 +02:00
Hans-Kristian Arntzen 96a84e2633 vkd3d: Fix build with DESCRIPTOR_QA.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
2021-05-26 17:26:01 +02:00