mirrors/vkd3d-proton

Commit Graph

Author	SHA1	Message	Date
Hans-Kristian Arntzen	33b9166fec	vkd3d: Make device coherency extension optional for breadcrumbs. Some implementation can support marker, but not explicit coherency. Buffer markers are often uncached either way, so should be fine ... Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen	972ce74ac6	vkd3d: When using breadcrumbs, consider that WaitSemaphore can be buggy. Spec says that in device lost, driver must return DEVICE_LOST in finite time, but this does not happen on NV drivers. Use a long timeout instead in this scenario. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:07:56 +01:00
Robin Kertels	5f97d1eb70	vkd3d: Implement NV_checkpoint path for breadcrumbs. Signed-off-by: Robin Kertels <robin.kertels@gmail.com> Co-authored-by: Hans-Kristian Arntzen <post@arntzen-software.no> Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:07:56 +01:00
Robin Kertels	a6ea442819	vkd3d: Enable VK_NV_device_diagnostic_checkpoints. Signed-off-by: Robin Kertels <robin.kertels@gmail.com>	2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen	365dd05557	vkd3d: Add breadcrumbs support. AMD path for this commit. Idea is that we can automatically instrument markers with command list information we can make some sense of in vkd3d-proton. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen	5017b3723c	vkd3d: Enable VK_AMD_device_coherent_memory. For breadcrumbs support, along with buffer marker. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen	6a4f2842cb	cache: Move d3d12_pipeline_library to internal references. Allow us to hold internal magic pipeline libraries without creating cycles. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 12:29:32 +01:00
Hans-Kristian Arntzen	18a5315db4	cache: Refactor lock strategy of internal hashmaps. Rather than having to take writer lock on serialize calls from the outside, we should just take locks when accessing the internal hashmaps instead. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 12:29:32 +01:00
Hans-Kristian Arntzen	7c228139c3	cache: Refactor out pipeline library serialization. If outer code has taken a reader lock, we don't need to lock again. Also allows a reader lock to go GetSerializedSize + Serialize with one reader lock. This will be relevant for magic cache implementation. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 12:29:32 +01:00
Hans-Kristian Arntzen	30b4abcea1	vkd3d: Do not discard images in Clear*View() unless we have to. It's redundant to add an UNDEFINED transition here for committed resources. We need it for sparse and placed resources to handle aliasing rules, but that's it. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-10 15:14:55 +01:00
Hans-Kristian Arntzen	17b1ffb41a	vkd3d: Add path to use GENERAL depth-stencil images. On some implementations, it doesn't matter for performance what we use, and we can avoid a lot of ugly barriers this way. Opt-in to use this extensions on GPUs we know handles it well, otherwise, keep using the tracking paths. With VK_KHR_dynamic_rendering, this is now feasible to do since we no longer have to deal with shenanigans related to VkRenderPass layouts and complicated compatibility rules. To make this work with the existing framework, just need to consider that GENERAL can be a common layout alongside DEPTH_STENCIL_OPTIMAL, which are both common layouts that do not need to be tracked at all. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-10 15:14:55 +01:00
Hans-Kristian Arntzen	f9da3bf564	vkd3d: Add VK_KHR_driver_properties. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-10 15:14:55 +01:00
Hans-Kristian Arntzen	c6149b47cd	cache: Handle ref-count rules for multiple LoadPipeline/StorePipeline. In pipeline libraries, the library holds on to private references of the libraries so that they can be rapidly loaded on-demand. This behavior is verifed by API tests. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-09 18:35:09 +01:00
Hans-Kristian Arntzen	cc08339624	vkd3d: Use internal_refcounts for pipeline state. When we store pipeline state in libraries we have to manage lifetime a bit differently, which requires internal refcounts of some sort. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-09 18:35:09 +01:00
Hans-Kristian Arntzen	422f6804fb	vkd3d: Enable VK_KHR_create_renderpass2. Required extension by VK_KHR_fragment_shading_rate and VK_KHR_separate_depth_stencil_layouts, but we don't care about enabling any features or use it directly. Needed to silence validation errors. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-09 16:35:05 +01:00
Georg Lehmann	14a06680d9	vkd3d: Remove unused renderpass remains. Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>	2022-03-08 18:34:18 +01:00
Hans-Kristian Arntzen	409dc57645	vkd3d: Properly decay depth-stencil images. When performing a decay of a DSV resource, make sure to transition all subresources, not just the particular aspect being transitioned. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-08 18:11:50 +01:00
Hans-Kristian Arntzen	b330900659	vkd3d: Do not transition all aspects for single subresource. We require separate DS layouts. Fixes validation errors where we transition from read-only, but our neighbor aspect might have been optimal. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-08 18:11:50 +01:00
Philip Rebohle	9a408367dc	vkd3d: Remove render pass cache. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	51e6b2bbbe	vkd3d: Remove render pass from command list state. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	94f82d1085	vkd3d: Get rid of pipeline variant flags. These only existed for VRS attachment, which is no longer necessary with VK_KHR_dynamic_rendering. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	1a68267962	vkd3d: Remove framebuffer list from d3d12_command_allocator. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	c4f88951fc	vkd3d: Use dynamic rendering for regular draw calls. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	9673ac173d	vkd3d: Use dynamic rendering for pipeline creation. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	3783eaf4f7	vkd3d: Implement swap chain blits using dynamic rendering. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	024ef02f9b	vkd3d: Implement meta image copies using dynamic rendering. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	549d4ee63f	vkd3d: Remove render pass list from d3d12_command_allocator. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	6186cc1f0e	vkd3d: Implement clears using dynamic rendering. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	2c92ab7d1e	vkd3d: Enable and require VK_KHR_dynamic_rendering. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Hans-Kristian Arntzen	9fbae668fe	vkd3d: Ensure that all SPIR-V modules are properly cached. When we require inter-stage fixups, we need a solution for partial validity of the cache. Accept the modules all or nothing. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-08 16:43:30 +01:00
Hans-Kristian Arntzen	ce45297695	vkd3d: Enable debug_utils if vk_debug is enabled. Allows debug callbacks to go through in Wine. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-08 16:40:51 +01:00
LemiSt24	c411d0d0c2	vkd3d: Add case for D3D12_STATE_SUBOBJECT_TYPE_GLOBAL_ROOT_SIGNATURE Signed-off-by: LemiSt24 <lennard.strohmeyer@gmail.com>	2022-03-07 16:15:22 +01:00
Hans-Kristian Arntzen	9a63df07b8	vkd3d: Add punchthrough path for descriptor copies. Proves out the viability of this style of implementation. Ideally we'd have a more officially sanctioned way of doing similar things later :) Unfortunately, the overhead removal is too great to ignore on target platform. Makes use of a private (reserved) extension for now ... Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-04 13:34:18 +01:00
Mike Blumenkrantz	1d76803aff	vkd3d: optimize memory access pattern for sampler descriptors this removes them from the bitscan path Signed-off-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>	2022-03-01 22:50:45 +01:00
Hans-Kristian Arntzen	dc622fc715	vkd3d: Recycle command pools in Elden Ring. Very churny. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-25 18:40:52 +01:00
Hans-Kristian Arntzen	9817c52d24	vkd3d: Add workaround to ignore mismatch driver/device in PSO library. Elden Ring does not detect the proper error code and create a new pipeline library. Instead, create a fresh new library, which works around the issue. The game has a pattern of LoadPipeline -> if fail -> CreatePSO -> StorePipeline. Sometimes, in the same process it will LoadLibrary from its own cache (could explain some stutters), so it's very useful to have this either way. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-25 14:50:57 +01:00
Hans-Kristian Arntzen	a8229390f9	vkd3d: Add more pipeline_library_log snippets. Hook GetCachedBlob and various attempts to use LoadPipeline. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-25 14:50:57 +01:00
Hans-Kristian Arntzen	12c73ee18a	swapchain: More gracefully handle SURFACE_LOST. Just like handling min/maxImageExtent of 0, we can just fall back to user buffers. Elden Ring hits this case on application teardown. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-25 14:04:06 +01:00
Hans-Kristian Arntzen	f39ece9a7c	vkd3d: Enable performance workarounds for Elden Ring. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-25 13:59:08 +01:00
Hans-Kristian Arntzen	c19eaac376	vkd3d: Add VKD3D_CONFIG option for command pool recycling. Normal behaving apps should not benefit from any of this. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-25 13:59:08 +01:00
Hans-Kristian Arntzen	54fbadcc94	vkd3d: Recycle command pools. Elden Ring in particular spam frees and allocates command pools despite this being a very bad idea. Add a simple 8-entry cache which seems to take care of it. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-25 13:59:08 +01:00
Hans-Kristian Arntzen	4b07535909	vkd3d: Optimize memory access pattern for single descriptor copies. We can mark a descriptor as being SINGLE_DESCRIPTOR, which means we only need one descriptor copy. This way, we can avoid doing somewhat expensive work (every nanosecond counts here): - Bitscan loop - Read deep into d3d12_device guts (often a cache miss). The memory index depends on the bitscan, which causes bubble. When we have a single descriptor, we can just store the binding information inline and avoid this jank. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-25 13:04:43 +01:00
Hans-Kristian Arntzen	84d632f194	vkd3d: Rewrite memory layout for resource descriptors. Tune memory layout so that we can deduce various information without making a single pointer dereference: - d3d12_descriptor_heap* - heap offset - Pointer to various side data structures we need to keep around. Instead of having one big 64 byte data structure with tons of padding, tune it down to 32 + 8 bytes per descriptor of extra dummy data. To make all of this work, use a somewhat clever encoding scheme for CPU VA where lower bits store number of active bits used to encode descriptor offset. From there, we can mask away bits to recover d3d12_descriptor_heap. Metadata is stored inline in one big allocation, and we can just offset from there based on extracted log2i_ceil(descriptor count). Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-25 13:04:43 +01:00
Hans-Kristian Arntzen	b309913b6d	vkd3d: Use unsafe_impl in CopyDescriptorsSimple. This is an ultra-hot path and seems to show up somehow on profile. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-25 13:04:43 +01:00
Hans-Kristian Arntzen	c29d005ef4	vkd3d: Don't enable fast descriptor copy path for descriptor QA. The hooks are in the generic function. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-24 16:42:00 +01:00
Hans-Kristian Arntzen	8a46c21254	vkd3d: Add VKD3D_CONFIG to skip memory allocator clears. For cases where games spam committed allocations and don't use NOT_ZEROED. We still rely on zerovram behavior for initial backing which should be enough in most cases. Strictly speaking however, we are forced to clear the allocations every time if application does not use the flag correctly. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-24 12:52:05 +01:00
Hans-Kristian Arntzen	76ca492a39	vkd3d: Add some debug logging for when clear passes happen. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-24 12:52:05 +01:00
Hans-Kristian Arntzen	83c4e62660	vkd3d: Bump suballocation limit to 2 MiB. This is a more principled limit since that's the huge page size. Avoids some allocation spam. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-24 12:14:22 +01:00
Hans-Kristian Arntzen	4bea653504	vkd3d: Fix CopyTiles for suballocated linear resources. Forgot to offset buffer offset. Fun! Found when bumping VA allocation limit to 2 MiB instead of 1 MiB. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-24 12:14:22 +01:00
Hans-Kristian Arntzen	edbf49aad4	vkd3d: Support opt-in to single MUTABLE set. Useful for Intel since Intel hardware cannot support more than 1M descriptors in general, and opting in to correct behavior should improve CPU overhead as well when copying descriptors. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-21 17:08:25 +01:00

1 2 3 4 5 ...

1974 Commits