mirrors/vkd3d-proton

Commit Graph

Author	SHA1	Message	Date
Hans-Kristian Arntzen	5eeca3c69d	vkd3d: Enable and use VK_KHR_fragment_shader_barycentric. For now, just keep the NV path as well. It's the exact same extension basically as the KHR one. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-05-31 14:12:54 +02:00
Hans-Kristian Arntzen	f964532619	vkd3d: Implement extended DXR queries. Requires ray_tracing_maintenance1. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-05-30 20:26:50 +02:00
Robin Kertels	8ac7aaca99	vkd3d: Enable VK_KHR_ray_tracing_maintenance1. Signed-off-by: Robin Kertels <robin.kertels@gmail.com>	2022-05-11 19:11:01 +02:00
Philip Rebohle	beb58f8472	vkd3d: Enable and require VK_KHR_maintenance4. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-04-22 11:36:02 +02:00
Philip Rebohle	e7a6af4971	vkd3d: Use texel buffer views for UAV clears with buffer to image copy. Allows this to more easily work with more formats. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-04-21 13:51:58 +02:00
Hans-Kristian Arntzen	ae0dafa3a1	cache: Attempt to use disk cache instead when appropriate. When the disk cache is used, the cache we give back to applications is a dummy. Therefore, try to use the disk cache blob if we detect a useless application blob. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen	6c8542f7d6	vkd3d: Make use of internal pipeline library if we're asked to. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen	2dcb1e2efc	cache: Implement an on-disk pipeline library. With VKD3D_SHADER_CACHE_PATH, we can add automatic serialization of pipeline blobs to disk, even for games which do not make any use of GetCachedBlob of ID3D12PipelineLibrary interfaces. Most applications expect drivers to have some kind of internal caching. This is implemented as a system where a disk thread will manage a private ID3D12PipelineLibrary, and new PSOs are automatically committed to this library. PSO creation will also consult this internal pipeline library if applications do not provide their own blob. The strategy for updating the cache is based on a read-only cache which is mmaped from disk, with an exclusive write-only portion for new blobs, which ensures some degree of safety if there are multiple concurrent processes using the same cache. The memory layout of the disk cache is optimized to be very efficient for appending new blobs, just simple fwrites + fflush. The format is also robust against sliced files, which solves the problem where applications tear down without destroying the D3D12 device properly. This structure is very similar to Fossilize, and in fact the idea is to move towards actually using the Fossilize format directly later. This implementation prepares us for this scenario where e.g. Steam could potentially manage the vkd3d-proton cache. The main complication in this implementation is that we have to merge the read-only and write caches. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen	3095ed84d3	cache: Add concept of internal pipeline libraries. For internal pipeline libraries, we want a somewhat different strategy. - PSOs are keyed by hash instead of user key. - We want the option to conditionally store SPIR-V and PSO blobs. For internal caches, there isn't much of a reason to store PSO blobs since the disk cache is going to be primed anyways. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-05 14:12:20 +02:00
Hans-Kristian Arntzen	637834dc75	vkd3d: Make private_root_signatures actually private. Makes sure that we drop private root signature device references when public pipeline state refcount hits 0. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-05 14:12:20 +02:00
Joshua Ashton	2ed513b99a	vkd3d: Remove VKD3D_MAX_DYNAMIC_STATE_COUNT This was off by one, at some point, which could cause a stack buffer overrun which is naughty. Replace this with just an ARRAY_SIZE on the dynamic_state_list for the array size. Signed-off-by: Joshua Ashton <joshua@froggi.es>	2022-04-01 15:19:18 +02:00
Hans-Kristian Arntzen	241078d7e8	vkd3d: Add scalar UBO layout requirement for SM 6.0. Needed to support SM 6.0 CBufferLoad. This path is mostly unused since it's opt-in in DXC and horribly broken ... Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-30 20:13:32 +02:00
Hans-Kristian Arntzen	6f43f450c8	vkd3d: Disable primitive restart when using non-compatible topologies. Primitive restart is only used for strip primitive types, and must be ignored for lists. Use and require extended_dynamic_state2 for this purpose. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-30 16:12:16 +02:00
Hans-Kristian Arntzen	1b5f7e8fc3	vkd3d: Use VkImageViewCreateInfo correctly. For EXTENDED_USAGE, we still need to restrict image usage when creating concrete views. Use VkImageViewUsageCreateInfo to restrict usage flags to the kind of view we're creating. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-24 17:55:32 +01:00
Hans-Kristian Arntzen	cf65a78570	vkd3d: Rename DSV UNKNOWN workaround query. Make it more obvious what it's really trying to check. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-23 22:36:00 +01:00
Hans-Kristian Arntzen	6e915dd2c0	vkd3d: Use rt_count as basis for binding RTVs. Found some validation errors where rt_count != rtv_active_mask, and blending used rt_count instead of rtv_active_mask. If shader renders to a NULL attachment, we must make sure that it's part of the PSO interface. Also, use rt_count rather than active mask when beginning render pass. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-23 14:29:51 +01:00
Philip Rebohle	34f5fc6a31	vkd3d: Do not create pipeline variants for NULL RTVs. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-22 13:06:00 +01:00
Hans-Kristian Arntzen	63530501a5	vkd3d: Require VK_EXT_extended_dynamic_state. This is basically required for not horrible stutter and performance and is widely supported. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-16 17:48:21 +01:00
Hans-Kristian Arntzen	e61cc0234a	vkd3d: Allow debug ring to know about device lost scenarios. For this case, we want to block and teardown the debug ring thread. It's okay to fish for dead messages in the ring, since we know there won't be more GPU work submitted. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen	a6700d3d85	vkd3d: Make debug ring aware of potential crash scenarios. If we expect device losts (breadcrumb debug), we need to use DEVICE uncached/coherent, since we might not be able to flush GPU caches properly. We also need to remove the idea of being able to copy out the control block back to host. This is too brittle and we should instead just place the control block in PCI-e BAR instead. Rethink how we pass messages from GPU to CPU to make it more robust. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:26:27 +01:00
Robin Kertels	a6ea442819	vkd3d: Enable VK_NV_device_diagnostic_checkpoints. Signed-off-by: Robin Kertels <robin.kertels@gmail.com>	2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen	365dd05557	vkd3d: Add breadcrumbs support. AMD path for this commit. Idea is that we can automatically instrument markers with command list information we can make some sense of in vkd3d-proton. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen	5017b3723c	vkd3d: Enable VK_AMD_device_coherent_memory. For breadcrumbs support, along with buffer marker. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen	6a4f2842cb	cache: Move d3d12_pipeline_library to internal references. Allow us to hold internal magic pipeline libraries without creating cycles. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 12:29:32 +01:00
Hans-Kristian Arntzen	18a5315db4	cache: Refactor lock strategy of internal hashmaps. Rather than having to take writer lock on serialize calls from the outside, we should just take locks when accessing the internal hashmaps instead. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 12:29:32 +01:00
Hans-Kristian Arntzen	17b1ffb41a	vkd3d: Add path to use GENERAL depth-stencil images. On some implementations, it doesn't matter for performance what we use, and we can avoid a lot of ugly barriers this way. Opt-in to use this extensions on GPUs we know handles it well, otherwise, keep using the tracking paths. With VK_KHR_dynamic_rendering, this is now feasible to do since we no longer have to deal with shenanigans related to VkRenderPass layouts and complicated compatibility rules. To make this work with the existing framework, just need to consider that GENERAL can be a common layout alongside DEPTH_STENCIL_OPTIMAL, which are both common layouts that do not need to be tracked at all. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-10 15:14:55 +01:00
Hans-Kristian Arntzen	f9da3bf564	vkd3d: Add VK_KHR_driver_properties. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-10 15:14:55 +01:00
Hans-Kristian Arntzen	c6149b47cd	cache: Handle ref-count rules for multiple LoadPipeline/StorePipeline. In pipeline libraries, the library holds on to private references of the libraries so that they can be rapidly loaded on-demand. This behavior is verifed by API tests. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-09 18:35:09 +01:00
Hans-Kristian Arntzen	cc08339624	vkd3d: Use internal_refcounts for pipeline state. When we store pipeline state in libraries we have to manage lifetime a bit differently, which requires internal refcounts of some sort. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-09 18:35:09 +01:00
Hans-Kristian Arntzen	422f6804fb	vkd3d: Enable VK_KHR_create_renderpass2. Required extension by VK_KHR_fragment_shading_rate and VK_KHR_separate_depth_stencil_layouts, but we don't care about enabling any features or use it directly. Needed to silence validation errors. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-09 16:35:05 +01:00
Georg Lehmann	14a06680d9	vkd3d: Remove unused renderpass remains. Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>	2022-03-08 18:34:18 +01:00
Philip Rebohle	9a408367dc	vkd3d: Remove render pass cache. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	51e6b2bbbe	vkd3d: Remove render pass from command list state. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	94f82d1085	vkd3d: Get rid of pipeline variant flags. These only existed for VRS attachment, which is no longer necessary with VK_KHR_dynamic_rendering. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	1a68267962	vkd3d: Remove framebuffer list from d3d12_command_allocator. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	c4f88951fc	vkd3d: Use dynamic rendering for regular draw calls. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	9673ac173d	vkd3d: Use dynamic rendering for pipeline creation. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	3783eaf4f7	vkd3d: Implement swap chain blits using dynamic rendering. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	024ef02f9b	vkd3d: Implement meta image copies using dynamic rendering. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	549d4ee63f	vkd3d: Remove render pass list from d3d12_command_allocator. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Philip Rebohle	2c92ab7d1e	vkd3d: Enable and require VK_KHR_dynamic_rendering. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-08 17:44:47 +01:00
Hans-Kristian Arntzen	9a63df07b8	vkd3d: Add punchthrough path for descriptor copies. Proves out the viability of this style of implementation. Ideally we'd have a more officially sanctioned way of doing similar things later :) Unfortunately, the overhead removal is too great to ignore on target platform. Makes use of a private (reserved) extension for now ... Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-04 13:34:18 +01:00
Hans-Kristian Arntzen	54fbadcc94	vkd3d: Recycle command pools. Elden Ring in particular spam frees and allocates command pools despite this being a very bad idea. Add a simple 8-entry cache which seems to take care of it. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-25 13:59:08 +01:00
Hans-Kristian Arntzen	4b07535909	vkd3d: Optimize memory access pattern for single descriptor copies. We can mark a descriptor as being SINGLE_DESCRIPTOR, which means we only need one descriptor copy. This way, we can avoid doing somewhat expensive work (every nanosecond counts here): - Bitscan loop - Read deep into d3d12_device guts (often a cache miss). The memory index depends on the bitscan, which causes bubble. When we have a single descriptor, we can just store the binding information inline and avoid this jank. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-25 13:04:43 +01:00
Hans-Kristian Arntzen	84d632f194	vkd3d: Rewrite memory layout for resource descriptors. Tune memory layout so that we can deduce various information without making a single pointer dereference: - d3d12_descriptor_heap* - heap offset - Pointer to various side data structures we need to keep around. Instead of having one big 64 byte data structure with tons of padding, tune it down to 32 + 8 bytes per descriptor of extra dummy data. To make all of this work, use a somewhat clever encoding scheme for CPU VA where lower bits store number of active bits used to encode descriptor offset. From there, we can mask away bits to recover d3d12_descriptor_heap. Metadata is stored inline in one big allocation, and we can just offset from there based on extracted log2i_ceil(descriptor count). Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-25 13:04:43 +01:00
Hans-Kristian Arntzen	b309913b6d	vkd3d: Use unsafe_impl in CopyDescriptorsSimple. This is an ultra-hot path and seems to show up somehow on profile. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-25 13:04:43 +01:00
Hans-Kristian Arntzen	83c4e62660	vkd3d: Bump suballocation limit to 2 MiB. This is a more principled limit since that's the huge page size. Avoids some allocation spam. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-24 12:14:22 +01:00
Hans-Kristian Arntzen	edbf49aad4	vkd3d: Support opt-in to single MUTABLE set. Useful for Intel since Intel hardware cannot support more than 1M descriptors in general, and opting in to correct behavior should improve CPU overhead as well when copying descriptors. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-21 17:08:25 +01:00
Hans-Kristian Arntzen	15704b2419	vkd3d: Optimize descriptor copies for common code paths. The common path that we really need to optimize for is CBV_SRV_UAV + Simple + 1 descriptor. Descriptor benchmark shows an almost 50% reduction in overhead now. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-21 16:35:36 +01:00
Hans-Kristian Arntzen	c725c29bb6	vkd3d: Inline query for set/binding from set_index. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-02-21 16:35:36 +01:00

1 2 3 4 5 ...

824 Commits