mirrors/vkd3d-proton

Commit Graph

Author	SHA1	Message	Date
Hans-Kristian Arntzen	4a07d9c038	debug: Add concept of implicit instance index to debug ring. For internal debug shaders, it is helpful to ensure in-order logs when sorted for later inspection. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-07-11 14:59:00 +02:00
Hans-Kristian Arntzen	e138a5117a	vkd3d: Encode in detail which commands we're emitting in template. Feed this back to debug ring for less cryptic logs. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-07-11 14:59:00 +02:00
Hans-Kristian Arntzen	96fdb71ae4	vkd3d: Refactor out patch command token enum. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-07-11 14:59:00 +02:00
Hans-Kristian Arntzen	6d3c5d53b0	vkd3d: Add debug ring path for execute indirect template patches. Somehow inspect draw parameters this way. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-07-11 14:59:00 +02:00
Hans-Kristian Arntzen	f93a581dae	vkd3d: Trace breadcrumbs for execute indirect templates. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-07-11 14:59:00 +02:00
Hans-Kristian Arntzen	eda0b2fab2	vkd3d: Do a best effort in handling COLLECTION local static samplers. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-07-11 14:58:19 +02:00
Hans-Kristian Arntzen	74eb676cfb	vkd3d-shader: Normalize root signature compatibility hashing. The hash should only depend on the raw byte stream, not the entire DXBC blob. Useful now since we can declare root signatures either through DXBC blob or as RDAT object (which is raw). Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-07-11 14:34:34 +02:00
Hans-Kristian Arntzen	8a94c3ce0e	vkd3d: Add more detailed breadcrumb logging for TraceRays. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-07-11 14:04:38 +02:00
Hans-Kristian Arntzen	ddb425c5cb	vkd3d: Add support for tag logging in breadcrumbs. To keep things simple, outer code is responsible for keeping string alive. Intended to be used for RTPSO entry point name debugging. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-07-11 14:04:38 +02:00
Hans-Kristian Arntzen	0ef6a8b798	vkd3d: Expose utility for creating root signature from raw blob. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-07-11 12:37:34 +02:00
Hans-Kristian Arntzen	22778b99be	vkd3d: Handle default global root signature in RTPSO. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-07-11 12:37:34 +02:00
Hans-Kristian Arntzen	3c92b3a1bc	vkd3d: Implement AddToStateObject(). This is barely implementable, and relies on implementations to do kinda what we want. To make this work in practice, we need to allow two pipelines per state object. One that is created with LIBRARY and one that can be bound. When incrementing the PSO, we use the LIBRARY one. It seems to be allowed to create a new library from an old library. It is more convenient for us if we're allowed to do this, so do this until we're forced to do otherwise. DXR 1.1 requires that shader identifiers remain invariant for child pipelines if the parent pipeline also have them. Vulkan has no such guarantee, but we can speculate that it works and validate that identifiers remain invariant. This seems to work fine on NVIDIA at least ... It probably makes sense that it works for implementations where pipeline libraries are compiled at that time. The basic implementation of AddToStateObject() is to consider the parent pipeline as a COLLECTION pipeline. This composes well and avoids a lot of extra implementation cruft. Also adds validation to ensure that COLLECTION global state matches with other COLLECTION objects and the parent. We will also inherit global state like root signatures, pipeline config, shader configs etc when using AddToStateObject(). The tests pass on NVIDIA at least. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-07-11 12:11:27 +02:00
Hans-Kristian Arntzen	8473355a98	vkd3d: Hold private ownership over global root signature. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-07-11 11:49:44 +02:00
Hans-Kristian Arntzen	bc759be2af	vkd3d: Optimize ExecuteIndirect() if no INDIRECT transitions happened. The D3D12 docs outline this as an implementation detail explicitly, so we should do the same thing. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-06-24 14:55:39 +02:00
Hans-Kristian Arntzen	18f1d1c72e	vkd3d: Implement ExecuteIndirect with state update. Implements the most basic iteration where we don't try to take advantage of index LUT, hoisting CS patching or attempting to reuse application indirect buffer directly. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-06-24 14:55:39 +02:00
Hans-Kristian Arntzen	1b704287e5	vkd3d: Enable NV_device_generated_commands extension. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-06-24 14:55:39 +02:00
Hans-Kristian Arntzen	f975f09bb1	meta: Add ExecuteIndirect patch meta shader. Currently we are translating the index type. This will be changed in a follow up commit where we move over to index LUT. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-06-23 14:39:22 +02:00
Hans-Kristian Arntzen	619a54810d	vkd3d: Pass down required memory types to scratch allocators. Separate scratch pools by their intended usage. Allows e.g. preprocess buffers to be allocated differently from normal buffers, which is necessary on implementations that use special memory types to implement preprocess buffers. Potentially can also allow for separate pools for host visible scratch memory down the line. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-06-23 14:39:22 +02:00
Hans-Kristian Arntzen	cecb8d6ebc	vkd3d: Don't suballocate scratch buffers. Scratch buffers are 1 MiB blocks which will end up being suballocated. This was not intended and a fallout from the earlier change where VA_SIZE was bumped to 2 MiB for Elden Ring. Introduce a memory allocation flag INTERNAL_SCRATCH which disables suballocation and VA map insert. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-06-23 14:39:22 +02:00
Hans-Kristian Arntzen	8ae391e675	vkd3d: Add more stringent validation for CreateCommandSignature. The runtime is specified to validate certain things. Also, be more robust against unsupported command signatures, since we might need to draw/dispatch at an offset. Avoids hard GPU crashes. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen	c132073df8	vkd3d: Refactor index buffer state to be flushed late. With ExecuteIndirect state we'll need to modify or refresh index buffer state. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-06-23 12:52:29 +02:00
Hans-Kristian Arntzen	de5b751468	vkd3d: Enable VK_KHR_depth_stencil_resolve. Required by KHR_dynamic_rendering. Caught by updated validation layers. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-06-17 11:54:31 +02:00
Hans-Kristian Arntzen	135aff4685	vkd3d: Remove the global VkPipelineCache. Just use VK_NULL_HANDLE. We rely on the disk cache to exist anyways here. We never serialize the global pipeline cache, so it might just confuse drivers into disable disk cache if anything. Also reduce memory bloat. Also gets rid of very old NV driver workaround where we forced global pipeline cache. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-06-17 11:53:46 +02:00
Tatsuyuki Ishi	3577ca3144	vkd3d: Introduce transfer batches. Transfer batches buffers CopyTextureRegion calls for batching. The flushes needs to happen in a few places: 1. ResourceBarrier: This is where the transition from COPY_DEST to other might happen, at which point the writes must be visible. This might also transition away from COPY_SRC which invalidates the precondition. 2. Copy operations. Copies to the same resource are implicitly ordered. 3. Draws and dispatches. These are not strictly necessary, but we don't want too much command reordering so flushing here seems good. 4. Close. So that we don't throw commands into the void. Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>	2022-06-16 11:54:26 +02:00
Tatsuyuki Ishi	829ac72e3d	vkd3d: Break up CopyTextureRegion into three stages. A parameter preparation stage, a pre-execution barrier stage, then finally the execution and post-execution barrier stage. Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>	2022-06-13 14:40:23 +02:00
Hans-Kristian Arntzen	7916d2a6d8	vkd3d: Enable and use VK_KHR_fragment_shader_barycentric. For now, just keep the NV path as well. It's the exact same extension basically as the KHR one. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-05-31 15:59:49 +02:00
Hans-Kristian Arntzen	467db76f90	vkd3d: Remove obsolete COLOR -> COMPUTE workaround for Deathloop. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-05-31 15:59:35 +02:00
Hans-Kristian Arntzen	f964532619	vkd3d: Implement extended DXR queries. Requires ray_tracing_maintenance1. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-05-30 20:26:50 +02:00
Robin Kertels	8ac7aaca99	vkd3d: Enable VK_KHR_ray_tracing_maintenance1. Signed-off-by: Robin Kertels <robin.kertels@gmail.com>	2022-05-11 19:11:01 +02:00
Philip Rebohle	beb58f8472	vkd3d: Enable and require VK_KHR_maintenance4. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-04-22 11:36:02 +02:00
Philip Rebohle	e7a6af4971	vkd3d: Use texel buffer views for UAV clears with buffer to image copy. Allows this to more easily work with more formats. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-04-21 13:51:58 +02:00
Hans-Kristian Arntzen	ae0dafa3a1	cache: Attempt to use disk cache instead when appropriate. When the disk cache is used, the cache we give back to applications is a dummy. Therefore, try to use the disk cache blob if we detect a useless application blob. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen	6c8542f7d6	vkd3d: Make use of internal pipeline library if we're asked to. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen	2dcb1e2efc	cache: Implement an on-disk pipeline library. With VKD3D_SHADER_CACHE_PATH, we can add automatic serialization of pipeline blobs to disk, even for games which do not make any use of GetCachedBlob of ID3D12PipelineLibrary interfaces. Most applications expect drivers to have some kind of internal caching. This is implemented as a system where a disk thread will manage a private ID3D12PipelineLibrary, and new PSOs are automatically committed to this library. PSO creation will also consult this internal pipeline library if applications do not provide their own blob. The strategy for updating the cache is based on a read-only cache which is mmaped from disk, with an exclusive write-only portion for new blobs, which ensures some degree of safety if there are multiple concurrent processes using the same cache. The memory layout of the disk cache is optimized to be very efficient for appending new blobs, just simple fwrites + fflush. The format is also robust against sliced files, which solves the problem where applications tear down without destroying the D3D12 device properly. This structure is very similar to Fossilize, and in fact the idea is to move towards actually using the Fossilize format directly later. This implementation prepares us for this scenario where e.g. Steam could potentially manage the vkd3d-proton cache. The main complication in this implementation is that we have to merge the read-only and write caches. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen	3095ed84d3	cache: Add concept of internal pipeline libraries. For internal pipeline libraries, we want a somewhat different strategy. - PSOs are keyed by hash instead of user key. - We want the option to conditionally store SPIR-V and PSO blobs. For internal caches, there isn't much of a reason to store PSO blobs since the disk cache is going to be primed anyways. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-05 14:12:20 +02:00
Hans-Kristian Arntzen	637834dc75	vkd3d: Make private_root_signatures actually private. Makes sure that we drop private root signature device references when public pipeline state refcount hits 0. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-05 14:12:20 +02:00
Joshua Ashton	2ed513b99a	vkd3d: Remove VKD3D_MAX_DYNAMIC_STATE_COUNT This was off by one, at some point, which could cause a stack buffer overrun which is naughty. Replace this with just an ARRAY_SIZE on the dynamic_state_list for the array size. Signed-off-by: Joshua Ashton <joshua@froggi.es>	2022-04-01 15:19:18 +02:00
Hans-Kristian Arntzen	241078d7e8	vkd3d: Add scalar UBO layout requirement for SM 6.0. Needed to support SM 6.0 CBufferLoad. This path is mostly unused since it's opt-in in DXC and horribly broken ... Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-30 20:13:32 +02:00
Hans-Kristian Arntzen	6f43f450c8	vkd3d: Disable primitive restart when using non-compatible topologies. Primitive restart is only used for strip primitive types, and must be ignored for lists. Use and require extended_dynamic_state2 for this purpose. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-30 16:12:16 +02:00
Hans-Kristian Arntzen	1b5f7e8fc3	vkd3d: Use VkImageViewCreateInfo correctly. For EXTENDED_USAGE, we still need to restrict image usage when creating concrete views. Use VkImageViewUsageCreateInfo to restrict usage flags to the kind of view we're creating. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-24 17:55:32 +01:00
Hans-Kristian Arntzen	cf65a78570	vkd3d: Rename DSV UNKNOWN workaround query. Make it more obvious what it's really trying to check. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-23 22:36:00 +01:00
Hans-Kristian Arntzen	6e915dd2c0	vkd3d: Use rt_count as basis for binding RTVs. Found some validation errors where rt_count != rtv_active_mask, and blending used rt_count instead of rtv_active_mask. If shader renders to a NULL attachment, we must make sure that it's part of the PSO interface. Also, use rt_count rather than active mask when beginning render pass. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-23 14:29:51 +01:00
Philip Rebohle	34f5fc6a31	vkd3d: Do not create pipeline variants for NULL RTVs. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-22 13:06:00 +01:00
Hans-Kristian Arntzen	63530501a5	vkd3d: Require VK_EXT_extended_dynamic_state. This is basically required for not horrible stutter and performance and is widely supported. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-16 17:48:21 +01:00
Hans-Kristian Arntzen	e61cc0234a	vkd3d: Allow debug ring to know about device lost scenarios. For this case, we want to block and teardown the debug ring thread. It's okay to fish for dead messages in the ring, since we know there won't be more GPU work submitted. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen	a6700d3d85	vkd3d: Make debug ring aware of potential crash scenarios. If we expect device losts (breadcrumb debug), we need to use DEVICE uncached/coherent, since we might not be able to flush GPU caches properly. We also need to remove the idea of being able to copy out the control block back to host. This is too brittle and we should instead just place the control block in PCI-e BAR instead. Rethink how we pass messages from GPU to CPU to make it more robust. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:26:27 +01:00
Robin Kertels	a6ea442819	vkd3d: Enable VK_NV_device_diagnostic_checkpoints. Signed-off-by: Robin Kertels <robin.kertels@gmail.com>	2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen	365dd05557	vkd3d: Add breadcrumbs support. AMD path for this commit. Idea is that we can automatically instrument markers with command list information we can make some sense of in vkd3d-proton. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen	5017b3723c	vkd3d: Enable VK_AMD_device_coherent_memory. For breadcrumbs support, along with buffer marker. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:07:56 +01:00
Hans-Kristian Arntzen	6a4f2842cb	cache: Move d3d12_pipeline_library to internal references. Allow us to hold internal magic pipeline libraries without creating cycles. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 12:29:32 +01:00

1 2 3 4 5 ...

850 Commits