mirrors/vkd3d-proton

Commit Graph

Author	SHA1	Message	Date
Hans-Kristian Arntzen	5044975152	vkd3d: Drop redundant validate of PSO state blob from disk cache. If we get an entry, it's implicitly validated. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen	8dc8b72807	cache: Add some performance information for shader cache operations. They can take a long time and it's useful to have some reports here. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen	ae0dafa3a1	cache: Attempt to use disk cache instead when appropriate. When the disk cache is used, the cache we give back to applications is a dummy. Therefore, try to use the disk cache blob if we detect a useless application blob. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen	6c8542f7d6	vkd3d: Make use of internal pipeline library if we're asked to. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen	2dcb1e2efc	cache: Implement an on-disk pipeline library. With VKD3D_SHADER_CACHE_PATH, we can add automatic serialization of pipeline blobs to disk, even for games which do not make any use of GetCachedBlob of ID3D12PipelineLibrary interfaces. Most applications expect drivers to have some kind of internal caching. This is implemented as a system where a disk thread will manage a private ID3D12PipelineLibrary, and new PSOs are automatically committed to this library. PSO creation will also consult this internal pipeline library if applications do not provide their own blob. The strategy for updating the cache is based on a read-only cache which is mmaped from disk, with an exclusive write-only portion for new blobs, which ensures some degree of safety if there are multiple concurrent processes using the same cache. The memory layout of the disk cache is optimized to be very efficient for appending new blobs, just simple fwrites + fflush. The format is also robust against sliced files, which solves the problem where applications tear down without destroying the D3D12 device properly. This structure is very similar to Fossilize, and in fact the idea is to move towards actually using the Fossilize format directly later. This implementation prepares us for this scenario where e.g. Steam could potentially manage the vkd3d-proton cache. The main complication in this implementation is that we have to merge the read-only and write caches. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-06 16:36:26 +02:00
Hans-Kristian Arntzen	3095ed84d3	cache: Add concept of internal pipeline libraries. For internal pipeline libraries, we want a somewhat different strategy. - PSOs are keyed by hash instead of user key. - We want the option to conditionally store SPIR-V and PSO blobs. For internal caches, there isn't much of a reason to store PSO blobs since the disk cache is going to be primed anyways. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-05 14:12:20 +02:00
Hans-Kristian Arntzen	db9b9a13de	cache: Fix misleading comment about chunk alignment. It's 8. Used to be 4 before some other fixes ... Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-05 14:12:20 +02:00
Hans-Kristian Arntzen	637834dc75	vkd3d: Make private_root_signatures actually private. Makes sure that we drop private root signature device references when public pipeline state refcount hits 0. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-05 14:12:20 +02:00
Hans-Kristian Arntzen	93928424a9	common: Move time query to common header. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-05 14:12:20 +02:00
Hans-Kristian Arntzen	c8b143c0bd	common: Add wrapper for _ftelli64/_fseeki64. MSVC doesn't have ftello64/fseeko64, nor off64_t. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-05 14:12:20 +02:00
Hans-Kristian Arntzen	ca0a186a4b	common: Add some file utils. Supports more advanced file operations than we'd normally need. Intended to be used by magic disk cache. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-04-05 14:12:20 +02:00
Philip Rebohle	c9101b8ec3	tests: Add test to clear R11G11B10 UAV to zero. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-04-05 11:52:23 +02:00
Philip Rebohle	829c02bf90	vkd3d: Remove format compatibility info for R11G11B10. Not allowing R32 views may give us compression back in some scenarios. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-04-05 11:52:23 +02:00
Philip Rebohle	e4184830c5	vkd3d: Add ClearUAV path that uses buffer-to-image copies. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-04-05 11:52:23 +02:00
Philip Rebohle	d1425ee4d1	vkd3d: Use VK_ACCESS_MEMORY_{READ,WRITE}_BIT where appropriate Buggy RADV versions no longer work due to missing extension support. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-04-05 11:52:23 +02:00
Denis Barkar	8dda6df729	vkd3d: Force non-invariant position for Serious Sam 4. Signed-off-by: Denis Barkar <dbarkar@nvidia.com>	2022-04-01 15:34:52 +02:00
Joshua Ashton	2ed513b99a	vkd3d: Remove VKD3D_MAX_DYNAMIC_STATE_COUNT This was off by one, at some point, which could cause a stack buffer overrun which is naughty. Replace this with just an ARRAY_SIZE on the dynamic_state_list for the array size. Signed-off-by: Joshua Ashton <joshua@froggi.es>	2022-04-01 15:19:18 +02:00
Hans-Kristian Arntzen	19e088cdfc	tests: Add test for weird CBV layouts. CBufferLoad and 16-bit/64-bit tests. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-30 20:13:32 +02:00
Hans-Kristian Arntzen	241078d7e8	vkd3d: Add scalar UBO layout requirement for SM 6.0. Needed to support SM 6.0 CBufferLoad. This path is mostly unused since it's opt-in in DXC and horribly broken ... Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-30 20:13:32 +02:00
Hans-Kristian Arntzen	e01589a33b	dxil-spirv: Update submodule. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-30 20:13:32 +02:00
Hans-Kristian Arntzen	2e704c5a5e	tests: Test primitive restart behavior on list primitives. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-30 16:12:16 +02:00
Hans-Kristian Arntzen	6f43f450c8	vkd3d: Disable primitive restart when using non-compatible topologies. Primitive restart is only used for strip primitive types, and must be ignored for lists. Use and require extended_dynamic_state2 for this purpose. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-30 16:12:16 +02:00
Hans-Kristian Arntzen	cfeaa18b09	vkd3d: Enable MUTABLE_SINGLE_SET for Intel GPUs. There are strict limits on number of descriptors which can be used, and we have to use MUTABLE + single set to make this work. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-30 12:25:20 +02:00
Hans-Kristian Arntzen	da63f0beac	vkd3d: Compute range_end after sparse checks in copy tracking. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-30 12:13:25 +02:00
Hans-Kristian Arntzen	35e777f8a0	meta: Update docs for latest breadcrumbs/debug-ring work. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-30 12:13:16 +02:00
Hans-Kristian Arntzen	095a36cbaf	meta: Update stale notes about driver versions. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-30 12:13:16 +02:00
Philip Rebohle	6378f1b880	vkd3d: Optimize WriteBufferImmediate for consecutive writes. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-30 11:51:10 +02:00
Philip Rebohle	307190e96b	tests: Test WriteBufferImmediate with disjoint ranges. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-30 11:51:10 +02:00
Hans-Kristian Arntzen	2e8fb27182	vkd3d: Correctly handle dynamic depth/stencil attachment infos. {depth,stencil}AttachmentFormat and p{Depth,Stencil}Attachment are only allowed if the format contains that aspect. Check this explicitly. Fixes some validation errors. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-24 17:55:32 +01:00
Hans-Kristian Arntzen	1b5f7e8fc3	vkd3d: Use VkImageViewCreateInfo correctly. For EXTENDED_USAGE, we still need to restrict image usage when creating concrete views. Use VkImageViewUsageCreateInfo to restrict usage flags to the kind of view we're creating. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-24 17:55:32 +01:00
Hans-Kristian Arntzen	cf65a78570	vkd3d: Rename DSV UNKNOWN workaround query. Make it more obvious what it's really trying to check. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-23 22:36:00 +01:00
Philip Rebohle	1d3957fe6d	vkd3d: Do not create pipeline variants for NULL DSV. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-23 22:22:09 +01:00
Philip Rebohle	c9abcfa656	vkd3d: Use d3d12_graphics_pipeline_state_has_unknown_dsv_format more consistently. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-23 22:22:09 +01:00
Hans-Kristian Arntzen	03427c6ee6	vkd3d: Explicitly use NULL RTV mask for dual source blending. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-23 14:29:51 +01:00
Hans-Kristian Arntzen	09682f8417	tests: Extend validation tests for dual source blending. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-23 14:29:51 +01:00
Hans-Kristian Arntzen	6273780e50	vkd3d: Accurately validate dual source blend state. We need to check RTVFormats and IO signature. If both RTVFormat uses non-null format and IO signature has an active entry, we must fail compilation. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-23 14:29:51 +01:00
Hans-Kristian Arntzen	6e915dd2c0	vkd3d: Use rt_count as basis for binding RTVs. Found some validation errors where rt_count != rtv_active_mask, and blending used rt_count instead of rtv_active_mask. If shader renders to a NULL attachment, we must make sure that it's part of the PSO interface. Also, use rt_count rather than active mask when beginning render pass. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-23 14:29:51 +01:00
Philip Rebohle	34f5fc6a31	vkd3d: Do not create pipeline variants for NULL RTVs. Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2022-03-22 13:06:00 +01:00
Hans-Kristian Arntzen	63530501a5	vkd3d: Require VK_EXT_extended_dynamic_state. This is basically required for not horrible stutter and performance and is widely supported. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-16 17:48:21 +01:00
Hans-Kristian Arntzen	dd6534f3f8	vkd3d: Report enabled debug ring size as INFO instead of WARN. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen	09997b4dd8	vkd3d: Fish for message clues on device lost. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen	6d35f98e59	vkd3d: Emit deadca7 cookie for num_words in debug ring. Makes it somewhat feasible to fish for message begin codes in the stream. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen	e61cc0234a	vkd3d: Allow debug ring to know about device lost scenarios. For this case, we want to block and teardown the debug ring thread. It's okay to fish for dead messages in the ring, since we know there won't be more GPU work submitted. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen	c54895b4b7	vkd3d: Fix overflow of ring_size. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen	a6700d3d85	vkd3d: Make debug ring aware of potential crash scenarios. If we expect device losts (breadcrumb debug), we need to use DEVICE uncached/coherent, since we might not be able to flush GPU caches properly. We also need to remove the idea of being able to copy out the control block back to host. This is too brittle and we should instead just place the control block in PCI-e BAR instead. Rethink how we pass messages from GPU to CPU to make it more robust. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen	f0cac9d97c	debug: Make elects helper-lane aware. The elected lane must be able to perform side effects, so make sure helper lanes don't participate. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen	08c0ea209f	debug: Add helper Makefile to easily build shader override modules. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen	64d42c08ee	debug: Add helpers to do wave uniform debug messages. If we know the input is wave uniform (progress markers for example), no need to spam the log. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen	3d8ef2b349	debug: Emit messages more robustly in face of crashes. Attempt to enforce memory order on the num_words to only commit complete messages. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:26:27 +01:00
Hans-Kristian Arntzen	33b9166fec	vkd3d: Make device coherency extension optional for breadcrumbs. Some implementation can support marker, but not explicit coherency. Buffer markers are often uncached either way, so should be fine ... Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>	2022-03-11 13:07:56 +01:00

1 2 3 4 5 ...

3926 Commits All Branches Search

3926 Commits

All Branches