Fixes an issue where push constants can be invalidated by
shader-based clear/copy commands.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Uses one push constant range with VK_SHADER_STAGE_ALL. This
will allow us to easily add descriptor table offsets as push
constants.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
An upcoming change to the binding model will use these to
initialize descriptors that have the wrong resource type
bound, or were left uninitialized by the application.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
This needs a major rework as the current implementation has bugs,
is hard to reason about, and very hard to maintain as we're about
to make major changes to the binding model as a whole.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Resource index is found in idx[0] in SM 5.0, but idx[1] when using SM
5.1, and register space is encoded separately. An rb_tree keeps track of
the internal resource index idx[0] and can map that to space/binding as
required when emitting SPIR-V.
For this to work, we must also make UAV counters register space aware.
In earlier implementation, UAV counter mask was assumed to correlate 1:1
with register_index, which breaks on SM 5.1.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Greatly reduce VA allocations we have to make and makes returned VA more
sensible, and better matches returned VAs we see on native drivers.
D3D12 usage flags for buffers seem generic enough that there is no
obvious benefit to place smaller VkBuffers on top of VkDeviceMemory.
Ideally, physical_buffer_address is used here, but this works as a good
fallback if that path is added later.
With this patch and previous VA optimization, I'm observing a 2.0-2.5%
FPS uplift on SOTTR when CPU bound.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
ID3D12GraphicsCommandList2 and WriteBufferImmediate() are used by
Hitman 2, but implementing the function on top of an AMD extension has
no effect on game behaviour. It's commonly used to write debug info.
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Addresses the following limitations of the previous implementation:
- Only R32_{UINT,TYPELESS} were supported for buffers.
- Clearing an image UAV did not behave correctly for images with non-UINT formats.
- Due to the use of transfer operations, extra memory barriers were needed.
If necessary, this will create a temporary view with a bit-compatible
UINT format for the resource in order to perform a bit-exact clear.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
This also fixes a format specifier warning in an ERR for the 32-bit Linux
build.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Currently, vkd3d_view_destroy_descriptor assumes image views
by default, but we need to be able to attach buffer views to
command allocators for UAV clears.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
The additional data is needed to implement UAV clears.
Moving this out of d3d12_desc also helps make copying and
traversing descriptor arrays more CPU cache-friendly.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Shadow of the Tomb Raider does not re-bind all descriptor tables after
setting a new root signature if tessellation is enabled, which causes
some descriptors to be left undefined.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
The GPU VA allocator was allocating memory in a way where dereferencing
GPU VA required a lock + bsearch() to find the right VA range.
Rather than going this route, we turn the common case into O(1) and
lock-free by creating a slab allocator which allows us to lookup a
pointer directly from a GPU VA with (VA - Base) / PageSize.
The number of allocations in the fast path must be limited since we
cannot trivially grow the allocator while remaining lock-free for
dereferences.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Shadow of the Tomb Raider overwrites descriptors while they are being
copied in another thread. This patch makes reads and writes atomic for
CBV, SRV, UAV, and sampler descriptors, but not RTV and DSV, for which
copying is not implemented.
Benchmark total frames vs mutex count (the single mutex was locked
only once for copying):
1 mutex: 6480 6489 6503
8 mutexes: 6691 6693 6661
16 mutexes: 6665 6682 6703
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Enables ReadFromSubresource() to succeed in cases where it would have
failed otherwise.
Signed-off-by: Józef Kucia <jkucia@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Order of structures doesn't matter so we can simply prepend instead of
apending.
Signed-off-by: Józef Kucia <jkucia@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
VkDeviceMemory must be externally synchronized.
Signed-off-by: Józef Kucia <jkucia@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
The condition in d3d12_resource_is_cpu_accessible() is going to be
changed in the following commits.
Signed-off-by: Józef Kucia <jkucia@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Predicate arguments which are only non-zero in bit 32 or higher are not
supported. Predicates will not be applied to clear and copy commands because
Vulkan does not support predication of these command classes.
Signed-off-by: Conor McCarthy <cmccarthy@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
We maintain separate arrays for enqueued fences and fences owned by the
fence worker thread.
Signed-off-by: Józef Kucia <jkucia@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
It isn't immediately obvious what "1u << graphics->rt_count" means.
Use dsv_attachment_mask() helper instead.
Signed-off-by: Józef Kucia <jkucia@codeweavers.com>
Signed-off-by: Henri Verbeet <hverbeet@codeweavers.com>
Signed-off-by: Alexandre Julliard <julliard@winehq.org>