It's technically undefined to use NULL UAV counters,
but drivers all implement some form of robust behavior here
when presented with NULL counters, so we'll have to follow suit.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
We have no way of expressing size / alignment requirements to
applications since the API query does not provide us with heap
information. Reuse the fallback path for promoting placed to committed.
Guardians of the Galaxy hits a case where it tries to place 3x
host-visible 3D images in one heap, and they end up overlapping in
memory due to a 16x16x80 3D texture taking up far less space in optimal
tiling compared to linear tiling on AMD.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
For internal debug shaders, it is helpful to ensure in-order logs when
sorted for later inspection.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
We cannot handle all scenarios if COLLECTIONS are incompatible,
but test the easier cases.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
CP77 relies on this to work somehow ...
The DXR spec seems to suggest this is allowed, but there is no direct
concept for this in Vulkan.
This seems to work on NVIDIA at least, but we're on very shaky ground
here ...
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
The hash should only depend on the raw byte stream, not the entire DXBC
blob. Useful now since we can declare root signatures either through
DXBC blob or as RDAT object (which is raw).
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
To keep things simple, outer code is responsible for keeping string
alive. Intended to be used for RTPSO entry point name debugging.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Handle embedded DXIL subobjects and fix various issues exposed by the
upcoming new tests.
Associating with global root signatures, shader config and pipeline
config needs to be rewritten so that we validate uniqueness late.
The strategy here is to look at all exports we care about and find an
association.
There are many priority levels which are implied by how I understand the
DXR docs. State objects in the API win over embedded DXIL state objects.
Any DXIL state object wins over a collection.
Hit group associations can trump an entry point. It's not entirely clear
how this works, but we let it win if it has higher priority, i.e.
an explicit association directed at the hit group.
There's also cases where explicit assignment trumps explicit default
assignment, which then trumps just declaring a state object.
Collection state is inherited in some cases like AddToStateObject() even
if this seems to be undocumented behavior.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
This is barely implementable, and relies on implementations to do kinda
what we want.
To make this work in practice, we need to allow two pipelines per state
object. One that is created with LIBRARY and one that can be bound. When
incrementing the PSO, we use the LIBRARY one.
It seems to be allowed to create a new library from an old library.
It is more convenient for us if we're allowed to do this, so do this
until we're forced to do otherwise.
DXR 1.1 requires that shader identifiers remain invariant for child
pipelines if the parent pipeline also have them.
Vulkan has no such guarantee, but we can speculate that it works and
validate that identifiers remain invariant. This seems to work fine on
NVIDIA at least ... It probably makes sense that it works for
implementations where pipeline libraries are compiled at that time.
The basic implementation of AddToStateObject() is to consider
the parent pipeline as a COLLECTION pipeline. This composes well and
avoids a lot of extra implementation cruft.
Also adds validation to ensure that COLLECTION global state matches with
other COLLECTION objects and the parent. We will also inherit global
state like root signatures, pipeline config, shader configs etc when
using AddToStateObject().
The tests pass on NVIDIA at least.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Docs explicitly specify that placed RTV / DSV resource must be properly
initialized before use, either on first use or after aliasing barriers,
so there should be no need to perform initial layout transition.
Fixes spurious GPU hangs in Hitman III where application aliases
an indirect buffer and a DSV. The DSV is cleared after the indirect
buffer is consumed, but the initial_layout_transition is triggered and
HTILE init clobbered the buffer.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Also be a bit more uniform with using break/return on fail conditions.
Otherwise, the indirect command will read data from the count buffer
instead, which may lead to bugs or GPU hangs.
Signed-off-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Transfer batch can clobber graphics pipeline for e.g. depth->color copies.
Hence, flushing the batches after applying the graphics pipeline set by the
app can cause correctness issues.
To prevent that, do the transfer batch flush first before we apply any
render-related states.
Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>