This isn't known to fix any current bugs but it does prevent a
regression in a subsequent commit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 02bc4aabb48 ('nir/lower_io_to_vector: allow FS outputs to be vectorized')
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: 3844ed8d44 ("radv: Add adaptive_sync driconfig option and enable it by default.")
Fixes: e260493f2a ("radeonsi: Enable adaptive_sync by default for radeon")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
No option is supported yet, this is just the boilerplate.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
u_endian.h needs to be included, otherwise PIPE_ARCH_BIG_ENDIAN might not
be defined on big-endian architectures and the endian conversion macros
will be incorrect.
I don't think anything is broken because of this, I just noticed this when
looking at the file.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This bit redirects the state cache from the unified/RO sections of the
L3 cache to the "CS command buffer" section of the cache, which would
be set up via TCCNTLREG. The documentation says:
"Additionaly, this redirection should be enabled only if there is a
non-zero allocation for the CS command buffer section."
We don't allocate any cache to the CS command buffer section, so
enabling this redirection effectively disabled the state cache.
The Windows driver only sets up that section when using POSH, which
we do not currently use. So, leave it unallocated and disable the
redirection to get a functional state cache again.
Improves performance in Civilization VI by 18%, Manhattan 3.0 by 6%,
and Car Chase by 2%.
Jason pointed out that the caches likely refer to offsets from dynamic
and surface state base addresses, so when we change those, we need to
invalidate the caches.
Comment borrowed from src/intel/vulkan/genX_cmd_buffer.c.
The driver can't determine PIPE_QUERY_PRIMITIVES_GENERATED or
PIPE_QUERY_PRIMITIVES_EMITTED once we support geometry or
tessellation, since these stages add primitives at runtime. Use the
WRITE_PRIMITIVE_COUNTS event to write back the primitive counts and
implement a hw query for this.
Reviewed-by: Rob Clark <robdclark@gmail.com>
The GPU writes out streamout offsets as it goes to the FLUSH_BASE
pointer. We use that value with CP_MEM_TO_REG when appending to the
stream so that we don't have to track the offsets with the CPU in the
driver. This ensures that streamout continues to work once we enable
geometry and tessellation shader stages that add geometry.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Should fix some issues we're seeing. And use REALLOC instead of realloc.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This doesn't fix anything known but it's correct now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This has lower_io_to_vector try to turn variables into arrays of 4-sized
vectors when possible and fall back to the old approach when that isn't
possible.
This is so that lower_io_to_vector can guarantee that only one variable is
used for each fragment shader output.
v2: handle dual-source blending
v3: don't try to merge structs and non-32-bit types in get_flat_type()
v3: fix per-vertex inputs
v3: fix and cleanup location advancement in get_flat_type() and it's
calling code
v4: prioritize the original mode over the flat mode
v4: don't create flat variables to merge only one variable
v5: don't skip an entire slot when encountering structs in the old mode
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
v2: handle dual-source blending
v3: use a higher MAX_SLOTS
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
It shouldn't matter much because output varyings should have been
compacted during NIR shader linking but it mirrors what the driver
does when emitting NGG GS vertex parameters.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
If the fragment shader needs the layer index, we have to allocate
one more dword in the NGG GS storage. Found by inspection. This
doesn't fix anything known.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Sometimes LAVA jobs will timeout due to transient issues, and the Gitlab
job will fail in that case. Increase the timeouts to reduce the
likeliness of that happening and reduce false positives.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
So repositories don't need to be specially configured with a token to
access LAVA, store this token in a bind volume for a special runner.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Having two different structs is useless.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>