Commit Graph

166889 Commits

Author SHA1 Message Date
Timothy Arceri 2a2d85e254 nir/nir_opt_copy_prop_vars: reuse dynamic arrays
As per the previous commit if we don't reuse these dynamic arrays
we end up needlessly thrashing the memory handling functions.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>
2023-02-16 23:31:59 +00:00
Timothy Arceri ffe0f3fda1 nir/nir_opt_copy_prop_vars: reuse hash tables
Due to how this pass works we can end up thrashing memory if we
do not reuse these hash tables rather than reusing them.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>
2023-02-16 23:31:59 +00:00
Timothy Arceri 731e9fd535 nir/nir_opt_copy_prop_vars: avoid comparison explosion
Previously the pass was comparing every deref to every load/store
causing the pass to slow down more the larger the shader is.

Here we use a hash table so we can simple store everything needed
for comparision of a var separately.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>
2023-02-16 23:31:59 +00:00
Timothy Arceri 8f6f5730f6 nir/nir_opt_copy_prop_vars: remove extra loop
The fix in 947f7b452a introduced an extra loop over the copies
array to find the correct entry in the case it had been moved.

The problem is these loops can be iterated over millions of times
so lets simply update the entry pointer in the case we change its
location in the array.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>
2023-02-16 23:31:59 +00:00
Faith Ekstrand 4e09d37f3b nir/from_ssa: Move the loop bounds check in resolve_parallel_copy
We loop, effectively, over two stacks: ready and to_do and finish only
when both are empty.  In the case where ready is empty, we pull one off
of to_do, add a copy to a temporary, and push it onto the ready stack.
Previously, we assumed that we would never get to the temporary copy
case if to_do has exactly one entry because that would imply that there
was only one copy left which means there can't possibly be a cycle to
break.  This was true until c7fc44f9eb ("nir/from_ssa: Respect and
populate divergence information") which changed things such that
temporary copies sometimes get added in the case where a convergent
value is copied both to convergent and divergent destinations.

This patch adjusts our loop iteration to always attempt to clear the
ready stack before checking if there's anything left on the to_do stack.
I also added an assert to make the exit condition more clear.

Fixes: c7fc44f9eb ("nir/from_ssa: Respect and populate divergence information")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8037
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21315>
2023-02-16 20:23:42 +00:00
Faith Ekstrand 5afba073c6 nir/from_ssa: Only re-locate values that are destinations
There is an optimization in the parallel copy algorithm where, after a
copy has been performed, we can treat the destination as the new source
for future copies of the same source.  In particular, consider the
following parallel copy: A -> B, C -> A, A -> C.  In this case, after we
have done the A -> B copy, we can make note that the value in A is now
in B and emit the sequence: A -> B, C -> A, B -> C.  This allows us to
resolve the swap cycle between A anc C without allocating a temporary
register because we know B is also a copy of A.

When one of the registers involved is convergent and the other is
divergent, this optimization is problematic because, while convergent to
divergent copies are fine, we can't re-use the divergent copy in later
copies if any of those copies are to a convergent variable.  We could,
but it would require a read_first_invocation which would get messy.  In
In c7fc44f9eb ("nir/from_ssa: Respect and populate divergence
information"), we attempted to deal with this by limiting the rename
optimization to the case where the divergence matched.

The problem is that we did the re-name part whenever the divergence
matched but only marked it as ready if the thing being copied was a
destination.  (We actually left two instances of loc[a] = b, one which
always happened and one which only happened if we also wanted to flag
the source as being ready to use as a destination.)  While this
technically doesn't cause any problems, it may result in more inter-mov
dependencies which hurts instruction scheduling.  For example, if we had
the parallel copy A -> B, A -> C, A -> D, we now end up emitting the
sequence A -> B, B -> C, C -> D which has many more data hazards between
instructions caused by the constant shuffling.

This commit restores the original logic in which we only perform the
rename optimization if the rename would free up a register we will later
use as a destination.  This isn't entirely optimal as it still doesn't
prove that there is a cycle involved first, but it should lead to a
reduction in unnecessary dependencies.

No shader-db changes on SKL or DG2

Fixes: c7fc44f9eb ("nir/from_ssa: Respect and populate divergence information")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21315>
2023-02-16 20:23:42 +00:00
Rob Clark 9673502b3b freedreno/drm: Optimize stateobj re-emit
For long-lived stateobjs, it is common to re-emit to the same submit
multiple times.  By giving each submit a unique sequence # we can detect
this case and skip the extra append_bo().

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
2023-02-16 19:57:13 +00:00
Rob Clark 6747d30155 freedreno: Add seqno helper
It is a pretty common pattern to allocate a non-zero sequence # for
lightweight checking if an object is the same, changed, for use in cache
keys, etc.  (And also pretty common to forget to handle the rollover
zero case.)  Add a helper for this.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
2023-02-16 19:57:13 +00:00
Rob Clark 8f2b22ba66 freedreno: Drop batch lock
Now that we are not tracking cross-context batch dependencies, there is
no scenario where one context could trigger flushing another context's
batch.  So we can drop the batch lock intended to protect against this.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
2023-02-16 19:57:13 +00:00
Rob Clark 9a6de00e98 freedreno/batch: Stop tracking cross-context deps
The app is expected to provide suitable cross-context synchronization
(fences, etc), so don't try to do it's job for them.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
2023-02-16 19:57:13 +00:00
Rob Clark a4b949fe61 freedreno: Avoid taking screen lock
Avoid taking screen unlock for batch unref.  Instead just split the
destroy fxn into locked and unlocked variants.  That way we only end
up taking the screen lock on final unref but avoid it in the common
case.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
2023-02-16 19:57:13 +00:00
Rob Clark 35fc1595b3 freedreno/a6xx: Pre-compute PROG related LRZ state
PROG state mostly just disables various LRZ related flags, which can
be handled as a simple mask.  The exception is ztest mode, which is
either overriden by PROG state, or we use the all 1's value (which
isn't valid from hw standpoint) to signal that it needs to be computed
at draw time, which fortunately fits in with the bitmask approach.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
2023-02-16 19:57:13 +00:00
Rob Clark c938101bb5 freedreno: Move FD_MESA_DEBUG cases out of draw_vbo
If the debug options are enabled, just plug in a debug version of
draw_vbo with the additional checks.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
2023-02-16 19:57:13 +00:00
Rob Clark 8942f4b734 freedreno: Move blend out of dirty-rsc tracking
This was not doing any actual resource tracking, just updating
gmem_reason.  And furthermore, a6xx+ doesn't care about the bits
it was setting.  So move this to per-gen backend for the gens that
need it, and avoid setting FD_DIRTY_RESOURCE when FD_DIRTY_BLEND
is set.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
2023-02-16 19:57:13 +00:00
Rob Clark 67d4bc7be4 freedreno/a6xx: Remove tex-state refcnting
Now that we use a flag to trigger the tex state invalidation coming from
other contexts, we can drop the refcnt'ing.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
2023-02-16 19:57:13 +00:00
Rob Clark cfd4721ee0 freedreno/drm: Make rb refcnt non-atomic
Now that the one special case where multiple threads could race to
ref/unref, we can go back to using non-atomic refcnts.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
2023-02-16 19:57:13 +00:00
Rob Clark f91bcd2455 freedreno/a6xx: Do tex-state invalidates in same ctx
If a resource invalidate is triggered by a different ctx (potentially on
a different thread) simply flag that the tex state needs invalidation,
but defer handling it to the ctx that owns the tex state.

This will let us remove atomic refcnt'ing on the tex state, and more
importantly atomic refcnt'ing on the fd_ringbuffer (as this was the one
special case where rb's could be accessed from multiple threads).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
2023-02-16 19:57:13 +00:00
Rob Clark e7993d68e2 freedreno/a6xx: Multi-draw support
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
2023-02-16 19:57:13 +00:00
Rob Clark cc31997f1b freedreno/a6xx: Split out flush_streamout() helper
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
2023-02-16 19:57:13 +00:00
Rob Clark 911d67bdad freedreno/a6xx: Drop unused return
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
2023-02-16 19:57:13 +00:00
Rob Clark c4e2e821a2 freedreno: Push num_draws down to backend
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
2023-02-16 19:57:13 +00:00
Rob Clark 6bfee9e669 freedreno: Account for multi-draw in num_draws
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
2023-02-16 19:57:13 +00:00
Daniel Schürmann f6251b21f9 radv/rt: don't hash maxPipelineRayRecursionDepth
The stack size has no effect on the generated shader anymore.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21159>
2023-02-16 19:37:25 +00:00
Daniel Schürmann 8e718c5b63 radv/rt: use dynamic_callable_stack_base also for static stack_sizes
This patch also removes rt_pipeline->dynamic_stack_size and replaces
it by checking for rt_pipeline->stack_size == -1u.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21159>
2023-02-16 19:37:25 +00:00
Daniel Schürmann 2649a1f272 radv/rt: introduce and set rt_pipeline->stack_size
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21159>
2023-02-16 19:37:25 +00:00
Daniel Schürmann b338d59047 radv: unconditionally enable scratch for RT shaders
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21159>
2023-02-16 19:37:25 +00:00
Daniel Schürmann aa362b4b6f radv: rename shader_info->cs.uses_sbt -> shader_info->cs.is_rt_shader
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21159>
2023-02-16 19:37:25 +00:00
Konstantin Seurer 72d9604db0 radv: Clean up dynamic RT stack allocation
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21159>
2023-02-16 19:37:25 +00:00
Sidney Just fc84c63e17 zink: Add missing features to the profile file
Fixes: 2ea481b2f0 ("Zink: add Zink profiles file")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20920>
2023-02-16 19:11:57 +00:00
Sidney Just 60e0322092 zink: add check for samplerMirrorClampToEdge Vulkan 1.2 feature
This adds a check to advertise PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE when either the extension is present or the Vulkan 1.2 feature is enabled.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20920>
2023-02-16 19:11:57 +00:00
Emma Anholt ed62eec58b hasvk: Fix SPIR-V warning about TF unsupported on gen7.
It's supported now.

Fixes: d82826ad44 ("anv: Implement VK_EXT_transform_feedback on Gen7")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21228>
2023-02-16 18:11:44 +00:00
Emma Anholt 98455470ea hasvk: Silence conformance warning in CI.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21228>
2023-02-16 18:11:44 +00:00
Emma Anholt 570acf5655 ci: Add a manual full and 1/10th hasvk CTS runs.
These are manual since they're on a runner in my basement that sometimes
can go down, but it'll be nice to have this for throwing the rare hasvk MR
at.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21228>
2023-02-16 18:11:44 +00:00
Danylo Piliaiev be976e0aa6 ci/tu: Add 1/200 pass to test for stale reg usage
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21226>
2023-02-16 17:43:10 +00:00
Danylo Piliaiev 86f82d4224 docs/freedreno: Add info about stale reg stomper dbg option
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21226>
2023-02-16 17:43:10 +00:00
Danylo Piliaiev a66d9c815d turnip: Add debug option to find usage of stale reg values
MESA_VK_ABORT_ON_DEVICE_LOSS=1 \
TU_DEBUG_STALE_REGS_RANGE=0x00000c00,0x0000be01 \
TU_DEBUG_STALE_REGS_FLAGS=cmdbuf,renderpass \
./app

To pinpoint the reg causing a failure reducing regs range could be
used for bisection. Some failures may be caused by multi-reg combination,
in such case set 'inverse' flag which would change the meaning of reg
range to "do not stomp these regs".

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21226>
2023-02-16 17:43:10 +00:00
Timur Kristóf 084d10a702 aco: Remove MTBUF zero operand.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21363>
2023-02-16 17:16:34 +00:00
Timur Kristóf afdacf4dcc aco: Don't set scalar offset on buffer load instructions when it's zero.
This helps generate slightly more optimal instructions.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21363>
2023-02-16 17:16:34 +00:00
José Roberto de Souza e050a00b9f intel/common: Move i915 files to i915 folder
Following the organization done in intel/dev and intel/vulkan.

Probably due to some rebase issue we had a duplicated copyright header
in intel_gem_i915.h that is being removed in here too.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21256>
2023-02-16 16:24:36 +00:00
Mike Blumenkrantz 41286f100e vl/dri3: avoid deadlocking when polling deleted windows for events
upcoming xserver releases will emit PresentConfigureNotify with this
flag set when a window is destroyed, ensuring drivers
don't poll infinitely and deadlock

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>

Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21339>
2023-02-16 15:55:47 +00:00
Mike Blumenkrantz 819cbf329a vulkan/wsi: avoid deadlocking dri3 when polling deleted windows for events
upcoming xserver releases will emit PresentConfigureNotify with this
flag set when a window is destroyed, ensuring drivers
don't poll infinitely and deadlock

fixes #6685

cc: mesa-stable

Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21339>
2023-02-16 15:55:47 +00:00
Mike Blumenkrantz 91de576a7f dri3: avoid deadlocking when polling deleted windows for events
upcoming xserver releases will emit PresentConfigureNotify with this
flag set when a window is destroyed, ensuring drivers
don't poll infinitely and deadlock

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/116

cc: mesa-stable

Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>

Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21339>
2023-02-16 15:55:47 +00:00
Timur Kristóf 4621ffdec1 aco: Get rid of redundant load_vmem_mubuf function.
Call emit_load directly from visit_load_buffer instead.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358>
2023-02-16 15:29:37 +00:00
Timur Kristóf 74f1b77046 radv: Move VS input lowering to new file: radv_nir_lower_vs_inputs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358>
2023-02-16 15:29:37 +00:00
Timur Kristóf 450e173de0 ac/llvm: Change ac_build_tbuffer_load to take format and channel type.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358>
2023-02-16 15:29:37 +00:00
Timur Kristóf 0ae778ca59 ac/llvm: Fix ac_build_buffer_load to work with more than 4 channels.
LLVM is unable to select instructions for num_channels > 4, so we
workaround that by manually splitting larger buffer loads.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358>
2023-02-16 15:29:37 +00:00
Timur Kristóf a2755fc203 ac/llvm: Fix buffer_load_amd with larger than 32-bit channel sizes.
LLVM is unable to select instructions for larger than 32-bit channel types.
Workaround by using i32 and casting to the correct type later.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358>
2023-02-16 15:29:37 +00:00
Timur Kristóf b5b0ded4c1 ac/llvm: Remove "structurized" argument and instead check vindex.
Change ac_build_buffer_load_common and ac_build_tbuffer_load so
the use structurized load when the vindex argument is not NULL.
Adjust callers to match the new behaviour.

This fixes the load_buffer_amd intrinsic with index source.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358>
2023-02-16 15:29:37 +00:00
Timur Kristóf 881c52ba19 ac: Port ACO's get_fetch_format to ac_get_safe_fetch_size.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358>
2023-02-16 15:29:36 +00:00
Timur Kristóf 2e9f5aadd0 nir: Clarify comment above load_buffer_amd.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358>
2023-02-16 15:29:36 +00:00