Not really direct perfetto support, but add a way that tracepoints can
be associated with a driver provided callback which can generate
perfetto events using the timestamps collected on the GPU.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9901>
Various places around mesa which might want to register a data-source,
etc, should call util_perfetto_init() first to ensure we connect to the
tracing service.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9901>
A shorter interval lets us have more granularity to see counter changes
per tile pass.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9901>
In radv_get_modifier_flags, we had a special case for multi-planar
formats. However ac_is_modifier_supported should already take care of
rejecting unsupported modifiers for multi-planar buffers.
Some time ago, ac_is_modifier_supported rejected any non-linear modifier
for multi-planar formats. 35e25ea1d0 ("ac/surface: allow non-DCC modifiers
for YUV on GFX9+") changed that to allow non-DCC modifiers with
multi-planar formats on GFX9+. Since then, the radv check has been out
of sync.
A similar patch was applied to radeonsi in 979e138695 ("radeonsi: stop
special-casing YUV formats in si_query_dmabuf_modifiers").
This fixes tiling artifacts with NV12 buffers.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10623>
We relied on mm_slab_new choosing the same bucket the caller used, just
pass it in.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8765>
It seems common for there to be holes.
fossil-db (GFX10.3, robustBufferAccess enabled):
Totals from 33791 (23.10% of 146267) affected shaders:
(no statistics changed)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871>
This reverts commit a8a6b9fb2fdcb1bea55707fa0c2b8e96f03c6b5b.
This is no longer necessary now that we fixup the size when creating the
descriptors.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871>
In the future, we might have vertex attribute loads from the same binding
but with different descriptors. Since they will be loading from the same
buffer, we should continue grouping them into clauses.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871>
The condition for enabling it was incorrect, and was always false.
Therefore, it was never really enabled.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10683>
The GS copy shader is not used with NGG GS.
This fixes a big bug when NGG GS is running in Wave32 mode.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10683>
Fixes issues with upcoming CTS test testing empty structs.
v2: decorate with UNUSED as only used in assert (Timothy)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10681>
according to spec, this is supposed to handle fragment shader fetch
from previous draw output, not color output readback from previous
color output write
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10706>
this is much easier to read and is going to greatly simplify the eventual
multidraw implementation which will be dropped in
also it allows moving conditionals outside of loops to very slightly improve
drawoverhead performance (with multidraw)
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10701>
instead of manually creating the ivci, there's already a util function for
that which will handle everything
also only set layer info if there are multiple layers
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10703>
The current policy is to always favor accumulators if possible, however,
this is not always optimal.
Particularly, accumulators play a crucial role in enabling QPU instruction
merges, since these are limited to both the ADD and the ALU instructions
addressing at most 2 physical registers. For 2-src instructions, this means
that to be able to merge we need them to address at least 2 accumulators.
While favoring accumulators does help the case for instruction merges in
general, it is risky to assign accumulators to variables that have
long life spans. Doing so will make the accumulator unavailable for
any other instructions during that life span, and since we only have a few
accumulators, we can quickly run out and losing our capacity to merge
instructions for large parts of the qpu program.
On the other hand, we also want to avoid the extreme case were we keep
allocating physical registers to the point we run out, even if we have
accumulators available, since accumulators have additional restrictions
and may not be suitable for everything.
This change continues the policy of favoring accumulators, but it only
does so if the life span of the temps is short, to ensure that we can
recycle accumulators often across instructions and avoid running out
for sections of the QPU code, unless we are already running out of
physical registers.
total instructions in shared programs: 13654647 -> 13336921 (-2.33%)
instructions in affected programs: 11015919 -> 10698193 (-2.88%)
helped: 39758
HURT: 17325
Instructions are helped.
total threads in shared programs: 412046 -> 412038 (<.01%)
threads in affected programs: 16 -> 8 (-50.00%)
helped: 0
HURT: 4
Threads are HURT.
total uniforms in shared programs: 3745726 -> 3746003 (<.01%)
uniforms in affected programs: 17296 -> 17573 (1.60%)
helped: 76
HURT: 99
Uniforms are HURT.
total max-temps in shared programs: 2364430 -> 2359942 (-0.19%)
max-temps in affected programs: 109117 -> 104629 (-4.11%)
helped: 2893
HURT: 772
Max-temps are helped.
total spills in shared programs: 5727 -> 5746 (0.33%)
spills in affected programs: 221 -> 240 (8.60%)
helped: 1
HURT: 2
total fills in shared programs: 13121 -> 13139 (0.14%)
fills in affected programs: 466 -> 484 (3.86%)
helped: 1
HURT: 2
total sfu-stalls in shared programs: 33432 -> 34491 (3.17%)
sfu-stalls in affected programs: 18219 -> 19278 (5.81%)
helped: 4459
HURT: 5087
Inconclusive result
total inst-and-stalls in shared programs: 13688079 -> 13371412 (-2.31%)
inst-and-stalls in affected programs: 11030017 -> 10713350 (-2.87%)
helped: 39630
HURT: 17429
Inst-and-stalls are helped.
total nops in shared programs: 335753 -> 333708 (-0.61%)
nops in affected programs: 112659 -> 110614 (-1.82%)
helped: 8726
HURT: 7383
Inconclusive result
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10686>