We need more GS/NGG bits, so we need to add current_gs_state for that.
This simplifies the logic in the draw code.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16885>
When NGG is active, the GS invocation counter is always incremented, even
if there's no explicit GS.
Implementing the counter manually fixes it:
* in emit_gs_epilogue for the legacy path
* in gfx10_ngg_gs_emit_prologue for the ngg path
This fixes piglit's arb_query_buffer_object-qbo test.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15861>
To support PIPE_STAT_QUERY_GS_INVOCATIONS and PIPE_STAT_QUERY_GS_PRIMITIVES
being used at the same time we have to reuse the same buffer.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15861>
Statistics only work in non-NGG mode. If screen->use_ngg is true, we can't
know if the draw will actually use NGG or not, so this commit switch
to a shader based implementation of this counter.
To avoid modifying si_query, the shader implementation behaves like the hw
one: it uses the same buffer size and offset.
The emulation path activation in the shader is controlled by vs_state_bit[31].
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15861>
Currently this just has wait, but in order to get the right answer
for vulkan partial, lavapipe/llvmpipe need to pass a partial flag
through here in the future.
This just changes the API so that's possible.
v2: use an enum (zmike)
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15009>
Performance counters will be used by RADV for VK_KHR_performance_query
and also for adding SPM support.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11140>
This will allow removing the winsys pointer from buffers.
The amdgpu winsys adds dummy_ws to get radeon_winsys because there can be
no radeon_winsys around (e.g. while amdgpu_winsys is being destroyed), but
we still need some way to call buffer functions.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9809>
DCC/CMASK/HTILE clears will not set this. We could do a better job
at not setting this in other cases too
Image copies also don't set this.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9795>
Slabs always allocate the next power of two size from their pools. This
wastes memory if the size is not a power of two.
bo->base.size is overwritten because the default is the allocated power of
two size, but we need the real size to compute the wasted size in
amdgpu_bo_slab_destroy. entry_size is added to the hole in pb_slab_entry
to hold the real entry size.
Like other memory stats, no atomics are used.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8683>
We often do this:
pipe->set_constant_buffer(pipe, shader, slot, &cb);
pipe_resource_reference(&cb->buffer, NULL);
That results in atomic increment in set_constant_buffer followed by
atomic decrement after set_constant_buffer. This new interface
eliminates those atomics.
For the case above, this should be used instead:
pipe->set_constant_buffer(pipe, shader, slot, true, &cb);
cb->buffer = NULL; // if cb is not a local variable, else do nothing
AMD Zen benefits from this. The perf improvement is ~3% for Viewperf13/Catia.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>
This decreases the release libgallium_dri.so size without debug symbols
by 16384 bytes. The CPU time spent in si_emit_draw_packets decreased
from 4.5% to 4.1% in viewperf13/catia/plane01.
The previous code did:
cs->current.buf[cs->current.cdw++] = ...;
cs->current.buf[cs->current.cdw++] = ...;
cs->current.buf[cs->current.cdw++] = ...;
cs->current.buf[cs->current.cdw++] = ...;
The new code does:
unsigned num = cs->current.cdw;
uint32_t *buf = cs->current.buf;
buf[num++] = ...;
buf[num++] = ...;
buf[num++] = ...;
buf[num++] = ...;
cs->current.cdw = num;
The code is the same (radeon_emit is redefined as a macro) except that
all set and emit functions must be surrounded by radeon_begin(cs) and
radeon_end().
radeon_packets_added() returns whether there has been any new packets added
since radeon_begin.
radeon_end_update_context_roll(sctx) sets sctx->context_roll = true
if there has been any new packets added since radeon_begin.
For now, the "cs" parameter is intentionally unused in radeon_emit and
radeon_emit_array.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8653>
Replace mesa's slightly different container_of() with one more aligned
to the linux kernel's version which takes a type as the 2nd param. This
avoids warnings like:
freedreno_context.c:396:44: warning: variable 'batch' is uninitialized when used within its own initialization [-Wuninitialized]
At the same time, we can add additional build-time type-checking asserts
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7941>
There are many issues with SDMA across many generations of hardware.
A recent example is that gfx10.3 suffers from random GPU hangs if
userspace uses SDMA.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7908>
It's straightforward except that the amdgpu winsys had to be cleaned up
to allow this.
radeon_cmdbuf is inlined and optionally the winsys can save the pointer
to it. radeon_cmdbuf::priv points to the winsys cs structure.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7907>
This is a relatively minimal change to adjust all the gallium interfaces
to use bool instead of boolean. I tried to avoid making unrelated
changes inside of drivers to flip boolean -> bool to reduce the risk of
regressions (the compiler will much more easily allow "dirty" values
inside a char-based boolean than a C99 _Bool).
This has been build-tested on amd64 with:
Gallium drivers: nouveau r300 r600 radeonsi freedreno swrast etnaviv v3d
vc4 i915 svga virgl swr panfrost iris lima kmsro
Gallium st: mesa xa xvmc xvmc vdpau va
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>