A step towards getting rid of checks for gpu_id sprinkled around.
Checking major generation is ok, but checking for == or >= a specific
gpu_id is going to start getting messy as we add more a6xx.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11790>
This way we can make the tables const. At the same time, for a6xx, this
introduces a "sub-generation template" to reduce the copy/paste for
parameters which are keyed to the sub-generation. It also explicitly
lists every supported GPU, to get rid of duplicate lists of supported
gpus between the device-info and drivers.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11790>
Replace it with a calculation which works for all current GPUs.
Duplicated the calculation in both drivers because freedreno_dev_info isn't
meant for derived parameters (and drivers might want to just calculate on
the fly instead).
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11790>
The larger page alignment is directly related to the 96 tile alignment,
so check for that instead of a specific gpu id.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11790>
- It hurts users with newer firmware who don't need the workaround
- Kernel now rejects older firmware due to security issues, so this will
prevent users from using older firmware anyway.
- Only whitelisting 650 enables the workaround by default for any new GPUs
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11790>
The result is about +5-ish fps in Doom Eternal.
It turns out that the location of position exports matters more
than we thought, and it's actually better to keep them at the bottom
for culling shaders rather than schedule it up to the top.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
This makes culling shaders more efficient because they split the
shader in two parts. It is better to optimize before this split
happens.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
Currently we don't enable it on any chip by default, but
we plan to enable it soon on GFX10.3 when we are comfortable
with its performance.
RADV_PERFTEST=nggc environment variable enables it on GFX10+ GPUs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
These are very straightforward as they just copy data from
the newly added shader arguments.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
Add new shader arguments in RADV for:
- NGG culling settings
- Viewport transform
These will be used by NGG culling shaders.
Additionally, some tweaks are made to some config registers
in order to make culling shaders more efficient.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
We need to emit viewport transform information for culling shaders.
This is used for small primitive culling.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
Culling is traditionally done by the rasterizer, but that
can be a bottleneck when an app creates a large number
of primitives. Eg. a lot of tiny triangles reduce the
rasterziation efficiency.
NGG makes it possible for the shader to check primitives
and delete those that it can prove are not needed.
After this is done, we have to repack the surviving invocations
so they remain compact. This also saves bandwidth, because
some memory loads are only executed by those vertices that
survived the culling.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
The algorithms were originally implemented by Marek Olšák,
hence the copyright to AMD.
This commit just ports the LLVM based implementation to NIR,
using the new intrinsics added earlier.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
The new intrinsics fall into the following categories:
1. New viewport intrinsics:
For missing components that we need.
RADV will emit new SGPR arguments which will contain the
viewport information for culling shaders. These are used to
compute the screen space coordinates for small primitive culling.
2. load_cull_xxx:
Load the culling settings in runtime.
These will be a new SGPR argument in RADV.
3. overwrite_xxx:
These are needed because system values such as vertex and
instance ID are not writeable, but we need to change them
after repacking shader invocations of VS and TES.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
No need to include the same BO multiple times in the long-lived ringbuffer
object's list of relocs to be added to the submit.
Improves non-TC drawoverhead -test 9 (8 tex updates) throughput by 1.4901%
+/- 0.8705% (n=20)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11697>
On drawoverhead -test 9 (8 texture changes), this saves us 172kb of
memory. That's only ~1% of the GEM memory while the test is running, but
more importantly it saves us 29% of the gem BO allocations.
non-TC drawoverhead -test 9 (8 texture change) throughput 0.449019% +/-
0.336296% (n=100), but this gets better as we get better suballocation
density.
Note that this means that all fd_ringbuffer_new_object calls can now
return data aligned to 64 bytes, instead of 4k. We may find that we need
to increase it if some of our objects (tex consts, sampler consts, etc.)
require more alignment than that. But, this may help non-drawoverhead
perf if any of our RB objects have a cache in front of them (indirect
consts?) and we don't have most of our data in the same cache set any
more.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11697>
BO resources imported from a handle may have an offset provided, which
reduces the available size within the BO. Take this in account when
checking that the size is sufficient in lima.
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11076>
Enable radeonsi_no_infinite_interp for Teardown to fix hangs.
Based on comments from #3714.
Tested-by: Joshua Ashton <joshua@froggi.es>
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Acked-by: Martin Peres <martin.peres@mupuf.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11814>
This migration was done with libclang-based automatic tooling, which
performed these replacements:
* Operand(uint8_t) -> Operand::c8
* Operand(uint16_t) -> Operand::c16
* Operand(uint32_t, false) -> Operand::c32
* Operand(uint32_t, bool) -> Operand::c32_or_c64
* Operand(uint64_t) -> Operand::c64
* Operand(0) -> Operand::zero(num_bytes)
Casts that were previously used for constructor selection have automatically
been removed (e.g. Operand((uint16_t)1) -> Operand::c16(1)).
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>
Force the option rather than relying on autodetection -- ARM runners were
apparently finding the necessary deps, but the x86 rootfs (radeonsi, iris)
and x86_test-gl container (i915g) were not.
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11834>
these are the only two defines referenced, so they can be defined to 0
for platforms that don't support modifiers in order to remove a ton of
ifdefs and make the code more readable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11847>
Emit all the state layout config (such as push-const CONSTLEN) first,
before emitting anything that depends on that state. This fixes an
issue that was showing up when FLUT is enabled in ir3 (which results
in higher probability of not having any immediats lowered to push-
consts).
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8705>
these are the only frontends which may be used by gallium drivers in ci,
so stop triggering all driver jobs when other frontends are changed since
those changes can never affect ci
<MrCooper> Not that simple unfortunately. E.g. the llvmpipe-piglit-cl job hits
src/gallium/frontends/clover & possibly src/gallium/targets/opencl,
many jobs hit src/gallium/{frontends,targets}/dri and probably
src/gallium/targets/pipe-loader, lavapipe jobs hit src/gallium/{frontends,targets}/lavapipe.
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11832>