This add support for the tessellation i/o callbacks.
Tessellation requires another level of indirect indexing,
and allows fetches from shader outputs.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3841>
This adds the same tessellator code that swr uses, swr should
move to using this copy, unfortunately that wasn't trivial on my first
look.
The p_tessellator wrapper wraps it in a form that is a useful interface
for draw.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3841>
This is pretty basic (and a bit crappy at the moment). I think we
might want some sort of overlay in the future and also be able to
trigger captures with F12 or whatever.
To record a capture, set RADV_THREAD_TRACE to something greater than
zero (eg. RADV_THREAD_TRACE=100 will capture frame #100). If the
driver didn't crash (or the GPU didn't hang), the capture file
should be stored in /tmp.
To open that capture, use Radeon GPU Profiler and enjoy your
profiling times with RADV! \o/
Note that thread trace support is quite experimental, only GFX9 is
supported at the moment, and a bunch of useful stuff are still missing
(shader ISA, pipelines info, etc). More is comming soon.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
SQTT is also a file format (.rgp extension) that can be consumed
by Radeon GPU Profiler for profiling purposes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
Thread trace markers (also called events in Radeon GPU Profiler)
should be emitted after every draw/dispatch calls to collect data.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
SQTT is a hardware block that collects thread trace data (like
wave occupancy, timings, etc) for every draw/dispatch calls.
It's only supported on GFX9 at the moment but I will add other
generations support soon.
This is the first step towards profiling with RADV!
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
I want to make sure that I don't introduce warnings in turnip where we
have active work going on, and I also want to make sure that the drivers
we care about testing are warnings-clean.
As with the previous -Werror change, this is for CI only and doesn't
affect end-user builds.
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3607>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3607>
The previous instance of this comparision was 1u to avoid the warning, fix
this one too.
Fixes: dba71de5c6 ("aco: only create parallelcopy to restore exec at loop exit if needed")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3607>
let the driver skip or submit an empty draw call.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3990>
They are never used by multi draws and internal draws.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3990>
Midgard has the ability to calculate addresses as part of the load/store
pipeline. We'd like to make use of this to avoid doing this work on the
ALU pipes. To do so, when emitting globals/SSBOs/shareds, we walk the
tree looking for address arithmetic to try to parse out something the
hardware can work with, letting the original instructions be DCE'd
ideally. This analysis is done at the NIR level to properly account for
some messy details of vectorization which we'd rather not poke at the
backend level. (Originally I wrote this as a MIR pass but I'm fairly
sure it was wrong.)
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3978>
The swizzles are as-if they were 32-bit regardless of the bitness of the
operation, but the source sizes can and do change depending on the
flags. Account for this in the analysis.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3978>
Many load/store ops (used for globals, SSBOs, shared memory, etc) have
the ability to compute addresses directly. Mark off which ones behave
like this.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3978>
When mixing 32/64-bit, we need to align the 32-bit registers to get the
required alignment. This isn't quite enough yet, though, since user
swizzles could bypass and will need to be lowered to 32-bit moves
(outstanding todo).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3978>
I'm working on moving the db410c CI from docker to LAVA, which means we
get to boot a custom kernel. To do that, we need to enable ARCH_QCOM in
the kernel, save the dtb around, and include abootimg in our container so
that we can generate combined kernel/dtb/ramdisk images for fastboot.
LAVA's fastboot support is unable to pack the overlay into an abootimg
image, just a cpio rootfs. We could flash the cpio rootfs after overlay
addition, but that takes 2 minutes to do, and causes wear on the devices.
Instead, we'll bring up the network at boot and use wget to fetch the
overlay. We'll want network support anyway, so that we can transfer the
failure xmls back to the gitlab job's artifacts at some point.
Since the msm GPU and realtek network firmware increase our payload by
3MB, add in firmware compression so that it doesn't waste as much RAM on
devices not using it.
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3928>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3928>
No sense building some of these subsystems for just testing the GPU.
Saves some container rebuild time and kernel contents to move around.
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3928>