Test Zink on Venus on Lavapipe.
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27790>
Moves the lowering of VGRFs into FIXED_GRFs from the code generation
to (almost) right after the register allocation.
This will allow: (1) later passes not worry about VGRFs (and what they
mean in a post reg alloc phase) and (2) make easier to add certain
types of validation post reg alloc phase using the backend IR.
Note that a couple of passes still take advantage of seeing "allocated
VGRFs", so perform lowering after they run.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28604>
This commit adds an implementation of removesuffix so we don't
need the 'str' one which was added in 3.9.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28831>
Bspec 57023: RENDER_SURFACE_STATE::Shader Channel Select Red
"For channels not present in the surface format, the corresponding
Surface Channel Select is either SCS_ZERO or SCS_ONE."
This restriction applies to alpha channel as well if an associated
resource is not used as a render target.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28791>
Bspec 57023: RENDER_SURFACE_STATE:: Shader Channel Select Red
"Render Target messages do not support swapping of colors with
alpha. The Red, Green, or Blue Shader Channel Selects do not
support SCS_ALPHA. The Shader Channel Select Alpha does not support
SCS_RED, SCS_GREEN, or SCS_BLUE."
Cc: mesa-stable
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28791>
UVD and VCE are separated engines, and not co-exist with VCNs
Fixes: c34cfc1a3b (ac/gpu_info: update multimedia info)
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28863>
Use predication instead of branching whenever possible and profitable:
all divergent leaf branches are replaced with predication. Non-divergent
branches are kept since for those a branch might be more performant when
it jumps over all instructions. Although it might be possible to support
a limited form of nested predication, this is more difficult to
implement so we only support leaf branches for now.
When translating from NIR to ir3, predication is emitted just like
normal branches except that the branch is replaced with pred[tf] and the
opposite (pred[ft]) is inserted at the end of the then-block. This
pattern is then recognized during legalization at which point the
closing prede is inserted. We don't insert this right away to allow
opt_jump to optimize jumps out of the else-block. Since the branches we
support for predication always have exactly one block in each arm, the
then-block is emitted first, and blocks are never reordered, this way of
emitting predicated branches ensures they have the correct memory
layout.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27982>
To support predt/predf which always read from p0.x, we need to support
precolored sources for the predicates RA.
This patch implements this as follows: whenever a precolored source is
encountered whose def isn't live in the correct register, reload it into
the correct one. To make sure we don't reload too often, two precautions
are made. First, we precolor all defs of precolored sources and try do
use that register when allocating one for a def. Second, since currently
only p0.x is used for precoloring, we try not to allocate it whenever
there are outstanding precolored defs.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27982>
We used to model predt/predf as taking a predicate register source. The
blob disassembler shows them taking a label argument. However, it seems
that both are incorrect: the condition is always taken from p0.x and I
have not been able to construct a test case were the label makes any
difference.
This patch changes predt/predf to not take any arguments and adds
documentation about how predicated execution works.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27982>
split-debug uses C args `--gsplit-dwarf` and linker args `--gdb-index`
to achieve split debug, speed up the CI linking, and allow us to
distribute debug symbols standalone.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28576>
Set the proper bit when adding clipdist load/store.
It also sets the variable name to match with the CLIPDISTn created.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28798>
Following the previous patch and allocating just the number of
iris_bucket_cache that will be used by giving platform.
While at it also adding util_vma_heap_finish() call in the
iris_bufmgr_create() error path.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28864>
It was allocating slabs and cache buckets data structs of lmem heaps
but those will never be used in integrated gpus, so lets avoid waste
cpu time and memory with those.
This will also remove slabs and cache buckets for
IRIS_HEAP_DEVICE_LOCAL_CPU_VISIBLE_SMALL_BAR for
discrete GPUs in systems with resizeble bar enabled.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28864>
Thanks to transform feedback, we don't know which varying components
will be used when compiling the FS. The VS could use additional
components for xfb, and packing the inlocs per-component would result in
overlapping varyings. In order to do this properly, we'd need to create
a variant for the FS when used with xfb.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28626>
I think this isn't necessary, and when we disable packing inlocs we will
start actually using the compmask computed here tests like
KHR-Single-GL46.enhanced_layouts.varying_components on zink will fail
unless we add the extra unused components at the beginning.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28626>
sim_state.perfcnt_total provides the total number of counters
supported by the underlying simulated platform and is what we
use when we create a perform to validate that the counters
requested are valid, so we should use this.
V3D_PERFCNT_NUM is a fixed enum value that is only valid for
V3D 4.2 at present and is not sufficiently large for all the
counters available in V3D 7.x.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28870>
Check how much smaller can the weight+bias buffers be with different
amount of bits to encode runs of zeroes and choose the smallest one.
This reduces the bandwidth considerably, which is at present the
bottleneck with useful models.
On a Libre Computer Alta AML-A311D-CC, I see these improvements:
MobileNetV1: 15.650ms -> 9.991ms
SSDLite MobileDet: 56.149ms -> 32.692ms
Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27513>
This way, applications can get to know about performance issues when
they happen, using the debug callback mechanism.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28693>
Now that we only call one of these, the other one is superfluous. So
let's combine them and use the shorter name for the result.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28693>
This allows us to use perf_debug_ctx() instead of perf_debug(), which
will help make things a bit cleaner down the line.
In order to do this, we also need to make sure we always have access to
the context, so let's also pass ctx to panfrost_should_linear_convert
while we're at it.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28693>
All of the NPU related DRM_ETNAVIV_GET_PARAM values, which got introduced in
6.9-rc1 of the kernel got removed before the 6.9 release. Clean-up our code base.
NPU support _NEEDS_ hwdb support and a recent stable kernel.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28837>