Commit Graph

142716 Commits

Author SHA1 Message Date
Timur Kristóf fc1fabbabf ac/nir: Analyze culling shaders to remember which inputs are used when.
These will be useful for some optimizations.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
2021-07-13 23:56:33 +00:00
Timur Kristóf faf766b864 ac/nir: Reuse the repacked output positions of culling shaders.
The position outputs are stored into LDS and reloaded after
repacking, therefore the repacked position values can be
reused in the bottom part of the shader.

Fossil DB results on Sienna Cichlid (with NGG culling on):

Totals from 9016 (7.01% of 128647) affected shaders:
VGPRs: 372472 -> 347560 (-6.69%); split: -6.82%, +0.13%
SpillSGPRs: 437 -> 87 (-80.09%)
CodeSize: 32359340 -> 30441692 (-5.93%); split: -5.93%, +0.00%
MaxWaves: 222030 -> 238970 (+7.63%); split: +7.83%, -0.20%
Instrs: 6207833 -> 5834149 (-6.02%); split: -6.02%, +0.00%
Latency: 27626263 -> 27890632 (+0.96%); split: -5.34%, +6.29%
InvThroughput: 4792958 -> 4361336 (-9.01%); split: -9.01%, +0.00%
VClause: 144385 -> 139586 (-3.32%); split: -9.29%, +5.97%
SClause: 141350 -> 129875 (-8.12%); split: -8.57%, +0.45%
Copies: 580017 -> 568916 (-1.91%); split: -3.60%, +1.68%
Branches: 209067 -> 209154 (+0.04%); split: -0.24%, +0.28%
PreSGPRs: 281320 -> 277814 (-1.25%)
PreVGPRs: 290040 -> 273861 (-5.58%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
2021-07-13 23:56:33 +00:00
Timur Kristóf d18920e03a radv: Run algebraic optimizations before NGG lowering.
This makes culling shaders more efficient because they split the
shader in two parts. It is better to optimize before this split
happens.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
2021-07-13 23:56:33 +00:00
Timur Kristóf f30e4351de radv: Support NGG culling with new perftest environment variable.
Currently we don't enable it on any chip by default, but
we plan to enable it soon on GFX10.3 when we are comfortable
with its performance.

RADV_PERFTEST=nggc environment variable enables it on GFX10+ GPUs.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
2021-07-13 23:56:33 +00:00
Timur Kristóf 182d9b1e60 aco: Implement NGG culling related intrinsics.
These are very straightforward as they just copy data from
the newly added shader arguments.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
2021-07-13 23:56:33 +00:00
Timur Kristóf 9a95f5487f radv: New shader args for NGG culling settings and viewport.
Add new shader arguments in RADV for:
- NGG culling settings
- Viewport transform

These will be used by NGG culling shaders.

Additionally, some tweaks are made to some config registers
in order to make culling shaders more efficient.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
2021-07-13 23:56:33 +00:00
Timur Kristóf ed163a44b6 radv: Expose radv_get_viewport_xform in radv_private.h
We need to emit viewport transform information for culling shaders.
This is used for small primitive culling.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
2021-07-13 23:56:33 +00:00
Timur Kristóf e97f0463a8 ac/nir: Implement NGG deferred attribute culling in NIR.
Culling is traditionally done by the rasterizer, but that
can be a bottleneck when an app creates a large number
of primitives. Eg. a lot of tiny triangles reduce the
rasterziation efficiency.

NGG makes it possible for the shader to check primitives
and delete those that it can prove are not needed.

After this is done, we have to repack the surviving invocations
so they remain compact. This also saves bandwidth, because
some memory loads are only executed by those vertices that
survived the culling.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
2021-07-13 23:56:33 +00:00
Timur Kristóf 556a690bac ac/nir: Use a ballot that matches the wave size during NGG lowering.
This generates slightly more efficient code in Wave32 mode.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
2021-07-13 23:56:33 +00:00
Timur Kristóf 651a3da1b5 ac/nir: Add a NIR port of ac_llvm_cull.
The algorithms were originally implemented by Marek Olšák,
hence the copyright to AMD.

This commit just ports the LLVM based implementation to NIR,
using the new intrinsics added earlier.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
2021-07-13 23:56:33 +00:00
Timur Kristóf 48e638ab29 nir: Add AMD specific intrinsics for NGG shader based culling.
The new intrinsics fall into the following categories:

1. New viewport intrinsics:
For missing components that we need.
RADV will emit new SGPR arguments which will contain the
viewport information for culling shaders. These are used to
compute the screen space coordinates for small primitive culling.

2. load_cull_xxx:
Load the culling settings in runtime.
These will be a new SGPR argument in RADV.

3. overwrite_xxx:
These are needed because system values such as vertex and
instance ID are not writeable, but we need to change them
after repacking shader invocations of VS and TES.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
2021-07-13 23:56:33 +00:00
Emma Anholt c071187dbb ci: Enable testing of i915g in the the debian -Werror release build.
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11852>
2021-07-13 22:30:45 +00:00
Emma Anholt 3504bccb7c i915g: Fix release build compiler warnings.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11852>
2021-07-13 22:30:45 +00:00
Emma Anholt 10d8e123c5 freedreno: Optimize duplicate obj-obj ring relocs.
No need to include the same BO multiple times in the long-lived ringbuffer
object's list of relocs to be added to the submit.

Improves non-TC drawoverhead -test 9 (8 tex updates) throughput by 1.4901%
+/- 0.8705% (n=20)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11697>
2021-07-13 22:12:56 +00:00
Emma Anholt 5c3ca9cb81 freedreno/a6xx: Allocate just enough memory for SO state, only if we do SO.
Continuing to improve our suballocation packing.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11697>
2021-07-13 22:12:56 +00:00
Emma Anholt 599443febc freedrneo/a6xx: Reduce the size of the long-lived texture stateobj.
It's just a few commands to upload the sampler/texconst data.  Improves
the efficiency of suballocation.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11697>
2021-07-13 22:12:56 +00:00
Emma Anholt b53e8831bb freedreno/a6xx: Reduce the size of the config stateobj allocation.
Improves the efficiency of suballocation.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11697>
2021-07-13 22:12:56 +00:00
Emma Anholt 737d4caa83 freedreno: Suballocate our long-lived ring objects.
On drawoverhead -test 9 (8 texture changes), this saves us 172kb of
memory.  That's only ~1% of the GEM memory while the test is running, but
more importantly it saves us 29% of the gem BO allocations.

non-TC drawoverhead -test 9 (8 texture change) throughput 0.449019% +/-
0.336296% (n=100), but this gets better as we get better suballocation
density.

Note that this means that all fd_ringbuffer_new_object calls can now
return data aligned to 64 bytes, instead of 4k.  We may find that we need
to increase it if some of our objects (tex consts, sampler consts, etc.)
require more alignment than that.  But, this may help non-drawoverhead
perf if any of our RB objects have a cache in front of them (indirect
consts?) and we don't have most of our data in the same cache set any
more.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11697>
2021-07-13 22:12:56 +00:00
Paul Kocialkowski eefd93c176 lima: Take offset in account when checking BO size
BO resources imported from a handle may have an offset provided, which
reduces the available size within the BO. Take this in account when
checking that the size is sufficient in lima.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11076>
2021-07-13 21:26:21 +00:00
Joshua Ashton c880bdeb40 driconf: Add more workarounds for Teardown
Enable radeonsi_no_infinite_interp for Teardown to fix hangs.

Based on comments from #3714.

Tested-by: Joshua Ashton <joshua@froggi.es>
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Acked-by: Martin Peres <martin.peres@mupuf.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11814>
2021-07-13 21:02:06 +00:00
Simon Zeni c8ed5ac206 anv: Implement VK_EXT_acquire_drm_display
Signed-off-by: Simon Zeni <simon@bl4ckb0ne.ca>
Reviewed-by: Simon Ser <contact@emersion.fr>
Tested-by: Simon Ser <contact@emersion.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11735>
2021-07-13 20:50:32 +00:00
Tony Wasserka f438cbc23e aco: Remove deprecated Operand constructors
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>
2021-07-13 17:43:26 +00:00
Tony Wasserka cfd866ed42 aco: Clean up unneeded literal casts
These were only needed to select the appropriate Operand constructor before.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>
2021-07-13 17:43:26 +00:00
Tony Wasserka 66e51dc474 aco: Remove use of deprecated Operand constructors
This migration was done with libclang-based automatic tooling, which
performed these replacements:
* Operand(uint8_t) -> Operand::c8
* Operand(uint16_t) -> Operand::c16
* Operand(uint32_t, false) -> Operand::c32
* Operand(uint32_t, bool) -> Operand::c32_or_c64
* Operand(uint64_t) -> Operand::c64
* Operand(0) -> Operand::zero(num_bytes)

Casts that were previously used for constructor selection have automatically
been removed (e.g. Operand((uint16_t)1) -> Operand::c16(1)).

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>
2021-07-13 17:43:26 +00:00
Tony Wasserka 76554419b3 aco: Remove use of deprecated Operand constructors in aco_builder.h
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>
2021-07-13 17:43:26 +00:00
Tony Wasserka 4e33688f23 aco: Remove use of deprecated Operand constructors in test_to_hw_instr.cpp
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>
2021-07-13 17:43:26 +00:00
Tony Wasserka db436a843c aco: Replace Operand literal constructors with factory member functions
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>
2021-07-13 17:43:26 +00:00
Emma Anholt 446bf13e48 ci: Make sure that we build the piglit dmabuf tests.
Force the option rather than relying on autodetection -- ARM runners were
apparently finding the necessary deps, but the x86 rootfs (radeonsi, iris)
and x86_test-gl container (i915g) were not.

Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11834>
2021-07-13 16:31:06 +00:00
Mike Blumenkrantz d29c086fb9 zink: simplify modifier ifdefs
these are the only two defines referenced, so they can be defined to 0
for platforms that don't support modifiers in order to remove a ton of
ifdefs and make the code more readable

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11847>
2021-07-13 14:59:18 +00:00
Rob Clark 7f5a01a47d freedreno/ir3: Add float immed "FLUT" support
We can encode a limited set of float immeds into cat2 instructions,
using hw's float lookup table (FLUT) feature.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/36
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8705>
2021-07-13 14:40:30 +00:00
Rob Clark 4b2afd11cc freedreno/computerator: Add script to probe FLUT values
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8705>
2021-07-13 14:40:30 +00:00
Rob Clark 4e802538e7 turnip: Split tu6_emit_xs()
Emit all the state layout config (such as push-const CONSTLEN) first,
before emitting anything that depends on that state.  This fixes an
issue that was showing up when FLUT is enabled in ir3 (which results
in higher probability of not having any immediats lowered to push-
consts).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8705>
2021-07-13 14:40:30 +00:00
Rob Clark 71003e3c84 turnip: avoid some UB
Reduce a bit of extra noise that makes diffing cmdstream traces more
annoying.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8705>
2021-07-13 14:40:30 +00:00
Jason Ekstrand 3d934ee03f glsl: Delete lower_texture_projection
This is only used by i965 and we've been getting it through
nir_lower_tex since forever.  Get rid of the GLSL IR pass.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11827>
2021-07-13 14:06:33 +00:00
Mike Blumenkrantz 2de1849a8c ci: only trigger gallium_core_file_list jobs from dri and glx frontend changes
these are the only frontends which may be used by gallium drivers in ci,
so stop triggering all driver jobs when other frontends are changed since
those changes can never affect ci

<MrCooper> Not that simple unfortunately. E.g. the llvmpipe-piglit-cl job hits
           src/gallium/frontends/clover & possibly src/gallium/targets/opencl,
           many jobs hit src/gallium/{frontends,targets}/dri and probably
           src/gallium/targets/pipe-loader, lavapipe jobs hit src/gallium/{frontends,targets}/lavapipe.

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11832>
2021-07-13 13:36:15 +00:00
Mike Blumenkrantz 0b9a2abd49 ci: add vulkan files to lavapipe rules
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11833>
2021-07-13 13:08:19 +00:00
Icecream95 4531de487e pan/bi: Create a nop clause when the shader starts with ATEST
Otherwise there would be no clause with the dependencies needed for
ATEST set, so the GPU would get stuck.

Not needed on v7, as there shader_wait_dependency in the RSD will wait
for the dependencies before the shader starts.

Explicitly create a NOP instruction, as it is assumed that clauses
have a non-zero count of instructions in various places.

Fixes GPU timeouts in many applications, such as SuperTuxKart and
GZDoom.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11842>
2021-07-13 12:32:47 +00:00
Icecream95 c689a1dcb3 panfrost: Fix full_threads calculation on v6
Fixes: 8ba2f9f698 ("panfrost: Create a blitter library to replace the existing preload helpers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11842>
2021-07-13 12:32:47 +00:00
Heinrich Fink bff3ac0b26 gbm/dri: Fix leaking bo memory on failure path
In gbm_dri_bo_create, when modifiers are requested but not supported, do
not return NULL immediately, but first go to cleanup section to free
already allocated buffer object.

Fixes: cb9ae4273d ("dri: add loader_dri_create_image helper")
Signed-off-by: Heinrich Fink <hfink@snap.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11844>
2021-07-13 11:15:44 +00:00
Antonio Caggiano 7eb7ed8cde pps: Panfrost documentation
Add documentation for the Panfrost Perfetto datasource.

Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10215>
2021-07-13 11:03:55 +00:00
Antonio Caggiano 513d1baaea pps: Panfrost pps driver
Add the Panfrost pps driver.

v2: Human readable names for counter blocks and use `unreachable`.
v3: Use libpanfrost_perf to collect counter values.

Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10215>
2021-07-13 11:03:55 +00:00
Pierre-Eric Pelloux-Prayer bcf8c7910d mesa: clear shader_info::is_lowered in prog_to_nir
This needs to be resetted each time prog_to_nir is called because it
turns st_nir_assign_vs_in_locations into a no-op when set.

Fixes: 81d106d6ec ("radeonsi: lower IO intrinsics - complete rewrite of input/output scanning")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5001
Reviewed-by: Isaac Bosompem <mrisaacb@google.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11636>
2021-07-13 10:42:47 +00:00
Iago Toral Quiroga bf89b2f041 v3dv: use defines for push constant offsets used by texel buffer copy shaders
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11843>
2021-07-13 10:20:39 +00:00
Iago Toral Quiroga a89cd7f9bb v3dv: allow batching texel buffer copies for 3D images
For these we only need to check that the depth extent we are
copying is the same across regions in the batch, since we use
that to specify the number of layers in the framebuffer used
for the copy.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11843>
2021-07-13 10:20:39 +00:00
Iago Toral Quiroga 738e7106dd v3dv: implement layered texel buffer copies using a geometry shader
Instead of specifying a separate framebuffer per layer which is expected
to be much slower.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11843>
2021-07-13 10:20:39 +00:00
Iago Toral Quiroga 8c16b48009 v3dv: fix push constant range for texel buffer copy pipelines
As per get_texel_buffer_copy_fs(), we load 24 bytes of data.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11843>
2021-07-13 10:20:39 +00:00
Erik Faye-Lund 4efbeafa44 zink: remove duplicate format-mapping on little-endian
Doing *both* of thse ends up rewriting the previous mapping. Since this
doesn't seem to have lead to issues, it seems like the new mapping works
just as well.

Fixes: a22a1c0324 ("zink: Fix VK_FORMAT_A8B8G8R8_SRGB_PACK32 mapping on big-endian")
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11417>
2021-07-13 08:11:33 +00:00
James Jones b2252de03e loader: Handle failure to load DRI driver library
I factored out the chunk of loader code that dlopen()s
libraries from the rest of the DRI driver loader function
in this commit:

  commit bc343154f8
  Author: James Jones <jajones@nvidia.com>
  Date:   Thu Apr 22 23:17:08 2021 -0700

  loader: Factor out driver library loading code

However, I failed to adjust the DRI loader function that
now uses the new helper function to handle the case where
the requested DRI library is not found.

This change restores the prior behavior, and also ensures
loader_open_driver() consistently returns NULL in the
out_driver_handle parameter on failure.

Fixes: bc343154f8 ("loader: Factor out driver library loading code")
Signed-off-by: James Jones <jajones@nvidia.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11807>
2021-07-13 07:36:17 +00:00
Daniel Schürmann b97cd93b35 aco: fix extract_vector optimization
If the allocated_vec map contains a different RegType
for the elements, ensure that the size matches exactly.

Otherwise, it could happen that extracting a dword
element matched with a subdword element.

No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11823>
2021-07-13 09:14:43 +02:00
Daniel Schürmann 98136bda05 aco: fix self-intersecting register swaps
Splitting self-intersecting register swaps into
3 sections was unnecessary and only worked because
the middle section was always empty for full dword
swaps.

No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11823>
2021-07-13 09:14:43 +02:00