Commit Graph

153342 Commits

Author SHA1 Message Date
Samuel Pitoiset 665a671c7d radv: only init acceleration structure if RT is enabled
This is to fix a LLVM crash with 13.0 because ATOMIC_FMAX is only
supported on 14.0+, so RADV_DEBUG=llvm was just completely broken.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16305>
2022-05-04 16:46:25 +00:00
Samuel Pitoiset e53e70fba0 radv/sqtt: fix configuring AUTO_FLUSH_MODE on GFX10.3
The polarity is inverted. Ported from RadeonSI and PAL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16303>
2022-05-04 16:13:49 +00:00
Samuel Pitoiset 4f9ae10296 ac,radeonsi: add has_sqtt_auto_flush_mode_bug
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16303>
2022-05-04 16:13:49 +00:00
Adam Jackson 6f4b5fb675 egl/kopper: Hook up eglSwapInterval
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15800>
2022-05-04 15:06:51 +00:00
Adam Jackson b6ea787903 glx/kopper: Enable GLX_EXT_swap_control etc.
This requires newly tracking the max swap interval since kopper can't do
abs(interval) > 1 yet.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15800>
2022-05-04 15:06:51 +00:00
Adam Jackson 1e90e3325b kopper: Grow a swap interval API
We take a slight liberty here by allowing 0 to mean either MAILBOX or
IMMEDIATE, since Wayland (at least) doesn't have a true IMMEDIATE mode
at least MAILBOX won't throttle to vblank.

This only correctly handles intervals of 0 or 1 at the moment.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15800>
2022-05-04 15:06:51 +00:00
Adam Jackson 44a20baeb8 wsi/x11: Avoid using xcb_wait_for_special_event in FIFO modes
If the window is destroyed from underneath us while we happen to be in
xcb_wait_for_special_event, there's no recovery. The special event will
never match because the XID is no longer valid, and Present doesn't have
an in-band DestroyNotify. We're going to work around this by using the
poll API instead. If we get an event we short-circuit back to the top of
the "wait for available image" loop, so we drain the whole special event
queue before any other logic. Which means if we run out of special
events (and the connection and swapchain are still valid) that we
_don't_ have enough images available, so to hurry along any events that
the X server hasn't flushed out yet we call GetGeometry on the
swapchain's window. As a side effect this verifies that the window is
still alive.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15800>
2022-05-04 15:06:51 +00:00
Samuel Pitoiset 260cd1a18b radv/radix: handle intentional allocation failures properly
This test can intentionally make the alloc callback to return NULL, so
we have to handle object creation failures properly. The driver should
also avoid memleaks because the test checks that.

Fixes crashes with
dEQP-VK.api.device_init.create_instance_device_intentional_alloc_fail.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Martin Roukala <martin.roukala@mupuf.org>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16298>
2022-05-04 14:44:55 +00:00
Konstantin Seurer 428929cf1f radv: Use RADV_RT_STAGE_BITS more often
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16306>
2022-05-04 13:48:24 +00:00
Konstantin Seurer 3438a5ec15 radv: Treat rt stages like compute stages
Fixes dEQP-VK.binding_model.descriptorset_random.sets4.noarray.ubolimitlow.sbolimitlow.sampledimglow.outimgtexlow.noiub.nouab.rgen.noia.0
and probably some other ones.

Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16306>
2022-05-04 13:48:24 +00:00
Konstantin Seurer 0fe2ffeb65 radv: Move RADV_RT_STAGE_BITS to radv_private.h
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16306>
2022-05-04 13:48:24 +00:00
Samuel Pitoiset f5cffbb8df radv: re-emit dynamic line stipple state if the primitive topology changed
The dynamic primitive topology could change from LINE_LIST to
LINE_STRIP for example and the stipple state depends on this.

Found by inspection.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16295>
2022-05-04 13:30:13 +00:00
Alyssa Rosenzweig 0fcddd4d2c pan/bi: Rework varying linking on Valhall
Valhall introduces hardware-allocated varyings. Instead of allocating varying
descriptors on the CPU with a slot based interface, the driver just tells the
hardware how many bytes to allocate per vertex and loads/stores with byte
offsets. This is much nicer!

However, this requires us to rework our linking code to account for separable
shaders. With separable shaders, we can't rely on driver_location matching
between stages, and unlike on Midgard, we can't resolve the differences with
curated command stream descriptors. However, we *can* rely on slots matching. So
we should "just" determine the byte offsets based on the slot, and then
separable shaders work.

For GLES, it really is that easy.

For desktop GL, it's not -- desktop GL brings unpredictable extra varyings like
COL1 and TEX2. Allocating space for all of these unconditionally would hamper
performance. To cope, we key fragment shaders to the set of non-GLES varyings
written by the linked vertex shader. Then we may define an efficient ABI, where
only apps only pay for what they use.

Fixes various tests in dEQP-GLES31.functional.separate_shader.random.* on
Valhall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16310>
2022-05-04 13:07:59 +00:00
Alyssa Rosenzweig 635d8d6bd7 panvk: Don't use VARYING_SLOT_TEX0 internally
This is a legacy varying for desktop GL use. Don't use it in our meta shaders,
as it adds pointless complexity.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16310>
2022-05-04 13:07:59 +00:00
Alyssa Rosenzweig 27a8e4f9d5 panfrost: Don't use VARYING_SLOT_TEX0 internally
This is a legacy varying for desktop GL use. Don't use it in our internal
shaders, as it adds pointless complexity.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16310>
2022-05-04 13:07:59 +00:00
Alyssa Rosenzweig b31527952e panfrost/ci: Smoke test spilling
Spilling is tricky and doesn't get much testing in CI. Run
a subset of dEQP-GLES2.functional.shaders.* with spilling forced to get spilling
tested in CI.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16314>
2022-05-04 12:48:27 +00:00
Alyssa Rosenzweig 6761dbf891 panfrost: Use packed TLS on Valhall
Packed TLS has cache-locality benefits on Valhall, compared to Bifrost's flat
TLS. Valhall does support flat TLS, but requires extra arithmetic in the shader
for correct results. At least until we get to generic pointers (and maybe even
then), we can use packed TLS. So just use packed TLS always for proper spilling.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16314>
2022-05-04 12:48:27 +00:00
Alyssa Rosenzweig 98bdc4a5ff panfrost: Use emit_tls
Instead of rolling our own, so it can pick up a Valhall fix.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16314>
2022-05-04 12:48:27 +00:00
Alyssa Rosenzweig 0e65c6de0e panfrost: Correct XML for TLS
It was never updated for Valhall, from Midgard.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16314>
2022-05-04 12:48:27 +00:00
Alyssa Rosenzweig 045ed4e688 pan/bi: Assert that blend shaders may not spill
The set of blend shaders is closed. They are completely internal. As such, we
know that the registers we reserve for them suffice, and we don't permit
register spilling. Refusing to support spilling in blend shaders simplifies a
number of parts of the compiler. Add a check that we don't try to spill anyway,
which will silently fail.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16314>
2022-05-04 12:48:27 +00:00
Alyssa Rosenzweig 6b6ace5199 pan/bi: Add option to test spilling
BIFROST_MESA_DEBUG=spill now restricts the register file to 1/4 its usual size,
useful for testing register spilling (e.g. running CTS) as well as debugging
spilling on small shaders.

Note blend shaders are exempt, as we don't allow blend shaders to spill.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16314>
2022-05-04 12:48:27 +00:00
Alyssa Rosenzweig 961b18ccbc pan/bi: Align spilled registers on Valhall
Required to support packed addressing correctly. Fixes (with spilling forced):

dEQP-GLES2.functional.shaders.random.trigonometric.vertex.20

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16314>
2022-05-04 12:48:27 +00:00
Alyssa Rosenzweig 040a3ef24e pan/va: Serialize memory stores
We could do better :(

Fixes spilling.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16314>
2022-05-04 12:48:27 +00:00
Alyssa Rosenzweig 5831c44121 panfrost: Relax image check
Shader images on Valhall don't allow nonzero "Minimum level". However,
pan_texture lowers away nonzero minimum levels anyway, so there's nothing to
check. Fixes:

KHR-GLES31.core.shader_image_load_store.advanced-allMips-cs

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16315>
2022-05-04 12:29:55 +00:00
Georg Lehmann bf6372df62 meson: Tell glslang to be quiet.
Without --quiet glslang unconditionally prints the input file name to stdout.
Check if --quiet is supported because some distros only have ancient glslang
versions.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16312>
2022-05-04 11:30:43 +00:00
Rhys Perry 1b639a0ce5 aco/ra: fix vgpr_limit
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: b98a4d4dd7 ("aco: refactor GPR limit calculation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16297>
2022-05-04 11:12:13 +00:00
Georg Lehmann 69cceab718 aco: Remove D16 zero components from image stores.
No foz-db changes.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15179>
2022-05-04 09:58:03 +00:00
Georg Lehmann a9bce05700 radv: Run copy_prop and dce after folding 16bit sampling/load/store.
Totals from 10 (0.01% of 134913) affected shaders:
CodeSize: 53168 -> 54832 (+3.13%); split: -0.17%, +3.30%
Instrs: 9117 -> 9200 (+0.91%); split: -1.74%, +2.65%
Latency: 41595 -> 41787 (+0.46%); split: -0.95%, +1.41%
InvThroughput: 16412 -> 16424 (+0.07%); split: -1.95%, +2.02%
VClause: 107 -> 112 (+4.67%); split: -0.93%, +5.61%
Copies: 199 -> 535 (+168.84%); split: -3.02%, +171.86%
PreVGPRs: 520 -> 502 (-3.46%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15179>
2022-05-04 09:58:03 +00:00
Georg Lehmann 9bca149353 radv: Use nir_fold_16bit_image_load_store_conversions.
Totals from 10 (0.01% of 134913) affected shaders:
CodeSize: 53316 -> 53168 (-0.28%)
Instrs: 9219 -> 9117 (-1.11%)
Latency: 41744 -> 41595 (-0.36%)
InvThroughput: 16616 -> 16412 (-1.23%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15179>
2022-05-04 09:58:03 +00:00
Georg Lehmann 7a6dbe0c77 aco: Implement image_load d16.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15179>
2022-05-04 09:58:03 +00:00
Georg Lehmann 7ffcaf9187 aco: Implement image_store d16.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15179>
2022-05-04 09:58:03 +00:00
Georg Lehmann 5833fab766 nir/lower_mediump: Add a new pass to fold 16bit image load/store.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15179>
2022-05-04 09:58:03 +00:00
Alejandro Piñeiro 1ed2b5e253 v3dv/pipeline: include pipeline layout on the pipeline sha1
Fixes failures on tests like this when the on-disk-cache is enabled:
dEQP-VK.binding_model.descriptor_copy.compute.uniform_buffer_0

We only found them when running full CTS runs. What happens is that we
got a hit from the on-disk shader cache, for several tests using the
same shaders. But some tests seems to be using a uniform buffer, and
others a inline buffer. Right now inline buffers leads to some changes
on the final nir shader, and generated assembly, compared with uniform
buffers. So we got a wrong shader. Fortunately we only got an assert
instead of weird behaviour.

With this commit we include the pipeline layout on the pipeline sha1,
so those two cases would get different sha1. FWIW, this is what other
drivers are already doing.

Surprisingly that didn't cause a problem before.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16313>
2022-05-04 09:21:20 +00:00
Alejandro Piñeiro 502fae57be v3dv/pipeline_cache: add on disk cache hit stats
Useful when debugging/testing on disk cache.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16313>
2022-05-04 09:21:20 +00:00
Alejandro Piñeiro f57a01c5f9 v3dv/pipeline_cache: adds check to skip searching for a entry
If we are calling pipeline_cache_upload_shared_data with
from_disk_cache, that means that we had used the disk-cache to found
that entry. And that should only happens if we didn't find the entry
on the cache. So on that case we can skip to search for it.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16313>
2022-05-04 09:21:20 +00:00
Alejandro Piñeiro 080e14ff61 v3dv/pipeline: fix small comment typo
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16313>
2022-05-04 09:21:20 +00:00
Jonathan Gray 3cc1efee6f intel/dev: sync ADL-S pci ids with linux
sync ADL-S pci ids with linux
c79b846f892d ("drm/i915/adl_s: Update ADL-S PCI IDs")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16317>
2022-05-04 18:01:45 +10:00
Juan A. Suarez Romero f21e396f4c v3d: disable early-Z on odd frame dimensions
The early-Z buffer may load incorrect depth values if the frame has an
od width or height. In this case we need to disable early-Z.

v3:
 - Set job->early_zs_clear only for V3D_VERSION >= 40 (Iago)
 - Check early-Z is disabled if no zsbuf (Iago)

v4:
 - Borrow comments from v3dv around v3d_update_job_ez() (Iago)

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3557
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16166>
2022-05-04 07:44:46 +00:00
Juan A. Suarez Romero 07cfa2bd96 v3d: enable early Z/S clears
This performance optimization can be enabled if we are clearing Z/S
buffer, and not storing or loading it.

v2:
 - Add assertion on depth/stencil job loads (Iago)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16166>
2022-05-04 07:44:46 +00:00
Mike Blumenkrantz 4dec4ba87d wgl: don't auto-load zink before software drivers
as in glx/egl, zink+lavapipe should only load if explicitly selected

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16311>
2022-05-04 01:20:12 +00:00
Mike Blumenkrantz a9451f2599 zink: use VK_EXT_primitives_generated_query when available
the old codepath must be maintained, but runtime will be far simpler

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16274>
2022-05-04 01:09:57 +00:00
Mike Blumenkrantz 406f7a0eb1 zink: add a flag to zink_query to trigger rasterizer discard workaround
make this agnostic of query types; no functional changes

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16274>
2022-05-04 01:09:57 +00:00
Mike Blumenkrantz 5269521cc2 zink: add and use a function to detected emulated primgen queries
no functional changes, just reducing instances of PIPE_QUERY_PRIMITIVES_GENERATED

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16274>
2022-05-04 01:09:57 +00:00
Mike Blumenkrantz ddced9ea6b zink: pass screen param to convert_query_type()
no functional changes

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16274>
2022-05-04 01:09:57 +00:00
Mike Blumenkrantz 563ae3fd69 zink: pass query object to get_num_results()
no functional changes

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16274>
2022-05-04 01:09:57 +00:00
Mike Blumenkrantz 6af4a74f8a zink: pass query object to get_num_query_pools()
no functional changes

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16274>
2022-05-04 01:09:57 +00:00
Mike Blumenkrantz 24626b954d zink: pass query object to get_num_queries()
no functional changes

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16274>
2022-05-04 01:09:57 +00:00
Mike Blumenkrantz 7aeaf695c0 zink: hook up VK_EXT_primitives_generated_query
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16274>
2022-05-04 01:09:57 +00:00
Autumn on Tape e353b89c57 gallivm: use VPERMPS (x86/AVX2) for 32-bit 8-element shuffles
Signed-off-by: Autumn on Tape <autumn@cyfox.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13671>
2022-05-03 23:09:37 +00:00
Autumn on Tape 57e25fc55c gallivm: use shufflevector for shuffles when index is constant data
This checks if index is a ConstantAggregateZero, ConstantDataSequential,
or UndefValue and emits a single shufflevector instead of a loop if so.

Signed-off-by: Autumn on Tape <autumn@cyfox.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13671>
2022-05-03 23:09:37 +00:00