Commit Graph

145887 Commits

Author SHA1 Message Date
Mike Blumenkrantz 7cc85dba71 build: unify vulkan cpp platform args
these were duplicated all over the place, and it's annoying to have to keep
duplicating them any time a new component includes the vulkan header

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13141>
2021-10-06 14:19:35 +00:00
Mike Blumenkrantz 1d574d4860 lavapipe: remove display extension support
lavapipe doesn't actually support these

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13141>
2021-10-06 14:19:35 +00:00
Thomas Wagner fe1a091bd0 lavapipe: enable KHR_external_memory_fd
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Heinrich Fink <hfink@snap.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12345>
2021-10-06 13:49:08 +00:00
Thomas Wagner 9da15aa3aa llvmpipe: enable EXT_memory_object(_fd)
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Heinrich Fink <hfink@snap.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12345>
2021-10-06 13:49:08 +00:00
Thomas Wagner 895d3399f7 lavapipe: add support for KHR_external_memory_fd
Support creating exportable memory. Use memfd file
descriptors and import/export them as opaque fd handles.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Heinrich Fink <hfink@snap.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12345>
2021-10-06 13:49:08 +00:00
Thomas Wagner 1608a815e3 llvmpipe: add support for EXT_memory_object(_fd)
Enable the import of memory via opaque fd handles, which
are based upon memory-fds. The extension is necessary for sharing
images and buffers from Vulkan.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Heinrich Fink <hfink@snap.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12345>
2021-10-06 13:49:08 +00:00
Thomas Wagner 1166ee9caf gallium: add utility and interface for memory fd allocations
Add utility functions to allocate aligned memory backed by
mem_fd objects. Add interface to Gallium for same allocation.
It will be used in later commits for external memory support
in Vulkan/OpenGL.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Heinrich Fink <hfink@snap.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12345>
2021-10-06 13:49:08 +00:00
Connor Abbott 0209311c6e ir3: Use source in ir3_output_conv_src_type()
This was incorrectly converted when splitting the regs array. Noticed by
inspection.

Fixes: d3e08327cf ("ir3/core: Switch to srcs/dsts arrays")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13220>
2021-10-06 13:15:50 +00:00
Roman Stratiienko f1c322c269 meson_options: Bump max value of platform-sdk-version to 31
During building Android-12, the following error appears:
meson.build:21:0: ERROR: New value 31 is more than maximum value 30.

Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13216>
2021-10-06 12:05:22 +00:00
Danylo Piliaiev 6a16b6a74c turnip: fix vbs emission when there are holes in bindings
Otherwise we read garbage for bindings with value above
vertexBindingDescriptionCount.

Fixes vkd3d test "test_append_aligned_element"

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13195>
2021-10-06 10:05:50 +00:00
Alejandro Piñeiro bc5892b7fc v3dv: use NULL for vk_error on initialization failures
This commit fixes two issues:

  * On CreateInstance, we are freeing the instance, and then trying to
    use it when calling vk_error. This could be problematic, so let's
    just use NULL.

  * On CreateDevice, we are getting a unsupported feature error, and
    then trying to call vk_error using the instance. That's is not
    really a instance error, and will assert when the ongoing common
    vk_error lands mesa. Let's use NULL instead, as the object it
    applies, the device, was not created.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13219>
2021-10-06 11:42:28 +02:00
Dave Airlie ab1c888c8d device_select: close dri3 fd after using it.
This can leak and causes crashes in some CTS test groups
dEQP-VK.wsi.xcb.incremental_present*

Fixes: 9bc5b2d169 ("vulkan: add initial device selection layer. (v6.1)")
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13215>
2021-10-06 18:24:54 +10:00
Dave Airlie 028591954a lvp/fence: quick fix to previous commit.
This fixes last of xcb cts issues.

Fixes: 8a294b6f97 ("lavapipe: Fix vkWaitForFences for initially-signalled fences")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13212>
2021-10-06 15:27:33 +10:00
Pavel Asyutchenko b9617bc621 lavapipe: Fix vkWaitForFences for initially-signalled fences
Fences with VK_FENCE_CREATE_SIGNALED_BIT are created with
signalled=true and timeline=0, waiting on them without
submitting first returned VK_TIMEOUT instead of VK_SUCCESS.

Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13128>
2021-10-06 15:11:04 +10:00
Mike Blumenkrantz 96ea718b7e lavapipe: EXT_4444_formats support
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12001>
2021-10-06 04:35:25 +00:00
Dave Airlie 29f4931b52 llvmpipe: fix 4-bit output scaling.
This is overkill, but hey 4-bits per channel is hardly something to
care about.

(Suggestions welcome for a better version).

Fixes:
dEQP-GLES2.functional.fbo.render.*rgba4*

Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12001>
2021-10-06 04:35:25 +00:00
Emma Anholt 22a332f5ac virgl: Add support for NIR shaders when VIRGL_DEBUG=nir.
This will let me incrementally fix nir-to-tgsi against virgl without
having to carry around the whole "remove TGSI from mesa/st" MR.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>
2021-10-06 03:44:18 +00:00
Emma Anholt 4e3e149ffd nir_to_tgsi: Force the TXQ LOD argument to be scalar.
Otherwise, older virglrenderer fails all the texturesize tests.

Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>
2021-10-06 03:44:17 +00:00
Emma Anholt 469f0345ac nir_to_tgsi: Add a workaround for virgl UBO array dynamic indexing.
virgl makes one array of UBOs starting from the first non-CB0 UBO used,
and does dynamic indexing off of that.  It requires that the dynamic
indexing be CONST[ADDR[0]+base], rather than having the base be loaded in
addr0.

If we had a nir_intrinsic_base() on load_ubo, this would be easy.  As we
don't, emit a subtract at address deref time.

Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>
2021-10-06 03:44:17 +00:00
Emma Anholt a292268cd5 nir_to_tgsi: Sort FS output declarations to avoid virglrenderer bugs.
The TGSI debug output is a lot more readable if it's in location order,
anyway.

Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>
2021-10-06 03:44:17 +00:00
Emma Anholt 7dde279db5 nir-to-tgsi: Avoid emitting TXL just for lod 0 on non-vertex shaders.
Prompted by comparing virgl fails and finding that it has issues with
immediate args to TXL/TXB, at least.

Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>
2021-10-06 03:44:17 +00:00
Emma Anholt c3c560089e nir_to_tgsi: Turn GS PRIMID into an input instead of a sysval.
While TGSI can represent it either way, virgl and r600 at least demand an
input.

Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>
2021-10-06 03:44:17 +00:00
Emma Anholt 91a5a18dbf nir_to_tgsi: Add support for nir_intrinsic_load_barycentric_at_sample.
It doesn't have to be a constant sample, so we need to store it at load
time and use the load's dest at interpolate_at time.

Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>
2021-10-06 03:44:17 +00:00
Emma Anholt 15aabcd806 nir_to_tgsi: Add support for load_barycentric_sample.
This is used for var->data.sample inputs, which are already declared to be
TGSI_INTERPOLATE_LOC_SAMPLE, so we can just use the interpolated inputs.

Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>
2021-10-06 03:44:17 +00:00
Emma Anholt 80c007a4dd nir_to_tgsi: Add support for declaring image arrays.
Required for virgl.

Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>
2021-10-06 03:44:17 +00:00
Emma Anholt 8d6f738007 gallium/ureg: Sort the input decls, too.
Just like outputs, virglrenderer needs its inputs sorted.  Should be
harmless for other TGSI producers, and makes the declarations more
readable.

Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>
2021-10-06 03:44:17 +00:00
Emma Anholt 441643b105 nir_to_tgsi: Add support for load_output/load_per_vertex_output.
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>
2021-10-06 03:44:17 +00:00
Emma Anholt 96cf3b3595 nir_to_tgsi: Include txf_ms's sample index.
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>
2021-10-06 03:44:17 +00:00
Emma Anholt ba6368b54d mesa/st: Don't bump locations of patch vars for !PIPE_CAP_TEXCOORD.
There's no need to reserve the bottom 9 VARYING_SLOT_PATCH*, since
VARYING_SLOT_TEXCOORD won't be mapped there.  This helps us match up with
nir_to_tgsi, which wasn't shifting down by 9 for patch.

Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>
2021-10-06 03:44:17 +00:00
Mike Blumenkrantz 2f6debfd6d lavapipe: inherit from vk_image
simple and easy since we don't use much of this anyway

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13146>
2021-10-06 03:10:06 +00:00
Dave Airlie 9392bd89e9 llvmpipe/cs: change submission pattern for threadpool
Recent ncnn benchmarks showed a slowdown, and this change seemed
more likely.

The batching into threads for the main workloads is fine, however
the remainder stuff doesn't get spread out and can bottleneck in
one thread.

Switch to a model where the initial work is batched, but the
remainder is iterated over one by one.

Brings ncnn benchmarks back in line with previously.

Fixes: 69109e0b19 ("llvmpipe/cs: rework thread pool for avoid mtx locking")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13210>
2021-10-06 02:42:20 +00:00
Lionel Landwerlin 3924df9fe7 anv: enable VK_KHR_maintenance4
v2 (Jason Ekstrand):
 - Get maxBufferSize from ISL.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>
2021-10-06 02:18:39 +00:00
Jason Ekstrand 231653ea35 intel/isl: Add a max_buffer_size limit to isl_device
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>
2021-10-06 02:18:39 +00:00
Lionel Landwerlin 9edbd13f81 anv: implement vkGetDeviceImageSparseMemoryRequirementsKHR
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>
2021-10-06 02:18:39 +00:00
Lionel Landwerlin 4075dd16ab anv: implement vkGetDeviceImageMemoryRequirementsKHR
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>
2021-10-06 02:18:39 +00:00
Lionel Landwerlin 9058fd8979 anv: move VkImage object allocation to anv_CreateImage
v2 (Jason Ekstrand):
 - Switch the order of arguments to be device, image, other stuff

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>
2021-10-06 02:18:39 +00:00
Jason Ekstrand 8c2a1ed3da anv: Add an anv_image_get_memory_requirements helper
This is similar to a patch from Lionel except works in terms of aspects
rather than bindings.  This makes it easy to use from the Android code.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>
2021-10-06 02:18:39 +00:00
Lionel Landwerlin 76b1d04e72 anv: remove unused function
Fixes: 49908c602f ("anv/android: Rework our handling of AHardwareBuffer imports")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>
2021-10-06 02:18:39 +00:00
Lionel Landwerlin f2397badc4 anv: implement vkGetDeviceBufferMemoryRequirementsKHR
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>
2021-10-06 02:18:39 +00:00
Lionel Landwerlin 8072cc8f20 anv: move GetBufferMemoryRequirement with other buffer functions
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>
2021-10-06 02:18:39 +00:00
Jason Ekstrand 7677f1d09e vulkan: Update the XML and headers to 1.2.195
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>
2021-10-06 02:18:39 +00:00
Ian Romanick cb28361642 nir/algebraic: Small optimizations for SpvOpFOrdNotEqual and SpvOpFUnordEqual
No shader-db changes on any Intel platform.

Fossil-db results:

All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 144380118 -> 143692823 (-0.5%)
SENDs in all programs: 6920822 -> 6920822 (+0.0%)
Loops in all programs: 38299 -> 38299 (+0.0%)
Cycles in all programs: 8434782176 -> 8423078994 (-0.1%)
Spills in all programs: 206830 -> 204469 (-1.1%)
Fills in all programs: 318737 -> 313660 (-1.6%)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12320>
2021-10-06 01:53:47 +00:00
Ian Romanick 0cf25f559f spirv: Generate shorter code for SpvOpFUnord comparisons
No shader-db or fossil-db changes on any Intel platform.

v2: Keep the flt <-> fge switcharoo local to the SpvOpFUnordLessThan,
etc. handling.  Add a comment explaining why the suboptimal
SpvOpFUnordEqual implementation is used here.  Suggested by Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12320>
2021-10-06 01:53:47 +00:00
Ian Romanick 1ce48ce91d spirv: SpvOpFUnordNotEqual doesn't need special treatment
The NIR fneu opcode already matches the "unordered not equal" semantics
of the SPIR-V opcode.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12320>
2021-10-06 01:53:47 +00:00
Ian Romanick f8148b861f spirv: Minor cleanup in SpvOpFOrdNotEqual
v2: Add a comment explaining why the suboptimal SpvOpFOrdNotEqual
implementation is still used here.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12320>
2021-10-06 01:53:47 +00:00
Ian Romanick 803b754b81 spirv: Silence unused parameter warnings in vtn_alu.c
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12320>
2021-10-06 01:53:47 +00:00
Mike Blumenkrantz c074b2d812 ci: updates
fails are from #4571

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12831>
2021-10-06 01:12:29 +00:00
Mike Blumenkrantz a2fb67209e zink: support 16bit rgbx formats
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12831>
2021-10-06 01:12:29 +00:00
Alyssa Rosenzweig c00e7b729f pan/bi: Optimize abs(derivative)
We implement fine derivatives as:

	broadcast(x, (lane & ~1) + 1) - broadcast(x, lane & ~1)

Most of the complexity is to get the right sign. If we can ignore the
sign, we can generate the simpler code:

	broadcast(x, lane ^ 1) - lane

This is a particular win on v7+ where the broadcast instruction (CLPER)
can do `lane ^ value` for free. However, even on v6 where we lower to an
explicit XOR instruction, it's still a win.

The limiting case is fwidth. The fragment shader

   gl_FragColor = fwidth(vec4_varying);

has the following results on v6, v7, and v9:

G72 (-26% instructions, -43% cycles):
38 inst, 30 tuples, 5 clauses, 1.166667 cycles, 1.166667 arith, 28 quadwords
28 inst, 19 tuples, 4 clauses, 0.666667 cycles, 0.666667 arith, 19 quadwords

G76 (-37% instructions, -54% cycles):
38 inst, 30 tuples, 5 clauses, 1.166667 cycles, 1.166667 arith, 28 quadwords
24 inst, 16 tuples, 4 clauses, 0.541667 cycles, 0.541667 arith, 18 quadwords

G78 (-40% instructions, -56% cycles):
40 inst, 1.125000 cycles, 0.250000 fma, 0.109375 cvt, 1.125000 sfu, 20 quadwords
24 inst, 0.500000 cycles, 0.250000 fma, 0.015625 cvt, 0.500000 sfu, 12 quadwords

shader-db tells a similar story -- most shaders are unaffected, but a
shader that uses fwidth has a 20% reduction in cycle count:

instructions helped:   shaders/tesseract/488.shader_test MESA_SHADER_FRAGMENT: 264 -> 262 (-0.76%)
instructions helped:   shaders/chromeos/109-1.shader_test MESA_SHADER_FRAGMENT: 36 -> 28 (-22.22%)
tuples helped:   shaders/chromeos/109-1.shader_test MESA_SHADER_FRAGMENT: 27 -> 22 (-18.52%)
tuples HURT:   shaders/tesseract/488.shader_test MESA_SHADER_FRAGMENT: 211 -> 212 (0.47%)
clauses HURT:   shaders/tesseract/488.shader_test MESA_SHADER_FRAGMENT: 32 -> 33 (3.12%)
cycles helped:   shaders/chromeos/109-1.shader_test MESA_SHADER_FRAGMENT: 1 -> 0.79 (-20.83%)
arith helped:   shaders/chromeos/109-1.shader_test MESA_SHADER_FRAGMENT: 1 -> 0.79 (-20.83%)
quadwords helped:   shaders/chromeos/109-1.shader_test MESA_SHADER_FRAGMENT: 31 -> 28 (-9.68%)
quadwords HURT:   shaders/tesseract/488.shader_test MESA_SHADER_FRAGMENT: 176 -> 178 (1.14%)

total instructions in shared programs: 148370 -> 148360 (<.01%)
instructions in affected programs: 300 -> 290 (-3.33%)
helped: 2
HURT: 0

total tuples in shared programs: 124188 -> 124184 (<.01%)
tuples in affected programs: 238 -> 234 (-1.68%)
helped: 1
HURT: 1
helped stats (abs) min: 5.0 max: 5.0 x̄: 5.00 x̃: 5
helped stats (rel) min: 18.52% max: 18.52% x̄: 18.52% x̃: 18.52%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.47% max: 0.47% x̄: 0.47% x̃: 0.47%

total clauses in shared programs: 25692 -> 25693 (<.01%)
clauses in affected programs: 32 -> 33 (3.12%)
helped: 0
HURT: 1

total cycles in shared programs: 12132.04 -> 12131.83 (<.01%)
cycles in affected programs: 1 -> 0.79 (-20.83%)
helped: 1
HURT: 0

total arith in shared programs: 4623.75 -> 4623.54 (<.01%)
arith in affected programs: 1 -> 0.79 (-20.83%)
helped: 1
HURT: 0

total quadwords in shared programs: 110386 -> 110385 (<.01%)
quadwords in affected programs: 207 -> 206 (-0.48%)
helped: 1
HURT: 1
helped stats (abs) min: 3.0 max: 3.0 x̄: 3.00 x̃: 3
helped stats (rel) min: 9.68% max: 9.68% x̄: 9.68% x̃: 9.68%
HURT stats (abs)   min: 2.0 max: 2.0 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 1.14% max: 1.14% x̄: 1.14% x̃: 1.14%

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12332>
2021-10-06 00:40:57 +00:00
Alyssa Rosenzweig 3e8f540753 nir: Add Mali-specific derivative opcodes
Add derivative opcodes fddx_must_abs_mali/fddy_must_abs_mali satisfying:

   fabs(fdd*_must_abs_mali(v)) = fabs(fdd*(v))

The sign of their result is undefined.

On Bifrost and Valhall, these unsigned derivatives can be implemented
more efficiently than the correctly-signed counterparts, since the sign
fixup requires extra ALU instructions. On backends where this is the
case, it is useful to optimize fabs(fdd*(v)) to
fabs(fdd*_must_abs_mali(v)). This pattern comes up with the GLSL builtin
`fwidth`.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12332>
2021-10-06 00:40:57 +00:00