mirrors/mesa - Frog Git

Commit Graph

Author	SHA1	Message	Date
Mike Blumenkrantz	7cc85dba71	build: unify vulkan cpp platform args these were duplicated all over the place, and it's annoying to have to keep duplicating them any time a new component includes the vulkan header Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13141>	2021-10-06 14:19:35 +00:00
Mike Blumenkrantz	1d574d4860	lavapipe: remove display extension support lavapipe doesn't actually support these Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13141>	2021-10-06 14:19:35 +00:00
Thomas Wagner	fe1a091bd0	lavapipe: enable KHR_external_memory_fd Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Heinrich Fink <hfink@snap.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12345>	2021-10-06 13:49:08 +00:00
Thomas Wagner	9da15aa3aa	llvmpipe: enable EXT_memory_object(_fd) Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Heinrich Fink <hfink@snap.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12345>	2021-10-06 13:49:08 +00:00
Thomas Wagner	895d3399f7	lavapipe: add support for KHR_external_memory_fd Support creating exportable memory. Use memfd file descriptors and import/export them as opaque fd handles. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Heinrich Fink <hfink@snap.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12345>	2021-10-06 13:49:08 +00:00
Thomas Wagner	1608a815e3	llvmpipe: add support for EXT_memory_object(_fd) Enable the import of memory via opaque fd handles, which are based upon memory-fds. The extension is necessary for sharing images and buffers from Vulkan. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Heinrich Fink <hfink@snap.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12345>	2021-10-06 13:49:08 +00:00
Thomas Wagner	1166ee9caf	gallium: add utility and interface for memory fd allocations Add utility functions to allocate aligned memory backed by mem_fd objects. Add interface to Gallium for same allocation. It will be used in later commits for external memory support in Vulkan/OpenGL. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Heinrich Fink <hfink@snap.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12345>	2021-10-06 13:49:08 +00:00
Connor Abbott	0209311c6e	ir3: Use source in ir3_output_conv_src_type() This was incorrectly converted when splitting the regs array. Noticed by inspection. Fixes: `d3e08327cf` ("ir3/core: Switch to srcs/dsts arrays") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13220>	2021-10-06 13:15:50 +00:00
Roman Stratiienko	f1c322c269	meson_options: Bump max value of platform-sdk-version to 31 During building Android-12, the following error appears: meson.build:21:0: ERROR: New value 31 is more than maximum value 30. Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com> Acked-by: Emil Velikov <emil.l.velikov@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13216>	2021-10-06 12:05:22 +00:00
Danylo Piliaiev	6a16b6a74c	turnip: fix vbs emission when there are holes in bindings Otherwise we read garbage for bindings with value above vertexBindingDescriptionCount. Fixes vkd3d test "test_append_aligned_element" Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13195>	2021-10-06 10:05:50 +00:00
Alejandro Piñeiro	bc5892b7fc	v3dv: use NULL for vk_error on initialization failures This commit fixes two issues: * On CreateInstance, we are freeing the instance, and then trying to use it when calling vk_error. This could be problematic, so let's just use NULL. * On CreateDevice, we are getting a unsupported feature error, and then trying to call vk_error using the instance. That's is not really a instance error, and will assert when the ongoing common vk_error lands mesa. Let's use NULL instead, as the object it applies, the device, was not created. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13219>	2021-10-06 11:42:28 +02:00
Dave Airlie	ab1c888c8d	device_select: close dri3 fd after using it. This can leak and causes crashes in some CTS test groups dEQP-VK.wsi.xcb.incremental_present* Fixes: `9bc5b2d169` ("vulkan: add initial device selection layer. (v6.1)") Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13215>	2021-10-06 18:24:54 +10:00
Dave Airlie	028591954a	lvp/fence: quick fix to previous commit. This fixes last of xcb cts issues. Fixes: `8a294b6f97` ("lavapipe: Fix vkWaitForFences for initially-signalled fences") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13212>	2021-10-06 15:27:33 +10:00
Pavel Asyutchenko	b9617bc621	lavapipe: Fix vkWaitForFences for initially-signalled fences Fences with VK_FENCE_CREATE_SIGNALED_BIT are created with signalled=true and timeline=0, waiting on them without submitting first returned VK_TIMEOUT instead of VK_SUCCESS. Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13128>	2021-10-06 15:11:04 +10:00
Mike Blumenkrantz	96ea718b7e	lavapipe: EXT_4444_formats support Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12001>	2021-10-06 04:35:25 +00:00
Dave Airlie	29f4931b52	llvmpipe: fix 4-bit output scaling. This is overkill, but hey 4-bits per channel is hardly something to care about. (Suggestions welcome for a better version). Fixes: dEQP-GLES2.functional.fbo.render.rgba4 Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12001>	2021-10-06 04:35:25 +00:00
Emma Anholt	22a332f5ac	virgl: Add support for NIR shaders when VIRGL_DEBUG=nir. This will let me incrementally fix nir-to-tgsi against virgl without having to carry around the whole "remove TGSI from mesa/st" MR. Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>	2021-10-06 03:44:18 +00:00
Emma Anholt	4e3e149ffd	nir_to_tgsi: Force the TXQ LOD argument to be scalar. Otherwise, older virglrenderer fails all the texturesize tests. Acked-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>	2021-10-06 03:44:17 +00:00
Emma Anholt	469f0345ac	nir_to_tgsi: Add a workaround for virgl UBO array dynamic indexing. virgl makes one array of UBOs starting from the first non-CB0 UBO used, and does dynamic indexing off of that. It requires that the dynamic indexing be CONST[ADDR[0]+base], rather than having the base be loaded in addr0. If we had a nir_intrinsic_base() on load_ubo, this would be easy. As we don't, emit a subtract at address deref time. Acked-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>	2021-10-06 03:44:17 +00:00
Emma Anholt	a292268cd5	nir_to_tgsi: Sort FS output declarations to avoid virglrenderer bugs. The TGSI debug output is a lot more readable if it's in location order, anyway. Acked-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>	2021-10-06 03:44:17 +00:00
Emma Anholt	7dde279db5	nir-to-tgsi: Avoid emitting TXL just for lod 0 on non-vertex shaders. Prompted by comparing virgl fails and finding that it has issues with immediate args to TXL/TXB, at least. Acked-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>	2021-10-06 03:44:17 +00:00
Emma Anholt	c3c560089e	nir_to_tgsi: Turn GS PRIMID into an input instead of a sysval. While TGSI can represent it either way, virgl and r600 at least demand an input. Acked-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>	2021-10-06 03:44:17 +00:00
Emma Anholt	91a5a18dbf	nir_to_tgsi: Add support for nir_intrinsic_load_barycentric_at_sample. It doesn't have to be a constant sample, so we need to store it at load time and use the load's dest at interpolate_at time. Acked-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>	2021-10-06 03:44:17 +00:00
Emma Anholt	15aabcd806	nir_to_tgsi: Add support for load_barycentric_sample. This is used for var->data.sample inputs, which are already declared to be TGSI_INTERPOLATE_LOC_SAMPLE, so we can just use the interpolated inputs. Acked-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>	2021-10-06 03:44:17 +00:00
Emma Anholt	80c007a4dd	nir_to_tgsi: Add support for declaring image arrays. Required for virgl. Acked-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>	2021-10-06 03:44:17 +00:00
Emma Anholt	8d6f738007	gallium/ureg: Sort the input decls, too. Just like outputs, virglrenderer needs its inputs sorted. Should be harmless for other TGSI producers, and makes the declarations more readable. Acked-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>	2021-10-06 03:44:17 +00:00
Emma Anholt	441643b105	nir_to_tgsi: Add support for load_output/load_per_vertex_output. Acked-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>	2021-10-06 03:44:17 +00:00
Emma Anholt	96cf3b3595	nir_to_tgsi: Include txf_ms's sample index. Acked-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>	2021-10-06 03:44:17 +00:00
Emma Anholt	ba6368b54d	mesa/st: Don't bump locations of patch vars for !PIPE_CAP_TEXCOORD. There's no need to reserve the bottom 9 VARYING_SLOT_PATCH*, since VARYING_SLOT_TEXCOORD won't be mapped there. This helps us match up with nir_to_tgsi, which wasn't shifting down by 9 for patch. Acked-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12800>	2021-10-06 03:44:17 +00:00
Mike Blumenkrantz	2f6debfd6d	lavapipe: inherit from vk_image simple and easy since we don't use much of this anyway Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13146>	2021-10-06 03:10:06 +00:00
Dave Airlie	9392bd89e9	llvmpipe/cs: change submission pattern for threadpool Recent ncnn benchmarks showed a slowdown, and this change seemed more likely. The batching into threads for the main workloads is fine, however the remainder stuff doesn't get spread out and can bottleneck in one thread. Switch to a model where the initial work is batched, but the remainder is iterated over one by one. Brings ncnn benchmarks back in line with previously. Fixes: `69109e0b19` ("llvmpipe/cs: rework thread pool for avoid mtx locking") Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13210>	2021-10-06 02:42:20 +00:00
Lionel Landwerlin	3924df9fe7	anv: enable VK_KHR_maintenance4 v2 (Jason Ekstrand): - Get maxBufferSize from ISL. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>	2021-10-06 02:18:39 +00:00
Jason Ekstrand	231653ea35	intel/isl: Add a max_buffer_size limit to isl_device Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>	2021-10-06 02:18:39 +00:00
Lionel Landwerlin	9edbd13f81	anv: implement vkGetDeviceImageSparseMemoryRequirementsKHR Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>	2021-10-06 02:18:39 +00:00
Lionel Landwerlin	4075dd16ab	anv: implement vkGetDeviceImageMemoryRequirementsKHR Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>	2021-10-06 02:18:39 +00:00
Lionel Landwerlin	9058fd8979	anv: move VkImage object allocation to anv_CreateImage v2 (Jason Ekstrand): - Switch the order of arguments to be device, image, other stuff Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>	2021-10-06 02:18:39 +00:00
Jason Ekstrand	8c2a1ed3da	anv: Add an anv_image_get_memory_requirements helper This is similar to a patch from Lionel except works in terms of aspects rather than bindings. This makes it easy to use from the Android code. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>	2021-10-06 02:18:39 +00:00
Lionel Landwerlin	76b1d04e72	anv: remove unused function Fixes: `49908c602f` ("anv/android: Rework our handling of AHardwareBuffer imports") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>	2021-10-06 02:18:39 +00:00
Lionel Landwerlin	f2397badc4	anv: implement vkGetDeviceBufferMemoryRequirementsKHR Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>	2021-10-06 02:18:39 +00:00
Lionel Landwerlin	8072cc8f20	anv: move GetBufferMemoryRequirement with other buffer functions Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>	2021-10-06 02:18:39 +00:00
Jason Ekstrand	7677f1d09e	vulkan: Update the XML and headers to 1.2.195 Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>	2021-10-06 02:18:39 +00:00
Ian Romanick	cb28361642	nir/algebraic: Small optimizations for SpvOpFOrdNotEqual and SpvOpFUnordEqual No shader-db changes on any Intel platform. Fossil-db results: All Intel platforms had similar results. (Ice Lake shown) Instructions in all programs: 144380118 -> 143692823 (-0.5%) SENDs in all programs: 6920822 -> 6920822 (+0.0%) Loops in all programs: 38299 -> 38299 (+0.0%) Cycles in all programs: 8434782176 -> 8423078994 (-0.1%) Spills in all programs: 206830 -> 204469 (-1.1%) Fills in all programs: 318737 -> 313660 (-1.6%) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12320>	2021-10-06 01:53:47 +00:00
Ian Romanick	0cf25f559f	spirv: Generate shorter code for SpvOpFUnord comparisons No shader-db or fossil-db changes on any Intel platform. v2: Keep the flt <-> fge switcharoo local to the SpvOpFUnordLessThan, etc. handling. Add a comment explaining why the suboptimal SpvOpFUnordEqual implementation is used here. Suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12320>	2021-10-06 01:53:47 +00:00
Ian Romanick	1ce48ce91d	spirv: SpvOpFUnordNotEqual doesn't need special treatment The NIR fneu opcode already matches the "unordered not equal" semantics of the SPIR-V opcode. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12320>	2021-10-06 01:53:47 +00:00
Ian Romanick	f8148b861f	spirv: Minor cleanup in SpvOpFOrdNotEqual v2: Add a comment explaining why the suboptimal SpvOpFOrdNotEqual implementation is still used here. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12320>	2021-10-06 01:53:47 +00:00
Ian Romanick	803b754b81	spirv: Silence unused parameter warnings in vtn_alu.c Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12320>	2021-10-06 01:53:47 +00:00
Mike Blumenkrantz	c074b2d812	ci: updates fails are from #4571 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12831>	2021-10-06 01:12:29 +00:00
Mike Blumenkrantz	a2fb67209e	zink: support 16bit rgbx formats Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12831>	2021-10-06 01:12:29 +00:00
Alyssa Rosenzweig	c00e7b729f	pan/bi: Optimize abs(derivative) We implement fine derivatives as: broadcast(x, (lane & ~1) + 1) - broadcast(x, lane & ~1) Most of the complexity is to get the right sign. If we can ignore the sign, we can generate the simpler code: broadcast(x, lane ^ 1) - lane This is a particular win on v7+ where the broadcast instruction (CLPER) can do `lane ^ value` for free. However, even on v6 where we lower to an explicit XOR instruction, it's still a win. The limiting case is fwidth. The fragment shader gl_FragColor = fwidth(vec4_varying); has the following results on v6, v7, and v9: G72 (-26% instructions, -43% cycles): 38 inst, 30 tuples, 5 clauses, 1.166667 cycles, 1.166667 arith, 28 quadwords 28 inst, 19 tuples, 4 clauses, 0.666667 cycles, 0.666667 arith, 19 quadwords G76 (-37% instructions, -54% cycles): 38 inst, 30 tuples, 5 clauses, 1.166667 cycles, 1.166667 arith, 28 quadwords 24 inst, 16 tuples, 4 clauses, 0.541667 cycles, 0.541667 arith, 18 quadwords G78 (-40% instructions, -56% cycles): 40 inst, 1.125000 cycles, 0.250000 fma, 0.109375 cvt, 1.125000 sfu, 20 quadwords 24 inst, 0.500000 cycles, 0.250000 fma, 0.015625 cvt, 0.500000 sfu, 12 quadwords shader-db tells a similar story -- most shaders are unaffected, but a shader that uses fwidth has a 20% reduction in cycle count: instructions helped: shaders/tesseract/488.shader_test MESA_SHADER_FRAGMENT: 264 -> 262 (-0.76%) instructions helped: shaders/chromeos/109-1.shader_test MESA_SHADER_FRAGMENT: 36 -> 28 (-22.22%) tuples helped: shaders/chromeos/109-1.shader_test MESA_SHADER_FRAGMENT: 27 -> 22 (-18.52%) tuples HURT: shaders/tesseract/488.shader_test MESA_SHADER_FRAGMENT: 211 -> 212 (0.47%) clauses HURT: shaders/tesseract/488.shader_test MESA_SHADER_FRAGMENT: 32 -> 33 (3.12%) cycles helped: shaders/chromeos/109-1.shader_test MESA_SHADER_FRAGMENT: 1 -> 0.79 (-20.83%) arith helped: shaders/chromeos/109-1.shader_test MESA_SHADER_FRAGMENT: 1 -> 0.79 (-20.83%) quadwords helped: shaders/chromeos/109-1.shader_test MESA_SHADER_FRAGMENT: 31 -> 28 (-9.68%) quadwords HURT: shaders/tesseract/488.shader_test MESA_SHADER_FRAGMENT: 176 -> 178 (1.14%) total instructions in shared programs: 148370 -> 148360 (<.01%) instructions in affected programs: 300 -> 290 (-3.33%) helped: 2 HURT: 0 total tuples in shared programs: 124188 -> 124184 (<.01%) tuples in affected programs: 238 -> 234 (-1.68%) helped: 1 HURT: 1 helped stats (abs) min: 5.0 max: 5.0 x̄: 5.00 x̃: 5 helped stats (rel) min: 18.52% max: 18.52% x̄: 18.52% x̃: 18.52% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.47% max: 0.47% x̄: 0.47% x̃: 0.47% total clauses in shared programs: 25692 -> 25693 (<.01%) clauses in affected programs: 32 -> 33 (3.12%) helped: 0 HURT: 1 total cycles in shared programs: 12132.04 -> 12131.83 (<.01%) cycles in affected programs: 1 -> 0.79 (-20.83%) helped: 1 HURT: 0 total arith in shared programs: 4623.75 -> 4623.54 (<.01%) arith in affected programs: 1 -> 0.79 (-20.83%) helped: 1 HURT: 0 total quadwords in shared programs: 110386 -> 110385 (<.01%) quadwords in affected programs: 207 -> 206 (-0.48%) helped: 1 HURT: 1 helped stats (abs) min: 3.0 max: 3.0 x̄: 3.00 x̃: 3 helped stats (rel) min: 9.68% max: 9.68% x̄: 9.68% x̃: 9.68% HURT stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2 HURT stats (rel) min: 1.14% max: 1.14% x̄: 1.14% x̃: 1.14% Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12332>	2021-10-06 00:40:57 +00:00
Alyssa Rosenzweig	3e8f540753	nir: Add Mali-specific derivative opcodes Add derivative opcodes fddx_must_abs_mali/fddy_must_abs_mali satisfying: fabs(fdd_must_abs_mali(v)) = fabs(fdd(v)) The sign of their result is undefined. On Bifrost and Valhall, these unsigned derivatives can be implemented more efficiently than the correctly-signed counterparts, since the sign fixup requires extra ALU instructions. On backends where this is the case, it is useful to optimize fabs(fdd(v)) to fabs(fdd_must_abs_mali(v)). This pattern comes up with the GLSL builtin `fwidth`. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12332>	2021-10-06 00:40:57 +00:00

1 2 3 4 5 ...

145887 Commits All Branches Search

145887 Commits

All Branches