Commit Graph

143834 Commits

Author SHA1 Message Date
Matt Turner dcf1d8d7a4 util: Replace recursive DFS with iterative implementation
Doesn't fix, but improves the situation in issue #5163. The
VK.spirv_assembly.instruction.graphics.spirv_ids_abuse.lots_ids_* tests
emit about 160k instructions. ir3_sched blows out the stack after
dag_traverse_bottom_up_node reaches a depth of about 130k frames.

This patch replaces the recursively-implemented post-order traversal
with an iterative implementation using a stack, allowing us to process
DAGs as large as memory can hold.

Definitely makes you appreciate the elegance of recursion...

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12232>
2021-08-17 17:54:09 +00:00
Matt Turner 9261a02028 util: Add unit tests for dag
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12232>
2021-08-17 17:54:09 +00:00
Connor Abbott d9c90eee8b freedreno: Decode a650+ CP_START_BIN/CP_END_BIN packets
The blob uses them for GMEM renderpasses.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12184>
2021-08-17 16:53:40 +00:00
Erik Faye-Lund 726fdf3385 st/mesa: correct point_tri_clip for gles2
The OpenGL ES 2.0 (and later) specifications, section 2.13 (Primitive
Clipping) say the following about point-clipping:

> If the primitive under consideration is a point, then clipping
> discards it if it lies outside the near or far clip plane; otherwise
> it is passed unchanged.

This matches the D3D convention, and we already have a rasterizer-state
bit for it. So let's set that bit when the API is OpenGL ES 2.

We can set this inside the PointSprite-conditional block, because that's
always enabled on GLES 2, and it's undefined to enable this state
without also enabling point_quad_rasterization.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12350>
2021-08-17 15:01:04 +00:00
Erik Faye-Lund 48b1b159ff llvmpipe: take intersection with bbox for non-legacy points
When I updated this code for multi-sampling, I missed one detail; if we
want to be able to support pipe_rasterizer_state::point_tri_clip, we
need to use the intersection of the bbox (clipped to the viewport
rectangle) and the generated primitive, otherwise we won't end up doing
x/y viewport clipping at all.

Because we've adjusted some of the parts of the bbox when adjusting for
inclusiveness/exclusiveness and fill-rule, we also need to reverse the
adjustment.

Fixes: f530e72ea0 ("llvmpipe: do not always use pixel-rounded coordinates for points")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12350>
2021-08-17 15:01:04 +00:00
Jason Ekstrand 782f75cb52 intel/isl: Use uint64_t for computed byte offsets
This is mostly a bit of future-proofing.  We never end up with offsets
that don't fit in 32 bits today because, thanks to driver limitations
caused by relocations, we don't allocate buffers bigger than 2GB today.
However, if we ever did, it's possible to create a surface on modern
platforms that consumes more than 4GB and we would end up with wrapping
in our offset calculations.

Acked-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11765>
2021-08-17 09:36:13 -05:00
Jason Ekstrand eb7c28bf24 intel/isl: Add a missing assert in isl_tiling_get_intratile_offset_sa
Fixes: a4dafe1fad "intel/isl: Make the offset helpers four dimensional"
Acked-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11765>
2021-08-17 09:14:39 -05:00
Jason Ekstrand 7d521bc104 intel/isl: Better document isl_tiling_get_intratile_offset_*
The docs weren't updated when we switched it to 4D.  Also, the new docs
are way better.  While we're here, use the parameter name offset_B to be
more consistent.

Acked-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11765>
2021-08-17 09:14:39 -05:00
Jason Ekstrand 9ab2f7d489 intel/isl: Add units to view dimensions in isl_surf_get_uncompressed_surf
This makes things a bit more clear.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11765>
2021-08-17 09:14:39 -05:00
Jason Ekstrand 3702406154 intel/isl: Explicitly set offset_B = 0 in get_uncomp_surf for arrays
The only user of this case is iris which initializes offset_B to 0 so
there's no real bug here.  However, it is unexpected from an API PoV.

Fixes: 9946120d2b "intel/isl: Add more cases to isl_surf_get_uncompressed_surf"
Acked-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11765>
2021-08-17 09:14:39 -05:00
Mike Blumenkrantz 40fdb3212c zink: add a suballocator
this is an aux/pipebuffer implementation borrowing heavily from the
one in radeonsi. it currently has the following limitations, which
will be resolved in a followup series:
* 32bit address space still explodes
* swapchain images still have separate memory handling

performance in games like Tomb Raider has been observed to increase by
over 1000%

SQUASHED: simplify get_memory_type_index()

now that the heaps are enumerated, this can be reduced to a simple
array index with a fallback

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12146>
2021-08-17 13:21:28 +00:00
Andreas Baierl 5df677e996 lima: CI: Enable GL_R8 and GL_RG8 texture formats
This is fixed in deqp now. See https://github.com/KhronosGroup/VK-GL-CTS/pull/241
Since CI is using deqp version > vulkan-cts-1.2.6.0, this isn't an issue anymore.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12409>
2021-08-17 14:49:51 +02:00
Erico Nunes 574bff9087 ci: enable CI for lima again
Enable CI for lima again on meson-gxl-s805x-libretech-ac boards
with Mali-450.
These boards are managed by a LAVA instance and so follow the LAVA CI
workflow in Mesa.
The goal is to have coverage for deqp-gles2, as lima is a GLES2-only
driver.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11789>
2021-08-17 11:22:59 +00:00
Filip Gawin 46d0126deb radv: improve rounding of zmin
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12388>
2021-08-17 11:03:59 +00:00
Roman Stratiienko 5ec6b6e9bb lima: Implement lima_resource_get_param() callback
Currently stride, offset, modifier is obtained by invoking
lima_resource_get_handle() with WINSYS_HANDLE_TYPE_KMS.

Before commit 47f000c170 this path was working. Obtained handle
was simply ignored by DRI frontend and only requested data used.

After commit 47f000c170 such requests started to fail when
DRI is initialized using KMSRO and resource has no scanout data.

When lima_resource_get_param() is implemented, it will be used in
a first place to obtain resource data.

Fixes: 47f000c170 ("lima: fail in get_handle(TYPE_KMS) without a scanout resource")
Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12362>
2021-08-17 10:50:51 +00:00
Jordan Justen 221871fb6d meson: Search for python3 before python for bin/meson_get_version.py
Most systems have either dropped the python executable, or made it
python3.

But it is still possible to configure a system such that python runs
python2. https://www.python.org/dev/peps/pep-0394/

Or, some developers may still be running older distributions where
python is python2.

Since bin/meson_get_version.py now requires python3, we should search
for python3 before python.

Fixes: f1eae2f8bb ("python: drop python2 support")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12400>
2021-08-17 09:45:54 +00:00
Juan A. Suarez Romero adfd3f8cd4 v3d/ci: add piglit flake
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12401>
2021-08-17 09:29:59 +00:00
Marcin Ślusarz 89bc8ff408 glsl/opt_algebraic: disable invalid optimization
When operators other than eq and ne are involved we can't really
move operands around and negate them because such transformation
may change the value of the whole expression.

Some examples:

For unsigned var:
0 >= 1u + var would eventually become 0xffffffff >= var,
which would always evaluate to true, when original expression
was true only for var == 0xffffffff.

For signed var:
0 >= 1 + var would become -1 >= var, which would evaluate to
false for var == 2147483647, when original expression evaluated
to true (because signed overflow is defined to wrap around in
glsl, 1 + 2147483647 == -2147483648, so 0 >= -2147483648).

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5226
Fixes: 34ec1a24d6 ("glsl: Optimize (x + y cmp 0) into (x cmp -y).")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12359>
2021-08-17 08:17:52 +00:00
Marcin Ślusarz f9e3ae6a01 intel/error-decode: printout INSTDONE_GEOM register for Gfx12.5
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12392>
2021-08-17 08:05:45 +00:00
Marcin Ślusarz 4f4f3b1072 genxml: add INSTDONE_GEOM register for Gfx12.5
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12392>
2021-08-17 08:05:45 +00:00
Lionel Landwerlin c8ff8a66cf intel/error-decode: printout more registers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12392>
2021-08-17 08:05:45 +00:00
Lionel Landwerlin bee7bff48e genxml: add more INSTDONE registers for Gfx12.5
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12392>
2021-08-17 08:05:45 +00:00
Ella-0 123590b88c v3dv: Implement VK_EXT_provoking_vertex
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12382>
2021-08-17 07:49:41 +00:00
Samuel Pitoiset 80e5e059fa radv: fix pre-computing viewport xform when setting new viewports
viewportCount is the number of viewports in pViewports while
firstViewport is the index.

Fixes new CTS dEQP-VK.draw.depth_clamp.*_clamp_four_viewports

Fixes: a2ef92d7a5 ("radv: pre-calculate viewport transforms")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12353>
2021-08-17 08:20:02 +02:00
Timothy Arceri edfcc4f022 nir: fix GCM when GVN enabled
Enabling GVN uncovered a bug where we would crash if the pass
thinking about pushing something into a loop.

Fixes: 6538b3e566 ("nir: add heuristic for instructions in loops with GCM")

Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12242>
2021-08-17 03:15:49 +00:00
Vinson Lee 104aa6b5d1 nv50/ir: Add FlatteningPass constructor.
Fix defect reported by Coverity Scan.

Uninitialized scalar field (UNINIT_CTOR)
member_not_init_in_gen_ctor: The compiler-generated constructor for this
class does not initialize gpr_unit.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12295>
2021-08-17 01:40:14 +00:00
Vinson Lee ac1ddfba35 zink: Remove unnecessary null checks.
Fix defects reported by Coverity Scan.

Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking sa suggests that it may be null,
but it has already been dereferenced on all paths leading to the
check.
check_after_deref: Null-checking sb suggests that it may be null,
but it has already been dereferenced on all paths leading to the
check.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12298>
2021-08-17 01:25:00 +00:00
Filip Gawin e6d996f8ff radeonsi: improve rounding of zmin
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12389>
2021-08-17 01:09:51 +00:00
Dave Airlie 9e6f414766 llvmpipe: init renderer string once to avoid races.
In a multithreads clover run the get_name call would race against
itself and sometimes an empty device name would occur.

Just init it once.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12285>
2021-08-16 23:20:00 +00:00
Dave Airlie ff99270923 gallivm: fix non-32 bit popcounts.
Fixes
OpenCL CTS integer_ops popcount

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12285>
2021-08-16 23:20:00 +00:00
Dave Airlie 9922ea7e66 gallivm: fix idiv/irem for 8/16/64-bit and 32-bit INT_MIN/-1
This fixes integer division for non-32bit but also fixes the
32-bit case where INT_MIN/-1 causes an exception.

Fixes CL CTS
./integer_ops/test_integer_ops quick_long_math

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12285>
2021-08-16 23:20:00 +00:00
Dave Airlie ff2d838c7a llvmpipe/cl: limit kernel input size.
Fixes:
api min_max_parameter_size

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12285>
2021-08-16 23:20:00 +00:00
Dave Airlie c3bede9c96 gallivm: don't lower local invocation index in frontend
The frontend can't handle variable block sizes properly,
so just lower it here in the backend.

Fixes CTS basic local_linear_id + get_linear_ids

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12285>
2021-08-16 23:20:00 +00:00
Emma Anholt baf800b236 i915g: Implement cube/3d texture_subdata() as a series of per-layer maps.
i915 doesn't lay out the images such that one could use a layer_stride to
step between them, and the individual maps should be just as good at
uploading.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12384>
2021-08-16 22:48:54 +00:00
Emma Anholt 9b51204d8f i915g: Fix 3D texture layouts for width != height.
Obvious typo here.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12384>
2021-08-16 22:48:54 +00:00
Ella-0 dad0c16782 v3dv: Implement VK_EXT_pipeline_creation_cache_control
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12381>
2021-08-16 20:41:03 +00:00
Dave Airlie 8f72268fc9 llvmpipe: enable GL compatibility profiles
The two rasterpos fails looks related to GLSL linking, the vertex
shader is linked with the geometry shader which doesn't use any
of it's outputs so they seem to get removed, which stops the rasterpos
from working.

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12374>
2021-08-17 05:34:55 +10:00
Dave Airlie 5b9ca78f47 draw: add vertex color clamping to gs/tes
This refactors out the vertex color clamping from the VS shader,
and adds calls to it for the tes/gs stages. It also conditionalised
they key on having later stages as clamping should only happen in
the last stage.

This is needed for GL compatibility profiles

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12374>
2021-08-17 05:33:33 +10:00
Dave Airlie 1ae55f05c0 draw/tess: add clipvertex support for compatibility
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12374>
2021-08-17 05:33:31 +10:00
Dave Airlie f48fed8e91 draw/gs: add clipvertex support for compatibility
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12374>
2021-08-17 05:33:27 +10:00
Dave Airlie 3b1b1af694 draw: handle primitive ID for quads/quad strips.
In order to enable compat contexts QUADS/QUAD_STRIPS need
to support primitive ID. There are some piglit tests for this.

This adds support to the decomposer to pass quads so the prim
assembler can pick them up and add primitive IDs.

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12374>
2021-08-17 05:33:11 +10:00
Juan A. Suarez Romero 441e490f5a v3dv: initialize CL submission structure
This fixes an issue related with testing this with a kernel with the
performance counters enabled: it introduces a "pad" field that in the CL
submission structure that is not initialized.

Fixes: ca13868098 ("drm-uapi: add v3d performance counters")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12390>
2021-08-16 18:06:35 +00:00
Rhys Perry 795f3b7318 ci: update trace hashes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>
2021-08-16 17:19:45 +00:00
Rhys Perry cfc4433015 nir,glsl_to_nir: use nir_fdot()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>
2021-08-16 17:19:45 +00:00
Rhys Perry f6f9000f84 spirv: create ffma more often
We will not be able to combine instructions into ffma later if they are
exact, so create them from the start. They can be lowered later if they
are unwanted.

fossil-db (GFX10.3):
Totals from 14697 (10.05% of 146267) affected shaders:
VGPRs: 645736 -> 614168 (-4.89%)
CodeSize: 59312768 -> 58735352 (-0.97%); split: -0.97%, +0.00%
MaxWaves: 372900 -> 376666 (+1.01%)
Instrs: 11339280 -> 11120882 (-1.93%); split: -1.93%, +0.00%
Latency: 284874519 -> 285277327 (+0.14%); split: -0.10%, +0.24%
InvThroughput: 68791374 -> 68526739 (-0.38%); split: -0.49%, +0.10%

fossil-db (GFX10):
Totals from 11039 (7.55% of 146267) affected shaders:
CodeSize: 54785444 -> 54785268 (-0.00%); split: -0.00%, +0.00%
Instrs: 10401349 -> 10401396 (+0.00%); split: -0.00%, +0.00%
Latency: 277781803 -> 278572890 (+0.28%); split: -0.00%, +0.29%
InvThroughput: 65035902 -> 65100855 (+0.10%); split: -0.00%, +0.10%

fossil-db (GFX9):
Totals from 24055 (16.43% of 146401) affected shaders:
SGPRs: 1790704 -> 1790640 (-0.00%)
VGPRs: 1105736 -> 1105716 (-0.00%)
CodeSize: 110944732 -> 110948812 (+0.00%); split: -0.00%, +0.01%
Instrs: 21609095 -> 21610227 (+0.01%); split: -0.00%, +0.01%
Latency: 756137596 -> 756145812 (+0.00%); split: -0.02%, +0.02%
InvThroughput: 344103825 -> 344112245 (+0.00%); split: -0.00%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>
2021-08-16 17:19:45 +00:00
Rhys Perry 28acc4120f nir: lower fdot to ffma if lower_ffma=false
fossil-db (GFX10.3):
Totals from 57689 (39.44% of 146267) affected shaders:
VGPRs: 2873712 -> 2873432 (-0.01%); split: -0.01%, +0.00%
CodeSize: 227661100 -> 227583572 (-0.03%); split: -0.08%, +0.04%
MaxWaves: 1289562 -> 1289598 (+0.00%); split: +0.01%, -0.00%
Instrs: 43115433 -> 43083308 (-0.07%); split: -0.12%, +0.05%
Latency: 869947191 -> 870279826 (+0.04%); split: -0.06%, +0.10%
InvThroughput: 199425811 -> 199434448 (+0.00%); split: -0.04%, +0.05%

fossil-db (GFX10):
Totals from 2 (0.00% of 146267) affected shaders:
Latency: 8123 -> 8107 (-0.20%)

fossil-db (GFX9):
Totals from 2 (0.00% of 146401) affected shaders:
(no stat changes)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>
2021-08-16 17:19:45 +00:00
Rhys Perry 174a4f36f9 nir: create ffma from builders more often
We will not be able to combine instructions into ffma later if they are
exact, so create them from the start. They can be lowered later if they
are unwanted.

fossil-db (GFX10.3):
Totals from 16589 (11.34% of 146267) affected shaders:
VGPRs: 938872 -> 938704 (-0.02%)
SpillSGPRs: 11334 -> 10785 (-4.84%)
CodeSize: 96551964 -> 96498040 (-0.06%); split: -0.08%, +0.02%
MaxWaves: 338760 -> 338772 (+0.00%)
Instrs: 18356857 -> 18350486 (-0.03%); split: -0.06%, +0.02%
Latency: 561563310 -> 561414360 (-0.03%); split: -0.08%, +0.05%
InvThroughput: 145629673 -> 145594740 (-0.02%); split: -0.04%, +0.01%

fossil-db (GFX10):
Totals from 16252 (11.11% of 146267) affected shaders:
VGPRs: 893820 -> 893744 (-0.01%)
SpillSGPRs: 11334 -> 10785 (-4.84%)
CodeSize: 95890244 -> 95839124 (-0.05%); split: -0.08%, +0.02%
MaxWaves: 367704 -> 367734 (+0.01%)
Instrs: 18199741 -> 18194437 (-0.03%); split: -0.06%, +0.03%
Latency: 560912971 -> 560854179 (-0.01%); split: -0.07%, +0.06%
InvThroughput: 142899814 -> 142877939 (-0.02%); split: -0.03%, +0.02%

fossil-db (GFX9):
Totals from 16287 (11.12% of 146401) affected shaders:
SGPRs: 1312784 -> 1312768 (-0.00%); split: -0.05%, +0.05%
VGPRs: 931440 -> 931444 (+0.00%); split: -0.00%, +0.00%
SpillSGPRs: 14623 -> 14597 (-0.18%)
CodeSize: 94428788 -> 94344404 (-0.09%); split: -0.10%, +0.01%
MaxWaves: 90105 -> 90109 (+0.00%)
Instrs: 18486905 -> 18473434 (-0.07%); split: -0.08%, +0.01%
Latency: 720947295 -> 720818323 (-0.02%); split: -0.07%, +0.05%
InvThroughput: 365240104 -> 365224659 (-0.00%); split: -0.02%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>
2021-08-16 17:19:45 +00:00
Rhys Perry ed70b256ce nir: add ffma creation helpers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>
2021-08-16 17:19:45 +00:00
Rhys Perry 4ec4d862c2 nir/algebraic: add is_used_once to dot product reassociation optimization
This improves register usage.

fossil-db (Sienna Cichlid, on top of !9805):
Totals from 4317 (2.88% of 149839) affected shaders:
VGPRs: 352592 -> 351704 (-0.25%); split: -1.48%, +1.23%
SpillSGPRs: 182 -> 248 (+36.26%)
CodeSize: 31601192 -> 31587624 (-0.04%); split: -0.09%, +0.04%
MaxWaves: 56964 -> 57298 (+0.59%); split: +2.48%, -1.90%
Instrs: 5973557 -> 5974122 (+0.01%); split: -0.05%, +0.06%
Latency: 72088175 -> 72253033 (+0.23%); split: -0.36%, +0.59%
InvThroughput: 14978160 -> 14798919 (-1.20%); split: -1.29%, +0.09%
VClause: 100994 -> 98645 (-2.33%); split: -3.05%, +0.73%
SClause: 278206 -> 276820 (-0.50%); split: -0.54%, +0.04%
Copies: 200264 -> 199556 (-0.35%); split: -1.17%, +0.82%
Branches: 86410 -> 85930 (-0.56%); split: -0.56%, +0.01%
PreSGPRs: 207355 -> 207759 (+0.19%); split: -0.00%, +0.20%
PreVGPRs: 314646 -> 310911 (-1.19%); split: -1.35%, +0.17%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>
2021-08-16 17:19:45 +00:00
Rhys Perry f95a16be72 nir/algebraic: reassociate add chains for more MAD/FMA-friendly code
fossil-db (GFX10.3):
Totals from 25866 (17.68% of 146267) affected shaders:
VGPRs: 1625456 -> 1644936 (+1.20%); split: -0.05%, +1.24%
SpillSGPRs: 11729 -> 11725 (-0.03%); split: -0.07%, +0.03%
CodeSize: 161604460 -> 161458052 (-0.09%); split: -0.11%, +0.02%
MaxWaves: 454842 -> 452160 (-0.59%); split: +0.04%, -0.63%
Instrs: 30652596 -> 30456446 (-0.64%); split: -0.65%, +0.01%
Latency: 723098749 -> 722084247 (-0.14%); split: -0.21%, +0.07%
InvThroughput: 166023468 -> 165506875 (-0.31%); split: -0.36%, +0.05%

fossil-db (GFX10):
Totals from 25866 (17.68% of 146267) affected shaders:
VGPRs: 1593576 -> 1611976 (+1.15%); split: -0.09%, +1.25%
SpillSGPRs: 11729 -> 11725 (-0.03%); split: -0.07%, +0.03%
CodeSize: 162294468 -> 162154456 (-0.09%); split: -0.11%, +0.02%
MaxWaves: 477448 -> 474166 (-0.69%); split: +0.10%, -0.79%
Instrs: 30820164 -> 30625805 (-0.63%); split: -0.65%, +0.02%
Latency: 723190249 -> 722273445 (-0.13%); split: -0.20%, +0.08%
InvThroughput: 163114872 -> 162582966 (-0.33%); split: -0.37%, +0.04%

fossil-db (GFX9):
Totals from 25866 (17.67% of 146401) affected shaders:
SGPRs: 2167808 -> 2169920 (+0.10%); split: -0.09%, +0.19%
VGPRs: 1649404 -> 1667592 (+1.10%); split: -0.43%, +1.53%
CodeSize: 161273556 -> 161281996 (+0.01%); split: -0.07%, +0.08%
MaxWaves: 114910 -> 113519 (-1.21%); split: +0.10%, -1.31%
Instrs: 31557180 -> 31403708 (-0.49%); split: -0.50%, +0.02%
Latency: 899594793 -> 898786283 (-0.09%); split: -0.19%, +0.10%
InvThroughput: 412265691 -> 411551698 (-0.17%); split: -0.28%, +0.11%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>
2021-08-16 17:19:45 +00:00