Commit Graph

6226 Commits

Author SHA1 Message Date
Rhys Perry 7c63ec70ef nir: document that ACCESS_RESTRICT is not set at intrinsics
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7295>
2021-06-10 13:17:22 +00:00
Rhys Perry 938098c98d nir/opt_load_store_vectorize: only require one variable to be restrict
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7295>
2021-06-10 13:17:22 +00:00
Rhys Perry 865ca3af2b nir/opt_load_store_vectorize: check for restrict at the variable
SPIR-V -> NIR doesn't set ACCESS_RESTRICT at the intrinsic.

fossil-db (GFX10.3):
Totals from 3 (0.00% of 139391) affected shaders:
CodeSize: 12364 -> 12356 (-0.06%)
Instrs: 2493 -> 2494 (+0.04%); split: -0.04%, +0.08%
Cycles: 15279372 -> 15295756 (+0.11%); split: -0.11%, +0.21%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7295>
2021-06-10 13:17:22 +00:00
Rhys Perry 2e7bceb220 nir/load_store_vectorizer: fix check_for_robustness() with indirect loads
fossil-db (GFX10.3, robustness2 enabled):
Totals from 13958 (9.54% of 146267) affected shaders:
VGPRs: 609168 -> 624304 (+2.48%); split: -0.05%, +2.53%
CodeSize: 48229504 -> 48488392 (+0.54%); split: -0.02%, +0.56%
MaxWaves: 354426 -> 349448 (-1.40%); split: +0.00%, -1.41%
Instrs: 9332093 -> 9375053 (+0.46%); split: -0.03%, +0.49%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7295>
2021-06-10 13:17:22 +00:00
Timur Kristóf 1e49018ced amd: Add extra source to the mbcnt_amd NIR intrinsic.
The v_mbcnt instructions can take an extra source that they add to
the result. This is not exposed in SPIR-V but we now expose it in NIR.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
2021-06-09 16:48:51 +00:00
Timur Kristóf 43ce80a58f nir: Add AMD-specific byte and lane permute intrinsics.
These map directly to v_perm_b32 and v_permlane_b32.
Unfortunately there is no corresponding NIR opcode or
intrinsics, and it's too tedious to puzzle these things
together from the existing NIR instructions.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
2021-06-09 16:48:51 +00:00
Timur Kristóf c92dab8e2b nir: Add nir_op_sad_u8x4 which corresponds to AMD's v_sad_u8.
NIR currently doesn't have any intrinsics for a horizontal packed add,
so this one is modeled after AMD's v_sad_u8.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
2021-06-09 16:48:51 +00:00
Caio Marcelo de Oliveira Filho e94c99513a nir/gather_info: Rename per_vertex to is_arrayed
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11252>
2021-06-09 07:35:57 +00:00
Caio Marcelo de Oliveira Filho a59f1d628a nir/lower_io: Rename vertex_index to array_index in helpers
The helpers will be reused for per-primitive variables that are also
arrayed, so use a more general name.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11252>
2021-06-09 07:35:57 +00:00
Alyssa Rosenzweig 95bd6e915f nir/lower_fragcolor: Avoid redundant load_output
At best, this is an extra instruction for NIR to optimize out. At worst,
depending on pass ordering nir_load_output could sneak into the final
NIR, even on drivers that don't support fbfetch.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11255>
2021-06-09 02:58:08 +00:00
Caio Marcelo de Oliveira Filho 8af6766062 nir: Move workgroup_size and workgroup_variable_size into common shader_info
Move it out the "cs" sub-struct, since these will be used for other
shader stages in the future.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11225>
2021-06-08 09:23:55 -07:00
Caio Marcelo de Oliveira Filho b5f6fc442c nir: Move zero_initialize_shared_memory into common shader_info
Move it out the "cs" sub-struct, since the bit will be used for other
shader stages in the future.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11225>
2021-06-08 09:23:55 -07:00
Bas Nieuwenhuizen 6b7ff241f4 nir/lower_returns: Deal with single-arg phis after if.
if we have

   if ... {
      return;
   } else {
      // block X
   }
   // block Y
   phi(X: ...)

then nir_lower_returns tries to move block Y into the else body,
except nir_cf_extract doesn't move the phi. As the return is removed
in the then-body the phi suddenly has the wrong number of arguments
(and the phi doesn't dominate its uses anymore).

In this case we know that the phi has to be single arg, so we can just
rewrite the users of the phis and drop them.

Hit this in my RT adventures, not sure if this is actually reachable
right now, as single arg phis tend to be kind of exceptional outside
of CSSA and we typically call nir_lower_returns pretty early.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11207>
2021-06-08 11:29:53 +00:00
Rhys Perry 1cbcfb8b38 nir, nir/algebraic: add byte/word insertion instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:42 +00:00
Rhys Perry edae3e5623 nir/algebraic: optimize extract of extract
Found in some sottr shaders (originally iand(ishr(a, 16), 0xffff))

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:42 +00:00
Caio Marcelo de Oliveira Filho c8a7bd0dc8 nir: Rename WORK_GROUP (and similar) to WORKGROUP
Be consistent with other usages in Vulkan and SPIR-V, and the recently
added workgroup_size field.

Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>
2021-06-07 22:34:42 +00:00
Caio Marcelo de Oliveira Filho a71a780598 nir: Rename nir_intrinsic_load_local_group_size to nir_intrinsic_load_workgroup_size
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>
2021-06-07 22:34:42 +00:00
Caio Marcelo de Oliveira Filho 43a6a2151b compiler: Rename SYSTEM_VALUE_LOCAL_GROUP_SIZE to SYSTEM_VALUE_WORKGROUP_SIZE
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>
2021-06-07 22:34:42 +00:00
Caio Marcelo de Oliveira Filho 430d2206da compiler: Rename local_size to workgroup_size
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>
2021-06-07 22:34:42 +00:00
Alyssa Rosenzweig c509878971 nir: Add nir_intrinsic_load_back_face_agx
On AGX, the special register for front facing is inverted from its meaning in
APIs. We need to lower load_front_face to inot(load_back_face). Doing this in
the backend is trivial, but then we would miss out on algebraic optimizations
for the inot.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11199>
2021-06-05 20:38:22 +00:00
Hoe Hao Cheng 90a5fef85c nir: define NIR_ALU_MAX_INPUTS
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11172>
2021-06-04 19:33:13 +00:00
Rhys Perry 49add985ff nir/unsigned_upper_bound: don't require dominance metadata
Instead, determine if it's a merge or loop exit phi.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9808>
2021-06-04 14:14:00 +00:00
Mike Blumenkrantz 1199d86b2c compiler/spirv: expand_to_vec4 -> nir_pad_vec4
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10821>
2021-05-31 18:45:24 +00:00
Mike Blumenkrantz f9ecbb1e1d nir/builder: add nir_mask
it's handy to have functions for generating masks

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10620>
2021-05-26 04:06:27 +00:00
Timothy Arceri 8b180ab98b nir/lower_io_to_vector: fix per vertex io handling for arrays
The pass was processing the per vertex index from the wrong end
of the array deref chain.

Fixes: bcd14756ee ("nir/lower_io_to_vector: add flat mode")

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10798>
2021-05-21 02:43:30 +00:00
Ian Romanick 880b00dc59 nir/lower_tex: Add support for lowering YUYV formats
v2: Rebase on bc438c91d9 ("nir/lower_tex: ignore texture_index if
tex_instr has deref src")

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9610>
2021-05-21 01:40:22 +00:00
Ian Romanick 1358d93650 nir/lower_tex: Add support for lowering Y41x formats
These are similar to AYUV, but the channel ordering is different... in
such a way that there's no RGBA format that will make the channels line
up right.

v2: Rebase on bc438c91d9 ("nir/lower_tex: ignore texture_index if
tex_instr has deref src")

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9610>
2021-05-21 01:40:22 +00:00
Alyssa Rosenzweig 4d44d4179e glsl: Fix packing of matrices for XFB
The CAP for packed transform feedback concerns packing of unrelated
variables into the same varying slot. (On Mali, transform feedback is
implemented on a per-slot basis, so different variables need different
slots to be written to different buffers.) However, this requirement is
tangential to the packing of arrays, matrices, and structures inherent
to GLSL. These array-like values need to be packed /within/ their slot,
even though drivers using the CAP (just Panfrost) cannot pack
independent values in the slot. Transform feedback of individual
elements is not independent, after all.

Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10778>
2021-05-20 10:05:39 +00:00
Alyssa Rosenzweig 538ab8c571 glsl: Fix subscripted arrays with no XFB packing
We need to duplicate the subscripted members even if they happen to be
aligned, since the other elements may be passed into the consumer
shader. Fixes on Panfrost:

dEQP-GLES3.functional.transform_feedback.array_element.interleaved.lines.highp_float

Note: the test did pass on main previously due to an elaborate set of
driver hacks. I don't believe the old behaviour was correct regardless.

Only Panfrost is affected by this change and the next, as every other
driver sets PIPE_CAP_PACKED_STREAM_OUTPUT.

Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10778>
2021-05-20 10:05:39 +00:00
Ian Romanick d246c31ec1 nir/algebraic: Add algebraic opt for float comparisons with identical operands.
The flt version could have been added in 56e21647e2, but our
collective understanding of NaN and comparisons was poor in 2015.  The
new "is_a_number" predicate makes the others possible.

All of the helped shaders in shader-db are either from Mad Max or Skia.
Some of the Skia shaders just get decimated by this change:

instructions helped:   shaders/skia/580-4.shader_test FS SIMD8:          81 -> 29 (-64.20%) (scheduled: top-down)

I looked at a couple of those shaders, and they had sequences like:

        vec1 32 ssa_44 = flt32 ssa_32, ssa_32
        vec1 32 ssa_45 = b32csel ssa_44, ssa_43, ssa_0
        vec1 32 ssa_46 = fge32 ssa_32, ssa_32
        vec1 32 ssa_47 = b32csel ssa_46, ssa_0, ssa_45
        vec1 32 ssa_48 = iand ssa_46, ssa_44
        vec1 32 ssa_49 = b32csel ssa_48, ssa_43, ssa_0

ssa_44 is replaced with False.  Then ssa_47 selects between ssa_0 and
ssa_0, so ssa_47 and ssa_46 are eliminated.  ssa_48 is (False && don't
care), so ssa_48 and ssa_49 are eliminated.  After that, many
calculations now involve constants of zero, so they are optimized down
too.  So it continues until there's not much left!

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

All Intel platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 21072238 -> 21071386 (<.01%)
instructions in affected programs: 33722 -> 32870 (-2.53%)
helped: 146
HURT: 1
helped stats (abs) min: 1 max: 62 x̄: 5.84 x̃: 2
helped stats (rel) min: 0.19% max: 62.35% x̄: 4.09% x̃: 1.07%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.20% max: 0.20% x̄: 0.20% x̃: 0.20%
95% mean confidence interval for instructions value: -7.94 -3.65
95% mean confidence interval for instructions %-change: -5.87% -2.25%
Instructions are helped.

total cycles in shared programs: 856203326 -> 856192238 (<.01%)
cycles in affected programs: 749966 -> 738878 (-1.48%)
helped: 148
HURT: 0
helped stats (abs) min: 1 max: 1226 x̄: 74.92 x̃: 18
helped stats (rel) min: 0.07% max: 49.70% x̄: 2.69% x̃: 0.46%
95% mean confidence interval for cycles value: -104.82 -45.02
95% mean confidence interval for cycles %-change: -4.01% -1.37%
Cycles are helped.

LOST:   4
GAINED: 0

Fossil-db results:

Tiger Lake
Instructions in all programs: 160915223 -> 160898354 (-0.0%)
SENDs in all programs: 6812780 -> 6812780 (+0.0%)
Loops in all programs: 38340 -> 38340 (+0.0%)
Cycles in all programs: 7434144207 -> 7433978462 (-0.0%)
Spills in all programs: 192582 -> 192582 (+0.0%)
Fills in all programs: 304537 -> 304537 (+0.0%)

Ice Lake
Instructions in all programs: 145296298 -> 145279531 (-0.0%)
SENDs in all programs: 6863692 -> 6863692 (+0.0%)
Loops in all programs: 38334 -> 38334 (+0.0%)
Cycles in all programs: 8800257014 -> 8800088384 (-0.0%)
Spills in all programs: 216880 -> 216880 (+0.0%)
Fills in all programs: 334248 -> 334248 (+0.0%)

Skylake
Instructions in all programs: 135891664 -> 135874910 (-0.0%)
SENDs in all programs: 6802946 -> 6802946 (+0.0%)
Loops in all programs: 38331 -> 38331 (+0.0%)
Cycles in all programs: 8444273433 -> 8444130932 (-0.0%)
Spills in all programs: 194839 -> 194839 (+0.0%)
Fills in all programs: 301114 -> 301114 (+0.0%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>
2021-05-20 01:39:35 +00:00
Ian Romanick 64bcfc3a17 nir/algebraic: Rearrange some logic-joined comparisons and reduce
On Skylake and Broadwell, a single big compute shader in Dirt Rally has
spills and fills *REALLY* helped.  That same shader is hurt very
slightly for spills and fills on Ice Lake.

v2: Move the patterns earlier to be nearer other patterns that are
similar.  Mark the replacement fmin and fmax exact.  Both suggested by
Rhys.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

Tiger Lake
total instructions in shared programs: 21073812 -> 21073041 (<.01%)
instructions in affected programs: 77608 -> 76837 (-0.99%)
helped: 522
HURT: 33
helped stats (abs) min: 1 max: 26 x̄: 1.58 x̃: 1
helped stats (rel) min: 0.22% max: 14.29% x̄: 1.29% x̃: 1.02%
HURT stats (abs)   min: 1 max: 8 x̄: 1.67 x̃: 1
HURT stats (rel)   min: 0.25% max: 3.42% x̄: 1.06% x̃: 0.86%
95% mean confidence interval for instructions value: -1.57 -1.20
95% mean confidence interval for instructions %-change: -1.25% -1.05%
Instructions are helped.

total cycles in shared programs: 856224346 -> 856211096 (<.01%)
cycles in affected programs: 2394231 -> 2380981 (-0.55%)
helped: 603
HURT: 25
helped stats (abs) min: 1 max: 5218 x̄: 59.37 x̃: 28
helped stats (rel) min: 0.06% max: 5.61% x̄: 1.52% x̃: 1.37%
HURT stats (abs)   min: 2 max: 21394 x̄: 901.92 x̃: 10
HURT stats (rel)   min: 0.02% max: 5.90% x̄: 0.95% x̃: 0.59%
95% mean confidence interval for cycles value: -93.61 51.41
95% mean confidence interval for cycles %-change: -1.50% -1.34%
Inconclusive result (value mean confidence interval includes 0).

LOST:   1
GAINED: 1

Ice Lake
total instructions in shared programs: 20025692 -> 20024554 (<.01%)
instructions in affected programs: 104981 -> 103843 (-1.08%)
helped: 738
HURT: 0
helped stats (abs) min: 1 max: 30 x̄: 1.54 x̃: 1
helped stats (rel) min: 0.31% max: 10.53% x̄: 1.20% x̃: 1.06%
95% mean confidence interval for instructions value: -1.66 -1.43
95% mean confidence interval for instructions %-change: -1.26% -1.14%
Instructions are helped.

total cycles in shared programs: 979474407 -> 979422333 (<.01%)
cycles in affected programs: 4136364 -> 4084290 (-1.26%)
helped: 759
HURT: 59
helped stats (abs) min: 2 max: 11010 x̄: 72.78 x̃: 28
helped stats (rel) min: 0.03% max: 6.43% x̄: 1.23% x̃: 1.02%
HURT stats (abs)   min: 1 max: 698 x̄: 53.66 x̃: 8
HURT stats (rel)   min: 0.02% max: 24.05% x̄: 1.64% x̃: 0.33%
95% mean confidence interval for cycles value: -97.08 -30.24
95% mean confidence interval for cycles %-change: -1.14% -0.91%
Cycles are helped.

total spills in shared programs: 10568 -> 10569 (<.01%)
spills in affected programs: 102 -> 103 (0.98%)
helped: 0
HURT: 1

total fills in shared programs: 11347 -> 11349 (0.02%)
fills in affected programs: 277 -> 279 (0.72%)
helped: 0
HURT: 1

LOST:   2
GAINED: 2

Skylake
total instructions in shared programs: 18190419 -> 18188523 (-0.01%)
instructions in affected programs: 102502 -> 100606 (-1.85%)
helped: 791
HURT: 0
helped stats (abs) min: 1 max: 676 x̄: 2.40 x̃: 1
helped stats (rel) min: 0.34% max: 20.23% x̄: 1.41% x̃: 1.23%
95% mean confidence interval for instructions value: -4.07 -0.72
95% mean confidence interval for instructions %-change: -1.47% -1.34%
Instructions are helped.

total cycles in shared programs: 960737969 -> 960498951 (-0.02%)
cycles in affected programs: 4435351 -> 4196333 (-5.39%)
helped: 804
HURT: 67
helped stats (abs) min: 1 max: 198540 x̄: 300.54 x̃: 24
helped stats (rel) min: 0.03% max: 25.41% x̄: 1.21% x̃: 0.92%
HURT stats (abs)   min: 2 max: 680 x̄: 39.06 x̃: 6
HURT stats (rel)   min: 0.05% max: 23.98% x̄: 1.12% x̃: 0.19%
95% mean confidence interval for cycles value: -722.03 173.20
95% mean confidence interval for cycles %-change: -1.15% -0.91%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 9757 -> 9722 (-0.36%)
spills in affected programs: 138 -> 103 (-25.36%)
helped: 1
HURT: 0

total fills in shared programs: 9861 -> 9576 (-2.89%)
fills in affected programs: 564 -> 279 (-50.53%)
helped: 1
HURT: 0

LOST:   5
GAINED: 2

Broadwell
total instructions in shared programs: 17853870 -> 17852414 (<.01%)
instructions in affected programs: 101276 -> 99820 (-1.44%)
helped: 777
HURT: 0
helped stats (abs) min: 1 max: 264 x̄: 1.87 x̃: 1
helped stats (rel) min: 0.34% max: 8.44% x̄: 1.37% x̃: 1.23%
95% mean confidence interval for instructions value: -2.54 -1.21
95% mean confidence interval for instructions %-change: -1.42% -1.32%
Instructions are helped.

total cycles in shared programs: 1029846029 -> 1029725458 (-0.01%)
cycles in affected programs: 4435791 -> 4315220 (-2.72%)
helped: 813
HURT: 43
helped stats (abs) min: 2 max: 68560 x̄: 149.95 x̃: 24
helped stats (rel) min: 0.02% max: 73.73% x̄: 1.43% x̃: 0.92%
HURT stats (abs)   min: 2 max: 726 x̄: 31.12 x̃: 13
HURT stats (rel)   min: 0.01% max: 8.43% x̄: 0.62% x̃: 0.31%
95% mean confidence interval for cycles value: -299.58 17.87
95% mean confidence interval for cycles %-change: -1.63% -1.02%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 20333 -> 20307 (-0.13%)
spills in affected programs: 151 -> 125 (-17.22%)
helped: 1
HURT: 0

total fills in shared programs: 25899 -> 25775 (-0.48%)
fills in affected programs: 573 -> 449 (-21.64%)
helped: 1
HURT: 0

LOST:   5
GAINED: 0

Sandy Bridge, Ivy Bridge, and Haswell had similar results. (Haswell shown)
total instructions in shared programs: 16417658 -> 16416320 (<.01%)
instructions in affected programs: 96495 -> 95157 (-1.39%)
helped: 774
HURT: 0
helped stats (abs) min: 1 max: 18 x̄: 1.73 x̃: 1
helped stats (rel) min: 0.33% max: 9.80% x̄: 1.52% x̃: 1.20%
95% mean confidence interval for instructions value: -1.83 -1.63
95% mean confidence interval for instructions %-change: -1.59% -1.46%
Instructions are helped.

total cycles in shared programs: 1037104346 -> 1037080579 (<.01%)
cycles in affected programs: 3787747 -> 3763980 (-0.63%)
helped: 791
HURT: 53
helped stats (abs) min: 1 max: 5411 x̄: 65.87 x̃: 32
helped stats (rel) min: 0.02% max: 21.17% x̄: 1.44% x̃: 1.18%
HURT stats (abs)   min: 2 max: 14160 x̄: 534.72 x̃: 18
HURT stats (rel)   min: 0.02% max: 15.37% x̄: 5.70% x̃: 0.54%
95% mean confidence interval for cycles value: -69.39 13.07
95% mean confidence interval for cycles %-change: -1.19% -0.80%
Inconclusive result (value mean confidence interval includes 0).

LOST:   12
GAINED: 2

GM45 and Iron Lake had similar results. (Iron Lake shown)
total instructions in shared programs: 8132855 -> 8132703 (<.01%)
instructions in affected programs: 8782 -> 8630 (-1.73%)
helped: 38
HURT: 0
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 1.66% max: 3.23% x̄: 1.77% x̃: 1.72%
95% mean confidence interval for instructions value: -4.00 -4.00
95% mean confidence interval for instructions %-change: -1.88% -1.65%
Instructions are helped.

total cycles in shared programs: 238300850 -> 238298568 (<.01%)
cycles in affected programs: 257202 -> 254920 (-0.89%)
helped: 62
HURT: 2
helped stats (abs) min: 4 max: 58 x̄: 36.90 x̃: 50
helped stats (rel) min: 0.15% max: 1.55% x̄: 0.87% x̃: 1.12%
HURT stats (abs)   min: 2 max: 4 x̄: 3.00 x̃: 3
HURT stats (rel)   min: 0.12% max: 0.22% x̄: 0.17% x̃: 0.17%
95% mean confidence interval for cycles value: -41.34 -29.98
95% mean confidence interval for cycles %-change: -0.95% -0.73%
Cycles are helped.

Fossil-db results:

All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 145296888 -> 145296346 (-0.0%)
SENDs in all programs: 6863696 -> 6863696 (+0.0%)
Loops in all programs: 38334 -> 38334 (+0.0%)
Cycles in all programs: 8800262303 -> 8800258950 (-0.0%)
Spills in all programs: 216880 -> 216880 (+0.0%)
Fills in all programs: 334248 -> 334248 (+0.0%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>
2021-05-20 01:39:35 +00:00
Ian Romanick adc2835646 nir/algebraic: Mark some more logic-joined comparison reductions as exact
If the values are known to be numbers, the the replacements are exact.
This is only applied to the patterns with constants.  Constants should
always be numbers, and shaders with NaN constants should be handled in a
different way.

No shader-db or fossil-db changes on any Intel platform.  The intention
is to make these patterns more future proof.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>
2021-05-20 01:39:35 +00:00
Ian Romanick 23bbf3932b nir/algebraic: Mark some more comparison reductions exact
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

All Haswell and later Intel platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 21049056 -> 21048939 (<.01%)
instructions in affected programs: 4716 -> 4599 (-2.48%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 6 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.99% max: 5.43% x̄: 2.80% x̃: 2.51%
95% mean confidence interval for instructions value: -3.46 -2.54
95% mean confidence interval for instructions %-change: -3.22% -2.38%
Instructions are helped.

total cycles in shared programs: 855141411 -> 855141159 (<.01%)
cycles in affected programs: 54491 -> 54239 (-0.46%)
helped: 28
HURT: 5
helped stats (abs) min: 2 max: 34 x̄: 12.82 x̃: 12
helped stats (rel) min: 0.06% max: 2.73% x̄: 0.94% x̃: 0.75%
HURT stats (abs)   min: 2 max: 52 x̄: 21.40 x̃: 6
HURT stats (rel)   min: 0.11% max: 2.46% x̄: 0.90% x̃: 0.56%
95% mean confidence interval for cycles value: -13.72 -1.55
95% mean confidence interval for cycles %-change: -1.01% -0.31%
Cycles are helped.

Tiger Lake
Instructions in all programs: 160902191 -> 160899554 (-0.0%)
SENDs in all programs: 6812435 -> 6812435 (+0.0%)
Loops in all programs: 38225 -> 38225 (+0.0%)
Cycles in all programs: 7428581420 -> 7428555881 (-0.0%)
Spills in all programs: 192582 -> 192582 (+0.0%)
Fills in all programs: 304539 -> 304539 (+0.0%)

A lot of fragment shaders in Shadow of the Tomb Raider were helped, and
a bunch of vertex shaders in Octopath Traveler were hurt.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>
2021-05-20 01:39:35 +00:00
Ian Romanick 7d85dc4f35 nir/algebraic: Equality comparison inversions require sources be numbers
v2: Update A630 expected image checksum for minetest.trace.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

Tiger Lake
total instructions in shared programs: 21036690 -> 21049485 (0.06%)
instructions in affected programs: 852085 -> 864880 (1.50%)
helped: 240
HURT: 2514
helped stats (abs) min: 1 max: 46 x̄: 2.45 x̃: 2
helped stats (rel) min: 0.15% max: 4.30% x̄: 0.79% x̃: 0.55%
HURT stats (abs)   min: 1 max: 198 x̄: 5.32 x̃: 2
HURT stats (rel)   min: 0.06% max: 10.71% x̄: 1.48% x̃: 1.04%
95% mean confidence interval for instructions value: 4.14 5.15
95% mean confidence interval for instructions %-change: 1.23% 1.34%
Instructions are HURT.

total cycles in shared programs: 856045255 -> 855816220 (-0.03%)
cycles in affected programs: 16743786 -> 16514751 (-1.37%)
helped: 790
HURT: 1973
helped stats (abs) min: 1 max: 10766 x̄: 627.97 x̃: 18
helped stats (rel) min: <.01% max: 32.59% x̄: 3.01% x̃: 0.64%
HURT stats (abs)   min: 1 max: 4078 x̄: 135.36 x̃: 18
HURT stats (rel)   min: <.01% max: 54.56% x̄: 2.80% x̃: 0.82%
95% mean confidence interval for cycles value: -131.36 -34.42
95% mean confidence interval for cycles %-change: 0.88% 1.40%
Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree).

total spills in shared programs: 9771 -> 9766 (-0.05%)
spills in affected programs: 47 -> 42 (-10.64%)
helped: 1
HURT: 0

total fills in shared programs: 9451 -> 9430 (-0.22%)
fills in affected programs: 91 -> 70 (-23.08%)
helped: 1
HURT: 0

LOST:   16
GAINED: 51

All Intel GPUs from Sandybridge through Ice Lake had similar results. (Ice Lake shown)
total instructions in shared programs: 20024781 -> 20025568 (<.01%)
instructions in affected programs: 103309 -> 104096 (0.76%)
helped: 12
HURT: 389
helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1
helped stats (rel) min: 0.20% max: 2.70% x̄: 1.36% x̃: 1.37%
HURT stats (abs)   min: 1 max: 8 x̄: 2.06 x̃: 1
HURT stats (rel)   min: 0.05% max: 7.14% x̄: 1.25% x̃: 0.95%
95% mean confidence interval for instructions value: 1.78 2.15
95% mean confidence interval for instructions %-change: 1.06% 1.28%
Instructions are HURT.

total cycles in shared programs: 979419070 -> 979439180 (<.01%)
cycles in affected programs: 4968711 -> 4988821 (0.40%)
helped: 60
HURT: 381
helped stats (abs) min: 1 max: 1296 x̄: 96.92 x̃: 26
helped stats (rel) min: <.01% max: 27.10% x̄: 1.64% x̃: 0.65%
HURT stats (abs)   min: 1 max: 7320 x̄: 68.04 x̃: 30
HURT stats (rel)   min: <.01% max: 19.77% x̄: 1.32% x̃: 0.87%
95% mean confidence interval for cycles value: 10.25 80.95
95% mean confidence interval for cycles %-change: 0.69% 1.15%
Cycles are HURT.

LOST:   1
GAINED: 2

GM45 and Iron Lake had similar results. (Iron Lake shown)
total instructions in shared programs: 8128474 -> 8132527 (0.05%)
instructions in affected programs: 642323 -> 646376 (0.63%)
helped: 12
HURT: 1972
helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4
helped stats (rel) min: 0.72% max: 1.72% x̄: 1.09% x̃: 0.83%
HURT stats (abs)   min: 1 max: 16 x̄: 2.07 x̃: 3
HURT stats (rel)   min: 0.12% max: 7.14% x̄: 0.77% x̃: 0.70%
95% mean confidence interval for instructions value: 1.99 2.10
95% mean confidence interval for instructions %-change: 0.74% 0.79%
Instructions are HURT.

total cycles in shared programs: 238280994 -> 238294376 (<.01%)
cycles in affected programs: 8841250 -> 8854632 (0.15%)
helped: 84
HURT: 1192
helped stats (abs) min: 4 max: 64 x̄: 12.50 x̃: 8
helped stats (rel) min: 0.02% max: 1.61% x̄: 0.28% x̃: 0.17%
HURT stats (abs)   min: 2 max: 198 x̄: 12.11 x̃: 12
HURT stats (rel)   min: 0.02% max: 8.03% x̄: 0.28% x̃: 0.14%
95% mean confidence interval for cycles value: 9.65 11.32
95% mean confidence interval for cycles %-change: 0.22% 0.27%
Cycles are HURT.

No fossil-db changes on any Intel platform.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>
2021-05-20 01:39:35 +00:00
Ian Romanick 4246c2869c nir/algebraic: Invert comparisons less often
This fixes the piglit test range_analysis_fsat_of_nan.shader_test.  That
test contains some code like

    o = saturate(X) > 0 ? vec4(1.0, 0.0, 0.0, 1.0)
                        : vec4(0.0, 1.0, 0.0, 1.0);

A clever optimizer will convert this to

    o = vec4(float(saturate(X) > 0),
             float(!(saturate(X) > 0)),
             0, 1);

Due to the ordering of optimizations in the compiler, the `saturate`
operations are removed.  This is safe even in the presense of NaN.

    o = vec4(float(X > 0), float(!(X > 0)), 0, 1);

Since the calculations are not marked precise, an overzealous
optimizer may reduce this to

    o = vec4(float(X > 0), float(X <= 0), 0, 1);

This will result in black being output.  The GLSL spec gives quite a bit
of leeway with respect to NaN, but that seems too far.  The shader
author asked for a result of red or green.  A result of black is still
"undefined behavior," but it's also a little mean.

This also enables CSE to do its job better.

v2: Update A530 expected image checksum for minetest.trace.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4531
Fixes: 0dbda153aa ("nir/algebraic: Flag inexact optimizations")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

Tiger Lake
total instructions in shared programs: 21041563 -> 21041789 (<.01%)
instructions in affected programs: 992066 -> 992292 (0.02%)
helped: 526
HURT: 548
helped stats (abs) min: 1 max: 16 x̄: 2.48 x̃: 2
helped stats (rel) min: 0.04% max: 5.56% x̄: 0.74% x̃: 0.49%
HURT stats (abs)   min: 1 max: 27 x̄: 2.80 x̃: 2
HURT stats (rel)   min: 0.04% max: 4.55% x̄: 0.59% x̃: 0.38%
95% mean confidence interval for instructions value: -0.00 0.42
95% mean confidence interval for instructions %-change: -0.12% <.01%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 855885569 -> 856118189 (0.03%)
cycles in affected programs: 343637248 -> 343869868 (0.07%)
helped: 907
HURT: 541
helped stats (abs) min: 1 max: 7724 x̄: 206.45 x̃: 36
helped stats (rel) min: <.01% max: 29.97% x̄: 1.01% x̃: 0.37%
HURT stats (abs)   min: 1 max: 14177 x̄: 776.09 x̃: 31
HURT stats (rel)   min: <.01% max: 29.94% x̄: 1.24% x̃: 0.35%
95% mean confidence interval for cycles value: 84.30 237.00
95% mean confidence interval for cycles %-change: -0.32% -0.01%
Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree).

LOST:   3
GAINED: 5

Ice Lake
total instructions in shared programs: 20027107 -> 20025352 (<.01%)
instructions in affected programs: 1068856 -> 1067101 (-0.16%)
helped: 1153
HURT: 273
helped stats (abs) min: 1 max: 14 x̄: 1.83 x̃: 1
helped stats (rel) min: 0.03% max: 5.66% x̄: 0.61% x̃: 0.35%
HURT stats (abs)   min: 1 max: 15 x̄: 1.29 x̃: 1
HURT stats (rel)   min: 0.16% max: 1.30% x̄: 0.58% x̃: 0.60%
95% mean confidence interval for instructions value: -1.33 -1.13
95% mean confidence interval for instructions %-change: -0.43% -0.34%
Instructions are helped.

total cycles in shared programs: 979499227 -> 979448725 (<.01%)
cycles in affected programs: 344261539 -> 344211037 (-0.01%)
helped: 1079
HURT: 441
helped stats (abs) min: 1 max: 9384 x̄: 147.78 x̃: 48
helped stats (rel) min: <.01% max: 31.83% x̄: 0.90% x̃: 0.33%
HURT stats (abs)   min: 1 max: 7220 x̄: 247.07 x̃: 32
HURT stats (rel)   min: <.01% max: 31.30% x̄: 1.52% x̃: 0.53%
95% mean confidence interval for cycles value: -70.01 3.56
95% mean confidence interval for cycles %-change: -0.35% -0.05%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 10564 -> 10568 (0.04%)
spills in affected programs: 143 -> 147 (2.80%)
helped: 0
HURT: 1

total fills in shared programs: 11343 -> 11347 (0.04%)
fills in affected programs: 287 -> 291 (1.39%)
helped: 0
HURT: 1

LOST:   3
GAINED: 2

Skylake
total instructions in shared programs: 18192274 -> 18190128 (-0.01%)
instructions in affected programs: 1000188 -> 998042 (-0.21%)
helped: 1149
HURT: 55
helped stats (abs) min: 1 max: 14 x̄: 1.92 x̃: 1
helped stats (rel) min: 0.04% max: 6.67% x̄: 0.67% x̃: 0.42%
HURT stats (abs)   min: 1 max: 2 x̄: 1.05 x̃: 1
HURT stats (rel)   min: 0.16% max: 0.55% x̄: 0.27% x̃: 0.26%
95% mean confidence interval for instructions value: -1.87 -1.69
95% mean confidence interval for instructions %-change: -0.67% -0.58%
Instructions are helped.

total cycles in shared programs: 960856054 -> 960728040 (-0.01%)
cycles in affected programs: 340840968 -> 340712954 (-0.04%)
helped: 1079
HURT: 233
helped stats (abs) min: 1 max: 7640 x̄: 170.95 x̃: 46
helped stats (rel) min: <.01% max: 30.20% x̄: 0.96% x̃: 0.28%
HURT stats (abs)   min: 1 max: 6864 x̄: 242.23 x̃: 26
HURT stats (rel)   min: <.01% max: 34.64% x̄: 2.10% x̃: 0.22%
95% mean confidence interval for cycles value: -135.62 -59.53
95% mean confidence interval for cycles %-change: -0.59% -0.25%
Cycles are helped.

LOST:   15
GAINED: 1

Broadwell
total instructions in shared programs: 17855624 -> 17853580 (-0.01%)
instructions in affected programs: 1012209 -> 1010165 (-0.20%)
helped: 1105
HURT: 52
helped stats (abs) min: 1 max: 13 x̄: 1.90 x̃: 1
helped stats (rel) min: 0.03% max: 6.67% x̄: 0.67% x̃: 0.36%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.13% max: 0.52% x̄: 0.26% x̃: 0.25%
95% mean confidence interval for instructions value: -1.86 -1.67
95% mean confidence interval for instructions %-change: -0.68% -0.58%
Instructions are helped.

total cycles in shared programs: 1029905447 -> 1029840699 (<.01%)
cycles in affected programs: 347102680 -> 347037932 (-0.02%)
helped: 1007
HURT: 211
helped stats (abs) min: 1 max: 1360 x̄: 89.76 x̃: 48
helped stats (rel) min: <.01% max: 16.26% x̄: 0.69% x̃: 0.25%
HURT stats (abs)   min: 1 max: 1297 x̄: 121.51 x̃: 20
HURT stats (rel)   min: <.01% max: 31.31% x̄: 1.21% x̃: 0.20%
95% mean confidence interval for cycles value: -62.39 -43.92
95% mean confidence interval for cycles %-change: -0.47% -0.25%
Cycles are helped.

total spills in shared programs: 20335 -> 20333 (<.01%)
spills in affected programs: 19 -> 17 (-10.53%)
helped: 2
HURT: 0

total fills in shared programs: 25905 -> 25899 (-0.02%)
fills in affected programs: 23 -> 17 (-26.09%)
helped: 2
HURT: 0

LOST:   9
GAINED: 0

Haswell
total instructions in shared programs: 16418516 -> 16417293 (<.01%)
instructions in affected programs: 223785 -> 222562 (-0.55%)
helped: 590
HURT: 67
helped stats (abs) min: 1 max: 15 x̄: 2.19 x̃: 1
helped stats (rel) min: 0.03% max: 6.52% x̄: 0.87% x̃: 0.60%
HURT stats (abs)   min: 1 max: 2 x̄: 1.04 x̃: 1
HURT stats (rel)   min: 0.04% max: 1.85% x̄: 0.44% x̃: 0.25%
95% mean confidence interval for instructions value: -2.01 -1.71
95% mean confidence interval for instructions %-change: -0.80% -0.67%
Instructions are helped.

total cycles in shared programs: 1037179754 -> 1037084874 (<.01%)
cycles in affected programs: 352541071 -> 352446191 (-0.03%)
helped: 1093
HURT: 182
helped stats (abs) min: 1 max: 888 x̄: 111.03 x̃: 64
helped stats (rel) min: <.01% max: 27.30% x̄: 0.84% x̃: 0.20%
HURT stats (abs)   min: 1 max: 6777 x̄: 145.49 x̃: 21
HURT stats (rel)   min: <.01% max: 24.10% x̄: 1.99% x̃: 0.29%
95% mean confidence interval for cycles value: -88.10 -60.73
95% mean confidence interval for cycles %-change: -0.58% -0.29%
Cycles are helped.

total spills in shared programs: 17457 -> 17456 (<.01%)
spills in affected programs: 12 -> 11 (-8.33%)
helped: 1
HURT: 0

total fills in shared programs: 20387 -> 20385 (<.01%)
fills in affected programs: 15 -> 13 (-13.33%)
helped: 1
HURT: 0

LOST:   6
GAINED: 1

Ivy Bridge and earlier platforms had similar results. (Ivy Bridge shown)
total instructions in shared programs: 15515482 -> 15513998 (<.01%)
instructions in affected programs: 239739 -> 238255 (-0.62%)
helped: 573
HURT: 57
helped stats (abs) min: 1 max: 20 x̄: 2.73 x̃: 2
helped stats (rel) min: 0.03% max: 9.84% x̄: 0.94% x̃: 0.55%
HURT stats (abs)   min: 1 max: 2 x̄: 1.39 x̃: 1
HURT stats (rel)   min: 0.09% max: 1.85% x̄: 0.52% x̃: 0.35%
95% mean confidence interval for instructions value: -2.57 -2.14
95% mean confidence interval for instructions %-change: -0.89% -0.73%
Instructions are helped.

total cycles in shared programs: 584509880 -> 584463152 (<.01%)
cycles in affected programs: 11765280 -> 11718552 (-0.40%)
helped: 661
HURT: 152
helped stats (abs) min: 1 max: 3073 x̄: 101.99 x̃: 32
helped stats (rel) min: <.01% max: 34.38% x̄: 1.46% x̃: 0.50%
HURT stats (abs)   min: 1 max: 6637 x̄: 136.10 x̃: 15
HURT stats (rel)   min: <.01% max: 24.19% x̄: 1.75% x̃: 0.25%
95% mean confidence interval for cycles value: -82.79 -32.16
95% mean confidence interval for cycles %-change: -1.11% -0.61%
Cycles are helped.

LOST:   9
GAINED: 0

Tiger Lake
Instructions in all programs: 160905127 -> 160900949 (-0.0%)
SENDs in all programs: 6812418 -> 6812085 (-0.0%)
Loops in all programs: 38225 -> 38225 (+0.0%)
Cycles in all programs: 7431911114 -> 7433914697 (+0.0%)
Spills in all programs: 192582 -> 192582 (+0.0%)
Fills in all programs: 304539 -> 304537 (-0.0%)

Ice Lake
Instructions in all programs: 145296733 -> 145292370 (-0.0%)
SENDs in all programs: 6863818 -> 6863485 (-0.0%)
Loops in all programs: 38219 -> 38219 (+0.0%)
Cycles in all programs: 8798257570 -> 8800204360 (+0.0%)
Spills in all programs: 216880 -> 216880 (+0.0%)
Fills in all programs: 334250 -> 334248 (-0.0%)

Skylake
Instructions in all programs: 135891485 -> 135887357 (-0.0%)
SENDs in all programs: 6803031 -> 6802698 (-0.0%)
Loops in all programs: 38216 -> 38216 (+0.0%)
Cycles in all programs: 8442221881 -> 8444201959 (+0.0%)
Spills in all programs: 194839 -> 194839 (+0.0%)
Fills in all programs: 301116 -> 301114 (-0.0%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>
2021-05-20 01:39:35 +00:00
Ian Romanick 49177b9e2f nir/algebraic: Tautology replacements require sources be numbers
It seems worth the small amount of damage to give an extra cushion of
not having to debug problems later.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

All Intel platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 21043197 -> 21043359 (<.01%)
instructions in affected programs: 4409 -> 4571 (3.67%)
helped: 0
HURT: 25
HURT stats (abs)   min: 1 max: 16 x̄: 6.48 x̃: 5
HURT stats (rel)   min: 0.39% max: 15.38% x̄: 4.59% x̃: 4.40%
95% mean confidence interval for instructions value: 4.37 8.59
95% mean confidence interval for instructions %-change: 2.93% 6.26%
Instructions are HURT.

total cycles in shared programs: 856175986 -> 856176921 (<.01%)
cycles in affected programs: 58908 -> 59843 (1.59%)
helped: 0
HURT: 25
HURT stats (abs)   min: 7 max: 70 x̄: 37.40 x̃: 38
HURT stats (rel)   min: 0.27% max: 5.63% x̄: 1.87% x̃: 1.39%
95% mean confidence interval for cycles value: 31.11 43.69
95% mean confidence interval for cycles %-change: 1.35% 2.39%
Cycles are HURT.

No fossil-db changes on any Intel platform.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>
2021-05-20 01:39:35 +00:00
Ian Romanick d69ba58644 nir/algebraic: Remove some optimizations of comparisons with fsat
When most of these patterns were created, we believed, incorrectly, that
fsat(NaN) was NaN.  We have since realized that fsat(NaN) is zero.
Originally, this changed the patterns to use is_a_number.  This didn't
help any shaders, so it's easier to just drop the optimizations.

This commit crossed paths with 4c3ad4d065 ("nir/algebraic: mark more
optimization with fsat(NaN) as inexact") and bc123c396a
("nir/algebraic: mark some optimizations with fsat(NaN) as inexact").
Given that these don't impact very many shaders, it seems safer to just
remove them.

As discussed in
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8716, I tried
modifying these patterns to use !(b cmp a).  Unfortunately, on Intel
GPUs, the results were much worse than just removing the patterns
altogether.

Some other related patterns will be addressed in later commits.

There are still a number of patterns that use the identity fsat(1-X) ==
1 - fsat(X).  If X is NaN, the former is zero while the latter is 1.0.
I haven't evaluted these patterns yet.  If changes are needed in these
patterns, it should be a separate commit anyway.

v2: Replace arrow `=>` with `->` in comments because the `=>` looks a
lot like `<=` comparison.  Suggested by Rhys.

Fixes: 92b75c126b ("nir/algebraic: Replace checks that a value is between (or not) [0, 1]")
Fixes: a7f0c57673 ("nir/algebraic: Eliminate useless fsat() on operand of comparison w/value in (0, 1)")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

All Intel hardware had similar results. (Ice Lake shown)
total instructions in shared programs: 20029060 -> 20029670 (<.01%)
instructions in affected programs: 69236 -> 69846 (0.88%)
helped: 0
HURT: 263
HURT stats (abs)   min: 1 max: 20 x̄: 2.32 x̃: 1
HURT stats (rel)   min: 0.30% max: 11.11% x̄: 1.35% x̃: 0.98%
95% mean confidence interval for instructions value: 1.86 2.78
95% mean confidence interval for instructions %-change: 1.18% 1.52%
Instructions are HURT.

total cycles in shared programs: 979821278 -> 979834425 (<.01%)
cycles in affected programs: 1476848 -> 1489995 (0.89%)
helped: 49
HURT: 204
helped stats (abs) min: 1 max: 812 x̄: 102.31 x̃: 20
helped stats (rel) min: 0.01% max: 21.43% x̄: 2.23% x̃: 0.52%
HURT stats (abs)   min: 2 max: 2600 x̄: 89.02 x̃: 16
HURT stats (rel)   min: 0.04% max: 27.27% x̄: 1.49% x̃: 0.72%
95% mean confidence interval for cycles value: 13.18 90.75
95% mean confidence interval for cycles %-change: 0.29% 1.25%
Cycles are HURT.

No fossil-db changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>
2021-05-20 01:39:35 +00:00
Jason Ekstrand b447f5049b nir: Add a discard optimization pass
Many fragment shaders do a discard using relatively little information
but still put the discard fairly far down in the shader for no good
reason.  If the discard is moved higher up, we can possibly avoid doing
some or almost all of the work in the shader.  When this lets us skip
texturing operations, it's an especially high win.

One of the biggest offenders here is DXVK.  The D3D APIs have different
rules for discards than OpenGL and Vulkan.  One effective way (which is
what DXVK uses) to implement DX behavior on top of GL or Vulkan is to
wait until the very end of the shader to discard.  This ends up in the
pessimal case where we always do all of the work before discarding.
This pass helps some DXVK shaders significantly.

v2 (Jason Ekstrand):
 - Fix a couple of typos (Grazvydas, Ian)
 - Use the new nir_instr_move helper
 - Find all movable discards before moving anything so we don't
   accidentally re-order anything and break dependencies

v3 (Pierre-Eric): remove the call to nir_opt_conditional_discard based
on Daniel Schürmann comment.

v4 (Pierre-Eric):
 - handle demote intrinsics and drop derivatives_safe_after_discard
 - add early return if discards/demotes aren't used

v5 (Pierre-Eric):
 - use pass_flags instead of instr set (Daniel Schürmann)

v6 (Daniel Schürmann):
 - cleanup and fix pass_flags handling

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10522>
2021-05-19 18:04:44 +00:00
Jason Ekstrand 3033410b10 nir/gather_info: Expose a nir_intrinsic_writes_external_memory helper
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10522>
2021-05-19 18:04:44 +00:00
Jason Ekstrand f97fb1fa55 nir: Add a nir_instr_move helper
Removes an instruction from one place and inserts it at another while
working around a weird cursor corner-case.

v2: change return value to bool (Daniel Schürmann)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> (v1)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10522>
2021-05-19 18:04:44 +00:00
Bas Nieuwenhuizen 2d6a6469b8 nir: Add bvh64_intersect_ray_amd intrinsic.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10818>
2021-05-18 23:01:47 +02:00
Bas Nieuwenhuizen aa82f91c38 nir: Add load_sbt_amd intrinsic.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9767>
2021-05-18 18:29:36 +00:00
Samuel Pitoiset 1b1c726ca9 nir/opt_access: fix getting variables in presence of similar bindings/desc
It's perfectly legal to declare multiple SSBOs that point to the same
binding/descriptor_set with different access mask. Currently, it will
always get the first one in the list that matches binding/desc_set
regardless of the access mask, but other variables might have different
access mask.

Fix this by being conservative if another variable uses the same
binding/desc_set because we can't get it reliably without adding
a new field to vulkan_resource_index.

This fixes rendering issues in Resident Evil Village with vkd3d-proton.
This bug has been uncovered by ("spirv: Don't remove variables used by
resource indexing intrinsics") because variables are no longer removed

No fossils-db changes.

Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10692>
2021-05-18 06:25:24 +00:00
Connor Abbott a40714abf7 nir/lower_phis_to_scalar: Add "lower_all" option
We don't want to have to deal with vector phis in freedreno, because
vectors are always split/unsplit around vectorized instructions anyways,
and the stated reason for not scalarising them (it hurting coalescing)
won't apply to us because we won't be using nir_from_ssa. Add this
option so that we don't have to do the equivalent thing while
translating from NIR.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10809>
2021-05-17 09:59:45 +00:00
Mike Blumenkrantz 6df187df13 nir/builder: add nir_pad_vector and nir_pad_vec4 util functions
these pad a given value to vec4 or arbitrary number of components

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10630>
2021-05-16 14:15:14 +00:00
Gert Wollny 4c045ad11e nir/linker: add option to ignore the IO precisions for better varying packing
Backends that don't handle IO component precision can pack more varyings
into one slot if the linker ignores the precision. If the IO is vectorized
then this can save IO instructions.

Related: 165a69d2f7
    nir: handle mediump varyings in varying compaction helpers

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10722>
2021-05-15 09:58:27 +02:00
Caio Marcelo de Oliveira Filho 09984fd02f nir: Rename nir_is_per_vertex_io to nir_is_arrayed_io
VS outputs are "per vertex" but not the kind of I/O we want to match
with this helper.  Change to a name that covers the "arrayness"
required by the type.

Name inspired by the GLSL spec definition of arrayed I/O.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10493>
2021-05-14 16:17:45 +00:00
Gert Wollny e418710f8b compiler/nir: check whether var is an input in lower_fragcoord_wtrans
Otherwise the lowering pass might try to lower any other load from
a deref if its data.location value happens to be zero.

Fixes: 418c4c0d7d
  compiler/nir: extend lower_fragcoord_wtrans to support VARYING_SLOT_POS

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10577>
2021-05-14 13:26:13 +00:00
Timothy Arceri 5aabc91273 glsl: add missing support for explicit components in interface blocks
From the ARB_enhanced_layouts spec:

   "As with input layout qualifiers, all shaders except compute shaders
   allow *location* layout qualifiers on output variable declarations,
   output block declarations, and output block member declarations.  Of
   these, variables and block members (but not blocks) additionally
   allow the *component* layout qualifier."

We previously had compile tests in piglit to make sure this was not a
compile error but no execution tests.

Fixes: d99a040bbf ("i965: enable ARB_enhanced_layouts for gen8+")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10763>
2021-05-13 08:07:53 +00:00
Timothy Arceri 1a71d6aa6e glsl: create validate_component_layout_for_type() helper
This will be used in the following patch.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10763>
2021-05-13 08:07:53 +00:00
Timur Kristóf 0d6b6c850f nir: Add AMD specific intrinsics for merged shaders and NGG.
These intrinsics represent what the hardware can actually do.
Lowering our shaders to use these intrinsics will allow us to
deal with mapping the classic VS, TES, GS (and the future MS)
stages to the hardware capabilities using NIR, which makes our
backend compilers simpler.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Timur Kristóf 641707a807 nir: Allow load_primitive_id in VS in nir_divergence_analysis.
The lowered NIR code of NGG VS shaders uses this intrinsic
when the VS has to export the primitive ID.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Timur Kristóf e905e0938a nir: Support upper bound of unsigned bit size conversions.
These allow us to generate slightly better code in some cases,
eg. multiplications in ACO.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Timur Kristóf 9a2ffe1abb nir: Support upper bound of subgroup_id/num_subgroups for non-compute.
These intrinsics will be used when lowering NGG shaders, including
currently supported stages like VS, TES, GS and also by mesh shaders
in the future.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Marcin Ślusarz 2c3e2d69bd nir: handle float atomics in nir_lower_memory_model
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 2adb337256 ("nir,radv/aco: add and use pass to lower make available/visible barriers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10766>
2021-05-12 11:09:07 +00:00
Marcin Ślusarz 27073b59bc nir: handle float atomics in nir_gather_info
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10766>
2021-05-12 11:09:07 +00:00
Tapani Pälli 181beece3c nir: skip assert check with empty structs
Fixes issues with upcoming CTS test testing empty structs.

v2: decorate with UNUSED as only used in assert (Timothy)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10681>
2021-05-10 08:07:29 +03:00
Mauro Rossi 2736ae0454 android: nir: add nir_lower_fragcolor.c to Makefile.sources
Fixes the following building error:

FAILED: out/target/product/x86_64/obj/SHARED_LIBRARIES/gallium_dri_intermediates/LINKED/gallium_dri.so
...
ld.lld: error: undefined symbol: nir_lower_fragcolor
>>> referenced by pan_assemble.c:81 (external/mesa/src/gallium/drivers/panfrost/pan_assemble.c:81)

Cc: 21.0 21.1 <mesa-stable@lists.freedesktop.org>
Fixes: 1fd3563025 ("nir: add lowering pass for fragcolor -> fragdata")
Acked-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10712>
2021-05-09 00:34:46 +02:00
Alyssa Rosenzweig db2f6b87a3 nir/divergence_anlysis: Add intrinsics for Bifrost
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10022>
2021-05-07 18:20:30 +00:00
Alyssa Rosenzweig f3de2bd6c2 nir: Add blend lowering pass
This pass was originally developed for Panfrost, where it passes the
relevant dEQP tests. Upstreaming so it can be extended and then shared
with:

* Asahi, for blending
* Zink, for logic ops
* Lavapipe, for advanced blending

Note that using this with MRT in a fragment shader (as non-panfrost
drivers will) has not yet been tested. Logic ops with integer
framebuffers are probably todo. It's been enough for Panfrost, will
suffice for ES2 on Asahi, and provides an upstream base for kusma's work
on advanced blending, so overall the merge is a net benefit.

v2: Remove bogus assert that the format layout is PLAIN. We need to
render R11G11B10, which Mesa reports as layout OTHER. The code is still
correct.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10601>
2021-05-07 17:25:21 +00:00
Gert Wollny b4600d9352 nir: Add filter callback for lower_to_scalar to the options
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9943>
2021-05-07 12:09:03 +00:00
Mike Blumenkrantz 37545418cd nir: add nir_isub_imm
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10654>
2021-05-06 13:01:03 +00:00
Jesse Natalie d8bac1002c vtn: Use relaxed 24bit opcodes for CL 24bit math
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10549>
2021-05-05 22:06:42 +00:00
Jesse Natalie d7ca0319d7 nir: Add relaxed 24bit opcodes
These are equivalent to the 32bit opcodes if there are no more efficient
24bit opcodes available, but inputs are guaranteed to already be 24bit,
so the 24bit opcodes can be used instead if they exist and are efficient.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10549>
2021-05-05 22:06:42 +00:00
Jason Ekstrand e1edf74dde nir/builder: Move clamp helpers to nir_builder.h
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10631>
2021-05-04 22:51:34 +00:00
Caio Marcelo de Oliveira Filho dd48683cfd nir: Move shared_memory_explicit_layout bit into common shader_info
Move it out of the "cs" sub-struct, since the bit can be used for
other shader stages in the future.

This also removes a subtle issue in spirv_to_nir:
info.cs.shared_memory_explicit_layout was used without checking for
the CS shader stage.  It ended up being "harmless" since the effects
also depended on presence of shared variables.

Fixes: 5de6c5973a ("spirv: Implement SPV_KHR_workgroup_memory_explicit_layout")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10529>
2021-05-04 20:54:58 +00:00
Iago Toral Quiroga aebb47b7d1 compiler/nir: add a divergence analysis option for non-uniform workgroup id
The V3D hardware allows us to pack multiple workgroups together to avoid
wasting execution lanes in shader cores.

For example, if we dispatch 16 workgroups with a local size of 1 element, we
can pack all 16 workgroups in a single 16-wide dispatch where each lane
executes a different workgroup, instead of 16 1-wide dispatches.

When we do this, we don't have a uniform workgroup id any more.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10541>
2021-05-04 15:53:23 +00:00
Caio Marcelo de Oliveira Filho 7cc846788c nir: Remove now unnecessary conditions from emit_load/store helpers
The mode one was used before 0bc5a829dd ("nir: Remove shared support from
lower_io").

The others were used before 5f7c7c9a7f ("nir: add src and dest types
to all IO loads and stores for mediump").

All conditions now are always true, so drop them.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10533>
2021-05-04 06:33:24 -07:00
Gert Wollny a199697642 nir/opt_algebraic: optimizations for add umax/umin with zero
For unsigned comparisons with zero these ops can be eliminated.

v2: Add comparison optimizations with -1 (Rhys Perry)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10583>
2021-05-04 09:33:32 +02:00
Alyssa Rosenzweig a976101da5 nir/opcodes: Reword confusing comment
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10578>
2021-05-03 12:51:47 +00:00
Alyssa Rosenzweig 0ea67e57e5 nir: Add fsin_agx opcode
Used to split up the fsin/fcos lowering for AGX between NIR and the
backend, to permit algebraic optimizations without polluting NIR with
too many hardware details. The backend NIR lowering produces an
fmul/ffma of the input so we can optimize code like sin(2*x).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10582>
2021-05-02 17:41:09 -04:00
Caio Marcelo de Oliveira Filho e763db4a47 spirv: Don't replicate patch bool in vtn_variable
When we originally added patch variable handling to spirv_to_nir, we
were splitting I/O block variables in spirv_to_nir, so we weren't
guaranteed to have a nir_variable early enough in processing.

Since b0c643d8f5 ("spirv: Use NIR per-member splitting"), we've been
using NIR per-member splitting where we have a nir_variable which has
a separate nir_variable_data per member.  With this, we can drop
vtn_variable::patch and use the patch boolean on the nir_variable
instead.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10469>
2021-04-29 06:55:29 +00:00
Rhys Perry 7a7838529a nir/lower_non_uniform: allow lowering with vec2 handles
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9523>
2021-04-27 15:56:07 +00:00
Tapani Pälli d93153a564 glsl: ignore interface precision qualifier on desktop GL
This fixes linking failures with new GL45 linkage tests, no
regressions spotted on existing tests.

v2: add spec reference (Samuel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10373>
2021-04-27 08:25:41 +00:00
Mike Blumenkrantz c8dfed0c12 nir/gl_lower_buffers: set access for ssbo load/store instrs
this is the last place where the information is available, so set the info before
it gets lost

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10074>
2021-04-26 21:31:44 +00:00
Connor Abbott 77fcb01f7f nir/lower_clip_disable: Fix store writemask
We're storing into the array element, not the whole variable.

Fixes: fb2fe80 ("nir: add lowering pass for clip plane enabling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7274>
2021-04-26 17:07:02 +00:00
Jesse Natalie 2775b9139b nir_lower_readonly_images_to_tex: Use nir_shader_lower_instructions
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10356>
2021-04-23 23:16:15 +00:00
Jesse Natalie fa677c8644 nir_lower_readonly_images_to_tex: Support non-CL semantics
For non-CL, intrinsic access isn't set, because the image type doesn't
have access qualifier. Instead, the access qualifier is set on the variable.

So, add a mode to this pass which can chase back to the variable in addition
to the intrinsic access. Also, update the variable type and the deref chain
types so everything is consistent, that the tex is accessing a sampler. Note
we can't do this for CL, because void-typed samplers don't exist.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10356>
2021-04-23 23:16:15 +00:00
Jesse Natalie 29c9731400 nir: Rename nir_lower_cl_images_to_tex, replace 'cl' with 'readonly'
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10356>
2021-04-23 23:16:15 +00:00
Jesse Natalie 1c41f63e26 vtn: Propagate access data from UBO/SSBO/push constant types to variables of that type, not just their pointers
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10356>
2021-04-23 23:16:15 +00:00
Jesse Natalie 9936463ef6 vtn: Propagate access data that's present on all struct members to the struct itself
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10356>
2021-04-23 23:16:15 +00:00
Alyssa Rosenzweig c84804f167 nir/lower_fragcolor: Take max cbufs as argument
One step closer to generalizing this pass to more drivers.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10411>
2021-04-23 17:20:43 +00:00
Alyssa Rosenzweig 73eb497b86 nir/lower_fragcolor: Fix driver_location assignment
Fixes crash in
dEQP-GLES31.functional.shaders.framebuffer_fetch.basic.last_frag_data
when using this pass.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10411>
2021-04-23 17:20:43 +00:00
Alyssa Rosenzweig 0f4ba349e9 nir/lower_fragcolor: Handle fp16 outputs
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10391>
2021-04-21 22:17:28 +00:00
Alyssa Rosenzweig 49c6157b15 nir/lower_fragcolor: Use shader_instructions_pass
While I was in the area.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10391>
2021-04-21 22:17:28 +00:00
Lionel Landwerlin 0bb29c07a4 spirv: fixup pointer_to/from_ssa with acceleration structures
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ed907e5d84 ("spirv: Add support for OpTypeAccelerationStructureKHR")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10357>
2021-04-21 21:51:51 +00:00
Rhys Perry 89b759c4f9 nir/opt_load_store_vectorize: loop internally
To vectorize to vec8/16 or vec4 (without vec3), we can't incrementally add
components to a load/store. This patch loops vectorization so that two new
vec2/4/8 operations can be combined into a larger operation.

fossil-db (GFX10.3):
Totals from 22 (0.02% of 139391) affected shaders:
SpillVGPRs: 1749 -> 1771 (+1.26%)
CodeSize: 901212 -> 892532 (-0.96%); split: -1.19%, +0.22%
Scratch: 178176 -> 184320 (+3.45%)
Instrs: 159358 -> 158027 (-0.84%); split: -0.99%, +0.16%
Cycles: 37046772 -> 36738544 (-0.83%); split: -1.00%, +0.17%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10384>
2021-04-21 20:26:58 +00:00
Rhys Perry 447820d003 nir/opt_load_store_vectorize: ignore load_vulkan_descriptor
These mess with alignment calculation.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10384>
2021-04-21 20:26:58 +00:00
Rhys Perry 6ca11b4a66 nir/opt_load_store_vectorize: improve handling of swizzles
Previously (for simplicity), it could have skipped vectorization if
swizzles were involved.

fossil-db (GFX10.3):
Totals from 498 (0.36% of 139391) affected shaders:
SGPRs: 25328 -> 26608 (+5.05%); split: -1.36%, +6.41%
VGPRs: 9988 -> 9996 (+0.08%)
SpillSGPRs: 40 -> 65 (+62.50%)
CodeSize: 1410188 -> 1385584 (-1.74%); split: -1.76%, +0.02%
Instrs: 257149 -> 250579 (-2.55%); split: -2.57%, +0.01%
Cycles: 1096892 -> 1070600 (-2.40%); split: -2.41%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10384>
2021-04-21 20:26:58 +00:00
Rhys Perry 4df3654c79 nir/load_store_vectorize: assume CAN_REORDER ops don't alias with stores
fossil-db (GFX10.3):
Totals from 20 (0.01% of 139391) affected shaders:
SGPRs: 688 -> 712 (+3.49%); split: -1.16%, +4.65%
CodeSize: 35488 -> 34424 (-3.00%); split: -3.04%, +0.05%
Instrs: 6405 -> 6259 (-2.28%); split: -2.44%, +0.16%
Cycles: 51768 -> 51268 (-0.97%); split: -1.21%, +0.24%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10384>
2021-04-21 20:26:58 +00:00
Mike Blumenkrantz 3ccd0891d3 nir/lower_fragcolor: set outputs_written for fragdata members
normal gather_info stuff

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10080>
2021-04-21 19:36:16 +00:00
Matt Turner 7f8c5844ef compiler/glsl: Always propagate_invariance() last
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10292>
2021-04-20 18:54:57 +00:00
Matt Turner d35f8604c7 compiler/glsl: Propagate invariant/precise when splitting arrays
This fixes the
dEQP-GLES3.functional.shaders.invariance.{low,medium,high}p.loop_4 tests when
run in a VM with virgl on a host with iris. virgl mangles the GLSL shaders and
emits shader code for the host driver that contains vec4 arrays. As such, the
test did not fail when running directly on the host.

The test also did not fail if the host was using i965. Disabling
PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY in iris was sufficient to work around it,
so I believe that i965 didn't show the problem because after arrays were split
by optimize_split_arrays(), even though the invariant/precise qualifiers were
lost, do_common_optimization() would be called again and thus
propagate_invariance() would propagate the qualifiers to the new variables
produced by optimize_split_arrays().

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10292>
2021-04-20 18:54:57 +00:00
Matt Turner 5ef4296cb6 compiler/glsl: Return progress from propagate_invariance()
Doing so allow you to easily tell what the pass did using the existing
infrastructure in the OPT macro.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10292>
2021-04-20 18:54:57 +00:00
Jesse Natalie 0e2566a8a7 shader_enums: Fix MSVC warning C4334 (32bit shift cast to 64bit)
The warning is triggered when assigning into inputs_read, which is 64bit.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-By: Bill Kristiansen <billkris@microsoft.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10331>
2021-04-20 00:28:35 +00:00
Jesse Natalie 09440ce3fb nir: Fix MSVC warning C4334 (32bit shift cast to 64bit)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-By: Bill Kristiansen <billkris@microsoft.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10331>
2021-04-20 00:28:34 +00:00
Lionel Landwerlin 856953b131 spirv: fix uToAccelerationStructure handling
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7f223a2329 ("spirv: Implement SpvOpConvertUToAccelerationStructureKHR")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10324>
2021-04-19 22:02:53 +00:00
Alyssa Rosenzweig 899dd8e60a nir: Update some comments referring to imov
This was renamed when I was in high school. I remember updating the
Midgard compiler while sitting in AP Physics.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10296>
2021-04-19 20:07:35 +00:00
Danylo Piliaiev f17b41ab4f nir: add lowering pass for helperInvocationEXT()
Some hardware doesn't have a way to check if invocation was demoted,
in such case we have to track it ourselves.
OpIsHelperInvocationEXT is specified as:

 "An invocation is currently a helper invocation if it was originally
  invoked as a helper invocation or if it has been demoted to a helper
  invocation by OpDemoteToHelperInvocationEXT."

Therefore we:
- Set gl_IsHelperInvocationEXT = gl_HelperInvocation
- Add "gl_IsHelperInvocationEXT = true" right before each demote
- Add "gl_IsHelperInvocationEXT = gl_IsHelperInvocationEXT || condition"
  right before each demote_if

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9460>
2021-04-19 17:11:36 +00:00
Erik Faye-Lund 7886983835 nir/lower_tex: do not stumble on 16-bit inputs
If a has been lowered to float16 here, then we end up trying to
construct a vector of mixed precision, which the validator asserts
about.

So let's make sure we use the same type for all arguments.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10201>
2021-04-19 14:28:05 +00:00