Commit Graph

313 Commits

Author SHA1 Message Date
Marek Olšák b86305bb57 nir/algebraic: collapse conversion opcodes (many patterns)
mediump inserts a lot of conversions. This cleans up the IR.
All other combinations are covered too.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>
2020-09-10 23:35:13 +00:00
Marek Olšák cdd498bbe8 nir: add new mediump opcodes f2[ui]mp, i2fmp, u2fmp
Algebraic optimizations will select them.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>
2020-09-10 23:35:13 +00:00
Marek Olšák 3d3df8dbff nir: remove redundant opcode u2ump
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>
2020-09-10 23:35:13 +00:00
Marek Olšák 26fc5e1f4a nir/algebraic: expand existing 32-bit patterns to all bit sizes using loops
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>
2020-09-10 23:35:13 +00:00
Marek Olšák 3c8934a644 nir/algebraic: add flrp patterns for 16 and 64 bits
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>
2020-09-10 23:35:13 +00:00
Marek Olšák a7ece63de9 nir/algebraic: add 16-bit versions of a few 32-bit patterns
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6599>
2020-09-04 17:06:22 +00:00
Marek Olšák 00b28a50b2 nir/algebraic: trivially enable existing 32-bit patterns for all bit sizes
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6599>
2020-09-04 17:06:22 +00:00
Eric Anholt 479d9c97eb nir: Add simplistic lowering for bany_equal/ball_inequal.
It would be nice if we could do swizzling of an expression on the
replacement side so that we could have a single ieq/ine of the vector
after CSE.  However, if you do want vector operations, nir_opt_vectorize()
does just fine.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6567>
2020-09-02 09:58:44 -07:00
Samuel Pitoiset bc123c396a nir/algebraic: mark some optimizations with fsat(NaN) as inexact
If a is Nan, fsat(NaN) is expected to be 0 and some optimizations
should be marked as inexact.

Fixes a GPU hang with Death Stranding and RADV/ACO (RADV/LLVM
isn't affected because it lowers fsat).

No fossils-db change.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3368
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6519>
2020-09-01 11:20:03 +02:00
Jesse Natalie d91f85f16e nir: Remove 32bit restriction for uadd_carry optimization
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6313>
2020-08-27 16:57:42 +00:00
Daniel Schürmann a79dad950b nir,amd: remove trinary_minmax opcodes
These consist of the variations nir_op_{i|u|f}{min|max|med}3 which are either
lowered in the backend (LLVM) anyway or can be recombined by the backend (ACO).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6421>
2020-08-24 20:56:11 +00:00
Erik Faye-Lund 5e841e8b4f nir: add iabs-lowering code
Microsoft's DXIL is based on LLVM, which doesn't have an integer ABS
opcode, but instead needs it lowered to NEG + MAX. We need to do this
with an option, to prevent an already existing optimization rule from
undoing this.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5211>
2020-08-24 10:02:47 +00:00
Karol Herbst e5899c1e88 nir: rename nir_op_fne to nir_op_fneu
It was always fneu but naming it fne causes confusion from time to time. So
lets rename it. Later we also want to add other unordered and fne, this is
a smaller preparation for that.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6377>
2020-08-21 17:26:21 +00:00
Boris Brezillon 18e464cfc0 compiler/nir: Add new flags to lower pack/unpack split instructions
And add new rules to do this lowering in nir_opt_algebraic.py.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6309>
2020-08-17 19:46:10 +00:00
Jesse Natalie a1ed83fddd nir: Optimize mask+downcast to just downcast
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6330>
2020-08-17 14:36:18 +00:00
Daniel Schürmann 5f79e4e69a nir/algebraic: fold some nested bcsel
Totals from 14266 (10.62% of 134368) affected shaders (Polaris):
SGPRs: 761756 -> 762732 (+0.13%); split: -0.00%, +0.13%
VGPRs: 430392 -> 430924 (+0.12%); split: -0.05%, +0.17%
SpillSGPRs: 4652 -> 4628 (-0.52%); split: -0.60%, +0.09%
CodeSize: 30133000 -> 29949780 (-0.61%); split: -0.66%, +0.05%
MaxWaves: 102122 -> 102111 (-0.01%); split: +0.00%, -0.01%
Instrs: 5845085 -> 5841668 (-0.06%); split: -0.08%, +0.03%
Cycles: 69033140 -> 68889188 (-0.21%); split: -0.22%, +0.01%
VMEM: 8479021 -> 8474978 (-0.05%); split: +0.03%, -0.08%
SMEM: 831437 -> 830464 (-0.12%); split: +0.06%, -0.18%
VClause: 105411 -> 105410 (-0.00%); split: -0.01%, +0.01%
SClause: 327727 -> 327780 (+0.02%); split: -0.00%, +0.02%
Copies: 372704 -> 373306 (+0.16%); split: -0.16%, +0.32%
Branches: 112260 -> 112269 (+0.01%); split: -0.00%, +0.01%
PreSGPRs: 433308 -> 433631 (+0.07%); split: -0.01%, +0.09%
PreVGPRs: 397888 -> 397905 (+0.00%); split: -0.01%, +0.01%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
2020-07-20 15:56:46 +00:00
Daniel Schürmann 27244662f2 nir/algebraic: propagate b2i out of ior/iand
Totals from 761 (0.57% of 134368) affected shaders (Polaris):
SGPRs: 29496 -> 29488 (-0.03%)
SpillSGPRs: 41 -> 43 (+4.88%)
CodeSize: 1922036 -> 1882408 (-2.06%); split: -2.08%, +0.02%
Instrs: 366051 -> 360362 (-1.55%); split: -1.57%, +0.02%
Cycles: 7692516 -> 7661216 (-0.41%); split: -0.41%, +0.01%
VMEM: 365175 -> 365172 (-0.00%)
VClause: 15324 -> 15322 (-0.01%)
SClause: 9825 -> 9824 (-0.01%); split: -0.02%, +0.01%
Copies: 41216 -> 41294 (+0.19%); split: -0.01%, +0.20%
Branches: 7020 -> 7033 (+0.19%)
PreSGPRs: 22103 -> 22106 (+0.01%)
PreVGPRs: 26518 -> 26515 (-0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
2020-07-20 15:56:46 +00:00
Daniel Schürmann baee5a9812 nir/algebraic: add distributive rules for ior/iand
Totals from 581 (0.43% of 134368) affected shaders (Polaris):
CodeSize: 1389560 -> 1386488 (-0.22%)
Instrs: 264488 -> 263984 (-0.19%)
Cycles: 1057952 -> 1055936 (-0.19%)
VMEM: 296016 -> 291613 (-1.49%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
2020-07-20 15:56:46 +00:00
Daniel Schürmann 70d3efeb88 nir/algebraic: optimize (a < 0.0) ? -a : a -> fabs(a)
Totals from affected shaders: (VEGA)
SGPRS: 13920 -> 13920 (0.00 %)
VGPRS: 10252 -> 10252 (0.00 %)
Spilled SGPRs: 62 -> 62 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 587648 -> 587224 (-0.07 %) bytes
LDS: 5 -> 5 (0.00 %) blocks
Max Waves: 1489 -> 1489 (0.00 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
2020-07-20 15:56:46 +00:00
Daniel Schürmann 9d22c5ed71 nir/algebraic: optimize fmul(x, bcsel(c, -1.0, 1.0)) -> bcsel(c, -x, x)
Totals from affected shaders: (VEGA)
SGPRS: 545712 -> 545712 (0.00 %)
VGPRS: 413092 -> 413116 (0.01 %)
Spilled SGPRs: 10616 -> 10616 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 37031684 -> 36984248 (-0.13 %) bytes
LDS: 427 -> 427 (0.00 %) blocks
Max Waves: 54350 -> 54340 (-0.02 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
2020-07-20 15:56:46 +00:00
Daniel Schürmann 56ec814b56 nir/algebraic: add some more unop + bcsel optimizations
Totals from affected shaders: (VEGA)
SGPRS: 284392 -> 284400 (0.00 %)
VGPRS: 261080 -> 261076 (-0.00 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 24698596 -> 24277788 (-1.70 %) bytes
LDS: 196 -> 196 (0.00 %) blocks
Max Waves: 10101 -> 10105 (0.04 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
2020-07-20 15:56:45 +00:00
Daniel Schürmann 2fca183910 nir/algebraic: add optimizations for fsign/isign
This just reverts fsign/isign lowering.

Totals from affected shaders:
SGPRS: 257496 -> 256672 (-0.32 %)
VGPRS: 181800 -> 178864 (-1.61 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 11355852 -> 11141840 (-1.88 %) bytes
LDS: 3789 -> 3789 (0.00 %) blocks
Max Waves: 30453 -> 30951 (1.64 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
2020-07-20 15:56:45 +00:00
Daniel Schürmann 8e1b75b330 nir/algebraic: optimize iand/ior of (n)eq zero
Found in some Detroit: Become Human shaders.

Totals from affected shaders:
SGPRS: 700256 -> 700256 (0.00 %)
VGPRS: 507208 -> 507212 (0.00 %)
Spilled SGPRs: 142531 -> 142531 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 76404616 -> 76301768 (-0.13 %) bytes
LDS: 43 -> 43 (0.00 %) blocks
Max Waves: 21438 -> 21438 (0.00 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
2020-07-20 15:56:45 +00:00
Daniel Schürmann de0ebaf09d nir/algebraic: optimize bcsel(a, 0, 1) to b2i
This avoids combination with other bcsel operations,
and as b2i is often a no-op (when used for iadd and such),
the resulting pattern is preferable.

Totals from affected shaders: (VEGA)
SGPRS: 598448 -> 598448 (0.00 %)
VGPRS: 457940 -> 457352 (-0.13 %)
Spilled SGPRs: 127154 -> 127154 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 64836352 -> 64802728 (-0.05 %) bytes
LDS: 781 -> 781 (0.00 %) blocks
Max Waves: 22931 -> 22931 (0.00 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
2020-07-20 15:56:45 +00:00
Ian Romanick 8591adea38 nir/algebraic: Don't distrubte absolute-value into dot-products
Dot product is multiplication followed by addition, and absolute value
does not distribute into addition.

Only vec4 platforms are affected by this change as scalar-only platforms
never have any of the fdot_replicated instructions.  In the shader-db
results, below, shaders in MANY different applications are affected.
Trine, Doom3, Enemy Territory: Quake Wars, Counter Strike: Global
Offensive, Mad Max, Metro Last Light, and on and on...  I'm really
shocked that there were no test regressions!

All Haswell and earlier platforms had similar results. (Haswell shown)
total instructions in shared programs: 16219743 -> 16219820 (<.01%)
instructions in affected programs: 12171 -> 12248 (0.63%)
helped: 1
HURT: 78
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.78% max: 0.78% x̄: 0.78% x̃: 0.78%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.35% max: 2.38% x̄: 0.91% x̃: 1.06%
95% mean confidence interval for instructions value: 0.92 1.03
95% mean confidence interval for instructions %-change: 0.78% 1.00%
Instructions are HURT.

total cycles in shared programs: 538481383 -> 538491045 (<.01%)
cycles in affected programs: 470796 -> 480458 (2.05%)
helped: 149
HURT: 142
helped stats (abs) min: 1 max: 1338 x̄: 71.13 x̃: 4
helped stats (rel) min: 0.06% max: 40.99% x̄: 2.76% x̃: 0.67%
HURT stats (abs)   min: 1 max: 2092 x̄: 142.68 x̃: 12
HURT stats (rel)   min: 0.07% max: 55.38% x̄: 5.07% x̃: 1.07%
95% mean confidence interval for cycles value: -5.28 71.69
95% mean confidence interval for cycles %-change: -0.07% 2.19%
Inconclusive result (value mean confidence interval includes 0).

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: 62795475e8 ("nir/algebraic: Distribute source modifiers into instructions")
Closes: #3129
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5581>
2020-07-02 14:05:33 -07:00
Alyssa Rosenzweig 54d7907c27 nir: Propagate *2*16 conversions into vectors
If we have code like:

   ('f2f16', ('vec2', ('f2f32', 'a@16'), '#b@32'))

We would like to eliminate the conversions, but the existing rules can't
see into the the (heterogenous) vector. So instead of trying to
eliminate in one pass, we add opts to propagate the f2f16 into the
vector. Even if nothing further happens, this is often a win since then
the created vector is smaller (half2 instead of float2). Hence the above
gets transformed to

   ('vec2', ('f2f16', ('f2f32', 'a@16')), ('f2f16', '#b@32'))

Then the existing f2f16(f2f32) rule will kick in for the first component
and constant folding will for the second and we'll be left with

   ('vec2', 'a@16', '#b@16')

...eliminating all conversions.

v2: Predicate on !options->vectorize_vec2_16bit. As discussed, this
optimization helps greatly on true vector architectures (like Midgard)
but wreaks havoc on more modern SIMD-within-a-register architectures
(like Bifrost and modern AMD). So let's predicate on that.

v3: Extend for integers as well and add a comment explaining the
transforms.

Results on Midgard (unfortunately a true SIMD architecture):

total instructions in shared programs: 51359 -> 50963 (-0.77%)
instructions in affected programs: 4523 -> 4127 (-8.76%)
helped: 53
HURT: 0
helped stats (abs) min: 1 max: 86 x̄: 7.47 x̃: 6
helped stats (rel) min: 1.71% max: 28.00% x̄: 9.66% x̃: 7.34%
95% mean confidence interval for instructions value: -10.58 -4.36
95% mean confidence interval for instructions %-change: -11.45% -7.88%
Instructions are helped.

total bundles in shared programs: 25825 -> 25670 (-0.60%)
bundles in affected programs: 2057 -> 1902 (-7.54%)
helped: 53
HURT: 0
helped stats (abs) min: 1 max: 26 x̄: 2.92 x̃: 2
helped stats (rel) min: 2.86% max: 30.00% x̄: 8.64% x̃: 8.33%
95% mean confidence interval for bundles value: -3.93 -1.92
95% mean confidence interval for bundles %-change: -10.69% -6.59%
Bundles are helped.

total quadwords in shared programs: 41359 -> 41055 (-0.74%)
quadwords in affected programs: 3801 -> 3497 (-8.00%)
helped: 57
HURT: 0
helped stats (abs) min: 1 max: 57 x̄: 5.33 x̃: 4
helped stats (rel) min: 1.92% max: 21.05% x̄: 8.22% x̃: 6.67%
95% mean confidence interval for quadwords value: -7.35 -3.32
95% mean confidence interval for quadwords %-change: -9.54% -6.90%
Quadwords are helped.

total registers in shared programs: 3849 -> 3807 (-1.09%)
registers in affected programs: 167 -> 125 (-25.15%)
helped: 32
HURT: 1
helped stats (abs) min: 1 max: 3 x̄: 1.34 x̃: 1
helped stats (rel) min: 20.00% max: 50.00% x̄: 26.35% x̃: 20.00%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 16.67% max: 16.67% x̄: 16.67% x̃: 16.67%
95% mean confidence interval for registers value: -1.54 -1.00
95% mean confidence interval for registers %-change: -29.41% -20.69%
Registers are helped.

total threads in shared programs: 2471 -> 2520 (1.98%)
threads in affected programs: 49 -> 98 (100.00%)
helped: 25
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.96 x̃: 2
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for threads value: 1.88 2.04
95% mean confidence interval for threads %-change: 100.00% 100.00%
Threads are [helped].

total spills in shared programs: 168 -> 168 (0.00%)
spills in affected programs: 0 -> 0
helped: 0
HURT: 0

total fills in shared programs: 186 -> 186 (0.00%)
fills in affected programs: 0 -> 0
helped: 0
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4999>
2020-06-30 16:21:33 +00:00
Boris Brezillon cff418cc4c nir: Add new rules to optimize NOOP pack/unpack pairs
nir_load_store_vectorize_test.ssbo_load_adjacent_32_32_64_64 expectations
need to be fixed accordingly.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5589>
2020-06-29 09:18:26 +02:00
Marek Olšák f798513f91 nir: add i2imp and u2ump opcodes for conversions to mediump
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5002>
2020-06-02 20:01:18 +00:00
Alyssa Rosenzweig f3310cb3e1 nir: Fold f2f16(b2f32(x)) to b2f16(x)
By definition.

This reduces register pressure on freedreno so that the noubo expected
failure goes away.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5002>
2020-06-02 20:01:18 +00:00
Ian Romanick 412e29c277 nir/algebraic: Eliminate useless extract before unpack
The shader helped for spills and fills is the big compute shader in Dirt
Showdown.  One of the shaders hurt for spills and fills on Broadwell is
the big compute shader in Bioshock Infinite, but combined with the
previous commit, it's still an impovement.

Tiger Lake
total instructions in shared programs: 21833218 -> 21832449 (<.01%)
instructions in affected programs: 66104 -> 65335 (-1.16%)
helped: 106
HURT: 14
helped stats (abs) min: 1 max: 67 x̄: 7.87 x̃: 5
helped stats (rel) min: 0.19% max: 5.76% x̄: 1.27% x̃: 0.95%
HURT stats (abs)   min: 1 max: 14 x̄: 4.64 x̃: 1
HURT stats (rel)   min: 0.19% max: 4.12% x̄: 1.41% x̃: 0.19%
95% mean confidence interval for instructions value: -8.51 -4.30
95% mean confidence interval for instructions %-change: -1.23% -0.69%
Instructions are helped.

total cycles in shared programs: 506180109 -> 506196314 (<.01%)
cycles in affected programs: 1671429 -> 1687634 (0.97%)
helped: 37
HURT: 84
helped stats (abs) min: 1 max: 490 x̄: 73.27 x̃: 24
helped stats (rel) min: 0.02% max: 7.98% x̄: 1.25% x̃: 0.41%
HURT stats (abs)   min: 1 max: 5000 x̄: 225.19 x̃: 8
HURT stats (rel)   min: 0.03% max: 10.22% x̄: 1.22% x̃: 0.42%
95% mean confidence interval for cycles value: 2.85 265.00
95% mean confidence interval for cycles %-change: 0.04% 0.88%
Cycles are HURT.

Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 19961317 -> 19960543 (<.01%)
instructions in affected programs: 30268 -> 29494 (-2.56%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 142 x̄: 19.85 x̃: 7
helped stats (rel) min: 0.19% max: 7.87% x̄: 2.33% x̃: 2.31%
95% mean confidence interval for instructions value: -29.46 -10.23
95% mean confidence interval for instructions %-change: -2.95% -1.71%
Instructions are helped.

total cycles in shared programs: 498863755 -> 498865843 (<.01%)
cycles in affected programs: 1831136 -> 1833224 (0.11%)
helped: 57
HURT: 65
helped stats (abs) min: 1 max: 1400 x̄: 128.93 x̃: 25
helped stats (rel) min: 0.05% max: 3.49% x̄: 0.89% x̃: 0.71%
HURT stats (abs)   min: 1 max: 1887 x̄: 145.18 x̃: 15
HURT stats (rel)   min: 0.02% max: 9.88% x̄: 1.83% x̃: 0.73%
95% mean confidence interval for cycles value: -58.30 92.53
95% mean confidence interval for cycles %-change: 0.16% 0.97%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 8774 -> 8773 (-0.01%)
spills in affected programs: 20 -> 19 (-5.00%)
helped: 1
HURT: 0

total fills in shared programs: 9496 -> 9494 (-0.02%)
fills in affected programs: 40 -> 38 (-5.00%)
helped: 1
HURT: 0

Broadwell
total instructions in shared programs: 17859373 -> 17858548 (<.01%)
instructions in affected programs: 38452 -> 37627 (-2.15%)
helped: 31
HURT: 0
helped stats (abs) min: 1 max: 143 x̄: 26.61 x̃: 10
helped stats (rel) min: 0.19% max: 7.87% x̄: 2.57% x̃: 2.69%
95% mean confidence interval for instructions value: -39.79 -13.44
95% mean confidence interval for instructions %-change: -3.25% -1.89%
Instructions are helped.

total cycles in shared programs: 525858109 -> 525869236 (<.01%)
cycles in affected programs: 2058597 -> 2069724 (0.54%)
helped: 44
HURT: 75
helped stats (abs) min: 2 max: 1330 x̄: 187.84 x̃: 23
helped stats (rel) min: 0.04% max: 31.31% x̄: 2.13% x̃: 0.85%
HURT stats (abs)   min: 1 max: 3915 x̄: 258.56 x̃: 47
HURT stats (rel)   min: 0.02% max: 10.53% x̄: 2.81% x̃: 2.21%
95% mean confidence interval for cycles value: -26.06 213.07
95% mean confidence interval for cycles %-change: 0.19% 1.78%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 25744 -> 25730 (-0.05%)
spills in affected programs: 1578 -> 1564 (-0.89%)
helped: 4
HURT: 2

total fills in shared programs: 31710 -> 31689 (-0.07%)
fills in affected programs: 4346 -> 4325 (-0.48%)
helped: 3
HURT: 3

Haswell
total instructions in shared programs: 16228399 -> 16227783 (<.01%)
instructions in affected programs: 22201 -> 21585 (-2.77%)
helped: 27
HURT: 0
helped stats (abs) min: 1 max: 68 x̄: 22.81 x̃: 11
helped stats (rel) min: 0.19% max: 7.87% x̄: 2.92% x̃: 2.86%
95% mean confidence interval for instructions value: -31.96 -13.66
95% mean confidence interval for instructions %-change: -3.68% -2.15%
Instructions are helped.

total cycles in shared programs: 538613967 -> 538701354 (0.02%)
cycles in affected programs: 1653044 -> 1740431 (5.29%)
helped: 36
HURT: 81
helped stats (abs) min: 2 max: 708 x̄: 104.50 x̃: 17
helped stats (rel) min: <.01% max: 15.01% x̄: 1.67% x̃: 0.65%
HURT stats (abs)   min: 1 max: 30100 x̄: 1125.30 x̃: 304
HURT stats (rel)   min: 0.02% max: 16.21% x̄: 8.98% x̃: 11.60%
95% mean confidence interval for cycles value: 23.78 1470.01
95% mean confidence interval for cycles %-change: 4.29% 7.12%
Cycles are HURT.

total spills in shared programs: 23418 -> 23409 (-0.04%)
spills in affected programs: 177 -> 168 (-5.08%)
helped: 2
HURT: 0

total fills in shared programs: 25919 -> 25896 (-0.09%)
fills in affected programs: 568 -> 545 (-4.05%)
helped: 3
HURT: 0

Ivy Bridge
total instructions in shared programs: 15265983 -> 15265759 (<.01%)
instructions in affected programs: 8418 -> 8194 (-2.66%)
helped: 5
HURT: 0
helped stats (abs) min: 18 max: 99 x̄: 44.80 x̃: 26
helped stats (rel) min: 1.74% max: 4.26% x̄: 3.12% x̃: 3.00%
95% mean confidence interval for instructions value: -86.29 -3.31
95% mean confidence interval for instructions %-change: -4.43% -1.81%
Instructions are helped.

total cycles in shared programs: 422930336 -> 422929589 (<.01%)
cycles in affected programs: 59347 -> 58600 (-1.26%)
helped: 3
HURT: 2
helped stats (abs) min: 72 max: 1060 x̄: 433.33 x̃: 168
helped stats (rel) min: 1.14% max: 3.48% x̄: 2.23% x̃: 2.06%
HURT stats (abs)   min: 265 max: 288 x̄: 276.50 x̃: 276
HURT stats (rel)   min: 4.79% max: 5.64% x̄: 5.22% x̃: 5.22%
95% mean confidence interval for cycles value: -829.08 530.28
95% mean confidence interval for cycles %-change: -4.43% 5.93%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 4953 -> 4946 (-0.14%)
spills in affected programs: 344 -> 337 (-2.03%)
helped: 2
HURT: 0

total fills in shared programs: 5548 -> 5521 (-0.49%)
fills in affected programs: 838 -> 811 (-3.22%)
helped: 2
HURT: 0

No shader-db changes on any earlier Intel platform.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
2020-05-11 12:07:01 -07:00
Ian Romanick bc0bbb8f0b nir/algebraic: Add some half packing optimizations for pack_half_2x16_split
Like 1f72857739 ("nir/algebraic: add some half packing optimizations"),
but for the pack_half_2x16_split variant.

The shader helped for spills and fills is the big compute shader in
Bioshock Infinite.

Tiger Lake
total instructions in shared programs: 21834539 -> 21833218 (<.01%)
instructions in affected programs: 60119 -> 58798 (-2.20%)
helped: 105
HURT: 0
helped stats (abs) min: 5 max: 50 x̄: 12.58 x̃: 9
helped stats (rel) min: 0.86% max: 26.46% x̄: 2.58% x̃: 1.70%
95% mean confidence interval for instructions value: -14.35 -10.81
95% mean confidence interval for instructions %-change: -3.20% -1.97%
Instructions are helped.

total cycles in shared programs: 506215169 -> 506180109 (<.01%)
cycles in affected programs: 1445088 -> 1410028 (-2.43%)
helped: 97
HURT: 8
helped stats (abs) min: 1 max: 16882 x̄: 387.76 x̃: 26
helped stats (rel) min: 0.05% max: 18.31% x̄: 1.77% x̃: 1.34%
HURT stats (abs)   min: 21 max: 635 x̄: 319.12 x̃: 212
HURT stats (rel)   min: 0.39% max: 20.08% x̄: 8.96% x̃: 4.46%
95% mean confidence interval for cycles value: -782.96 115.15
95% mean confidence interval for cycles %-change: -1.74% -0.16%
Inconclusive result (value mean confidence interval includes 0).

Ice Lake, Skylake, and Broadwell had similar results. (Ice Lake shown)
total instructions in shared programs: 19962974 -> 19961317 (<.01%)
instructions in affected programs: 63471 -> 61814 (-2.61%)
helped: 105
HURT: 0
helped stats (abs) min: 6 max: 82 x̄: 15.78 x̃: 11
helped stats (rel) min: 1.11% max: 28.65% x̄: 3.17% x̃: 2.16%
95% mean confidence interval for instructions value: -18.38 -13.18
95% mean confidence interval for instructions %-change: -3.86% -2.48%
Instructions are helped.

total cycles in shared programs: 498908953 -> 498863755 (<.01%)
cycles in affected programs: 1566998 -> 1521800 (-2.88%)
helped: 89
HURT: 15
helped stats (abs) min: 2 max: 17502 x̄: 532.19 x̃: 69
helped stats (rel) min: 0.07% max: 18.54% x̄: 4.71% x̃: 3.12%
HURT stats (abs)   min: 3 max: 661 x̄: 144.47 x̃: 16
HURT stats (rel)   min: 0.14% max: 20.57% x̄: 4.29% x̃: 0.30%
95% mean confidence interval for cycles value: -903.93 34.74
95% mean confidence interval for cycles %-change: -4.50% -2.32%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 8776 -> 8774 (-0.02%)
spills in affected programs: 25 -> 23 (-8.00%)
helped: 1
HURT: 0

total fills in shared programs: 9500 -> 9496 (-0.04%)
fills in affected programs: 46 -> 42 (-8.70%)
helped: 1
HURT: 0

Haswell
total instructions in shared programs: 16229912 -> 16228399 (<.01%)
instructions in affected programs: 61257 -> 59744 (-2.47%)
helped: 105
HURT: 0
helped stats (abs) min: 6 max: 51 x̄: 14.41 x̃: 11
helped stats (rel) min: 0.77% max: 28.65% x̄: 3.08% x̃: 2.15%
95% mean confidence interval for instructions value: -16.14 -12.68
95% mean confidence interval for instructions %-change: -3.77% -2.40%
Instructions are helped.

total cycles in shared programs: 538654481 -> 538613967 (<.01%)
cycles in affected programs: 1448966 -> 1408452 (-2.80%)
helped: 58
HURT: 47
helped stats (abs) min: 9 max: 22604 x̄: 957.00 x̃: 74
helped stats (rel) min: 0.40% max: 18.81% x̄: 6.22% x̃: 3.03%
HURT stats (abs)   min: 5 max: 3720 x̄: 318.98 x̃: 49
HURT stats (rel)   min: 0.20% max: 34.50% x̄: 5.05% x̃: 2.12%
95% mean confidence interval for cycles value: -999.84 228.14
95% mean confidence interval for cycles %-change: -2.86% 0.51%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 15266086 -> 15265983 (<.01%)
instructions in affected programs: 7272 -> 7169 (-1.42%)
helped: 3
HURT: 0
helped stats (abs) min: 21 max: 41 x̄: 34.33 x̃: 41
helped stats (rel) min: 0.66% max: 5.43% x̄: 2.44% x̃: 1.23%

total cycles in shared programs: 422930883 -> 422930336 (<.01%)
cycles in affected programs: 49259 -> 48712 (-1.11%)
helped: 3
HURT: 0
helped stats (abs) min: 106 max: 221 x̄: 182.33 x̃: 220
helped stats (rel) min: 0.71% max: 5.95% x̄: 2.46% x̃: 0.72%

No changes on any earilier Intel platforms.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
2020-05-11 12:07:01 -07:00
Ian Romanick a2bf41ec65 nir/algebraic: Optimize ushr of pack_half, not ishr
When a = -1.0, pack_half_2x16(vec2(0x0000, 0xBC00)) will produce
0xBC000000.  The ishr will produce 0xFFFFBC00.  The replacement
pack_half_2x16(vec2(0xBC00, 0x0000)) will produce 0x0000BC00.

Fixes: 1f72857739 ("nir/algebraic: add some half packing optimizations")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
2020-05-11 12:07:01 -07:00
Ian Romanick 3b6449d453 nir/algebraic: Optimize some bfe patterns
v2: Use -x instead of 32-x in shift counts.

Tiger Lake
total instructions in shared programs: 17597691 -> 17597405 (<.01%)
instructions in affected programs: 224557 -> 224271 (-0.13%)
helped: 74
HURT: 17
helped stats (abs) min: 1 max: 71 x̄: 14.36 x̃: 7
helped stats (rel) min: 0.08% max: 1.80% x̄: 0.50% x̃: 0.37%
HURT stats (abs)   min: 1 max: 141 x̄: 45.71 x̃: 40
HURT stats (rel)   min: 0.03% max: 3.55% x̄: 1.20% x̃: 1.14%
95% mean confidence interval for instructions value: -10.53 4.24
95% mean confidence interval for instructions %-change: -0.38% 0.01%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 333595656 -> 333180770 (-0.12%)
cycles in affected programs: 70056467 -> 69641581 (-0.59%)
helped: 91
HURT: 4
helped stats (abs) min: 1 max: 25174 x̄: 4571.40 x̃: 400
helped stats (rel) min: <.01% max: 2.23% x̄: 0.40% x̃: 0.21%
HURT stats (abs)   min: 1 max: 370 x̄: 277.75 x̃: 370
HURT stats (rel)   min: 0.01% max: 0.04% x̄: 0.04% x̃: 0.04%
95% mean confidence interval for cycles value: -5981.55 -2752.89
95% mean confidence interval for cycles %-change: -0.48% -0.29%
Cycles are helped.

Ice Lake, Skylake, Broadwell, and Haswell had similar results. (Ice Lake shown)
total instructions in shared programs: 16117204 -> 16116723 (<.01%)
instructions in affected programs: 207109 -> 206628 (-0.23%)
helped: 100
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 4.81 x̃: 7
helped stats (rel) min: 0.10% max: 1.58% x̄: 0.23% x̃: 0.20%
95% mean confidence interval for instructions value: -5.51 -4.11
95% mean confidence interval for instructions %-change: -0.27% -0.19%
Instructions are helped.

total cycles in shared programs: 330487341 -> 330082421 (-0.12%)
cycles in affected programs: 68037050 -> 67632130 (-0.60%)
helped: 89
HURT: 7
helped stats (abs) min: 2 max: 24610 x̄: 4567.07 x̃: 400
helped stats (rel) min: <.01% max: 1.52% x̄: 0.39% x̃: 0.22%
HURT stats (abs)   min: 1 max: 370 x̄: 221.29 x̃: 170
HURT stats (rel)   min: 0.01% max: 1.66% x̄: 0.58% x̃: 0.04%
95% mean confidence interval for cycles value: -5780.79 -2655.05
95% mean confidence interval for cycles %-change: -0.42% -0.22%
Cycles are helped.

Ivy Bridge
total instructions in shared programs: 11873641 -> 11873137 (<.01%)
instructions in affected programs: 147464 -> 146960 (-0.34%)
helped: 54
HURT: 0
helped stats (abs) min: 9 max: 10 x̄: 9.33 x̃: 9
helped stats (rel) min: 0.29% max: 0.41% x̄: 0.34% x̃: 0.34%
95% mean confidence interval for instructions value: -9.46 -9.20
95% mean confidence interval for instructions %-change: -0.35% -0.33%
Instructions are helped.

total cycles in shared programs: 175769085 -> 175549519 (-0.12%)
cycles in affected programs: 60770592 -> 60551026 (-0.36%)
helped: 54
HURT: 0
helped stats (abs) min: 252 max: 13434 x̄: 4066.04 x̃: 1290
helped stats (rel) min: 0.02% max: 0.74% x̄: 0.34% x̃: 0.26%
95% mean confidence interval for cycles value: -5323.59 -2808.48
95% mean confidence interval for cycles %-change: -0.41% -0.27%
Cycles are helped.

No changes on any earlier Intel platforms.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4156>
2020-05-07 10:55:50 -07:00
Ian Romanick f46eabf84e nir/algebraic: Split ibfe and ubfe with two constant sources
I also tried splitting ubfe instructions with one or zero constants,
and zero shaders in shader-db were affected.

The "lost" shader is a compute shader that was promoted from SIMD8 to
SIMD16, so is also counted as the gained shader.

v2: Further restrict bfe splitting.  bfe with multiple constants is
better on at least some Radeon GPUs.  Use -x instead of 32-x in shift
counts.

v3: Fix the outer shift count for ibfe lowering.  Add c=0 optimizations
to prevent bad lowering.  Both suggested by Rhys.  Add shift by -32
optimizations.

Tiger Lake
total instructions in shared programs: 17608764 -> 17596316 (-0.07%)
instructions in affected programs: 303765 -> 291317 (-4.10%)
helped: 113
HURT: 46
helped stats (abs) min: 1 max: 458 x̄: 120.67 x̃: 21
helped stats (rel) min: 0.09% max: 11.23% x̄: 3.47% x̃: 1.39%
HURT stats (abs)   min: 1 max: 201 x̄: 25.83 x̃: 6
HURT stats (rel)   min: 0.23% max: 5.18% x̄: 1.53% x̃: 1.11%
95% mean confidence interval for instructions value: -101.13 -55.45
95% mean confidence interval for instructions %-change: -2.61% -1.44%
Instructions are helped.

total cycles in shared programs: 338390770 -> 333530868 (-1.44%)
cycles in affected programs: 79438330 -> 74578428 (-6.12%)
helped: 112
HURT: 64
helped stats (abs) min: 2 max: 268955 x̄: 44261.93 x̃: 1452
helped stats (rel) min: <.01% max: 29.51% x̄: 4.72% x̃: 2.23%
HURT stats (abs)   min: 2 max: 17618 x̄: 1522.41 x̃: 84
HURT stats (rel)   min: <.01% max: 7.34% x̄: 1.35% x̃: 0.34%
95% mean confidence interval for cycles value: -37232.47 -17993.69
95% mean confidence interval for cycles %-change: -3.37% -1.65%
Cycles are helped.

total spills in shared programs: 8944 -> 8138 (-9.01%)
spills in affected programs: 3240 -> 2434 (-24.88%)
helped: 67
HURT: 0

total fills in shared programs: 9373 -> 7842 (-16.33%)
fills in affected programs: 4736 -> 3205 (-32.33%)
helped: 67
HURT: 0

LOST:   1
GAINED: 2

Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 16123288 -> 16116876 (-0.04%)
instructions in affected programs: 241155 -> 234743 (-2.66%)
helped: 126
HURT: 2
helped stats (abs) min: 1 max: 209 x̄: 50.90 x̃: 7
helped stats (rel) min: 0.07% max: 5.94% x̄: 1.76% x̃: 0.65%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.05% max: 0.24% x̄: 0.15% x̃: 0.15%
95% mean confidence interval for instructions value: -61.29 -38.89
95% mean confidence interval for instructions %-change: -2.05% -1.42%
Instructions are helped.

total cycles in shared programs: 335419163 -> 330438819 (-1.48%)
cycles in affected programs: 77515502 -> 72535158 (-6.42%)
helped: 139
HURT: 37
helped stats (abs) min: 2 max: 269140 x̄: 36374.19 x̃: 597
helped stats (rel) min: <.01% max: 28.60% x̄: 3.67% x̃: 1.31%
HURT stats (abs)   min: 4 max: 17618 x̄: 2045.08 x̃: 174
HURT stats (rel)   min: 0.02% max: 8.32% x̄: 2.61% x̃: 0.62%
95% mean confidence interval for cycles value: -37799.30 -18795.51
95% mean confidence interval for cycles %-change: -3.13% -1.57%
Cycles are helped.

total spills in shared programs: 8065 -> 7306 (-9.41%)
spills in affected programs: 3153 -> 2394 (-24.07%)
helped: 67
HURT: 0

total fills in shared programs: 8710 -> 7412 (-14.90%)
fills in affected programs: 4466 -> 3168 (-29.06%)
helped: 67
HURT: 0

LOST:   1
GAINED: 1

Broadwell
total instructions in shared programs: 14970538 -> 14965967 (-0.03%)
instructions in affected programs: 227040 -> 222469 (-2.01%)
helped: 126
HURT: 2
helped stats (abs) min: 1 max: 136 x̄: 36.29 x̃: 8
helped stats (rel) min: 0.07% max: 6.02% x̄: 1.47% x̃: 0.89%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.05% max: 0.24% x̄: 0.14% x̃: 0.14%
95% mean confidence interval for instructions value: -43.05 -28.37
95% mean confidence interval for instructions %-change: -1.69% -1.19%
Instructions are helped.

total cycles in shared programs: 336237662 -> 333035960 (-0.95%)
cycles in affected programs: 72066394 -> 68864692 (-4.44%)
helped: 134
HURT: 42
helped stats (abs) min: 4 max: 122640 x̄: 24344.54 x̃: 1833
helped stats (rel) min: <.01% max: 26.93% x̄: 4.02% x̃: 2.38%
HURT stats (abs)   min: 1 max: 17205 x̄: 1439.69 x̃: 92
HURT stats (rel)   min: <.01% max: 7.12% x̄: 1.34% x̃: 0.62%
95% mean confidence interval for cycles value: -23753.58 -12629.40
95% mean confidence interval for cycles %-change: -3.50% -1.98%
Cycles are helped.

total spills in shared programs: 21122 -> 20204 (-4.35%)
spills in affected programs: 3644 -> 2726 (-25.19%)
helped: 67
HURT: 0

total fills in shared programs: 24879 -> 23460 (-5.70%)
fills in affected programs: 4883 -> 3464 (-29.06%)
helped: 67
HURT: 0

Haswell
total instructions in shared programs: 13148269 -> 13145444 (-0.02%)
instructions in affected programs: 137046 -> 134221 (-2.06%)
helped: 97
HURT: 3
helped stats (abs) min: 1 max: 137 x̄: 30.58 x̃: 3
helped stats (rel) min: 0.14% max: 4.38% x̄: 1.38% x̃: 0.44%
HURT stats (abs)   min: 1 max: 70 x̄: 47.00 x̃: 70
HURT stats (rel)   min: 0.05% max: 5.82% x̄: 3.90% x̃: 5.82%
95% mean confidence interval for instructions value: -37.15 -19.35
95% mean confidence interval for instructions %-change: -1.56% -0.89%
Instructions are helped.

total cycles in shared programs: 321221834 -> 318333159 (-0.90%)
cycles in affected programs: 54932349 -> 52043674 (-5.26%)
helped: 95
HURT: 53
helped stats (abs) min: 4 max: 123390 x̄: 30648.39 x̃: 702
helped stats (rel) min: <.01% max: 28.87% x̄: 4.27% x̃: 2.87%
HURT stats (abs)   min: 4 max: 2357 x̄: 432.49 x̃: 113
HURT stats (rel)   min: <.01% max: 3.44% x̄: 1.03% x̃: 0.54%
95% mean confidence interval for cycles value: -26154.16 -12881.99
95% mean confidence interval for cycles %-change: -3.20% -1.55%
Cycles are helped.

total spills in shared programs: 19878 -> 19293 (-2.94%)
spills in affected programs: 3020 -> 2435 (-19.37%)
helped: 41
HURT: 2

total fills in shared programs: 20918 -> 19875 (-4.99%)
fills in affected programs: 3968 -> 2925 (-26.29%)
helped: 41
HURT: 2

LOST:   0
GAINED: 1

Ivy Bridge
total instructions in shared programs: 11875585 -> 11873641 (-0.02%)
instructions in affected programs: 78065 -> 76121 (-2.49%)
helped: 27
HURT: 0
helped stats (abs) min: 8 max: 134 x̄: 72.00 x̃: 72
helped stats (rel) min: 0.36% max: 4.23% x̄: 2.42% x̃: 2.42%
95% mean confidence interval for instructions value: -83.68 -60.32
95% mean confidence interval for instructions %-change: -2.78% -2.07%
Instructions are helped.

total cycles in shared programs: 178232734 -> 175769085 (-1.38%)
cycles in affected programs: 50018707 -> 47555058 (-4.93%)
helped: 27
HURT: 0
helped stats (abs) min: 82035 max: 99953 x̄: 91246.26 x̃: 92278
helped stats (rel) min: 4.40% max: 5.69% x̄: 4.93% x̃: 4.95%
95% mean confidence interval for cycles value: -93674.20 -88818.32
95% mean confidence interval for cycles %-change: -5.09% -4.78%
Cycles are helped.

total spills in shared programs: 4182 -> 3739 (-10.59%)
spills in affected programs: 1089 -> 646 (-40.68%)
helped: 27
HURT: 0

total fills in shared programs: 5216 -> 4345 (-16.70%)
fills in affected programs: 1874 -> 1003 (-46.48%)
helped: 27
HURT: 0

No changes on any earlier Intel platforms.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4156>
2020-05-07 10:55:50 -07:00
Ian Romanick 0d605a8bbf nir/algebraic: Recognize open-coded byte or word extract from bfe
v2: Move word-extract patterns up near the byte-extract patterns.
Suggested by Rhys.

Tiger Lake
total instructions in shared programs: 21369236 -> 21368712 (<.01%)
instructions in affected programs: 913104 -> 912580 (-0.06%)
helped: 209
HURT: 165
helped stats (abs) min: 1 max: 30 x̄: 5.35 x̃: 3
helped stats (rel) min: 0.03% max: 6.92% x̄: 0.28% x̃: 0.12%
HURT stats (abs)   min: 1 max: 18 x̄: 3.61 x̃: 3
HURT stats (rel)   min: 0.04% max: 0.87% x̄: 0.16% x̃: 0.12%
95% mean confidence interval for instructions value: -2.04 -0.76
95% mean confidence interval for instructions %-change: -0.14% -0.04%
Instructions are helped.

total cycles in shared programs: 490161481 -> 490175959 (<.01%)
cycles in affected programs: 72557244 -> 72571722 (0.02%)
helped: 193
HURT: 189
helped stats (abs) min: 1 max: 14240 x̄: 509.16 x̃: 71
helped stats (rel) min: <.01% max: 13.71% x̄: 0.44% x̃: 0.05%
HURT stats (abs)   min: 2 max: 4210 x̄: 596.53 x̃: 173
HURT stats (rel)   min: <.01% max: 5.59% x̄: 0.54% x̃: 0.14%
95% mean confidence interval for cycles value: -96.33 172.13
95% mean confidence interval for cycles %-change: -0.07% 0.16%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 10780 -> 10782 (0.02%)
spills in affected programs: 18 -> 20 (11.11%)
helped: 0
HURT: 1

total fills in shared programs: 10396 -> 10370 (-0.25%)
fills in affected programs: 2292 -> 2266 (-1.13%)
helped: 27
HURT: 1

Ice Lake
total instructions in shared programs: 19556356 -> 19555446 (<.01%)
instructions in affected programs: 833336 -> 832426 (-0.11%)
helped: 400
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 2.27 x̃: 2
helped stats (rel) min: 0.07% max: 4.42% x̄: 0.14% x̃: 0.10%
95% mean confidence interval for instructions value: -2.42 -2.13
95% mean confidence interval for instructions %-change: -0.18% -0.11%
Instructions are helped.

total cycles in shared programs: 488026481 -> 488008714 (<.01%)
cycles in affected programs: 81581708 -> 81563941 (-0.02%)
helped: 193
HURT: 206
helped stats (abs) min: 1 max: 3615 x̄: 576.35 x̃: 131
helped stats (rel) min: <.01% max: 4.50% x̄: 0.49% x̃: 0.22%
HURT stats (abs)   min: 1 max: 2244 x̄: 453.73 x̃: 170
HURT stats (rel)   min: <.01% max: 5.71% x̄: 0.36% x̃: 0.14%
95% mean confidence interval for cycles value: -127.23 38.17
95% mean confidence interval for cycles %-change: -0.12% 0.03%
Inconclusive result (value mean confidence interval includes 0).

total fills in shared programs: 9935 -> 9908 (-0.27%)
fills in affected programs: 2208 -> 2181 (-1.22%)
helped: 27
HURT: 0

Skylake
total instructions in shared programs: 17766078 -> 17765186 (<.01%)
instructions in affected programs: 822017 -> 821125 (-0.11%)
helped: 399
HURT: 1
helped stats (abs) min: 1 max: 20 x̄: 2.27 x̃: 2
helped stats (rel) min: 0.07% max: 4.46% x̄: 0.15% x̃: 0.10%
HURT stats (abs)   min: 12 max: 12 x̄: 12.00 x̃: 12
HURT stats (rel)   min: 0.50% max: 0.50% x̄: 0.50% x̃: 0.50%
95% mean confidence interval for instructions value: -2.39 -2.07
95% mean confidence interval for instructions %-change: -0.18% -0.11%
Instructions are helped.

total cycles in shared programs: 470905548 -> 470907497 (<.01%)
cycles in affected programs: 78598491 -> 78600440 (<.01%)
helped: 202
HURT: 192
helped stats (abs) min: 1 max: 3690 x̄: 228.98 x̃: 60
helped stats (rel) min: <.01% max: 4.51% x̄: 0.24% x̃: 0.03%
HURT stats (abs)   min: 1 max: 2260 x̄: 251.05 x̃: 77
HURT stats (rel)   min: <.01% max: 5.31% x̄: 0.24% x̃: 0.06%
95% mean confidence interval for cycles value: -45.01 54.90
95% mean confidence interval for cycles %-change: -0.07% 0.05%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 9941 -> 9943 (0.02%)
spills in affected programs: 26 -> 28 (7.69%)
helped: 0
HURT: 1

total fills in shared programs: 10293 -> 10268 (-0.24%)
fills in affected programs: 2391 -> 2366 (-1.05%)
helped: 27
HURT: 1

Broadwell
total instructions in shared programs: 17463211 -> 17462366 (<.01%)
instructions in affected programs: 861444 -> 860599 (-0.10%)
helped: 399
HURT: 1
helped stats (abs) min: 1 max: 20 x̄: 2.14 x̃: 2
helped stats (rel) min: 0.03% max: 4.46% x̄: 0.14% x̃: 0.09%
HURT stats (abs)   min: 7 max: 7 x̄: 7.00 x̃: 7
HURT stats (rel)   min: 0.33% max: 0.33% x̄: 0.33% x̃: 0.33%
95% mean confidence interval for instructions value: -2.26 -1.97
95% mean confidence interval for instructions %-change: -0.17% -0.10%
Instructions are helped.

total cycles in shared programs: 507048912 -> 506898243 (-0.03%)
cycles in affected programs: 79806433 -> 79655764 (-0.19%)
helped: 248
HURT: 136
helped stats (abs) min: 1 max: 8450 x̄: 1124.18 x̃: 64
helped stats (rel) min: <.01% max: 5.91% x̄: 0.83% x̃: 0.05%
HURT stats (abs)   min: 2 max: 7632 x̄: 942.12 x̃: 103
HURT stats (rel)   min: <.01% max: 5.62% x̄: 0.71% x̃: 0.08%
95% mean confidence interval for cycles value: -647.01 -137.73
95% mean confidence interval for cycles %-change: -0.47% -0.10%
Cycles are helped.

total spills in shared programs: 22996 -> 22998 (<.01%)
spills in affected programs: 31 -> 33 (6.45%)
helped: 0
HURT: 1

total fills in shared programs: 25951 -> 25923 (-0.11%)
fills in affected programs: 2444 -> 2416 (-1.15%)
helped: 29
HURT: 1

Haswell
total instructions in shared programs: 15841325 -> 15840554 (<.01%)
instructions in affected programs: 869679 -> 868908 (-0.09%)
helped: 394
HURT: 6
helped stats (abs) min: 1 max: 20 x̄: 2.15 x̃: 2
helped stats (rel) min: 0.06% max: 4.46% x̄: 0.14% x̃: 0.09%
HURT stats (abs)   min: 7 max: 18 x̄: 12.83 x̃: 13
HURT stats (rel)   min: 0.32% max: 0.82% x̄: 0.59% x̃: 0.61%
95% mean confidence interval for instructions value: -2.16 -1.69
95% mean confidence interval for instructions %-change: -0.16% -0.09%
Instructions are helped.

total cycles in shared programs: 520417167 -> 520279766 (-0.03%)
cycles in affected programs: 80949963 -> 80812562 (-0.17%)
helped: 246
HURT: 139
helped stats (abs) min: 1 max: 8152 x̄: 790.08 x̃: 129
helped stats (rel) min: <.01% max: 11.46% x̄: 0.70% x̃: 0.09%
HURT stats (abs)   min: 1 max: 7085 x̄: 409.78 x̃: 80
HURT stats (rel)   min: <.01% max: 5.25% x̄: 0.31% x̃: 0.06%
95% mean confidence interval for cycles value: -526.34 -187.43
95% mean confidence interval for cycles %-change: -0.49% -0.18%
Cycles are helped.

total spills in shared programs: 21714 -> 21729 (0.07%)
spills in affected programs: 174 -> 189 (8.62%)
helped: 0
HURT: 6

total fills in shared programs: 22136 -> 22132 (-0.02%)
fills in affected programs: 2848 -> 2844 (-0.14%)
helped: 31
HURT: 6

Ivy Bridge
total instructions in shared programs: 15177059 -> 15177003 (<.01%)
instructions in affected programs: 79370 -> 79314 (-0.07%)
helped: 29
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.93 x̃: 2
helped stats (rel) min: 0.06% max: 0.16% x̄: 0.08% x̃: 0.07%
95% mean confidence interval for instructions value: -2.03 -1.83
95% mean confidence interval for instructions %-change: -0.09% -0.07%
Instructions are helped.

total cycles in shared programs: 420424359 -> 420417254 (<.01%)
cycles in affected programs: 29562648 -> 29555543 (-0.02%)
helped: 23
HURT: 6
helped stats (abs) min: 2 max: 2741 x̄: 432.57 x̃: 142
helped stats (rel) min: <.01% max: 0.26% x̄: 0.04% x̃: 0.02%
HURT stats (abs)   min: 4 max: 1184 x̄: 474.00 x̃: 226
HURT stats (rel)   min: <.01% max: 0.11% x̄: 0.05% x̃: 0.05%
95% mean confidence interval for cycles value: -553.48 63.48
95% mean confidence interval for cycles %-change: -0.05% <.01%
Inconclusive result (value mean confidence interval includes 0).

total fills in shared programs: 6420 -> 6393 (-0.42%)
fills in affected programs: 1901 -> 1874 (-1.42%)
helped: 27
HURT: 0

No changes on any earlier Intel platforms.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4156>
2020-05-07 10:55:50 -07:00
Rhys Perry abc4a82857 nir: make fsat return 0.0 with NaN instead of passing it through
This is how lower_fsat and ACO implements fsat and is a more useful
definition since it can be exactly created from fmin(fmax(a, 0.0), 1.0).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3716>
2020-05-07 10:39:19 +00:00
Ian Romanick 7b869710a1 nir/algebraic: Require operands to iand be 32-bit
With the mask value 0x80000000, the other operand must be 32-bit.  This
fixes failures in
dEQP-VK.subgroups.ballot_mask.ext_shader_subgroup_ballot.*.gl_subgroupgemaskarb_*
tests from Vulkan 1.2.2 CTS.

Checking one of the tests, it appears that the tests are doing 64-bit
iand with 0x0000000080000000, then comparing the result with zero.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2834
Fixes: 88eb8f190b ("nir/algebraic: Simplify logic to detect sign of an integer")
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4770>
2020-04-28 20:33:56 +00:00
Jonathan Marek 42093bb694 nir: add pack_32_2x16_split/unpack_32_2x16_split lowering
The new option replaces the two other _split lowering options, since
there's no need for separate options.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4738>
2020-04-27 18:40:03 +00:00
Gert Wollny 49ce749d0e nir: Add umad24 and umul24 opcodes
So far only the singed versions are defined.

v2: Make umad24 and umul24 non-driver specific (Eric Anholt)

v3: Take care of nir_builder and automatic lowering of the
    opcodes if they are not supported by the backend.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4610>
2020-04-23 18:23:04 +00:00
Rhys Perry 32d871b48f nir/algebraic: don't undo lowering of 8/16-bit comparisons to 32-bit
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4387>
2020-04-23 10:57:38 +00:00
Samuel Pitoiset 59427b6d1d nir/opt_algebraic: lower 64-bit fmin3/fmax3/fmed3
This unconditionally lowers 64-bit fmin3/fmax3/fmed3 because
AMD hardware doesn't have native instructions, and no drivers
except RADV uses these instructions.

Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.*.f64.*
with ACO.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4570>
2020-04-20 06:59:47 +00:00
Ian Romanick b097e326b8 nir/algebraic: Remove a redundant fabs pattern
Made redundant by 5544b2cbbd ("nir/algebraic: Use value range analysis
to eliminate useless unary ops").

No shader-db changes on any Intel platform.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Ian Romanick af1bc7e0c7 nir/algebraic: Use value range analysis to convert fmax to fsat
This is conceptually similar to the 1-fsat(a) <=> fsat(1-a) rearragement
done in:

3b74790941 ("nir/algebraic: Recognize open-coded flrp(a, b, fsat(c))")

2d259713b7 ("nir/algebraic: Commute 1-fsat(a) to fsat(1-a) for all
non-fmul instructions").

Note: this helps the Aztex Ruins shader that was hurt for spills and
fills on Braodwell in the previous commit, but it does not fix the
spills or fills. :(

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 14528985 -> 14526116 (-0.02%)
instructions in affected programs: 477300 -> 474431 (-0.60%)
helped: 2332
HURT: 0
helped stats (abs) min: 1 max: 18 x̄: 1.23 x̃: 1
helped stats (rel) min: 0.07% max: 8.89% x̄: 0.88% x̃: 0.64%
95% mean confidence interval for instructions value: -1.27 -1.19
95% mean confidence interval for instructions %-change: -0.92% -0.85%
Instructions are helped.

total cycles in shared programs: 203723684 -> 203692984 (-0.02%)
cycles in affected programs: 4878847 -> 4848147 (-0.63%)
helped: 1764
HURT: 324
helped stats (abs) min: 1 max: 706 x̄: 22.94 x̃: 17
helped stats (rel) min: <.01% max: 17.75% x̄: 1.94% x̃: 1.66%
HURT stats (abs)   min: 1 max: 400 x̄: 30.15 x̃: 10
HURT stats (rel)   min: <.01% max: 17.76% x̄: 1.91% x̃: 0.69%
95% mean confidence interval for cycles value: -16.55 -12.86
95% mean confidence interval for cycles %-change: -1.44% -1.24%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Ian Romanick 62795475e8 nir/algebraic: Distribute source modifiers into instructions
There are three main classes of cases that are helped by this change:

1. When the negation is applied to a value being type converted (e.g.,
   float(-x)).  This could possibly also be handled with more clever
   code generation.

2. When the negation is applied to a phi node source (e.g., x = -(...);
   at the end of a basic block).  This was the original case that caught
   my attention while looking at shader-db dumps.

3. When the negation is applied to the source of an instruction that
   cannot have source modifiers.  This includes texture instructions and
   math box instructions on pre-Gen7 platforms (see more details below).

In many these cases the negation can be propagated into the instructions
that generate the value (e.g., -(a*b) = (-a)*b).

In addition to the operations implemtned in this patch, I also tried:

 - frcp - Helped 6 or fewer shaders on Gen7+, and hurt just as many on
   pre-Gen7.  On Gen6 and earlier, frcp is a math box instruction, and
   math box instructions cannot have source modifiers.

   I suspect this is why so many more shaders are helped on Gen6 than on
   Gen5 or Gen7.  Gen6 supports OpenGL 3.3, so a lot more shaders
   compile on it.  A lot of these shaders may have things like cos(-x)
   or rcp(-x) that could result in an explicit negation instruction.

 - bcsel - Hurt a few shaders with none helped.  bcsel operates on
   integer sources, so the fabs or fneg cannot be a source modifier in
   the bcsel itself.

 - Integer instructions - No changes on any Intel platform.

Some notes about the shader-db results below.

 - On Tiger Lake, a single Deus Ex fragment shader is hurt for both
   spills and fills.

 - On Haswell, a different Deus Ex fragment shader is hurt for both
   spills and fills.

 - On GM45, the "LOST: 1" and "GAINED: 1" is a single Left4Dead 2
   (very high graphics settings, lol) fragment shader that upgrades
   from SIMD8 to SIMD16.

v2: Add support for fsign.  Add some patterns that remove redundant
negations and redundant absolute value rather than trying to push them
down the tree.

Tiger Lake
total instructions in shared programs: 17611333 -> 17586465 (-0.14%)
instructions in affected programs: 3033734 -> 3008866 (-0.82%)
helped: 10310
HURT: 632
helped stats (abs) min: 1 max: 35 x̄: 2.61 x̃: 1
helped stats (rel) min: 0.04% max: 16.67% x̄: 1.43% x̃: 1.01%
HURT stats (abs)   min: 1 max: 47 x̄: 3.21 x̃: 2
HURT stats (rel)   min: 0.04% max: 5.08% x̄: 0.88% x̃: 0.63%
95% mean confidence interval for instructions value: -2.33 -2.21
95% mean confidence interval for instructions %-change: -1.32% -1.27%
Instructions are helped.

total cycles in shared programs: 338365223 -> 338262252 (-0.03%)
cycles in affected programs: 125291811 -> 125188840 (-0.08%)
helped: 5224
HURT: 2031
helped stats (abs) min: 1 max: 5670 x̄: 46.73 x̃: 12
helped stats (rel) min: <.01% max: 34.78% x̄: 1.91% x̃: 0.97%
HURT stats (abs)   min: 1 max: 2882 x̄: 69.50 x̃: 14
HURT stats (rel)   min: <.01% max: 44.93% x̄: 2.35% x̃: 0.74%
95% mean confidence interval for cycles value: -18.71 -9.68
95% mean confidence interval for cycles %-change: -0.80% -0.63%
Cycles are helped.

total spills in shared programs: 8942 -> 8946 (0.04%)
spills in affected programs: 8 -> 12 (50.00%)
helped: 0
HURT: 1

total fills in shared programs: 9399 -> 9401 (0.02%)
fills in affected programs: 21 -> 23 (9.52%)
helped: 0
HURT: 1

Ice Lake
total instructions in shared programs: 16124348 -> 16102258 (-0.14%)
instructions in affected programs: 2830928 -> 2808838 (-0.78%)
helped: 11294
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.96 x̃: 1
helped stats (rel) min: 0.07% max: 17.65% x̄: 1.32% x̃: 0.93%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 3.45% max: 4.00% x̄: 3.72% x̃: 3.72%
95% mean confidence interval for instructions value: -1.99 -1.93
95% mean confidence interval for instructions %-change: -1.34% -1.29%
Instructions are helped.

total cycles in shared programs: 335393932 -> 335325794 (-0.02%)
cycles in affected programs: 123834609 -> 123766471 (-0.06%)
helped: 5034
HURT: 2128
helped stats (abs) min: 1 max: 3256 x̄: 43.39 x̃: 11
helped stats (rel) min: <.01% max: 35.79% x̄: 1.98% x̃: 1.00%
HURT stats (abs)   min: 1 max: 2634 x̄: 70.63 x̃: 16
HURT stats (rel)   min: <.01% max: 49.49% x̄: 2.73% x̃: 0.62%
95% mean confidence interval for cycles value: -13.66 -5.37
95% mean confidence interval for cycles %-change: -0.69% -0.48%
Cycles are helped.

LOST:   0
GAINED: 2

Skylake
total instructions in shared programs: 14949240 -> 14927930 (-0.14%)
instructions in affected programs: 2594756 -> 2573446 (-0.82%)
helped: 11000
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.94 x̃: 1
helped stats (rel) min: 0.07% max: 18.75% x̄: 1.39% x̃: 0.94%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 4.76% max: 4.76% x̄: 4.76% x̃: 4.76%
95% mean confidence interval for instructions value: -1.97 -1.91
95% mean confidence interval for instructions %-change: -1.42% -1.37%
Instructions are helped.

total cycles in shared programs: 324829346 -> 324821596 (<.01%)
cycles in affected programs: 121566087 -> 121558337 (<.01%)
helped: 4611
HURT: 2147
helped stats (abs) min: 1 max: 3715 x̄: 33.29 x̃: 10
helped stats (rel) min: <.01% max: 36.08% x̄: 1.94% x̃: 1.00%
HURT stats (abs)   min: 1 max: 2551 x̄: 67.88 x̃: 16
HURT stats (rel)   min: <.01% max: 53.79% x̄: 3.69% x̃: 0.89%
95% mean confidence interval for cycles value: -4.25 1.96
95% mean confidence interval for cycles %-change: -0.28% -0.02%
Inconclusive result (value mean confidence interval includes 0).

Broadwell
total instructions in shared programs: 14971203 -> 14949957 (-0.14%)
instructions in affected programs: 2635699 -> 2614453 (-0.81%)
helped: 10982
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.93 x̃: 1
helped stats (rel) min: 0.07% max: 18.75% x̄: 1.39% x̃: 0.94%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 4.76% max: 4.76% x̄: 4.76% x̃: 4.76%
95% mean confidence interval for instructions value: -1.97 -1.90
95% mean confidence interval for instructions %-change: -1.42% -1.37%
Instructions are helped.

total cycles in shared programs: 336215033 -> 336086458 (-0.04%)
cycles in affected programs: 127383198 -> 127254623 (-0.10%)
helped: 4884
HURT: 1963
helped stats (abs) min: 1 max: 25696 x̄: 51.78 x̃: 12
helped stats (rel) min: <.01% max: 58.28% x̄: 2.00% x̃: 1.05%
HURT stats (abs)   min: 1 max: 3401 x̄: 63.33 x̃: 16
HURT stats (rel)   min: <.01% max: 39.95% x̄: 2.20% x̃: 0.70%
95% mean confidence interval for cycles value: -29.99 -7.57
95% mean confidence interval for cycles %-change: -0.89% -0.71%
Cycles are helped.

total fills in shared programs: 24905 -> 24901 (-0.02%)
fills in affected programs: 117 -> 113 (-3.42%)
helped: 4
HURT: 0

LOST:   0
GAINED: 16

Haswell
total instructions in shared programs: 13148927 -> 13131528 (-0.13%)
instructions in affected programs: 2220941 -> 2203542 (-0.78%)
helped: 8017
HURT: 4
helped stats (abs) min: 1 max: 12 x̄: 2.17 x̃: 1
helped stats (rel) min: 0.07% max: 15.25% x̄: 1.40% x̃: 0.93%
HURT stats (abs)   min: 1 max: 7 x̄: 2.50 x̃: 1
HURT stats (rel)   min: 0.33% max: 4.76% x̄: 2.73% x̃: 2.91%
95% mean confidence interval for instructions value: -2.21 -2.13
95% mean confidence interval for instructions %-change: -1.43% -1.37%
Instructions are helped.

total cycles in shared programs: 321221791 -> 321079870 (-0.04%)
cycles in affected programs: 126886055 -> 126744134 (-0.11%)
helped: 4674
HURT: 1729
helped stats (abs) min: 1 max: 23654 x̄: 56.47 x̃: 16
helped stats (rel) min: <.01% max: 53.22% x̄: 2.13% x̃: 1.05%
HURT stats (abs)   min: 1 max: 3694 x̄: 70.58 x̃: 18
HURT stats (rel)   min: <.01% max: 63.06% x̄: 2.48% x̃: 0.90%
95% mean confidence interval for cycles value: -33.31 -11.02
95% mean confidence interval for cycles %-change: -0.99% -0.78%
Cycles are helped.

total spills in shared programs: 19872 -> 19874 (0.01%)
spills in affected programs: 21 -> 23 (9.52%)
helped: 0
HURT: 1

total fills in shared programs: 20941 -> 20941 (0.00%)
fills in affected programs: 62 -> 62 (0.00%)
helped: 1
HURT: 1

LOST:   0
GAINED: 8

Ivy Bridge
total instructions in shared programs: 11875553 -> 11853839 (-0.18%)
instructions in affected programs: 1553112 -> 1531398 (-1.40%)
helped: 7304
HURT: 3
helped stats (abs) min: 1 max: 16 x̄: 2.97 x̃: 2
helped stats (rel) min: 0.07% max: 15.25% x̄: 1.62% x̃: 1.15%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.05% max: 3.33% x̄: 2.44% x̃: 2.94%
95% mean confidence interval for instructions value: -3.04 -2.90
95% mean confidence interval for instructions %-change: -1.65% -1.59%
Instructions are helped.

total cycles in shared programs: 178246425 -> 178184484 (-0.03%)
cycles in affected programs: 13702146 -> 13640205 (-0.45%)
helped: 4409
HURT: 1566
helped stats (abs) min: 1 max: 531 x̄: 24.52 x̃: 13
helped stats (rel) min: <.01% max: 38.67% x̄: 2.14% x̃: 1.02%
HURT stats (abs)   min: 1 max: 356 x̄: 29.48 x̃: 10
HURT stats (rel)   min: <.01% max: 64.73% x̄: 1.87% x̃: 0.70%
95% mean confidence interval for cycles value: -11.60 -9.14
95% mean confidence interval for cycles %-change: -1.19% -0.99%
Cycles are helped.

LOST:   0
GAINED: 10

Sandy Bridge
total instructions in shared programs: 10695740 -> 10667483 (-0.26%)
instructions in affected programs: 2337607 -> 2309350 (-1.21%)
helped: 10720
HURT: 1
helped stats (abs) min: 1 max: 49 x̄: 2.64 x̃: 2
helped stats (rel) min: 0.07% max: 20.00% x̄: 1.54% x̃: 1.13%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.04% max: 1.04% x̄: 1.04% x̃: 1.04%
95% mean confidence interval for instructions value: -2.69 -2.58
95% mean confidence interval for instructions %-change: -1.57% -1.51%
Instructions are helped.

total cycles in shared programs: 153478839 -> 153416223 (-0.04%)
cycles in affected programs: 22050900 -> 21988284 (-0.28%)
helped: 5342
HURT: 2200
helped stats (abs) min: 1 max: 1020 x̄: 20.34 x̃: 16
helped stats (rel) min: <.01% max: 24.05% x̄: 1.51% x̃: 0.86%
HURT stats (abs)   min: 1 max: 335 x̄: 20.93 x̃: 6
HURT stats (rel)   min: <.01% max: 20.18% x̄: 1.03% x̃: 0.30%
95% mean confidence interval for cycles value: -9.18 -7.42
95% mean confidence interval for cycles %-change: -0.82% -0.71%
Cycles are helped.

Iron Lake
total instructions in shared programs: 8114882 -> 8105574 (-0.11%)
instructions in affected programs: 1232504 -> 1223196 (-0.76%)
helped: 4109
HURT: 2
helped stats (abs) min: 1 max: 6 x̄: 2.27 x̃: 1
helped stats (rel) min: 0.05% max: 8.33% x̄: 0.99% x̃: 0.66%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.94% max: 4.35% x̄: 2.65% x̃: 2.65%
95% mean confidence interval for instructions value: -2.31 -2.21
95% mean confidence interval for instructions %-change: -1.01% -0.96%
Instructions are helped.

total cycles in shared programs: 188504036 -> 188466296 (-0.02%)
cycles in affected programs: 31203798 -> 31166058 (-0.12%)
helped: 3447
HURT: 36
helped stats (abs) min: 2 max: 92 x̄: 11.03 x̃: 8
helped stats (rel) min: <.01% max: 5.41% x̄: 0.21% x̃: 0.13%
HURT stats (abs)   min: 2 max: 30 x̄: 7.33 x̃: 6
HURT stats (rel)   min: 0.01% max: 1.65% x̄: 0.18% x̃: 0.10%
95% mean confidence interval for cycles value: -11.16 -10.51
95% mean confidence interval for cycles %-change: -0.22% -0.20%
Cycles are helped.

LOST:   0
GAINED: 1

GM45
total instructions in shared programs: 4989697 -> 4984531 (-0.10%)
instructions in affected programs: 703952 -> 698786 (-0.73%)
helped: 2493
HURT: 2
helped stats (abs) min: 1 max: 6 x̄: 2.07 x̃: 1
helped stats (rel) min: 0.05% max: 8.33% x̄: 1.03% x̃: 0.66%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.95% max: 4.35% x̄: 2.65% x̃: 2.65%
95% mean confidence interval for instructions value: -2.13 -2.01
95% mean confidence interval for instructions %-change: -1.07% -0.99%
Instructions are helped.

total cycles in shared programs: 128929136 -> 128903886 (-0.02%)
cycles in affected programs: 21583096 -> 21557846 (-0.12%)
helped: 2214
HURT: 17
helped stats (abs) min: 2 max: 92 x̄: 11.44 x̃: 8
helped stats (rel) min: <.01% max: 5.41% x̄: 0.24% x̃: 0.13%
HURT stats (abs)   min: 2 max: 8 x̄: 4.24 x̃: 4
HURT stats (rel)   min: 0.01% max: 1.65% x̄: 0.20% x̃: 0.09%
95% mean confidence interval for cycles value: -11.75 -10.88
95% mean confidence interval for cycles %-change: -0.25% -0.22%
Cycles are helped.

LOST:   1
GAINED: 1

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Jason Ekstrand 84ab61160a nir/algebraic: Add downcast-of-pack opts
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4365>
2020-03-31 00:18:05 +00:00
Jason Ekstrand b2db84153a nir: Add b2b opcodes
These exist to convert between different types of boolean values.  In
particular, we want to use these for uniform and shared memory
operations where we need to convert to a reasonably sized boolean but we
don't care what its format is so we don't want to make the back-end
insert an actual i2b/b2i.  In the case of uniforms, Mesa can tweak the
format of the uniform boolean to whatever the driver wants.  In the case
of shared, every value in a shared variable comes from the shader so
it's already in the right boolean format.

The new boolean conversion opcodes get replaced with mov in
lower_bool_to_int/float32 so the back-end will hopefully never see them.
However, while we're in the middle of optimizing our NIR, they let us
have sensible load_uniform/ubo intrinsics and also have the bit size
conversion.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>
2020-03-30 15:46:19 +00:00
Samuel Pitoiset 3935a729d9 nir/algebraic: add fexp2(fmul(flog2(a), 0.5) -> fsqrt(a) optimization
Helps some Wolfenstein II and Wolfenstein Youngblood shaders.

pipeline-db (VEGA10/ACO):
Totals from affected shaders:
SGPRS: 17904 -> 17904 (0.00 %)
VGPRS: 14492 -> 14492 (0.00 %)
Spilled SGPRs: 20 -> 20 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 1753152 -> 1749708 (-0.20 %) bytes
Max Waves: 2581 -> 2581 (0.00 %)

pipeline-db (VEGA10/LLVM):
Totals from affected shaders:
SGPRS: 26656 -> 26656 (0.00 %)
VGPRS: 23780 -> 23780 (0.00 %)
Spilled SGPRs: 2112 -> 2112 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 2552712 -> 2549236 (-0.14 %) bytes
Max Waves: 3359 -> 3359 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4353>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4353>
2020-03-30 14:07:43 +00:00
Ian Romanick b421c0466d soft-fp64/flt: Perform checks in a different order
The change to nir_opt_algebraic cleans up a pattern that was never
produced before the rest of this commit was added.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 843005 -> 841666 (-0.16%)
instructions in affected programs: 460655 -> 459316 (-0.29%)
helped: 64
HURT: 17
helped stats (abs) min: 1 max: 72 x̄: 21.72 x̃: 20
helped stats (rel) min: 0.01% max: 28.07% x̄: 12.67% x̃: 16.07%
HURT stats (abs)   min: 1 max: 7 x̄: 3.00 x̃: 2
HURT stats (rel)   min: 0.01% max: 0.04% x̄: 0.02% x̃: 0.02%
95% mean confidence interval for instructions value: -20.87 -12.19
95% mean confidence interval for instructions %-change: -12.35% -7.66%
Instructions are helped.

total cycles in shared programs: 6944998 -> 6927246 (-0.26%)
cycles in affected programs: 3891872 -> 3874120 (-0.46%)
helped: 71
HURT: 10
helped stats (abs) min: 2 max: 772 x̄: 254.21 x̃: 156
helped stats (rel) min: <.01% max: 66.44% x̄: 21.72% x̃: 18.40%
HURT stats (abs)   min: 18 max: 69 x̄: 29.70 x̃: 20
HURT stats (rel)   min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -270.82 -167.50
95% mean confidence interval for cycles %-change: -24.41% -13.65%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 4e3d69ad07 nir/algebraic: Simplify a contradiction that can occur in __flt64_nonnan
The pattern is added to opt_algebraic because, for example, comparisons
with constant 0.0 will produce (a1 < 0).

Even with a pass that optimized Boolean expressions, I think this would
be very difficult to automatically recognize and optimize.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 933054 -> 929619 (-0.37%)
instructions in affected programs: 784041 -> 780606 (-0.44%)
helped: 59
HURT: 0
helped stats (abs) min: 2 max: 213 x̄: 58.22 x̃: 44
helped stats (rel) min: 0.02% max: 2.51% x̄: 0.72% x̃: 0.46%
95% mean confidence interval for instructions value: -70.80 -45.64
95% mean confidence interval for instructions %-change: -0.92% -0.53%
Instructions are helped.

total cycles in shared programs: 7304712 -> 7280180 (-0.34%)
cycles in affected programs: 7176260 -> 7151728 (-0.34%)
helped: 92
HURT: 0
helped stats (abs) min: 8 max: 1414 x̄: 266.65 x̃: 166
helped stats (rel) min: 0.04% max: 2.34% x̄: 0.43% x̃: 0.22%
95% mean confidence interval for cycles value: -333.05 -200.26
95% mean confidence interval for cycles %-change: -0.54% -0.31%
Cycles are helped.

Regular shader-db changes:

No changes on any Intel platform.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick e0cefc5a23 nir/algebraic: Constant reassociation for bitwise operations too
Like 5886cd79a0, but for iand, ior, and ixor.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake
total instructions in shared programs: 903108 -> 902830 (-0.03%)
instructions in affected programs: 654910 -> 654632 (-0.04%)
helped: 31
HURT: 5
helped stats (abs) min: 2 max: 31 x̄: 9.58 x̃: 7
helped stats (rel) min: 0.01% max: 0.23% x̄: 0.06% x̃: 0.04%
HURT stats (abs)   min: 1 max: 10 x̄: 3.80 x̃: 3
HURT stats (rel)   min: 0.01% max: 0.10% x̄: 0.03% x̃: 0.02%
95% mean confidence interval for instructions value: -10.55 -4.89
95% mean confidence interval for instructions %-change: -0.07% -0.03%
Instructions are helped.

total cycles in shared programs: 7059681 -> 7058006 (-0.02%)
cycles in affected programs: 5081309 -> 5079634 (-0.03%)
helped: 33
HURT: 12
helped stats (abs) min: 1 max: 444 x̄: 60.91 x̃: 18
helped stats (rel) min: <.01% max: 2.17% x̄: 0.25% x̃: 0.05%
HURT stats (abs)   min: 1 max: 288 x̄: 27.92 x̃: 2
HURT stats (rel)   min: <.01% max: 1.00% x̄: 0.23% x̃: 0.02%
95% mean confidence interval for cycles value: -68.32 -6.12
95% mean confidence interval for cycles %-change: -0.28% 0.03%
Inconclusive result (%-change mean confidence interval includes 0).

Ice Lake
total instructions in shared programs: 895384 -> 895159 (-0.03%)
instructions in affected programs: 658678 -> 658453 (-0.03%)
helped: 37
HURT: 0
helped stats (abs) min: 3 max: 16 x̄: 6.08 x̃: 4
helped stats (rel) min: <.01% max: 0.07% x̄: 0.04% x̃: 0.04%
95% mean confidence interval for instructions value: -7.46 -4.70
95% mean confidence interval for instructions %-change: -0.04% -0.03%
Instructions are helped.

total cycles in shared programs: 7092224 -> 7091195 (-0.01%)
cycles in affected programs: 5221666 -> 5220637 (-0.02%)
helped: 35
HURT: 11
helped stats (abs) min: 1 max: 247 x̄: 43.46 x̃: 12
helped stats (rel) min: <.01% max: 2.17% x̄: 0.23% x̃: 0.05%
HURT stats (abs)   min: 2 max: 432 x̄: 44.73 x̃: 5
HURT stats (rel)   min: <.01% max: 1.00% x̄: 0.25% x̃: 0.02%
95% mean confidence interval for cycles value: -49.00 4.26
95% mean confidence interval for cycles %-change: -0.27% 0.03%
Inconclusive result (value mean confidence interval includes 0).

Regular shader-db results:

All Haswell+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611408 -> 17611398 (<.01%)
instructions in affected programs: 1648 -> 1638 (-0.61%)
helped: 2
HURT: 0

total cycles in shared programs: 338366148 -> 338366124 (<.01%)
cycles in affected programs: 124048 -> 124024 (-0.02%)
helped: 2
HURT: 0

No changes on any earlier Intel platforms.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00