KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Eric Engestrom	f1eae2f8bb	python: drop python2 support Signed-off-by: Eric Engestrom <eric@engestrom.ch> Acked-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3674>	2021-08-14 21:44:32 +00:00
Ian Romanick	84d2e53789	Revert "nir/algebraic: Convert some f2u to f2i" Per https://gitlab.freedesktop.org/mesa/mesa/-/issues/5178#note_1019666, the assumption fundamental to this optimization is false. Section 2.4.1 (Float to Integer) of Ivy Bridge PRMs describes the situation. The wording of the section is somewhat confusing (because it doesn't clearly delineate between signed and unsigned integers), but the last two rows of the table make it clear that F->UD conversion clamps negative float values to 0. All other hardware mentioned in that thread seems to behave the same way. The real problem is that, with hardware that behaves in this ways, converting f2u(2147483648.0) to f2i(2147483648.0) changes the bit pattern that would be produced from 0x80000000 to 0x7fffffff. This reverts commit `ad05920258`. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12297>	2021-08-10 22:16:13 +00:00
Rhys Perry	4e2b94331b	nir/algebraic: improve irem by power-of-two optimization Requires one less instruction. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12039>	2021-08-09 11:00:39 +00:00
Rhys Perry	b009467b81	nir/algebraic: add optimizations for imul(a, INT_MIN) is_pos_power_of_two would catch this, but nir_op_imul has signed sources, so is_neg_power_of_two catches it instead, which creates a useless nir_op_ineg. fossil-db (Sienna Cichlid): Totals from 1014 (0.68% of 150170) affected shaders: CodeSize: 3592296 -> 3592288 (-0.00%); split: -0.00%, +0.00% Instrs: 671211 -> 670426 (-0.12%) Latency: 5268917 -> 5268479 (-0.01%); split: -0.01%, +0.00% InvThroughput: 2187349 -> 2187343 (-0.00%); split: -0.00%, +0.00% VClause: 8634 -> 8636 (+0.02%) Copies: 97585 -> 97604 (+0.02%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12039>	2021-08-09 11:00:39 +00:00
Rhys Perry	65cd5a0f22	nir/algebraic: don't optimize umod/imod/irem if lower_bitops=true Match the udiv/idiv/imul by power-of-two optimizations. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12039>	2021-08-09 11:00:39 +00:00
Rhys Perry	ec4b425f59	nir/algebraic: fix imod by negative power-of-two If "a" is a multiple of "b", then the result would have been "b" instead of 0. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `0ef5f3552f` ("nir: add strength reduction pattern for imod/irem with pow2 divisor.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12039>	2021-08-09 11:00:39 +00:00
Dave Airlie	ad92c2b253	nir: add fisnormal lowering just lower the 32-bit version for now. Thanks to alyssa for this suggested lowering. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12207>	2021-08-06 14:27:48 +10:00
Sagar Ghuge	06ab737686	nir: Add optimizations for iadd3 This patch also adds has_iadd3 bit to give more control if backend supports ternary add instruction or not. v2: - Add patterns in late optimization (Connor Abbott) Suggested-by: Alyssa/Jason Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>	2021-07-16 15:59:56 +00:00
Jason Ekstrand	2e08bae9b3	nir,vc4: Suffix a bunch of unorm 4x8 opcodes _vc4 Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11463>	2021-06-21 09:04:08 -05:00
Rhys Perry	1cbcfb8b38	nir, nir/algebraic: add byte/word insertion instructions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>	2021-06-08 08:57:42 +00:00
Rhys Perry	edae3e5623	nir/algebraic: optimize extract of extract Found in some sottr shaders (originally iand(ishr(a, 16), 0xffff)) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>	2021-06-08 08:57:42 +00:00
Ian Romanick	d246c31ec1	nir/algebraic: Add algebraic opt for float comparisons with identical operands. The flt version could have been added in `56e21647e2`, but our collective understanding of NaN and comparisons was poor in 2015. The new "is_a_number" predicate makes the others possible. All of the helped shaders in shader-db are either from Mad Max or Skia. Some of the Skia shaders just get decimated by this change: instructions helped: shaders/skia/580-4.shader_test FS SIMD8: 81 -> 29 (-64.20%) (scheduled: top-down) I looked at a couple of those shaders, and they had sequences like: vec1 32 ssa_44 = flt32 ssa_32, ssa_32 vec1 32 ssa_45 = b32csel ssa_44, ssa_43, ssa_0 vec1 32 ssa_46 = fge32 ssa_32, ssa_32 vec1 32 ssa_47 = b32csel ssa_46, ssa_0, ssa_45 vec1 32 ssa_48 = iand ssa_46, ssa_44 vec1 32 ssa_49 = b32csel ssa_48, ssa_43, ssa_0 ssa_44 is replaced with False. Then ssa_47 selects between ssa_0 and ssa_0, so ssa_47 and ssa_46 are eliminated. ssa_48 is (False && don't care), so ssa_48 and ssa_49 are eliminated. After that, many calculations now involve constants of zero, so they are optimized down too. So it continues until there's not much left! Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 21072238 -> 21071386 (<.01%) instructions in affected programs: 33722 -> 32870 (-2.53%) helped: 146 HURT: 1 helped stats (abs) min: 1 max: 62 x̄: 5.84 x̃: 2 helped stats (rel) min: 0.19% max: 62.35% x̄: 4.09% x̃: 1.07% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.20% max: 0.20% x̄: 0.20% x̃: 0.20% 95% mean confidence interval for instructions value: -7.94 -3.65 95% mean confidence interval for instructions %-change: -5.87% -2.25% Instructions are helped. total cycles in shared programs: 856203326 -> 856192238 (<.01%) cycles in affected programs: 749966 -> 738878 (-1.48%) helped: 148 HURT: 0 helped stats (abs) min: 1 max: 1226 x̄: 74.92 x̃: 18 helped stats (rel) min: 0.07% max: 49.70% x̄: 2.69% x̃: 0.46% 95% mean confidence interval for cycles value: -104.82 -45.02 95% mean confidence interval for cycles %-change: -4.01% -1.37% Cycles are helped. LOST: 4 GAINED: 0 Fossil-db results: Tiger Lake Instructions in all programs: 160915223 -> 160898354 (-0.0%) SENDs in all programs: 6812780 -> 6812780 (+0.0%) Loops in all programs: 38340 -> 38340 (+0.0%) Cycles in all programs: 7434144207 -> 7433978462 (-0.0%) Spills in all programs: 192582 -> 192582 (+0.0%) Fills in all programs: 304537 -> 304537 (+0.0%) Ice Lake Instructions in all programs: 145296298 -> 145279531 (-0.0%) SENDs in all programs: 6863692 -> 6863692 (+0.0%) Loops in all programs: 38334 -> 38334 (+0.0%) Cycles in all programs: 8800257014 -> 8800088384 (-0.0%) Spills in all programs: 216880 -> 216880 (+0.0%) Fills in all programs: 334248 -> 334248 (+0.0%) Skylake Instructions in all programs: 135891664 -> 135874910 (-0.0%) SENDs in all programs: 6802946 -> 6802946 (+0.0%) Loops in all programs: 38331 -> 38331 (+0.0%) Cycles in all programs: 8444273433 -> 8444130932 (-0.0%) Spills in all programs: 194839 -> 194839 (+0.0%) Fills in all programs: 301114 -> 301114 (+0.0%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	64bcfc3a17	nir/algebraic: Rearrange some logic-joined comparisons and reduce On Skylake and Broadwell, a single big compute shader in Dirt Rally has spills and fills REALLY helped. That same shader is hurt very slightly for spills and fills on Ice Lake. v2: Move the patterns earlier to be nearer other patterns that are similar. Mark the replacement fmin and fmax exact. Both suggested by Rhys. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake total instructions in shared programs: 21073812 -> 21073041 (<.01%) instructions in affected programs: 77608 -> 76837 (-0.99%) helped: 522 HURT: 33 helped stats (abs) min: 1 max: 26 x̄: 1.58 x̃: 1 helped stats (rel) min: 0.22% max: 14.29% x̄: 1.29% x̃: 1.02% HURT stats (abs) min: 1 max: 8 x̄: 1.67 x̃: 1 HURT stats (rel) min: 0.25% max: 3.42% x̄: 1.06% x̃: 0.86% 95% mean confidence interval for instructions value: -1.57 -1.20 95% mean confidence interval for instructions %-change: -1.25% -1.05% Instructions are helped. total cycles in shared programs: 856224346 -> 856211096 (<.01%) cycles in affected programs: 2394231 -> 2380981 (-0.55%) helped: 603 HURT: 25 helped stats (abs) min: 1 max: 5218 x̄: 59.37 x̃: 28 helped stats (rel) min: 0.06% max: 5.61% x̄: 1.52% x̃: 1.37% HURT stats (abs) min: 2 max: 21394 x̄: 901.92 x̃: 10 HURT stats (rel) min: 0.02% max: 5.90% x̄: 0.95% x̃: 0.59% 95% mean confidence interval for cycles value: -93.61 51.41 95% mean confidence interval for cycles %-change: -1.50% -1.34% Inconclusive result (value mean confidence interval includes 0). LOST: 1 GAINED: 1 Ice Lake total instructions in shared programs: 20025692 -> 20024554 (<.01%) instructions in affected programs: 104981 -> 103843 (-1.08%) helped: 738 HURT: 0 helped stats (abs) min: 1 max: 30 x̄: 1.54 x̃: 1 helped stats (rel) min: 0.31% max: 10.53% x̄: 1.20% x̃: 1.06% 95% mean confidence interval for instructions value: -1.66 -1.43 95% mean confidence interval for instructions %-change: -1.26% -1.14% Instructions are helped. total cycles in shared programs: 979474407 -> 979422333 (<.01%) cycles in affected programs: 4136364 -> 4084290 (-1.26%) helped: 759 HURT: 59 helped stats (abs) min: 2 max: 11010 x̄: 72.78 x̃: 28 helped stats (rel) min: 0.03% max: 6.43% x̄: 1.23% x̃: 1.02% HURT stats (abs) min: 1 max: 698 x̄: 53.66 x̃: 8 HURT stats (rel) min: 0.02% max: 24.05% x̄: 1.64% x̃: 0.33% 95% mean confidence interval for cycles value: -97.08 -30.24 95% mean confidence interval for cycles %-change: -1.14% -0.91% Cycles are helped. total spills in shared programs: 10568 -> 10569 (<.01%) spills in affected programs: 102 -> 103 (0.98%) helped: 0 HURT: 1 total fills in shared programs: 11347 -> 11349 (0.02%) fills in affected programs: 277 -> 279 (0.72%) helped: 0 HURT: 1 LOST: 2 GAINED: 2 Skylake total instructions in shared programs: 18190419 -> 18188523 (-0.01%) instructions in affected programs: 102502 -> 100606 (-1.85%) helped: 791 HURT: 0 helped stats (abs) min: 1 max: 676 x̄: 2.40 x̃: 1 helped stats (rel) min: 0.34% max: 20.23% x̄: 1.41% x̃: 1.23% 95% mean confidence interval for instructions value: -4.07 -0.72 95% mean confidence interval for instructions %-change: -1.47% -1.34% Instructions are helped. total cycles in shared programs: 960737969 -> 960498951 (-0.02%) cycles in affected programs: 4435351 -> 4196333 (-5.39%) helped: 804 HURT: 67 helped stats (abs) min: 1 max: 198540 x̄: 300.54 x̃: 24 helped stats (rel) min: 0.03% max: 25.41% x̄: 1.21% x̃: 0.92% HURT stats (abs) min: 2 max: 680 x̄: 39.06 x̃: 6 HURT stats (rel) min: 0.05% max: 23.98% x̄: 1.12% x̃: 0.19% 95% mean confidence interval for cycles value: -722.03 173.20 95% mean confidence interval for cycles %-change: -1.15% -0.91% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 9757 -> 9722 (-0.36%) spills in affected programs: 138 -> 103 (-25.36%) helped: 1 HURT: 0 total fills in shared programs: 9861 -> 9576 (-2.89%) fills in affected programs: 564 -> 279 (-50.53%) helped: 1 HURT: 0 LOST: 5 GAINED: 2 Broadwell total instructions in shared programs: 17853870 -> 17852414 (<.01%) instructions in affected programs: 101276 -> 99820 (-1.44%) helped: 777 HURT: 0 helped stats (abs) min: 1 max: 264 x̄: 1.87 x̃: 1 helped stats (rel) min: 0.34% max: 8.44% x̄: 1.37% x̃: 1.23% 95% mean confidence interval for instructions value: -2.54 -1.21 95% mean confidence interval for instructions %-change: -1.42% -1.32% Instructions are helped. total cycles in shared programs: 1029846029 -> 1029725458 (-0.01%) cycles in affected programs: 4435791 -> 4315220 (-2.72%) helped: 813 HURT: 43 helped stats (abs) min: 2 max: 68560 x̄: 149.95 x̃: 24 helped stats (rel) min: 0.02% max: 73.73% x̄: 1.43% x̃: 0.92% HURT stats (abs) min: 2 max: 726 x̄: 31.12 x̃: 13 HURT stats (rel) min: 0.01% max: 8.43% x̄: 0.62% x̃: 0.31% 95% mean confidence interval for cycles value: -299.58 17.87 95% mean confidence interval for cycles %-change: -1.63% -1.02% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 20333 -> 20307 (-0.13%) spills in affected programs: 151 -> 125 (-17.22%) helped: 1 HURT: 0 total fills in shared programs: 25899 -> 25775 (-0.48%) fills in affected programs: 573 -> 449 (-21.64%) helped: 1 HURT: 0 LOST: 5 GAINED: 0 Sandy Bridge, Ivy Bridge, and Haswell had similar results. (Haswell shown) total instructions in shared programs: 16417658 -> 16416320 (<.01%) instructions in affected programs: 96495 -> 95157 (-1.39%) helped: 774 HURT: 0 helped stats (abs) min: 1 max: 18 x̄: 1.73 x̃: 1 helped stats (rel) min: 0.33% max: 9.80% x̄: 1.52% x̃: 1.20% 95% mean confidence interval for instructions value: -1.83 -1.63 95% mean confidence interval for instructions %-change: -1.59% -1.46% Instructions are helped. total cycles in shared programs: 1037104346 -> 1037080579 (<.01%) cycles in affected programs: 3787747 -> 3763980 (-0.63%) helped: 791 HURT: 53 helped stats (abs) min: 1 max: 5411 x̄: 65.87 x̃: 32 helped stats (rel) min: 0.02% max: 21.17% x̄: 1.44% x̃: 1.18% HURT stats (abs) min: 2 max: 14160 x̄: 534.72 x̃: 18 HURT stats (rel) min: 0.02% max: 15.37% x̄: 5.70% x̃: 0.54% 95% mean confidence interval for cycles value: -69.39 13.07 95% mean confidence interval for cycles %-change: -1.19% -0.80% Inconclusive result (value mean confidence interval includes 0). LOST: 12 GAINED: 2 GM45 and Iron Lake had similar results. (Iron Lake shown) total instructions in shared programs: 8132855 -> 8132703 (<.01%) instructions in affected programs: 8782 -> 8630 (-1.73%) helped: 38 HURT: 0 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 1.66% max: 3.23% x̄: 1.77% x̃: 1.72% 95% mean confidence interval for instructions value: -4.00 -4.00 95% mean confidence interval for instructions %-change: -1.88% -1.65% Instructions are helped. total cycles in shared programs: 238300850 -> 238298568 (<.01%) cycles in affected programs: 257202 -> 254920 (-0.89%) helped: 62 HURT: 2 helped stats (abs) min: 4 max: 58 x̄: 36.90 x̃: 50 helped stats (rel) min: 0.15% max: 1.55% x̄: 0.87% x̃: 1.12% HURT stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3 HURT stats (rel) min: 0.12% max: 0.22% x̄: 0.17% x̃: 0.17% 95% mean confidence interval for cycles value: -41.34 -29.98 95% mean confidence interval for cycles %-change: -0.95% -0.73% Cycles are helped. Fossil-db results: All Intel platforms had similar results. (Ice Lake shown) Instructions in all programs: 145296888 -> 145296346 (-0.0%) SENDs in all programs: 6863696 -> 6863696 (+0.0%) Loops in all programs: 38334 -> 38334 (+0.0%) Cycles in all programs: 8800262303 -> 8800258950 (-0.0%) Spills in all programs: 216880 -> 216880 (+0.0%) Fills in all programs: 334248 -> 334248 (+0.0%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	adc2835646	nir/algebraic: Mark some more logic-joined comparison reductions as exact If the values are known to be numbers, the the replacements are exact. This is only applied to the patterns with constants. Constants should always be numbers, and shaders with NaN constants should be handled in a different way. No shader-db or fossil-db changes on any Intel platform. The intention is to make these patterns more future proof. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	23bbf3932b	nir/algebraic: Mark some more comparison reductions exact Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> All Haswell and later Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 21049056 -> 21048939 (<.01%) instructions in affected programs: 4716 -> 4599 (-2.48%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 6 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.99% max: 5.43% x̄: 2.80% x̃: 2.51% 95% mean confidence interval for instructions value: -3.46 -2.54 95% mean confidence interval for instructions %-change: -3.22% -2.38% Instructions are helped. total cycles in shared programs: 855141411 -> 855141159 (<.01%) cycles in affected programs: 54491 -> 54239 (-0.46%) helped: 28 HURT: 5 helped stats (abs) min: 2 max: 34 x̄: 12.82 x̃: 12 helped stats (rel) min: 0.06% max: 2.73% x̄: 0.94% x̃: 0.75% HURT stats (abs) min: 2 max: 52 x̄: 21.40 x̃: 6 HURT stats (rel) min: 0.11% max: 2.46% x̄: 0.90% x̃: 0.56% 95% mean confidence interval for cycles value: -13.72 -1.55 95% mean confidence interval for cycles %-change: -1.01% -0.31% Cycles are helped. Tiger Lake Instructions in all programs: 160902191 -> 160899554 (-0.0%) SENDs in all programs: 6812435 -> 6812435 (+0.0%) Loops in all programs: 38225 -> 38225 (+0.0%) Cycles in all programs: 7428581420 -> 7428555881 (-0.0%) Spills in all programs: 192582 -> 192582 (+0.0%) Fills in all programs: 304539 -> 304539 (+0.0%) A lot of fragment shaders in Shadow of the Tomb Raider were helped, and a bunch of vertex shaders in Octopath Traveler were hurt. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	7d85dc4f35	nir/algebraic: Equality comparison inversions require sources be numbers v2: Update A630 expected image checksum for minetest.trace. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake total instructions in shared programs: 21036690 -> 21049485 (0.06%) instructions in affected programs: 852085 -> 864880 (1.50%) helped: 240 HURT: 2514 helped stats (abs) min: 1 max: 46 x̄: 2.45 x̃: 2 helped stats (rel) min: 0.15% max: 4.30% x̄: 0.79% x̃: 0.55% HURT stats (abs) min: 1 max: 198 x̄: 5.32 x̃: 2 HURT stats (rel) min: 0.06% max: 10.71% x̄: 1.48% x̃: 1.04% 95% mean confidence interval for instructions value: 4.14 5.15 95% mean confidence interval for instructions %-change: 1.23% 1.34% Instructions are HURT. total cycles in shared programs: 856045255 -> 855816220 (-0.03%) cycles in affected programs: 16743786 -> 16514751 (-1.37%) helped: 790 HURT: 1973 helped stats (abs) min: 1 max: 10766 x̄: 627.97 x̃: 18 helped stats (rel) min: <.01% max: 32.59% x̄: 3.01% x̃: 0.64% HURT stats (abs) min: 1 max: 4078 x̄: 135.36 x̃: 18 HURT stats (rel) min: <.01% max: 54.56% x̄: 2.80% x̃: 0.82% 95% mean confidence interval for cycles value: -131.36 -34.42 95% mean confidence interval for cycles %-change: 0.88% 1.40% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total spills in shared programs: 9771 -> 9766 (-0.05%) spills in affected programs: 47 -> 42 (-10.64%) helped: 1 HURT: 0 total fills in shared programs: 9451 -> 9430 (-0.22%) fills in affected programs: 91 -> 70 (-23.08%) helped: 1 HURT: 0 LOST: 16 GAINED: 51 All Intel GPUs from Sandybridge through Ice Lake had similar results. (Ice Lake shown) total instructions in shared programs: 20024781 -> 20025568 (<.01%) instructions in affected programs: 103309 -> 104096 (0.76%) helped: 12 HURT: 389 helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1 helped stats (rel) min: 0.20% max: 2.70% x̄: 1.36% x̃: 1.37% HURT stats (abs) min: 1 max: 8 x̄: 2.06 x̃: 1 HURT stats (rel) min: 0.05% max: 7.14% x̄: 1.25% x̃: 0.95% 95% mean confidence interval for instructions value: 1.78 2.15 95% mean confidence interval for instructions %-change: 1.06% 1.28% Instructions are HURT. total cycles in shared programs: 979419070 -> 979439180 (<.01%) cycles in affected programs: 4968711 -> 4988821 (0.40%) helped: 60 HURT: 381 helped stats (abs) min: 1 max: 1296 x̄: 96.92 x̃: 26 helped stats (rel) min: <.01% max: 27.10% x̄: 1.64% x̃: 0.65% HURT stats (abs) min: 1 max: 7320 x̄: 68.04 x̃: 30 HURT stats (rel) min: <.01% max: 19.77% x̄: 1.32% x̃: 0.87% 95% mean confidence interval for cycles value: 10.25 80.95 95% mean confidence interval for cycles %-change: 0.69% 1.15% Cycles are HURT. LOST: 1 GAINED: 2 GM45 and Iron Lake had similar results. (Iron Lake shown) total instructions in shared programs: 8128474 -> 8132527 (0.05%) instructions in affected programs: 642323 -> 646376 (0.63%) helped: 12 HURT: 1972 helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4 helped stats (rel) min: 0.72% max: 1.72% x̄: 1.09% x̃: 0.83% HURT stats (abs) min: 1 max: 16 x̄: 2.07 x̃: 3 HURT stats (rel) min: 0.12% max: 7.14% x̄: 0.77% x̃: 0.70% 95% mean confidence interval for instructions value: 1.99 2.10 95% mean confidence interval for instructions %-change: 0.74% 0.79% Instructions are HURT. total cycles in shared programs: 238280994 -> 238294376 (<.01%) cycles in affected programs: 8841250 -> 8854632 (0.15%) helped: 84 HURT: 1192 helped stats (abs) min: 4 max: 64 x̄: 12.50 x̃: 8 helped stats (rel) min: 0.02% max: 1.61% x̄: 0.28% x̃: 0.17% HURT stats (abs) min: 2 max: 198 x̄: 12.11 x̃: 12 HURT stats (rel) min: 0.02% max: 8.03% x̄: 0.28% x̃: 0.14% 95% mean confidence interval for cycles value: 9.65 11.32 95% mean confidence interval for cycles %-change: 0.22% 0.27% Cycles are HURT. No fossil-db changes on any Intel platform. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	4246c2869c	nir/algebraic: Invert comparisons less often This fixes the piglit test range_analysis_fsat_of_nan.shader_test. That test contains some code like o = saturate(X) > 0 ? vec4(1.0, 0.0, 0.0, 1.0) : vec4(0.0, 1.0, 0.0, 1.0); A clever optimizer will convert this to o = vec4(float(saturate(X) > 0), float(!(saturate(X) > 0)), 0, 1); Due to the ordering of optimizations in the compiler, the `saturate` operations are removed. This is safe even in the presense of NaN. o = vec4(float(X > 0), float(!(X > 0)), 0, 1); Since the calculations are not marked precise, an overzealous optimizer may reduce this to o = vec4(float(X > 0), float(X <= 0), 0, 1); This will result in black being output. The GLSL spec gives quite a bit of leeway with respect to NaN, but that seems too far. The shader author asked for a result of red or green. A result of black is still "undefined behavior," but it's also a little mean. This also enables CSE to do its job better. v2: Update A530 expected image checksum for minetest.trace. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4531 Fixes: `0dbda153aa` ("nir/algebraic: Flag inexact optimizations") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake total instructions in shared programs: 21041563 -> 21041789 (<.01%) instructions in affected programs: 992066 -> 992292 (0.02%) helped: 526 HURT: 548 helped stats (abs) min: 1 max: 16 x̄: 2.48 x̃: 2 helped stats (rel) min: 0.04% max: 5.56% x̄: 0.74% x̃: 0.49% HURT stats (abs) min: 1 max: 27 x̄: 2.80 x̃: 2 HURT stats (rel) min: 0.04% max: 4.55% x̄: 0.59% x̃: 0.38% 95% mean confidence interval for instructions value: -0.00 0.42 95% mean confidence interval for instructions %-change: -0.12% <.01% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 855885569 -> 856118189 (0.03%) cycles in affected programs: 343637248 -> 343869868 (0.07%) helped: 907 HURT: 541 helped stats (abs) min: 1 max: 7724 x̄: 206.45 x̃: 36 helped stats (rel) min: <.01% max: 29.97% x̄: 1.01% x̃: 0.37% HURT stats (abs) min: 1 max: 14177 x̄: 776.09 x̃: 31 HURT stats (rel) min: <.01% max: 29.94% x̄: 1.24% x̃: 0.35% 95% mean confidence interval for cycles value: 84.30 237.00 95% mean confidence interval for cycles %-change: -0.32% -0.01% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). LOST: 3 GAINED: 5 Ice Lake total instructions in shared programs: 20027107 -> 20025352 (<.01%) instructions in affected programs: 1068856 -> 1067101 (-0.16%) helped: 1153 HURT: 273 helped stats (abs) min: 1 max: 14 x̄: 1.83 x̃: 1 helped stats (rel) min: 0.03% max: 5.66% x̄: 0.61% x̃: 0.35% HURT stats (abs) min: 1 max: 15 x̄: 1.29 x̃: 1 HURT stats (rel) min: 0.16% max: 1.30% x̄: 0.58% x̃: 0.60% 95% mean confidence interval for instructions value: -1.33 -1.13 95% mean confidence interval for instructions %-change: -0.43% -0.34% Instructions are helped. total cycles in shared programs: 979499227 -> 979448725 (<.01%) cycles in affected programs: 344261539 -> 344211037 (-0.01%) helped: 1079 HURT: 441 helped stats (abs) min: 1 max: 9384 x̄: 147.78 x̃: 48 helped stats (rel) min: <.01% max: 31.83% x̄: 0.90% x̃: 0.33% HURT stats (abs) min: 1 max: 7220 x̄: 247.07 x̃: 32 HURT stats (rel) min: <.01% max: 31.30% x̄: 1.52% x̃: 0.53% 95% mean confidence interval for cycles value: -70.01 3.56 95% mean confidence interval for cycles %-change: -0.35% -0.05% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 10564 -> 10568 (0.04%) spills in affected programs: 143 -> 147 (2.80%) helped: 0 HURT: 1 total fills in shared programs: 11343 -> 11347 (0.04%) fills in affected programs: 287 -> 291 (1.39%) helped: 0 HURT: 1 LOST: 3 GAINED: 2 Skylake total instructions in shared programs: 18192274 -> 18190128 (-0.01%) instructions in affected programs: 1000188 -> 998042 (-0.21%) helped: 1149 HURT: 55 helped stats (abs) min: 1 max: 14 x̄: 1.92 x̃: 1 helped stats (rel) min: 0.04% max: 6.67% x̄: 0.67% x̃: 0.42% HURT stats (abs) min: 1 max: 2 x̄: 1.05 x̃: 1 HURT stats (rel) min: 0.16% max: 0.55% x̄: 0.27% x̃: 0.26% 95% mean confidence interval for instructions value: -1.87 -1.69 95% mean confidence interval for instructions %-change: -0.67% -0.58% Instructions are helped. total cycles in shared programs: 960856054 -> 960728040 (-0.01%) cycles in affected programs: 340840968 -> 340712954 (-0.04%) helped: 1079 HURT: 233 helped stats (abs) min: 1 max: 7640 x̄: 170.95 x̃: 46 helped stats (rel) min: <.01% max: 30.20% x̄: 0.96% x̃: 0.28% HURT stats (abs) min: 1 max: 6864 x̄: 242.23 x̃: 26 HURT stats (rel) min: <.01% max: 34.64% x̄: 2.10% x̃: 0.22% 95% mean confidence interval for cycles value: -135.62 -59.53 95% mean confidence interval for cycles %-change: -0.59% -0.25% Cycles are helped. LOST: 15 GAINED: 1 Broadwell total instructions in shared programs: 17855624 -> 17853580 (-0.01%) instructions in affected programs: 1012209 -> 1010165 (-0.20%) helped: 1105 HURT: 52 helped stats (abs) min: 1 max: 13 x̄: 1.90 x̃: 1 helped stats (rel) min: 0.03% max: 6.67% x̄: 0.67% x̃: 0.36% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.13% max: 0.52% x̄: 0.26% x̃: 0.25% 95% mean confidence interval for instructions value: -1.86 -1.67 95% mean confidence interval for instructions %-change: -0.68% -0.58% Instructions are helped. total cycles in shared programs: 1029905447 -> 1029840699 (<.01%) cycles in affected programs: 347102680 -> 347037932 (-0.02%) helped: 1007 HURT: 211 helped stats (abs) min: 1 max: 1360 x̄: 89.76 x̃: 48 helped stats (rel) min: <.01% max: 16.26% x̄: 0.69% x̃: 0.25% HURT stats (abs) min: 1 max: 1297 x̄: 121.51 x̃: 20 HURT stats (rel) min: <.01% max: 31.31% x̄: 1.21% x̃: 0.20% 95% mean confidence interval for cycles value: -62.39 -43.92 95% mean confidence interval for cycles %-change: -0.47% -0.25% Cycles are helped. total spills in shared programs: 20335 -> 20333 (<.01%) spills in affected programs: 19 -> 17 (-10.53%) helped: 2 HURT: 0 total fills in shared programs: 25905 -> 25899 (-0.02%) fills in affected programs: 23 -> 17 (-26.09%) helped: 2 HURT: 0 LOST: 9 GAINED: 0 Haswell total instructions in shared programs: 16418516 -> 16417293 (<.01%) instructions in affected programs: 223785 -> 222562 (-0.55%) helped: 590 HURT: 67 helped stats (abs) min: 1 max: 15 x̄: 2.19 x̃: 1 helped stats (rel) min: 0.03% max: 6.52% x̄: 0.87% x̃: 0.60% HURT stats (abs) min: 1 max: 2 x̄: 1.04 x̃: 1 HURT stats (rel) min: 0.04% max: 1.85% x̄: 0.44% x̃: 0.25% 95% mean confidence interval for instructions value: -2.01 -1.71 95% mean confidence interval for instructions %-change: -0.80% -0.67% Instructions are helped. total cycles in shared programs: 1037179754 -> 1037084874 (<.01%) cycles in affected programs: 352541071 -> 352446191 (-0.03%) helped: 1093 HURT: 182 helped stats (abs) min: 1 max: 888 x̄: 111.03 x̃: 64 helped stats (rel) min: <.01% max: 27.30% x̄: 0.84% x̃: 0.20% HURT stats (abs) min: 1 max: 6777 x̄: 145.49 x̃: 21 HURT stats (rel) min: <.01% max: 24.10% x̄: 1.99% x̃: 0.29% 95% mean confidence interval for cycles value: -88.10 -60.73 95% mean confidence interval for cycles %-change: -0.58% -0.29% Cycles are helped. total spills in shared programs: 17457 -> 17456 (<.01%) spills in affected programs: 12 -> 11 (-8.33%) helped: 1 HURT: 0 total fills in shared programs: 20387 -> 20385 (<.01%) fills in affected programs: 15 -> 13 (-13.33%) helped: 1 HURT: 0 LOST: 6 GAINED: 1 Ivy Bridge and earlier platforms had similar results. (Ivy Bridge shown) total instructions in shared programs: 15515482 -> 15513998 (<.01%) instructions in affected programs: 239739 -> 238255 (-0.62%) helped: 573 HURT: 57 helped stats (abs) min: 1 max: 20 x̄: 2.73 x̃: 2 helped stats (rel) min: 0.03% max: 9.84% x̄: 0.94% x̃: 0.55% HURT stats (abs) min: 1 max: 2 x̄: 1.39 x̃: 1 HURT stats (rel) min: 0.09% max: 1.85% x̄: 0.52% x̃: 0.35% 95% mean confidence interval for instructions value: -2.57 -2.14 95% mean confidence interval for instructions %-change: -0.89% -0.73% Instructions are helped. total cycles in shared programs: 584509880 -> 584463152 (<.01%) cycles in affected programs: 11765280 -> 11718552 (-0.40%) helped: 661 HURT: 152 helped stats (abs) min: 1 max: 3073 x̄: 101.99 x̃: 32 helped stats (rel) min: <.01% max: 34.38% x̄: 1.46% x̃: 0.50% HURT stats (abs) min: 1 max: 6637 x̄: 136.10 x̃: 15 HURT stats (rel) min: <.01% max: 24.19% x̄: 1.75% x̃: 0.25% 95% mean confidence interval for cycles value: -82.79 -32.16 95% mean confidence interval for cycles %-change: -1.11% -0.61% Cycles are helped. LOST: 9 GAINED: 0 Tiger Lake Instructions in all programs: 160905127 -> 160900949 (-0.0%) SENDs in all programs: 6812418 -> 6812085 (-0.0%) Loops in all programs: 38225 -> 38225 (+0.0%) Cycles in all programs: 7431911114 -> 7433914697 (+0.0%) Spills in all programs: 192582 -> 192582 (+0.0%) Fills in all programs: 304539 -> 304537 (-0.0%) Ice Lake Instructions in all programs: 145296733 -> 145292370 (-0.0%) SENDs in all programs: 6863818 -> 6863485 (-0.0%) Loops in all programs: 38219 -> 38219 (+0.0%) Cycles in all programs: 8798257570 -> 8800204360 (+0.0%) Spills in all programs: 216880 -> 216880 (+0.0%) Fills in all programs: 334250 -> 334248 (-0.0%) Skylake Instructions in all programs: 135891485 -> 135887357 (-0.0%) SENDs in all programs: 6803031 -> 6802698 (-0.0%) Loops in all programs: 38216 -> 38216 (+0.0%) Cycles in all programs: 8442221881 -> 8444201959 (+0.0%) Spills in all programs: 194839 -> 194839 (+0.0%) Fills in all programs: 301116 -> 301114 (-0.0%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	49177b9e2f	nir/algebraic: Tautology replacements require sources be numbers It seems worth the small amount of damage to give an extra cushion of not having to debug problems later. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 21043197 -> 21043359 (<.01%) instructions in affected programs: 4409 -> 4571 (3.67%) helped: 0 HURT: 25 HURT stats (abs) min: 1 max: 16 x̄: 6.48 x̃: 5 HURT stats (rel) min: 0.39% max: 15.38% x̄: 4.59% x̃: 4.40% 95% mean confidence interval for instructions value: 4.37 8.59 95% mean confidence interval for instructions %-change: 2.93% 6.26% Instructions are HURT. total cycles in shared programs: 856175986 -> 856176921 (<.01%) cycles in affected programs: 58908 -> 59843 (1.59%) helped: 0 HURT: 25 HURT stats (abs) min: 7 max: 70 x̄: 37.40 x̃: 38 HURT stats (rel) min: 0.27% max: 5.63% x̄: 1.87% x̃: 1.39% 95% mean confidence interval for cycles value: 31.11 43.69 95% mean confidence interval for cycles %-change: 1.35% 2.39% Cycles are HURT. No fossil-db changes on any Intel platform. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	d69ba58644	nir/algebraic: Remove some optimizations of comparisons with fsat When most of these patterns were created, we believed, incorrectly, that fsat(NaN) was NaN. We have since realized that fsat(NaN) is zero. Originally, this changed the patterns to use is_a_number. This didn't help any shaders, so it's easier to just drop the optimizations. This commit crossed paths with `4c3ad4d065` ("nir/algebraic: mark more optimization with fsat(NaN) as inexact") and `bc123c396a` ("nir/algebraic: mark some optimizations with fsat(NaN) as inexact"). Given that these don't impact very many shaders, it seems safer to just remove them. As discussed in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8716, I tried modifying these patterns to use !(b cmp a). Unfortunately, on Intel GPUs, the results were much worse than just removing the patterns altogether. Some other related patterns will be addressed in later commits. There are still a number of patterns that use the identity fsat(1-X) == 1 - fsat(X). If X is NaN, the former is zero while the latter is 1.0. I haven't evaluted these patterns yet. If changes are needed in these patterns, it should be a separate commit anyway. v2: Replace arrow `=>` with `->` in comments because the `=>` looks a lot like `<=` comparison. Suggested by Rhys. Fixes: `92b75c126b` ("nir/algebraic: Replace checks that a value is between (or not) [0, 1]") Fixes: `a7f0c57673` ("nir/algebraic: Eliminate useless fsat() on operand of comparison w/value in (0, 1)") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> All Intel hardware had similar results. (Ice Lake shown) total instructions in shared programs: 20029060 -> 20029670 (<.01%) instructions in affected programs: 69236 -> 69846 (0.88%) helped: 0 HURT: 263 HURT stats (abs) min: 1 max: 20 x̄: 2.32 x̃: 1 HURT stats (rel) min: 0.30% max: 11.11% x̄: 1.35% x̃: 0.98% 95% mean confidence interval for instructions value: 1.86 2.78 95% mean confidence interval for instructions %-change: 1.18% 1.52% Instructions are HURT. total cycles in shared programs: 979821278 -> 979834425 (<.01%) cycles in affected programs: 1476848 -> 1489995 (0.89%) helped: 49 HURT: 204 helped stats (abs) min: 1 max: 812 x̄: 102.31 x̃: 20 helped stats (rel) min: 0.01% max: 21.43% x̄: 2.23% x̃: 0.52% HURT stats (abs) min: 2 max: 2600 x̄: 89.02 x̃: 16 HURT stats (rel) min: 0.04% max: 27.27% x̄: 1.49% x̃: 0.72% 95% mean confidence interval for cycles value: 13.18 90.75 95% mean confidence interval for cycles %-change: 0.29% 1.25% Cycles are HURT. No fossil-db changes. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Jesse Natalie	d7ca0319d7	nir: Add relaxed 24bit opcodes These are equivalent to the 32bit opcodes if there are no more efficient 24bit opcodes available, but inputs are guaranteed to already be 24bit, so the 24bit opcodes can be used instead if they exist and are efficient. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10549>	2021-05-05 22:06:42 +00:00
Gert Wollny	a199697642	nir/opt_algebraic: optimizations for add umax/umin with zero For unsigned comparisons with zero these ops can be eliminated. v2: Add comparison optimizations with -1 (Rhys Perry) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> (v1) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10583>	2021-05-04 09:33:32 +02:00
Jesse Natalie	3c8bcdc863	nir: Add a new opcode for [un]packing doubles HLSL doesn't support bitcasting a 64bit integer to a double. DXIL doesn't have generic pack/unpack instructions, so we lower those to integer bitwise ops. As a result, NIR generic double pack/unpack would require our backend to emit a bitcast to get a double, but we want to match HLSL semantics and emit MakeDouble/SplitDouble. Adding a dedicated opcode for double pack/unpack allows us to add a pass to emit that instead, which lets our backend emit the right instruction to pack and unpack doubles. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10063>	2021-04-09 01:54:33 +00:00
Gert Wollny	0f5b3c37c5	nir: Add opcodes for fused comp + csel and optimizations Some backends, like r600 support a fused version of int and float compare against zero and and csel. Adding these opcodes here makes it possible to optimize this in nir. v2: Add rules for float compare + csel Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9452>	2021-03-22 15:19:46 +01:00
Gert Wollny	a5747f8ab3	nir: add opcodes for find_msb_rev and lowering Some hardware supports a version of find_msb where the bits are counted starting at the high bit, and this needs some lowering to obtain the value that is expected by find_msb Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9452>	2021-03-22 15:19:46 +01:00
Timur Kristóf	132171dc4e	nir: Add a few more algebraic optimizations to help address calculation. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Ian Romanick	2c4fd24c01	nir/algebraic: Apply addition property of equality to the other ordering too Inequality comparison operations are not commutative, so `foo < bar` and `bar < foo` both have to be explicitly listed. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> All Intel GPUs had similar results. (Ice Lake shown) total instructions in shared programs: 20027051 -> 20026899 (<.01%) instructions in affected programs: 37181 -> 37029 (-0.41%) helped: 85 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 1.79 x̃: 1 helped stats (rel) min: 0.05% max: 6.78% x̄: 0.92% x̃: 0.68% 95% mean confidence interval for instructions value: -2.42 -1.15 95% mean confidence interval for instructions %-change: -1.23% -0.61% Instructions are helped. total cycles in shared programs: 979762793 -> 979753527 (<.01%) cycles in affected programs: 2653905 -> 2644639 (-0.35%) helped: 104 HURT: 50 helped stats (abs) min: 1 max: 1048 x̄: 119.99 x̃: 11 helped stats (rel) min: <.01% max: 9.88% x̄: 0.77% x̃: 0.20% HURT stats (abs) min: 1 max: 734 x̄: 64.26 x̃: 8 HURT stats (rel) min: <.01% max: 3.06% x̄: 0.36% x̃: 0.10% 95% mean confidence interval for cycles value: -98.65 -21.68 95% mean confidence interval for cycles %-change: -0.66% -0.15% Cycles are helped. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9374>	2021-03-04 22:50:53 +00:00
Ian Romanick	33031bdab6	nir/algebraic: Apply addition property of equality more conservatively This allows a lot more CSE. Depending on where the addition and the comparison are scheduled, it may also reduce register pressure by reducing the live range of the addends. Across all the platforms, the shaders affected for spills or fills were all fragment shaders from Dirt Rally. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 21043103 -> 21038804 (-0.02%) instructions in affected programs: 892878 -> 888579 (-0.48%) helped: 1549 HURT: 724 helped stats (abs) min: 1 max: 225 x̄: 4.14 x̃: 2 helped stats (rel) min: 0.05% max: 11.18% x̄: 1.04% x̃: 0.78% HURT stats (abs) min: 1 max: 71 x̄: 2.93 x̃: 1 HURT stats (rel) min: 0.07% max: 6.90% x̄: 0.80% x̃: 0.56% 95% mean confidence interval for instructions value: -2.33 -1.45 95% mean confidence interval for instructions %-change: -0.50% -0.40% Instructions are helped. total cycles in shared programs: 855054155 -> 855757566 (0.08%) cycles in affected programs: 58275918 -> 58979329 (1.21%) helped: 1213 HURT: 1680 helped stats (abs) min: 1 max: 107405 x̄: 1684.00 x̃: 10 helped stats (rel) min: <.01% max: 38.09% x̄: 1.51% x̃: 0.25% HURT stats (abs) min: 1 max: 126632 x̄: 1634.59 x̃: 12 HURT stats (rel) min: <.01% max: 85.91% x̄: 2.75% x̃: 0.49% 95% mean confidence interval for cycles value: -98.06 584.35 95% mean confidence interval for cycles %-change: 0.71% 1.22% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 9843 -> 9771 (-0.73%) spills in affected programs: 72 -> 0 helped: 5 HURT: 0 total fills in shared programs: 9600 -> 9451 (-1.55%) fills in affected programs: 149 -> 0 helped: 5 HURT: 0 LOST: 14 GAINED: 9 Skylake total instructions in shared programs: 18185074 -> 18183866 (<.01%) instructions in affected programs: 575180 -> 573972 (-0.21%) helped: 1286 HURT: 468 helped stats (abs) min: 1 max: 15 x̄: 1.55 x̃: 1 helped stats (rel) min: 0.03% max: 4.08% x̄: 0.67% x̃: 0.65% HURT stats (abs) min: 1 max: 8 x̄: 1.69 x̃: 1 HURT stats (rel) min: 0.13% max: 7.69% x̄: 0.87% x̃: 0.45% 95% mean confidence interval for instructions value: -0.77 -0.60 95% mean confidence interval for instructions %-change: -0.30% -0.22% Instructions are helped. total cycles in shared programs: 960518105 -> 960608234 (<.01%) cycles in affected programs: 42536073 -> 42626202 (0.21%) helped: 1210 HURT: 1714 helped stats (abs) min: 1 max: 7015 x̄: 123.41 x̃: 10 helped stats (rel) min: <.01% max: 33.76% x̄: 1.32% x̃: 0.26% HURT stats (abs) min: 1 max: 14474 x̄: 139.71 x̃: 14 HURT stats (rel) min: <.01% max: 58.94% x̄: 2.00% x̃: 0.44% 95% mean confidence interval for cycles value: 4.02 57.63 95% mean confidence interval for cycles %-change: 0.43% 0.82% Cycles are HURT. LOST: 16 GAINED: 42 Broadwell total instructions in shared programs: 17856880 -> 17852158 (-0.03%) instructions in affected programs: 564836 -> 560114 (-0.84%) helped: 1243 HURT: 418 helped stats (abs) min: 1 max: 115 x̄: 4.36 x̃: 1 helped stats (rel) min: 0.03% max: 9.67% x̄: 0.90% x̃: 0.67% HURT stats (abs) min: 1 max: 8 x̄: 1.67 x̃: 1 HURT stats (rel) min: 0.14% max: 7.69% x̄: 0.89% x̃: 0.46% 95% mean confidence interval for instructions value: -3.45 -2.23 95% mean confidence interval for instructions %-change: -0.51% -0.38% Instructions are helped. total cycles in shared programs: 1031140321 -> 1029856892 (-0.12%) cycles in affected programs: 66986946 -> 65703517 (-1.92%) helped: 1084 HURT: 1653 helped stats (abs) min: 1 max: 415168 x̄: 1835.32 x̃: 10 helped stats (rel) min: <.01% max: 57.16% x̄: 1.19% x̃: 0.28% HURT stats (abs) min: 1 max: 43930 x̄: 427.14 x̃: 12 HURT stats (rel) min: <.01% max: 57.53% x̄: 1.32% x̃: 0.39% 95% mean confidence interval for cycles value: -915.76 -22.07 95% mean confidence interval for cycles %-change: 0.17% 0.47% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total spills in shared programs: 20891 -> 20335 (-2.66%) spills in affected programs: 1567 -> 1011 (-35.48%) helped: 70 HURT: 0 total fills in shared programs: 27307 -> 25905 (-5.13%) fills in affected programs: 5381 -> 3979 (-26.05%) helped: 71 HURT: 0 LOST: 17 GAINED: 20 Haswell total instructions in shared programs: 16411850 -> 16409414 (-0.01%) instructions in affected programs: 602666 -> 600230 (-0.40%) helped: 1152 HURT: 781 helped stats (abs) min: 1 max: 103 x̄: 3.59 x̃: 1 helped stats (rel) min: 0.03% max: 8.61% x̄: 0.85% x̃: 0.65% HURT stats (abs) min: 1 max: 41 x̄: 2.18 x̃: 1 HURT stats (rel) min: 0.12% max: 7.69% x̄: 0.88% x̃: 0.69% 95% mean confidence interval for instructions value: -1.74 -0.78 95% mean confidence interval for instructions %-change: -0.21% -0.10% Instructions are helped. total cycles in shared programs: 1035338781 -> 1036977801 (0.16%) cycles in affected programs: 68961096 -> 70600116 (2.38%) helped: 1246 HURT: 2206 helped stats (abs) min: 1 max: 392022 x̄: 1040.28 x̃: 14 helped stats (rel) min: <.01% max: 56.44% x̄: 2.32% x̃: 0.38% HURT stats (abs) min: 1 max: 68630 x̄: 1330.56 x̃: 18 HURT stats (rel) min: <.01% max: 69.97% x̄: 3.31% x̃: 0.61% 95% mean confidence interval for cycles value: 90.43 859.17 95% mean confidence interval for cycles %-change: 1.02% 1.54% Cycles are HURT. total spills in shared programs: 17805 -> 17457 (-1.95%) spills in affected programs: 1202 -> 854 (-28.95%) helped: 34 HURT: 31 total fills in shared programs: 20939 -> 20387 (-2.64%) fills in affected programs: 2702 -> 2150 (-20.43%) helped: 34 HURT: 31 LOST: 24 GAINED: 45 Ivy Bridge and earlier Intel GPUs had similar results. (Ivy Bridge shown) total instructions in shared programs: 15515912 -> 15516757 (<.01%) instructions in affected programs: 396569 -> 397414 (0.21%) helped: 578 HURT: 858 helped stats (abs) min: 1 max: 9 x̄: 1.32 x̃: 1 helped stats (rel) min: 0.04% max: 3.70% x̄: 0.65% x̃: 0.65% HURT stats (abs) min: 1 max: 11 x̄: 1.87 x̃: 1 HURT stats (rel) min: 0.08% max: 12.90% x̄: 0.95% x̃: 0.53% 95% mean confidence interval for instructions value: 0.47 0.70 95% mean confidence interval for instructions %-change: 0.24% 0.37% Instructions are HURT. total cycles in shared programs: 584395455 -> 584466352 (0.01%) cycles in affected programs: 20346570 -> 20417467 (0.35%) helped: 1192 HURT: 1896 helped stats (abs) min: 1 max: 4108 x̄: 123.27 x̃: 14 helped stats (rel) min: <.01% max: 37.20% x̄: 2.27% x̃: 0.46% HURT stats (abs) min: 1 max: 3698 x̄: 114.89 x̃: 19 HURT stats (rel) min: <.01% max: 70.28% x̄: 3.02% x̃: 0.71% 95% mean confidence interval for cycles value: 10.75 35.16 95% mean confidence interval for cycles %-change: 0.73% 1.23% Cycles are HURT. LOST: 20 GAINED: 12 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9374>	2021-03-04 22:50:53 +00:00
Timothy Arceri	9f474bd4b4	nir: handle negatives in ffma reassociation optimisation shader-db results Iris (BDW): total instructions in shared programs: 16632076 -> 16631057 (<.01%) instructions in affected programs: 48010 -> 46991 (-2.12%) helped: 47 HURT: 6 total cycles in shared programs: 915266726 -> 915263622 (<.01%) cycles in affected programs: 1182283 -> 1179179 (-0.26%) helped: 18 HURT: 27 total loops in shared programs: 4929 -> 4929 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 18834 -> 18801 (-0.18%) spills in affected programs: 525 -> 492 (-6.29%) helped: 3 HURT: 0 total fills in shared programs: 23008 -> 22981 (-0.12%) fills in affected programs: 435 -> 408 (-6.21%) helped: 3 HURT: 0 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8608>	2021-02-22 00:49:13 +00:00
Ian Romanick	3250e04d25	nir/algebraic: Add some max/min optimizations with 3 variables Specifically, ARB assembly shaders with code like SLT r0, r0, c[0].xxxx; ... KIL r0.xyzx; can result in this pattern. The other cases (e.g., 'KIL r0.xxxx' and 'KIL r0.xyxx') are handled by existing patterns. Reviewed-by: Matt Turner <mattst88@gmail.com> All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 21050098 -> 21050065 (<.01%) instructions in affected programs: 2062 -> 2029 (-1.60%) helped: 31 HURT: 1 helped stats (abs) min: 1 max: 3 x̄: 1.10 x̃: 1 helped stats (rel) min: 1.14% max: 4.35% x̄: 1.89% x̃: 1.69% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.65% max: 0.65% x̄: 0.65% x̃: 0.65% 95% mean confidence interval for instructions value: -1.23 -0.84 95% mean confidence interval for instructions %-change: -2.12% -1.50% Instructions are helped. total cycles in shared programs: 855105466 -> 855105055 (<.01%) cycles in affected programs: 50136 -> 49725 (-0.82%) helped: 33 HURT: 0 helped stats (abs) min: 3 max: 22 x̄: 12.45 x̃: 12 helped stats (rel) min: 0.13% max: 1.57% x̄: 0.86% x̃: 0.92% 95% mean confidence interval for cycles value: -13.78 -11.13 95% mean confidence interval for cycles %-change: -0.97% -0.76% Cycles are helped. No fossil-db changes on any Intel platform. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9122>	2021-02-19 17:31:27 -08:00
Ian Romanick	d9b5bce85a	nir/algebraic: Remove some redundant b2f logic-op reduction patterns There are patterns that will re-write the fmin or fmax part into a form that other patterns will gradually convert to the same ior or iand. For example, fmax(b2f(a), b2f(b)) != 0 b2f(a \|\| b) != 0 a \|\| b No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9122>	2021-02-19 17:31:24 -08:00
Ian Romanick	7e127c1fca	nir/algebraic: Fix some min/max of b2f replacements fmin(-A, -B) is -fmax(A, B), and fmax(-A, -B) is -fmin(A, B). Therefore the logic joining A and B should toggle between ior and iand for the negated versions. At the very least, a shader from Euro Truck Simulator 2 in shader-db is affected by this. The KIL instruction in the (ARB assembly) shader ends up with the wrong logic. This is _probably_ the source of https://gitlab.freedesktop.org/mesa/mesa/-/issues/1346. That said, the issue mentions that Mesa 18.0.5 works, but commit `68420d8322` ("nir: Simplify min and max of b2f") was added in 17.3. Moreover, I was not able to reproduce the error in the ETS2 shader from shader-db from any Mesa commit near the time the original fd.o bugzilla was submitted (December 2018). 🤷 In fact, the current error in that shader starts with `9167324a86` ("nir/algebraic: Mark some logic-joined comparison reductions as exact"). That's a bit of a red herring as `9167324a86` just sets off a chain of replacements that eventually leads to the incorrect min/max of b2f patterns fixed by this commit. The other affected shaders in the shader-db results are from Cargo Commander. These are also ARB assembly shaders. I think any ARB assembly shader that uses the pattern SLT r0, ...; ... KIL -r0; will suffer from issues related to this. This change fixes the piglit tests/spec/arb_fragment_program/kil-of-slt.shader_test test added in https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/454. shader-db results: All Gen6+ platforms had similar result. (Ice Lake shown) total instructions in shared programs: 20034604 -> 20034486 (<.01%) instructions in affected programs: 3885 -> 3767 (-3.04%) helped: 47 HURT: 2 helped stats (abs) min: 2 max: 4 x̄: 2.64 x̃: 2 helped stats (rel) min: 2.33% max: 8.33% x̄: 3.48% x̃: 3.39% HURT stats (abs) min: 3 max: 3 x̄: 3.00 x̃: 3 HURT stats (rel) min: 13.64% max: 16.67% x̄: 15.15% x̃: 15.15% 95% mean confidence interval for instructions value: -2.83 -1.99 95% mean confidence interval for instructions %-change: -3.84% -1.60% Instructions are helped. total cycles in shared programs: 979881379 -> 979879406 (<.01%) cycles in affected programs: 119873 -> 117900 (-1.65%) helped: 46 HURT: 3 helped stats (abs) min: 10 max: 756 x̄: 45.41 x̃: 26 helped stats (rel) min: 0.53% max: 19.72% x̄: 1.67% x̃: 1.26% HURT stats (abs) min: 28 max: 56 x̄: 38.67 x̃: 32 HURT stats (rel) min: 1.44% max: 3.54% x̄: 2.75% x̃: 3.27% 95% mean confidence interval for cycles value: -70.83 -9.70 95% mean confidence interval for cycles %-change: -2.23% -0.57% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8115098 -> 8115076 (<.01%) instructions in affected programs: 2592 -> 2570 (-0.85%) helped: 32 HURT: 2 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.88% max: 2.70% x̄: 1.35% x̃: 1.31% HURT stats (abs) min: 5 max: 5 x̄: 5.00 x̃: 5 HURT stats (rel) min: 17.24% max: 18.52% x̄: 17.88% x̃: 17.88% 95% mean confidence interval for instructions value: -1.15 -0.15 95% mean confidence interval for instructions %-change: -1.83% 1.39% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: 238189718 -> 238189802 (<.01%) cycles in affected programs: 75076 -> 75160 (0.11%) helped: 3 HURT: 31 helped stats (abs) min: 2 max: 130 x̄: 44.67 x̃: 2 helped stats (rel) min: 0.18% max: 5.70% x̄: 2.02% x̃: 0.19% HURT stats (abs) min: 2 max: 70 x̄: 7.03 x̃: 4 HURT stats (rel) min: 0.07% max: 6.41% x̄: 0.53% x̃: 0.15% 95% mean confidence interval for cycles value: -7.27 12.21 95% mean confidence interval for cycles %-change: -0.33% 0.94% Inconclusive result (value mean confidence interval includes 0). No fossil-db changes on any Intel platform. Fixes: `68420d8322` ("nir: Simplify min and max of b2f") Closes: #1346 Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9122>	2021-02-19 17:30:53 -08:00
Jason Ekstrand	2491d5a662	nir/algebraic: Covert up-cast of down-cast to extract on Intel This starts generating extract for bit sizes other than 32 but our back-end handles that just fine. Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8872>	2021-02-16 16:36:31 +00:00
Jason Ekstrand	f9b3be09e1	nir/algebraic: Clean up up-cast of down-cast when we can There are a bunch of cases where we can pretty quickly determine that the high bits don't matter. In these cases, delete the casts. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8872>	2021-02-16 16:36:31 +00:00
Ian Romanick	ed138f2861	nir/algebraic: Partially revert `3f782cdd25` I'm not sure what the logic was, but there is no opportunity for anything to flush to zero here. 'a' is a Boolean value, and b2f produces 1.0 or 0.0. This was originally part of https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3765/. Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: Andres Gomez <agomez@igalia.com> Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8910>	2021-02-07 18:31:01 -08:00
Ian Romanick	5923742356	nir/algebraic: add patterns for a >> #b << #b and a << #b >> #b Commit `5476d18183` ("nir/algebraic: add patterns for a >> #b << #b") added the ushr version, but it missed the ishr. A bunch of compute shaders with stores to shared storage generate the ishr pattern. Enabling this optimization also enables the iadd/iand reassociation (right after this hunk), and that enables merging of stores to shared storage. A couple shaders have spills and fills hurt on some platforms. These all occur in shaders that also have SENDs helped. On Gen9 and Gen11, the helped SENDs more than makes up for the extra spills and fills. On Gen7 and Gen8, it's not as clear. All of the shaders affected are compute shaders in DiRT Rally 2 or Bioshock Inifinite. The most affected Bioshock shader on Broadwell looks like: Before: CS SIMD8 shader: 1335 inst, 0 loops, 22411 cycles, 42:36 spills:fills, 159 sends, scheduled with mode lifo, Promoted 2 constants, compacted 21360 to 16528 bytes. After: CS SIMD8 shader: 1175 inst, 0 loops, 25916 cycles, 96:135 spills:fills, 72 sends, scheduled with mode lifo, Promoted 2 constants, compacted 18800 to 13648 bytes. The results on Haswell and Ivy Bridge are similar. Given that there are only 2 promoted constants, MR !7698 won't have any effect. There were no statistically significant changes on Gen9+ in Bioshock in our performance CI. Gen8 isn't in that CI, and DiRT Showdown 2 is also not included in that CI. It is possible that these shaders aren't used in the settings or demos used in the CI. The other pattern, which switches the order of the shifts, only helps a couple shaders. If I wasn't already adding another pattern, I definitely wouldn't bother with that one. v2: s/ishr/ushr/ in the replacement for the ushr pattern. Noticed by Rhys. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake total instructions in shared programs: 21052760 -> 21049269 (-0.02%) instructions in affected programs: 59497 -> 56006 (-5.87%) helped: 46 HURT: 0 helped stats (abs) min: 2 max: 552 x̄: 75.89 x̃: 53 helped stats (rel) min: 0.28% max: 43.43% x̄: 5.87% x̃: 4.10% 95% mean confidence interval for instructions value: -108.96 -42.82 95% mean confidence interval for instructions %-change: -8.38% -3.35% Instructions are helped. total cycles in shared programs: 855229761 -> 855148518 (<.01%) cycles in affected programs: 8491373 -> 8410130 (-0.96%) helped: 33 HURT: 15 helped stats (abs) min: 42 max: 26940 x̄: 6200.70 x̃: 4329 helped stats (rel) min: 0.09% max: 38.78% x̄: 7.97% x̃: 4.29% HURT stats (abs) min: 2 max: 18132 x̄: 8225.33 x̃: 7288 HURT stats (rel) min: <.01% max: 13.37% x̄: 5.72% x̃: 4.53% 95% mean confidence interval for cycles value: -4331.52 946.40 95% mean confidence interval for cycles %-change: -6.78% -0.61% Inconclusive result (value mean confidence interval includes 0). total sends in shared programs: 989947 -> 989694 (-0.03%) sends in affected programs: 523 -> 270 (-48.37%) helped: 5 HURT: 0 helped stats (abs) min: 9 max: 87 x̄: 50.60 x̃: 37 helped stats (rel) min: 25.71% max: 54.72% x̄: 43.49% x̃: 42.53% 95% mean confidence interval for sends value: -93.95 -7.25 95% mean confidence interval for sends %-change: -58.48% -28.50% Sends are helped. Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 20033498 -> 20030552 (-0.01%) instructions in affected programs: 59220 -> 56274 (-4.97%) helped: 48 HURT: 0 helped stats (abs) min: 1 max: 465 x̄: 61.38 x̃: 39 helped stats (rel) min: 0.03% max: 42.27% x̄: 5.19% x̃: 3.90% 95% mean confidence interval for instructions value: -89.57 -33.18 95% mean confidence interval for instructions %-change: -7.49% -2.89% Instructions are helped. total cycles in shared programs: 979993675 -> 979840773 (-0.02%) cycles in affected programs: 6738454 -> 6585552 (-2.27%) helped: 46 HURT: 0 helped stats (abs) min: 42 max: 6265 x̄: 3323.96 x̃: 3579 helped stats (rel) min: 0.09% max: 37.38% x̄: 4.34% x̃: 2.39% 95% mean confidence interval for cycles value: -3664.70 -2983.21 95% mean confidence interval for cycles %-change: -6.63% -2.06% Cycles are helped. total spills in shared programs: 10659 -> 10661 (0.02%) spills in affected programs: 36 -> 38 (5.56%) helped: 1 HURT: 1 total fills in shared programs: 11551 -> 11551 (0.00%) fills in affected programs: 70 -> 70 (0.00%) helped: 1 HURT: 1 total sends in shared programs: 1032117 -> 1031785 (-0.03%) sends in affected programs: 711 -> 379 (-46.69%) helped: 5 HURT: 0 helped stats (abs) min: 18 max: 87 x̄: 66.40 x̃: 74 helped stats (rel) min: 27.69% max: 54.72% x̄: 44.49% x̃: 44.31% 95% mean confidence interval for sends value: -101.79 -31.01 95% mean confidence interval for sends %-change: -58.42% -30.55% Sends are helped. Broadwell total instructions in shared programs: 17865005 -> 17862757 (-0.01%) instructions in affected programs: 66438 -> 64190 (-3.38%) helped: 49 HURT: 0 helped stats (abs) min: 1 max: 266 x̄: 45.88 x̃: 39 helped stats (rel) min: 0.03% max: 11.99% x̄: 3.73% x̃: 3.92% 95% mean confidence interval for instructions value: -59.15 -32.61 95% mean confidence interval for instructions %-change: -4.35% -3.12% Instructions are helped. total cycles in shared programs: 1031298803 -> 1031219023 (<.01%) cycles in affected programs: 7253602 -> 7173822 (-1.10%) helped: 45 HURT: 2 helped stats (abs) min: 18 max: 7828 x̄: 1928.33 x̃: 1918 helped stats (rel) min: <.01% max: 10.51% x̄: 1.58% x̃: 1.31% HURT stats (abs) min: 3490 max: 3505 x̄: 3497.50 x̃: 3497 HURT stats (rel) min: 15.56% max: 15.64% x̄: 15.60% x̃: 15.60% 95% mean confidence interval for cycles value: -2174.88 -1220.01 95% mean confidence interval for cycles %-change: -2.00% 0.30% Inconclusive result (%-change mean confidence interval includes 0). total spills in shared programs: 20799 -> 20924 (0.60%) spills in affected programs: 843 -> 968 (14.83%) helped: 0 HURT: 4 total fills in shared programs: 27110 -> 27334 (0.83%) fills in affected programs: 1824 -> 2048 (12.28%) helped: 1 HURT: 4 total sends in shared programs: 1017935 -> 1017603 (-0.03%) sends in affected programs: 711 -> 379 (-46.69%) helped: 5 HURT: 0 helped stats (abs) min: 18 max: 87 x̄: 66.40 x̃: 74 helped stats (rel) min: 27.69% max: 54.72% x̄: 44.49% x̃: 44.31% 95% mean confidence interval for sends value: -101.79 -31.01 95% mean confidence interval for sends %-change: -58.42% -30.55% Sends are helped. Haswell and Ivy Bridge had similar results. (Haswell shown) total instructions in shared programs: 16397496 -> 16395411 (-0.01%) instructions in affected programs: 59384 -> 57299 (-3.51%) helped: 49 HURT: 0 helped stats (abs) min: 1 max: 208 x̄: 42.55 x̃: 39 helped stats (rel) min: 0.03% max: 8.18% x̄: 3.74% x̃: 3.91% 95% mean confidence interval for instructions value: -53.59 -31.51 95% mean confidence interval for instructions %-change: -4.24% -3.23% Instructions are helped. total cycles in shared programs: 1035483504 -> 1035397592 (<.01%) cycles in affected programs: 9379739 -> 9293827 (-0.92%) helped: 45 HURT: 4 helped stats (abs) min: 10 max: 5600 x̄: 2164.51 x̃: 2350 helped stats (rel) min: <.01% max: 11.61% x̄: 1.93% x̃: 1.56% HURT stats (abs) min: 2 max: 5756 x̄: 2872.75 x̃: 2866 HURT stats (rel) min: <.01% max: 24.65% x̄: 12.29% x̃: 12.26% 95% mean confidence interval for cycles value: -2293.06 -1213.56 95% mean confidence interval for cycles %-change: -2.42% 0.88% Inconclusive result (%-change mean confidence interval includes 0). total spills in shared programs: 17672 -> 17803 (0.74%) spills in affected programs: 364 -> 495 (35.99%) helped: 2 HURT: 2 total fills in shared programs: 20752 -> 20937 (0.89%) fills in affected programs: 656 -> 841 (28.20%) helped: 2 HURT: 2 total sends in shared programs: 1044703 -> 1044450 (-0.02%) sends in affected programs: 523 -> 270 (-48.37%) helped: 5 HURT: 0 helped stats (abs) min: 9 max: 87 x̄: 50.60 x̃: 37 helped stats (rel) min: 25.71% max: 54.72% x̄: 43.49% x̃: 42.53% 95% mean confidence interval for sends value: -93.95 -7.25 95% mean confidence interval for sends %-change: -58.48% -28.50% Sends are helped. No changes on Gen6 or earlier GPUs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8852>	2021-02-08 00:25:22 +00:00
Ian Romanick	6b0443a900	nir/algebraic: Fix a >> #b << #b for sizes other than 32-bit The base mask previously used was 0xffffffff. This is not correct (but should still work) for 16-bit and 8-bit values, but it means the high 32-bits of 64-bit values will get chopped off. Instead of just restricting the pattern to 32-bits (as was done before `00b28a50b2`), this extends the optimization in two ways: 1. Make it correct for other bit sizes. 2. Make it work for arbitrary shift counts. This has the added benefit of reducing the number of patterns actually added (7 previously, 4 now). The "Reassociate for improved CSE" part is just reverted to its pre-00b28a50b2c behavior. I doubt that pattern is likely to have much impact outside 32-bits. This change fixes the piglit tests tests/spec/arb_gpu_shader_int64/fs-shl-of-shr-int64.shader_test and tests/spec/arb_gpu_shader_int64/fs-iand-of-iadd-int64.shader_test. All of the shaders helped in shader-db are vertex shaders on platforms with vector-oriented vertex processing. The shaders contain ((x >> 16) << 16). These platforms set lower_extract_word, so the optimization that transforms (x >> 16) to extract_u16 doesn't trigger. With only ~60 shaders involved, I didn't bother trying to add extract_XYZ versions of these patterns to try to get those cases. Fixes: `00b28a50b2` ("nir/algebraic: trivially enable existing 32-bit patterns for all bit sizes") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Haswell and earlier Intel GPUs had simlar results. (Haswell shown) total instructions in shared programs: 16397554 -> 16397496 (<.01%) instructions in affected programs: 7961 -> 7903 (-0.73%) helped: 58 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.36% max: 1.89% x̄: 0.99% x̃: 0.78% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -1.13% -0.85% Instructions are helped. total cycles in shared programs: 1035483770 -> 1035483504 (<.01%) cycles in affected programs: 75922 -> 75656 (-0.35%) helped: 44 HURT: 2 helped stats (abs) min: 2 max: 12 x̄: 6.14 x̃: 2 helped stats (rel) min: 0.05% max: 1.67% x̄: 0.87% x̃: 0.72% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.06% max: 0.06% x̄: 0.06% x̃: 0.06% 95% mean confidence interval for cycles value: -7.28 -4.29 95% mean confidence interval for cycles %-change: -1.03% -0.63% Cycles are helped. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8852>	2021-02-08 00:25:22 +00:00
Samuel Pitoiset	4c3ad4d065	nir/algebraic: mark more optimization with fsat(NaN) as inexact These optimizations are duplicated from the main optimization table to the late one... And I missed some in the original fix. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3368 Fixes: `bc123c396a` ("nir/algebraic: mark some optimizations with fsat(NaN) as inexact") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8716>	2021-01-26 17:06:23 +00:00
Rhys Perry	b729cd58d7	nir/algebraic: eliminate exact a*0.0 if float execution mode allow it fossil-db (GFX10): Totals from 611 (0.44% of 139391) affected shaders: SGPRs: 40528 -> 40288 (-0.59%) VGPRs: 16136 -> 16152 (+0.10%); split: -0.15%, +0.25% CodeSize: 970192 -> 951036 (-1.97%) MaxWaves: 10561 -> 10557 (-0.04%); split: +0.08%, -0.11% Instrs: 174874 -> 172879 (-1.14%); split: -1.18%, +0.04% fossil-db (GFX10.3): Totals from 611 (0.44% of 139391) affected shaders: SGPRs: 40680 -> 40488 (-0.47%) VGPRs: 18368 -> 18276 (-0.50%); split: -0.57%, +0.07% CodeSize: 1050712 -> 1033624 (-1.63%); split: -1.64%, +0.02% MaxWaves: 8658 -> 8674 (+0.18%) Instrs: 205364 -> 201220 (-2.02%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5523>	2021-01-26 11:36:13 +00:00
Rhys Perry	614ab26afd	nir/algebraic: optimize out exact a+0.0 if it's used only as a float fossil-db (GFX10): Totals from 133 (0.10% of 139391) affected shaders: SGPRs: 7864 -> 7856 (-0.10%); split: -0.20%, +0.10% VGPRs: 4884 -> 4836 (-0.98%) CodeSize: 288932 -> 287084 (-0.64%) MaxWaves: 1973 -> 1979 (+0.30%) Instrs: 53899 -> 53550 (-0.65%) fossil-db (GFX10.3): Totals from 133 (0.10% of 139391) affected shaders: SGPRs: 7832 -> 7835 (+0.04%); split: -0.06%, +0.10% VGPRs: 5144 -> 5088 (-1.09%) CodeSize: 318912 -> 316696 (-0.69%) MaxWaves: 1735 -> 1746 (+0.63%) Instrs: 65367 -> 64853 (-0.79%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5523>	2021-01-26 11:36:13 +00:00
Rhys Perry	2849f0b5aa	nir/algebraic: optimize out exact a*1.0 if it's used only as a float fossil-db (GFX10): Totals from 10180 (7.30% of 139391) affected shaders: SGPRs: 549392 -> 549448 (+0.01%); split: -0.00%, +0.01% VGPRs: 243228 -> 243008 (-0.09%); split: -0.11%, +0.02% CodeSize: 12939080 -> 12603996 (-2.59%); split: -2.59%, +0.00% MaxWaves: 186948 -> 186976 (+0.01%) Instrs: 2497266 -> 2414648 (-3.31%) fossil-db (GFX10.3): Totals from 10180 (7.30% of 139391) affected shaders: SGPRs: 549672 -> 549280 (-0.07%); split: -0.23%, +0.16% VGPRs: 289296 -> 283672 (-1.94%); split: -2.83%, +0.88% CodeSize: 13920180 -> 13255560 (-4.77%); split: -4.77%, +0.00% MaxWaves: 151789 -> 153165 (+0.91%) Instrs: 2756978 -> 2671517 (-3.10%); split: -3.10%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5523>	2021-01-26 11:36:13 +00:00
Daniel Schürmann	bd8e84eb8d	nir: replace .lower_sub with .has_fsub and .has_isub This allows a more fine-grained control about whether a backend supports one of these instructions. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597>	2021-01-11 19:13:51 +00:00
Daniel Schürmann	b3ce55b445	nir,vc4: Lower fneg to fmul(x, -1.0) This patch also replaces lower_negate with lower_ineg / lower_fneg. The fneg semantics have been clarified as of Version 1.5, Revision 1 of the SPIR-V specification, which means that the previous lowering to fsub is not a viable solution anymore, and is replaced with lowering to fmul(x, -1.0). Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597>	2021-01-11 19:13:51 +00:00
Ian Romanick	539c25c2da	nir/algebraic: Move the flrp -> bcsel rule earlier If multiple rules could match, the rule that appears first in the file is used. Only Tiger Lake and Ice Lake are affected. Other platforms either have a LRP instruction or can't run any shaders from shader-db that would benefit. v2: Fix issues created when this commit was rebased on top of `3c8934a644` ("nir/algebraic: add flrp patterns for 16 and 64 bits"). Noticed by Caio. Tiger Lake and Ice Lake had similar results. total instructions in shared programs: 20908672 -> 20908661 (<.01%) instructions in affected programs: 419 -> 408 (-2.63%) helped: 5 HURT: 0 helped stats (abs) min: 1 max: 3 x̄: 2.20 x̃: 3 helped stats (rel) min: 1.85% max: 3.19% x̄: 2.49% x̃: 2.65% 95% mean confidence interval for instructions value: -3.56 -0.84 95% mean confidence interval for instructions %-change: -3.24% -1.73% Instructions are helped. total cycles in shared programs: 473513940 -> 473513793 (<.01%) cycles in affected programs: 7176 -> 7029 (-2.05%) helped: 12 HURT: 0 helped stats (abs) min: 5 max: 22 x̄: 12.25 x̃: 12 helped stats (rel) min: 0.84% max: 3.24% x̄: 2.09% x̃: 1.80% 95% mean confidence interval for cycles value: -15.43 -9.07 95% mean confidence interval for cycles %-change: -2.57% -1.61% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	ec16f935fe	nir/algebraic: Mark comparisons generated from lowered fsign precise This prevents other transformations from converting them to 'a != 0'. For example, both of these transformations can do this: (('~flt', 0.0, ('fabs', a)), ('fne', a, 0.0)), (('~flt', ('fneg', ('fabs', a)), 0.0), ('fne', a, 0.0)), Both fsign(fabs(NaN)) and fsign(fneg(fabs(NaN))) should produce zero, but, since 'NaN != 0.0' is true, cascading these transformations could cause them to generate 1.0 or -1.0 respecively. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	9771af5dde	nir/algebraic: Fix broken NaN and -0.0 behavior No shader-db or fossil-db changes on any Intel platform. v2: Add a coding line to fix SCons build problems caused by the ± character. Fixes: `25bfba3335` ("nir/algebraic: Recognize open-coded copysign(1.0, a)") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	55621c6d1c	nir/algebraic: Add some compare-with-zero optimizations that are exact This prevents some fossil-db regressions in "spir-v: Mark floating point comparisons exact". v2: Note that the patterns and replacements produce the same value when isnan(b). Suggested by Caio. v3: Use C99 isfinite() instead of (obsolete) BSD finite(). Fixes various Windows builds. No fossil-db changes on any Inetl platform, Vega, or Polaris10. All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 20908670 -> 20908672 (<.01%) instructions in affected programs: 69 -> 71 (2.90%) helped: 0 HURT: 1 total cycles in shared programs: 473515288 -> 473513940 (<.01%) cycles in affected programs: 4942 -> 3594 (-27.28%) helped: 2 HURT: 0 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	9167324a86	nir/algebraic: Mark some logic-joined comparison reductions as exact This also prevents some fossil-db regressions in "spir-v: Mark floating point comparisons exact". v2: Mark the fmin / fmax in the replacement exact to prevent other optimizations from ruining the NaN-clensing property of the fmin / fmax. Suggested by Rhys. Don't assume that constants are not NaN because some components of a vector might be NaN while others are numbers. Noticed by Rhys. This causes ~8 more shaders in Age of Wonders III (dxvk) to regress on cycles (not instructions) by less than 1% when "spir-v: Mark floating point comparisons exact" is applied. This difference is too small to care. All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 20908668 -> 20908670 (<.01%) instructions in affected programs: 9196 -> 9198 (0.02%) helped: 10 HURT: 5 helped stats (abs) min: 1 max: 2 x̄: 1.40 x̃: 1 helped stats (rel) min: 0.02% max: 5.41% x̄: 2.20% x̃: 2.16% HURT stats (abs) min: 2 max: 6 x̄: 3.20 x̃: 3 HURT stats (rel) min: 2.44% max: 16.67% x̄: 9.39% x̃: 12.50% 95% mean confidence interval for instructions value: -1.22 1.49 95% mean confidence interval for instructions %-change: -2.08% 5.41% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 473515330 -> 473515288 (<.01%) cycles in affected programs: 67146 -> 67104 (-0.06%) helped: 10 HURT: 7 helped stats (abs) min: 1 max: 36 x̄: 15.90 x̃: 17 helped stats (rel) min: 0.01% max: 1.29% x̄: 0.66% x̃: 0.89% HURT stats (abs) min: 1 max: 48 x̄: 16.71 x̃: 4 HURT stats (rel) min: 0.08% max: 1.94% x̄: 0.87% x̃: 0.19% 95% mean confidence interval for cycles value: -13.88 8.94 95% mean confidence interval for cycles %-change: -0.56% 0.49% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	fe3c518277	nir/algebraic: Don't add reordered version of patterns for commutative instructions The reordered are automatically considered by nir_algebraic rules for commutative instructions. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	314a40c902	Revert "nir: Replace an odd comparison involving fmin of -b2f" I originally noticed that `3b30814791` ("nir/algebraic: Optimize 1-bit Booleans") caused this pattern no longer be matched by incorrectly replacing b@32 with b@1. Making that correct had no effect on shader-db. When this pattern originally was added, it only affected 4 shaders, so it's not worth the effort to debug further. This reverts commit `f50400cc80`. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	aec0547838	nir/algebraic: Make some notes about comparison rearrangements versus infinity The original comment was a little terse and a little incorrect. The rearrangements are fine w.r.t. NaN. However, they produce incorrect results if one operand is +Inf and the other is -Inf. A later commit, "nir/algebraic: Add some compare-with-zero optimizations that are exact", will add some more patterns here. It may be reasonable to squash this commit (forward) into that commit. v2: Fix some incorrect comparisons operators in the comment (<= vs >=). Add commentary that subtraction works like addition w.r.t. NaN. Both noticed / suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00

1 2 3 4 5 ...

382 Commits