Commit Graph

121557 Commits

Author SHA1 Message Date
Jason Ekstrand 46187bb54f anv: Swizzle fast-clear values
Starting with Gen12, we can fast-clear a lot more surface formats and we
are suddenly in the position of having to fast-clear surfaces with
formats with an implicit swizzle such as VK_FORMAT_R4G4B4A4_UNORM_PACK16
which is represented as ISL_FORMAT_A4B4G4R4 with a BGRA swizzle.  In
order for blorp to do the fast-clear color conversion for us, it needs
a properly swizzled color.

This fixes the following Vulkan CTS groups on TGL:

 - dEQP-VK.pipeline.blend.format.b4g4r4a4_unorm_pack16.*
 - dEQP-VK.api.image_clearing.core.clear_color_image.*.b4g4r4a4*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4218>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4218>
2020-03-18 21:05:07 +00:00
Jason Ekstrand 3fb8f19481 intel/blorp: Add support for swizzling fast-clear colors
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4218>
2020-03-18 21:05:07 +00:00
Ian Romanick bf2eb3e0ee soft-fp64: Split a block that was missing a cast on a comparison
This function has code like:

   if (0x7FD <= zExp) {
      if ((0x7FD < zExp) ||
         ((zExp == 0x7FD) &&
            (0x001FFFFFu == zFrac0 && 0xFFFFFFFFu == zFrac1) &&
               increment)) {
         ...
	 return ...;
      }
      if (zExp < 0) {

I saw that, and I thought, "Uh... what?  Dead code?"  I thought it was a
bit fishy, so I grabbed the Berkeley SoftFloat Library 3e code, and
there is similar code in softfloat_roundPackToF64
(source/s_roundPackToF64.c), but it has an extra (uint16_t) cast in the
first comparison.  This is basicially a shortcut for

   if (zExp < 0 || zExp >= 0x7FD) {

So, having the nesting kind of makes sense. On a CPU, nesting the flow
control can be an optimization.  On a GPU, it's just fail.  Split the
block so that we don't need the uint16_t cast magic.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 683638 -> 658127 (-3.73%)
instructions in affected programs: 666839 -> 641328 (-3.83%)
helped: 92
HURT: 0
helped stats (abs) min: 26 max: 2456 x̄: 277.29 x̃: 144
helped stats (rel) min: 3.21% max: 4.22% x̄: 3.79% x̃: 3.90%
95% mean confidence interval for instructions value: -345.84 -208.75
95% mean confidence interval for instructions %-change: -3.86% -3.73%
Instructions are helped.

total cycles in shared programs: 5458858 -> 5344600 (-2.09%)
cycles in affected programs: 5360114 -> 5245856 (-2.13%)
helped: 92
HURT: 0
helped stats (abs) min: 126 max: 10300 x̄: 1241.93 x̃: 655
helped stats (rel) min: 1.71% max: 2.37% x̄: 2.12% x̃: 2.17%
95% mean confidence interval for cycles value: -1539.93 -943.94
95% mean confidence interval for cycles %-change: -2.16% -2.08%
Cycles are helped.

Fixes: f111d72596 ("glsl: Add "built-in" functions to do add(fp64, fp64)")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick a8882132f9 soft-fp64/fadd: Common code optimization for differing sign case
This is basically the same ideas from the previous 4 commits applied
to the aSign != bSign part... and all smashed into one commit.

The shader hurt for spill and / or fills is from
KHR-GL46.gpu_shader_fp64.builtin.inverse_dmat4.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake
total instructions in shared programs: 787258 -> 683638 (-13.16%)
instructions in affected programs: 725435 -> 621815 (-14.28%)
helped: 74
HURT: 0
helped stats (abs) min: 152 max: 10261 x̄: 1400.27 x̃: 975
helped stats (rel) min: 11.61% max: 20.92% x̄: 15.40% x̃: 14.86%
95% mean confidence interval for instructions value: -1740.11 -1060.43
95% mean confidence interval for instructions %-change: -16.01% -14.79%
Instructions are helped.

total cycles in shared programs: 6483227 -> 5458858 (-15.80%)
cycles in affected programs: 6051245 -> 5026876 (-16.93%)
helped: 74
HURT: 0
helped stats (abs) min: 1566 max: 95474 x̄: 13842.82 x̃: 9757
helped stats (rel) min: 13.94% max: 23.26% x̄: 17.98% x̃: 17.57%
95% mean confidence interval for cycles value: -17104.25 -10581.40
95% mean confidence interval for cycles %-change: -18.61% -17.35%
Cycles are helped.

total spills in shared programs: 553 -> 445 (-19.53%)
spills in affected programs: 553 -> 445 (-19.53%)
helped: 1
HURT: 0

total fills in shared programs: 1307 -> 1323 (1.22%)
fills in affected programs: 1307 -> 1323 (1.22%)
helped: 0
HURT: 1

Ice Lake
total instructions in shared programs: 781216 -> 678470 (-13.15%)
instructions in affected programs: 720088 -> 617342 (-14.27%)
helped: 74
HURT: 0
helped stats (abs) min: 153 max: 8863 x̄: 1388.46 x̃: 975
helped stats (rel) min: 11.24% max: 21.03% x̄: 15.47% x̃: 15.01%
95% mean confidence interval for instructions value: -1703.57 -1073.35
95% mean confidence interval for instructions %-change: -16.09% -14.85%
Instructions are helped.

total cycles in shared programs: 6464085 -> 5453997 (-15.63%)
cycles in affected programs: 6031771 -> 5021683 (-16.75%)
helped: 74
HURT: 0
helped stats (abs) min: 1552 max: 90317 x̄: 13649.84 x̃: 9650
helped stats (rel) min: 13.84% max: 23.11% x̄: 17.83% x̃: 17.41%
95% mean confidence interval for cycles value: -16802.89 -10496.79
95% mean confidence interval for cycles %-change: -18.46% -17.21%
Cycles are helped.

total spills in shared programs: 279 -> 368 (31.90%)
spills in affected programs: 279 -> 368 (31.90%)
helped: 0
HURT: 1

total fills in shared programs: 973 -> 1155 (18.71%)
fills in affected programs: 973 -> 1155 (18.71%)
helped: 0
HURT: 1

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 2d1216a039 soft-fp64/fadd: Move common code out of both branches of an if-statement
The previous two commits were just setting the scene for this change.

The mix(..., __propagateFloat64NaN(a, b), propagate) statements are not
identical in the two halves, but they are equivalent.  The first clause
of the mix in the else-branch is trivally ±Inf.  The first clause in the
then-branch __packFloat64(aSign, aExp, aFracHi, aFracLo).  The
preceeding conditions prove that aExp=0x7ff, aFracHi=0, and aFracLo=0.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 819560 -> 787258 (-3.94%)
instructions in affected programs: 757737 -> 725435 (-4.26%)
helped: 74
HURT: 0
helped stats (abs) min: 43 max: 3545 x̄: 436.51 x̃: 296
helped stats (rel) min: 3.54% max: 6.16% x̄: 4.52% x̃: 4.36%
95% mean confidence interval for instructions value: -548.42 -324.61
95% mean confidence interval for instructions %-change: -4.68% -4.37%
Instructions are helped.

total cycles in shared programs: 6817254 -> 6483227 (-4.90%)
cycles in affected programs: 6385272 -> 6051245 (-5.23%)
helped: 74
HURT: 0
helped stats (abs) min: 430 max: 33271 x̄: 4513.88 x̃: 3047
helped stats (rel) min: 4.28% max: 7.45% x̄: 5.48% x̃: 5.31%
95% mean confidence interval for cycles value: -5610.46 -3417.30
95% mean confidence interval for cycles %-change: -5.65% -5.32%
Cycles are helped.

total spills in shared programs: 591 -> 553 (-6.43%)
spills in affected programs: 591 -> 553 (-6.43%)
helped: 1
HURT: 0

total fills in shared programs: 1353 -> 1307 (-3.40%)
fills in affected programs: 1353 -> 1307 (-3.40%)
helped: 1
HURT: 0

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 16dfd06472 soft-fp64/fadd: Use absolute value of expDiff
In one branch we know that expDiff is already positive.

In the other branch we know the expDiff is negative.  Previously in that
branch the code was -(expDiff + 1).  This is equvialent to (-expDiff) -
1, and since expDiff is negative, abs(expDiff) - 1.

The main purpose of this commit is to prepare for "soft-fp64/fadd: Move
common code out of both branches of an if-statement".

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 818246 -> 819560 (0.16%)
instructions in affected programs: 756423 -> 757737 (0.17%)
helped: 1
HURT: 73
helped stats (abs) min: 1205 max: 1205 x̄: 1205.00 x̃: 1205
helped stats (rel) min: 1.36% max: 1.36% x̄: 1.36% x̃: 1.36%
HURT stats (abs)   min: 2 max: 149 x̄: 34.51 x̃: 27
HURT stats (rel)   min: 0.14% max: 1.09% x̄: 0.41% x̃: 0.30%
95% mean confidence interval for instructions value: -16.56 52.07
95% mean confidence interval for instructions %-change: 0.30% 0.47%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 6816686 -> 6817254 (<.01%)
cycles in affected programs: 6384704 -> 6385272 (<.01%)
helped: 37
HURT: 37
helped stats (abs) min: 30 max: 5790 x̄: 289.05 x̃: 102
helped stats (rel) min: 0.04% max: 0.86% x̄: 0.29% x̃: 0.31%
HURT stats (abs)   min: 2 max: 1020 x̄: 304.41 x̃: 232
HURT stats (rel)   min: <.01% max: 1.58% x̄: 0.55% x̃: 0.43%
95% mean confidence interval for cycles value: -165.37 180.72
95% mean confidence interval for cycles %-change: <.01% 0.27%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 705 -> 591 (-16.17%)
spills in affected programs: 705 -> 591 (-16.17%)
helped: 1
HURT: 0

total fills in shared programs: 1501 -> 1353 (-9.86%)
fills in affected programs: 1501 -> 1353 (-9.86%)
helped: 1
HURT: 0

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick da3fa01891 soft-fp64/fadd: Rename aFrac and bFrac variables
Exchanging aFracHi / bFracHi and aFracLo / bFracLo should not affect the
result of the later call to __add64.

The main purpose of this commit is to prepare for "soft-fp64/fadd: Move
common code out of both branches of an if-statement".

v2: Fix a typo in a comment.  Noticed by Matt.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 812094 -> 818246 (0.76%)
instructions in affected programs: 750271 -> 756423 (0.82%)
helped: 0
HURT: 74
HURT stats (abs)   min: 7 max: 520 x̄: 83.14 x̃: 59
HURT stats (rel)   min: 0.52% max: 1.48% x̄: 0.89% x̃: 0.84%
95% mean confidence interval for instructions value: 63.96 102.31
95% mean confidence interval for instructions %-change: 0.83% 0.95%
Instructions are HURT.

total cycles in shared programs: 6797157 -> 6816686 (0.29%)
cycles in affected programs: 6365175 -> 6384704 (0.31%)
helped: 0
HURT: 74
HURT stats (abs)   min: 16 max: 1690 x̄: 263.91 x̃: 181
HURT stats (rel)   min: 0.14% max: 0.68% x̄: 0.32% x̃: 0.27%
95% mean confidence interval for cycles value: 199.74 328.07
95% mean confidence interval for cycles %-change: 0.29% 0.36%
Cycles are HURT.

total spills in shared programs: 703 -> 705 (0.28%)
spills in affected programs: 703 -> 705 (0.28%)
helped: 0
HURT: 1

total fills in shared programs: 1499 -> 1501 (0.13%)
fills in affected programs: 1499 -> 1501 (0.13%)
helped: 0
HURT: 1

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 3c9ff97215 soft-fp64/fadd: Combine an if-statement into the preceeding else-clause
The main purpose of this commit is to prepare for "soft-fp64/fadd: Move
common code out of both branches of an if-statement".

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 812590 -> 812094 (-0.06%)
instructions in affected programs: 672135 -> 671639 (-0.07%)
helped: 57
HURT: 0
helped stats (abs) min: 1 max: 32 x̄: 8.70 x̃: 7
helped stats (rel) min: <.01% max: 0.49% x̄: 0.12% x̃: 0.09%
95% mean confidence interval for instructions value: -10.46 -6.94
95% mean confidence interval for instructions %-change: -0.15% -0.09%
Instructions are helped.

total cycles in shared programs: 6798039 -> 6797157 (-0.01%)
cycles in affected programs: 5810059 -> 5809177 (-0.02%)
helped: 54
HURT: 2
helped stats (abs) min: 2 max: 68 x̄: 16.44 x̃: 12
helped stats (rel) min: <.01% max: 0.12% x̄: 0.03% x̃: 0.02%
HURT stats (abs)   min: 2 max: 4 x̄: 3.00 x̃: 3
HURT stats (rel)   min: <.01% max: <.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -19.50 -12.00
95% mean confidence interval for cycles %-change: -0.03% -0.02%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 480565812c soft-fp64/fadd: Reformat after previous commit
Convert

   } else if (...) {
      ...
   } else {
      ...
   }

to

   } else {
      if (...) {
         ...
      } else {
         ...
      }
   }

Not doing this reformatting in the previous commit makes the previous
commit easier to review, and doing it before the next commit makes the
next commit easier to review.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 9496a67eec soft-fp64/fadd: Delete a redundant condition check
Previous condition checks already guaranteen that expDiff != 0 and
!(expDiff > 0), so expDiff < 0 is the only option left.

The main purpose of this commit is to prepare for "soft-fp64/fadd: Move
common code out of both branches of an if-statement".

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 815491 -> 812590 (-0.36%)
instructions in affected programs: 753668 -> 750767 (-0.38%)
helped: 74
HURT: 0
helped stats (abs) min: 3 max: 281 x̄: 39.20 x̃: 25
helped stats (rel) min: 0.29% max: 0.73% x̄: 0.42% x̃: 0.40%
95% mean confidence interval for instructions value: -48.50 -29.91
95% mean confidence interval for instructions %-change: -0.45% -0.40%
Instructions are helped.

total cycles in shared programs: 6813681 -> 6798039 (-0.23%)
cycles in affected programs: 6381699 -> 6366057 (-0.25%)
helped: 74
HURT: 0
helped stats (abs) min: 24 max: 1488 x̄: 211.38 x̃: 149
helped stats (rel) min: 0.20% max: 0.44% x̄: 0.26% x̃: 0.25%
95% mean confidence interval for cycles value: -261.68 -161.08
95% mean confidence interval for cycles %-change: -0.28% -0.25%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 7078105592 soft-fp64/fadd: Just let the subtraction happen when the result will be zero
The main purpose of this commit is to prepare for "soft-fp64/fadd: Move
common code out of both branches of an if-statement".

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 815717 -> 815491 (-0.03%)
instructions in affected programs: 735489 -> 735263 (-0.03%)
helped: 39
HURT: 34
helped stats (abs) min: 2 max: 192 x̄: 20.79 x̃: 12
helped stats (rel) min: 0.01% max: 0.46% x̄: 0.26% x̃: 0.28%
HURT stats (abs)   min: 1 max: 65 x̄: 17.21 x̃: 11
HURT stats (rel)   min: <.01% max: 1.11% x̄: 0.35% x̃: 0.19%
95% mean confidence interval for instructions value: -10.40 4.21
95% mean confidence interval for instructions %-change: -0.07% 0.13%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 6820707 -> 6813681 (-0.10%)
cycles in affected programs: 6388725 -> 6381699 (-0.11%)
helped: 51
HURT: 23
helped stats (abs) min: 3 max: 1837 x̄: 184.76 x̃: 120
helped stats (rel) min: <.01% max: 0.48% x̄: 0.25% x̃: 0.25%
HURT stats (abs)   min: 18 max: 216 x̄: 104.22 x̃: 98
HURT stats (rel)   min: 0.06% max: 0.73% x̄: 0.31% x̃: 0.11%
95% mean confidence interval for cycles value: -154.67 -35.22
95% mean confidence interval for cycles %-change: -0.15% <.01%
Inconclusive result (%-change mean confidence interval includes 0).

total spills in shared programs: 702 -> 703 (0.14%)
spills in affected programs: 702 -> 703 (0.14%)
helped: 0
HURT: 1

total fills in shared programs: 1497 -> 1499 (0.13%)
fills in affected programs: 1497 -> 1499 (0.13%)
helped: 0
HURT: 1

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick cae36fa217 soft-fp64/fadd: Pick zero or non-zero result based on subtraction result
The main purpose of this commit is to prepare for "soft-fp64/fadd: Move
common code out of both branches of an if-statement".

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 817327 -> 815717 (-0.20%)
instructions in affected programs: 755504 -> 753894 (-0.21%)
helped: 73
HURT: 1
helped stats (abs) min: 1 max: 159 x̄: 22.12 x̃: 14
helped stats (rel) min: 0.05% max: 0.40% x̄: 0.22% x̃: 0.23%
HURT stats (abs)   min: 5 max: 5 x̄: 5.00 x̃: 5
HURT stats (rel)   min: 0.07% max: 0.07% x̄: 0.07% x̃: 0.07%
95% mean confidence interval for instructions value: -27.27 -16.24
95% mean confidence interval for instructions %-change: -0.24% -0.20%
Instructions are helped.

total cycles in shared programs: 6822826 -> 6820707 (-0.03%)
cycles in affected programs: 6390844 -> 6388725 (-0.03%)
helped: 71
HURT: 3
helped stats (abs) min: 2 max: 537 x̄: 30.72 x̃: 18
helped stats (rel) min: <.01% max: 0.08% x̄: 0.03% x̃: 0.03%
HURT stats (abs)   min: 10 max: 32 x̄: 20.67 x̃: 20
HURT stats (rel)   min: 0.01% max: 0.02% x̄: 0.02% x̃: 0.02%
95% mean confidence interval for cycles value: -43.41 -13.86
95% mean confidence interval for cycles %-change: -0.04% -0.03%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 70be98f17a soft-fp64/fadd: Massively split the live range of zFrac0 and zFrac1
The main purpose of this commit is to prepare for "soft-fp64/fadd: Move
common code out of both branches of an if-statement".

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 822766 -> 817327 (-0.66%)
instructions in affected programs: 760943 -> 755504 (-0.71%)
helped: 74
HURT: 0
helped stats (abs) min: 8 max: 515 x̄: 73.50 x̃: 51
helped stats (rel) min: 0.58% max: 1.10% x̄: 0.77% x̃: 0.73%
95% mean confidence interval for instructions value: -91.17 -55.83
95% mean confidence interval for instructions %-change: -0.81% -0.74%
Instructions are helped.

total cycles in shared programs: 6816791 -> 6822826 (0.09%)
cycles in affected programs: 6384809 -> 6390844 (0.09%)
helped: 0
HURT: 74
HURT stats (abs)   min: 6 max: 1179 x̄: 81.55 x̃: 50
HURT stats (rel)   min: 0.02% max: 0.17% x̄: 0.09% x̃: 0.09%
95% mean confidence interval for cycles value: 48.99 114.12
95% mean confidence interval for cycles %-change: 0.09% 0.10%
Cycles are HURT.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 73fa3a1ca4 soft-fp64/fadd: Instead of tracking "b < a", track sign of the difference
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 824403 -> 822766 (-0.20%)
instructions in affected programs: 756260 -> 754623 (-0.22%)
helped: 68
HURT: 1
helped stats (abs) min: 1 max: 118 x̄: 26.26 x̃: 18
helped stats (rel) min: 0.02% max: 0.97% x̄: 0.31% x̃: 0.23%
HURT stats (abs)   min: 149 max: 149 x̄: 149.00 x̃: 149
HURT stats (rel)   min: 0.17% max: 0.17% x̄: 0.17% x̃: 0.17%
95% mean confidence interval for instructions value: -31.94 -15.51
95% mean confidence interval for instructions %-change: -0.37% -0.23%
Instructions are helped.

total cycles in shared programs: 6828935 -> 6816791 (-0.18%)
cycles in affected programs: 6385191 -> 6373047 (-0.19%)
helped: 73
HURT: 0
helped stats (abs) min: 2 max: 852 x̄: 166.36 x̃: 120
helped stats (rel) min: <.01% max: 0.80% x̄: 0.22% x̃: 0.17%
95% mean confidence interval for cycles value: -210.80 -121.91
95% mean confidence interval for cycles %-change: -0.27% -0.17%
Cycles are helped.

total fills in shared programs: 1442 -> 1497 (3.81%)
fills in affected programs: 1442 -> 1497 (3.81%)
helped: 0
HURT: 1

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 5b07f542e5 soft-fp64: Optimize __fmin64 and __fmax64 by using different evaluation order [v2]
v2: Go to extra effort to avoid flow control inserted to implement
short-circuit evaluation rules.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 797779 -> 796849 (-0.12%)
instructions in affected programs: 3499 -> 2569 (-26.58%)
helped: 21
HURT: 0
helped stats (abs) min: 8 max: 112 x̄: 44.29 x̃: 44
helped stats (rel) min: 16.09% max: 33.15% x̄: 25.72% x̃: 24.62%
95% mean confidence interval for instructions value: -55.94 -32.63
95% mean confidence interval for instructions %-change: -28.14% -23.30%
Instructions are helped.

total cycles in shared programs: 6601355 -> 6588351 (-0.20%)
cycles in affected programs: 25376 -> 12372 (-51.25%)
helped: 21
HURT: 0
helped stats (abs) min: 156 max: 1410 x̄: 619.24 x̃: 526
helped stats (rel) min: 42.39% max: 53.98% x̄: 50.12% x̃: 50.75%
95% mean confidence interval for cycles value: -776.58 -461.89
95% mean confidence interval for cycles %-change: -51.57% -48.67%
Cycles are helped.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 617a69107e soft-fp64/ffloor: Simplify the >= 0 comparison
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 797951 -> 797779 (-0.02%)
instructions in affected programs: 126482 -> 126310 (-0.14%)
helped: 15
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 11.47 x̃: 10
helped stats (rel) min: <.01% max: 0.60% x̄: 0.28% x̃: 0.29%
95% mean confidence interval for instructions value: -14.79 -8.14
95% mean confidence interval for instructions %-change: -0.40% -0.16%
Instructions are helped.

total cycles in shared programs: 6601437 -> 6601355 (<.01%)
cycles in affected programs: 1089336 -> 1089254 (<.01%)
helped: 15
HURT: 0
helped stats (abs) min: 2 max: 12 x̄: 5.47 x̃: 6
helped stats (rel) min: <.01% max: 0.04% x̄: 0.01% x̃: 0.01%
95% mean confidence interval for cycles value: -7.06 -3.87
95% mean confidence interval for cycles %-change: -0.02% <.01%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick abf28d6a70 soft-fp64: Relax the way NaN is propagated
Also reassociate a couple expressions to encourage some CSE.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 813599 -> 797951 (-1.92%)
instructions in affected programs: 796110 -> 780462 (-1.97%)
helped: 92
HURT: 0
helped stats (abs) min: 3 max: 5198 x̄: 170.09 x̃: 83
helped stats (rel) min: 0.36% max: 5.50% x̄: 1.57% x̃: 1.40%
95% mean confidence interval for instructions value: -282.42 -57.75
95% mean confidence interval for instructions %-change: -1.71% -1.42%
Instructions are helped.

total cycles in shared programs: 6687128 -> 6601437 (-1.28%)
cycles in affected programs: 6582246 -> 6496555 (-1.30%)
helped: 92
HURT: 0
helped stats (abs) min: 36 max: 14442 x̄: 931.42 x̃: 592
helped stats (rel) min: 0.45% max: 3.16% x̄: 1.44% x̃: 1.23%
95% mean confidence interval for cycles value: -1257.58 -605.27
95% mean confidence interval for cycles %-change: -1.58% -1.30%
Cycles are helped.

total spills in shared programs: 759 -> 702 (-7.51%)
spills in affected programs: 759 -> 702 (-7.51%)
helped: 3
HURT: 0

total fills in shared programs: 2412 -> 1442 (-40.22%)
fills in affected programs: 2412 -> 1442 (-40.22%)
helped: 3
HURT: 0

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 8178fa8876 soft-fp64/fsat: Micro-optimize x >= 1 test
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 841590 -> 841332 (-0.03%)
instructions in affected programs: 121957 -> 121699 (-0.21%)
helped: 7
HURT: 0
helped stats (abs) min: 15 max: 54 x̄: 36.86 x̃: 41
helped stats (rel) min: 0.16% max: 0.33% x̄: 0.23% x̃: 0.18%
95% mean confidence interval for instructions value: -49.73 -23.98
95% mean confidence interval for instructions %-change: -0.29% -0.16%
Instructions are helped.

total cycles in shared programs: 6926828 -> 6923967 (-0.04%)
cycles in affected programs: 1038569 -> 1035708 (-0.28%)
helped: 7
HURT: 0
helped stats (abs) min: 128 max: 616 x̄: 408.71 x̃: 446
helped stats (rel) min: 0.18% max: 0.44% x̄: 0.29% x̃: 0.22%
95% mean confidence interval for cycles value: -571.72 -245.70
95% mean confidence interval for cycles %-change: -0.38% -0.19%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick b6f58b4709 soft-fp64/fsat: Micro-optimize x < 0 test
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 841647 -> 841590 (<.01%)
instructions in affected programs: 122014 -> 121957 (-0.05%)
helped: 7
HURT: 0
helped stats (abs) min: 3 max: 12 x̄: 8.14 x̃: 9
helped stats (rel) min: 0.04% max: 0.07% x̄: 0.05% x̃: 0.04%
95% mean confidence interval for instructions value: -11.23 -5.06
95% mean confidence interval for instructions %-change: -0.06% -0.03%
Instructions are helped.

total cycles in shared programs: 6926904 -> 6926828 (<.01%)
cycles in affected programs: 1038645 -> 1038569 (<.01%)
helped: 7
HURT: 0
helped stats (abs) min: 4 max: 16 x̄: 10.86 x̃: 12
helped stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -14.97 -6.74
95% mean confidence interval for cycles %-change: -0.01% <.01%
Cycles are helped.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 7673dcbd21 soft-fp64/fsat: Correctly handle NaN
fsat is defined as min(max(a, 0.0), 1.0), and IEEE defines both min and
max to return the non-NaN value when one value is NaN.  Based on this,
fsat should definitely return 0.0 for NaN.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 841666 -> 841647 (<.01%)
instructions in affected programs: 122033 -> 122014 (-0.02%)
helped: 7
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 2.71 x̃: 3
helped stats (rel) min: 0.01% max: 0.02% x̄: 0.02% x̃: 0.01%
95% mean confidence interval for instructions value: -3.74 -1.69
95% mean confidence interval for instructions %-change: -0.02% -0.01%
Instructions are helped.

total cycles in shared programs: 6927246 -> 6926904 (<.01%)
cycles in affected programs: 1038987 -> 1038645 (-0.03%)
helped: 7
HURT: 0
helped stats (abs) min: 18 max: 72 x̄: 48.86 x̃: 54
helped stats (rel) min: 0.03% max: 0.05% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -67.38 -30.33
95% mean confidence interval for cycles %-change: -0.05% -0.02%
Cycles are helped.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: a42163cbbc ("compiler: Add lowering support for 64-bit saturate operations to software")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2585
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick b421c0466d soft-fp64/flt: Perform checks in a different order
The change to nir_opt_algebraic cleans up a pattern that was never
produced before the rest of this commit was added.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 843005 -> 841666 (-0.16%)
instructions in affected programs: 460655 -> 459316 (-0.29%)
helped: 64
HURT: 17
helped stats (abs) min: 1 max: 72 x̄: 21.72 x̃: 20
helped stats (rel) min: 0.01% max: 28.07% x̄: 12.67% x̃: 16.07%
HURT stats (abs)   min: 1 max: 7 x̄: 3.00 x̃: 2
HURT stats (rel)   min: 0.01% max: 0.04% x̄: 0.02% x̃: 0.02%
95% mean confidence interval for instructions value: -20.87 -12.19
95% mean confidence interval for instructions %-change: -12.35% -7.66%
Instructions are helped.

total cycles in shared programs: 6944998 -> 6927246 (-0.26%)
cycles in affected programs: 3891872 -> 3874120 (-0.46%)
helped: 71
HURT: 10
helped stats (abs) min: 2 max: 772 x̄: 254.21 x̃: 156
helped stats (rel) min: <.01% max: 66.44% x̄: 21.72% x̃: 18.40%
HURT stats (abs)   min: 18 max: 69 x̄: 29.70 x̃: 20
HURT stats (rel)   min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -270.82 -167.50
95% mean confidence interval for cycles %-change: -24.41% -13.65%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick f6992bf624 soft-fp64/fneg: Don't treat NaN specially
__fabs64 doesn't do anything special, and the value is still NaN
regardless of the value of the MSB.  In a strict sense, it's possible
that both functions should set the "signal" bit.

lts on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 844558 -> 843005 (-0.18%)
instructions in affected programs: 725975 -> 724422 (-0.21%)
helped: 53
HURT: 4
helped stats (abs) min: 1 max: 313 x̄: 29.87 x̃: 21
helped stats (rel) min: 0.01% max: 0.94% x̄: 0.30% x̃: 0.22%
HURT stats (abs)   min: 4 max: 11 x̄: 7.50 x̃: 7
HURT stats (rel)   min: 0.03% max: 0.09% x̄: 0.05% x̃: 0.04%
95% mean confidence interval for instructions value: -39.02 -15.47
95% mean confidence interval for instructions %-change: -0.34% -0.21%
Instructions are helped.

total cycles in shared programs: 6962024 -> 6944998 (-0.24%)
cycles in affected programs: 6185470 -> 6168444 (-0.28%)
helped: 59
HURT: 0
helped stats (abs) min: 64 max: 2863 x̄: 288.58 x̃: 208
helped stats (rel) min: 0.11% max: 0.87% x̄: 0.33% x̃: 0.27%
95% mean confidence interval for cycles value: -387.15 -190.00
95% mean confidence interval for cycles %-change: -0.38% -0.28%
Cycles are helped.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick de4acd8816 soft-fp64: Store sign value as 0 or 0x80000000
...instead of 0 or 1.  Many places the sign bit is extracted, then later
put back in the same position.  This saves some left-shift operations.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 848106 -> 844558 (-0.42%)
instructions in affected programs: 833480 -> 829932 (-0.43%)
helped: 106
HURT: 1
helped stats (abs) min: 1 max: 995 x̄: 33.48 x̃: 12
helped stats (rel) min: 0.15% max: 2.20% x̄: 0.60% x̃: 0.35%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: <.01% max: <.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for instructions value: -51.88 -14.43
95% mean confidence interval for instructions %-change: -0.71% -0.47%
Instructions are helped.

total cycles in shared programs: 6969125 -> 6962024 (-0.10%)
cycles in affected programs: 6717689 -> 6710588 (-0.11%)
helped: 78
HURT: 7
helped stats (abs) min: 2 max: 2083 x̄: 110.27 x̃: 56
helped stats (rel) min: <.01% max: 0.30% x̄: 0.11% x̃: 0.11%
HURT stats (abs)   min: 2 max: 1340 x̄: 214.29 x̃: 4
HURT stats (rel)   min: 0.01% max: 0.71% x̄: 0.13% x̃: 0.02%
95% mean confidence interval for cycles value: -144.02 -23.06
95% mean confidence interval for cycles %-change: -0.12% -0.07%
Cycles are helped.

total spills in shared programs: 814 -> 759 (-6.76%)
spills in affected programs: 814 -> 759 (-6.76%)
helped: 2
HURT: 1

total fills in shared programs: 2488 -> 2412 (-3.05%)
fills in affected programs: 2488 -> 2412 (-3.05%)
helped: 2
HURT: 1

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 598e2fc6a1 soft-fp64: Pick a single idiom for treating sign value as a Boolean
Replace all of the bool(qSign) with qSign != 0u.  Remove unnecessary
parenthesis from around most of the existing qSign != 0u.

This dramatically simplifies the next commit.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 848109 -> 848106 (<.01%)
instructions in affected programs: 53 -> 50 (-5.66%)
helped: 1
HURT: 0

total cycles in shared programs: 6969145 -> 6969125 (<.01%)
cycles in affected programs: 396 -> 376 (-5.05%)
helped: 1
HURT: 0

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 325a21f5eb soft-fp64: Simplify __countLeadingZeros32 function
findMSB returns -1 for an input of zero, so 31 - findMSB(a) is
sufficient on its own.

There's only one user of findMSB in shader-db, and it does not match
this pattern.

TODO: Add a pattern in the backend code generator that emits 31 -
nir_op_ufind_msb(a) as if it were nir_op_uclz.  That should save a couple
instructions.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 859509 -> 848109 (-1.33%)
instructions in affected programs: 841058 -> 829658 (-1.36%)
helped: 97
HURT: 0
helped stats (abs) min: 3 max: 1161 x̄: 117.53 x̃: 72
helped stats (rel) min: 0.98% max: 6.74% x̄: 1.70% x̃: 1.35%
95% mean confidence interval for instructions value: -147.21 -87.84
95% mean confidence interval for instructions %-change: -1.94% -1.46%
Instructions are helped.

total cycles in shared programs: 7072275 -> 6969145 (-1.46%)
cycles in affected programs: 6955767 -> 6852637 (-1.48%)
helped: 97
HURT: 0
helped stats (abs) min: 32 max: 10900 x̄: 1063.20 x̃: 560
helped stats (rel) min: 1.18% max: 7.58% x̄: 1.84% x̃: 1.45%
95% mean confidence interval for cycles value: -1339.43 -786.96
95% mean confidence interval for cycles %-change: -2.11% -1.57%
Cycles are helped.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 812230fd94 soft-fp64: Don't open-code umulExtended
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 928859 -> 859509 (-7.47%)
instructions in affected programs: 866293 -> 796943 (-8.01%)
helped: 76
HURT: 0
helped stats (abs) min: 75 max: 8042 x̄: 912.50 x̃: 688
helped stats (rel) min: 5.35% max: 21.02% x̄: 10.35% x̃: 7.58%
95% mean confidence interval for instructions value: -1138.37 -686.63
95% mean confidence interval for instructions %-change: -11.69% -9.00%
Instructions are helped.

total cycles in shared programs: 7272912 -> 7072275 (-2.76%)
cycles in affected programs: 6763486 -> 6562849 (-2.97%)
helped: 76
HURT: 0
helped stats (abs) min: 214 max: 30136 x̄: 2639.96 x̃: 1923
helped stats (rel) min: 1.75% max: 9.20% x̄: 4.04% x̃: 2.41%
95% mean confidence interval for cycles value: -3455.29 -1824.63
95% mean confidence interval for cycles %-change: -4.69% -3.39%
Cycles are helped.

total spills in shared programs: 817 -> 814 (-0.37%)
spills in affected programs: 791 -> 788 (-0.38%)
helped: 2
HURT: 0

total fills in shared programs: 2438 -> 2488 (2.05%)
fills in affected programs: 2392 -> 2442 (2.09%)
helped: 0
HURT: 2

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick d1e0227ef1 soft-fp64/b2f: Reimplement using bitwise logic ops
This doesn't help a lot of shaders, but it helps those few a LOT.

This could also be implemented using bcsel.  That version is very
slightly worse because the generated SEL instruction wants to have two
immediate sources, so one of them usually needs an extra MOV instruction
to load.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 929619 -> 928859 (-0.08%)
instructions in affected programs: 1651 -> 891 (-46.03%)
helped: 8
HURT: 0
helped stats (abs) min: 38 max: 152 x̄: 95.00 x̃: 95
helped stats (rel) min: 42.70% max: 86.36% x̄: 49.88% x̃: 44.66%
95% mean confidence interval for instructions value: -132.97 -57.03
95% mean confidence interval for instructions %-change: -62.28% -37.49%
Instructions are helped.

total cycles in shared programs: 7280180 -> 7272912 (-0.10%)
cycles in affected programs: 12960 -> 5692 (-56.08%)
helped: 8
HURT: 0
helped stats (abs) min: 352 max: 1456 x̄: 908.50 x̃: 910
helped stats (rel) min: 52.45% max: 91.19% x̄: 59.24% x̃: 55.15%
95% mean confidence interval for cycles value: -1274.03 -542.97
95% mean confidence interval for cycles %-change: -70.06% -48.41%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 4e3d69ad07 nir/algebraic: Simplify a contradiction that can occur in __flt64_nonnan
The pattern is added to opt_algebraic because, for example, comparisons
with constant 0.0 will produce (a1 < 0).

Even with a pass that optimized Boolean expressions, I think this would
be very difficult to automatically recognize and optimize.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 933054 -> 929619 (-0.37%)
instructions in affected programs: 784041 -> 780606 (-0.44%)
helped: 59
HURT: 0
helped stats (abs) min: 2 max: 213 x̄: 58.22 x̃: 44
helped stats (rel) min: 0.02% max: 2.51% x̄: 0.72% x̃: 0.46%
95% mean confidence interval for instructions value: -70.80 -45.64
95% mean confidence interval for instructions %-change: -0.92% -0.53%
Instructions are helped.

total cycles in shared programs: 7304712 -> 7280180 (-0.34%)
cycles in affected programs: 7176260 -> 7151728 (-0.34%)
helped: 92
HURT: 0
helped stats (abs) min: 8 max: 1414 x̄: 266.65 x̃: 166
helped stats (rel) min: 0.04% max: 2.34% x̄: 0.43% x̃: 0.22%
95% mean confidence interval for cycles value: -333.05 -200.26
95% mean confidence interval for cycles %-change: -0.54% -0.31%
Cycles are helped.

Regular shader-db changes:

No changes on any Intel platform.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick e0cefc5a23 nir/algebraic: Constant reassociation for bitwise operations too
Like 5886cd79a0, but for iand, ior, and ixor.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake
total instructions in shared programs: 903108 -> 902830 (-0.03%)
instructions in affected programs: 654910 -> 654632 (-0.04%)
helped: 31
HURT: 5
helped stats (abs) min: 2 max: 31 x̄: 9.58 x̃: 7
helped stats (rel) min: 0.01% max: 0.23% x̄: 0.06% x̃: 0.04%
HURT stats (abs)   min: 1 max: 10 x̄: 3.80 x̃: 3
HURT stats (rel)   min: 0.01% max: 0.10% x̄: 0.03% x̃: 0.02%
95% mean confidence interval for instructions value: -10.55 -4.89
95% mean confidence interval for instructions %-change: -0.07% -0.03%
Instructions are helped.

total cycles in shared programs: 7059681 -> 7058006 (-0.02%)
cycles in affected programs: 5081309 -> 5079634 (-0.03%)
helped: 33
HURT: 12
helped stats (abs) min: 1 max: 444 x̄: 60.91 x̃: 18
helped stats (rel) min: <.01% max: 2.17% x̄: 0.25% x̃: 0.05%
HURT stats (abs)   min: 1 max: 288 x̄: 27.92 x̃: 2
HURT stats (rel)   min: <.01% max: 1.00% x̄: 0.23% x̃: 0.02%
95% mean confidence interval for cycles value: -68.32 -6.12
95% mean confidence interval for cycles %-change: -0.28% 0.03%
Inconclusive result (%-change mean confidence interval includes 0).

Ice Lake
total instructions in shared programs: 895384 -> 895159 (-0.03%)
instructions in affected programs: 658678 -> 658453 (-0.03%)
helped: 37
HURT: 0
helped stats (abs) min: 3 max: 16 x̄: 6.08 x̃: 4
helped stats (rel) min: <.01% max: 0.07% x̄: 0.04% x̃: 0.04%
95% mean confidence interval for instructions value: -7.46 -4.70
95% mean confidence interval for instructions %-change: -0.04% -0.03%
Instructions are helped.

total cycles in shared programs: 7092224 -> 7091195 (-0.01%)
cycles in affected programs: 5221666 -> 5220637 (-0.02%)
helped: 35
HURT: 11
helped stats (abs) min: 1 max: 247 x̄: 43.46 x̃: 12
helped stats (rel) min: <.01% max: 2.17% x̄: 0.23% x̃: 0.05%
HURT stats (abs)   min: 2 max: 432 x̄: 44.73 x̃: 5
HURT stats (rel)   min: <.01% max: 1.00% x̄: 0.25% x̃: 0.02%
95% mean confidence interval for cycles value: -49.00 4.26
95% mean confidence interval for cycles %-change: -0.27% 0.03%
Inconclusive result (value mean confidence interval includes 0).

Regular shader-db results:

All Haswell+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611408 -> 17611398 (<.01%)
instructions in affected programs: 1648 -> 1638 (-0.61%)
helped: 2
HURT: 0

total cycles in shared programs: 338366148 -> 338366124 (<.01%)
cycles in affected programs: 124048 -> 124024 (-0.02%)
helped: 2
HURT: 0

No changes on any earlier Intel platforms.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 1d36af9338 nir/algebraic: Generalize some and-of-shift-right patterns [v2]
Generalizes some of the patterns from 76289fbfa8 and 905ff86198.  In
particular, some of the soft-fp64 code generates (a & 0x7fffffff) << 1
when constant 0.0 is compared (flt or feq).

v2: Reduce the set of added patterns to those that actually help
something.  This reduces the size of the state transition tables by
about 29k.  Suggested by Jason.  Remove the existing patterns that this
commit subsumes.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake
total instructions in shared programs: 903171 -> 903108 (<.01%)
instructions in affected programs: 635903 -> 635840 (<.01%)
helped: 25
HURT: 11
helped stats (abs) min: 1 max: 16 x̄: 5.04 x̃: 3
helped stats (rel) min: <.01% max: 0.15% x̄: 0.04% x̃: 0.03%
HURT stats (abs)   min: 2 max: 14 x̄: 5.73 x̃: 5
HURT stats (rel)   min: <.01% max: 0.11% x̄: 0.04% x̃: 0.02%
95% mean confidence interval for instructions value: -3.91 0.41
95% mean confidence interval for instructions %-change: -0.03% <.01%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 7059527 -> 7059681 (<.01%)
cycles in affected programs: 5249401 -> 5249555 (<.01%)
helped: 41
HURT: 9
helped stats (abs) min: 2 max: 76 x̄: 11.90 x̃: 10
helped stats (rel) min: <.01% max: 11.86% x̄: 0.99% x̃: 0.01%
HURT stats (abs)   min: 2 max: 380 x̄: 71.33 x̃: 12
HURT stats (rel)   min: <.01% max: 0.22% x̄: 0.04% x̃: 0.01%
95% mean confidence interval for cycles value: -14.93 21.09
95% mean confidence interval for cycles %-change: -1.40% -0.20%
Inconclusive result (value mean confidence interval includes 0).

Ice Lake
total instructions in shared programs: 895506 -> 895384 (-0.01%)
instructions in affected programs: 658800 -> 658678 (-0.02%)
helped: 37
HURT: 0
helped stats (abs) min: 2 max: 8 x̄: 3.30 x̃: 2
helped stats (rel) min: <.01% max: 0.03% x̄: 0.02% x̃: 0.02%
95% mean confidence interval for instructions value: -4.00 -2.59
95% mean confidence interval for instructions %-change: -0.02% -0.02%
Instructions are helped.

total cycles in shared programs: 7092748 -> 7092224 (<.01%)
cycles in affected programs: 5272008 -> 5271484 (<.01%)
helped: 36
HURT: 14
helped stats (abs) min: 2 max: 440 x̄: 21.67 x̃: 8
helped stats (rel) min: <.01% max: 11.86% x̄: 1.12% x̃: 0.02%
HURT stats (abs)   min: 2 max: 122 x̄: 18.29 x̃: 6
HURT stats (rel)   min: <.01% max: 0.07% x̄: 0.01% x̃: <.01%
95% mean confidence interval for cycles value: -29.24 8.28
95% mean confidence interval for cycles %-change: -1.40% -0.21%
Inconclusive result (value mean confidence interval includes 0).

Regular shader-db results:

All Haswell+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611489 -> 17611408 (<.01%)
instructions in affected programs: 21188 -> 21107 (-0.38%)
helped: 23
HURT: 1
helped stats (abs) min: 1 max: 16 x̄: 3.78 x̃: 3
helped stats (rel) min: 0.03% max: 5.82% x̄: 1.13% x̃: 0.85%
HURT stats (abs)   min: 6 max: 6 x̄: 6.00 x̃: 6
HURT stats (rel)   min: 0.60% max: 0.60% x̄: 0.60% x̃: 0.60%
95% mean confidence interval for instructions value: -5.27 -1.48
95% mean confidence interval for instructions %-change: -1.70% -0.42%
Instructions are helped.

total cycles in shared programs: 338418502 -> 338366148 (-0.02%)
cycles in affected programs: 2289052 -> 2236698 (-2.29%)
helped: 18
HURT: 3
helped stats (abs) min: 4 max: 18000 x̄: 2909.67 x̃: 38
helped stats (rel) min: 0.09% max: 4.07% x̄: 0.96% x̃: 0.43%
HURT stats (abs)   min: 2 max: 14 x̄: 6.67 x̃: 4
HURT stats (rel)   min: 0.22% max: 1.13% x̄: 0.66% x̃: 0.64%
95% mean confidence interval for cycles value: -5204.00 217.91
95% mean confidence interval for cycles %-change: -1.31% -0.14%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 11875617 -> 11875615 (<.01%)
instructions in affected programs: 1339 -> 1337 (-0.15%)
helped: 2
HURT: 0

No changes on any earlier Intel platforms.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick d6d63aec18 nir/algebraic: optimize ior(ine(a, 0), ine(b, 0)) to ine(ior(a, b), 0)
Like 70f9e2589e.  Also scrub the unnecessary size qualifier in both
replacement patterns.

This occurs in a handful of places in the soft-fp64 code, and that is
the primary reason for the change.

Perhaps the patterns that generate umin should be conditioned on
something, but I'm not sure what.  lower_bitops might cover the cases
that matter, but it seems ugly.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 936505 -> 933388 (-0.33%)
instructions in affected programs: 925719 -> 922602 (-0.34%)
helped: 154
HURT: 1
helped stats (abs) min: 1 max: 211 x̄: 35.45 x̃: 16
helped stats (rel) min: 0.34% max: 9.30% x̄: 2.28% x̃: 0.96%
HURT stats (abs)   min: 2342 max: 2342 x̄: 2342.00 x̃: 2342
HURT stats (rel)   min: 2.28% max: 2.28% x̄: 2.28% x̃: 2.28%
95% mean confidence interval for instructions value: -51.21 10.99
95% mean confidence interval for instructions %-change: -2.61% -1.89%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 7323502 -> 7306184 (-0.24%)
cycles in affected programs: 7220376 -> 7203058 (-0.24%)
helped: 126
HURT: 1
helped stats (abs) min: 2 max: 946 x̄: 159.10 x̃: 95
helped stats (rel) min: 0.01% max: 9.62% x̄: 0.80% x̃: 0.37%
HURT stats (abs)   min: 2728 max: 2728 x̄: 2728.00 x̃: 2728
HURT stats (rel)   min: 0.37% max: 0.37% x̄: 0.37% x̃: 0.37%
95% mean confidence interval for cycles value: -192.07 -80.66
95% mean confidence interval for cycles %-change: -1.07% -0.51%
Cycles are helped.

total spills in shared programs: 635 -> 817 (28.66%)
spills in affected programs: 635 -> 817 (28.66%)
helped: 0
HURT: 3

total fills in shared programs: 2065 -> 2438 (18.06%)
fills in affected programs: 2019 -> 2392 (18.47%)
helped: 0
HURT: 2

Regular shader-db results:

All Haswell+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611506 -> 17611489 (<.01%)
instructions in affected programs: 33442 -> 33425 (-0.05%)
helped: 32
HURT: 6
helped stats (abs) min: 1 max: 6 x̄: 1.69 x̃: 1
helped stats (rel) min: 0.08% max: 1.90% x̄: 0.27% x̃: 0.11%
HURT stats (abs)   min: 1 max: 15 x̄: 6.17 x̃: 5
HURT stats (rel)   min: 0.09% max: 1.50% x̄: 0.65% x̃: 0.55%
95% mean confidence interval for instructions value: -1.70 0.80
95% mean confidence interval for instructions %-change: -0.30% 0.05%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 338419218 -> 338418502 (<.01%)
cycles in affected programs: 385795 -> 385079 (-0.19%)
helped: 42
HURT: 3
helped stats (abs) min: 2 max: 192 x̄: 24.57 x̃: 16
helped stats (rel) min: 0.04% max: 2.09% x̄: 0.33% x̃: 0.22%
HURT stats (abs)   min: 64 max: 164 x̄: 105.33 x̃: 88
HURT stats (rel)   min: 0.77% max: 1.58% x̄: 1.09% x̃: 0.93%
95% mean confidence interval for cycles value: -29.76 -2.06
95% mean confidence interval for cycles %-change: -0.40% -0.07%
Cycles are helped.

Ivy Bridge and Sandy Bridge had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11875620 -> 11875617 (<.01%)
instructions in affected programs: 421 -> 418 (-0.71%)
helped: 2
HURT: 0

total cycles in shared programs: 178245336 -> 178245326 (<.01%)
cycles in affected programs: 3425 -> 3415 (-0.29%)
helped: 2
HURT: 0

No changes on Gen4 or Gen5.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Ian Romanick 88eb8f190b nir/algebraic: Simplify logic to detect sign of an integer
This occurs in a handful of places in the soft-fp64 code, and that is
the primary reason for the change.

v2: Fix a typo in a comment.  Noticed by Matt.  Copy the correct fp64
shader-db results to the commit message.  I realized that I used
accidentally used the results from the next commit.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 906235 -> 906149 (<.01%)
instructions in affected programs: 353966 -> 353880 (-0.02%)
helped: 31
HURT: 2
helped stats (abs) min: 1 max: 8 x̄: 3.03 x̃: 3
helped stats (rel) min: 0.01% max: 1.59% x̄: 0.10% x̃: 0.04%
HURT stats (abs)   min: 3 max: 5 x̄: 4.00 x̃: 4
HURT stats (rel)   min: 0.02% max: 0.02% x̄: 0.02% x̃: 0.02%
95% mean confidence interval for instructions value: -3.51 -1.70
95% mean confidence interval for instructions %-change: -0.19% <.01%
Inconclusive result (%-change mean confidence interval includes 0).

total cycles in shared programs: 7076552 -> 7076173 (<.01%)
cycles in affected programs: 2878361 -> 2877982 (-0.01%)
helped: 37
HURT: 2
helped stats (abs) min: 2 max: 48 x̄: 10.81 x̃: 6
helped stats (rel) min: <.01% max: 2.17% x̄: 0.47% x̃: 0.01%
HURT stats (abs)   min: 1 max: 20 x̄: 10.50 x̃: 10
HURT stats (rel)   min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -13.96 -5.48
95% mean confidence interval for cycles %-change: -0.72% -0.16%
Cycles are helped.

total fills in shared programs: 2064 -> 2065 (0.05%)
fills in affected programs: 45 -> 46 (2.22%)
helped: 0
HURT: 1

Regular shader-db results:

All Gen7+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611530 -> 17611506 (<.01%)
instructions in affected programs: 5934 -> 5910 (-0.40%)
helped: 10
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 2.40 x̃: 2
helped stats (rel) min: 0.14% max: 1.24% x̄: 0.47% x̃: 0.34%
95% mean confidence interval for instructions value: -3.53 -1.27
95% mean confidence interval for instructions %-change: -0.78% -0.17%
Instructions are helped.

total cycles in shared programs: 338419178 -> 338419218 (<.01%)
cycles in affected programs: 19244 -> 19284 (0.21%)
helped: 4
HURT: 2
helped stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.05% max: 0.11% x̄: 0.08% x̃: 0.08%
HURT stats (abs)   min: 26 max: 26 x̄: 26.00 x̃: 26
HURT stats (rel)   min: 1.20% max: 1.20% x̄: 1.20% x̃: 1.20%
95% mean confidence interval for cycles value: -9.08 22.41
95% mean confidence interval for cycles %-change: -0.35% 1.04%
Inconclusive result (value mean confidence interval includes 0).

No changes on any earlier Intel platform.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Pierre-Eric Pelloux-Prayer e7f3a8d695 st/mesa: disallow deferred flush if there are multiple contexts
u_threaded can hang in these situation, with one context waiting on a
deferred fence from the other context.
But the other context isn't flushing its pending work (because it's waiting
for more work to pushed) so everything is stuck.

Fixes: d17b35e671 ("gallium: add PIPE_FLUSH_DEFERRED")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1430
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4213>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4213>
2020-03-18 20:01:57 +00:00
Chad Versace 6ee971c882 anv: Use isl_drm_modifier_get_default_aux_state()
Use it in anv_layout_to_aux_state().

Refactor only. No change in behavior.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3881>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3881>
2020-03-18 11:39:33 -07:00
Jason Ekstrand 0905d5a14a intel/isl: Don't align linear images to 64K on Gen12+
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4048>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4048>
2020-03-18 17:33:28 +00:00
Samuel Pitoiset 94e37859a9 radv: fix random depth range unrestricted failures due to a cache issue
The shader module name is used to compute the pipeline key. The
driver used to load the wrong pipelines because the shader names
were similar.

This should fix random failures of
dEQP-VK.pipeline.depth_range_unrestricted.*

Fixes: f11ea22666 ("radv: fix a performance regression with graphics depth/stencil clears")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216>
2020-03-18 11:36:24 +00:00
Hyunjun Ko a6625b15a4 turnip: Do gathering xfb info after nir_remove_dead_variables
So we could align stream outputs correctly even if unused in/outs are
removed.

Fixes:
  dEQP-VK.transform_feedback.fuzz.random_vertex.scalar_types.*
  dEQP-VK.transform_feedback.fuzz.random_vertex.vector_types.*

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4207>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4207>
2020-03-18 09:47:04 +00:00
Hyunjun Ko c11a2bc202 turnip: Fix wrong assignment of xfb output's offset.
Should be divided by 4 so we could calculate the offset correctly in
tu6_setup_streamout.

Fixes: 2a1d6b81ed
Related: 374406a7c4

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4207>
2020-03-18 09:47:04 +00:00
Lionel Landwerlin 25a54554b3 intel/decoder: don't consider header fields past dword0
v2: use ULL

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4134>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4134>
2020-03-18 09:19:53 +00:00
Vasily Khoruzhick 0c41937440 lima: decode depth/stencil write bits in RSW
Now that we know the bits that are responsible for enabling depth/stencil
writes in shader we can decode them properly.

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4197>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4197>
2020-03-18 08:36:17 +00:00
Icenowy Zheng 9205762cae lima: implement zsbuf reload
Fragment shader can write depth and stencil if we set necessary flags
in RSW. In addition to that we need to use special format for Z24S8.
Original format is apparently Z24X8 since we can't sample stencil in GLES2.
This new format also seems to use several components for storing depth
since we saw r != g != b when sampling with this format.

[vasily: - initialize clear->depth to 0xffffff if we reload depth, just
           like blob does. Reloading doesn't work otherwise
         - use single bitmap for reload type]

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4197>
2020-03-18 08:36:17 +00:00
Vasily Khoruzhick dbceabed72 lima: disable Z16 format
Unfortunately we don't know how to reload Z16 buffers yet and blob
is using Z24 for dEQP tests that need depth reload.

Reviewed-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4197>
2020-03-18 08:36:17 +00:00
Eric Anholt 8b8af6d398 gallium/util: Switch util_float_to_half to _mesa_float_to_half()'s impl.
The util_float_to_half() implementation was much smaller, but when trying
to switch _mesa_float_to_half to it, many testcases
(dEQP-VK.spirv_assembly.instruction.graphics.opquantize.*,
piglit.spec.arb_shading_language_packing.*packhalf2x16) start failing on
Intel.  Replace the broken impl so that people don't have to debug it
later.

Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3699>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3699>
2020-03-17 22:28:12 +00:00
Bas Nieuwenhuizen 8e4e2cedcf amd/llvm: Fix divergent descriptor regressions with radeonsi.
piglit/bin/arb_bindless_texture-limit -auto -fbo:
  Needed to deal with non-NULL dynamic_index without deref in tex instructions.

piglit/bin/shader_runner tests/spec/arb_bindless_texture/execution/images/multiple-resident-images-reading.shader_test -auto:
  Need to deal with non-deref images in enter_waterfall_imae.

Fixes: b83c9aca4a "amd/llvm: Fix divergent descriptor indexing. (v3)"
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191>
2020-03-17 22:53:16 +01:00
Dave Airlie 040ce9a1b3 gallium: fix build with latest meson and gcc10
In Fedora 32 build was failing with meson-0.53.2-1.git88e40c7.fc32
and gcc-10.0.1-0.9.fc32.x86_64.

Worked with meson-0.53.1-1 and same gcc.

/usr/bin/ld: src/gallium/state_trackers/dri/libdri.a(dri2.c.o): in function `dri2_interop_export_object':
/home/airlied/devel/mesa/mesa/build/../src/gallium/state_trackers/dri/dri2.c:1813: undefined reference to `st_finalize_texture'
/usr/bin/ld: src/gallium/state_trackers/dri/libdri.a(dri_screen.c.o): in function `dri_init_screen_helper':
/home/airlied/devel/mesa/mesa/build/../src/gallium/state_trackers/dri/dri_screen.c:580: undefined reference to `st_gl_api_create'

Moving this around seems to fix it.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4220>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4220>
2020-03-17 21:14:38 +00:00
Marek Olšák 8dc5e174c7 ac: don't set old denormals flags with LLVM >= 11
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
2020-03-17 20:47:48 +00:00
Marek Olšák 63a5051ea6 ac: set new LLVM denormal flags
See: https://reviews.llvm.org/D71358

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
2020-03-17 20:47:48 +00:00
Marek Olšák 56cc10bd27 ac: unify denorm setting enforcement
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
2020-03-17 20:47:48 +00:00
Marek Olšák e4959add2f gallium/u_vbuf: simplify the first if statement in u_vbuf_upload_buffers
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4153>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4153>
2020-03-17 19:13:57 +00:00
Marek Olšák 99a29a20d2 gallium/u_threaded: don't sync the thread for all unsychronized mappings
This was missing for the READ case. This improves glBegin/End performance.
(vbo maps with WRITE | READ | UNSYCHRONIZED)

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4153>
2020-03-17 19:13:57 +00:00