nir/algebraic: Convert some f2u to f2i
Section 5.4.1 (Conversion and Scalar Constructors) of the GLSL 4.60 spec says: It is undefined to convert a negative floating-point value to an uint. Assuming that (uint)some_float behaves like (uint)(int)some_float allows some optimizations in the i965 backend to proceed. This basically undoes the small amount of damage done by "intel/compiler: Avoid propagating inequality cmods if types are different". v2: Replicate part of the commit message as a comment in the code. Suggested by Jason. shader-db results compairing *before* "intel/compiler: Avoid propagating inequality cmods if types are different" and after this commit: Skylake total cycles in shared programs: 383007996 -> 383007896 (<.01%) cycles in affected programs: 85208 -> 85108 (-0.12%) helped: 13 HURT: 8 helped stats (abs) min: 2 max: 26 x̄: 10.77 x̃: 6 helped stats (rel) min: 0.09% max: 0.65% x̄: 0.28% x̃: 0.14% HURT stats (abs) min: 2 max: 12 x̄: 5.00 x̃: 3 HURT stats (rel) min: 0.04% max: 0.32% x̄: 0.12% x̃: 0.07% 95% mean confidence interval for cycles value: -9.31 -0.21 95% mean confidence interval for cycles %-change: -0.24% <.01% Cycles are helped. Broadwell total cycles in shared programs: 415251194 -> 415251370 (<.01%) cycles in affected programs: 83750 -> 83926 (0.21%) helped: 7 HURT: 13 helped stats (abs) min: 10 max: 12 x̄: 11.43 x̃: 12 helped stats (rel) min: 0.30% max: 0.30% x̄: 0.30% x̃: 0.30% HURT stats (abs) min: 2 max: 36 x̄: 19.69 x̃: 22 HURT stats (rel) min: 0.05% max: 0.89% x̄: 0.44% x̃: 0.47% 95% mean confidence interval for cycles value: 0.76 16.84 95% mean confidence interval for cycles %-change: <.01% 0.37% Inconclusive result (%-change mean confidence interval includes 0). Haswell total instructions in shared programs: 13823885 -> 13823886 (<.01%) instructions in affected programs: 2249 -> 2250 (0.04%) helped: 0 HURT: 1 total cycles in shared programs: 390094243 -> 390094001 (<.01%) cycles in affected programs: 85640 -> 85398 (-0.28%) helped: 15 HURT: 6 helped stats (abs) min: 4 max: 26 x̄: 18.53 x̃: 18 helped stats (rel) min: 0.09% max: 0.66% x̄: 0.47% x̃: 0.42% HURT stats (abs) min: 2 max: 14 x̄: 6.00 x̃: 2 HURT stats (rel) min: 0.04% max: 0.37% x̄: 0.15% x̃: 0.04% 95% mean confidence interval for cycles value: -17.36 -5.69 95% mean confidence interval for cycles %-change: -0.44% -0.14% Cycles are helped. Ivy Bridge total cycles in shared programs: 180986448 -> 180986552 (<.01%) cycles in affected programs: 34835 -> 34939 (0.30%) helped: 0 HURT: 10 HURT stats (abs) min: 2 max: 18 x̄: 10.40 x̃: 10 HURT stats (rel) min: 0.06% max: 0.36% x̄: 0.28% x̃: 0.30% 95% mean confidence interval for cycles value: 4.67 16.13 95% mean confidence interval for cycles %-change: 0.20% 0.35% Cycles are HURT. Sandy Bridge total cycles in shared programs: 154603969 -> 154603970 (<.01%) cycles in affected programs: 171514 -> 171515 (<.01%) helped: 25 HURT: 14 helped stats (abs) min: 1 max: 4 x̄: 1.80 x̃: 1 helped stats (rel) min: 0.02% max: 0.10% x̄: 0.04% x̃: 0.04% HURT stats (abs) min: 1 max: 8 x̄: 3.29 x̃: 3 HURT stats (rel) min: 0.03% max: 0.28% x̄: 0.10% x̃: 0.11% 95% mean confidence interval for cycles value: -0.91 0.96 95% mean confidence interval for cycles %-change: -0.02% 0.04% Inconclusive result (value mean confidence interval includes 0). No changes on Iron Lake or GM45. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This commit is contained in:
parent
ac21dd4aee
commit
ad05920258
|
@ -565,6 +565,19 @@ optimizations = [
|
||||||
(('~f2u32', ('i2f', 'a@32')), a),
|
(('~f2u32', ('i2f', 'a@32')), a),
|
||||||
(('~f2u32', ('u2f', 'a@32')), a),
|
(('~f2u32', ('u2f', 'a@32')), a),
|
||||||
|
|
||||||
|
# Section 5.4.1 (Conversion and Scalar Constructors) of the GLSL 4.60 spec
|
||||||
|
# says:
|
||||||
|
#
|
||||||
|
# It is undefined to convert a negative floating-point value to an
|
||||||
|
# uint.
|
||||||
|
#
|
||||||
|
# Assuming that (uint)some_float behaves like (uint)(int)some_float allows
|
||||||
|
# some optimizations in the i965 backend to proceed.
|
||||||
|
(('ige', ('f2u', a), b), ('ige', ('f2i', a), b)),
|
||||||
|
(('ige', b, ('f2u', a)), ('ige', b, ('f2i', a))),
|
||||||
|
(('ilt', ('f2u', a), b), ('ilt', ('f2i', a), b)),
|
||||||
|
(('ilt', b, ('f2u', a)), ('ilt', b, ('f2i', a))),
|
||||||
|
|
||||||
# Packing and then unpacking does nothing
|
# Packing and then unpacking does nothing
|
||||||
(('unpack_64_2x32_split_x', ('pack_64_2x32_split', a, b)), a),
|
(('unpack_64_2x32_split_x', ('pack_64_2x32_split', a, b)), a),
|
||||||
(('unpack_64_2x32_split_y', ('pack_64_2x32_split', a, b)), b),
|
(('unpack_64_2x32_split_y', ('pack_64_2x32_split', a, b)), b),
|
||||||
|
|
Loading…
Reference in New Issue