Commit Graph

121985 Commits

Author SHA1 Message Date
Alyssa Rosenzweig b033189dd7 pan/bit: Wire through I/O
We'd like to wire in attributes and uniforms as inputs and look at the
varying as output for automatic testing on-device, building up a test
framework for us.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4396>
2020-04-01 02:25:05 +00:00
Alyssa Rosenzweig b26214e907 pan/bit: Add `run` mode to the cmdline
This emulates the functionality of shader_runner (built for kbase) using
the bifrost testing infrastructure so it runs on mainline. Ideally this
will let us test shaders from the assembler.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4396>
2020-04-01 02:25:05 +00:00
Jose Fonseca cb56d5d9f8 appveyor: Remove Meson job.
Appveyor Meson fails misteriously some times, e.g.,
- https://ci.appveyor.com/project/mesa3d/mesa/builds/31780753/job/w8b28iahboxq4na2
- https://ci.appveyor.com/project/mesa3d/mesa/builds/31857376
and now that we have msvc coverage on gitlab ci this is no longer necessary.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4392>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4392>
2020-04-01 01:13:21 +00:00
Rob Clark 59754409cc freedreno/log: fix build error
It seems some versions of gcc are less clever about const initializers:

```
../src/gallium/drivers/freedreno/freedreno_log.c:58:33: error: initializer element is not constant
 const unsigned msgs_per_chunk = bo_size / sizeof(uint64_t);
                                 ^~~~~~~
```

See https://gitlab.freedesktop.org/mesa/mesa/-/issues/2713

Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4390>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4390>
2020-04-01 00:51:09 +00:00
Ian Romanick b097e326b8 nir/algebraic: Remove a redundant fabs pattern
Made redundant by 5544b2cbbd ("nir/algebraic: Use value range analysis
to eliminate useless unary ops").

No shader-db changes on any Intel platform.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Ian Romanick af1bc7e0c7 nir/algebraic: Use value range analysis to convert fmax to fsat
This is conceptually similar to the 1-fsat(a) <=> fsat(1-a) rearragement
done in:

3b74790941 ("nir/algebraic: Recognize open-coded flrp(a, b, fsat(c))")

2d259713b7 ("nir/algebraic: Commute 1-fsat(a) to fsat(1-a) for all
non-fmul instructions").

Note: this helps the Aztex Ruins shader that was hurt for spills and
fills on Braodwell in the previous commit, but it does not fix the
spills or fills. :(

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 14528985 -> 14526116 (-0.02%)
instructions in affected programs: 477300 -> 474431 (-0.60%)
helped: 2332
HURT: 0
helped stats (abs) min: 1 max: 18 x̄: 1.23 x̃: 1
helped stats (rel) min: 0.07% max: 8.89% x̄: 0.88% x̃: 0.64%
95% mean confidence interval for instructions value: -1.27 -1.19
95% mean confidence interval for instructions %-change: -0.92% -0.85%
Instructions are helped.

total cycles in shared programs: 203723684 -> 203692984 (-0.02%)
cycles in affected programs: 4878847 -> 4848147 (-0.63%)
helped: 1764
HURT: 324
helped stats (abs) min: 1 max: 706 x̄: 22.94 x̃: 17
helped stats (rel) min: <.01% max: 17.75% x̄: 1.94% x̃: 1.66%
HURT stats (abs)   min: 1 max: 400 x̄: 30.15 x̃: 10
HURT stats (rel)   min: <.01% max: 17.76% x̄: 1.91% x̃: 0.69%
95% mean confidence interval for cycles value: -16.55 -12.86
95% mean confidence interval for cycles %-change: -1.44% -1.24%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Ian Romanick 62795475e8 nir/algebraic: Distribute source modifiers into instructions
There are three main classes of cases that are helped by this change:

1. When the negation is applied to a value being type converted (e.g.,
   float(-x)).  This could possibly also be handled with more clever
   code generation.

2. When the negation is applied to a phi node source (e.g., x = -(...);
   at the end of a basic block).  This was the original case that caught
   my attention while looking at shader-db dumps.

3. When the negation is applied to the source of an instruction that
   cannot have source modifiers.  This includes texture instructions and
   math box instructions on pre-Gen7 platforms (see more details below).

In many these cases the negation can be propagated into the instructions
that generate the value (e.g., -(a*b) = (-a)*b).

In addition to the operations implemtned in this patch, I also tried:

 - frcp - Helped 6 or fewer shaders on Gen7+, and hurt just as many on
   pre-Gen7.  On Gen6 and earlier, frcp is a math box instruction, and
   math box instructions cannot have source modifiers.

   I suspect this is why so many more shaders are helped on Gen6 than on
   Gen5 or Gen7.  Gen6 supports OpenGL 3.3, so a lot more shaders
   compile on it.  A lot of these shaders may have things like cos(-x)
   or rcp(-x) that could result in an explicit negation instruction.

 - bcsel - Hurt a few shaders with none helped.  bcsel operates on
   integer sources, so the fabs or fneg cannot be a source modifier in
   the bcsel itself.

 - Integer instructions - No changes on any Intel platform.

Some notes about the shader-db results below.

 - On Tiger Lake, a single Deus Ex fragment shader is hurt for both
   spills and fills.

 - On Haswell, a different Deus Ex fragment shader is hurt for both
   spills and fills.

 - On GM45, the "LOST: 1" and "GAINED: 1" is a single Left4Dead 2
   (very high graphics settings, lol) fragment shader that upgrades
   from SIMD8 to SIMD16.

v2: Add support for fsign.  Add some patterns that remove redundant
negations and redundant absolute value rather than trying to push them
down the tree.

Tiger Lake
total instructions in shared programs: 17611333 -> 17586465 (-0.14%)
instructions in affected programs: 3033734 -> 3008866 (-0.82%)
helped: 10310
HURT: 632
helped stats (abs) min: 1 max: 35 x̄: 2.61 x̃: 1
helped stats (rel) min: 0.04% max: 16.67% x̄: 1.43% x̃: 1.01%
HURT stats (abs)   min: 1 max: 47 x̄: 3.21 x̃: 2
HURT stats (rel)   min: 0.04% max: 5.08% x̄: 0.88% x̃: 0.63%
95% mean confidence interval for instructions value: -2.33 -2.21
95% mean confidence interval for instructions %-change: -1.32% -1.27%
Instructions are helped.

total cycles in shared programs: 338365223 -> 338262252 (-0.03%)
cycles in affected programs: 125291811 -> 125188840 (-0.08%)
helped: 5224
HURT: 2031
helped stats (abs) min: 1 max: 5670 x̄: 46.73 x̃: 12
helped stats (rel) min: <.01% max: 34.78% x̄: 1.91% x̃: 0.97%
HURT stats (abs)   min: 1 max: 2882 x̄: 69.50 x̃: 14
HURT stats (rel)   min: <.01% max: 44.93% x̄: 2.35% x̃: 0.74%
95% mean confidence interval for cycles value: -18.71 -9.68
95% mean confidence interval for cycles %-change: -0.80% -0.63%
Cycles are helped.

total spills in shared programs: 8942 -> 8946 (0.04%)
spills in affected programs: 8 -> 12 (50.00%)
helped: 0
HURT: 1

total fills in shared programs: 9399 -> 9401 (0.02%)
fills in affected programs: 21 -> 23 (9.52%)
helped: 0
HURT: 1

Ice Lake
total instructions in shared programs: 16124348 -> 16102258 (-0.14%)
instructions in affected programs: 2830928 -> 2808838 (-0.78%)
helped: 11294
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.96 x̃: 1
helped stats (rel) min: 0.07% max: 17.65% x̄: 1.32% x̃: 0.93%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 3.45% max: 4.00% x̄: 3.72% x̃: 3.72%
95% mean confidence interval for instructions value: -1.99 -1.93
95% mean confidence interval for instructions %-change: -1.34% -1.29%
Instructions are helped.

total cycles in shared programs: 335393932 -> 335325794 (-0.02%)
cycles in affected programs: 123834609 -> 123766471 (-0.06%)
helped: 5034
HURT: 2128
helped stats (abs) min: 1 max: 3256 x̄: 43.39 x̃: 11
helped stats (rel) min: <.01% max: 35.79% x̄: 1.98% x̃: 1.00%
HURT stats (abs)   min: 1 max: 2634 x̄: 70.63 x̃: 16
HURT stats (rel)   min: <.01% max: 49.49% x̄: 2.73% x̃: 0.62%
95% mean confidence interval for cycles value: -13.66 -5.37
95% mean confidence interval for cycles %-change: -0.69% -0.48%
Cycles are helped.

LOST:   0
GAINED: 2

Skylake
total instructions in shared programs: 14949240 -> 14927930 (-0.14%)
instructions in affected programs: 2594756 -> 2573446 (-0.82%)
helped: 11000
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.94 x̃: 1
helped stats (rel) min: 0.07% max: 18.75% x̄: 1.39% x̃: 0.94%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 4.76% max: 4.76% x̄: 4.76% x̃: 4.76%
95% mean confidence interval for instructions value: -1.97 -1.91
95% mean confidence interval for instructions %-change: -1.42% -1.37%
Instructions are helped.

total cycles in shared programs: 324829346 -> 324821596 (<.01%)
cycles in affected programs: 121566087 -> 121558337 (<.01%)
helped: 4611
HURT: 2147
helped stats (abs) min: 1 max: 3715 x̄: 33.29 x̃: 10
helped stats (rel) min: <.01% max: 36.08% x̄: 1.94% x̃: 1.00%
HURT stats (abs)   min: 1 max: 2551 x̄: 67.88 x̃: 16
HURT stats (rel)   min: <.01% max: 53.79% x̄: 3.69% x̃: 0.89%
95% mean confidence interval for cycles value: -4.25 1.96
95% mean confidence interval for cycles %-change: -0.28% -0.02%
Inconclusive result (value mean confidence interval includes 0).

Broadwell
total instructions in shared programs: 14971203 -> 14949957 (-0.14%)
instructions in affected programs: 2635699 -> 2614453 (-0.81%)
helped: 10982
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.93 x̃: 1
helped stats (rel) min: 0.07% max: 18.75% x̄: 1.39% x̃: 0.94%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 4.76% max: 4.76% x̄: 4.76% x̃: 4.76%
95% mean confidence interval for instructions value: -1.97 -1.90
95% mean confidence interval for instructions %-change: -1.42% -1.37%
Instructions are helped.

total cycles in shared programs: 336215033 -> 336086458 (-0.04%)
cycles in affected programs: 127383198 -> 127254623 (-0.10%)
helped: 4884
HURT: 1963
helped stats (abs) min: 1 max: 25696 x̄: 51.78 x̃: 12
helped stats (rel) min: <.01% max: 58.28% x̄: 2.00% x̃: 1.05%
HURT stats (abs)   min: 1 max: 3401 x̄: 63.33 x̃: 16
HURT stats (rel)   min: <.01% max: 39.95% x̄: 2.20% x̃: 0.70%
95% mean confidence interval for cycles value: -29.99 -7.57
95% mean confidence interval for cycles %-change: -0.89% -0.71%
Cycles are helped.

total fills in shared programs: 24905 -> 24901 (-0.02%)
fills in affected programs: 117 -> 113 (-3.42%)
helped: 4
HURT: 0

LOST:   0
GAINED: 16

Haswell
total instructions in shared programs: 13148927 -> 13131528 (-0.13%)
instructions in affected programs: 2220941 -> 2203542 (-0.78%)
helped: 8017
HURT: 4
helped stats (abs) min: 1 max: 12 x̄: 2.17 x̃: 1
helped stats (rel) min: 0.07% max: 15.25% x̄: 1.40% x̃: 0.93%
HURT stats (abs)   min: 1 max: 7 x̄: 2.50 x̃: 1
HURT stats (rel)   min: 0.33% max: 4.76% x̄: 2.73% x̃: 2.91%
95% mean confidence interval for instructions value: -2.21 -2.13
95% mean confidence interval for instructions %-change: -1.43% -1.37%
Instructions are helped.

total cycles in shared programs: 321221791 -> 321079870 (-0.04%)
cycles in affected programs: 126886055 -> 126744134 (-0.11%)
helped: 4674
HURT: 1729
helped stats (abs) min: 1 max: 23654 x̄: 56.47 x̃: 16
helped stats (rel) min: <.01% max: 53.22% x̄: 2.13% x̃: 1.05%
HURT stats (abs)   min: 1 max: 3694 x̄: 70.58 x̃: 18
HURT stats (rel)   min: <.01% max: 63.06% x̄: 2.48% x̃: 0.90%
95% mean confidence interval for cycles value: -33.31 -11.02
95% mean confidence interval for cycles %-change: -0.99% -0.78%
Cycles are helped.

total spills in shared programs: 19872 -> 19874 (0.01%)
spills in affected programs: 21 -> 23 (9.52%)
helped: 0
HURT: 1

total fills in shared programs: 20941 -> 20941 (0.00%)
fills in affected programs: 62 -> 62 (0.00%)
helped: 1
HURT: 1

LOST:   0
GAINED: 8

Ivy Bridge
total instructions in shared programs: 11875553 -> 11853839 (-0.18%)
instructions in affected programs: 1553112 -> 1531398 (-1.40%)
helped: 7304
HURT: 3
helped stats (abs) min: 1 max: 16 x̄: 2.97 x̃: 2
helped stats (rel) min: 0.07% max: 15.25% x̄: 1.62% x̃: 1.15%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.05% max: 3.33% x̄: 2.44% x̃: 2.94%
95% mean confidence interval for instructions value: -3.04 -2.90
95% mean confidence interval for instructions %-change: -1.65% -1.59%
Instructions are helped.

total cycles in shared programs: 178246425 -> 178184484 (-0.03%)
cycles in affected programs: 13702146 -> 13640205 (-0.45%)
helped: 4409
HURT: 1566
helped stats (abs) min: 1 max: 531 x̄: 24.52 x̃: 13
helped stats (rel) min: <.01% max: 38.67% x̄: 2.14% x̃: 1.02%
HURT stats (abs)   min: 1 max: 356 x̄: 29.48 x̃: 10
HURT stats (rel)   min: <.01% max: 64.73% x̄: 1.87% x̃: 0.70%
95% mean confidence interval for cycles value: -11.60 -9.14
95% mean confidence interval for cycles %-change: -1.19% -0.99%
Cycles are helped.

LOST:   0
GAINED: 10

Sandy Bridge
total instructions in shared programs: 10695740 -> 10667483 (-0.26%)
instructions in affected programs: 2337607 -> 2309350 (-1.21%)
helped: 10720
HURT: 1
helped stats (abs) min: 1 max: 49 x̄: 2.64 x̃: 2
helped stats (rel) min: 0.07% max: 20.00% x̄: 1.54% x̃: 1.13%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.04% max: 1.04% x̄: 1.04% x̃: 1.04%
95% mean confidence interval for instructions value: -2.69 -2.58
95% mean confidence interval for instructions %-change: -1.57% -1.51%
Instructions are helped.

total cycles in shared programs: 153478839 -> 153416223 (-0.04%)
cycles in affected programs: 22050900 -> 21988284 (-0.28%)
helped: 5342
HURT: 2200
helped stats (abs) min: 1 max: 1020 x̄: 20.34 x̃: 16
helped stats (rel) min: <.01% max: 24.05% x̄: 1.51% x̃: 0.86%
HURT stats (abs)   min: 1 max: 335 x̄: 20.93 x̃: 6
HURT stats (rel)   min: <.01% max: 20.18% x̄: 1.03% x̃: 0.30%
95% mean confidence interval for cycles value: -9.18 -7.42
95% mean confidence interval for cycles %-change: -0.82% -0.71%
Cycles are helped.

Iron Lake
total instructions in shared programs: 8114882 -> 8105574 (-0.11%)
instructions in affected programs: 1232504 -> 1223196 (-0.76%)
helped: 4109
HURT: 2
helped stats (abs) min: 1 max: 6 x̄: 2.27 x̃: 1
helped stats (rel) min: 0.05% max: 8.33% x̄: 0.99% x̃: 0.66%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.94% max: 4.35% x̄: 2.65% x̃: 2.65%
95% mean confidence interval for instructions value: -2.31 -2.21
95% mean confidence interval for instructions %-change: -1.01% -0.96%
Instructions are helped.

total cycles in shared programs: 188504036 -> 188466296 (-0.02%)
cycles in affected programs: 31203798 -> 31166058 (-0.12%)
helped: 3447
HURT: 36
helped stats (abs) min: 2 max: 92 x̄: 11.03 x̃: 8
helped stats (rel) min: <.01% max: 5.41% x̄: 0.21% x̃: 0.13%
HURT stats (abs)   min: 2 max: 30 x̄: 7.33 x̃: 6
HURT stats (rel)   min: 0.01% max: 1.65% x̄: 0.18% x̃: 0.10%
95% mean confidence interval for cycles value: -11.16 -10.51
95% mean confidence interval for cycles %-change: -0.22% -0.20%
Cycles are helped.

LOST:   0
GAINED: 1

GM45
total instructions in shared programs: 4989697 -> 4984531 (-0.10%)
instructions in affected programs: 703952 -> 698786 (-0.73%)
helped: 2493
HURT: 2
helped stats (abs) min: 1 max: 6 x̄: 2.07 x̃: 1
helped stats (rel) min: 0.05% max: 8.33% x̄: 1.03% x̃: 0.66%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.95% max: 4.35% x̄: 2.65% x̃: 2.65%
95% mean confidence interval for instructions value: -2.13 -2.01
95% mean confidence interval for instructions %-change: -1.07% -0.99%
Instructions are helped.

total cycles in shared programs: 128929136 -> 128903886 (-0.02%)
cycles in affected programs: 21583096 -> 21557846 (-0.12%)
helped: 2214
HURT: 17
helped stats (abs) min: 2 max: 92 x̄: 11.44 x̃: 8
helped stats (rel) min: <.01% max: 5.41% x̄: 0.24% x̃: 0.13%
HURT stats (abs)   min: 2 max: 8 x̄: 4.24 x̃: 4
HURT stats (rel)   min: 0.01% max: 1.65% x̄: 0.20% x̃: 0.09%
95% mean confidence interval for cycles value: -11.75 -10.88
95% mean confidence interval for cycles %-change: -0.25% -0.22%
Cycles are helped.

LOST:   1
GAINED: 1

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Ian Romanick c0bdf37c91 nir/algebraic: Change the default cursor location when replacing a unary op
If the expression tree that is being replaced has a unary operation at
its root, set the cursor (location where new instructions are inserted)
at the source instruction instead.

This doesn't do much now because there are very few patterns that have a
unary operation as the root.  Almost all of the patterns that do have a
unary operation as the root have inot.  All of the shaders that are
affected by this commit have expression trees with an inot at the root.

This change prevents some significant, spurious caused by the next
commit.  There is further explanation in the large comment added in
the code.

I also considered a couple other options that may still be worth exploring.

1. Add some mark-up to the search pattern to denote where new
   instructions should be added.  I considered using "@" to denote the
   cursor location.  For example,

    (('fneg', ('fadd@', a, b)), ...)

2. To prevent other kinds of unintended code motion, add the ability to
   name expressions in the search pattern so that they can be reused in
   the replacement.  For example,

   (('bcsel', ('ige', ('find_lsb=b', a), 0), ('find_lsb', a), -1), b),

   An alternative would be to add some kind of CSE at the time of
   inserting the replacements.  Create a new instruction, then check to
   see if it already exists.  That option might be better overall.

Over the years I know Matt has heard me complain, "I added a pattern
that just deleted an instruction, but it added a bunch of spills!"  This
was always in large, complex shaders that are very hard to analyze.  I
always blamed these cases on the scheduler being dumb.  I am now very
suspicious that unintended code motion was the real problem.

All Gen4+ Intel platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611405 -> 17611333 (<.01%)
instructions in affected programs: 18613 -> 18541 (-0.39%)
helped: 41
HURT: 13
helped stats (abs) min: 1 max: 18 x̄: 4.46 x̃: 4
helped stats (rel) min: 0.27% max: 5.68% x̄: 1.29% x̃: 1.34%
HURT stats (abs)   min: 1 max: 20 x̄: 8.54 x̃: 7
HURT stats (rel)   min: 0.30% max: 4.20% x̄: 2.15% x̃: 2.38%
95% mean confidence interval for instructions value: -3.29 0.63
95% mean confidence interval for instructions %-change: -0.95% 0.02%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 338366118 -> 338365223 (<.01%)
cycles in affected programs: 257889 -> 256994 (-0.35%)
helped: 42
HURT: 15
helped stats (abs) min: 2 max: 120 x̄: 39.38 x̃: 34
helped stats (rel) min: 0.04% max: 2.55% x̄: 0.86% x̃: 0.76%
HURT stats (abs)   min: 6 max: 204 x̄: 50.60 x̃: 34
HURT stats (rel)   min: 0.11% max: 4.75% x̄: 1.12% x̃: 0.56%
95% mean confidence interval for cycles value: -30.39 -1.02
95% mean confidence interval for cycles %-change: -0.66% -0.02%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Ian Romanick d2b4f3f137 intel/vec4: Allow late copy propagation on vec4
This change incurs a small amount of hurt now, but it enables a lot of
benefit on vec4 shaders on the next commit.  nir_opt_algebraic_late
converts dph, dot3, etc. to dhp_replicated, dot_replicated3, etc.  In
the process, it introduces extra moves.  If the original NIR contained

        vec1 32 ssa_45 = fdot4 ssa_51, ssa_44
        vec1 32 ssa_46 = fneg ssa_45

nir_opt_algebraic_late will produce

        vec4 32 ssa_18 = fdot_replicated4 ssa_1, ssa_15
        vec1 32 ssa_19 = mov ssa_18.x
        vec1 32 ssa_17 = fneg ssa_19

The algebraic pass added in the next commit can't see through the move
to know that the fneg applies to a fdot_replicated4.

Haswell, Ivy Bridge, and Sandybridge had similar results. (Haswell shown)
total cycles in shared programs: 187077604 -> 187079858 (<.01%)
cycles in affected programs: 350132 -> 352386 (0.64%)
helped: 174
HURT: 194
helped stats (abs) min: 2 max: 124 x̄: 23.60 x̃: 16
helped stats (rel) min: 0.12% max: 15.88% x̄: 4.98% x̃: 3.86%
HURT stats (abs)   min: 2 max: 164 x̄: 32.78 x̃: 16
HURT stats (rel)   min: 0.17% max: 22.82% x̄: 6.46% x̃: 0.86%
95% mean confidence interval for cycles value: 2.04 10.21
95% mean confidence interval for cycles %-change: 0.17% 1.93%
Cycles are HURT.

No shader-db changes on any other Intel platform.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Timothy Arceri 0f4a81430e nir: fix crash in varying packing on interface mismatch
For example when the outputs are scalars but the inputs are struct
members.

Fixes: 26aa460940 ("nir: rewrite varying component packing")

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4351>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4351>
2020-03-31 23:43:31 +00:00
Eric Anholt 31011c7a39 freedreno/turnip: Use the NIR info to decide if we need helper invocations.
We had an approximation that was assuming any ddx or tex instruction
needed helper invocations, but that's not true for texelFetch() or
textureSize().  It also meant that we were setting PIXLOD on vertex and
compute shaders doing texturing, which doesn't really make sense.

shader-db (with a hack to log pixlod):
total pixlod in shared programs: 582 -> 573 (-1.55%)

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2681
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4308>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4308>
2020-03-31 22:29:22 +00:00
Eric Anholt 974b9c57c1 freedreno: Drop an unnecessary include marked "this should go away"
It came in with the initial import, and doesn't seem to be necessary any
more.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4289>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4289>
2020-03-31 21:20:11 +00:00
Rob Clark 127fa5d00c freedreno/ir3: fix android build
Fixes: e5339fe4a4 ("Move compiler.h and imports.h/c from src/mesa/main into src/util")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4381>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4381>
2020-03-31 18:46:04 +00:00
Rob Clark ae7da1a017 util: move ALIGN/ROUND_DOWN_TO to u_math.h
These are less mesa specific than the rest of macros.h, and would be
nice to use outside of mesa.  Prep for next patch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4381>
2020-03-31 18:46:04 +00:00
Jason Ekstrand 7a53e67816 spirv: Implement OpCopyObject and OpCopyLogical as blind copies
Because the types etc. are required to logically match, we can just
copy-propagate the guts of the vtn_value.  This was causing issues with
some new CTS tests that are doing an OpCopyObject of a sampler which is
a special-cased type in spirv_to_nir.  Of course, this is only a partial
solution.  Ideally, we've got a bit of work to do to make all the
composite stuff able to handle all types including images, sampler, and
combined image/samplers but this gets some CTS tests passing.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4375>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4375>
2020-03-31 17:55:30 +00:00
Lionel Landwerlin 88c046a6d3 isl: don't warn in physical extent calculation for yuv formats
Those format have correct descriptions already with the exception of
the planar format. In that case we introduce an assert.

This fine because we don't use the planar format in any of our
drivers. There are restrictions on how the addresses of the 2 planes
are relative to one another which make this annoying. The sampler is
also more limited than what we can do with a shader snippet.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2999>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2999>
2020-03-31 15:59:21 +00:00
Lionel Landwerlin 015f08dd43 isl: set bpb for Y8_UNORM
This isn't a format we use in any of the drivers but for consistency
just give it a correct bpb.

We also set the luminance in the G channel. We can't actually use this
format with the 3D sampler (only media).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2999>
2020-03-31 15:59:21 +00:00
Eric Engestrom 5f4d9b419a scons: prune unused Makefile.sources
Fixes: 2e92d33819 ("scons: Prune out unnecessary targets.")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed by: Jose Fonseca <jfonseca@vmware.com>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4373>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4373>
2020-03-31 09:42:07 +00:00
Connor Abbott d63acce5f4 tu: Return the correct alignment for images
The alignment field was never initialized, so we were just returning an
alignment of 0. Return the alignment from fdl, and while we're here
cleanup some leftovers in tu_private.h.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4357>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4357>
2020-03-31 08:22:58 +00:00
Connor Abbott d84c206d85 freedreno/fdl: Add base_align
Tell users what the base address of the image needs to be aligned to.
These values are based on experimentation via passing an offset to
vkBindImageMemory with turnip and seeing if tests still pass. Note that
r8g8 is also special in this regard, however it actually has an
increased alignment (in bytes).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4357>
2020-03-31 08:22:58 +00:00
Jason Ekstrand 896a7c28eb anv/allocator: Use util_dynarray for blocks in anv_state_stream
When we originally wrote a bunch of the allocation data structures, we
re-used the GPU memory for CPU-side data structures.  It's a bit more
memory efficient and usually ok.  However, this has a couple of
problems:

 1. It makes it MUCH more likely that the GPU will accidentlly stomp
    CPU-side data structures and cause nearly impossible to debug
    crashes.

 2. With discrete GPUs, the memory will be mapped somehow and that map
    may be across the BAR so it could have horribly slow CPU access.
    This is bad for our CPU-side data structures.

In the case of anv_state_stream, it also made the data structure
massively more complex than it needed to be.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4336>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4336>
2020-03-31 08:12:07 +00:00
Jason Ekstrand 63bec07e14 anv: Account for the header in anv_state_stream_alloc
If we have an allocation that's exactly the block size, we end up
computing a new block size to allocate that's exactly the block size,
add in the header, and then assert fail.  When computing the block size,
we need to account for the header.

Fixes: 955127db93 "anv/allocator: Add support for large stream..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4336>
2020-03-31 08:12:07 +00:00
Marek Olšák 6e672074dd st/mesa: add environment variable pin_app_thread for faster glthread on AMD Zen
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4369>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4369>
2020-03-30 23:57:52 -04:00
Marek Olšák 4df3c7a207 gallium/u_threaded: call the driver to pin threads to L3 immediately
This is thread-safe and we want it to be done immediately for good L3 cache
usage.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4369>
2020-03-30 23:57:49 -04:00
Qiang Yu 4de35bed42 lima: also check tiled and depth case when import
We missed the tiled and depth case when check buffer alignment.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4362>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4362>
2020-03-31 01:40:29 +00:00
Qiang Yu e46b2ef724 lima: fix buffer import with offset
With EGL_EXT_image_dma_buf_import, user can import dma_buf
with offset.

This is also used by AOSP GLConsumer::updateTexImage
with HAL_PIXEL_FORMAT_YV12 buffer which store YUV planes in
the same buffer with offset. Render sample from it using
GL_OES_EGL_image_external. This should fix some video
display problem when using MediaCodec soft decoding which
generates HAL_PIXEL_FORMAT_YV12 buffer and render it on
screen.

Test program:
https://github.com/yuq/gfx/tree/master/yuv2rgb/dma-buf

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4362>
2020-03-31 01:40:29 +00:00
Alyssa Rosenzweig 02ad147c5c pan/bi: Fix handling of constants with COMBINE
We should never see COMBINE constants explicitly since they'll become
moves anyway, so we can simplify that. On the other hand, we do need the
type information for the lowering to work properly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig bd19e76340 pan/bi: Handle fp16/abs scheduling restriction
See previous commit for the packing side. Here we update the scheduler
to accomodate this. Note we don't actually hit this path yet, but it's
good to be proactive.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig c88f816169 pan/bi: Handle abs packing for fp16/FMA add/min
It's seriously quirky, and all to save a single bit. Alas. It also
introduces an edge case for the scheduler which is a bit annoying.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig ba8e11f0f1 pan/bi: Handle core faddminmax16 packing
This works without the exception of absolute values, which have some...
odd properties to be handled in the next commit.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig 12a16f2247 pan/bi: Structify fadd/min/max16
There is some quirky encoding here.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig c12a208d78 pan/bi: Add v2f16 versions of rounding ops
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig f81b67b857 pan/bi: Handle round opcodes in frontend
These correspond to various ops routed through BI_ROUND

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig c7170e9742 pan/bi: Assert out i16 related converts for now
Needs more investigation, and GLSL doesn't use it quite yet sadly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig 2fd8b2e6d4 pan/bi: Add one-source f32->f16 op
This really has a second op for vectorization but we don't handle this
quite yet...

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig 197c6414ea pan/bi: Add bifrost_fma_2src generic
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig 57a8e6e8d0 pan/bi: Handle standard FMA conversions
These are plain old 1-sources so they're easy to start with.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig 499e97b519 pan/bi: Enumerate conversions
There are lots of Bifrost conversion opcodes that can all be emitted
from BI_CONVERT, let's pattern match.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig 902f99a45d pan/bi: Expand out FMA conversion opcodes
There are a *lot* of them, with lots of symmetry we can exploit to
simplify the packing logic (but not entirely). Let's add the
corresponding header structs/defines, although we don't actually poke
the disassembler at this stage.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig 73715124ea pan/bi: Pack outmod and roundmode with FMA
The fields got missed.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig 158f2452f2 pan/bi: Add FMA16 packing
It's like the original FMA packing but with swizzles introduced.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig b5148b6b49 pan/bi: Fix missing type for fmul
We add a zero argument, we want it to align with the size of whatever
the other arguments were for optimization.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig 5eb209a05f pan/bi: Finish FMA structures
There were some missing fields for the 32-bit case, and the 16-bit case
has separate packing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig 375a7d0f32 pan/bi: Ignore swizzle in unwritten component
Otherwise we can trip the assert for no good reason.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig aa77d8128e pan/bi: Handle f2f* opcodes
Just more converts that got missed.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig c2a8ef907b panfrost: Enable PIPE_SHADER_CAP_FP16 on Bifrost
We don't have fp16 implemented on Midgard yet but on Bifrost we can flip
it on now.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig 77e04eb2e2 pan/bi: Enable precision lowering in standalone compiler
..since there's no CAP to guide here.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig 683cd9b6f4 pan/bi: Fix off-by-one in scoreboarding packing
Clauses actually encode the *next* clauses' dependencies.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig f3726a0874 pan/bi: Fix overzealous write barriers
It's possible this triggers an INSTR_INVALID_ENC.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00
Alyssa Rosenzweig 3d7166fa69 pan/bit: Begin generating a vertex job
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4382>
2020-03-31 01:12:26 +00:00