Commit Graph

2538 Commits

Author SHA1 Message Date
Francisco Jerez ba79a90fb5 glsl: Switch ast_type_qualifier to a 128-bit bitset.
This should end the drought of bits in the ast_type_qualifier object.
The bitset_t type works pretty much as a drop-in replacement for the
current uint64_t bitset.

The only catch is that the bitset_t type as defined in the previous
commit doesn't have a trivial constructor (because it has a
user-defined constructor), so it cannot be used as union member
without providing a user-defined constructor for the union (which
causes it in turn to be non-trivially constructible).  This annoyance
could be easily addressed in C++11 by declaring the default
constructor of bitset_t to be the implicitly defined one -- IMO one
more reason to drop support for GCC 4.2-4.3.

The other minor change was required because glsl_parser_extras.cpp was
hard-coding the type of bitset temporaries as uint64_t, which (unlike
would have been the case if the uint64_t had been replaced with
e.g. an __int128) would otherwise have caused a build failure, because
the boolean conversion operator of bitset_t is marked explicit (if
C++11 is available), so the bitset won't be silently truncated down to
1 bit in order to use it to initialize the uint64_t temporaries
(yikes).

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-02-24 15:28:36 -08:00
Marek Olšák 605a7f6db5 mesa: implement ARB_compatibility
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-23 20:50:15 +01:00
Samuel Pitoiset 63fb30c674 nir: lower fexp2(fmul(flog2(a), 2)) to fmul(a, a)
Similar for the 4 case.

Suggested by Bas.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:40:45 +01:00
Samuel Pitoiset b18997876f nir: add is_used_once for fmul(fexp2(a), fexp2(b)) to fexp2(fadd(a, b))
Otherwise the code size increases because the original fexp2()
instructions can't be deleted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:40:43 +01:00
Samuel Pitoiset 3c40be126f spirv: apply memory qualifiers to images
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-22 20:39:53 +01:00
Kenneth Graunke 183ce5e629 glsl: Parse 'layout' as a token with advanced blending or bindless
Both KHR_blend_equation_advanced and ARB_bindless_texture provide
layout qualifiers, and are exposed in compatibility contexts.  We
need to parse the layout qualifier as a token in order for those
to work, but forgot to extend this check.

ARB_shader_image_load_store would need a similar treatment, but we
don't expose that in legacy OpenGL contexts.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105161
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-02-21 17:50:57 -08:00
Timothy Arceri cdeac00267 nir: remove old assert
This was originally intended to make sure the remap location
was not -1. However the code has changed alot since then,
the location is now never set to -1 and we also handle
components meaning this old assert has been doing comparisions
with the pointer to the array of component data.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105183
2018-02-22 09:31:00 +11:00
Eric Anholt 4636ce362d glsl/tests: Fix a compiler warning about signed/unsigned loop comparison.
Fixes: d32956935e ("glsl: Walk a list of ir_dereference_array to mark array elements as accessed")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-20 20:23:57 -08:00
Eric Anholt 1b313eedb5 glsl: Silence warnings in the uniform initializer test about 16-bit types
They should probably get unit tests implemented, but this cleans up a
bunch of warnings in my build for now.

Fixes: 59f458cd87 ("glsl: Add 16-bit types")
Cc: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-20 20:23:57 -08:00
Timothy Arceri 347038baa9 glsl/nir: add pixel_center_integer to shader info
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-19 08:47:48 +11:00
Eric Engestrom a176b053b6 glsl: fix sizeof(pointer) bug
Doesn't really change anything to the test though ¯\_(ツ)_/¯

CID: 1429511
Fixes: e8495646af "glsl/tests: changes to test_disk_cache_create test"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-16 12:04:29 +00:00
Marek Olšák 6b1e26e181 mesa: move STATE_LENGTH to shader_enums.h and use it everywhere
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák e149a0253c mesa,glsl,nir: reduce gl_state_index size to 2 bytes
Let's use the new gl_state_index16 type everywhere and remove
the typecasts.

This helps reduce the size of gl_program_parameter.

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Marek Olšák 65ed98839b mesa: reduce the size of gl_program
gl_program: 1456 -> 976 bytes

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-13 01:00:45 +01:00
Eric Anholt 21670f8208 glsl/tests: Fix strict aliasing warning about int64/double.
Fixes: 4bf9862747 ("glsl/tests: Add UINT64 and INT64 types")
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
2018-02-12 20:48:43 +00:00
Alejandro Piñeiro f32b01ca43 glsl/linker: remove ubo explicit binding handling
This is already handled at link_uniform_blocks, specifically at
process_block_array_leaf.

Additionally, this code was not handling correctly arrays of
arrays. When creating the name of the block to set the binding, it
only took into account the first level, so any attempt to set a
explicit binding on a array of array ubo would trigger an assertion.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-02-09 08:32:42 +01:00
Scott D Phillips 1f4d2433e7 meson: Add build option for tools
Add a build option to control building some of the misc tools we
have. Also set the executables to install, presumably you want
that if you're asking for the build.

v2: set 'install:' to the with_tools value, not true (Jordan)
    handle 'all' in a the comma list (Dylan)
    Add freedreno's tools (Dylan)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-08 11:24:42 -08:00
Tapani Pälli e8495646af glsl/tests: changes to test_disk_cache_create test
Next patch will allow disk_cache instance to be created without
path set for it, modify some test cases that assume disk_cache
creation to fail with invalid path. Creation should succeed but
simple put/get test fail.

v2: leave tests as is but check that both cache struct exists
    and try simple put/get that should fail with invalid path set
    (Emil)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli 83c81b6cce glsl/tests: move utility functions in cache_test
Patch moves functions higher so that we can utilize them from
test_disk_cache_create which is modified by next patch.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Samuel Pitoiset 3488a3f033 radv: run nir_opt_shrink_load
LLVM can't shrink loads.

Polaris10:
Totals from affected shaders:
SGPRS: 62528 -> 59955 (-4.11 %)
VGPRS: 44708 -> 44616 (-0.21 %)
Spilled SGPRs: 16 -> 8 (-50.00 %)
Code Size: 1355504 -> 1355172 (-0.02 %) bytes
Max Waves: 11710 -> 11670 (-0.34 %)

Vega10:
Totals from affected shaders:
SGPRS: 51448 -> 50371 (-2.09 %)
VGPRS: 39140 -> 39048 (-0.24 %)
Spilled SGPRs: 16 -> 16 (0.00 %)
Code Size: 1307188 -> 1304296 (-0.22 %) bytes
Max Waves: 11312 -> 11292 (-0.18 %)

This reduces SGPRs spilling in MadMax, and it also reduces
number of SGPRs in DOW3 and F12017. The number of waves slightly
decreases in F1 but I don't see any performance changes after
benchmarking it. Talos and Serious Sam are not affected because
they don't use any push constants.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-06 23:08:44 +01:00
Samuel Pitoiset e68562b94b nir: add nir_opt_shrink_load pass
This is a very simple pass that just shrinks load_push_constant
intrinsics when some components are unused. For now, it can just
shrink vec4 to vec3, vec3 to vec2 and so on.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-06 23:08:39 +01:00
Iago Toral Quiroga ef439a4fdc spirv: split constant initializers on in/out structs
The SPIR-V parser splits in/out struct variables and creates
a separate variable for each first-level member of the struct.
When the struct variable has an initializer this means that we also
need to split the initializer.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-06 07:50:18 +01:00
Ilia Mirkin 02a6d901ee mesa: add OES_EGL_image_external_essl3 support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-02-06 07:28:11 +02:00
Juan A. Suarez Romero 4195eed961 glsl/linker: check same name is not used in block and outside
According with OpenGL GLSL 3.20 spec, section 4.3.9:

  "It is a link-time error if any particular shader interface
   contains:
     - two different blocks, each having no instance name, and each
       having a member of the same name, or
     - a variable outside a block, and a block with no instance name,
       where the variable has the same name as a member in the block."

This fixes a previous commit 9b894c8 ("glsl/linker: link-error using the
same name in unnamed block and outside") that covered this case, but
did not take in account that precision qualifiers are ignored when
comparing blocks with no instance name.

With this commit, the original tests
KHR-GL*.shaders.uniform_block.common.name_matching keep fixed, and also
dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision
regression is fixed, which was broken by previous commit.

v2: use helper varibles (Matteo Bruni)

Fixes: 9b894c8 ("glsl/linker: link-error using the same name in unnamed block and outside")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104668
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104777
CC: Mark Janes <mark.a.janes@intel.com>
CC: "18.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Matteo Bruni <matteo.mystral@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-02-05 18:10:43 +01:00
Mathias Fröhlich 186f03cfb0 mesa: Put materials at the end of the generic block.
The materials are now moved to the end of the
generic attributes block to the range 4-15.

Before, the way the position and generic 0 attribute
is handled was dependent on the presence and kind of
the currently attached vertex program. With this
change the way the position attribute and the generic 0
attribute is treated only depends on the enabled
flag of those two arrays.
This will later help to untangle the update dependencies
between enabled arrays and shader inputs.

v2: s,VERT_ATTRIB_MAT_OFFSET,VERT_ATTRIB_MAT0,g

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:06 +01:00
Mathias Fröhlich 38b41fd718 mesa: Use defines for the aliased material array attributes.
Instead of just assuming that the material attributes
just overlap with the generic attributes 0-12, give
them symbolic defines so that we can easier move them
to an other range.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-01 22:39:06 +01:00
Ian Romanick ee63933a73 nir: Distribute binary operations with constants into bcsel
This was specifically designed to simplify 1+mix(0, a-1, condition) to
mix(1, a, condition) by pushing the 1+ inside.

Skylake, Broadwell, and Haswell had similar results.  Skylake shown.
total instructions in shared programs: 14521753 -> 14521716 (<.01%)
instructions in affected programs: 10619 -> 10582 (-0.35%)
helped: 51
HURT: 14
helped stats (abs) min: 1 max: 12 x̄: 1.43 x̃: 1
helped stats (rel) min: 0.20% max: 3.58% x̄: 1.01% x̃: 0.95%
HURT stats (abs)   min: 1 max: 11 x̄: 2.57 x̃: 1
HURT stats (rel)   min: 0.22% max: 1.75% x̄: 1.20% x̃: 1.32%
95% mean confidence interval for instructions value: -1.31 0.17
95% mean confidence interval for instructions %-change: -0.80% -0.27%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 533000205 -> 533003533 (<.01%)
cycles in affected programs: 110610 -> 113938 (3.01%)
helped: 43
HURT: 28
helped stats (abs) min: 6 max: 440 x̄: 27.12 x̃: 16
helped stats (rel) min: 0.39% max: 4.84% x̄: 1.60% x̃: 1.67%
HURT stats (abs)   min: 2 max: 3066 x̄: 160.50 x̃: 14
HURT stats (rel)   min: 0.08% max: 77.78% x̄: 5.16% x̃: 0.62%
95% mean confidence interval for cycles value: -43.81 137.56
95% mean confidence interval for cycles %-change: -1.47% 3.60%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 10018840 -> 10018713 (<.01%)
instructions in affected programs: 9431 -> 9304 (-1.35%)
helped: 51
HURT: 3
helped stats (abs) min: 1 max: 80 x̄: 2.76 x̃: 1
helped stats (rel) min: 0.20% max: 16.43% x̄: 1.16% x̃: 0.81%
HURT stats (abs)   min: 1 max: 12 x̄: 4.67 x̃: 1
HURT stats (rel)   min: 0.22% max: 1.33% x̄: 0.59% x̃: 0.22%
95% mean confidence interval for instructions value: -5.36 0.66
95% mean confidence interval for instructions %-change: -1.66% -0.46%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 87571944 -> 87572785 (<.01%)
cycles in affected programs: 117234 -> 118075 (0.72%)
helped: 42
HURT: 23
helped stats (abs) min: 2 max: 114 x̄: 51.90 x̃: 30
helped stats (rel) min: 0.11% max: 11.01% x̄: 4.45% x̃: 2.74%
HURT stats (abs)   min: 1 max: 2341 x̄: 131.35 x̃: 10
HURT stats (rel)   min: 0.06% max: 37.11% x̄: 2.75% x̃: 0.61%
95% mean confidence interval for cycles value: -61.05 86.93
95% mean confidence interval for cycles %-change: -3.47% -0.33%
Inconclusive result (value mean confidence interval includes 0).

Sandy Bridge
total instructions in shared programs: 10542933 -> 10542844 (<.01%)
instructions in affected programs: 11487 -> 11398 (-0.77%)
helped: 52
HURT: 3
helped stats (abs) min: 1 max: 40 x̄: 1.96 x̃: 1
helped stats (rel) min: 0.08% max: 8.16% x̄: 0.90% x̃: 0.72%
HURT stats (abs)   min: 1 max: 11 x̄: 4.33 x̃: 1
HURT stats (rel)   min: 0.22% max: 1.22% x̄: 0.55% x̃: 0.22%
95% mean confidence interval for instructions value: -3.17 -0.07
95% mean confidence interval for instructions %-change: -1.13% -0.52%
Instructions are helped.

total cycles in shared programs: 146098397 -> 146097094 (<.01%)
cycles in affected programs: 128140 -> 126837 (-1.02%)
helped: 47
HURT: 8
helped stats (abs) min: 2 max: 333 x̄: 29.21 x̃: 18
helped stats (rel) min: 0.13% max: 5.04% x̄: 1.18% x̃: 0.95%
HURT stats (abs)   min: 1 max: 16 x̄: 8.75 x̃: 9
HURT stats (rel)   min: 0.08% max: 0.43% x̄: 0.30% x̃: 0.34%
95% mean confidence interval for cycles value: -37.49 -9.90
95% mean confidence interval for cycles %-change: -1.22% -0.71%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886711 -> 7886509 (<.01%)
instructions in affected programs: 10425 -> 10223 (-1.94%)
helped: 50
HURT: 2
helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1
helped stats (rel) min: 0.34% max: 15.38% x̄: 1.12% x̃: 0.54%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.86% max: 0.91% x̄: 0.89% x̃: 0.89%
95% mean confidence interval for instructions value: -8.05 0.28
95% mean confidence interval for instructions %-change: -1.83% -0.26%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 178115324 -> 178114612 (<.01%)
cycles in affected programs: 765726 -> 765014 (-0.09%)
helped: 39
HURT: 1
helped stats (abs) min: 2 max: 276 x̄: 18.31 x̃: 8
helped stats (rel) min: <.01% max: 8.47% x̄: 0.39% x̃: 0.04%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -32.07 -3.53
95% mean confidence interval for cycles %-change: -0.86% 0.10%
Inconclusive result (%-change mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857762 -> 4857661 (<.01%)
instructions in affected programs: 5523 -> 5422 (-1.83%)
helped: 25
HURT: 1
helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1
helped stats (rel) min: 0.34% max: 13.61% x̄: 1.04% x̃: 0.52%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.86% max: 0.86% x̄: 0.86% x̃: 0.86%
95% mean confidence interval for instructions value: -9.99 2.22
95% mean confidence interval for instructions %-change: -2.01% 0.08%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 122179674 -> 122179194 (<.01%)
cycles in affected programs: 530162 -> 529682 (-0.09%)
helped: 22
HURT: 1
helped stats (abs) min: 2 max: 292 x̄: 21.91 x̃: 7
helped stats (rel) min: <.01% max: 8.65% x̄: 0.44% x̃: 0.04%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -46.56 4.82
95% mean confidence interval for cycles %-change: -1.20% 0.36%
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:15 -08:00
Ian Romanick 03fb13f646 nir: Rearrange logic op-compounded integer compares
Skylake and Broadwell had similar results.  Skylake shown.
total instructions in shared programs: 14521769 -> 14521753 (<.01%)
instructions in affected programs: 8782 -> 8766 (-0.18%)
helped: 16
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.12% max: 0.40% x̄: 0.20% x̃: 0.18%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.23% -0.16%
Instructions are helped.

total cycles in shared programs: 533000376 -> 533000205 (<.01%)
cycles in affected programs: 447035 -> 446864 (-0.04%)
helped: 9
HURT: 9
helped stats (abs) min: 2 max: 40 x̄: 35.78 x̃: 40
helped stats (rel) min: 0.02% max: 0.18% x̄: 0.10% x̃: 0.09%
HURT stats (abs)   min: 1 max: 52 x̄: 16.78 x̃: 10
HURT stats (rel)   min: <.01% max: 1.11% x̄: 0.29% x̃: 0.12%
95% mean confidence interval for cycles value: -25.07 6.07
95% mean confidence interval for cycles %-change: -0.08% 0.27%
Inconclusive result (value mean confidence interval includes 0).

No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick 053be9f020 nir: Rearrange and-compounded float compares
If both comparisons are used as sources for instructions other than the
iand, this transformation is detrimental.  If the non-identical value in
both compares is constant, the fmin or fmax will be constant-folded
away, so the transformation is always a win.

It is interesting to me that on Iron Lake only 81 shaders have
instruction counts changed, but 726 shaders have cycle counts changed.

shader-db results:

Skylake
total instructions in shared programs: 14525728 -> 14521017 (-0.03%)
instructions in affected programs: 1164726 -> 1160015 (-0.40%)
helped: 1692
HURT: 5
helped stats (abs) min: 1 max: 637 x̄: 2.79 x̃: 2
helped stats (rel) min: 0.07% max: 16.36% x̄: 0.81% x̃: 0.33%
HURT stats (abs)   min: 1 max: 12 x̄: 3.20 x̃: 1
HURT stats (rel)   min: 0.38% max: 2.86% x̄: 2.36% x̃: 2.86%
95% mean confidence interval for instructions value: -3.52 -2.03
95% mean confidence interval for instructions %-change: -0.86% -0.74%
Instructions are helped.

total cycles in shared programs: 533115449 -> 532991404 (-0.02%)
cycles in affected programs: 119401803 -> 119277758 (-0.10%)
helped: 1145
HURT: 467
helped stats (abs) min: 1 max: 34644 x̄: 145.92 x̃: 18
helped stats (rel) min: <.01% max: 45.33% x̄: 1.58% x̃: 0.42%
HURT stats (abs)   min: 1 max: 1590 x̄: 92.15 x̃: 15
HURT stats (rel)   min: <.01% max: 13.48% x̄: 1.26% x̃: 0.39%
95% mean confidence interval for cycles value: -122.16 -31.74
95% mean confidence interval for cycles %-change: -0.94% -0.57%
Cycles are helped.

total spills in shared programs: 9597 -> 9534 (-0.66%)
spills in affected programs: 403 -> 340 (-15.63%)
helped: 1
HURT: 1

total fills in shared programs: 13904 -> 13790 (-0.82%)
fills in affected programs: 1627 -> 1513 (-7.01%)
helped: 2
HURT: 1

LOST:   0
GAINED: 2

Broadwell
total instructions in shared programs: 14816966 -> 14812590 (-0.03%)
instructions in affected programs: 1499885 -> 1495509 (-0.29%)
helped: 1672
HURT: 15
helped stats (abs) min: 1 max: 455 x̄: 2.70 x̃: 2
helped stats (rel) min: 0.05% max: 16.36% x̄: 0.81% x̃: 0.33%
HURT stats (abs)   min: 1 max: 21 x̄: 9.20 x̃: 8
HURT stats (rel)   min: 0.08% max: 2.86% x̄: 1.06% x̃: 0.53%
95% mean confidence interval for instructions value: -3.14 -2.05
95% mean confidence interval for instructions %-change: -0.85% -0.73%
Instructions are helped.

total cycles in shared programs: 559353622 -> 559345595 (<.01%)
cycles in affected programs: 139893703 -> 139885676 (<.01%)
helped: 921
HURT: 697
helped stats (abs) min: 1 max: 42424 x̄: 143.45 x̃: 18
helped stats (rel) min: <.01% max: 36.23% x̄: 2.02% x̃: 0.87%
HURT stats (abs)   min: 1 max: 2370 x̄: 178.03 x̃: 38
HURT stats (rel)   min: <.01% max: 17.35% x̄: 0.71% x̃: 0.14%
95% mean confidence interval for cycles value: -59.64 49.72
95% mean confidence interval for cycles %-change: -1.02% -0.66%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 78902 -> 78861 (-0.05%)
spills in affected programs: 2418 -> 2377 (-1.70%)
helped: 1
HURT: 11

total fills in shared programs: 83782 -> 83678 (-0.12%)
fills in affected programs: 3515 -> 3411 (-2.96%)
helped: 2
HURT: 11

LOST:   0
GAINED: 5

Haswell and Ivy Bridge had similar results. Haswell shown.
total instructions in shared programs: 9033898 -> 9032010 (-0.02%)
instructions in affected programs: 308064 -> 306176 (-0.61%)
helped: 921
HURT: 4
helped stats (abs) min: 1 max: 20 x̄: 2.05 x̃: 1
helped stats (rel) min: 0.17% max: 17.54% x̄: 0.80% x̃: 0.35%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23%
95% mean confidence interval for instructions value: -2.21 -1.87
95% mean confidence interval for instructions %-change: -0.88% -0.68%
Instructions are helped.

total cycles in shared programs: 84628949 -> 84620520 (<.01%)
cycles in affected programs: 2164913 -> 2156484 (-0.39%)
helped: 518
HURT: 359
helped stats (abs) min: 1 max: 440 x̄: 41.52 x̃: 20
helped stats (rel) min: <.01% max: 17.17% x̄: 1.95% x̃: 1.01%
HURT stats (abs)   min: 1 max: 586 x̄: 36.43 x̃: 8
HURT stats (rel)   min: 0.04% max: 18.65% x̄: 1.47% x̃: 0.40%
95% mean confidence interval for cycles value: -15.17 -4.05
95% mean confidence interval for cycles %-change: -0.77% -0.32%
Cycles are helped.

LOST:   0
GAINED: 4

Sandy Bridge
total instructions in shared programs: 10544860 -> 10542933 (-0.02%)
instructions in affected programs: 360019 -> 358092 (-0.54%)
helped: 931
HURT: 4
helped stats (abs) min: 1 max: 20 x̄: 2.07 x̃: 1
helped stats (rel) min: 0.11% max: 15.52% x̄: 0.68% x̃: 0.30%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 3.33% max: 3.33% x̄: 3.33% x̃: 3.33%
95% mean confidence interval for instructions value: -2.23 -1.89
95% mean confidence interval for instructions %-change: -0.76% -0.58%
Instructions are helped.

total cycles in shared programs: 146106820 -> 146098397 (<.01%)
cycles in affected programs: 3435047 -> 3426624 (-0.25%)
helped: 572
HURT: 329
helped stats (abs) min: 1 max: 1289 x̄: 32.52 x̃: 15
helped stats (rel) min: <.01% max: 26.29% x̄: 0.97% x̃: 0.33%
HURT stats (abs)   min: 1 max: 1714 x̄: 30.93 x̃: 6
HURT stats (rel)   min: 0.02% max: 41.31% x̄: 1.13% x̃: 0.19%
95% mean confidence interval for cycles value: -16.85 -1.85
95% mean confidence interval for cycles %-change: -0.39% -0.01%
Cycles are helped.

LOST:   1
GAINED: 0

Iron Lake
total instructions in shared programs: 7886925 -> 7886711 (<.01%)
instructions in affected programs: 25763 -> 25549 (-0.83%)
helped: 75
HURT: 6
helped stats (abs) min: 1 max: 13 x̄: 3.33 x̃: 1
helped stats (rel) min: 0.35% max: 17.57% x̄: 1.96% x̃: 0.53%
HURT stats (abs)   min: 1 max: 16 x̄: 6.00 x̃: 1
HURT stats (rel)   min: 2.86% max: 4.79% x̄: 3.49% x̃: 2.86%
95% mean confidence interval for instructions value: -3.69 -1.60
95% mean confidence interval for instructions %-change: -2.54% -0.57%
Instructions are helped.

total cycles in shared programs: 178116888 -> 178115324 (<.01%)
cycles in affected programs: 5858790 -> 5857226 (-0.03%)
helped: 484
HURT: 242
helped stats (abs) min: 2 max: 76 x̄: 5.27 x̃: 6
helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06%
HURT stats (abs)   min: 2 max: 76 x̄: 4.07 x̃: 2
HURT stats (rel)   min: 0.01% max: 3.99% x̄: 0.19% x̃: 0.03%
95% mean confidence interval for cycles value: -2.76 -1.55
95% mean confidence interval for cycles %-change: -0.12% 0.01%
Inconclusive result (%-change mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857870 -> 4857762 (<.01%)
instructions in affected programs: 13994 -> 13886 (-0.77%)
helped: 39
HURT: 5
helped stats (abs) min: 1 max: 13 x̄: 3.28 x̃: 2
helped stats (rel) min: 0.33% max: 17.11% x̄: 1.86% x̃: 0.48%
HURT stats (abs)   min: 1 max: 16 x̄: 4.00 x̃: 1
HURT stats (rel)   min: 2.86% max: 4.71% x̄: 3.23% x̃: 2.86%
95% mean confidence interval for instructions value: -3.86 -1.05
95% mean confidence interval for instructions %-change: -2.61% 0.04%
Inconclusive result (%-change mean confidence interval includes 0).

total cycles in shared programs: 122180744 -> 122179674 (<.01%)
cycles in affected programs: 3686646 -> 3685576 (-0.03%)
helped: 273
HURT: 141
helped stats (abs) min: 2 max: 76 x̄: 5.81 x̃: 6
helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06%
HURT stats (abs)   min: 2 max: 76 x̄: 3.66 x̃: 2
HURT stats (rel)   min: 0.01% max: 3.99% x̄: 0.16% x̃: 0.02%
95% mean confidence interval for cycles value: -3.42 -1.75
95% mean confidence interval for cycles %-change: -0.15% 0.03%
Inconclusive result (%-change mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick 821e7a4d32 nir: Separate a weird compare with zero to two compares with zero
min(a+b, c+d) >= 0 becomes (a+b >= 0 && c+d >= 0).

No shader-db changes, but it does prevent 6 to 12 instruction
regressions in the next patch on all measured Intel platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick 68420d8322 nir: Simplify min and max of b2f
v2: Rebase on almost 2 years.  Require that one of the arguments to fmin
or fmax be used only once.  This prevents some regressions.

shader-db results:

Skylake and Broadwell had similar results.  Skylake shown.
total instructions in shared programs: 14526021 -> 14525913 (<.01%)
instructions in affected programs: 4613 -> 4505 (-2.34%)
helped: 31
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 3.48 x̃: 4
helped stats (rel) min: 0.62% max: 6.67% x̄: 3.31% x̃: 2.42%

total cycles in shared programs: 533118710 -> 533118403 (<.01%)
cycles in affected programs: 34334 -> 34027 (-0.89%)
helped: 24
HURT: 0
helped stats (abs) min: 4 max: 24 x̄: 12.79 x̃: 14
helped stats (rel) min: 0.25% max: 2.40% x̄: 1.08% x̃: 1.03%

No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick d8d18516b0 nir: Undo possible damage caused by rearranging or-compounded float compares
shader-db results:

Skylake and Broadwell had similar results (Skylake shown)
total instructions in shared programs: 14525898 -> 14525836 (<.01%)
instructions in affected programs: 1964 -> 1902 (-3.16%)
helped: 14
HURT: 0
helped stats (abs) min: 1 max: 25 x̄: 4.43 x̃: 1
helped stats (rel) min: 0.68% max: 9.77% x̄: 2.10% x̃: 0.86%
95% mean confidence interval for instructions value: -9.46 0.60
95% mean confidence interval for instructions %-change: -3.97% -0.24%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 533119892 -> 533115756 (<.01%)
cycles in affected programs: 96061 -> 91925 (-4.31%)
helped: 13
HURT: 1
helped stats (abs) min: 60 max: 596 x̄: 318.77 x̃: 300
helped stats (rel) min: 1.15% max: 5.49% x̄: 4.27% x̃: 4.42%
HURT stats (abs)   min: 8 max: 8 x̄: 8.00 x̃: 8
HURT stats (rel)   min: 0.46% max: 0.46% x̄: 0.46% x̃: 0.46%
95% mean confidence interval for cycles value: -379.43 -211.43
95% mean confidence interval for cycles %-change: -4.84% -3.01%
Cycles are helped.

Haswell, Ivy Bridge and Sandy Bridge had similar results (Haswell shown).
total instructions in shared programs: 9033948 -> 9033898 (<.01%)
instructions in affected programs: 535 -> 485 (-9.35%)
helped: 2
HURT: 0

total cycles in shared programs: 84631402 -> 84628949 (<.01%)
cycles in affected programs: 63197 -> 60744 (-3.88%)
helped: 13
HURT: 2
helped stats (abs) min: 1 max: 594 x̄: 189.62 x̃: 140
helped stats (rel) min: 0.07% max: 5.04% x̄: 3.79% x̃: 4.01%
HURT stats (abs)   min: 4 max: 8 x̄: 6.00 x̃: 6
HURT stats (rel)   min: 0.17% max: 0.45% x̄: 0.31% x̃: 0.31%
95% mean confidence interval for cycles value: -253.40 -73.67
95% mean confidence interval for cycles %-change: -4.24% -2.25%
Cycles are helped.

No changes on GM45 or Iron Lake.

v2: Add a couple more tautological compares.  Suggested by Elie.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick 3941cba0f7 nir: Be more conservative about rearranging or-compounded compares
If both comparisons are used as sources for instructions other than the
ior, this transformation is detrimental.  If the non-identical value in
both compares is constant, the fmin or fmax will be constant-folded
away, so the transformation is always a win.

shader-db results:

Skylake
total instructions in shared programs: 14526147 -> 14525898 (<.01%)
instructions in affected programs: 70239 -> 69990 (-0.35%)
helped: 102
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 2.44 x̃: 1
helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20%
95% mean confidence interval for instructions value: -2.86 -2.02
95% mean confidence interval for instructions %-change: -0.46% -0.31%
Instructions are helped.

total cycles in shared programs: 533120531 -> 533119892 (<.01%)
cycles in affected programs: 994875 -> 994236 (-0.06%)
helped: 76
HURT: 26
helped stats (abs) min: 1 max: 324 x̄: 27.09 x̃: 13
helped stats (rel) min: <.01% max: 4.21% x̄: 0.45% x̃: 0.18%
HURT stats (abs)   min: 1 max: 167 x̄: 54.62 x̃: 26
HURT stats (rel)   min: <.01% max: 4.36% x̄: 1.01% x̃: 0.39%
95% mean confidence interval for cycles value: -19.44 6.91
95% mean confidence interval for cycles %-change: -0.30% 0.15%
Inconclusive result (value mean confidence interval includes 0).

Broadwell
total instructions in shared programs: 14816005 -> 14815787 (<.01%)
instructions in affected programs: 64658 -> 64440 (-0.34%)
helped: 97
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 2.25 x̃: 1
helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20%
95% mean confidence interval for instructions value: -2.62 -1.87
95% mean confidence interval for instructions %-change: -0.45% -0.30%
Instructions are helped.

total cycles in shared programs: 559340386 -> 559339907 (<.01%)
cycles in affected programs: 1090491 -> 1090012 (-0.04%)
helped: 66
HURT: 28
helped stats (abs) min: 2 max: 198 x̄: 23.83 x̃: 16
helped stats (rel) min: 0.01% max: 4.21% x̄: 0.47% x̃: 0.27%
HURT stats (abs)   min: 2 max: 226 x̄: 39.07 x̃: 11
HURT stats (rel)   min: <.01% max: 4.61% x̄: 0.64% x̃: 0.20%
95% mean confidence interval for cycles value: -15.94 5.75
95% mean confidence interval for cycles %-change: -0.35% 0.07%
Inconclusive result (value mean confidence interval includes 0).

LOST:   0
GAINED: 1

Haswell
total instructions in shared programs: 9034106 -> 9033948 (<.01%)
instructions in affected programs: 24096 -> 23938 (-0.66%)
helped: 38
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 4.16 x̃: 4
helped stats (rel) min: 0.42% max: 2.29% x̄: 0.71% x̃: 0.64%
95% mean confidence interval for instructions value: -4.71 -3.60
95% mean confidence interval for instructions %-change: -0.84% -0.58%
Instructions are helped.

total cycles in shared programs: 84631628 -> 84631402 (<.01%)
cycles in affected programs: 148674 -> 148448 (-0.15%)
helped: 14
HURT: 14
helped stats (abs) min: 1 max: 114 x̄: 22.14 x̃: 12
helped stats (rel) min: 0.02% max: 2.98% x̄: 0.66% x̃: 0.21%
HURT stats (abs)   min: 1 max: 10 x̄: 6.00 x̃: 5
HURT stats (rel)   min: 0.01% max: 0.20% x̄: 0.12% x̃: 0.11%
95% mean confidence interval for cycles value: -19.42 3.28
95% mean confidence interval for cycles %-change: -0.59% 0.05%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 10015456 -> 10015293 (<.01%)
instructions in affected programs: 27701 -> 27538 (-0.59%)
helped: 38
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 4.29 x̃: 4
helped stats (rel) min: 0.33% max: 2.79% x̄: 0.66% x̃: 0.52%
95% mean confidence interval for instructions value: -4.87 -3.71
95% mean confidence interval for instructions %-change: -0.82% -0.51%
Instructions are helped.

total cycles in shared programs: 87524771 -> 87524569 (<.01%)
cycles in affected programs: 112324 -> 112122 (-0.18%)
helped: 6
HURT: 12
helped stats (abs) min: 2 max: 111 x̄: 44.67 x̃: 20
helped stats (rel) min: 0.02% max: 2.94% x̄: 1.45% x̃: 1.26%
HURT stats (abs)   min: 1 max: 16 x̄: 5.50 x̃: 5
HURT stats (rel)   min: <.01% max: 0.16% x̄: 0.08% x̃: 0.08%
95% mean confidence interval for cycles value: -29.14 6.69
95% mean confidence interval for cycles %-change: -0.93% 0.08%
Inconclusive result (value mean confidence interval includes 0).

LOST:   0
GAINED: 2

Sandy Bridge
total instructions in shared programs: 10545655 -> 10545465 (<.01%)
instructions in affected programs: 37198 -> 37008 (-0.51%)
helped: 42
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 4.52 x̃: 4
helped stats (rel) min: 0.31% max: 2.15% x̄: 0.58% x̃: 0.49%
95% mean confidence interval for instructions value: -5.14 -3.91
95% mean confidence interval for instructions %-change: -0.68% -0.47%
Instructions are helped.

total cycles in shared programs: 146113059 -> 146112427 (<.01%)
cycles in affected programs: 423514 -> 422882 (-0.15%)
helped: 32
HURT: 10
helped stats (abs) min: 4 max: 162 x̄: 24.34 x̃: 12
helped stats (rel) min: 0.06% max: 2.74% x̄: 0.37% x̃: 0.11%
HURT stats (abs)   min: 12 max: 19 x̄: 14.70 x̃: 14
HURT stats (rel)   min: 0.10% max: 0.18% x̄: 0.16% x̃: 0.14%
95% mean confidence interval for cycles value: -26.03 -4.07
95% mean confidence interval for cycles %-change: -0.43% -0.05%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886959 -> 7886925 (<.01%)
instructions in affected programs: 1340 -> 1306 (-2.54%)
helped: 4
HURT: 0
helped stats (abs) min: 2 max: 15 x̄: 8.50 x̃: 8
helped stats (rel) min: 0.63% max: 4.30% x̄: 2.45% x̃: 2.43%
95% mean confidence interval for instructions value: -20.44 3.44
95% mean confidence interval for instructions %-change: -5.78% 0.89%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 178116996 -> 178116888 (<.01%)
cycles in affected programs: 6262 -> 6154 (-1.72%)
helped: 2
HURT: 2
helped stats (abs) min: 44 max: 78 x̄: 61.00 x̃: 61
helped stats (rel) min: 3.31% max: 3.94% x̄: 3.62% x̃: 3.62%
HURT stats (abs)   min: 6 max: 8 x̄: 7.00 x̃: 7
HURT stats (rel)   min: 0.34% max: 0.68% x̄: 0.51% x̃: 0.51%
95% mean confidence interval for cycles value: -93.27 39.27
95% mean confidence interval for cycles %-change: -5.38% 2.27%
Inconclusive result (value mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857887 -> 4857870 (<.01%)
instructions in affected programs: 674 -> 657 (-2.52%)
helped: 2
HURT: 0

total cycles in shared programs: 122180816 -> 122180744 (<.01%)
cycles in affected programs: 3764 -> 3692 (-1.91%)
helped: 1
HURT: 1
helped stats (abs) min: 78 max: 78 x̄: 78.00 x̃: 78
helped stats (rel) min: 3.94% max: 3.94% x̄: 3.94% x̃: 3.94%
HURT stats (abs)   min: 6 max: 6 x̄: 6.00 x̃: 6
HURT stats (rel)   min: 0.34% max: 0.34% x̄: 0.34% x̃: 0.34%

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick cfc0d34802 nir: See through an fneg to apply existing optimizations
Doing the same for the existing feq and fne transformations didn't help
anything in shader-db.

shader-db results:

Broadwell and Skylake (Skylake shown)
total instructions in shared programs: 14529463 -> 14526147 (-0.02%)
instructions in affected programs: 402420 -> 399104 (-0.82%)
helped: 2136
HURT: 131
helped stats (abs) min: 1 max: 10 x̄: 1.61 x̃: 1
helped stats (rel) min: 0.03% max: 16.22% x̄: 3.14% x̃: 1.12%
HURT stats (abs)   min: 1 max: 2 x̄: 1.01 x̃: 1
HURT stats (rel)   min: 0.13% max: 7.69% x̄: 0.75% x̃: 0.57%
95% mean confidence interval for instructions value: -1.51 -1.41
95% mean confidence interval for instructions %-change: -3.06% -2.78%
Instructions are helped.

total cycles in shared programs: 533146915 -> 533120531 (<.01%)
cycles in affected programs: 10356261 -> 10329877 (-0.25%)
helped: 1933
HURT: 844
helped stats (abs) min: 1 max: 490 x̄: 29.44 x̃: 16
helped stats (rel) min: <.01% max: 28.57% x̄: 3.43% x̃: 1.88%
HURT stats (abs)   min: 1 max: 423 x̄: 36.17 x̃: 12
HURT stats (rel)   min: <.01% max: 23.75% x̄: 1.90% x̃: 0.59%
95% mean confidence interval for cycles value: -11.78 -7.22
95% mean confidence interval for cycles %-change: -1.98% -1.65%
Cycles are helped.

Haswell
total instructions in shared programs: 9037416 -> 9034106 (-0.04%)
instructions in affected programs: 389831 -> 386521 (-0.85%)
helped: 2184
HURT: 120
helped stats (abs) min: 1 max: 11 x̄: 1.57 x̃: 1
helped stats (rel) min: 0.03% max: 25.00% x̄: 2.73% x̃: 1.02%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.19% max: 7.69% x̄: 0.81% x̃: 0.57%
95% mean confidence interval for instructions value: -1.49 -1.39
95% mean confidence interval for instructions %-change: -2.68% -2.41%
Instructions are helped.

total cycles in shared programs: 84636243 -> 84631628 (<.01%)
cycles in affected programs: 4745058 -> 4740443 (-0.10%)
helped: 1904
HURT: 960
helped stats (abs) min: 1 max: 466 x̄: 30.21 x̃: 18
helped stats (rel) min: 0.02% max: 36.36% x̄: 3.57% x̃: 2.38%
HURT stats (abs)   min: 1 max: 1080 x̄: 55.11 x̃: 14
HURT stats (rel)   min: 0.02% max: 51.33% x̄: 2.77% x̃: 0.81%
95% mean confidence interval for cycles value: -4.51 1.29
95% mean confidence interval for cycles %-change: -1.64% -1.25%
Inconclusive result (value mean confidence interval includes 0).

LOST:   1
GAINED: 0

Sandy Bridge and Ivy Bridge (Ivy Bridge shown)
total instructions in shared programs: 10018873 -> 10015456 (-0.03%)
instructions in affected programs: 512820 -> 509403 (-0.67%)
helped: 2268
HURT: 162
helped stats (abs) min: 1 max: 11 x̄: 1.62 x̃: 1
helped stats (rel) min: 0.03% max: 25.00% x̄: 2.47% x̃: 0.88%
HURT stats (abs)   min: 1 max: 4 x̄: 1.59 x̃: 1
HURT stats (rel)   min: 0.09% max: 7.69% x̄: 0.86% x̃: 0.50%
95% mean confidence interval for instructions value: -1.46 -1.35
95% mean confidence interval for instructions %-change: -2.38% -2.12%
Instructions are helped.

total cycles in shared programs: 87538223 -> 87524771 (-0.02%)
cycles in affected programs: 5435520 -> 5422068 (-0.25%)
helped: 1916
HURT: 946
helped stats (abs) min: 1 max: 1392 x̄: 29.44 x̃: 18
helped stats (rel) min: <.01% max: 34.51% x̄: 3.34% x̃: 1.97%
HURT stats (abs)   min: 1 max: 633 x̄: 45.41 x̃: 11
HURT stats (rel)   min: 0.02% max: 25.95% x̄: 2.41% x̃: 0.62%
95% mean confidence interval for cycles value: -7.34 -2.06
95% mean confidence interval for cycles %-change: -1.62% -1.26%
Cycles are helped.

LOST:   1
GAINED: 0

Iron Lake
total instructions in shared programs: 7888446 -> 7886959 (-0.02%)
instructions in affected programs: 331581 -> 330094 (-0.45%)
helped: 1160
HURT: 97
helped stats (abs) min: 1 max: 10 x̄: 1.37 x̃: 1
helped stats (rel) min: 0.02% max: 9.68% x̄: 0.93% x̃: 0.43%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.17% max: 4.17% x̄: 0.37% x̃: 0.25%
95% mean confidence interval for instructions value: -1.25 -1.12
95% mean confidence interval for instructions %-change: -0.91% -0.75%
Instructions are helped.

total cycles in shared programs: 178130766 -> 178116996 (<.01%)
cycles in affected programs: 12534564 -> 12520794 (-0.11%)
helped: 1856
HURT: 187
helped stats (abs) min: 2 max: 202 x̄: 7.78 x̃: 4
helped stats (rel) min: <.01% max: 6.47% x̄: 0.28% x̃: 0.11%
HURT stats (abs)   min: 2 max: 26 x̄: 3.55 x̃: 2
HURT stats (rel)   min: 0.01% max: 2.14% x̄: 0.08% x̃: 0.02%
95% mean confidence interval for cycles value: -7.41 -6.07
95% mean confidence interval for cycles %-change: -0.28% -0.22%
Cycles are helped.

GM45
total instructions in shared programs: 4858912 -> 4857887 (-0.02%)
instructions in affected programs: 237565 -> 236540 (-0.43%)
helped: 867
HURT: 57
helped stats (abs) min: 1 max: 10 x̄: 1.25 x̃: 1
helped stats (rel) min: 0.02% max: 9.38% x̄: 0.87% x̃: 0.43%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.16% max: 3.85% x̄: 0.34% x̃: 0.22%
95% mean confidence interval for instructions value: -1.18 -1.04
95% mean confidence interval for instructions %-change: -0.88% -0.71%
Instructions are helped.

total cycles in shared programs: 122189118 -> 122180816 (<.01%)
cycles in affected programs: 8776418 -> 8768116 (-0.09%)
helped: 1213
HURT: 166
helped stats (abs) min: 2 max: 202 x̄: 7.30 x̃: 4
helped stats (rel) min: <.01% max: 6.43% x̄: 0.25% x̃: 0.11%
HURT stats (abs)   min: 2 max: 26 x̄: 3.35 x̃: 2
HURT stats (rel)   min: 0.01% max: 2.14% x̄: 0.06% x̃: 0.02%
95% mean confidence interval for cycles value: -6.78 -5.26
95% mean confidence interval for cycles %-change: -0.24% -0.18%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Timothy Arceri 9a2e085680 nir: add lower_all_io_to_temps flag
This will be used for freedreno and vc4 which require all inputs
and outputs to be copied to temps.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:08 +11:00
Timothy Arceri 3218756262 nir/st_glsl_to_nir: add param to disable splitting of inputs
We need this because we will always copy fs outputs to temps and
split the arrays, but do not want to do either of these with fs
inputs as it is unnessisary and makes handling interpolateAt
builtins difficult.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:08 +11:00
Timothy Arceri 09cd484d61 nir: partially revert c2acf97fcc
c2acf97fcc changed the use of double_inputs_read to be
inconsitent with its previous meaning. Here we re-enable the
gather info code that was removed as the modified code from
c2acf97fcc now uses the double_inputs member rather than
double_inputs_read.

This change allows us to use double_inputs_read with gallium
drivers without impacting double_inputs which is used by i965.

We also make use of the compiler option vs_inputs_dual_locations
to allow for the difference in behaviour between drivers that handle
vs inputs as taking up two locations for doubles, versus those that
treat them as taking a single location.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri 5b8de4bdff nir: add vs_inputs_dual_locations compiler option
Allows nir drivers to either use a single or dual locations for
vs double inputs.

i965 uses dual locations for both OpenGL and Vulkan drivers, for
now gallium OpenGL drivers only use a single location.

The following patch will also make use of this option when
calling nir_shader_gather_info().

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2018-01-30 09:08:47 +11:00
Timothy Arceri f63e05ae9e compiler: tidy up double_inputs_read uses
First we move double_inputs_read into a vs struct in the union,
double_inputs_read is only used for vs inputs so this will
save space and also allows us to add a new double_inputs field.

We add the new field because c2acf97fcc changed the behaviour
of double_inputs_read, and while it's no longer used to track
actual reads in i965 we do still want to track this for gallium
drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-30 09:08:47 +11:00
Tapani Pälli d0343bef66 nir: mark unused space in packed_tex_data
This change cleans following scary warnings in valgrind output
when disk cache is being written:

   ==6532== Uninitialised byte(s) found during client check request
   ==6532==    at 0x14423FAD: blob_write_bytes (blob.c:152)
   ==6532==    by 0x144240FB: blob_write_uint32 (blob.c:194)
   ==6532==    by 0x144001A5: write_tex (nir_serialize.c:613)

and later (loads of):

   ==6532== Use of uninitialised value of size 8
   ==6532==    at 0x62FCD9E: crc32_z (in /usr/lib64/libz.so.1.2.11)
   ==6532==    by 0x13F65014: util_hash_crc32 (crc32.c:127)
   ==6532==    by 0x13F5DABA: cache_put (disk_cache.c:947)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-29 08:11:22 +02:00
Brian Paul bacf72a18d mesa: change gl_link_status enums to uppercase
follow the convention of other enums.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-01-26 13:52:48 -07:00
Brian Paul aff5d9c256 mesa: change gl_compile_status enums to uppercase
To follow the convention of other enums.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2018-01-26 13:52:48 -07:00
Emil Velikov ac4437b20b automake: small cleanup after the meson.build inclusion
Namely extend the EXTRA_DIST list, instead of re-assigning it and bring
back a file dropped by mistake.

Fixes: 436ed65d38 ("autotools: include meson build files in tarball")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2018-01-25 17:06:29 +00:00
Timothy Arceri 0cbc62a4dd glsl: add image and sampler (un)packing support to glsl to nir
This is needed for ARB_bindless_texture support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 09:44:37 +11:00
Timothy Arceri 549ccbb435 nir: add image and sampler type to glsl_get_bit_size()
These are needed for ARB_bindless_texture support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-23 09:44:37 +11:00
Dylan Baker 436ed65d38 autotools: include meson build files in tarball
This adds the meson.build, meson_options.txt, and a few scripts that are
used exclusively by the meson build.

v2: - Remove accidentally included changes needed to test make dist with
      LLVM > 3.9

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-19 16:30:51 -08:00
Brian Paul f631a6ae8c glsl: remove unneeded extern "C" {} bracketing around Mesa includes
The two headers already have the right extern "C" annotations.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul 741d423478 glsl: include util/bitscan.h in serialize.cpp
Instead of relying on indirect inclusion of the header.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Juan A. Suarez Romero 9b894c88a6 glsl/linker: link-error using the same name in unnamed block and outside
According with OpenGL GLSL 4.20 spec, section 4.3.9, page 57:

   "It is a link-time error if any particular shader interface
    contains:
      - two different blocks, each having no instance name, and each
        having a member of the same name, or
      - a variable outside a block, and a block with no instance name,
        where the variable has the same name as a member in the block."

This means that it is a link error if for example we have a vertex
shader with the following definition.

  "layout(location=0) uniform Data { float a; float b; };"

and a fragment shader with:

  "uniform float a;"

As in both cases we refer to both uniforms as "a", and thus using
glGetUniformLocation() wouldn't know which one we mean.

This fixes KHR-GL*.shaders.uniform_block.common.name_matching.

v2: add fixed tests (Tapani)

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-01-16 19:42:35 +01:00
Samuel Pitoiset d5e369ff8a nir: add a 'const' qualifier to nir_ssa_def_components_read()
To avoid compilation warnings and because this helper
shouldn't update anything.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-12 12:25:17 +01:00
Dylan Baker 2083a14179 meson: Use dependencies for nir
This creates two new internal dependencies, idep_nir_headers and
idep_nir. The former encapsulates the generation of nir_opcodes.h and
nir_builder_opcodes.h and adding src/compiler/nir as an include path.
This ensures that any target that needs nir headers will have the
includes and that the generated headers will be generated before the
target is build. The second, idep_nir, includes the first and
additionally links to libnir.

This is intended to make it easier to avoid race conditions in the build
when using nir, since the number of consumers for libnir and it's
headers are quite high.

Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-11 15:40:02 -08:00
Dylan Baker 4ccb981673 meson: Use consistent style for tests
Don't use intermediate variables, use consistent whitespace.

Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-11 15:40:02 -08:00
Dylan Baker fbf192a67e meson: Use consistent style
Currently the meosn build has a mix of two styles:
arg : [foo, ...
       bar],

and
arg : [
  foo, ...,
  bar,
]

For consistency let's pick one. I've picked the later style, which I
think is more readable, and is more common in the mesa code base.

v2: - fix commit message

Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
2018-01-11 15:40:02 -08:00
Tapani Pälli f2c0e47d9c glsl: cleanup shader_cache header guard
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-01-11 08:04:56 +02:00
Ian Romanick 336afe7d7a glsl/linker: Safely generate mask of possible locations
If MaxAttribs were ever raised to 32, undefined behavior would occur.
We had already gone to the effort (albeit incorrectly) handle this in
one case, so fix them all.

CID: 1369628
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-01-10 07:21:12 -08:00
Ian Romanick 0c9df36157 glsl/linker: Mark no locations as invalid instead of marking all locations
If max_index were ever 32, the linker would have marked all 32
locations as invalid instead of marking none of them as invalid.  It's
a good thing the maximum value actually set by any driver for
MaxAttribs is 16.

Found by inspection while investigating CID 1369628.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-01-10 07:21:11 -08:00
Ian Romanick 702dc43f7e glsl: Don't handle visit_stop in several ::accept methods
All cases where the result could be non-visit_continue would have
already returned.

CID: 401351, 1224465, 1224466
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-01-10 07:21:11 -08:00
Ian Romanick a170f27958 glsl: Remove unnecessary assignments to type
None of these are necessary because result->type is the only thing used
outside the giant switch-statement.

CID: 1230983, 1230984
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-01-10 07:21:11 -08:00
Ian Romanick fd2f4f507f nir: Silence unused parameter warnings
In file included from src/compiler/nir/nir_opt_algebraic.c:4:0:
src/compiler/nir/nir_search_helpers.h: In function ‘is_not_const’:
src/compiler/nir/nir_search_helpers.h:118:59: warning: unused parameter
‘num_components’ [-Wunused-parameter]
 is_not_const(nir_alu_instr *instr, unsigned src, unsigned num_components,
                                                           ^~~~~~~~~~~~~~
src/compiler/nir/nir_search_helpers.h:119:29: warning: unused parameter
‘swizzle ’ [-Wunused-parameter]
              const uint8_t *swizzle)
                             ^~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-01-10 07:21:11 -08:00
Iago Toral Quiroga 7e5c81235f glsl: remove Lower{TCS,TES}PatchVerticesIn
Intel was the only user and now NIR can do the lowering.

v2: do not try to handle it as a system value directly for the SPIR-V
    path. In GL we rather handle it as a uniform like we do for the
    GLSL path (Jason).

v3: drop LowerTESPatchVerticesIn as well (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-10 08:21:02 +01:00
Scott D Phillips 161a97c3d5 .gitignore: Ignore new generated files
New generated files from:

  bb1e6ff161 ("spirv: Add a prepass to set types on vtn_values")
  65fc16c974 ("autotools: set XA versions in configure.ac and configure header file")

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-01-08 21:11:11 -08:00
Jason Ekstrand 9e5aaa93cb spirv: Do implicit conversions of uint to bool in OpStore
Technically, the GLSLang bug related to this can also affect SSBO writes
where the bool -> uint conversion is missing.  However, the only known
shipping application with an old enough version of GLSLang to cause
issues with this is the new DOOM game so we keep the workaround as small
as possible.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand 154668e79c spirv: Loosen the validation for load/store type matching
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104338
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand 986303cb92 spirv: Require a storage type for OpStore destinations
This rules out things such as trying to store a pointer to a local
variable.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand 70f588778c spirv: Add a vtn_types_compatible helper
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand 8bad7f33c6 spirv: Store the id of the type in vtn_type
Previously, we were storing a pointer to the vtn_value because we use it
to look up decorations when we create input/output variables.  This
works, but it also may be useful to have the id itself so we may as well
store that instead.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand 53265c8798 spirv: Add a mechanism for dumping failing shaders
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand 819adfdfb4 spirv: Rework asserts in var_decoration_cb
Now that higher levels are enforcing decoration sanity, we don't need
the vtn_asserts here.  This function *should* be safe but we still want
a few well-placed regular asserts in case something goes awry.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand 71ea4dded5 spirv: Rework error checking for decorations
This reworks the error checking on our generic handling of decorations.
The objective is to validate all of the SPIR-V assumptions we make
up-front and convert redundant checks to compiled-out asserts.  The most
important part of this is to ensure that member decorations only occur
on OpTypeStruct and that the member is never out-of-bounds.  This way
later code can assume that the member is sane and not have to worry
about OOB array access due to a misplaced OpMemberDecorate.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand d6a4099303 spirv: Add better type validation to OpTypeImage
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand 03c543d041 spirv: Switch on vtn_base_type in OpComposite(Extract|Insert)
This is a bit simpler since we have fewer enum values in the case.  It's
also a bit more efficient because we're making fewer glsl_get_* calls.
While we're at it, add better type validation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand 936f49268e spirv: Refactor Op[Spec]ConstantComposite and add better validation
Now that vtn_base_type is a real and full base type, we can switch on
that instead of the GLSL base type which is a lot fewer cases in our
switch.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand dabce5061d spirv: Add better validation to Op[Spec]Constant
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand 6cf965751a spirv: Remove a pointless assignment in SpvOpSpecConstant
We re-assign later inside the bit_size switch

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand f13a5cff72 spirv: Unify boolean constants and add better validation
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand 0bb18858fb spirv/info: Add spirv_op_to_string
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand ab85fd02d5 spirv: Make 'info' a local array spirv_info_c.py
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Jason Ekstrand 296046556a spirv: Add better error messages in vtn_value helpers
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-01-08 14:57:44 -08:00
Caio Marcelo de Oliveira Filho 22980f941e spirv: Import 1.2 rev 3 headers and grammar from Khronos
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-08 13:22:17 -08:00
Florian Will 7e025def6d glsl: Respect std430 layout in lower_buffer_access
Respect the std430 rules for determining offset and size of struct
members when using a std430 buffer. std140 rules lead to wrong buffer
offsets in that case.

Fixes my test case attached in Bugzilla. No piglit changes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104492
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-01-08 13:42:09 +11:00
Alejandro Piñeiro 2719467eb6 glsl/standalone: set MaxTransformFeedbackBuffers
Using 4, as it is the default value on mesa. See mesa/main/config.h
and the following commit that introduced the value:
15ac66e331

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-05 08:52:22 +01:00
Alejandro Piñeiro 0ba3de2ad7 glsl/standalone: set MaxVertexStreams
ARB_transform_feedback3 sets a minimum of 1, ARB_gpu_shader5 a minimum
of 4. It shouldn't matter too much, so choosing the later.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-05 08:52:22 +01:00
Alejandro Piñeiro 8dcf131f04 glsl/standalone: set MaxUniformBufferBindings
Used to handle how many ubo you can define on the context. Minimimum
defined as 36 on ARB_uniform_buffer_object spec, up to 84 on OpenGL
4.6 (12 per stage at each moment).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-05 08:52:22 +01:00
Alejandro Piñeiro 937b210551 glsl/standalone: point which arguments are mandatory
Every now and then I execute the standalone compiler, get the
non-version error, and need to remember what I'm doing wrong

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-05 08:52:22 +01:00
Eric Anholt deb552ca27 nir: Add a helper to get the uvec4 type.
I needed this in the vc5 compiler.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-03 14:25:23 -08:00
Rob Clark ea0bbe8201 nir: add missing local_group_size intrinsic
For GL_ARB_compute_variable_group_size

Reported-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-30 12:39:07 -05:00
Eero Tamminen 5c5f2eaa08 spirv: consider bitsize when handling OpSwitch cases
This reverts commit 7665383a33 and is
squashed together with https://patchwork.freedesktop.org/patch/194610/
(spirv: avoid infinite loop / freeze in vtn_cfg_walk_blocks()) which
fixes https://bugs.freedesktop.org/show_bug.cgi?id=104359 properly.

Fixes: 9702fac68e (spirv: consider bitsize when handling OpSwitch cases)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104359
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-28 10:38:58 -08:00
Mark Janes 7665383a33 Revert "spirv: consider bitsize when handling OpSwitch cases"
This reverts commit 9702fac68e, which
hangs vulkancts and crucible on all platforms.

The patch is being reverted because it disables continuous integration
testing.  The patch from bug 104359 does not apply to master.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104359
2017-12-21 12:15:40 -08:00
Brian Paul 6e5b882339 glsl: disable vec3 packing/splitting in tfb separate mode
This fixes a varying packing issue when using transform feedback in
GL_SEPARATE_ATTRIBS mode.  By time we get to linking, we already
know that the number of feedback attributes is under the
GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS limit so packing isn't
as critical.  In fact, packing/splitting vec3 attributes can cause
trouble because splitting effectively creates another TFB output
which can exceed device limits.  So, disable vec3 packing when it's
not needed to avoid that issue.

Fixes the Piglit ext_transform_feedback-separate test on VMware
driver.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:17 -07:00
Brian Paul 462df64495 glsl: simply packing class comparison
Handle comparing the packing class using the same method as we do
for var->data.is_xfb_only

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:17 -07:00
Brian Paul 06588a065f glsl: document varying_matches::assign_locations() params and return value
And change *components to components[] as a reminder that it's an array.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:16 -07:00
Brian Paul 544f41ff19 glsl: remove some continue statements
In some cases, I think loop code is easier to read without continue
statements.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:16 -07:00
Brian Paul 76fc24ba8d glsl: use bitwise operators in varying_matches::compute_packing_class()
The mix of bitwise operators with * and + to compute the packing_class
values was a little weird.  Just use bitwise ops instead.

v2: add assertion to make sure interpolation bits fit without collision,
per Timothy.  Basically, rewrite function to be simpler.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:16 -07:00
Brian Paul cd7705de44 glsl: simplify loop in varying_matches::assign_locations()
The use of break/continue was kind of weird/confusing.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:16 -07:00
Brian Paul 47b4183c92 glsl: minor simplification in assign_varying_locations()
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:16 -07:00
Brian Paul a0430bb62c glsl: make varying_matches::is_varying_packing_safe() const
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:16 -07:00
Brian Paul d86c9836d5 glsl: trivial comment fixes in lower_packed_varyings.cpp
Reviewed by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-20 11:23:16 -07:00
Juan A. Suarez Romero 24aac5f81e spirv: Makefile.nir.am: include vtn_gather_types_c.py script in tarball dist
Fixes: bb1e6ff161 ("spirv: Add a prepass to set types on vtn_values")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-12-20 17:44:35 +01:00
Juan A. Suarez Romero 9702fac68e spirv: consider bitsize when handling OpSwitch cases
When walking over all the cases in a OpSwitch, take in account the bitsize
of the literals to avoid getting wrong cases.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-20 10:39:15 +01:00
Dave Airlie 0e8e7ccf9d nir/linking: always set the used_across_stages/outputs_read bits
If we don't remap and output this code would trample the outputs
read bits.

This fixes a regression in
dEQP-VK.tessellation.shader_input_output.barrier

Fixes: 1c9c42d16b (nir: add varying component packing helpers)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-19 06:44:11 +10:00
Jason Ekstrand 3be382cd7c spirv: Relax the validation conditions of OpSelect
The Talos Principle contains shaders with an OpSelect between two
vectors where the condition is a scalar boolean.  This is technically
against the spec bout nir_builder gracefully handles it by splatting
out the condition to all the channels.  So long as the condition is a
boolean, just emit a warning instead of failing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104246
2017-12-18 09:48:58 -08:00
Eric Anholt 0bead224fe nir: Add a new lowering option to lower all txd to txl.
VC5 requires that all txd are lowered in the shader.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-14 14:36:17 -08:00
Eric Anholt b08b628994 nir: Fix interaction of GL_CLAMP lowering with texture offsets.
We want the clamping of the coordinate to apply after the offset, so we
need to do math to lower the offset out of the instruction.  Fixes texwrap
offset cases for GL_CLAMP with GL_NEAREST on vc5.

Note: I moved the get_texture_size() verbatim, so that it was defined
before use.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-14 14:36:17 -08:00
Rob Herring 546633dce2 Android: fix missing generation of vtn_gather_types.c
Commit bb1e6ff161 ("spirv: Add a prepass to set types on vtn_values")
added generation of vtn_gather_types.c, but forgot to add it to the
Android build files.

Fixes: bb1e6ff161 ("spirv: Add a prepass to set types on vtn_values")
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-12-13 16:20:15 -06:00
Brian Paul 0f2bd31baf glsl: trivial whitespace fixes in link_varyings.cpp 2017-12-13 08:38:07 -07:00
Timothy Arceri cab5513b47 nir: fix shift for uint64_t
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-13 13:20:28 +11:00
Jason Ekstrand 9718ce44c2 spirv: Handle image and sampler function parameters 2017-12-12 07:34:46 -08:00
Jason Ekstrand 7d3ebd1286 spirv/cfg: Refactor the function parameter loop a bit 2017-12-12 07:34:46 -08:00
Jason Ekstrand e6ba457c99 spirv/cfg: Be a bit more precise about function parameters
Pointers with no storage type are converted to inout variables but SSA
values and pointers with a storage type (which turns into a uint or
uvec2) are just input variables.
2017-12-12 07:34:46 -08:00
Jason Ekstrand aaeda8d7d4 spirv: Make sampled images a real type
Previously, we just gave them exactly the same type as the respective
image (which already had a sampler2D or similar type).  Now they have
their own base type and a pointer to the vtn_type for the image.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-12-12 07:34:46 -08:00
Jason Ekstrand df657ebb68 spirv: Add support for all bit sizes in OpSwitch
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101560
2017-12-11 22:28:34 -08:00
Jason Ekstrand 58cabae8cc spirv: Restructure the case loop in OpSwitch handling
Instead of calling vtn_add_case for the default case and then looping,
add an is_default variable and do everything inside the loop.  This will
make the next commit easier.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-11 22:28:34 -08:00
Jason Ekstrand 5f572ccc95 spirv: Add better parameter validation for vector and matrix types
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-11 22:28:34 -08:00
Jason Ekstrand a7c2be9944 spirv: Add type validation for OpSelect
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-11 22:28:34 -08:00
Jason Ekstrand 6737b1b859 spirv: Add basic type validation for OpLoad, OpStore, and OpCopyMemory
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-12-11 22:28:34 -08:00
Jason Ekstrand bb1e6ff161 spirv: Add a prepass to set types on vtn_values
This autogenerated pass will automatically find and set the type field
on all vtn_values.  This way we always have the type and can use it for
validation and other checks.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-11 22:28:34 -08:00
Jason Ekstrand 2c84b49ddf spirv: Add a vtn_type field to all vtn_values
At the moment, this just lets us drop the const_type for constants and
unify things a bit.  Eventually, we will use this to store the types of
all SPIR-V SSA values.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-11 22:28:34 -08:00
Jason Ekstrand 24f019fd69 spirv: Allow ignoring decorations for workgroup variables
Since we switched over to lowering SLM access directly in SPIR-V -> NIR,
we no longer have vtn_variables for SLM.  It's all safe as with UBOs and
SSBOs but we need to let it through in the assert.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104213
Fixes: 8761a04d0d
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-11 19:02:47 -08:00
Jason Ekstrand 2bc9123c33 spirv: Set lengths on scalar and vector types
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-11 19:02:47 -08:00
Bas Nieuwenhuizen b926da241a spirv: Fix loading an entire block at once.
There is no chain, so  checking the length ends with a SEGFAULT.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103579
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-10 01:43:26 +01:00
Jordan Justen 7cf1037d5a main, glsl: Add UniformDataDefaults which stores uniform defaults
The ARB_get_program_binary extension requires that uniform values in a
program be restored to their initial value just after linking.

This patch saves off the initial values just after linking. When the
program is restored by glProgramBinary, we can use this to copy the
initial value of uniforms into UniformDataSlots.

V2 (Timothy Arceri):
 - Store UniformDataDefaults only when serializing GLSL as this
   is what we want for both disk cache and ARB_get_program_binary.
   This saves us having to come back later and reset the Uniforms
   on program binary restores.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 16:44:35 +11:00
Jordan Justen ebd9e789c4 glsl: Split out shader program serialization
This will allow us to use the program serialization to implement
ARB_get_program_binary.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:44:35 +11:00
Alejandro Piñeiro 25e56b2eba mesa/spirv: move and rename nir_spirv_supported_capabilities
To avoid any vulkan driver to include the GL mtypes.h. Renamed as
eventually this could be used by drivers not using nir.

v2: remove compiler/spirv/spirv.h from mtypes (Alejandro)
v3: added the definition at compiler/shader_info.h (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-07 17:15:11 +01:00
Samuel Iglesias Gonsálvez 392638d6b5 spirv: fix bug when OpSpecConstantOp calls a conversion
In that case, nir_eval_const_opcode() will evaluate the conversion
but as it was using destination's bit_size, the resulting
value was just a cast of the source constant value. By passing the
source's bit size, it does the conversion properly.

Fixes:

dEQP-VK.spirv_assembly.instruction.*.opspecconstantop.*convert*

v2:
- Remove invalid conversion op cases.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-07 10:19:34 +01:00
Samuel Iglesias Gonsálvez 67ec314347 spirv: allow specialization constants with bitsize different than 32 bits
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-07 10:19:34 +01:00
James Legg 947470d10b nir/opcodes: Fix constant-folding of bitfield_insert
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104119
CC: <mesa-stable@lists.freedesktop.org>
CC: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-12-07 08:59:36 +00:00
Timothy Arceri 9d53ccccb2 glsl: get correct member type when processing xfb ifc arrays
This fixes a crash in:

KHR-GL45.enhanced_layouts.xfb_block_stride

Fixes: 0822517936 "glsl: add helper to process xfb qualifiers during linking"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-12-07 15:22:23 +11:00
Alejandro Piñeiro 0398b31d1d mesa: define nir_spirv_supported_capabilities
Until now it was part of spirv_to_nir_options. But it will be used on
the implementation of ARB_gl_spirv and ARB_spirv_extensions, and added
to the OpenGL context, as a way to save what SPIR-V capabilities the
current OpenGL implementation supports.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-06 22:25:52 +01:00
Eduardo Lima Mitev 4049c04122 spirv/nir: Add support for SPV_KHR_16bit_storage
v2: Minor changes after rebase against recent master (Alejandro
    Pinheiro)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo e0667a8bd8 spirv: Enable FPRoundingMode decorator to nir operations
SpvOpFConvert now manages the FPRoundingMode decorator for the
returning values enabling the nir_rounding_mode in the conversion
operation to fp16 values.

v2: Fixed breaking of specialization constants. (Jason Ekstrand)

v3: Avoid nir_rounding_mode * casting. (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev 549894a681 spirv/nir: Handle 16-bit types
v2: Added more missing implementations of 16-bit types. (Jason Ekstrand)

v3: Store values in values[0].u16[i] (Jason Ekstrand)
    Include switches based on bitsize for 16-bit types
    (Chema Casanova)
v4: Coding style fixes (Jason Ekstrand)
    Use vtn_u64_literal and u64[0] at 64-bit SpvOpConstant (Jason Ekstrand)

Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Eduardo Lima <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo 1f440d00d2 nir: Handle fp16 rounding modes at nir_type_conversion_op
nir_type_conversion enables new operations to handle rounding modes to
convert to fp16 values. Two new opcodes are enabled nir_op_f2f16_rtne
and nir_op_f2f16_rtz.

The undefined behaviour doesn't has any effect and uses the original
nir_op_f2f16 operation.

v2: Indentation fixed (Jason Ekstrand)

v3: Use explicit case for undefined rounding and assert if
    rounding mode is used for non 16-bit float conversions
    (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev 2af63683bc nir: Populate conversion opcodes to 16-bit types
This will include the following NIR ALU opcodes:
 * nir_op_i2i16
 * nir_op_i2f16
 * nir_op_u2u16
 * nir_op_u2f16
 * nir_op_f2i16
 * nir_op_f2u16
 * nir_op_f2f16

v2: Remove "from" 16-bit in commit subject (Topi Pohjolainen)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo d711445430 nir: Add rounding modes enum
v2: Added comments describing each of the rounding modes. (Jason
    Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev 5165e222d1 nir: Add support for 16-bit types (half float, int16 and uint16)
v2: Renamed glsl_half_float_type() to glsl_float16_t_type().
    (Jason Ekstrand)

Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Eduardo Lima <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev 59f458cd87 glsl: Add 16-bit types
Adds new INT16, UINT16 and FLOAT16 base types.

The corresponding GL types for half floats were reused from the
AMD_gpu_shader_half_float extension. The int16 and uint16 types come from
NV_gpu_shader_5 extension.

This adds the builtins and the lexer support.

To avoid a bunch of warnings due to cases not handled in switch, the
new types have been added to a few places using same behavior as
their 32-bit counterparts, except for a few trivial cases where they are
already handled properly. Subsequent patches in this set will provide
correct 16-bit implementations when needed.

v2: * Use FLOAT16 instead of HALF_FLOAT as name of the base type.
    * Removed float16_t from builtin types.
    * Don't copy 16-bit types as if they were 32-bit values in
      copy_constant_to_storage().
    * Use get_scalar_type() instead of adding a new custom switch
      statement.
    (Jason Ekstrand)
v3: Use GL_FLOAT16_NV instead of GL_HALF_FLOAT for consistency
    (Ilia Mirkin)
v4: Add missing 16-bit base types support in glsl_to_nir (Eduardo Lima).
v5: Fix coding style (Topi Poholainen).

Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-06 08:57:18 +01:00
Jason Ekstrand 93b4cb61eb spirv: Allow OpPtrAccessChain for block indices
The SPIR-V spec is a bit underspecified when it comes to exactly how
you're allowed to use OpPtrAccessChain and what it means in certain edge
cases.  In particular, what if the base pointer of the OpPtrAccessChain
points to the base struct of an SSBO instead of an element in that SSBO.
The original variable pointers implementation in mesa assumed that you
weren't allowed to do an OpPtrAccessChain that adjusted the block index
and asserted such.  However, there are some CTS tests that do this and,
if the CTS does it, someone will do it in the wild so we should probably
handle it.  With this commit, we significantly reduce our assumptions
and should be able to handle more-or-less anything.

The one assumption we still make for correctness is that if we see an
OpPtrAccessChain on a pointer to a struct decorated block that the block
index should be adjusted.  In theory, someone could try to put an array
stride on such a pointer and try to make the SSBO an implicit array of
the base struct and we would not give them what they want.  That said,
any index other than 0 would count as an out-of-bounds access which is
invalid.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 22:01:54 -08:00
Jason Ekstrand cfb81f58a0 nir: Add a vulkan_resource_reindex intrinsic
This is required for being able to handle OpPtrAccessChain in SPIR-V
where the base type of the incoming pointer requires us to add to the
block index instead of the byte offset.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 22:01:54 -08:00
Jason Ekstrand ae54a4f84f spirv: Add support for lowering workgroup access to offsets
Before, we always left workgroup variables as shared nir_variables and
let the driver call nir_lower_io.  This adds an option to do the
lowering directly in spirv_to_nir.  To do this, we implicitly assign the
variables a std430 layout and then treat them like a UBO or SSBO and
immediately lower all the way to an offset.

As a side-effect, the spirv_to_nir pass now handles variable pointers
for workgroup variables.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 22:01:54 -08:00
Jason Ekstrand f6eb5ce39c spirv: Rename get_shared_nir_atomic_op to get_var_nir_atomic_op
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 22:01:54 -08:00
Jason Ekstrand 992aabf239 spirv: Add theoretical support for single component pointers
Up until now, all pointers have been ivec2s.  We're about to add support
for pointers to workgroup storage and those are going to be uints.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 22:01:54 -08:00
Jason Ekstrand 843c192e2b spirv: Use offset_pointer_dereference to instead of get_vulkan_resource_index
There is no good reason why we should have the same logic repeated in
get_vulkan_resource_index and vtn_ssa_offset_pointer_dereference.  If
we're a bit more careful about how we do things, we can just use the one
function and get rid of the other entirely.  This also makes the push
constant special case a lot more clear.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 22:01:53 -08:00
Jason Ekstrand 6dffef6308 spirv: Refactor a couple of pointer query helpers
This commit moves them both into vtn_variables.c towards the top, makes
them take a vtn_builder, and replaces a hand-rolled instance of
is_external_block with a function call.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 20:56:16 -08:00
Jason Ekstrand 93646fb503 spirv: Refactor the base case of offset_pointer_dereference
This makes us key off of !offset instead of !block_index.  It also puts
the guts inside a switch statement so that we can handle more than just
UBOs and SSBOs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 20:56:14 -08:00
Jason Ekstrand 98edf6bca4 spirv: Add a switch statement for the block store opcode
This parallels what we do for vtn_block_load except that we don't yet
support anything except SSBO loads through this path.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 20:55:39 -08:00
Jason Ekstrand 91d91ce3e2 spirv: Use a dereference instead of vtn_variable_resource_index
This is equivalent and means we don't have resource index code scattered
about.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-05 20:55:37 -08:00
Jason Ekstrand d74b1f4809 spirv: Replace unreachable with vtn_fail
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2017-12-04 09:21:09 -08:00
Jason Ekstrand b7ef60d846 spirv: Replace assert with vtn_assert
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2017-12-04 09:21:09 -08:00
Jason Ekstrand 94ca8e04ad spirv: Add vtn_fail and vtn_assert helpers
These helpers are much nicer than just using assert because they don't
kill your process.  Instead, it longjmps back to spirv_to_nir(), cleans
up all the temporary memory, and nicely returns NULL.  While crashing is
completely OK in the Vulkan world, it's not considered to be quite so
nice in GL.  This should help us to make SPIR-V parsing much more
robust.  The one downside here is that vtn_assert is not compiled out in
release builds like assert() is so it isn't free.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2017-12-04 09:21:09 -08:00
Jason Ekstrand 591a07632c spirv: Do something useful with OpSource
We may as well log the source language and file name.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2017-12-04 09:21:09 -08:00
Jason Ekstrand 16dfdeefc8 spirv: Rework logging
This commit reworks the way that logging works in SPIR-V to provide
richer and more detailed logging infrastructure.  This commit contains
several improvements over the old mechanism:

 1) Log messages are now more detailed.  They contain the SPIR-V byte
    offset as well as source language information from OpSource and
    OpLine.

 2) There is now a logging callback mechanism so that errors can get
    propagated to the client through debug callbak extensions.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2017-12-04 09:21:09 -08:00
Jason Ekstrand 11bd753c4e spirv: Re-arrange vtn_builder initialization
This simply moves allocating the vtn_builder and initializing it to the
very beginning before we even parse the header.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2017-12-04 09:21:09 -08:00
Jason Ekstrand d74bec1a54 spirv: Parent the nir_shader to the builder while building
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2017-12-04 09:21:09 -08:00
Samuel Iglesias Gonsálvez fa8c1b92b7 glsl: don't run intrastage array validation when the interface type is not an array
We validate that the interface block array type's definition matches.
However, previously, the function could be called if an non-array
interface block has different type definitions -for example, when the
precision qualifier differs in a GLSL ES shader, we would create two
different types-, and it would return invalid as both definitions are
non-arrays.

We fix this by specifying that at least one definition should be an
array to call the validation.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:32:57 +01:00
Samuel Iglesias Gonsálvez fc6d55952d glsl/es: precision qualifier doesn't need to match in UBOs
They might mismatch due to the two shaders using different GLSL
versions, and that's ok in desktop GL. In ES, precision qualifiers
don't need to match.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:32:57 +01:00
Fabian Bieler 9bdb5457f4 glsl: Match order of gl_LightSourceParameters elements.
spotExponent and spotCosCutoff were swapped in the
gl_builtin_uniform_element struct.
Now the order matches across gl_builtin_uniform_element,
glsl_struct_field and the spec.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-12-03 21:14:14 -07:00
Timothy Arceri d99c7e0ff1 nir: allow builin arrays to be lowered
Galliums nir drivers expect this to be done.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:18 +11:00
Timothy Arceri 2bc49ac3e6 nir: add array lowering function that assumes there are no indirects
The gallium glsl->nir pass currently lowers away all indirects on both inputs
and outputs. This fuction allows us to lower vs inputs and fs outputs and also
lower things one stage at a time as we don't need to worry about indirects
on the other side of the shaders interface.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:18 +11:00
Timothy Arceri 2a35021bc6 nir: fix support for scalar arrays in nir_lower_io_types()
This was just recreating the same vector type we alreay had and
hitting an assert for scalars.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:10:30 +11:00
Timothy Arceri 1c9c42d16b nir: add varying component packing helpers
v2: update shader info input/output masks when pack components
v3: make sure interpolation loc matches, this is required for the
    radeonsi NIR backend.
v4: 33dca36f4f fixed nir_gather_info to update outputs_read
    correct, make sure we also adjust this correctly when
    packing components.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3)
2017-12-04 09:10:30 +11:00
Timothy Arceri c797bc6aa7 nir: add varying array splitting pass
V2:
 - fix matrix support, non-array matrices were being skipped in v1

v3:
 - handle lowering of tcs output loads correctly
 - correctly mark indirect locations for either in or out not both
   when processing a stage.
 - use nir_src_copy() when lowering stores.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-04 09:10:30 +11:00
Jason Ekstrand e19c623128 spirv: Convert the supported_extensions struct to spirv_options
This is a bit more general and lets us pass additional options into the
spirv_to_nir pass beyond what capabilities we support.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-02 08:09:11 -08:00
Jason Ekstrand 6bd876dcaa spirv: Only emit functions which are actually used
Instead of emitting absolutely everything, just emit the few functions
that are actually referenced in some way by the entrypoint.  This should
save us quite a bit of time when handed large shader modules containing
many entrypoints.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-02 08:07:35 -08:00
Jason Ekstrand f5aad36d2e spirv: Drop the impl field from vtn_builder
We have a nir_builder and it has an impl field.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-02 08:07:35 -08:00
Tapani Pälli faccbaf3fa mesa: add AllowGLSLCrossStageInterpolationMismatch workaround
This fixes issues seen with certain versions of Unreal Engine 4 editor
and games built with that using GLSL 4.30.

v2: add driinfo_gallium change (Emil Velikov)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97852
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103801
Acked-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-30 11:43:10 +02:00
Timothy Arceri a39a3b4b76 mesa: rework _mesa_add_parameter() to only add a single param
This is more inline with what the functions name suggests it should
do, and makes the code much easier to follow.

This will also make adding uniform packing support much simpler.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 21:50:48 +11:00
Eric Engestrom 9d281e1506 compiler: fix typo
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-11-28 10:54:38 +00:00
Eric Engestrom 7b85b9b877 compiler: use NDEBUG to guard asserts
nir_validate.c's #endif already had the correct NDEBUG comment

Fixes: dcb1acdea0 "nir/validate: Only build in debug mode"
Fixes: 9ff71b649b "i965/nir: Validate that NIR passes call nir_metadata_preserve()"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-11-28 10:54:38 +00:00
Timothy Arceri 3e789026ca st/glsl_to_tgsi: make use of driver_cache_blob with the disk cache
driver_cache_blob was introduced with the i965 disk cache, it allows
us to simplify the cache a little and possibly offers some minor
speed improvements since we load the GLSL metadata and TGSI from
disk in one pass.

Using driver_cache_blob should also make it straight forward to
implement binary support for ARB_get_program_binary in gallium.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-28 09:01:44 +11:00
Gwan-gyeong Mun 4cb27047c8 glsl: Fix typo nagivation -> navigation
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-28 08:48:55 +11:00
Dave Airlie 33dca36f4f nir: fill outputs_read field and add patch outputs read (v2)
This is to be used for TCS optimisations on radv.

v2: don't set written on reads (nha)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-27 13:50:03 +10:00
Ilia Mirkin ab336e8b46 nir: allow texture offsets with cube maps
GL doesn't have this, but some hardware supports it. This is convenient
for lowering tg4 to plain texture calls, which is necessary on Adreno
A4xx hardware.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2017-11-25 16:56:30 -05:00
Marek Olšák 78942e7dbf mesa: shrink VERT_ATTRIB bitfields to 32 bits
There are only 32 vertex attribs now.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-11-25 17:18:22 +01:00
Marek Olšák 43abaf2ad0 mesa: remove unused vertex attrib WEIGHT
We don't support ARB_vertex_blend.

Note that the attribute aliasing check for ARB_vertex_program had to be
rewritten.

vbo_context: 20344 -> 20008 bytes
gl_context: 74672 -> 74616 bytes

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-11-25 17:17:52 +01:00
Marek Olšák 2116b97418 mesa: don't assign numbers to vertex attrib enums manually
I plan to remove one of them.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-11-25 17:17:52 +01:00
Iago Toral Quiroga a217cbd7ec nir/gather_info: recognize load_patch_vertices_in as a system value
This intrinsic is produced to load SYSTEM_VALUE_VERTICES_IN, which is
generated to load gl_PatchVerticesIn in the SPIR-V path for both
Vulkan and OpenGL.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-22 08:03:55 +01:00
George Barrett f09c2cefdd glsl: Catch subscripted calls to undeclared subroutines
generate_array_index fails to check whether the target of a subroutine
call exists in the AST, potentially passing around null ir_rvalue
pointers eventuating in abort/segfault.

Fixes: fd01840c0b ("glsl: add AoA support to subroutines")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100438
2017-11-20 11:04:04 +11:00
Brian Paul 92c1290dc5 glsl: s/unsigned/glsl_base_type/ in glsl type code (v2)
Declare glsl_type::sampled_type as glsl_base_type as we do for the
base_type field.  And make base_type a bitfield to save a few bytes.

Update glsl_type constructor to take glsl_base_type instead of unsigned
and pass GLSL_TYPE_VOID instead of zero.

No Piglit regressions with llvmpipe.

v2:
- Declare both base_type and sampled_type as 8-bit fields
- Use the new ASSERT_BITFIELD_SIZE() macro.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-16 20:35:17 -07:00
Alejandro Piñeiro b498172d0e spirv: fix typo on DO NOT EDIT header
Introduced on commit 157c9a1341

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-14 13:07:36 +01:00
Alex Smith 4122d00846 nir/spirv: tg4 requires a sampler
Gather operations in both GLSL and SPIR-V require a sampler. Fixes
gathers returning garbage when using separate texture/samplers (on AMD,
was using an invalid sampler descriptor).

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-11-13 13:38:18 +00:00
Alex Smith e9eb3c4753 spirv: Use correct type for sampled images
We should use the result type of the OpSampledImage opcode, rather than
the type of the underlying image/samplers.

This resolves an issue when using separate images and shadow samplers
with glslang. Example:

    layout (...) uniform samplerShadow s0;
    layout (...) uniform texture2D res0;
    ...
    float result = textureLod(sampler2DShadow(res0, s0), uv, 0);

For this, for the combined OpSampledImage, the type of the base image
was being used (which does not have the Depth flag set, whereas the
result type does), therefore it was not being recognised as a shadow
sampler. This led to the wrong LLVM intrinsics being emitted by RADV.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-11-13 13:37:50 +00:00
Alejandro Piñeiro 157c9a1341 spirv: add DO NOT EDIT warning on generated spirv_info.c
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-13 13:28:44 +01:00
Iago Toral Quiroga 456e10944f glsl/linker: use without_array() to retrieve type
This is what we do in the condition too, so it makes sense.

v2: Only compute without_array() once (Ilia).

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-11-13 09:22:26 +01:00
Timothy Arceri 8c9f3f2c46 nir: add streams to nir data
This will be used by gallium drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-12 11:08:26 +11:00
Rob Clark ef4c42fc3a nir: handle get_buffer_size in nir_lower_atomics_to_ssbo
Overlooked initially, be we need to remap the SSBO index for this as
well.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-10 08:57:33 -05:00
Kenneth Graunke 688d695868 glsl: Make #pragma STDGL invariant(all) only modify outputs.
According to the GLSL ES 3.20, GLSL 4.50, and GLSL 1.20 specs:

   "To force all output variables to be invariant, use the pragma

       #pragma STDGL invariant(all)

    before all declarations in a shader."

Notably, this is only supposed to affect output variables.  Furthermore,

   "Only variables output from a shader can be candidates for invariance."

It looks like this has been wrong since we first supported the pragma in
2011 (commit 86b4398cd1).

Fixes dEQP-GLES2.functional.shaders.preprocessor.pragmas.pragma_fragment.

v2: Now that all cases are identical (other than compute shaders, which
    have no output variables anyway), we can drop the switch statement
    entirely.  We also don't need the current_function == NULL check;
    this was a hold over from when we had a single var_mode_out for both
    function parameters and shader varyings, in the bad old days.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-11-08 23:11:48 -08:00
Neil Roberts 4dc8458cd1 glsl: Transform fb buffers are only active if a variable uses them
The GL spec will soon be revised to clarify that a buffer binding for
a transform feedback buffer is only required if a variable is actually
defined to use the buffer binding point. Previously a declaration for
the default transform buffer would make it require a binding even if
nothing was declared to use the default buffer.

Affects:
KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list
KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list_and_api

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-09 05:39:42 +01:00
Ian Romanick 9c53b80ff9 glsl: Minor cleanups after previous commit
I think it's more clear to only call emit_access once.  The only
difference between the two calls is the value of size_mul used for the
offset parameter... but you really have to look at it to be sure.

The s/is_64bit/is_double/ change is because there are no int64_t or
uint64_t matrix types.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-11-08 18:37:29 -08:00
Ian Romanick c18d8c61d6 glsl: Use more link_calculate_matrix_stride in lower_buffer_access
I was going to squash this with the previous commit, but there's a lot
of churn in that commit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-11-08 18:37:29 -08:00
Ian Romanick 1a2beae1b3 glsl: Use link_calculate_matrix_stride in lower_buffer_access and friends
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-11-08 18:37:29 -08:00
Ian Romanick 24e78d99db glsl: Refactor matrix stride calculation into a utility function
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-11-08 18:37:29 -08:00
Ian Romanick 88f5588f77 glsl/linker: Optimize swizzles again after linking
Without this, the SPIR-V generator has to deal with a bunch of junk
like:

    (swiz z (swiz xxx (swiz x (var_ref packed:binormal.z,light_dir))))

It seems better to cull that stuff out than to add code to deal with
it.  The problem is the way swizzles to and from scalars have to be
handled in SPIR-V.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-11-08 18:37:29 -08:00
Ian Romanick ef1ca06ce8 glsl: Combine nop-swizzle optimization with swizzle-swizzle optimization
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: <thomashelland90@gmail.com>
2017-11-08 18:37:29 -08:00
Ian Romanick c858abb14f glsl: Make the swizzle-swizzle optimization greedy
If there is a long sequence of swizzled swizzles, compact all of them
down to a single swizzle.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: <thomashelland90@gmail.com>
2017-11-08 18:37:29 -08:00
Ian Romanick ae1fd09c1d glsl: Remove program_resource_visitor::visit_field(const glsl_struct_field *)
I could not find any remaining users.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-08 18:37:29 -08:00
Ian Romanick 2c7657f62c glsl: Silence unused parameter warning
glsl/lower_shared_reference.cpp: In member function ‘virtual void
{anonymous}::lower_shared_reference_visitor::insert_buffer_access(void*,
ir_dereference*, const glsl_type*, ir_rvalue*, unsigned int, int)’:

glsl/lower_shared_reference.cpp:244:58: warning: unused parameter
‘channel’ [-Wunused-parameter]
                                                      int channel)
                                                          ^~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-08 18:37:29 -08:00
Timothy Arceri 9c33533586 glsl: use the correct parent when allocating program data members
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-09 12:07:48 +11:00
Timothy Arceri cf05bb506a glsl: drop cache_fallback
This turned out to be a dead end, it is much easier and less error
prone to just cache the IR used by the drivers backend e.g. TGSI or
NIR.

Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-09 12:07:48 +11:00
Matt Turner 77a63d190a nir: Don't print swizzles when there are more than 4 components
... as can happen with various types like mat4, or else we'll smash the
stack writing past the end of components_local[].

Fixes: 5a0d3e1129 ("nir: Print the components referenced for split or
                      packed shader in/outs.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-11-08 13:22:26 -08:00
Dylan Baker 34593e978c meson: Add threads dependencies to glsl_compiler executable
Fixes compiling the optional standalone glsl compiler.

Reported-by: DrNick (on irc)
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-08 11:36:02 -08:00
Andreas Boll a6932faae1 glsl: Fix typo fragement -> fragment
Fixes: 94d669b0d2 ("glsl: enforce fragment shader input restrictions in
       GLSL ES 3.10")

Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-11-08 18:30:48 +00:00
Juan A. Suarez Romero d5a641106b glsl: add varying resources for arrays of complex types
This patch is mostly a patch done by Ilia Mirkin.

It fixes KHR-GL45.enhanced_layouts.varying_structure_locations.

v2: fix locations for TCS/TES/GS inputs and outputs (Ilia)

CC: Ilia Mirkin <imirkin@alum.mit.edu>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103098
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-11-08 10:12:07 +01:00
Jason Ekstrand df81b81fb9 compiler/nir_types: Handle vectors in glsl_get_array_element
Most of NIR doesn't allow doing array indexing on a vector (though it
does on a matrix).  However, nir_lower_io handles it just fine and this
behavior is needed for shared variables in Vulkan.  This commit makes
glsl_get_array_element do something sensible for vector types and makes
nir_validate happy with them.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:41:24 -08:00
Jason Ekstrand ad77775809 nir: Validate base types on array dereferences
We were already validating that the parent type goes along with the
child type but we weren't actually validating that the parent type is
reasonable.  This fixes that.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:41:24 -08:00
Jason Ekstrand ab9220edd6 nir,intel/compiler: Use a fixed subgroup size
The GL_ARB_shader_ballot spec says that gl_SubGroupSizeARB is declared
as a uniform.  This means that it cannot change across an invocation
such as a draw call or a compute dispatch.  For compute shaders, we're
ok because we only ever use one dispatch size.  For fragment, however,
the hardware dynamically chooses between SIMD8 and SIMD16 which violates
the spec.  Instead, let's just pick a subgroup size based on the shader
stage.  The fixed size we choose for compute shaders is a bit higher
than strictly needed but there's no real harm in that.  The advantage is
that, if they do anything interesting with the value, NIR will see it as
an immediate and can optimize better.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand a026458020 nir/lower_subgroups: Lower ballot intrinsics to the specified bit size
Ballot intrinsics return a bitfield of subgroups.  In GLSL and some
SPIR-V extensions, they return a uint64_t.  In SPV_KHR_shader_ballot,
they return a uvec4.  Also, some back-ends would rather pass around
32-bit values because it's easier than messing with 64-bit all the time.
To solve this mess, we make nir_lower_subgroups take a new parameter
called ballot_bit_size and it lowers whichever thing it gets in from the
source language (uint64_t or uvec4) to a scalar with the specified
number of bits.  This replaces a chunk of the old lowering code.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand 8c2bf020fd nir/builder: Add a nir_imm_intN_t helper
This lets you easily build integer immediates of arbitrary bit size.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand 9b35faba42 nir/lower_system_values: Lower SUBGROUP_*_MASK based on type
The SUBGROUP_*_MASK system values are uint64_t when coming in from GLSL
but uvec4 when coming in from SPIR-V.  Lowering based on type allows us
to nicely handle both.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand 3ee91ee6ac nir: Make ballot intrinsics variable-size
This way they can return either a uvec4 or a uint64_t.  At the moment,
this is a no-op since we still always return a uint64_t.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand ad127afcfd nir: Add a ssa_dest_init_for_type helper
This would be useful a number of places

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand 28da82f978 nir: Add a new subgroups lowering pass
This commit pulls nir_lower_read_invocations_to_scalar along with most
of the guts of nir_opt_intrinsics (which mostly does subgroup lowering)
into a new nir_lower_subgroups pass.  There are various other bits of
subgroup lowering that we're going to want to do so it makes a bit more
sense to keep it all together in one pass.  We also move it in i965 to
happen after nir_lower_system_values to ensure that because we want to
handle the subgroup mask system value intrinsics here.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand 295605c930 intel/cs: Push subgroup ID instead of base thread ID
We're going to want subgroup ID for SPIR-V subgroups eventually anyway.
We really only want to push one and calculate the other from it.  It
makes a bit more sense to push the subgroup ID because it's simpler to
calculate and because it's a real API thing.  The only advantage to
pushing the base thread ID is to avoid a single SHL in the shader.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand 80ddfab2f5 intel/cs: Rework the way thread local ID is handled
Previously, brw_nir_lower_intrinsics added the param and then emitted a
load_uniform intrinsic to load it directly.  This commit switches things
over to use a specific NIR intrinsic for the thread id.  The one thing I
don't like about this approach is that we have to copy thread_local_id
over to the new visitor in import_uniforms.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Gwan-gyeong Mun fb87c40a58 nir: fix a typo
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-11-06 18:11:24 -08:00
Tomasz Figa 0886be093f glsl: Allow precision mismatch on dead data with GLSL ES 1.00
Commit 259fc50545 added linker error for
mismatching uniform precision, as required by GLES 3.0 specification and
conformance test-suite.

Several Android applications, including Forge of Empires, have shaders
which violate this rule, on a dead varying that will be eliminated.
The problem affects a big number of applications using Cocos2D engine
and other GLES implementations accept this, this poses a serious
application compatibility issue.

Starting from GLSL ES 3.0, declarations with conflicting precision
qualifiers are explicitly prohibited. However GLSL ES 1.00 does not
clearly specify the behavior, except that

  "Uniforms are defined to behave as if they are using the same storage in
  the vertex and fragment processors and may be implemented this way.
  If uniforms are used in both the vertex and fragment shaders, developers
  should be warned if the precisions are different. Conversion of
  precision should never be implicit."

The word "used" is not clear in this context and might refer to
 1) declared (same as GLES 3.x)
 2) referred after post-processing, or
 3) linked after all optimizations are done.

Looking at existing applications, 2) or 3) seems to be widely adopted.
To avoid compatibility issues, turn the error into a warning if GLSL ES
version is lower than 3.0 and the data is dead in at least one of the
shaders.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97532
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-06 15:16:03 -08:00
Nicolai Hähnle ca63a5ed3e glsl: fix interpolateAtXxx(some_vec[idx], ...) with dynamic idx
The dynamic index of a vector (not array!) is lowered to a sequence of
conditional assignments. However, the interpolate_at_* expressions
require that the interpolant is an l-value of a shader input.

So instead of doing conditional assignments of parts of the shader input
and then interpolating that (which is nonsensical), we interpolate the
entire shader input and then do conditional assignments of the interpolated
result.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-03 14:30:08 +01:00
Nicolai Hähnle 4f42450b86 glsl: allow any l-value of an input variable as interpolant in interpolateAt*
The intended rule has been clarified in GLSL 4.60, Section 8.13.2
(Interpolation Functions):

   "For all of the interpolation functions, interpolant must be an l-value
    from an in declaration; this can include a variable, a block or
    structure member, an array element, or some combination of these.
    Component selection operators (e.g., .xy) may be used when specifying
    interpolant."

For members of interface blocks, var->data.must_be_shader_input must be
determined on-the-fly after lowering interface blocks, since we don't want
to disable varying packing for an entire block just because one input in it
is used in interpolateAt*.

v2: keep setting must_be_shader_input in ast_function (Ian)
v3: follow the relaxed rule of GLSL 4.60
v4: only apply the relaxed rules to desktop GL
    (the ES WG decided that the relaxed rules may apply in a future version
     but not retroactively; see also
     dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_centroid.negative.*)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101378
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-03 14:30:08 +01:00
Dave Airlie 57372c5a42 nir/serialize: fix build with gcc 4.4.7
I had to build on RHEL6 today, and noticed this.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-03 15:03:35 +10:00
Timothy Arceri 440d08fe93 nir: skip lowering sampler if there is no dereference
This avoids a crash on the output of nir_lower_bitmap().

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:19:46 +11:00
Timothy Arceri cf5f8f55c3 nir: add tess patch support to nir_remove_unused_varyings()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 08:58:39 +11:00
Jordan Justen e6ecd7d73f glsl/shader_cache: Save fs (BlendSupport) metadata
Fixes many GL 4.5 CTS blend tests, such as:

* GL45-CTS.blend_equation_advanced.extension_directive_enable
* GL45-CTS.blend_equation_advanced.extension_directive_warn
* GL45-CTS.blend_equation_advanced.blend_all.GL_MULTIPLY_KHR_all_qualifier
* GL45-CTS.blend_equation_advanced.blend_specific.GL_COLORBURN_KHR

v2:
 * Directly save the BlendSupport field to avoid potentially including
   a pointer in the future in the structure is updated. (tarceri)

Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-31 23:36:54 -07:00
Timothy Arceri 15f39e8654 mesa/glsl: add api_enabled flag to gl_transform_feedback_info
This will be used to disable the shader cache when xfb is enabled
via the api as we don't currently allow for it when generating the
sha for the shader.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-31 23:36:54 -07:00
Jordan Justen 4c7a1ec62a blob: Don't set overrun if reading 0 bytes at end of data
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-31 23:36:54 -07:00
Jordan Justen 6b815e405d glsl/shader_cache: Save and restore serialized nir in gl_program
v3:
 * Rename serialized_nir* to driver_cache_blob*. (Tim)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-31 23:36:54 -07:00
Jason Ekstrand 54f691311c nir: Add hooks for testing serialization
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-31 23:36:53 -07:00
Connor Abbott 120da00975 nir: add serialization and deserialization
v2 (Jason Ekstrand):
 - Various whitespace cleanups
 - Add helpers for reading/writing objects
 - Rework derefs
 - [de]serialize nir_shader::num_*
 - Fix uses of blob_reserve_bytes
 - Use a bitfield struct for packing tex_instr data

v3:
 - Zero nir_variable struct on deserialization. (Jordan)
 - Allow nir_serialize.h to be included in C++. (Jordan)
 - Handle NULL info.name. (Jason)
 - Set info.name to NULL when name is NULL. (Jordan)

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-31 23:36:53 -07:00
Neil Roberts b697ece10a nir/opt_intrinsics: Fix values for gl_SubGroupG{e,t}MaskARB
Previously the values were calculated by just shifting ~0 by the
invocation ID. This would end up including bits that are higher than
gl_SubGroupSizeARB. The corresponding CTS test effectively requires that
these high bits be zero so it was failing. There is a Piglit test as
well but this appears to checking the wrong values so it passes.

For the two greater-than bitmasks, this patch adds an extra mask with
(~0>>(64-gl_SubGroupSizeARB)) to force these bits to zero.

Fixes: KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102680#c3
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Neil Roberts <nroberts@igalia.com>
2017-10-31 23:28:00 +01:00
Ian Romanick 53c7b8bdca glsl: Fix bad formatting in a comment
Trivial

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2017-10-30 20:08:25 -07:00
Eduardo Lima Mitev f9de7f5596 glsl/linker: Check that re-declared, inter-shader built-in blocks match
>From GLSL 4.5 spec, section "7.1 Built-In Language Variables", page 130 of
the PDF states:

    "If multiple shaders using members of a built-in block belonging to
     the same interface are linked together in the same program, they must
     all redeclare the built-in block in the same way, as described in
     section 4.3.9 “Interface Blocks” for interface-block matching, or a
     link-time error will result."

Fixes:
* GL45-CTS.CommonBugs.CommonBug_PerVertexValidation

v2 (Neil Roberts):
Explicitly look for gl_PerVertex in the symbol tables instead of
waiting to find a variable in the interface.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102677
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eduardo Lima Mitev <elima@igalia.com>
Signed-off-by: Neil Roberts <nroberts@igalia.com>
2017-10-30 18:10:39 +01:00
Eduardo Lima Mitev f5fe99ac85 glsl: Use the utility function to copy symbols between symbol tables
This effectively factorizes a couple of similar routines.

v2 (Neil Roberts): Non-trivial rebase on master

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eduardo Lima Mitev <elima@igalia.com>
Signed-off-by: Neil Roberts <nroberts@igalia.com>
2017-10-30 18:10:39 +01:00
Eduardo Lima Mitev 4c62a270a9 glsl_parser_extra: Add utility to copy symbols between symbol tables
Some symbols gathered in the symbols table during parsing are needed
later for the compile and link stages, so they are moved along the
process. Currently, only functions and non-temporary variables are
copied between symbol tables. However, the built-in gl_PerVertex
interface blocks are also needed during the linking stage (the last
step), to match re-declared blocks of inter-stage shaders.

This patch adds a new utility function that will factorize current code
that copies functions and variables between two symbol tables, and in
addition will copy explicitly declared gl_PerVertex blocks too.

The function will be used in a subsequent patch.

v2 (Neil Roberts):
Allow the src symbol table to be NULL and explicitly copy the
gl_PerVertex symbols in case they are not referenced in the exec_list.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eduardo Lima Mitev <elima@igalia.com>
Signed-off-by: Neil Roberts <nroberts@igalia.com>
2017-10-30 18:10:39 +01:00
Ian Romanick 6403efbe74 glsl: Remove ir_binop_greater and ir_binop_lequal expressions
NIR does not have these instructions.  TGSI and Mesa IR both implement
them using < and >=, repsectively.  Removing them deletes a bunch of
code and means I don't have to add code to the SPIR-V generator for
them.

v2: Rebase on 2+ years of change... and fix a major bug added in the
rebase.

   text	   data	    bss	    dec	    hex	filename
8255291	 268856	 294072	8818219	 868e2b	32-bit i965_dri.so before
8254235	 268856	 294072	8817163	 868a0b	32-bit i965_dri.so after
7815339	 345592	 420592	8581523	 82f193	64-bit i965_dri.so before
7813995	 345560	 420592	8580147	 82ec33	64-bit i965_dri.so after

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-30 09:27:09 -07:00
Ian Romanick 34f7e761bc glsl/parser: Track built-in types using the glsl_type directly
Without the lexer changes, tests/glslparsertest/glsl2/tex_rect-02.frag
fails.  Before this change, the parser would determine that
sampler2DRect is not a valid type because the call to
state->symbols->get_type() in ast_type_specifier::glsl_type() would
return NULL.  Since ast_type_specifier::glsl_type() is now going to
return the glsl_type pointer that it received from the lexer, it doesn't
have an opportunity to generate an error.

   text	   data	    bss	    dec	    hex	filename
8255243	 268856	 294072	8818171	 868dfb	32-bit i965_dri.so before
8255291	 268856	 294072	8818219	 868e2b	32-bit i965_dri.so after
7815195	 345592	 420592	8581379	 82f103	64-bit i965_dri.so before
7815339	 345592	 420592	8581523	 82f193	64-bit i965_dri.so after

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-30 09:27:09 -07:00
Ian Romanick 747c057530 glsl/parser: Return the glsl_type object from the lexer
This allows us to use a single token for every built-in type except void.

   text	   data	    bss	    dec	    hex	filename
8275163	 269336	 294072	8838571	 86ddab	32-bit i965_dri.so before
8255243	 268856	 294072	8818171	 868dfb	32-bit i965_dri.so after
7836963	 346552	 420592	8604107	 8349cb	64-bit i965_dri.so before
7815195	 345592	 420592	8581379	 82f103	64-bit i965_dri.so after

Yes, the 64-bit binary shrinks by 21k.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-30 09:27:09 -07:00
Ian Romanick 4171900cf1 glsl/parser: Allocate identifier inside classify_identifier
Passing YYSTYPE into classify_identifier enables a later patch.

   text	   data	    bss	    dec	    hex	filename
8310339	 269336	 294072	8873747	 876713	32-bit i965_dri.so before
8275163	 269336	 294072	8838571	 86ddab	32-bit i965_dri.so after
7845579	 346552	 420592	8612723	 836b73	64-bit i965_dri.so before
7836963	 346552	 420592	8604107	 8349cb	64-bit i965_dri.so after

Yes, the 64-bit binary shrinks by 8k.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-30 09:27:09 -07:00
Ian Romanick 792acfc44a glsl/parser: Move anonymous struct name handling to the parser
There are two callers of the constructor, and they are right next to
each other.  Move the "#anon_struct" name handling to the parser so that
the conditional can be removed.

I've also deleted part of the comment (about the memory leak) because I
don't think it's quite accurate or relevant.

   text	   data	    bss	    dec	    hex	filename
8310399	 269336	 294072	8873807	 87674f	32-bit i965_dri.so before
8310339	 269336	 294072	8873747	 876713	32-bit i965_dri.so after
7845611	 346552	 420592	8612755	 836b93	64-bit i965_dri.so before
7845579	 346552	 420592	8612723	 836b73	64-bit i965_dri.so after

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-30 09:27:09 -07:00
Ian Romanick fc07ab165b glsl/parser: Silence unused parameter warning
glsl/glsl_parser_extras.cpp: In constructor ‘ast_struct_specifier::ast_struct_specifier(void*, const char*, ast_declarator_list*)’:
glsl/glsl_parser_extras.cpp:1675:50: warning: unused parameter ‘lin_ctx’ [-Wunused-parameter]
 ast_struct_specifier::ast_struct_specifier(void *lin_ctx, const char *identifier,
                                                  ^~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-30 09:27:09 -07:00
Ian Romanick d70e8ef1c1 glsl: Silence unused parameter warnings
glsl/standalone_scaffolding.cpp: In function ‘GLbitfield _mesa_program_state_flags(const gl_state_index*)’:
glsl/standalone_scaffolding.cpp:103:66: warning: unused parameter ‘state’ [-Wunused-parameter]
 _mesa_program_state_flags(const gl_state_index state[STATE_LENGTH])
                                                                  ^
glsl/standalone_scaffolding.cpp: In function ‘char* _mesa_program_state_string(const gl_state_index*)’:
glsl/standalone_scaffolding.cpp:109:67: warning: unused parameter ‘state’ [-Wunused-parameter]
 _mesa_program_state_string(const gl_state_index state[STATE_LENGTH])
                                                                   ^
glsl/standalone_scaffolding.cpp: In function ‘void _mesa_delete_shader(gl_context*, gl_shader*)’:
glsl/standalone_scaffolding.cpp:115:40: warning: unused parameter ‘ctx’ [-Wunused-parameter]
 _mesa_delete_shader(struct gl_context *ctx, struct gl_shader *sh)
                                        ^~~
glsl/standalone_scaffolding.cpp: In function ‘void _mesa_delete_linked_shader(gl_context*, gl_linked_shader*)’:
glsl/standalone_scaffolding.cpp:123:47: warning: unused parameter ‘ctx’ [-Wunused-parameter]
 _mesa_delete_linked_shader(struct gl_context *ctx,
                                               ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-30 09:27:09 -07:00
Emil Velikov fc7816fd4e Revert "foo"
This reverts commit 27d5a7bce0.

I fat fingered it, failing to reset the checkout before applying the
sequential commit.
2017-10-30 15:32:56 +00:00
Emil Velikov 27d5a7bce0 foo
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-10-30 15:22:26 +00:00
Kenneth Graunke 86c68bb886 nir: Make nir_gather_info collect a uses_fddx_fddy flag.
i965 turns fddx/fddy into their coarse/fine variants based on the
ctx->Hint.FragmentShaderDerivative setting.  It needs to know whether
this can impact a shader in order to better guess NOS settings.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-29 20:52:20 -07:00
Jason Ekstrand 8ab9820d34 spirv: Claim support for the simple memory model
It's rather surprising that we've never actually hit this before.
Aparently, Ian's SPIR-V generator currently claims the Simple when you
don't do anything complex.  We really shouldn't assert-fail on it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: mesa-stable@lists.freedesktop.org
2017-10-26 15:24:38 -07:00
Iago Toral Quiroga 13652e7516 glsl/linker: Fix type checks for location aliasing
From the OpenGL 4.6 spec, section 4.4.1 Input Layout Qualifiers, Page 68,
(Location aliasing):

   "Further, when location aliasing, the aliases sharing the location
    must have the same underlying numerical type  (floating-point or
    integer)."

The current implementation is too strict, since it checks that the
the base types are an exact match instead.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-26 08:40:14 +02:00
Iago Toral Quiroga 7276ccf8ed glsl/linker: refactor check_location_aliasing
Mostly, this merges the type checks with all the other checks so
we only have a single loop for this.

Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-26 08:40:14 +02:00
Iago Toral Quiroga e2abb75b0e glsl/linker: validate explicit locations for SSO programs
v2:
- we only need to validate inputs to the first stage and outputs
  from the last stage, everything else has already been validated
  during cross_validate_outputs_to_inputs (Timothy).
- Use MAX_VARYING instead of MAX_VARYINGS_INCL_PATCH (Illia)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-26 08:40:14 +02:00
Iago Toral Quiroga bdaf058978 glsl/linker: generalize validate_explicit_variable_location for SSO
For non-SSO programs, we only need to validate outputs, since
the cross validation of outputs to inputs will ensure that we
produce linker errors for invalid inputs too.

Hoever, for the SSO path there is no output to input validation,
so we need to validate inputs explicitly. Generalize the function
so it can handle this as well.

Also, notice that vertex shader inputs and fragment shader outputs
are already validated in assign_attribute_or_color_locations()
for both SSO and non-SSO paths, so we should not try to validate
that here again (in fact, the function would require explicit
paths to handle these two cases properly).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-26 08:40:14 +02:00
Iago Toral Quiroga e7b7fe314e glsl/linker: create a helper function to validate explicit locations
Currently, we only validate explicit locations for non-SSO programs.
This creates a helper that we can call from both SSO and non-SSO paths
directly, so we can reuse all the logic behind this.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-26 08:40:14 +02:00
Iago Toral Quiroga ab40acb453 glsl/linker: outputs in the same location must share auxiliary storage
From ARB_enhanced_layouts:

"[...]when location aliasing, the aliases sharing the location
  must have the same underlying numerical type (floating-point or
  integer) and the same auxiliary storage and
  interpolation qualification.[...]"

Add code to the linker to validate that aliased locations do
have the same aux storage.

Fixes:
KHR-GL45.enhanced_layouts.varying_location_aliasing_with_mixed_auxiliary_storage

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-26 08:40:14 +02:00
Iago Toral Quiroga 0b565f715d glsl/linker: outputs in the same location must share interpolation
From ARB_enhanced_layouts:

"[...]when location aliasing, the aliases sharing the location
 must have the same underlying numerical type (floating-point or
 integer) and the same auxiliary storage and
 interpolation qualification.[...]"

Add code to the linker to validate that aliased locations do
have the same interpolation.

Fixes:
KHR-GL45.enhanced_layouts.varying_location_aliasing_with_mixed_interpolation

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-26 08:40:14 +02:00
Iago Toral Quiroga c4545676d7 glsl/linker: fix location aliasing checks for interface variables
The existing code was checking the whole interface variable rather
than its members, which is not what we want: we want to check
aliasing for each member in the interface variable.

Surprisingly, there are piglit tests that verify this and were
passing due to a bug in the existing code: when we were computing
the last component used by an interface variable we would use
the 'vector' path and multiply by vector_elements, which is 0 for
interface variables. This made the loop that checks for aliasing
be a no-op and not add the interface variable to the list of outputs
so then we would fail to link when we did not see a matching output
for the same input in the next stage. Since the tests expect a
linker error to happen, they would pass, but not for the right
reason.

Unfortunately, the current implementation uses ir_variable instances
to keep track of explicit locations. Since we don't have
ir_variables instances for individual interface members, we need
to have a custom struct with the data we need. This struct has
the ir_variable (which for interface members is the whole
interface variable), plus the data that we need to validate for
each aliased location, for now only the base type, which for
interface members we will take from the appropriate field inside
the interface variable.

Later patches will expand this custom struct so we can also check
other requirements for location aliasing, specifically that
we have matching interpolation and auxiliary storage, that once
again, we will take from the appropriate field members for the
interface variables.

v2:
 - Use MAX_VARYING instead of MAX_VARYINGS_INCL_PATCH (Illia)

Fixes:
KHR-GL45.enhanced_layouts.varying_block_automatic_member_locations

Fixes (these were passing before but for incorrect reasons):
tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-location-overlap.shader_test
tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-mixed-order-overlap.shader_test

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-26 08:40:14 +02:00