KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Francisco Jerez	ba79a90fb5	glsl: Switch ast_type_qualifier to a 128-bit bitset. This should end the drought of bits in the ast_type_qualifier object. The bitset_t type works pretty much as a drop-in replacement for the current uint64_t bitset. The only catch is that the bitset_t type as defined in the previous commit doesn't have a trivial constructor (because it has a user-defined constructor), so it cannot be used as union member without providing a user-defined constructor for the union (which causes it in turn to be non-trivially constructible). This annoyance could be easily addressed in C++11 by declaring the default constructor of bitset_t to be the implicitly defined one -- IMO one more reason to drop support for GCC 4.2-4.3. The other minor change was required because glsl_parser_extras.cpp was hard-coding the type of bitset temporaries as uint64_t, which (unlike would have been the case if the uint64_t had been replaced with e.g. an __int128) would otherwise have caused a build failure, because the boolean conversion operator of bitset_t is marked explicit (if C++11 is available), so the bitset won't be silently truncated down to 1 bit in order to use it to initialize the uint64_t temporaries (yikes). Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Marek Olšák	605a7f6db5	mesa: implement ARB_compatibility Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:15 +01:00
Samuel Pitoiset	63fb30c674	nir: lower fexp2(fmul(flog2(a), 2)) to fmul(a, a) Similar for the 4 case. Suggested by Bas. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:40:45 +01:00
Samuel Pitoiset	b18997876f	nir: add is_used_once for fmul(fexp2(a), fexp2(b)) to fexp2(fadd(a, b)) Otherwise the code size increases because the original fexp2() instructions can't be deleted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:40:43 +01:00
Samuel Pitoiset	3c40be126f	spirv: apply memory qualifiers to images Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:39:53 +01:00
Kenneth Graunke	183ce5e629	glsl: Parse 'layout' as a token with advanced blending or bindless Both KHR_blend_equation_advanced and ARB_bindless_texture provide layout qualifiers, and are exposed in compatibility contexts. We need to parse the layout qualifier as a token in order for those to work, but forgot to extend this check. ARB_shader_image_load_store would need a similar treatment, but we don't expose that in legacy OpenGL contexts. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105161 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-02-21 17:50:57 -08:00
Timothy Arceri	cdeac00267	nir: remove old assert This was originally intended to make sure the remap location was not -1. However the code has changed alot since then, the location is now never set to -1 and we also handle components meaning this old assert has been doing comparisions with the pointer to the array of component data. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105183	2018-02-22 09:31:00 +11:00
Eric Anholt	4636ce362d	glsl/tests: Fix a compiler warning about signed/unsigned loop comparison. Fixes: `d32956935e` ("glsl: Walk a list of ir_dereference_array to mark array elements as accessed") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-20 20:23:57 -08:00
Eric Anholt	1b313eedb5	glsl: Silence warnings in the uniform initializer test about 16-bit types They should probably get unit tests implemented, but this cleans up a bunch of warnings in my build for now. Fixes: `59f458cd87` ("glsl: Add 16-bit types") Cc: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-20 20:23:57 -08:00
Timothy Arceri	347038baa9	glsl/nir: add pixel_center_integer to shader info Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-19 08:47:48 +11:00
Eric Engestrom	a176b053b6	glsl: fix sizeof(pointer) bug Doesn't really change anything to the test though ¯\_(ツ)_/¯ CID: 1429511 Fixes: `e8495646af` "glsl/tests: changes to test_disk_cache_create test" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-16 12:04:29 +00:00
Marek Olšák	6b1e26e181	mesa: move STATE_LENGTH to shader_enums.h and use it everywhere Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	e149a0253c	mesa,glsl,nir: reduce gl_state_index size to 2 bytes Let's use the new gl_state_index16 type everywhere and remove the typecasts. This helps reduce the size of gl_program_parameter. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	65ed98839b	mesa: reduce the size of gl_program gl_program: 1456 -> 976 bytes Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Eric Anholt	21670f8208	glsl/tests: Fix strict aliasing warning about int64/double. Fixes: `4bf9862747` ("glsl/tests: Add UINT64 and INT64 types") Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>	2018-02-12 20:48:43 +00:00
Alejandro Piñeiro	f32b01ca43	glsl/linker: remove ubo explicit binding handling This is already handled at link_uniform_blocks, specifically at process_block_array_leaf. Additionally, this code was not handling correctly arrays of arrays. When creating the name of the block to set the binding, it only took into account the first level, so any attempt to set a explicit binding on a array of array ubo would trigger an assertion. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-09 08:32:42 +01:00
Scott D Phillips	1f4d2433e7	meson: Add build option for tools Add a build option to control building some of the misc tools we have. Also set the executables to install, presumably you want that if you're asking for the build. v2: set 'install:' to the with_tools value, not true (Jordan) handle 'all' in a the comma list (Dylan) Add freedreno's tools (Dylan) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-08 11:24:42 -08:00
Tapani Pälli	e8495646af	glsl/tests: changes to test_disk_cache_create test Next patch will allow disk_cache instance to be created without path set for it, modify some test cases that assume disk_cache creation to fail with invalid path. Creation should succeed but simple put/get test fail. v2: leave tests as is but check that both cache struct exists and try simple put/get that should fail with invalid path set (Emil) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	83c81b6cce	glsl/tests: move utility functions in cache_test Patch moves functions higher so that we can utilize them from test_disk_cache_create which is modified by next patch. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Samuel Pitoiset	3488a3f033	radv: run nir_opt_shrink_load LLVM can't shrink loads. Polaris10: Totals from affected shaders: SGPRS: 62528 -> 59955 (-4.11 %) VGPRS: 44708 -> 44616 (-0.21 %) Spilled SGPRs: 16 -> 8 (-50.00 %) Code Size: 1355504 -> 1355172 (-0.02 %) bytes Max Waves: 11710 -> 11670 (-0.34 %) Vega10: Totals from affected shaders: SGPRS: 51448 -> 50371 (-2.09 %) VGPRS: 39140 -> 39048 (-0.24 %) Spilled SGPRs: 16 -> 16 (0.00 %) Code Size: 1307188 -> 1304296 (-0.22 %) bytes Max Waves: 11312 -> 11292 (-0.18 %) This reduces SGPRs spilling in MadMax, and it also reduces number of SGPRs in DOW3 and F12017. The number of waves slightly decreases in F1 but I don't see any performance changes after benchmarking it. Talos and Serious Sam are not affected because they don't use any push constants. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-06 23:08:44 +01:00
Samuel Pitoiset	e68562b94b	nir: add nir_opt_shrink_load pass This is a very simple pass that just shrinks load_push_constant intrinsics when some components are unused. For now, it can just shrink vec4 to vec3, vec3 to vec2 and so on. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-06 23:08:39 +01:00
Iago Toral Quiroga	ef439a4fdc	spirv: split constant initializers on in/out structs The SPIR-V parser splits in/out struct variables and creates a separate variable for each first-level member of the struct. When the struct variable has an initializer this means that we also need to split the initializer. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-06 07:50:18 +01:00
Ilia Mirkin	02a6d901ee	mesa: add OES_EGL_image_external_essl3 support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-06 07:28:11 +02:00
Juan A. Suarez Romero	4195eed961	glsl/linker: check same name is not used in block and outside According with OpenGL GLSL 3.20 spec, section 4.3.9: "It is a link-time error if any particular shader interface contains: - two different blocks, each having no instance name, and each having a member of the same name, or - a variable outside a block, and a block with no instance name, where the variable has the same name as a member in the block." This fixes a previous commit `9b894c8` ("glsl/linker: link-error using the same name in unnamed block and outside") that covered this case, but did not take in account that precision qualifiers are ignored when comparing blocks with no instance name. With this commit, the original tests KHR-GL*.shaders.uniform_block.common.name_matching keep fixed, and also dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression is fixed, which was broken by previous commit. v2: use helper varibles (Matteo Bruni) Fixes: `9b894c8` ("glsl/linker: link-error using the same name in unnamed block and outside") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104668 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104777 CC: Mark Janes <mark.a.janes@intel.com> CC: "18.0" <mesa-stable@lists.freedesktop.org> Tested-by: Matteo Bruni <matteo.mystral@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-02-05 18:10:43 +01:00
Mathias Fröhlich	186f03cfb0	mesa: Put materials at the end of the generic block. The materials are now moved to the end of the generic attributes block to the range 4-15. Before, the way the position and generic 0 attribute is handled was dependent on the presence and kind of the currently attached vertex program. With this change the way the position attribute and the generic 0 attribute is treated only depends on the enabled flag of those two arrays. This will later help to untangle the update dependencies between enabled arrays and shader inputs. v2: s,VERT_ATTRIB_MAT_OFFSET,VERT_ATTRIB_MAT0,g Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:06 +01:00
Mathias Fröhlich	38b41fd718	mesa: Use defines for the aliased material array attributes. Instead of just assuming that the material attributes just overlap with the generic attributes 0-12, give them symbolic defines so that we can easier move them to an other range. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:06 +01:00
Ian Romanick	ee63933a73	nir: Distribute binary operations with constants into bcsel This was specifically designed to simplify 1+mix(0, a-1, condition) to mix(1, a, condition) by pushing the 1+ inside. Skylake, Broadwell, and Haswell had similar results. Skylake shown. total instructions in shared programs: 14521753 -> 14521716 (<.01%) instructions in affected programs: 10619 -> 10582 (-0.35%) helped: 51 HURT: 14 helped stats (abs) min: 1 max: 12 x̄: 1.43 x̃: 1 helped stats (rel) min: 0.20% max: 3.58% x̄: 1.01% x̃: 0.95% HURT stats (abs) min: 1 max: 11 x̄: 2.57 x̃: 1 HURT stats (rel) min: 0.22% max: 1.75% x̄: 1.20% x̃: 1.32% 95% mean confidence interval for instructions value: -1.31 0.17 95% mean confidence interval for instructions %-change: -0.80% -0.27% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 533000205 -> 533003533 (<.01%) cycles in affected programs: 110610 -> 113938 (3.01%) helped: 43 HURT: 28 helped stats (abs) min: 6 max: 440 x̄: 27.12 x̃: 16 helped stats (rel) min: 0.39% max: 4.84% x̄: 1.60% x̃: 1.67% HURT stats (abs) min: 2 max: 3066 x̄: 160.50 x̃: 14 HURT stats (rel) min: 0.08% max: 77.78% x̄: 5.16% x̃: 0.62% 95% mean confidence interval for cycles value: -43.81 137.56 95% mean confidence interval for cycles %-change: -1.47% 3.60% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 10018840 -> 10018713 (<.01%) instructions in affected programs: 9431 -> 9304 (-1.35%) helped: 51 HURT: 3 helped stats (abs) min: 1 max: 80 x̄: 2.76 x̃: 1 helped stats (rel) min: 0.20% max: 16.43% x̄: 1.16% x̃: 0.81% HURT stats (abs) min: 1 max: 12 x̄: 4.67 x̃: 1 HURT stats (rel) min: 0.22% max: 1.33% x̄: 0.59% x̃: 0.22% 95% mean confidence interval for instructions value: -5.36 0.66 95% mean confidence interval for instructions %-change: -1.66% -0.46% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 87571944 -> 87572785 (<.01%) cycles in affected programs: 117234 -> 118075 (0.72%) helped: 42 HURT: 23 helped stats (abs) min: 2 max: 114 x̄: 51.90 x̃: 30 helped stats (rel) min: 0.11% max: 11.01% x̄: 4.45% x̃: 2.74% HURT stats (abs) min: 1 max: 2341 x̄: 131.35 x̃: 10 HURT stats (rel) min: 0.06% max: 37.11% x̄: 2.75% x̃: 0.61% 95% mean confidence interval for cycles value: -61.05 86.93 95% mean confidence interval for cycles %-change: -3.47% -0.33% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10542933 -> 10542844 (<.01%) instructions in affected programs: 11487 -> 11398 (-0.77%) helped: 52 HURT: 3 helped stats (abs) min: 1 max: 40 x̄: 1.96 x̃: 1 helped stats (rel) min: 0.08% max: 8.16% x̄: 0.90% x̃: 0.72% HURT stats (abs) min: 1 max: 11 x̄: 4.33 x̃: 1 HURT stats (rel) min: 0.22% max: 1.22% x̄: 0.55% x̃: 0.22% 95% mean confidence interval for instructions value: -3.17 -0.07 95% mean confidence interval for instructions %-change: -1.13% -0.52% Instructions are helped. total cycles in shared programs: 146098397 -> 146097094 (<.01%) cycles in affected programs: 128140 -> 126837 (-1.02%) helped: 47 HURT: 8 helped stats (abs) min: 2 max: 333 x̄: 29.21 x̃: 18 helped stats (rel) min: 0.13% max: 5.04% x̄: 1.18% x̃: 0.95% HURT stats (abs) min: 1 max: 16 x̄: 8.75 x̃: 9 HURT stats (rel) min: 0.08% max: 0.43% x̄: 0.30% x̃: 0.34% 95% mean confidence interval for cycles value: -37.49 -9.90 95% mean confidence interval for cycles %-change: -1.22% -0.71% Cycles are helped. Iron Lake total instructions in shared programs: 7886711 -> 7886509 (<.01%) instructions in affected programs: 10425 -> 10223 (-1.94%) helped: 50 HURT: 2 helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1 helped stats (rel) min: 0.34% max: 15.38% x̄: 1.12% x̃: 0.54% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.86% max: 0.91% x̄: 0.89% x̃: 0.89% 95% mean confidence interval for instructions value: -8.05 0.28 95% mean confidence interval for instructions %-change: -1.83% -0.26% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 178115324 -> 178114612 (<.01%) cycles in affected programs: 765726 -> 765014 (-0.09%) helped: 39 HURT: 1 helped stats (abs) min: 2 max: 276 x̄: 18.31 x̃: 8 helped stats (rel) min: <.01% max: 8.47% x̄: 0.39% x̃: 0.04% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -32.07 -3.53 95% mean confidence interval for cycles %-change: -0.86% 0.10% Inconclusive result (%-change mean confidence interval includes 0). GM45 total instructions in shared programs: 4857762 -> 4857661 (<.01%) instructions in affected programs: 5523 -> 5422 (-1.83%) helped: 25 HURT: 1 helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1 helped stats (rel) min: 0.34% max: 13.61% x̄: 1.04% x̃: 0.52% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.86% max: 0.86% x̄: 0.86% x̃: 0.86% 95% mean confidence interval for instructions value: -9.99 2.22 95% mean confidence interval for instructions %-change: -2.01% 0.08% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 122179674 -> 122179194 (<.01%) cycles in affected programs: 530162 -> 529682 (-0.09%) helped: 22 HURT: 1 helped stats (abs) min: 2 max: 292 x̄: 21.91 x̃: 7 helped stats (rel) min: <.01% max: 8.65% x̄: 0.44% x̃: 0.04% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -46.56 4.82 95% mean confidence interval for cycles %-change: -1.20% 0.36% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:15 -08:00
Ian Romanick	03fb13f646	nir: Rearrange logic op-compounded integer compares Skylake and Broadwell had similar results. Skylake shown. total instructions in shared programs: 14521769 -> 14521753 (<.01%) instructions in affected programs: 8782 -> 8766 (-0.18%) helped: 16 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.12% max: 0.40% x̄: 0.20% x̃: 0.18% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.23% -0.16% Instructions are helped. total cycles in shared programs: 533000376 -> 533000205 (<.01%) cycles in affected programs: 447035 -> 446864 (-0.04%) helped: 9 HURT: 9 helped stats (abs) min: 2 max: 40 x̄: 35.78 x̃: 40 helped stats (rel) min: 0.02% max: 0.18% x̄: 0.10% x̃: 0.09% HURT stats (abs) min: 1 max: 52 x̄: 16.78 x̃: 10 HURT stats (rel) min: <.01% max: 1.11% x̄: 0.29% x̃: 0.12% 95% mean confidence interval for cycles value: -25.07 6.07 95% mean confidence interval for cycles %-change: -0.08% 0.27% Inconclusive result (value mean confidence interval includes 0). No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	053be9f020	nir: Rearrange and-compounded float compares If both comparisons are used as sources for instructions other than the iand, this transformation is detrimental. If the non-identical value in both compares is constant, the fmin or fmax will be constant-folded away, so the transformation is always a win. It is interesting to me that on Iron Lake only 81 shaders have instruction counts changed, but 726 shaders have cycle counts changed. shader-db results: Skylake total instructions in shared programs: 14525728 -> 14521017 (-0.03%) instructions in affected programs: 1164726 -> 1160015 (-0.40%) helped: 1692 HURT: 5 helped stats (abs) min: 1 max: 637 x̄: 2.79 x̃: 2 helped stats (rel) min: 0.07% max: 16.36% x̄: 0.81% x̃: 0.33% HURT stats (abs) min: 1 max: 12 x̄: 3.20 x̃: 1 HURT stats (rel) min: 0.38% max: 2.86% x̄: 2.36% x̃: 2.86% 95% mean confidence interval for instructions value: -3.52 -2.03 95% mean confidence interval for instructions %-change: -0.86% -0.74% Instructions are helped. total cycles in shared programs: 533115449 -> 532991404 (-0.02%) cycles in affected programs: 119401803 -> 119277758 (-0.10%) helped: 1145 HURT: 467 helped stats (abs) min: 1 max: 34644 x̄: 145.92 x̃: 18 helped stats (rel) min: <.01% max: 45.33% x̄: 1.58% x̃: 0.42% HURT stats (abs) min: 1 max: 1590 x̄: 92.15 x̃: 15 HURT stats (rel) min: <.01% max: 13.48% x̄: 1.26% x̃: 0.39% 95% mean confidence interval for cycles value: -122.16 -31.74 95% mean confidence interval for cycles %-change: -0.94% -0.57% Cycles are helped. total spills in shared programs: 9597 -> 9534 (-0.66%) spills in affected programs: 403 -> 340 (-15.63%) helped: 1 HURT: 1 total fills in shared programs: 13904 -> 13790 (-0.82%) fills in affected programs: 1627 -> 1513 (-7.01%) helped: 2 HURT: 1 LOST: 0 GAINED: 2 Broadwell total instructions in shared programs: 14816966 -> 14812590 (-0.03%) instructions in affected programs: 1499885 -> 1495509 (-0.29%) helped: 1672 HURT: 15 helped stats (abs) min: 1 max: 455 x̄: 2.70 x̃: 2 helped stats (rel) min: 0.05% max: 16.36% x̄: 0.81% x̃: 0.33% HURT stats (abs) min: 1 max: 21 x̄: 9.20 x̃: 8 HURT stats (rel) min: 0.08% max: 2.86% x̄: 1.06% x̃: 0.53% 95% mean confidence interval for instructions value: -3.14 -2.05 95% mean confidence interval for instructions %-change: -0.85% -0.73% Instructions are helped. total cycles in shared programs: 559353622 -> 559345595 (<.01%) cycles in affected programs: 139893703 -> 139885676 (<.01%) helped: 921 HURT: 697 helped stats (abs) min: 1 max: 42424 x̄: 143.45 x̃: 18 helped stats (rel) min: <.01% max: 36.23% x̄: 2.02% x̃: 0.87% HURT stats (abs) min: 1 max: 2370 x̄: 178.03 x̃: 38 HURT stats (rel) min: <.01% max: 17.35% x̄: 0.71% x̃: 0.14% 95% mean confidence interval for cycles value: -59.64 49.72 95% mean confidence interval for cycles %-change: -1.02% -0.66% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 78902 -> 78861 (-0.05%) spills in affected programs: 2418 -> 2377 (-1.70%) helped: 1 HURT: 11 total fills in shared programs: 83782 -> 83678 (-0.12%) fills in affected programs: 3515 -> 3411 (-2.96%) helped: 2 HURT: 11 LOST: 0 GAINED: 5 Haswell and Ivy Bridge had similar results. Haswell shown. total instructions in shared programs: 9033898 -> 9032010 (-0.02%) instructions in affected programs: 308064 -> 306176 (-0.61%) helped: 921 HURT: 4 helped stats (abs) min: 1 max: 20 x̄: 2.05 x̃: 1 helped stats (rel) min: 0.17% max: 17.54% x̄: 0.80% x̃: 0.35% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23% 95% mean confidence interval for instructions value: -2.21 -1.87 95% mean confidence interval for instructions %-change: -0.88% -0.68% Instructions are helped. total cycles in shared programs: 84628949 -> 84620520 (<.01%) cycles in affected programs: 2164913 -> 2156484 (-0.39%) helped: 518 HURT: 359 helped stats (abs) min: 1 max: 440 x̄: 41.52 x̃: 20 helped stats (rel) min: <.01% max: 17.17% x̄: 1.95% x̃: 1.01% HURT stats (abs) min: 1 max: 586 x̄: 36.43 x̃: 8 HURT stats (rel) min: 0.04% max: 18.65% x̄: 1.47% x̃: 0.40% 95% mean confidence interval for cycles value: -15.17 -4.05 95% mean confidence interval for cycles %-change: -0.77% -0.32% Cycles are helped. LOST: 0 GAINED: 4 Sandy Bridge total instructions in shared programs: 10544860 -> 10542933 (-0.02%) instructions in affected programs: 360019 -> 358092 (-0.54%) helped: 931 HURT: 4 helped stats (abs) min: 1 max: 20 x̄: 2.07 x̃: 1 helped stats (rel) min: 0.11% max: 15.52% x̄: 0.68% x̃: 0.30% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 3.33% max: 3.33% x̄: 3.33% x̃: 3.33% 95% mean confidence interval for instructions value: -2.23 -1.89 95% mean confidence interval for instructions %-change: -0.76% -0.58% Instructions are helped. total cycles in shared programs: 146106820 -> 146098397 (<.01%) cycles in affected programs: 3435047 -> 3426624 (-0.25%) helped: 572 HURT: 329 helped stats (abs) min: 1 max: 1289 x̄: 32.52 x̃: 15 helped stats (rel) min: <.01% max: 26.29% x̄: 0.97% x̃: 0.33% HURT stats (abs) min: 1 max: 1714 x̄: 30.93 x̃: 6 HURT stats (rel) min: 0.02% max: 41.31% x̄: 1.13% x̃: 0.19% 95% mean confidence interval for cycles value: -16.85 -1.85 95% mean confidence interval for cycles %-change: -0.39% -0.01% Cycles are helped. LOST: 1 GAINED: 0 Iron Lake total instructions in shared programs: 7886925 -> 7886711 (<.01%) instructions in affected programs: 25763 -> 25549 (-0.83%) helped: 75 HURT: 6 helped stats (abs) min: 1 max: 13 x̄: 3.33 x̃: 1 helped stats (rel) min: 0.35% max: 17.57% x̄: 1.96% x̃: 0.53% HURT stats (abs) min: 1 max: 16 x̄: 6.00 x̃: 1 HURT stats (rel) min: 2.86% max: 4.79% x̄: 3.49% x̃: 2.86% 95% mean confidence interval for instructions value: -3.69 -1.60 95% mean confidence interval for instructions %-change: -2.54% -0.57% Instructions are helped. total cycles in shared programs: 178116888 -> 178115324 (<.01%) cycles in affected programs: 5858790 -> 5857226 (-0.03%) helped: 484 HURT: 242 helped stats (abs) min: 2 max: 76 x̄: 5.27 x̃: 6 helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06% HURT stats (abs) min: 2 max: 76 x̄: 4.07 x̃: 2 HURT stats (rel) min: 0.01% max: 3.99% x̄: 0.19% x̃: 0.03% 95% mean confidence interval for cycles value: -2.76 -1.55 95% mean confidence interval for cycles %-change: -0.12% 0.01% Inconclusive result (%-change mean confidence interval includes 0). GM45 total instructions in shared programs: 4857870 -> 4857762 (<.01%) instructions in affected programs: 13994 -> 13886 (-0.77%) helped: 39 HURT: 5 helped stats (abs) min: 1 max: 13 x̄: 3.28 x̃: 2 helped stats (rel) min: 0.33% max: 17.11% x̄: 1.86% x̃: 0.48% HURT stats (abs) min: 1 max: 16 x̄: 4.00 x̃: 1 HURT stats (rel) min: 2.86% max: 4.71% x̄: 3.23% x̃: 2.86% 95% mean confidence interval for instructions value: -3.86 -1.05 95% mean confidence interval for instructions %-change: -2.61% 0.04% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: 122180744 -> 122179674 (<.01%) cycles in affected programs: 3686646 -> 3685576 (-0.03%) helped: 273 HURT: 141 helped stats (abs) min: 2 max: 76 x̄: 5.81 x̃: 6 helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06% HURT stats (abs) min: 2 max: 76 x̄: 3.66 x̃: 2 HURT stats (rel) min: 0.01% max: 3.99% x̄: 0.16% x̃: 0.02% 95% mean confidence interval for cycles value: -3.42 -1.75 95% mean confidence interval for cycles %-change: -0.15% 0.03% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	821e7a4d32	nir: Separate a weird compare with zero to two compares with zero min(a+b, c+d) >= 0 becomes (a+b >= 0 && c+d >= 0). No shader-db changes, but it does prevent 6 to 12 instruction regressions in the next patch on all measured Intel platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	68420d8322	nir: Simplify min and max of b2f v2: Rebase on almost 2 years. Require that one of the arguments to fmin or fmax be used only once. This prevents some regressions. shader-db results: Skylake and Broadwell had similar results. Skylake shown. total instructions in shared programs: 14526021 -> 14525913 (<.01%) instructions in affected programs: 4613 -> 4505 (-2.34%) helped: 31 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 3.48 x̃: 4 helped stats (rel) min: 0.62% max: 6.67% x̄: 3.31% x̃: 2.42% total cycles in shared programs: 533118710 -> 533118403 (<.01%) cycles in affected programs: 34334 -> 34027 (-0.89%) helped: 24 HURT: 0 helped stats (abs) min: 4 max: 24 x̄: 12.79 x̃: 14 helped stats (rel) min: 0.25% max: 2.40% x̄: 1.08% x̃: 1.03% No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	d8d18516b0	nir: Undo possible damage caused by rearranging or-compounded float compares shader-db results: Skylake and Broadwell had similar results (Skylake shown) total instructions in shared programs: 14525898 -> 14525836 (<.01%) instructions in affected programs: 1964 -> 1902 (-3.16%) helped: 14 HURT: 0 helped stats (abs) min: 1 max: 25 x̄: 4.43 x̃: 1 helped stats (rel) min: 0.68% max: 9.77% x̄: 2.10% x̃: 0.86% 95% mean confidence interval for instructions value: -9.46 0.60 95% mean confidence interval for instructions %-change: -3.97% -0.24% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 533119892 -> 533115756 (<.01%) cycles in affected programs: 96061 -> 91925 (-4.31%) helped: 13 HURT: 1 helped stats (abs) min: 60 max: 596 x̄: 318.77 x̃: 300 helped stats (rel) min: 1.15% max: 5.49% x̄: 4.27% x̃: 4.42% HURT stats (abs) min: 8 max: 8 x̄: 8.00 x̃: 8 HURT stats (rel) min: 0.46% max: 0.46% x̄: 0.46% x̃: 0.46% 95% mean confidence interval for cycles value: -379.43 -211.43 95% mean confidence interval for cycles %-change: -4.84% -3.01% Cycles are helped. Haswell, Ivy Bridge and Sandy Bridge had similar results (Haswell shown). total instructions in shared programs: 9033948 -> 9033898 (<.01%) instructions in affected programs: 535 -> 485 (-9.35%) helped: 2 HURT: 0 total cycles in shared programs: 84631402 -> 84628949 (<.01%) cycles in affected programs: 63197 -> 60744 (-3.88%) helped: 13 HURT: 2 helped stats (abs) min: 1 max: 594 x̄: 189.62 x̃: 140 helped stats (rel) min: 0.07% max: 5.04% x̄: 3.79% x̃: 4.01% HURT stats (abs) min: 4 max: 8 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.17% max: 0.45% x̄: 0.31% x̃: 0.31% 95% mean confidence interval for cycles value: -253.40 -73.67 95% mean confidence interval for cycles %-change: -4.24% -2.25% Cycles are helped. No changes on GM45 or Iron Lake. v2: Add a couple more tautological compares. Suggested by Elie. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	3941cba0f7	nir: Be more conservative about rearranging or-compounded compares If both comparisons are used as sources for instructions other than the ior, this transformation is detrimental. If the non-identical value in both compares is constant, the fmin or fmax will be constant-folded away, so the transformation is always a win. shader-db results: Skylake total instructions in shared programs: 14526147 -> 14525898 (<.01%) instructions in affected programs: 70239 -> 69990 (-0.35%) helped: 102 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.44 x̃: 1 helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20% 95% mean confidence interval for instructions value: -2.86 -2.02 95% mean confidence interval for instructions %-change: -0.46% -0.31% Instructions are helped. total cycles in shared programs: 533120531 -> 533119892 (<.01%) cycles in affected programs: 994875 -> 994236 (-0.06%) helped: 76 HURT: 26 helped stats (abs) min: 1 max: 324 x̄: 27.09 x̃: 13 helped stats (rel) min: <.01% max: 4.21% x̄: 0.45% x̃: 0.18% HURT stats (abs) min: 1 max: 167 x̄: 54.62 x̃: 26 HURT stats (rel) min: <.01% max: 4.36% x̄: 1.01% x̃: 0.39% 95% mean confidence interval for cycles value: -19.44 6.91 95% mean confidence interval for cycles %-change: -0.30% 0.15% Inconclusive result (value mean confidence interval includes 0). Broadwell total instructions in shared programs: 14816005 -> 14815787 (<.01%) instructions in affected programs: 64658 -> 64440 (-0.34%) helped: 97 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.25 x̃: 1 helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20% 95% mean confidence interval for instructions value: -2.62 -1.87 95% mean confidence interval for instructions %-change: -0.45% -0.30% Instructions are helped. total cycles in shared programs: 559340386 -> 559339907 (<.01%) cycles in affected programs: 1090491 -> 1090012 (-0.04%) helped: 66 HURT: 28 helped stats (abs) min: 2 max: 198 x̄: 23.83 x̃: 16 helped stats (rel) min: 0.01% max: 4.21% x̄: 0.47% x̃: 0.27% HURT stats (abs) min: 2 max: 226 x̄: 39.07 x̃: 11 HURT stats (rel) min: <.01% max: 4.61% x̄: 0.64% x̃: 0.20% 95% mean confidence interval for cycles value: -15.94 5.75 95% mean confidence interval for cycles %-change: -0.35% 0.07% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 1 Haswell total instructions in shared programs: 9034106 -> 9033948 (<.01%) instructions in affected programs: 24096 -> 23938 (-0.66%) helped: 38 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 4.16 x̃: 4 helped stats (rel) min: 0.42% max: 2.29% x̄: 0.71% x̃: 0.64% 95% mean confidence interval for instructions value: -4.71 -3.60 95% mean confidence interval for instructions %-change: -0.84% -0.58% Instructions are helped. total cycles in shared programs: 84631628 -> 84631402 (<.01%) cycles in affected programs: 148674 -> 148448 (-0.15%) helped: 14 HURT: 14 helped stats (abs) min: 1 max: 114 x̄: 22.14 x̃: 12 helped stats (rel) min: 0.02% max: 2.98% x̄: 0.66% x̃: 0.21% HURT stats (abs) min: 1 max: 10 x̄: 6.00 x̃: 5 HURT stats (rel) min: 0.01% max: 0.20% x̄: 0.12% x̃: 0.11% 95% mean confidence interval for cycles value: -19.42 3.28 95% mean confidence interval for cycles %-change: -0.59% 0.05% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 10015456 -> 10015293 (<.01%) instructions in affected programs: 27701 -> 27538 (-0.59%) helped: 38 HURT: 0 helped stats (abs) min: 1 max: 9 x̄: 4.29 x̃: 4 helped stats (rel) min: 0.33% max: 2.79% x̄: 0.66% x̃: 0.52% 95% mean confidence interval for instructions value: -4.87 -3.71 95% mean confidence interval for instructions %-change: -0.82% -0.51% Instructions are helped. total cycles in shared programs: 87524771 -> 87524569 (<.01%) cycles in affected programs: 112324 -> 112122 (-0.18%) helped: 6 HURT: 12 helped stats (abs) min: 2 max: 111 x̄: 44.67 x̃: 20 helped stats (rel) min: 0.02% max: 2.94% x̄: 1.45% x̃: 1.26% HURT stats (abs) min: 1 max: 16 x̄: 5.50 x̃: 5 HURT stats (rel) min: <.01% max: 0.16% x̄: 0.08% x̃: 0.08% 95% mean confidence interval for cycles value: -29.14 6.69 95% mean confidence interval for cycles %-change: -0.93% 0.08% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 2 Sandy Bridge total instructions in shared programs: 10545655 -> 10545465 (<.01%) instructions in affected programs: 37198 -> 37008 (-0.51%) helped: 42 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 4.52 x̃: 4 helped stats (rel) min: 0.31% max: 2.15% x̄: 0.58% x̃: 0.49% 95% mean confidence interval for instructions value: -5.14 -3.91 95% mean confidence interval for instructions %-change: -0.68% -0.47% Instructions are helped. total cycles in shared programs: 146113059 -> 146112427 (<.01%) cycles in affected programs: 423514 -> 422882 (-0.15%) helped: 32 HURT: 10 helped stats (abs) min: 4 max: 162 x̄: 24.34 x̃: 12 helped stats (rel) min: 0.06% max: 2.74% x̄: 0.37% x̃: 0.11% HURT stats (abs) min: 12 max: 19 x̄: 14.70 x̃: 14 HURT stats (rel) min: 0.10% max: 0.18% x̄: 0.16% x̃: 0.14% 95% mean confidence interval for cycles value: -26.03 -4.07 95% mean confidence interval for cycles %-change: -0.43% -0.05% Cycles are helped. Iron Lake total instructions in shared programs: 7886959 -> 7886925 (<.01%) instructions in affected programs: 1340 -> 1306 (-2.54%) helped: 4 HURT: 0 helped stats (abs) min: 2 max: 15 x̄: 8.50 x̃: 8 helped stats (rel) min: 0.63% max: 4.30% x̄: 2.45% x̃: 2.43% 95% mean confidence interval for instructions value: -20.44 3.44 95% mean confidence interval for instructions %-change: -5.78% 0.89% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 178116996 -> 178116888 (<.01%) cycles in affected programs: 6262 -> 6154 (-1.72%) helped: 2 HURT: 2 helped stats (abs) min: 44 max: 78 x̄: 61.00 x̃: 61 helped stats (rel) min: 3.31% max: 3.94% x̄: 3.62% x̃: 3.62% HURT stats (abs) min: 6 max: 8 x̄: 7.00 x̃: 7 HURT stats (rel) min: 0.34% max: 0.68% x̄: 0.51% x̃: 0.51% 95% mean confidence interval for cycles value: -93.27 39.27 95% mean confidence interval for cycles %-change: -5.38% 2.27% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4857887 -> 4857870 (<.01%) instructions in affected programs: 674 -> 657 (-2.52%) helped: 2 HURT: 0 total cycles in shared programs: 122180816 -> 122180744 (<.01%) cycles in affected programs: 3764 -> 3692 (-1.91%) helped: 1 HURT: 1 helped stats (abs) min: 78 max: 78 x̄: 78.00 x̃: 78 helped stats (rel) min: 3.94% max: 3.94% x̄: 3.94% x̃: 3.94% HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.34% max: 0.34% x̄: 0.34% x̃: 0.34% Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	cfc0d34802	nir: See through an fneg to apply existing optimizations Doing the same for the existing feq and fne transformations didn't help anything in shader-db. shader-db results: Broadwell and Skylake (Skylake shown) total instructions in shared programs: 14529463 -> 14526147 (-0.02%) instructions in affected programs: 402420 -> 399104 (-0.82%) helped: 2136 HURT: 131 helped stats (abs) min: 1 max: 10 x̄: 1.61 x̃: 1 helped stats (rel) min: 0.03% max: 16.22% x̄: 3.14% x̃: 1.12% HURT stats (abs) min: 1 max: 2 x̄: 1.01 x̃: 1 HURT stats (rel) min: 0.13% max: 7.69% x̄: 0.75% x̃: 0.57% 95% mean confidence interval for instructions value: -1.51 -1.41 95% mean confidence interval for instructions %-change: -3.06% -2.78% Instructions are helped. total cycles in shared programs: 533146915 -> 533120531 (<.01%) cycles in affected programs: 10356261 -> 10329877 (-0.25%) helped: 1933 HURT: 844 helped stats (abs) min: 1 max: 490 x̄: 29.44 x̃: 16 helped stats (rel) min: <.01% max: 28.57% x̄: 3.43% x̃: 1.88% HURT stats (abs) min: 1 max: 423 x̄: 36.17 x̃: 12 HURT stats (rel) min: <.01% max: 23.75% x̄: 1.90% x̃: 0.59% 95% mean confidence interval for cycles value: -11.78 -7.22 95% mean confidence interval for cycles %-change: -1.98% -1.65% Cycles are helped. Haswell total instructions in shared programs: 9037416 -> 9034106 (-0.04%) instructions in affected programs: 389831 -> 386521 (-0.85%) helped: 2184 HURT: 120 helped stats (abs) min: 1 max: 11 x̄: 1.57 x̃: 1 helped stats (rel) min: 0.03% max: 25.00% x̄: 2.73% x̃: 1.02% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.19% max: 7.69% x̄: 0.81% x̃: 0.57% 95% mean confidence interval for instructions value: -1.49 -1.39 95% mean confidence interval for instructions %-change: -2.68% -2.41% Instructions are helped. total cycles in shared programs: 84636243 -> 84631628 (<.01%) cycles in affected programs: 4745058 -> 4740443 (-0.10%) helped: 1904 HURT: 960 helped stats (abs) min: 1 max: 466 x̄: 30.21 x̃: 18 helped stats (rel) min: 0.02% max: 36.36% x̄: 3.57% x̃: 2.38% HURT stats (abs) min: 1 max: 1080 x̄: 55.11 x̃: 14 HURT stats (rel) min: 0.02% max: 51.33% x̄: 2.77% x̃: 0.81% 95% mean confidence interval for cycles value: -4.51 1.29 95% mean confidence interval for cycles %-change: -1.64% -1.25% Inconclusive result (value mean confidence interval includes 0). LOST: 1 GAINED: 0 Sandy Bridge and Ivy Bridge (Ivy Bridge shown) total instructions in shared programs: 10018873 -> 10015456 (-0.03%) instructions in affected programs: 512820 -> 509403 (-0.67%) helped: 2268 HURT: 162 helped stats (abs) min: 1 max: 11 x̄: 1.62 x̃: 1 helped stats (rel) min: 0.03% max: 25.00% x̄: 2.47% x̃: 0.88% HURT stats (abs) min: 1 max: 4 x̄: 1.59 x̃: 1 HURT stats (rel) min: 0.09% max: 7.69% x̄: 0.86% x̃: 0.50% 95% mean confidence interval for instructions value: -1.46 -1.35 95% mean confidence interval for instructions %-change: -2.38% -2.12% Instructions are helped. total cycles in shared programs: 87538223 -> 87524771 (-0.02%) cycles in affected programs: 5435520 -> 5422068 (-0.25%) helped: 1916 HURT: 946 helped stats (abs) min: 1 max: 1392 x̄: 29.44 x̃: 18 helped stats (rel) min: <.01% max: 34.51% x̄: 3.34% x̃: 1.97% HURT stats (abs) min: 1 max: 633 x̄: 45.41 x̃: 11 HURT stats (rel) min: 0.02% max: 25.95% x̄: 2.41% x̃: 0.62% 95% mean confidence interval for cycles value: -7.34 -2.06 95% mean confidence interval for cycles %-change: -1.62% -1.26% Cycles are helped. LOST: 1 GAINED: 0 Iron Lake total instructions in shared programs: 7888446 -> 7886959 (-0.02%) instructions in affected programs: 331581 -> 330094 (-0.45%) helped: 1160 HURT: 97 helped stats (abs) min: 1 max: 10 x̄: 1.37 x̃: 1 helped stats (rel) min: 0.02% max: 9.68% x̄: 0.93% x̃: 0.43% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.17% max: 4.17% x̄: 0.37% x̃: 0.25% 95% mean confidence interval for instructions value: -1.25 -1.12 95% mean confidence interval for instructions %-change: -0.91% -0.75% Instructions are helped. total cycles in shared programs: 178130766 -> 178116996 (<.01%) cycles in affected programs: 12534564 -> 12520794 (-0.11%) helped: 1856 HURT: 187 helped stats (abs) min: 2 max: 202 x̄: 7.78 x̃: 4 helped stats (rel) min: <.01% max: 6.47% x̄: 0.28% x̃: 0.11% HURT stats (abs) min: 2 max: 26 x̄: 3.55 x̃: 2 HURT stats (rel) min: 0.01% max: 2.14% x̄: 0.08% x̃: 0.02% 95% mean confidence interval for cycles value: -7.41 -6.07 95% mean confidence interval for cycles %-change: -0.28% -0.22% Cycles are helped. GM45 total instructions in shared programs: 4858912 -> 4857887 (-0.02%) instructions in affected programs: 237565 -> 236540 (-0.43%) helped: 867 HURT: 57 helped stats (abs) min: 1 max: 10 x̄: 1.25 x̃: 1 helped stats (rel) min: 0.02% max: 9.38% x̄: 0.87% x̃: 0.43% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.16% max: 3.85% x̄: 0.34% x̃: 0.22% 95% mean confidence interval for instructions value: -1.18 -1.04 95% mean confidence interval for instructions %-change: -0.88% -0.71% Instructions are helped. total cycles in shared programs: 122189118 -> 122180816 (<.01%) cycles in affected programs: 8776418 -> 8768116 (-0.09%) helped: 1213 HURT: 166 helped stats (abs) min: 2 max: 202 x̄: 7.30 x̃: 4 helped stats (rel) min: <.01% max: 6.43% x̄: 0.25% x̃: 0.11% HURT stats (abs) min: 2 max: 26 x̄: 3.35 x̃: 2 HURT stats (rel) min: 0.01% max: 2.14% x̄: 0.06% x̃: 0.02% 95% mean confidence interval for cycles value: -6.78 -5.26 95% mean confidence interval for cycles %-change: -0.24% -0.18% Cycles are helped. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Timothy Arceri	9a2e085680	nir: add lower_all_io_to_temps flag This will be used for freedreno and vc4 which require all inputs and outputs to be copied to temps. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:08 +11:00
Timothy Arceri	3218756262	nir/st_glsl_to_nir: add param to disable splitting of inputs We need this because we will always copy fs outputs to temps and split the arrays, but do not want to do either of these with fs inputs as it is unnessisary and makes handling interpolateAt builtins difficult. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:08 +11:00
Timothy Arceri	09cd484d61	nir: partially revert `c2acf97fcc` `c2acf97fcc` changed the use of double_inputs_read to be inconsitent with its previous meaning. Here we re-enable the gather info code that was removed as the modified code from `c2acf97fcc` now uses the double_inputs member rather than double_inputs_read. This change allows us to use double_inputs_read with gallium drivers without impacting double_inputs which is used by i965. We also make use of the compiler option vs_inputs_dual_locations to allow for the difference in behaviour between drivers that handle vs inputs as taking up two locations for doubles, versus those that treat them as taking a single location. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-01-30 09:08:47 +11:00
Timothy Arceri	5b8de4bdff	nir: add vs_inputs_dual_locations compiler option Allows nir drivers to either use a single or dual locations for vs double inputs. i965 uses dual locations for both OpenGL and Vulkan drivers, for now gallium OpenGL drivers only use a single location. The following patch will also make use of this option when calling nir_shader_gather_info(). Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-01-30 09:08:47 +11:00
Timothy Arceri	f63e05ae9e	compiler: tidy up double_inputs_read uses First we move double_inputs_read into a vs struct in the union, double_inputs_read is only used for vs inputs so this will save space and also allows us to add a new double_inputs field. We add the new field because `c2acf97fcc` changed the behaviour of double_inputs_read, and while it's no longer used to track actual reads in i965 we do still want to track this for gallium drivers. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 09:08:47 +11:00
Tapani Pälli	d0343bef66	nir: mark unused space in packed_tex_data This change cleans following scary warnings in valgrind output when disk cache is being written: ==6532== Uninitialised byte(s) found during client check request ==6532== at 0x14423FAD: blob_write_bytes (blob.c:152) ==6532== by 0x144240FB: blob_write_uint32 (blob.c:194) ==6532== by 0x144001A5: write_tex (nir_serialize.c:613) and later (loads of): ==6532== Use of uninitialised value of size 8 ==6532== at 0x62FCD9E: crc32_z (in /usr/lib64/libz.so.1.2.11) ==6532== by 0x13F65014: util_hash_crc32 (crc32.c:127) ==6532== by 0x13F5DABA: cache_put (disk_cache.c:947) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-29 08:11:22 +02:00
Brian Paul	bacf72a18d	mesa: change gl_link_status enums to uppercase follow the convention of other enums. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-01-26 13:52:48 -07:00
Brian Paul	aff5d9c256	mesa: change gl_compile_status enums to uppercase To follow the convention of other enums. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-01-26 13:52:48 -07:00
Emil Velikov	ac4437b20b	automake: small cleanup after the meson.build inclusion Namely extend the EXTRA_DIST list, instead of re-assigning it and bring back a file dropped by mistake. Fixes: `436ed65d38` ("autotools: include meson build files in tarball") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-01-25 17:06:29 +00:00
Timothy Arceri	0cbc62a4dd	glsl: add image and sampler (un)packing support to glsl to nir This is needed for ARB_bindless_texture support. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 09:44:37 +11:00
Timothy Arceri	549ccbb435	nir: add image and sampler type to glsl_get_bit_size() These are needed for ARB_bindless_texture support. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 09:44:37 +11:00
Dylan Baker	436ed65d38	autotools: include meson build files in tarball This adds the meson.build, meson_options.txt, and a few scripts that are used exclusively by the meson build. v2: - Remove accidentally included changes needed to test make dist with LLVM > 3.9 Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-19 16:30:51 -08:00
Brian Paul	f631a6ae8c	glsl: remove unneeded extern "C" {} bracketing around Mesa includes The two headers already have the right extern "C" annotations. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-17 11:17:56 -07:00
Brian Paul	741d423478	glsl: include util/bitscan.h in serialize.cpp Instead of relying on indirect inclusion of the header. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-17 11:17:56 -07:00
Juan A. Suarez Romero	9b894c88a6	glsl/linker: link-error using the same name in unnamed block and outside According with OpenGL GLSL 4.20 spec, section 4.3.9, page 57: "It is a link-time error if any particular shader interface contains: - two different blocks, each having no instance name, and each having a member of the same name, or - a variable outside a block, and a block with no instance name, where the variable has the same name as a member in the block." This means that it is a link error if for example we have a vertex shader with the following definition. "layout(location=0) uniform Data { float a; float b; };" and a fragment shader with: "uniform float a;" As in both cases we refer to both uniforms as "a", and thus using glGetUniformLocation() wouldn't know which one we mean. This fixes KHR-GL*.shaders.uniform_block.common.name_matching. v2: add fixed tests (Tapani) Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-16 19:42:35 +01:00
Samuel Pitoiset	d5e369ff8a	nir: add a 'const' qualifier to nir_ssa_def_components_read() To avoid compilation warnings and because this helper shouldn't update anything. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-12 12:25:17 +01:00
Dylan Baker	2083a14179	meson: Use dependencies for nir This creates two new internal dependencies, idep_nir_headers and idep_nir. The former encapsulates the generation of nir_opcodes.h and nir_builder_opcodes.h and adding src/compiler/nir as an include path. This ensures that any target that needs nir headers will have the includes and that the generated headers will be generated before the target is build. The second, idep_nir, includes the first and additionally links to libnir. This is intended to make it easier to avoid race conditions in the build when using nir, since the number of consumers for libnir and it's headers are quite high. Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Dylan Baker	4ccb981673	meson: Use consistent style for tests Don't use intermediate variables, use consistent whitespace. Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Dylan Baker	fbf192a67e	meson: Use consistent style Currently the meosn build has a mix of two styles: arg : [foo, ... bar], and arg : [ foo, ..., bar, ] For consistency let's pick one. I've picked the later style, which I think is more readable, and is more common in the mesa code base. v2: - fix commit message Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Tapani Pälli	f2c0e47d9c	glsl: cleanup shader_cache header guard Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-01-11 08:04:56 +02:00
Ian Romanick	336afe7d7a	glsl/linker: Safely generate mask of possible locations If MaxAttribs were ever raised to 32, undefined behavior would occur. We had already gone to the effort (albeit incorrectly) handle this in one case, so fix them all. CID: 1369628 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-01-10 07:21:12 -08:00
Ian Romanick	0c9df36157	glsl/linker: Mark no locations as invalid instead of marking all locations If max_index were ever 32, the linker would have marked all 32 locations as invalid instead of marking none of them as invalid. It's a good thing the maximum value actually set by any driver for MaxAttribs is 16. Found by inspection while investigating CID 1369628. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-01-10 07:21:11 -08:00
Ian Romanick	702dc43f7e	glsl: Don't handle visit_stop in several ::accept methods All cases where the result could be non-visit_continue would have already returned. CID: 401351, 1224465, 1224466 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-01-10 07:21:11 -08:00
Ian Romanick	a170f27958	glsl: Remove unnecessary assignments to type None of these are necessary because result->type is the only thing used outside the giant switch-statement. CID: 1230983, 1230984 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-01-10 07:21:11 -08:00
Ian Romanick	fd2f4f507f	nir: Silence unused parameter warnings In file included from src/compiler/nir/nir_opt_algebraic.c:4:0: src/compiler/nir/nir_search_helpers.h: In function ‘is_not_const’: src/compiler/nir/nir_search_helpers.h:118:59: warning: unused parameter ‘num_components’ [-Wunused-parameter] is_not_const(nir_alu_instr instr, unsigned src, unsigned num_components, ^~~~~~~~~~~~~~ src/compiler/nir/nir_search_helpers.h:119:29: warning: unused parameter ‘swizzle ’ [-Wunused-parameter] const uint8_t swizzle) ^~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-01-10 07:21:11 -08:00
Iago Toral Quiroga	7e5c81235f	glsl: remove Lower{TCS,TES}PatchVerticesIn Intel was the only user and now NIR can do the lowering. v2: do not try to handle it as a system value directly for the SPIR-V path. In GL we rather handle it as a uniform like we do for the GLSL path (Jason). v3: drop LowerTESPatchVerticesIn as well (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-10 08:21:02 +01:00
Scott D Phillips	161a97c3d5	.gitignore: Ignore new generated files New generated files from: `bb1e6ff161` ("spirv: Add a prepass to set types on vtn_values") `65fc16c974` ("autotools: set XA versions in configure.ac and configure header file") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-01-08 21:11:11 -08:00
Jason Ekstrand	9e5aaa93cb	spirv: Do implicit conversions of uint to bool in OpStore Technically, the GLSLang bug related to this can also affect SSBO writes where the bool -> uint conversion is missing. However, the only known shipping application with an old enough version of GLSLang to cause issues with this is the new DOOM game so we keep the workaround as small as possible. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	154668e79c	spirv: Loosen the validation for load/store type matching Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104338 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424 Tested-by: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	986303cb92	spirv: Require a storage type for OpStore destinations This rules out things such as trying to store a pointer to a local variable. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	70f588778c	spirv: Add a vtn_types_compatible helper Tested-by: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	8bad7f33c6	spirv: Store the id of the type in vtn_type Previously, we were storing a pointer to the vtn_value because we use it to look up decorations when we create input/output variables. This works, but it also may be useful to have the id itself so we may as well store that instead. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	53265c8798	spirv: Add a mechanism for dumping failing shaders Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	819adfdfb4	spirv: Rework asserts in var_decoration_cb Now that higher levels are enforcing decoration sanity, we don't need the vtn_asserts here. This function should be safe but we still want a few well-placed regular asserts in case something goes awry. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	71ea4dded5	spirv: Rework error checking for decorations This reworks the error checking on our generic handling of decorations. The objective is to validate all of the SPIR-V assumptions we make up-front and convert redundant checks to compiled-out asserts. The most important part of this is to ensure that member decorations only occur on OpTypeStruct and that the member is never out-of-bounds. This way later code can assume that the member is sane and not have to worry about OOB array access due to a misplaced OpMemberDecorate. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	d6a4099303	spirv: Add better type validation to OpTypeImage Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	03c543d041	spirv: Switch on vtn_base_type in OpComposite(Extract\|Insert) This is a bit simpler since we have fewer enum values in the case. It's also a bit more efficient because we're making fewer glsl_get_* calls. While we're at it, add better type validation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	936f49268e	spirv: Refactor Op[Spec]ConstantComposite and add better validation Now that vtn_base_type is a real and full base type, we can switch on that instead of the GLSL base type which is a lot fewer cases in our switch. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	dabce5061d	spirv: Add better validation to Op[Spec]Constant Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	6cf965751a	spirv: Remove a pointless assignment in SpvOpSpecConstant We re-assign later inside the bit_size switch Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	f13a5cff72	spirv: Unify boolean constants and add better validation Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	0bb18858fb	spirv/info: Add spirv_op_to_string Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	ab85fd02d5	spirv: Make 'info' a local array spirv_info_c.py Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	296046556a	spirv: Add better error messages in vtn_value helpers Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Caio Marcelo de Oliveira Filho	22980f941e	spirv: Import 1.2 rev 3 headers and grammar from Khronos Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-08 13:22:17 -08:00
Florian Will	7e025def6d	glsl: Respect std430 layout in lower_buffer_access Respect the std430 rules for determining offset and size of struct members when using a std430 buffer. std140 rules lead to wrong buffer offsets in that case. Fixes my test case attached in Bugzilla. No piglit changes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104492 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-01-08 13:42:09 +11:00
Alejandro Piñeiro	2719467eb6	glsl/standalone: set MaxTransformFeedbackBuffers Using 4, as it is the default value on mesa. See mesa/main/config.h and the following commit that introduced the value: `15ac66e331` Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-05 08:52:22 +01:00
Alejandro Piñeiro	0ba3de2ad7	glsl/standalone: set MaxVertexStreams ARB_transform_feedback3 sets a minimum of 1, ARB_gpu_shader5 a minimum of 4. It shouldn't matter too much, so choosing the later. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-05 08:52:22 +01:00
Alejandro Piñeiro	8dcf131f04	glsl/standalone: set MaxUniformBufferBindings Used to handle how many ubo you can define on the context. Minimimum defined as 36 on ARB_uniform_buffer_object spec, up to 84 on OpenGL 4.6 (12 per stage at each moment). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-05 08:52:22 +01:00
Alejandro Piñeiro	937b210551	glsl/standalone: point which arguments are mandatory Every now and then I execute the standalone compiler, get the non-version error, and need to remember what I'm doing wrong Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-05 08:52:22 +01:00
Eric Anholt	deb552ca27	nir: Add a helper to get the uvec4 type. I needed this in the vc5 compiler. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-03 14:25:23 -08:00
Rob Clark	ea0bbe8201	nir: add missing local_group_size intrinsic For GL_ARB_compute_variable_group_size Reported-by: Karol Herbst <karolherbst@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-30 12:39:07 -05:00
Eero Tamminen	5c5f2eaa08	spirv: consider bitsize when handling OpSwitch cases This reverts commit `7665383a33` and is squashed together with https://patchwork.freedesktop.org/patch/194610/ (spirv: avoid infinite loop / freeze in vtn_cfg_walk_blocks()) which fixes https://bugs.freedesktop.org/show_bug.cgi?id=104359 properly. Fixes: `9702fac68e` (spirv: consider bitsize when handling OpSwitch cases) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104359 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-28 10:38:58 -08:00
Mark Janes	7665383a33	Revert "spirv: consider bitsize when handling OpSwitch cases" This reverts commit `9702fac68e`, which hangs vulkancts and crucible on all platforms. The patch is being reverted because it disables continuous integration testing. The patch from bug 104359 does not apply to master. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104359	2017-12-21 12:15:40 -08:00
Brian Paul	6e5b882339	glsl: disable vec3 packing/splitting in tfb separate mode This fixes a varying packing issue when using transform feedback in GL_SEPARATE_ATTRIBS mode. By time we get to linking, we already know that the number of feedback attributes is under the GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS limit so packing isn't as critical. In fact, packing/splitting vec3 attributes can cause trouble because splitting effectively creates another TFB output which can exceed device limits. So, disable vec3 packing when it's not needed to avoid that issue. Fixes the Piglit ext_transform_feedback-separate test on VMware driver. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:17 -07:00
Brian Paul	462df64495	glsl: simply packing class comparison Handle comparing the packing class using the same method as we do for var->data.is_xfb_only Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:17 -07:00
Brian Paul	06588a065f	glsl: document varying_matches::assign_locations() params and return value And change *components to components[] as a reminder that it's an array. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:16 -07:00
Brian Paul	544f41ff19	glsl: remove some continue statements In some cases, I think loop code is easier to read without continue statements. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:16 -07:00
Brian Paul	76fc24ba8d	glsl: use bitwise operators in varying_matches::compute_packing_class() The mix of bitwise operators with * and + to compute the packing_class values was a little weird. Just use bitwise ops instead. v2: add assertion to make sure interpolation bits fit without collision, per Timothy. Basically, rewrite function to be simpler. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:16 -07:00
Brian Paul	cd7705de44	glsl: simplify loop in varying_matches::assign_locations() The use of break/continue was kind of weird/confusing. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:16 -07:00
Brian Paul	47b4183c92	glsl: minor simplification in assign_varying_locations() Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:16 -07:00
Brian Paul	a0430bb62c	glsl: make varying_matches::is_varying_packing_safe() const Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:16 -07:00
Brian Paul	d86c9836d5	glsl: trivial comment fixes in lower_packed_varyings.cpp Reviewed by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:16 -07:00
Juan A. Suarez Romero	24aac5f81e	spirv: Makefile.nir.am: include vtn_gather_types_c.py script in tarball dist Fixes: `bb1e6ff161` ("spirv: Add a prepass to set types on vtn_values") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-12-20 17:44:35 +01:00
Juan A. Suarez Romero	9702fac68e	spirv: consider bitsize when handling OpSwitch cases When walking over all the cases in a OpSwitch, take in account the bitsize of the literals to avoid getting wrong cases. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-20 10:39:15 +01:00
Dave Airlie	0e8e7ccf9d	nir/linking: always set the used_across_stages/outputs_read bits If we don't remap and output this code would trample the outputs read bits. This fixes a regression in dEQP-VK.tessellation.shader_input_output.barrier Fixes: `1c9c42d16b` (nir: add varying component packing helpers) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-19 06:44:11 +10:00
Jason Ekstrand	3be382cd7c	spirv: Relax the validation conditions of OpSelect The Talos Principle contains shaders with an OpSelect between two vectors where the condition is a scalar boolean. This is technically against the spec bout nir_builder gracefully handles it by splatting out the condition to all the channels. So long as the condition is a boolean, just emit a warning instead of failing. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104246	2017-12-18 09:48:58 -08:00
Eric Anholt	0bead224fe	nir: Add a new lowering option to lower all txd to txl. VC5 requires that all txd are lowered in the shader. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-14 14:36:17 -08:00
Eric Anholt	b08b628994	nir: Fix interaction of GL_CLAMP lowering with texture offsets. We want the clamping of the coordinate to apply after the offset, so we need to do math to lower the offset out of the instruction. Fixes texwrap offset cases for GL_CLAMP with GL_NEAREST on vc5. Note: I moved the get_texture_size() verbatim, so that it was defined before use. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-14 14:36:17 -08:00
Rob Herring	546633dce2	Android: fix missing generation of vtn_gather_types.c Commit `bb1e6ff161` ("spirv: Add a prepass to set types on vtn_values") added generation of vtn_gather_types.c, but forgot to add it to the Android build files. Fixes: `bb1e6ff161` ("spirv: Add a prepass to set types on vtn_values") Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Rob Herring <robh@kernel.org>	2017-12-13 16:20:15 -06:00
Brian Paul	0f2bd31baf	glsl: trivial whitespace fixes in link_varyings.cpp	2017-12-13 08:38:07 -07:00
Timothy Arceri	cab5513b47	nir: fix shift for uint64_t Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-13 13:20:28 +11:00
Jason Ekstrand	9718ce44c2	spirv: Handle image and sampler function parameters	2017-12-12 07:34:46 -08:00
Jason Ekstrand	7d3ebd1286	spirv/cfg: Refactor the function parameter loop a bit	2017-12-12 07:34:46 -08:00
Jason Ekstrand	e6ba457c99	spirv/cfg: Be a bit more precise about function parameters Pointers with no storage type are converted to inout variables but SSA values and pointers with a storage type (which turns into a uint or uvec2) are just input variables.	2017-12-12 07:34:46 -08:00
Jason Ekstrand	aaeda8d7d4	spirv: Make sampled images a real type Previously, we just gave them exactly the same type as the respective image (which already had a sampler2D or similar type). Now they have their own base type and a pointer to the vtn_type for the image. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-12-12 07:34:46 -08:00
Jason Ekstrand	df657ebb68	spirv: Add support for all bit sizes in OpSwitch Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101560	2017-12-11 22:28:34 -08:00
Jason Ekstrand	58cabae8cc	spirv: Restructure the case loop in OpSwitch handling Instead of calling vtn_add_case for the default case and then looping, add an is_default variable and do everything inside the loop. This will make the next commit easier. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-11 22:28:34 -08:00
Jason Ekstrand	5f572ccc95	spirv: Add better parameter validation for vector and matrix types Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-11 22:28:34 -08:00
Jason Ekstrand	a7c2be9944	spirv: Add type validation for OpSelect Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-11 22:28:34 -08:00
Jason Ekstrand	6737b1b859	spirv: Add basic type validation for OpLoad, OpStore, and OpCopyMemory Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-12-11 22:28:34 -08:00
Jason Ekstrand	bb1e6ff161	spirv: Add a prepass to set types on vtn_values This autogenerated pass will automatically find and set the type field on all vtn_values. This way we always have the type and can use it for validation and other checks. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-11 22:28:34 -08:00
Jason Ekstrand	2c84b49ddf	spirv: Add a vtn_type field to all vtn_values At the moment, this just lets us drop the const_type for constants and unify things a bit. Eventually, we will use this to store the types of all SPIR-V SSA values. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-11 22:28:34 -08:00
Jason Ekstrand	24f019fd69	spirv: Allow ignoring decorations for workgroup variables Since we switched over to lowering SLM access directly in SPIR-V -> NIR, we no longer have vtn_variables for SLM. It's all safe as with UBOs and SSBOs but we need to let it through in the assert. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104213 Fixes: `8761a04d0d` Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-12-11 19:02:47 -08:00
Jason Ekstrand	2bc9123c33	spirv: Set lengths on scalar and vector types Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-11 19:02:47 -08:00
Bas Nieuwenhuizen	b926da241a	spirv: Fix loading an entire block at once. There is no chain, so checking the length ends with a SEGFAULT. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103579 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-10 01:43:26 +01:00
Jordan Justen	7cf1037d5a	main, glsl: Add UniformDataDefaults which stores uniform defaults The ARB_get_program_binary extension requires that uniform values in a program be restored to their initial value just after linking. This patch saves off the initial values just after linking. When the program is restored by glProgramBinary, we can use this to copy the initial value of uniforms into UniformDataSlots. V2 (Timothy Arceri): - Store UniformDataDefaults only when serializing GLSL as this is what we want for both disk cache and ARB_get_program_binary. This saves us having to come back later and reset the Uniforms on program binary restores. Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1) Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-12-08 16:44:35 +11:00
Jordan Justen	ebd9e789c4	glsl: Split out shader program serialization This will allow us to use the program serialization to implement ARB_get_program_binary. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-08 16:44:35 +11:00
Alejandro Piñeiro	25e56b2eba	mesa/spirv: move and rename nir_spirv_supported_capabilities To avoid any vulkan driver to include the GL mtypes.h. Renamed as eventually this could be used by drivers not using nir. v2: remove compiler/spirv/spirv.h from mtypes (Alejandro) v3: added the definition at compiler/shader_info.h (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-07 17:15:11 +01:00
Samuel Iglesias Gonsálvez	392638d6b5	spirv: fix bug when OpSpecConstantOp calls a conversion In that case, nir_eval_const_opcode() will evaluate the conversion but as it was using destination's bit_size, the resulting value was just a cast of the source constant value. By passing the source's bit size, it does the conversion properly. Fixes: dEQP-VK.spirv_assembly.instruction..opspecconstantop.convert* v2: - Remove invalid conversion op cases. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-07 10:19:34 +01:00
Samuel Iglesias Gonsálvez	67ec314347	spirv: allow specialization constants with bitsize different than 32 bits Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-07 10:19:34 +01:00
James Legg	947470d10b	nir/opcodes: Fix constant-folding of bitfield_insert Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104119 CC: <mesa-stable@lists.freedesktop.org> CC: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-12-07 08:59:36 +00:00
Timothy Arceri	9d53ccccb2	glsl: get correct member type when processing xfb ifc arrays This fixes a crash in: KHR-GL45.enhanced_layouts.xfb_block_stride Fixes: `0822517936` "glsl: add helper to process xfb qualifiers during linking" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-12-07 15:22:23 +11:00
Alejandro Piñeiro	0398b31d1d	mesa: define nir_spirv_supported_capabilities Until now it was part of spirv_to_nir_options. But it will be used on the implementation of ARB_gl_spirv and ARB_spirv_extensions, and added to the OpenGL context, as a way to save what SPIR-V capabilities the current OpenGL implementation supports. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-06 22:25:52 +01:00
Eduardo Lima Mitev	4049c04122	spirv/nir: Add support for SPV_KHR_16bit_storage v2: Minor changes after rebase against recent master (Alejandro Pinheiro) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	e0667a8bd8	spirv: Enable FPRoundingMode decorator to nir operations SpvOpFConvert now manages the FPRoundingMode decorator for the returning values enabling the nir_rounding_mode in the conversion operation to fp16 values. v2: Fixed breaking of specialization constants. (Jason Ekstrand) v3: Avoid nir_rounding_mode * casting. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev	549894a681	spirv/nir: Handle 16-bit types v2: Added more missing implementations of 16-bit types. (Jason Ekstrand) v3: Store values in values[0].u16[i] (Jason Ekstrand) Include switches based on bitsize for 16-bit types (Chema Casanova) v4: Coding style fixes (Jason Ekstrand) Use vtn_u64_literal and u64[0] at 64-bit SpvOpConstant (Jason Ekstrand) Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Eduardo Lima <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	1f440d00d2	nir: Handle fp16 rounding modes at nir_type_conversion_op nir_type_conversion enables new operations to handle rounding modes to convert to fp16 values. Two new opcodes are enabled nir_op_f2f16_rtne and nir_op_f2f16_rtz. The undefined behaviour doesn't has any effect and uses the original nir_op_f2f16 operation. v2: Indentation fixed (Jason Ekstrand) v3: Use explicit case for undefined rounding and assert if rounding mode is used for non 16-bit float conversions (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev	2af63683bc	nir: Populate conversion opcodes to 16-bit types This will include the following NIR ALU opcodes: * nir_op_i2i16 * nir_op_i2f16 * nir_op_u2u16 * nir_op_u2f16 * nir_op_f2i16 * nir_op_f2u16 * nir_op_f2f16 v2: Remove "from" 16-bit in commit subject (Topi Pohjolainen) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	d711445430	nir: Add rounding modes enum v2: Added comments describing each of the rounding modes. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev	5165e222d1	nir: Add support for 16-bit types (half float, int16 and uint16) v2: Renamed glsl_half_float_type() to glsl_float16_t_type(). (Jason Ekstrand) Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Eduardo Lima <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev	59f458cd87	glsl: Add 16-bit types Adds new INT16, UINT16 and FLOAT16 base types. The corresponding GL types for half floats were reused from the AMD_gpu_shader_half_float extension. The int16 and uint16 types come from NV_gpu_shader_5 extension. This adds the builtins and the lexer support. To avoid a bunch of warnings due to cases not handled in switch, the new types have been added to a few places using same behavior as their 32-bit counterparts, except for a few trivial cases where they are already handled properly. Subsequent patches in this set will provide correct 16-bit implementations when needed. v2: * Use FLOAT16 instead of HALF_FLOAT as name of the base type. * Removed float16_t from builtin types. * Don't copy 16-bit types as if they were 32-bit values in copy_constant_to_storage(). * Use get_scalar_type() instead of adding a new custom switch statement. (Jason Ekstrand) v3: Use GL_FLOAT16_NV instead of GL_HALF_FLOAT for consistency (Ilia Mirkin) v4: Add missing 16-bit base types support in glsl_to_nir (Eduardo Lima). v5: Fix coding style (Topi Poholainen). Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-06 08:57:18 +01:00
Jason Ekstrand	93b4cb61eb	spirv: Allow OpPtrAccessChain for block indices The SPIR-V spec is a bit underspecified when it comes to exactly how you're allowed to use OpPtrAccessChain and what it means in certain edge cases. In particular, what if the base pointer of the OpPtrAccessChain points to the base struct of an SSBO instead of an element in that SSBO. The original variable pointers implementation in mesa assumed that you weren't allowed to do an OpPtrAccessChain that adjusted the block index and asserted such. However, there are some CTS tests that do this and, if the CTS does it, someone will do it in the wild so we should probably handle it. With this commit, we significantly reduce our assumptions and should be able to handle more-or-less anything. The one assumption we still make for correctness is that if we see an OpPtrAccessChain on a pointer to a struct decorated block that the block index should be adjusted. In theory, someone could try to put an array stride on such a pointer and try to make the SSBO an implicit array of the base struct and we would not give them what they want. That said, any index other than 0 would count as an out-of-bounds access which is invalid. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:54 -08:00
Jason Ekstrand	cfb81f58a0	nir: Add a vulkan_resource_reindex intrinsic This is required for being able to handle OpPtrAccessChain in SPIR-V where the base type of the incoming pointer requires us to add to the block index instead of the byte offset. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:54 -08:00
Jason Ekstrand	ae54a4f84f	spirv: Add support for lowering workgroup access to offsets Before, we always left workgroup variables as shared nir_variables and let the driver call nir_lower_io. This adds an option to do the lowering directly in spirv_to_nir. To do this, we implicitly assign the variables a std430 layout and then treat them like a UBO or SSBO and immediately lower all the way to an offset. As a side-effect, the spirv_to_nir pass now handles variable pointers for workgroup variables. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:54 -08:00
Jason Ekstrand	f6eb5ce39c	spirv: Rename get_shared_nir_atomic_op to get_var_nir_atomic_op Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:54 -08:00
Jason Ekstrand	992aabf239	spirv: Add theoretical support for single component pointers Up until now, all pointers have been ivec2s. We're about to add support for pointers to workgroup storage and those are going to be uints. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:54 -08:00
Jason Ekstrand	843c192e2b	spirv: Use offset_pointer_dereference to instead of get_vulkan_resource_index There is no good reason why we should have the same logic repeated in get_vulkan_resource_index and vtn_ssa_offset_pointer_dereference. If we're a bit more careful about how we do things, we can just use the one function and get rid of the other entirely. This also makes the push constant special case a lot more clear. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:53 -08:00
Jason Ekstrand	6dffef6308	spirv: Refactor a couple of pointer query helpers This commit moves them both into vtn_variables.c towards the top, makes them take a vtn_builder, and replaces a hand-rolled instance of is_external_block with a function call. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 20:56:16 -08:00
Jason Ekstrand	93646fb503	spirv: Refactor the base case of offset_pointer_dereference This makes us key off of !offset instead of !block_index. It also puts the guts inside a switch statement so that we can handle more than just UBOs and SSBOs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 20:56:14 -08:00
Jason Ekstrand	98edf6bca4	spirv: Add a switch statement for the block store opcode This parallels what we do for vtn_block_load except that we don't yet support anything except SSBO loads through this path. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 20:55:39 -08:00
Jason Ekstrand	91d91ce3e2	spirv: Use a dereference instead of vtn_variable_resource_index This is equivalent and means we don't have resource index code scattered about. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 20:55:37 -08:00
Jason Ekstrand	d74b1f4809	spirv: Replace unreachable with vtn_fail Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2017-12-04 09:21:09 -08:00
Jason Ekstrand	b7ef60d846	spirv: Replace assert with vtn_assert Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2017-12-04 09:21:09 -08:00
Jason Ekstrand	94ca8e04ad	spirv: Add vtn_fail and vtn_assert helpers These helpers are much nicer than just using assert because they don't kill your process. Instead, it longjmps back to spirv_to_nir(), cleans up all the temporary memory, and nicely returns NULL. While crashing is completely OK in the Vulkan world, it's not considered to be quite so nice in GL. This should help us to make SPIR-V parsing much more robust. The one downside here is that vtn_assert is not compiled out in release builds like assert() is so it isn't free. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2017-12-04 09:21:09 -08:00
Jason Ekstrand	591a07632c	spirv: Do something useful with OpSource We may as well log the source language and file name. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2017-12-04 09:21:09 -08:00
Jason Ekstrand	16dfdeefc8	spirv: Rework logging This commit reworks the way that logging works in SPIR-V to provide richer and more detailed logging infrastructure. This commit contains several improvements over the old mechanism: 1) Log messages are now more detailed. They contain the SPIR-V byte offset as well as source language information from OpSource and OpLine. 2) There is now a logging callback mechanism so that errors can get propagated to the client through debug callbak extensions. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2017-12-04 09:21:09 -08:00
Jason Ekstrand	11bd753c4e	spirv: Re-arrange vtn_builder initialization This simply moves allocating the vtn_builder and initializing it to the very beginning before we even parse the header. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2017-12-04 09:21:09 -08:00
Jason Ekstrand	d74bec1a54	spirv: Parent the nir_shader to the builder while building Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2017-12-04 09:21:09 -08:00
Samuel Iglesias Gonsálvez	fa8c1b92b7	glsl: don't run intrastage array validation when the interface type is not an array We validate that the interface block array type's definition matches. However, previously, the function could be called if an non-array interface block has different type definitions -for example, when the precision qualifier differs in a GLSL ES shader, we would create two different types-, and it would return invalid as both definitions are non-arrays. We fix this by specifying that at least one definition should be an array to call the validation. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:32:57 +01:00
Samuel Iglesias Gonsálvez	fc6d55952d	glsl/es: precision qualifier doesn't need to match in UBOs They might mismatch due to the two shaders using different GLSL versions, and that's ok in desktop GL. In ES, precision qualifiers don't need to match. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:32:57 +01:00
Fabian Bieler	9bdb5457f4	glsl: Match order of gl_LightSourceParameters elements. spotExponent and spotCosCutoff were swapped in the gl_builtin_uniform_element struct. Now the order matches across gl_builtin_uniform_element, glsl_struct_field and the spec. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-12-03 21:14:14 -07:00
Timothy Arceri	d99c7e0ff1	nir: allow builin arrays to be lowered Galliums nir drivers expect this to be done. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	2bc49ac3e6	nir: add array lowering function that assumes there are no indirects The gallium glsl->nir pass currently lowers away all indirects on both inputs and outputs. This fuction allows us to lower vs inputs and fs outputs and also lower things one stage at a time as we don't need to worry about indirects on the other side of the shaders interface. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	2a35021bc6	nir: fix support for scalar arrays in nir_lower_io_types() This was just recreating the same vector type we alreay had and hitting an assert for scalars. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:10:30 +11:00
Timothy Arceri	1c9c42d16b	nir: add varying component packing helpers v2: update shader info input/output masks when pack components v3: make sure interpolation loc matches, this is required for the radeonsi NIR backend. v4: `33dca36f4f` fixed nir_gather_info to update outputs_read correct, make sure we also adjust this correctly when packing components. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3)	2017-12-04 09:10:30 +11:00
Timothy Arceri	c797bc6aa7	nir: add varying array splitting pass V2: - fix matrix support, non-array matrices were being skipped in v1 v3: - handle lowering of tcs output loads correctly - correctly mark indirect locations for either in or out not both when processing a stage. - use nir_src_copy() when lowering stores. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:10:30 +11:00
Jason Ekstrand	e19c623128	spirv: Convert the supported_extensions struct to spirv_options This is a bit more general and lets us pass additional options into the spirv_to_nir pass beyond what capabilities we support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-02 08:09:11 -08:00
Jason Ekstrand	6bd876dcaa	spirv: Only emit functions which are actually used Instead of emitting absolutely everything, just emit the few functions that are actually referenced in some way by the entrypoint. This should save us quite a bit of time when handed large shader modules containing many entrypoints. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-02 08:07:35 -08:00
Jason Ekstrand	f5aad36d2e	spirv: Drop the impl field from vtn_builder We have a nir_builder and it has an impl field. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-02 08:07:35 -08:00
Tapani Pälli	faccbaf3fa	mesa: add AllowGLSLCrossStageInterpolationMismatch workaround This fixes issues seen with certain versions of Unreal Engine 4 editor and games built with that using GLSL 4.30. v2: add driinfo_gallium change (Emil Velikov) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97852 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103801 Acked-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-30 11:43:10 +02:00
Timothy Arceri	a39a3b4b76	mesa: rework _mesa_add_parameter() to only add a single param This is more inline with what the functions name suggests it should do, and makes the code much easier to follow. This will also make adding uniform packing support much simpler. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 21:50:48 +11:00
Eric Engestrom	9d281e1506	compiler: fix typo Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-11-28 10:54:38 +00:00
Eric Engestrom	7b85b9b877	compiler: use NDEBUG to guard asserts nir_validate.c's #endif already had the correct NDEBUG comment Fixes: `dcb1acdea0` "nir/validate: Only build in debug mode" Fixes: `9ff71b649b` "i965/nir: Validate that NIR passes call nir_metadata_preserve()" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-11-28 10:54:38 +00:00
Timothy Arceri	3e789026ca	st/glsl_to_tgsi: make use of driver_cache_blob with the disk cache driver_cache_blob was introduced with the i965 disk cache, it allows us to simplify the cache a little and possibly offers some minor speed improvements since we load the GLSL metadata and TGSI from disk in one pass. Using driver_cache_blob should also make it straight forward to implement binary support for ARB_get_program_binary in gallium. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:01:44 +11:00
Gwan-gyeong Mun	4cb27047c8	glsl: Fix typo nagivation -> navigation Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-28 08:48:55 +11:00
Dave Airlie	33dca36f4f	nir: fill outputs_read field and add patch outputs read (v2) This is to be used for TCS optimisations on radv. v2: don't set written on reads (nha) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-27 13:50:03 +10:00
Ilia Mirkin	ab336e8b46	nir: allow texture offsets with cube maps GL doesn't have this, but some hardware supports it. This is convenient for lowering tg4 to plain texture calls, which is necessary on Adreno A4xx hardware. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 16:56:30 -05:00
Marek Olšák	78942e7dbf	mesa: shrink VERT_ATTRIB bitfields to 32 bits There are only 32 vertex attribs now. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-11-25 17:18:22 +01:00
Marek Olšák	43abaf2ad0	mesa: remove unused vertex attrib WEIGHT We don't support ARB_vertex_blend. Note that the attribute aliasing check for ARB_vertex_program had to be rewritten. vbo_context: 20344 -> 20008 bytes gl_context: 74672 -> 74616 bytes Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-11-25 17:17:52 +01:00
Marek Olšák	2116b97418	mesa: don't assign numbers to vertex attrib enums manually I plan to remove one of them. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-11-25 17:17:52 +01:00
Iago Toral Quiroga	a217cbd7ec	nir/gather_info: recognize load_patch_vertices_in as a system value This intrinsic is produced to load SYSTEM_VALUE_VERTICES_IN, which is generated to load gl_PatchVerticesIn in the SPIR-V path for both Vulkan and OpenGL. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-22 08:03:55 +01:00
George Barrett	f09c2cefdd	glsl: Catch subscripted calls to undeclared subroutines generate_array_index fails to check whether the target of a subroutine call exists in the AST, potentially passing around null ir_rvalue pointers eventuating in abort/segfault. Fixes: `fd01840c0b` ("glsl: add AoA support to subroutines") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100438	2017-11-20 11:04:04 +11:00
Brian Paul	92c1290dc5	glsl: s/unsigned/glsl_base_type/ in glsl type code (v2) Declare glsl_type::sampled_type as glsl_base_type as we do for the base_type field. And make base_type a bitfield to save a few bytes. Update glsl_type constructor to take glsl_base_type instead of unsigned and pass GLSL_TYPE_VOID instead of zero. No Piglit regressions with llvmpipe. v2: - Declare both base_type and sampled_type as 8-bit fields - Use the new ASSERT_BITFIELD_SIZE() macro. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-16 20:35:17 -07:00
Alejandro Piñeiro	b498172d0e	spirv: fix typo on DO NOT EDIT header Introduced on commit `157c9a1341` Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-14 13:07:36 +01:00
Alex Smith	4122d00846	nir/spirv: tg4 requires a sampler Gather operations in both GLSL and SPIR-V require a sampler. Fixes gathers returning garbage when using separate texture/samplers (on AMD, was using an invalid sampler descriptor). Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-13 13:38:18 +00:00
Alex Smith	e9eb3c4753	spirv: Use correct type for sampled images We should use the result type of the OpSampledImage opcode, rather than the type of the underlying image/samplers. This resolves an issue when using separate images and shadow samplers with glslang. Example: layout (...) uniform samplerShadow s0; layout (...) uniform texture2D res0; ... float result = textureLod(sampler2DShadow(res0, s0), uv, 0); For this, for the combined OpSampledImage, the type of the base image was being used (which does not have the Depth flag set, whereas the result type does), therefore it was not being recognised as a shadow sampler. This led to the wrong LLVM intrinsics being emitted by RADV. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-13 13:37:50 +00:00
Alejandro Piñeiro	157c9a1341	spirv: add DO NOT EDIT warning on generated spirv_info.c Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-13 13:28:44 +01:00
Iago Toral Quiroga	456e10944f	glsl/linker: use without_array() to retrieve type This is what we do in the condition too, so it makes sense. v2: Only compute without_array() once (Ilia). Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-11-13 09:22:26 +01:00
Timothy Arceri	8c9f3f2c46	nir: add streams to nir data This will be used by gallium drivers. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-12 11:08:26 +11:00
Rob Clark	ef4c42fc3a	nir: handle get_buffer_size in nir_lower_atomics_to_ssbo Overlooked initially, be we need to remap the SSBO index for this as well. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-10 08:57:33 -05:00
Kenneth Graunke	688d695868	glsl: Make #pragma STDGL invariant(all) only modify outputs. According to the GLSL ES 3.20, GLSL 4.50, and GLSL 1.20 specs: "To force all output variables to be invariant, use the pragma #pragma STDGL invariant(all) before all declarations in a shader." Notably, this is only supposed to affect output variables. Furthermore, "Only variables output from a shader can be candidates for invariance." It looks like this has been wrong since we first supported the pragma in 2011 (commit `86b4398cd1`). Fixes dEQP-GLES2.functional.shaders.preprocessor.pragmas.pragma_fragment. v2: Now that all cases are identical (other than compute shaders, which have no output variables anyway), we can drop the switch statement entirely. We also don't need the current_function == NULL check; this was a hold over from when we had a single var_mode_out for both function parameters and shader varyings, in the bad old days. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-11-08 23:11:48 -08:00
Neil Roberts	4dc8458cd1	glsl: Transform fb buffers are only active if a variable uses them The GL spec will soon be revised to clarify that a buffer binding for a transform feedback buffer is only required if a variable is actually defined to use the buffer binding point. Previously a declaration for the default transform buffer would make it require a binding even if nothing was declared to use the default buffer. Affects: KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list_and_api Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-09 05:39:42 +01:00
Ian Romanick	9c53b80ff9	glsl: Minor cleanups after previous commit I think it's more clear to only call emit_access once. The only difference between the two calls is the value of size_mul used for the offset parameter... but you really have to look at it to be sure. The s/is_64bit/is_double/ change is because there are no int64_t or uint64_t matrix types. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2017-11-08 18:37:29 -08:00
Ian Romanick	c18d8c61d6	glsl: Use more link_calculate_matrix_stride in lower_buffer_access I was going to squash this with the previous commit, but there's a lot of churn in that commit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2017-11-08 18:37:29 -08:00
Ian Romanick	1a2beae1b3	glsl: Use link_calculate_matrix_stride in lower_buffer_access and friends Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2017-11-08 18:37:29 -08:00
Ian Romanick	24e78d99db	glsl: Refactor matrix stride calculation into a utility function Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2017-11-08 18:37:29 -08:00
Ian Romanick	88f5588f77	glsl/linker: Optimize swizzles again after linking Without this, the SPIR-V generator has to deal with a bunch of junk like: (swiz z (swiz xxx (swiz x (var_ref packed:binormal.z,light_dir)))) It seems better to cull that stuff out than to add code to deal with it. The problem is the way swizzles to and from scalars have to be handled in SPIR-V. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2017-11-08 18:37:29 -08:00
Ian Romanick	ef1ca06ce8	glsl: Combine nop-swizzle optimization with swizzle-swizzle optimization Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: <thomashelland90@gmail.com>	2017-11-08 18:37:29 -08:00
Ian Romanick	c858abb14f	glsl: Make the swizzle-swizzle optimization greedy If there is a long sequence of swizzled swizzles, compact all of them down to a single swizzle. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: <thomashelland90@gmail.com>	2017-11-08 18:37:29 -08:00
Ian Romanick	ae1fd09c1d	glsl: Remove program_resource_visitor::visit_field(const glsl_struct_field *) I could not find any remaining users. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-08 18:37:29 -08:00
Ian Romanick	2c7657f62c	glsl: Silence unused parameter warning glsl/lower_shared_reference.cpp: In member function ‘virtual void {anonymous}::lower_shared_reference_visitor::insert_buffer_access(void, ir_dereference, const glsl_type, ir_rvalue, unsigned int, int)’: glsl/lower_shared_reference.cpp:244:58: warning: unused parameter ‘channel’ [-Wunused-parameter] int channel) ^~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-08 18:37:29 -08:00
Timothy Arceri	9c33533586	glsl: use the correct parent when allocating program data members Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-09 12:07:48 +11:00
Timothy Arceri	cf05bb506a	glsl: drop cache_fallback This turned out to be a dead end, it is much easier and less error prone to just cache the IR used by the drivers backend e.g. TGSI or NIR. Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-09 12:07:48 +11:00
Matt Turner	77a63d190a	nir: Don't print swizzles when there are more than 4 components ... as can happen with various types like mat4, or else we'll smash the stack writing past the end of components_local[]. Fixes: `5a0d3e1129` ("nir: Print the components referenced for split or packed shader in/outs.") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-08 13:22:26 -08:00
Dylan Baker	34593e978c	meson: Add threads dependencies to glsl_compiler executable Fixes compiling the optional standalone glsl compiler. Reported-by: DrNick (on irc) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-08 11:36:02 -08:00
Andreas Boll	a6932faae1	glsl: Fix typo fragement -> fragment Fixes: `94d669b0d2` ("glsl: enforce fragment shader input restrictions in GLSL ES 3.10") Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-08 18:30:48 +00:00
Juan A. Suarez Romero	d5a641106b	glsl: add varying resources for arrays of complex types This patch is mostly a patch done by Ilia Mirkin. It fixes KHR-GL45.enhanced_layouts.varying_structure_locations. v2: fix locations for TCS/TES/GS inputs and outputs (Ilia) CC: Ilia Mirkin <imirkin@alum.mit.edu> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103098 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-11-08 10:12:07 +01:00
Jason Ekstrand	df81b81fb9	compiler/nir_types: Handle vectors in glsl_get_array_element Most of NIR doesn't allow doing array indexing on a vector (though it does on a matrix). However, nir_lower_io handles it just fine and this behavior is needed for shared variables in Vulkan. This commit makes glsl_get_array_element do something sensible for vector types and makes nir_validate happy with them. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:41:24 -08:00
Jason Ekstrand	ad77775809	nir: Validate base types on array dereferences We were already validating that the parent type goes along with the child type but we weren't actually validating that the parent type is reasonable. This fixes that. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:41:24 -08:00
Jason Ekstrand	ab9220edd6	nir,intel/compiler: Use a fixed subgroup size The GL_ARB_shader_ballot spec says that gl_SubGroupSizeARB is declared as a uniform. This means that it cannot change across an invocation such as a draw call or a compute dispatch. For compute shaders, we're ok because we only ever use one dispatch size. For fragment, however, the hardware dynamically chooses between SIMD8 and SIMD16 which violates the spec. Instead, let's just pick a subgroup size based on the shader stage. The fixed size we choose for compute shaders is a bit higher than strictly needed but there's no real harm in that. The advantage is that, if they do anything interesting with the value, NIR will see it as an immediate and can optimize better. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	a026458020	nir/lower_subgroups: Lower ballot intrinsics to the specified bit size Ballot intrinsics return a bitfield of subgroups. In GLSL and some SPIR-V extensions, they return a uint64_t. In SPV_KHR_shader_ballot, they return a uvec4. Also, some back-ends would rather pass around 32-bit values because it's easier than messing with 64-bit all the time. To solve this mess, we make nir_lower_subgroups take a new parameter called ballot_bit_size and it lowers whichever thing it gets in from the source language (uint64_t or uvec4) to a scalar with the specified number of bits. This replaces a chunk of the old lowering code. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	8c2bf020fd	nir/builder: Add a nir_imm_intN_t helper This lets you easily build integer immediates of arbitrary bit size. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	9b35faba42	nir/lower_system_values: Lower SUBGROUP__MASK based on type The SUBGROUP__MASK system values are uint64_t when coming in from GLSL but uvec4 when coming in from SPIR-V. Lowering based on type allows us to nicely handle both. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	3ee91ee6ac	nir: Make ballot intrinsics variable-size This way they can return either a uvec4 or a uint64_t. At the moment, this is a no-op since we still always return a uint64_t. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	ad127afcfd	nir: Add a ssa_dest_init_for_type helper This would be useful a number of places Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	28da82f978	nir: Add a new subgroups lowering pass This commit pulls nir_lower_read_invocations_to_scalar along with most of the guts of nir_opt_intrinsics (which mostly does subgroup lowering) into a new nir_lower_subgroups pass. There are various other bits of subgroup lowering that we're going to want to do so it makes a bit more sense to keep it all together in one pass. We also move it in i965 to happen after nir_lower_system_values to ensure that because we want to handle the subgroup mask system value intrinsics here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	295605c930	intel/cs: Push subgroup ID instead of base thread ID We're going to want subgroup ID for SPIR-V subgroups eventually anyway. We really only want to push one and calculate the other from it. It makes a bit more sense to push the subgroup ID because it's simpler to calculate and because it's a real API thing. The only advantage to pushing the base thread ID is to avoid a single SHL in the shader. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	80ddfab2f5	intel/cs: Rework the way thread local ID is handled Previously, brw_nir_lower_intrinsics added the param and then emitted a load_uniform intrinsic to load it directly. This commit switches things over to use a specific NIR intrinsic for the thread id. The one thing I don't like about this approach is that we have to copy thread_local_id over to the new visitor in import_uniforms. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Gwan-gyeong Mun	fb87c40a58	nir: fix a typo Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-06 18:11:24 -08:00
Tomasz Figa	0886be093f	glsl: Allow precision mismatch on dead data with GLSL ES 1.00 Commit `259fc50545` added linker error for mismatching uniform precision, as required by GLES 3.0 specification and conformance test-suite. Several Android applications, including Forge of Empires, have shaders which violate this rule, on a dead varying that will be eliminated. The problem affects a big number of applications using Cocos2D engine and other GLES implementations accept this, this poses a serious application compatibility issue. Starting from GLSL ES 3.0, declarations with conflicting precision qualifiers are explicitly prohibited. However GLSL ES 1.00 does not clearly specify the behavior, except that "Uniforms are defined to behave as if they are using the same storage in the vertex and fragment processors and may be implemented this way. If uniforms are used in both the vertex and fragment shaders, developers should be warned if the precisions are different. Conversion of precision should never be implicit." The word "used" is not clear in this context and might refer to 1) declared (same as GLES 3.x) 2) referred after post-processing, or 3) linked after all optimizations are done. Looking at existing applications, 2) or 3) seems to be widely adopted. To avoid compatibility issues, turn the error into a warning if GLSL ES version is lower than 3.0 and the data is dead in at least one of the shaders. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97532 Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-06 15:16:03 -08:00
Nicolai Hähnle	ca63a5ed3e	glsl: fix interpolateAtXxx(some_vec[idx], ...) with dynamic idx The dynamic index of a vector (not array!) is lowered to a sequence of conditional assignments. However, the interpolate_at_* expressions require that the interpolant is an l-value of a shader input. So instead of doing conditional assignments of parts of the shader input and then interpolating that (which is nonsensical), we interpolate the entire shader input and then do conditional assignments of the interpolated result. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-03 14:30:08 +01:00
Nicolai Hähnle	4f42450b86	glsl: allow any l-value of an input variable as interpolant in interpolateAt* The intended rule has been clarified in GLSL 4.60, Section 8.13.2 (Interpolation Functions): "For all of the interpolation functions, interpolant must be an l-value from an in declaration; this can include a variable, a block or structure member, an array element, or some combination of these. Component selection operators (e.g., .xy) may be used when specifying interpolant." For members of interface blocks, var->data.must_be_shader_input must be determined on-the-fly after lowering interface blocks, since we don't want to disable varying packing for an entire block just because one input in it is used in interpolateAt. v2: keep setting must_be_shader_input in ast_function (Ian) v3: follow the relaxed rule of GLSL 4.60 v4: only apply the relaxed rules to desktop GL (the ES WG decided that the relaxed rules may apply in a future version but not retroactively; see also dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_centroid.negative.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101378 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-03 14:30:08 +01:00
Dave Airlie	57372c5a42	nir/serialize: fix build with gcc 4.4.7 I had to build on RHEL6 today, and noticed this. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-03 15:03:35 +10:00
Timothy Arceri	440d08fe93	nir: skip lowering sampler if there is no dereference This avoids a crash on the output of nir_lower_bitmap(). Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:19:46 +11:00
Timothy Arceri	cf5f8f55c3	nir: add tess patch support to nir_remove_unused_varyings() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 08:58:39 +11:00
Jordan Justen	e6ecd7d73f	glsl/shader_cache: Save fs (BlendSupport) metadata Fixes many GL 4.5 CTS blend tests, such as: * GL45-CTS.blend_equation_advanced.extension_directive_enable * GL45-CTS.blend_equation_advanced.extension_directive_warn * GL45-CTS.blend_equation_advanced.blend_all.GL_MULTIPLY_KHR_all_qualifier * GL45-CTS.blend_equation_advanced.blend_specific.GL_COLORBURN_KHR v2: * Directly save the BlendSupport field to avoid potentially including a pointer in the future in the structure is updated. (tarceri) Cc: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Timothy Arceri	15f39e8654	mesa/glsl: add api_enabled flag to gl_transform_feedback_info This will be used to disable the shader cache when xfb is enabled via the api as we don't currently allow for it when generating the sha for the shader. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Jordan Justen	4c7a1ec62a	blob: Don't set overrun if reading 0 bytes at end of data Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Jordan Justen	6b815e405d	glsl/shader_cache: Save and restore serialized nir in gl_program v3: * Rename serialized_nir* to driver_cache_blob*. (Tim) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Jason Ekstrand	54f691311c	nir: Add hooks for testing serialization Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-31 23:36:53 -07:00
Connor Abbott	120da00975	nir: add serialization and deserialization v2 (Jason Ekstrand): - Various whitespace cleanups - Add helpers for reading/writing objects - Rework derefs - [de]serialize nir_shader::num_* - Fix uses of blob_reserve_bytes - Use a bitfield struct for packing tex_instr data v3: - Zero nir_variable struct on deserialization. (Jordan) - Allow nir_serialize.h to be included in C++. (Jordan) - Handle NULL info.name. (Jason) - Set info.name to NULL when name is NULL. (Jordan) Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:53 -07:00
Neil Roberts	b697ece10a	nir/opt_intrinsics: Fix values for gl_SubGroupG{e,t}MaskARB Previously the values were calculated by just shifting ~0 by the invocation ID. This would end up including bits that are higher than gl_SubGroupSizeARB. The corresponding CTS test effectively requires that these high bits be zero so it was failing. There is a Piglit test as well but this appears to checking the wrong values so it passes. For the two greater-than bitmasks, this patch adds an extra mask with (~0>>(64-gl_SubGroupSizeARB)) to force these bits to zero. Fixes: KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102680#c3 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Neil Roberts <nroberts@igalia.com>	2017-10-31 23:28:00 +01:00
Ian Romanick	53c7b8bdca	glsl: Fix bad formatting in a comment Trivial Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2017-10-30 20:08:25 -07:00
Eduardo Lima Mitev	f9de7f5596	glsl/linker: Check that re-declared, inter-shader built-in blocks match >From GLSL 4.5 spec, section "7.1 Built-In Language Variables", page 130 of the PDF states: "If multiple shaders using members of a built-in block belonging to the same interface are linked together in the same program, they must all redeclare the built-in block in the same way, as described in section 4.3.9 “Interface Blocks” for interface-block matching, or a link-time error will result." Fixes: * GL45-CTS.CommonBugs.CommonBug_PerVertexValidation v2 (Neil Roberts): Explicitly look for gl_PerVertex in the symbol tables instead of waiting to find a variable in the interface. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102677 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eduardo Lima Mitev <elima@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com>	2017-10-30 18:10:39 +01:00
Eduardo Lima Mitev	f5fe99ac85	glsl: Use the utility function to copy symbols between symbol tables This effectively factorizes a couple of similar routines. v2 (Neil Roberts): Non-trivial rebase on master Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eduardo Lima Mitev <elima@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com>	2017-10-30 18:10:39 +01:00
Eduardo Lima Mitev	4c62a270a9	glsl_parser_extra: Add utility to copy symbols between symbol tables Some symbols gathered in the symbols table during parsing are needed later for the compile and link stages, so they are moved along the process. Currently, only functions and non-temporary variables are copied between symbol tables. However, the built-in gl_PerVertex interface blocks are also needed during the linking stage (the last step), to match re-declared blocks of inter-stage shaders. This patch adds a new utility function that will factorize current code that copies functions and variables between two symbol tables, and in addition will copy explicitly declared gl_PerVertex blocks too. The function will be used in a subsequent patch. v2 (Neil Roberts): Allow the src symbol table to be NULL and explicitly copy the gl_PerVertex symbols in case they are not referenced in the exec_list. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eduardo Lima Mitev <elima@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com>	2017-10-30 18:10:39 +01:00
Ian Romanick	6403efbe74	glsl: Remove ir_binop_greater and ir_binop_lequal expressions NIR does not have these instructions. TGSI and Mesa IR both implement them using < and >=, repsectively. Removing them deletes a bunch of code and means I don't have to add code to the SPIR-V generator for them. v2: Rebase on 2+ years of change... and fix a major bug added in the rebase. text data bss dec hex filename 8255291 268856 294072 8818219 868e2b 32-bit i965_dri.so before 8254235 268856 294072 8817163 868a0b 32-bit i965_dri.so after 7815339 345592 420592 8581523 82f193 64-bit i965_dri.so before 7813995 345560 420592 8580147 82ec33 64-bit i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-30 09:27:09 -07:00
Ian Romanick	34f7e761bc	glsl/parser: Track built-in types using the glsl_type directly Without the lexer changes, tests/glslparsertest/glsl2/tex_rect-02.frag fails. Before this change, the parser would determine that sampler2DRect is not a valid type because the call to state->symbols->get_type() in ast_type_specifier::glsl_type() would return NULL. Since ast_type_specifier::glsl_type() is now going to return the glsl_type pointer that it received from the lexer, it doesn't have an opportunity to generate an error. text data bss dec hex filename 8255243 268856 294072 8818171 868dfb 32-bit i965_dri.so before 8255291 268856 294072 8818219 868e2b 32-bit i965_dri.so after 7815195 345592 420592 8581379 82f103 64-bit i965_dri.so before 7815339 345592 420592 8581523 82f193 64-bit i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-30 09:27:09 -07:00
Ian Romanick	747c057530	glsl/parser: Return the glsl_type object from the lexer This allows us to use a single token for every built-in type except void. text data bss dec hex filename 8275163 269336 294072 8838571 86ddab 32-bit i965_dri.so before 8255243 268856 294072 8818171 868dfb 32-bit i965_dri.so after 7836963 346552 420592 8604107 8349cb 64-bit i965_dri.so before 7815195 345592 420592 8581379 82f103 64-bit i965_dri.so after Yes, the 64-bit binary shrinks by 21k. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-30 09:27:09 -07:00
Ian Romanick	4171900cf1	glsl/parser: Allocate identifier inside classify_identifier Passing YYSTYPE into classify_identifier enables a later patch. text data bss dec hex filename 8310339 269336 294072 8873747 876713 32-bit i965_dri.so before 8275163 269336 294072 8838571 86ddab 32-bit i965_dri.so after 7845579 346552 420592 8612723 836b73 64-bit i965_dri.so before 7836963 346552 420592 8604107 8349cb 64-bit i965_dri.so after Yes, the 64-bit binary shrinks by 8k. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-30 09:27:09 -07:00
Ian Romanick	792acfc44a	glsl/parser: Move anonymous struct name handling to the parser There are two callers of the constructor, and they are right next to each other. Move the "#anon_struct" name handling to the parser so that the conditional can be removed. I've also deleted part of the comment (about the memory leak) because I don't think it's quite accurate or relevant. text data bss dec hex filename 8310399 269336 294072 8873807 87674f 32-bit i965_dri.so before 8310339 269336 294072 8873747 876713 32-bit i965_dri.so after 7845611 346552 420592 8612755 836b93 64-bit i965_dri.so before 7845579 346552 420592 8612723 836b73 64-bit i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-30 09:27:09 -07:00
Ian Romanick	fc07ab165b	glsl/parser: Silence unused parameter warning glsl/glsl_parser_extras.cpp: In constructor ‘ast_struct_specifier::ast_struct_specifier(void, const char, ast_declarator_list)’: glsl/glsl_parser_extras.cpp:1675:50: warning: unused parameter ‘lin_ctx’ [-Wunused-parameter] ast_struct_specifier::ast_struct_specifier(void lin_ctx, const char *identifier, ^~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-30 09:27:09 -07:00
Ian Romanick	d70e8ef1c1	glsl: Silence unused parameter warnings glsl/standalone_scaffolding.cpp: In function ‘GLbitfield _mesa_program_state_flags(const gl_state_index)’: glsl/standalone_scaffolding.cpp:103:66: warning: unused parameter ‘state’ [-Wunused-parameter] _mesa_program_state_flags(const gl_state_index state[STATE_LENGTH]) ^ glsl/standalone_scaffolding.cpp: In function ‘char _mesa_program_state_string(const gl_state_index)’: glsl/standalone_scaffolding.cpp:109:67: warning: unused parameter ‘state’ [-Wunused-parameter] _mesa_program_state_string(const gl_state_index state[STATE_LENGTH]) ^ glsl/standalone_scaffolding.cpp: In function ‘void _mesa_delete_shader(gl_context, gl_shader)’: glsl/standalone_scaffolding.cpp:115:40: warning: unused parameter ‘ctx’ [-Wunused-parameter] _mesa_delete_shader(struct gl_context ctx, struct gl_shader sh) ^~~ glsl/standalone_scaffolding.cpp: In function ‘void _mesa_delete_linked_shader(gl_context, gl_linked_shader)’: glsl/standalone_scaffolding.cpp:123:47: warning: unused parameter ‘ctx’ [-Wunused-parameter] _mesa_delete_linked_shader(struct gl_context ctx, ^~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-30 09:27:09 -07:00
Emil Velikov	fc7816fd4e	Revert "foo" This reverts commit `27d5a7bce0`. I fat fingered it, failing to reset the checkout before applying the sequential commit.	2017-10-30 15:32:56 +00:00
Emil Velikov	27d5a7bce0	foo Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2017-10-30 15:22:26 +00:00
Kenneth Graunke	86c68bb886	nir: Make nir_gather_info collect a uses_fddx_fddy flag. i965 turns fddx/fddy into their coarse/fine variants based on the ctx->Hint.FragmentShaderDerivative setting. It needs to know whether this can impact a shader in order to better guess NOS settings. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-29 20:52:20 -07:00
Jason Ekstrand	8ab9820d34	spirv: Claim support for the simple memory model It's rather surprising that we've never actually hit this before. Aparently, Ian's SPIR-V generator currently claims the Simple when you don't do anything complex. We really shouldn't assert-fail on it. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: mesa-stable@lists.freedesktop.org	2017-10-26 15:24:38 -07:00
Iago Toral Quiroga	13652e7516	glsl/linker: Fix type checks for location aliasing From the OpenGL 4.6 spec, section 4.4.1 Input Layout Qualifiers, Page 68, (Location aliasing): "Further, when location aliasing, the aliases sharing the location must have the same underlying numerical type (floating-point or integer)." The current implementation is too strict, since it checks that the the base types are an exact match instead. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-26 08:40:14 +02:00
Iago Toral Quiroga	7276ccf8ed	glsl/linker: refactor check_location_aliasing Mostly, this merges the type checks with all the other checks so we only have a single loop for this. Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-26 08:40:14 +02:00
Iago Toral Quiroga	e2abb75b0e	glsl/linker: validate explicit locations for SSO programs v2: - we only need to validate inputs to the first stage and outputs from the last stage, everything else has already been validated during cross_validate_outputs_to_inputs (Timothy). - Use MAX_VARYING instead of MAX_VARYINGS_INCL_PATCH (Illia) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-26 08:40:14 +02:00
Iago Toral Quiroga	bdaf058978	glsl/linker: generalize validate_explicit_variable_location for SSO For non-SSO programs, we only need to validate outputs, since the cross validation of outputs to inputs will ensure that we produce linker errors for invalid inputs too. Hoever, for the SSO path there is no output to input validation, so we need to validate inputs explicitly. Generalize the function so it can handle this as well. Also, notice that vertex shader inputs and fragment shader outputs are already validated in assign_attribute_or_color_locations() for both SSO and non-SSO paths, so we should not try to validate that here again (in fact, the function would require explicit paths to handle these two cases properly). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-26 08:40:14 +02:00
Iago Toral Quiroga	e7b7fe314e	glsl/linker: create a helper function to validate explicit locations Currently, we only validate explicit locations for non-SSO programs. This creates a helper that we can call from both SSO and non-SSO paths directly, so we can reuse all the logic behind this. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-26 08:40:14 +02:00
Iago Toral Quiroga	ab40acb453	glsl/linker: outputs in the same location must share auxiliary storage From ARB_enhanced_layouts: "[...]when location aliasing, the aliases sharing the location must have the same underlying numerical type (floating-point or integer) and the same auxiliary storage and interpolation qualification.[...]" Add code to the linker to validate that aliased locations do have the same aux storage. Fixes: KHR-GL45.enhanced_layouts.varying_location_aliasing_with_mixed_auxiliary_storage Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-26 08:40:14 +02:00
Iago Toral Quiroga	0b565f715d	glsl/linker: outputs in the same location must share interpolation From ARB_enhanced_layouts: "[...]when location aliasing, the aliases sharing the location must have the same underlying numerical type (floating-point or integer) and the same auxiliary storage and interpolation qualification.[...]" Add code to the linker to validate that aliased locations do have the same interpolation. Fixes: KHR-GL45.enhanced_layouts.varying_location_aliasing_with_mixed_interpolation Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-10-26 08:40:14 +02:00
Iago Toral Quiroga	c4545676d7	glsl/linker: fix location aliasing checks for interface variables The existing code was checking the whole interface variable rather than its members, which is not what we want: we want to check aliasing for each member in the interface variable. Surprisingly, there are piglit tests that verify this and were passing due to a bug in the existing code: when we were computing the last component used by an interface variable we would use the 'vector' path and multiply by vector_elements, which is 0 for interface variables. This made the loop that checks for aliasing be a no-op and not add the interface variable to the list of outputs so then we would fail to link when we did not see a matching output for the same input in the next stage. Since the tests expect a linker error to happen, they would pass, but not for the right reason. Unfortunately, the current implementation uses ir_variable instances to keep track of explicit locations. Since we don't have ir_variables instances for individual interface members, we need to have a custom struct with the data we need. This struct has the ir_variable (which for interface members is the whole interface variable), plus the data that we need to validate for each aliased location, for now only the base type, which for interface members we will take from the appropriate field inside the interface variable. Later patches will expand this custom struct so we can also check other requirements for location aliasing, specifically that we have matching interpolation and auxiliary storage, that once again, we will take from the appropriate field members for the interface variables. v2: - Use MAX_VARYING instead of MAX_VARYINGS_INCL_PATCH (Illia) Fixes: KHR-GL45.enhanced_layouts.varying_block_automatic_member_locations Fixes (these were passing before but for incorrect reasons): tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-location-overlap.shader_test tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-mixed-order-overlap.shader_test Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-26 08:40:14 +02:00

... 3 4 5 6 7 ...

2538 Commits