KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Jason Ekstrand	63101177f3	nir: Add another index to load_uniform to specify the range read Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	769b5614f8	nir/opt_algebraic: Remove the encoding line This is an unneeded diff between the vulkan and master branches	2016-04-14 10:35:40 -07:00
Jason Ekstrand	c34be07230	spirv: Move to compiler/ While it does rely on NIR, it's not really part of the NIR core. At the moment, it still builds as part of libnir but that can be changed later if desired.	2016-04-14 10:28:47 -07:00
Jason Ekstrand	bfa3a38280	nir: Remove some pointless delta between vulkan and master	2016-04-14 10:24:33 -07:00
Jose Fonseca	feb6732e80	nir: Use _snprintf on Windows. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-14 16:38:37 +01:00
Jose Fonseca	ba0c0e3940	nir: Avoid structure initalization expressions. Not supported by MSVC, and completely unnecessary -- inline functions work just as well. NIR_SRC_INIT/NIR_DEST_INIT could and probably should be replaced by the inline functions. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-14 16:38:37 +01:00
Jose Fonseca	8f96524f13	nir: Remove unistd.h include. It doesn't seem needed, and is not available on MSVC. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-14 16:38:31 +01:00
Jose Fonseca	f8e2f1fba5	nir: Avoid empty {} struct initializer. Not supported by MSVC and consistent through NIR. [Emil Velikov: rebase] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-14 16:33:52 +01:00
Jason Ekstrand	12f88ba32a	Merge remote-tracking branch 'public/master' into vulkan	2016-04-13 20:25:39 -07:00
Jason Ekstrand	b63a98b121	nir/dead_variables: Configurably work with any variable mode The old version of the pass only worked on globals and locals and always left inputs, outputs, uniforms, etc. alone. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-13 15:45:10 -07:00
Jason Ekstrand	4455bfa9a0	nir/algebraic: Add lowering for ldexp The algorithm used is different from both the naive suggestion from the GLSL spec and the one used in GLSL IR today. Unfortunately, the GLSL IR implementation that we have today doesn't handle denormals (for those that care) or the case where the float source is +-inf. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-13 15:44:19 -07:00
Jason Ekstrand	745b3d295e	nir: Add more modulus opcodes These are all needed for SPIR-V Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-13 15:44:00 -07:00
Jason Ekstrand	dd616cab01	nir/lower_io: Allow for a full bitmask of modes Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-13 12:44:10 -07:00
Jason Ekstrand	2caaf0ac5e	nir/lower_indirect: nir_variable_mode is now a bitfield Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-13 12:44:07 -07:00
Jason Ekstrand	ffa0e12e15	nir: Convert nir_variable_mode to a bitfield There are several passes where we need to specify some set of variable modes that the pass needs top operate on. This lets us easily do that. Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-13 12:40:12 -07:00
Jason Ekstrand	8f3b516f2e	nir/clone: Copy bit size when cloning registers Reported-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-04-12 16:41:58 -07:00
Ian Romanick	193a5cee6a	nir: Fix typo in comment Trivial. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-11 19:24:19 -07:00
Markus Wick	18c8b927e2	nir: Merge redudant integer clamping. Dolphin uses them a lot. Range tracking would be better in the long term, but this two lines works fine for now. Signed-off-by: Markus Wick <markus@selfnet.de> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 18:48:50 -07:00
Kenneth Graunke	808d26c771	nir: Silence unused "options" warning in algebraic passes. Some passes may not refer to options->..., at which point the compiler will warn about an unused variable. Just cast to void unconditionally to shut it up. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-11 18:44:08 -07:00
Kenneth Graunke	5886cd79a0	nir: Do basic constant reassociation. Many shaders contain expression trees of the form: const_1 * (value * const_2) Reorganizing these to (const_1 * const_2) * value will allow constant folding to combine the constants. Sometimes, these constants are 2 and 0.5, so we can remove a multiply altogether. Other times, it can create more immediate constants, which can actually hurt. Finding a good balance here is tricky. While much more could be done, this simple patch seems to have a lot of positive benefit while having a low downside. shader-db results on Broadwell: total instructions in shared programs: 8963768 -> 8961369 (-0.03%) instructions in affected programs: 438318 -> 435919 (-0.55%) helped: 1502 HURT: 245 total cycles in shared programs: 71527354 -> 71421516 (-0.15%) cycles in affected programs: 11541788 -> 11435950 (-0.92%) helped: 3445 HURT: 1224 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-11 18:43:55 -07:00
Jason Ekstrand	a9e6213edd	nir/lower_system_values: Add support for several computed values Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-11 13:53:03 -07:00
Emil Velikov	3d67780b80	compiler: remove {glsl,nir}/Makefile.sources No longer used as of last commit. v2: Rebase. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2016-04-11 19:08:23 +01:00
Jason Ekstrand	3aa1a5ee88	nir/lower_system_values: Simplify the computation of LocalInvocationIndex	2016-04-10 23:43:38 -07:00
Connor Abbott	a89c474157	nir: add a pass for lowering (un)pack_double_2x32 v2: Undo unintended change to the signature of nir_normalize_cubemap_coords (Iago). v3: Move to compiler/nir (Iago) v4: Remove Authors from copyright header (Michael Schellenberger) v5 (Sam): - Use nir_channel() and nir_ssa_for_alu_src() helpers (Jason) - Inline lower_double_pack_instr() code into lower_double_pack_block() (Jason). - Initialize nir_builder at lower_double_pack_impl() (Jason). Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Connor Abbott	663e6421df	nir: add split versions of (un)pack_double_2x32 v2 (Sam): - Use uint64 instead of float64 for sources and destinations. (Connor) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Connor Abbott	b093808d26	nir: don't try to scalarize unpack_double_2x32 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Connor Abbott	9e31e0a21b	nir: add support for (un)pack_double_2x32 v2 (Sam): - Use uint64 instead of float64 for sources and destinations. (Connor) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Iago Toral Quiroga	d5d6260329	nir: add i2d and u2d opcodes v2: - Assert supports_int and don't fallback to nir_fmov (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Iago Toral Quiroga	b16d06252e	nir: add d2i, d2u, d2b opcodes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Connor Abbott	a4bce07dc6	nir: add support for d2f and f2d Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Iago Toral Quiroga	fab5d4cd95	nir/glsl_to_nir: set bit_size on ssbo_load result v2 (Sam): - Add missing bit_size assignment when ssbo_load destination is a boolean. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Samuel Iglesias Gonsálvez	a741378cb5	nir/glsl_to_nir: add bit-size info to add_instr() Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:28:01 +02:00
Connor Abbott	4b37c64f3b	nir/split_var_copies: handle doubles Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Connor Abbott	106a1b5501	nir/instr_set: handle 64-bit bit-sizes v2: Revert spurious change in nir_opt_cse.c (Iago) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Connor Abbott	f2ccb63be1	nir: handle doubles in nir_deref_get_const_initializer_load() v2 (Sam): - Use proper bitsize value when calling to nir_load_const_instr_create() (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Connor Abbott	41c2541fc7	nir/print: add support for printing doubles and bitsize v2: - Squash the printing doubles related patches into one patch (Sam). v3: - Print using PRIx64 format: long is 32-bit on some 32-bit platforms but long long is basically always 64-bit (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Connor Abbott	f5551f8a8b	nir/glsl_to_nir: support doubles v2: - Don't set sized types to the destination of texture related opcodes. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Iago Toral Quiroga	8e69782e3e	nir/lower_load_const_to_scalar: support doubles and multiple bit sizes v2 (Sam): - Add assert to detect bitsizes differents than 32 and 64 (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Iago Toral Quiroga	12f628adcb	nir/lower_to_source_mods: Handle different bit sizes v2 (Sam): - Use helper to get base type from nir_alu_type. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Samuel Iglesias Gonsálvez	3663a2397e	nir: add bit_size info to nir_load_const_instr_create() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Connor Abbott	a5b17ae745	nir/lower_vec: adapt to different bit sizes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Samuel Iglesias Gonsálvez	e3edaec739	nir: add bit_size info to nir_ssa_undef_instr_create() v2: - Make the users to give the right bit_sizes as arguments (Jason). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Connor Abbott	41a39e3384	nir/locals_to_regs: adapt to different bit sizes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Connor Abbott	40d1b671a9	nir/from_ssa: adapt to different bit sizes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Jason Ekstrand	7d58cfa366	nir: Add a pass for gathering various bits of shader info Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-10 20:43:47 -07:00
Jason Ekstrand	b8f3909b73	nir/gather_info: Handle discard_if Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:58:36 -07:00
Jason Ekstrand	e26a978773	Merge remote-tracking branch 'public/master' into vulkan	2016-04-07 16:56:34 -07:00
Kenneth Graunke	3babb7b0a4	nir: Use PRIi64 and PRIu64 instead of %ld and %lu. %ld and %lu aren't the right format specifiers for int64_t and uint64_t on 32-bit (x86) systems. They're %zu on Linux and %Iu on Windows. Use the standard C99 macros in hopes that they work everywhere. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-04 14:38:48 -07:00
Jason Ekstrand	eb93d6dec8	nir/search: Don't match inexact expressions with exact subexpressions In the first pass of implementing exact handling, I made a mistake with search-and-replace. In particular, we only reallly handled exact/inexact on the root of the tree. Instead, we need to check every node in the tree for an exact/inexact match. As an example of this, consider the following GLSL code precise float a = b + c; if (a < 0) { do_stuff(); } In that case, only the add will be declared "exact" and an expression that looks for "b + c < 0" will still match and replace it with "b < -c" which may yield different results. The solution is to simply bail if any of the values are exact when matching an inexact expression. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-04 13:48:10 -07:00
Jason Ekstrand	fe247bbe92	nir: Stop double-printing function arguments	2016-04-04 12:10:20 -07:00
Jason Ekstrand	cc1320220f	nir/gather_info: Add an assert for supported stages	2016-04-01 15:44:43 -07:00
Jason Ekstrand	ebb0bcc11d	nir: Move variable_get_io_mask back into gather_info It used to be in nir_gather_info.c until I moved it out to nir.h so it could be re-used with some linking code that never got merged. We'll move it back out if and when we have real code to share it with.	2016-04-01 15:39:48 -07:00
Jason Ekstrand	95106f6bfb	Merge remote-tracking branch 'public/master' into vulkan	2016-04-01 15:16:21 -07:00
Jason Ekstrand	de60e250f5	nir: Add an opcode for stomping a 32-bit value to 16-bit precision This correlates directly to the SPIR-V opcode OpQuantizeToF16 Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-01 13:52:28 -07:00
Ian Romanick	08ff5f4d1f	nir: Simplify a bcsel to logical-or Oddly, this did not affect the shader where I first noticed the pattern. That particular shader doesn't get its if-statement converted to a bcsel because there are two assignments in the else-statement. This led to me submitting https://bugs.freedesktop.org/show_bug.cgi?id=94747. shader-db results: Sandy Bridge total instructions in shared programs: 8467384 -> 8467069 (-0.00%) instructions in affected programs: 36594 -> 36279 (-0.86%) helped: 46 HURT: 0 total cycles in shared programs: 117573448 -> 117568518 (-0.00%) cycles in affected programs: 339114 -> 334184 (-1.45%) helped: 46 HURT: 0 Ivy Bridge / Haswell / Broadwell / Skylake: total instructions in shared programs: 7774258 -> 7773999 (-0.00%) instructions in affected programs: 30874 -> 30615 (-0.84%) helped: 46 HURT: 0 total cycles in shared programs: 65739190 -> 65734530 (-0.01%) cycles in affected programs: 180380 -> 175720 (-2.58%) helped: 45 HURT: 1 No change on G45 or Ironlake. I also tried these expressions, but none of them affected any shaders in shader-db: (('bcsel', a, 'a@bool', 'b@bool'), ('ior', a, b)), (('bcsel', a, 'b@bool', False), ('iand', a, b)), (('bcsel', a, 'b@bool', 'a@bool'), ('iand', a, b)), Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-31 14:59:36 -07:00
Matt Turner	05ee6627d6	nir: Fix typo from commit `6702f1acde`.	2016-03-30 19:18:35 -07:00
Matt Turner	6702f1acde	nir: Propagate negates up multiplication chains. total instructions in shared programs: 7112159 -> 7088092 (-0.34%) instructions in affected programs: 1374915 -> 1350848 (-1.75%) helped: 7392 HURT: 621 GAINED: 2 LOST: 2	2016-03-30 13:12:34 -07:00
Jason Ekstrand	cf2257069c	nir/spirv: Set a default number of invocations for geometry shaders The SPIR-V spec says geometry shaders are supposed to have one invocation by default. The execution mode is only required if there are multiple invocations.	2016-03-29 20:30:27 -07:00
Jason Ekstrand	35e2e96b30	nir: Add a helper for getting the current block from a cursor Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	be98c47528	nir/lower_out_to_temp: Add an "entrypoint" parameter Previously, the pass assumed that the entrypoint would be whatever function happened to have the name "main". We really shouldn't trust in the function names. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	31a5bec93f	nir/lower_out_to_temp: Steal the output's constant initializer Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	38de85f9a5	nir: Add a helper for getting the unique function in a shader Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	49be812be6	nir/sweep: Sweep function parameters They are no longer in the list of local variables so we need to explicitly sweep them. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	1be4c61c95	nir/builder: Add a helper for creating undefs Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	6a2479d618	nir/builder: Add a helper for storing to variable derefs Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	77e2ac1da7	nir/builder: Add a helper for building fdot instructions Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	da422663a6	nir: Add a variable_foreach_safe helper Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	731870fbe3	nir/Makefile: Fix alphabetization Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	433cf90650	nir/spirv: Remove the NoContraction hack NIR now just handles this for us by not fusing if the multiply is marked as exact.	2016-03-28 13:07:39 -07:00
Jason Ekstrand	035f66025b	nir/search: Don't match inexact expressions with exact subexpressions In the first pass of implementing exact handling, I made a mistake with search-and-replace. In particular, we only reallly handled exact/inexact on the root of the tree. Instead, we need to check every node in the tree for an exact/inexact match. As an example of this, consider the following GLSL code precise float a = b + c; if (a < 0) { do_stuff(); } In that case, only the add will be declared "exact" and an expression that looks for "b + c < 0" will still match and replace it with "b < -c" which may yield different results. The solution is to simply bail if any of the values are exact when matching an inexact expression.	2016-03-28 13:07:39 -07:00
Jason Ekstrand	fbb9e1f008	spirv/alu: Add support for the NoContraction decoration	2016-03-25 21:35:41 -07:00
Jason Ekstrand	00fa795cd3	spirv/glsl: Add a helper for converting glsl opcodes into nir opcodes This is similar to the way that regular ALU operations are handled.	2016-03-25 21:35:41 -07:00
Jason Ekstrand	98522c1853	nir/spirv: Get rid of the spirv2nir helper binary This was useful once upon a time but now that we have a real Vulkan driver to run our SPIR-V binaries through, there's really no point.	2016-03-25 21:35:41 -07:00
Jason Ekstrand	13bad493b4	nir/algebraic: Get rid of a redundant copy of fdiv lowering	2016-03-25 14:04:05 -07:00
Jason Ekstrand	08fe89864b	nir/algebraic: Add better lowering of ldexp	2016-03-25 14:04:05 -07:00
Jason Ekstrand	b75d770963	nir/builder: Simplify nir_ssa_undef a bit	2016-03-25 14:04:05 -07:00
Jason Ekstrand	ab31951bef	nir/spirv: Use the nir_ssa_undef helper from nir_builder	2016-03-25 14:04:05 -07:00
Jason Ekstrand	d2eee52a65	nir/builder: Add a bit size field to nir_ssa_undef	2016-03-25 14:04:05 -07:00
Jason Ekstrand	b50f7f0011	nir: Add a better comment for INTRINSIC_RANGE	2016-03-25 14:04:05 -07:00
Jason Ekstrand	add8c837b5	nir/glsl: Stop carying a pointer to the nir_shader in the visitor	2016-03-25 14:04:05 -07:00
Jason Ekstrand	2c3f95d6aa	Merge remote-tracking branch 'public/master' into vulkan	2016-03-24 17:30:14 -07:00
Jason Ekstrand	22b343a8ec	nir: Add a pass to inline functions This commit adds a new NIR pass that lowers all function calls away by inlining the functions. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	debf23ec68	nir/builder: Add helpers for easily inserting copy_var intrinsics Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	79dec93ead	nir: Add return lowering pass This commit adds a NIR pass for lowering away returns in functions. If the return is in a loop, it is lowered to a break. If it is not in a loop, it's lowered away by moving/deleting code as needed. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	8d61d72524	nir: Add a cursor helper for getting a cursor after any phi nodes Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	18b0166749	nir/builder: Add a helper for inserting jump instructions Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	97b663481c	nir/cf: Make extracting or re-inserting nothing a no-op Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	7022a673cd	nir: Add a function for comparing cursors Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	124f229ece	nir/cf: Handle relinking top-level blocks This can happen if a function ends in a return instruction and you remove the return. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	364212f1ed	nir: Add a pass to repair SSA form Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	ea98d415e4	nir/vars_to_ssa: Use the new nir_phi_builder helper The efficiency should be approximately the same. We do a little more work per phi node because we have to sort the predecessors. However, we no longer have to walk the blocks a second time to pop things off the stack. The bigger advantage, however, is that we can now re-use the phi placement and per-block SSA value tracking in other passes. As a side-benifit, the phi builder actually handles unreachable blocks correctly. The original vars_to_ssa code, because of the way it iterated the blocks and added phi sources, didn't add sources corresponding to predecessors of unreachable blocks. The new strategy employed by the phi builder creates a phi source for each predecessor and should correctly handle unreachable blocks by setting those sources to SSA undefs. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	42ddfc611f	nir/dominance: Handle unreachable blocks Previously, nir_dominance.c didn't properly handle unreachable blocks. This can happen if, for instance, you have something like this: loop { if (...) { break; } else { break; } } In this case, the block right after the if statement will be unreachable. This commit makes two changes to handle this. First, it removes an assert and allows block->imm_dom to be null if the block is unreachable. Second, it properly skips unreachable blocks in calc_dom_frontier_cb. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	e4dc82cfcf	nir: Add a phi node placement helper Right now, we have phi placement code in two places and there are other places where it would be nice to be able to do this analysis. Instead of repeating it all over the place, this commit adds a helper for placing all of the needed phi nodes for a value. v2: Add better documentation Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Rob Clark	0bea0e7141	nir: fix dangling ssadef->name ptrs In many places, the convention is to pass an existing ssadef name ptr when construction/initializing a new nir_ssa_def. But that goes badly (as noticed by garbage in nir_print output) when the original string gets freed. Just use ralloc_strdup() instead, and add ralloc_free() in the two places that would care (not that the strings wouldn't eventually get freed anyways). Also fixup the nir_search code which was directly setting ssadef->name to use the parent instruction as memctx. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-24 08:30:04 -04:00
Jason Ekstrand	a984e44abd	nir/glsl: Propagate invariant into NIR alu ops Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:07 -07:00
Jason Ekstrand	91d6272c2b	nir/alu_to_scalar: Propagate the "exact" bit Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:06 -07:00
Jason Ekstrand	5f39e3e165	nir/cse: Properly handle nir_ssa_def.exact Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:06 -07:00
Jason Ekstrand	0dbda153aa	nir/algebraic: Flag inexact optimizations Many of our optimizations, while great for cutting shaders down to size, aren't really precision-safe. This commit tries to flag all of the inexact floating-point optimizations so they don't get run on values that are flagged "exact". It's a bit conservative and maybe flags some safe optimizations as unsafe but that's better than missing one. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:02 -07:00
Jason Ekstrand	ed3a029e80	nir/algebraic: Fix fmin detection to match the spec The previous transformation got the arguments to fmin backwards. When NaNs are involved, the GLSL min/max aren't commutative so it matters. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:00 -07:00
Jason Ekstrand	89545b1314	nir/algebraic: Get rid of an invlid fxor optimization The fxor opcode is required to return 1.0f or 0.0f but the input variable may not be 1.0f or 0.0f. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:27:58 -07:00
Jason Ekstrand	3a7cb6534c	nir/algebraic: Allow for flagging operations as being inexact Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:27:55 -07:00
Jason Ekstrand	a6f25fa7d7	nir/search: Propagate exactness into newly created expressions Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:27:52 -07:00
Jason Ekstrand	ded3133d47	nir/builder: Add a flag for setting exact Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:26:34 -07:00
Jason Ekstrand	4ff89377d9	nir: Add an "exact" bit to nir_alu_instr Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:26:34 -07:00
Jason Ekstrand	f849f53990	nir/clone: Export nir_variable_clone Reviewed-by: Rob Clark <robclark@gmail.com>	2016-03-23 15:26:11 -07:00
Jason Ekstrand	5fe8959912	nir/clone: Expose nir_constant_clone Reviewed-by: Rob Clark <robclark@gmail.com>	2016-03-23 15:26:08 -07:00
Jason Ekstrand	c4c373f156	nir: Fix whitespace Reviewed-by: Rob Clark <robclark@gmail.com>	2016-03-23 15:25:53 -07:00
Ian Romanick	d7a25a9def	nir: Don't abs slt and friends No shader-db changes, but this is symmetric with the previous commit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:48:02 -07:00
Ian Romanick	2bb006af68	nir: Don't abs the result of b2f or b2i In the results below, 2 SIMD16 shaders in Trine are lost. G4X total instructions in shared programs: 4012279 -> 4011108 (-0.03%) instructions in affected programs: 116776 -> 115605 (-1.00%) helped: 339 HURT: 0 total cycles in shared programs: 84315862 -> 84313584 (-0.00%) cycles in affected programs: 1767232 -> 1764954 (-0.13%) helped: 274 HURT: 81 Ironlake total instructions in shared programs: 6399073 -> 6396998 (-0.03%) instructions in affected programs: 218050 -> 215975 (-0.95%) helped: 600 HURT: 0 total cycles in shared programs: 128892088 -> 128888810 (-0.00%) cycles in affected programs: 2867452 -> 2864174 (-0.11%) helped: 422 HURT: 137 Sandy Bridge total instructions in shared programs: 8462174 -> 8460759 (-0.02%) instructions in affected programs: 178529 -> 177114 (-0.79%) helped: 596 HURT: 0 total cycles in shared programs: 117542276 -> 117534098 (-0.01%) cycles in affected programs: 1239166 -> 1230988 (-0.66%) helped: 369 HURT: 150 Ivy Bridge total instructions in shared programs: 7775131 -> 7773410 (-0.02%) instructions in affected programs: 162903 -> 161182 (-1.06%) helped: 590 HURT: 0 total cycles in shared programs: 65759882 -> 65747268 (-0.02%) cycles in affected programs: 1004354 -> 991740 (-1.26%) helped: 467 HURT: 141 Haswell total instructions in shared programs: 7107786 -> 7106327 (-0.02%) instructions in affected programs: 140954 -> 139495 (-1.04%) helped: 590 HURT: 0 total cycles in shared programs: 64668028 -> 64655322 (-0.02%) cycles in affected programs: 967080 -> 954374 (-1.31%) helped: 452 HURT: 149 LOST: 2 GAINED: 0 Broadwell total instructions in shared programs: 8980029 -> 8978287 (-0.02%) instructions in affected programs: 197232 -> 195490 (-0.88%) helped: 715 HURT: 0 total cycles in shared programs: 70070448 -> 70055970 (-0.02%) cycles in affected programs: 975724 -> 961246 (-1.48%) helped: 471 HURT: 111 LOST: 2 GAINED: 0 Skylake total instructions in shared programs: 9115178 -> 9113436 (-0.02%) instructions in affected programs: 203012 -> 201270 (-0.86%) helped: 715 HURT: 0 total cycles in shared programs: 68848660 -> 68834004 (-0.02%) cycles in affected programs: 993888 -> 979232 (-1.47%) helped: 473 HURT: 116 LOST: 2 GAINED: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:48:02 -07:00
Ian Romanick	348e5a71d8	nir: Simplify 0 < fabs(a) Sandy Bridge / Ivy Bridge / Haswell total instructions in shared programs: 8462180 -> 8462174 (-0.00%) instructions in affected programs: 564 -> 558 (-1.06%) helped: 6 HURT: 0 total cycles in shared programs: 117542462 -> 117542276 (-0.00%) cycles in affected programs: 9768 -> 9582 (-1.90%) helped: 12 HURT: 0 Broadwell / Skylake total instructions in shared programs: 8980833 -> 8980826 (-0.00%) instructions in affected programs: 626 -> 619 (-1.12%) helped: 7 HURT: 0 total cycles in shared programs: 70077900 -> 70077714 (-0.00%) cycles in affected programs: 9378 -> 9192 (-1.98%) helped: 12 HURT: 0 G45 and Ironlake showed no change. v2: Modify the comments to look more like a proof. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:47:56 -07:00
Ian Romanick	564a8b8a26	nir: Simplify 0 >= b2f(a) This also prevented some regressions with other patches in my local tree. Broadwell / Skylake total instructions in shared programs: 8980835 -> 8980833 (-0.00%) instructions in affected programs: 45 -> 43 (-4.44%) helped: 1 HURT: 0 total cycles in shared programs: 70077904 -> 70077900 (-0.00%) cycles in affected programs: 122 -> 118 (-3.28%) helped: 1 HURT: 0 No changes on earlier platforms. v2: Modify the comments to look more like a proof. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:44:57 -07:00
Ian Romanick	bf0d60aa11	nir: Simplify i2b with negated or abs operand This enables removing ssa_201 and ssa_202 in sequences like: vec1 ssa_200 = flt ssa_199, ssa_194 vec1 ssa_201 = b2i ssa_200 vec1 ssa_202 = i2b -ssa_201 shader-db results: Sandy Bridge total instructions in shared programs: 8462257 -> 8462180 (-0.00%) instructions in affected programs: 3846 -> 3769 (-2.00%) helped: 35 HURT: 0 total cycles in shared programs: 117542934 -> 117542462 (-0.00%) cycles in affected programs: 20072 -> 19600 (-2.35%) helped: 20 HURT: 1 Ivy Bridge total instructions in shared programs: 7775252 -> 7775137 (-0.00%) instructions in affected programs: 3645 -> 3530 (-3.16%) helped: 35 HURT: 0 total cycles in shared programs: 65760522 -> 65760068 (-0.00%) cycles in affected programs: 21082 -> 20628 (-2.15%) helped: 25 HURT: 2 Haswell total instructions in shared programs: 7108666 -> 7108589 (-0.00%) instructions in affected programs: 3253 -> 3176 (-2.37%) helped: 35 HURT: 0 total cycles in shared programs: 64675726 -> 64675272 (-0.00%) cycles in affected programs: 21034 -> 20580 (-2.16%) helped: 26 HURT: 1 Broadwell / Skylake total instructions in shared programs: 8980912 -> 8980835 (-0.00%) instructions in affected programs: 3223 -> 3146 (-2.39%) helped: 35 HURT: 0 total cycles in shared programs: 70077926 -> 70077904 (-0.00%) cycles in affected programs: 21886 -> 21864 (-0.10%) helped: 21 HURT: 6 G45 and Ironlake showed no change. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:43:28 -07:00
Ian Romanick	a4079f1cb2	nir: Lower flrp with Boolean interpolator to bcsel On Intel platforms that don't set lower_flrp, using bcsel instead of flrp seems to be a small amount worse. On those platforms, the use of flrp, bcsel, and multiply of b2f is still an active area of research. In review, Matt suggested this is because bcsel turns into CMP+SEL, and because of the flag register we can't schedule instructions well. shader-db results: G4X / Ironlake total instructions in shared programs: 4016538 -> 4012279 (-0.11%) instructions in affected programs: 161556 -> 157297 (-2.64%) helped: 1077 HURT: 1 total cycles in shared programs: 84328296 -> 84315862 (-0.01%) cycles in affected programs: 4174570 -> 4162136 (-0.30%) helped: 926 HURT: 53 Unsurprisingly, no changes on later platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:42:42 -07:00
Connor Abbott	58fe7837b8	nir: propagate bitsize information in nir_search When we replace an expresion we have to compute bitsize information for the replacement. We do this in two passes to validate that bitsize information is consistent and correct: first we propagate bitsize from child nodes to parent, then we do it the other way around, starting from the original's instruction destination bitsize. v2 (Iago): - Always use nir_type_bool32 instead of nir_type_bool when generating algebraic optimizations. Before we used nir_type_bool32 with constants and nir_type_bool with variables. - Fix bool comparisons in nir_search.c to account for bitsized types. v3 (Sam): - Unpack the double constant value as unsigned long long (8 bytes) in nir_algrebraic.py. v4 (Sam): - Use helpers to get type size and base type from nir_alu_type. Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:54:45 +01:00
Connor Abbott	3124ce699b	nir: add a bit_size parameter to nir_ssa_dest_init v2: Squash multiple commits addressing the new parameter in different files so we don't break the build (Iago) v3: Fix tgsi (Samuel) v4: Fix nir_clone.c (Samuel) v5: Fix vc4 and freedreno (Iago) v6 (Sam) - Fix build errors in nir_lower_indirect_derefs - Use helper to get type size from nir_alu_type. Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:54:45 +01:00
Iago Toral Quiroga	084b24f558	nir: rename nir_const_value fields to include bitsize information Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-17 11:16:33 +01:00
Connor Abbott	9076c4e289	nir: update opcode definitions for different bit sizes Some opcodes need explicit bitsizes, and sometimes we need to use the double version when constant folding. v2: fix output type for u2f (Iago) v3: do not change vecN opcodes to be float. The next commit will add infrastructure to enable 64-bit integer constant folding so this is isn't really necessary. Also, that created problems with source modifiers in some cases (Iago) v4 (Jason): - do not change bcsel to work in terms of floats - leave ldexp generic Squashed changes to handle different bit sizes when constant folding since otherwise we would break the build. v2: - Use the bit-size information from the opcode information if defined (Iago) - Use helpers to get type size and base type of nir_alu_type enum (Sam) - Do not fallback to sized types to guess bit-size information. (Jason) Squashed changes in i965 and gallium/nir drivers to support sized types. These functions should only see sized types, but we can't make that change until we make sure that nir uses the sized versions in all the relevant places. A later commit will address this. Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Connor Abbott	6700d7e423	nir: add nir_{src,dest}_bit_size() helpers v2: use a ternary (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Jason Ekstrand	e172dbe5d2	nir: Add a bit_size to nir_register and nir_ssa_def This really hacky commit adds a bit size to registers and SSA values. It also adds rules in the validator to validate that they do the right things. It's still an open question as to whether or not we want a bit_size in nir_alu_instr or if we just want to let it inherit from the destination. I'm inclined to just let it inherit from the destination. A similar question needs to be asked about intrinsics. v2 (Connor): - Relax validation: comparisons have explicit destination sizes and implicit source sizes. v3 (Sam): - Use helpers to get size and base types of nir_alu_type enum. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Jason Ekstrand	78f1919429	nir: Add explicitly sized types v2: Fix size/type mask to properly handle 8-bit types. v3: Add helpers to get the bitsize and base type of a nir_alu_type enum. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Jordan Justen	3fd308a357	Merge remote-tracking branch 'origin/master' into vulkan	2016-03-17 01:44:07 -07:00
Jordan Justen	b1e7cdfdcf	nir: Lower shared var atomics during nir_lower_io Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	e3cbb9d37c	nir: Add support for lowering load/stores of shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	683c359c54	nir: Add atomic operations on variables This allows us to first generate atomic operations for shared variables using these opcodes, and then later we can lower those to the shared atomics intrinsics with nir_lower_io. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	3c807607df	nir: Add compute shader shared variable storage class Previously we were receiving shared variable accesses via a lowered intrinsic function from glsl. This change allows us to send in variables instead. For example, when converting from SPIR-V. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	26f8262698	nir/print: Add space after shader_storage var mode Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jason Ekstrand	7f6a0cb29c	Merge remote-tracking branch 'public/master' into vulkan	2016-03-15 14:09:50 -07:00
Jason Ekstrand	98d58e7320	nir/clone: Add support for cloning a single function_impl Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	036b209484	nir/validate: Better function validation Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	f86f3c90aa	nir/print: Better function argument printing Since we aren't going to put the function parameters or the return variable in the list of locals, it won't get a proper declaration. This changes nir_print to print the type along with each parameter or return variable. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	13969565f9	nir/print: Factor variable name lookup into a helper Otherwise, we have a problem when we go to print functions with arguments because their names get added to the hash table during declaration which happens after we print the prototype. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	e4bebe8a02	nir: Create function parameters in function_impl_create Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	066d3c115e	nir: Add a helper for creating a "bare" nir_function_impl Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	2ef4754a20	nir: Add a new "param" variable mode for parameters and return variables Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	41ae553fda	nir/glsl: Remove dead function parameter handling code NIR has never been used on IR where we haven't already done function inlining so this code has been dead from the beginning. Let's just get rid of it for now. We can always put it back in if we decide to use NIR for function inlining at some point in the future. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	14b18aba89	nir: Add a pass for lower indirect variable dereferences This new pass lowers load/store_var intrinsics that act on indirect derefs to if-ladder of direct load/store_var intrinsics. The if-ladders perform a simple binary search on the indirect. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-03-08 10:41:54 -08:00
Matt Turner	905ff86198	nir: Recognize open-coded extract_u16. No shader-db changes, but does recognize some extract_u16 which enables the next patch to optimize some code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-04 11:52:34 -08:00
Matt Turner	76289fbfa8	nir: Recognize open-coded extract_u8. Two shaders that appear in Unigine benchmarks (Heaven and Valley) unpack three bytes from an integer and convert each into a float: float((val >> 16u) & 0xffu) float((val >> 8u) & 0xffu) float((val >> 0u) & 0xffu) Instead of shifting, masking, and type converting like this: shr(8) g15<1>UD g25<8,8,1>UD 0x00000010UD and(8) g16<1>UD g15<8,8,1>UD 0x000000ffUD mov(8) g17<1>F g16<8,8,1>UD shr(8) g18<1>UD g25<8,8,1>UD 0x00000008UD and(8) g19<1>UD g18<8,8,1>UD 0x000000ffUD mov(8) g20<1>F g19<8,8,1>UD and(8) g21<1>UD g25<8,8,1>UD 0x000000ffUD mov(8) g22<1>F g21<8,8,1>UD i965 can simply extract a byte and convert to float in a single instruction: mov(8) g17<1>F g25.2<32,8,4>UB mov(8) g20<1>F g25.1<32,8,4>UB mov(8) g22<1>F g25.0<32,8,4>UB This patch implements the first step: recognizing byte extraction. A later patch will optimize out the conversion to float. instructions in affected programs: 28568 -> 27450 (-3.91%) helped: 7 cycles in affected programs: 210076 -> 203144 (-3.30%) helped: 7 This patch decreases the number of instructions in the two Unigine programs by: #1721: 4520 -> 4374 instructions (-3.23%) #1706: 3752 -> 3582 instructions (-4.53%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-04 11:52:34 -08:00
Kristian Høgsberg Kristensen	b00b42d99b	nir/spirv: Use the new bare sampler type	2016-02-28 11:24:05 -08:00
Jason Ekstrand	c9564fd598	nir/spirv: Allow but warn for a few capabilities Unfortunately, glslang gives us cull/clip distance and GS streams even if the shader doesn't use it whenever a shader is declared as version 450. This is a glslang bug, but we can easily enough ignore it for now.	2016-02-23 22:07:25 -08:00
Jason Ekstrand	040355b688	nir/spirv: Add more capabilities	2016-02-23 21:01:00 -08:00
Jason Ekstrand	f49ba0f7d8	nir/spirv: Add support for multisampled textures	2016-02-21 22:02:38 -08:00
Jason Ekstrand	79c0781f44	nir/gather_info: Count textures and images	2016-02-18 11:42:36 -08:00
Jason Ekstrand	581e4468f9	nir/spirv: Add some more capabilities	2016-02-17 18:04:39 -08:00
Jason Ekstrand	979732fafc	nir: Add a helper for getting the one function from a shader	2016-02-17 18:04:39 -08:00
Jason Ekstrand	8c05b44bbb	nir: Add a nir_foreach_variable_safe helper	2016-02-17 18:04:39 -08:00
Kristian Høgsberg Kristensen	b8da261dc7	spirv: Fix SpvOpFwidth, SpvOpFwidthFine and SpvOpFwidthCoarse "Result is the same as computing the sum of the absolute values of OpDPdx and OpDPdy on P." We were doing sum of absolute values of OpDPdx of P and OpDPdx of NULL.	2016-02-17 15:28:52 -08:00
Jason Ekstrand	88042b9f10	nir: Get rid of the C++ NIR_SRC/DEST_INIT macros These were originally added to reduce compiler warnings but aren't really needed. Getting rid of them reduces the diff between the Vulkan branch and master, so we might as well.	2016-02-12 21:35:02 -08:00
Jason Ekstrand	3c8dc1afd1	nir/spirv/glsl: Clean up the row-skipping swizzle logic a bit	2016-02-12 10:40:39 -08:00
Jason Ekstrand	4016619931	nir/spirv: Allow the clip distance capability.	2016-02-11 15:14:46 -08:00
Jason Ekstrand	f710f3ca37	Merge remote-tracking branch 'mesa-public/master' into vulkan This also reverts commit `1d65abfa58` because now NIR handles texture offsets in a much more sane way.	2016-02-10 17:12:11 -08:00
Jason Ekstrand	8750299a42	nir: Remove the const_offset from nir_tex_instr When NIR was originally drafted, there was no easy way to determine if something was constant or not. The result was that we had lots of special-casing for constant values such as this. Now that load_const instructions are SSA-only, it's really easy to find constants and this isn't really needed anymore. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robclark@gmail.com>	2016-02-10 16:33:50 -08:00
Jason Ekstrand	70dff4a55e	nir/lower_vec_to_movs: Better report channels handled by insert_mov This fixes two issues. First, we had a use-after-free in the case where the instruction got deleted and we tried to return mov->dest.write_mask. Second, in the case where we are doing a self-mov of a register, we delete those channels that are moved to themselves from the write-mask. This means that those channels aren't reported as being handled even though they are. We now stash off the write-mask before remove unneeded channels so that they still get reported as handled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94073 Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-02-10 16:33:14 -08:00
Jason Ekstrand	9be5a4bc29	nir/spirv: Fix handling of OpGroupMemberDecorate We were pulling the member index from the wrong dword	2016-02-10 15:36:42 -08:00
Jason Ekstrand	ac04c6de2c	nir/spirv: Assert that struct member ids are in-bounds	2016-02-10 15:36:41 -08:00
Mark Janes	8179834030	nir/spirv: fix build_mat_subdet stack smasher The sub-determinate implementation pattern fixed by `6a7e2904e0` has a second instance in the same file. With the previous algorithm, when row and j are both 3, the index overruns the array. This only impacts the stack on 32 bit builds. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-10 14:43:03 -08:00
Jason Ekstrand	09b3e30dc6	anv: Fix up spirv for new texture/sampler split stuff	2016-02-09 16:48:36 -08:00
Jason Ekstrand	b14f4c1fd3	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in the separate texture/sampler stuff from upstream	2016-02-09 16:47:37 -08:00
Jason Ekstrand	e01dd59b73	vtn: Use const_index helpers	2016-02-09 16:32:38 -08:00
Jason Ekstrand	768bd7f272	Merge commit '8b0fb1c152fe191768953aa8c77b89034a377f83' into vulkan This pulls in Rob Clark's const_index changes for NIR	2016-02-09 15:30:39 -08:00
Jason Ekstrand	5ec456375e	nir: Separate texture from sampler in nir_tex_instr This commit adds the capability to NIR to support separate textures and samplers. As it currently stands, glsl_to_nir only sets the texture deref and leaves the sampler deref alone as it did before and nir_lower_samplers assumes this. Backends can still assume that they are combined and only look at only at the texture index. Or, if they wish, they can assume that they are separate because nir_lower_samplers, tgsi_to_nir, and prog_to_nir all set both texture and sampler index whenever a sampler is required (the two indices are the same in this case). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	ee85014b90	nir/tex_instr: Rename sampler to texture We're about to separate the two concepts. When we do, the sampler will become optional. Doing a rename first makes the separation a bit more safe because drivers that depend on GLSL or TGSI behaviour will be fine to just use the texture index all the time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	3f42184994	nir: Add some braces around loops and ifs	2016-02-09 15:00:17 -08:00
Rob Clark	ced8d3e773	nir: use const_index helpers Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-09 17:30:33 -05:00
Rob Clark	b6cf98bc82	gtn: use const_index helpers Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-09 17:30:33 -05:00
Rob Clark	1df3ecc1b8	nir: const_index helpers Direct access to intr->const_index[n], where different slots have different meanings, is somewhat confusing. Instead, let's put some extra info in nir_intrinsic_infos[] about which slots map to what, and add some get/set helpers. The helpers validate that the field being accessed (base/writemask/etc) is applicable for the intrinsic opc, for some extra safety. And nir_print can use this to dump out decoded const_index fields. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-09 17:30:33 -05:00
Jason Ekstrand	1d65abfa58	nir/spirv: Better handle constant offsets in texture lookups	2016-02-09 10:29:05 -08:00
Jason Ekstrand	209820739b	nir/spirv: Set the vtn_mode and interface type for sampler parameters	2016-02-09 10:29:05 -08:00
Jason Ekstrand	de6c9c5f2e	nir/inline_functions: Don't shadown variables when it isn't needed Previously, in order to get things working, we just always shadowed variables. Now, we rewrite derefs whenever it's safe to do so and only shadow if we have an in or out variable that we write or read to respectively.	2016-02-09 10:29:05 -08:00
Jason Ekstrand	b6c00bfb03	nir: Rework function parameters	2016-02-09 10:29:05 -08:00
Timothy Arceri	1aae5e8ced	nir: remove unused nir_variable fields These are used in GLSL IR to removed unused varyings and match transform feedback variables. There is no need to use these in NIR. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:49:06 +11:00
Matt Turner	371c4b3c48	nir: Recognize open-coded bitfield_reverse. Helps 11 shaders in UnrealEngine4 demos. I seriously hope they would have given us bitfieldReverse() if we exposed GL 4.0 (but we do expose ARB_gpu_shader5, so why not use that anyway?). instructions in affected programs: 4875 -> 4633 (-4.96%) cycles in affected programs: 270516 -> 244516 (-9.61%) I suspect there's a lot of room to improve nir_search/opt_algebraic's handling of this. We'd actually like to match, e.g., step2 by matching step1 once and then doing a pointer comparison for the second instance of step1, but unfortunately we generate an enormous tuple for instead. The .text size increases by 6.5% and the .data by 17.5%. text data bss dec hex filename 22957 45224 0 68181 10a55 nir_libnir_la-nir_opt_algebraic.o 24461 53160 0 77621 12f35 nir_libnir_la-nir_opt_algebraic.o I'd be happy to remove this if Unreal4 uses bitfieldReverse() if it is in a GL 4.0 context once we expose GL 4.0. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 21:20:58 -08:00
Matt Turner	2d0d9755da	nir: Handle large unsigned values in opt_algebraic. The next patch adds an algebraic rule that uses the constant 0xff00ff00. Without this change, the build fails with return hex(struct.unpack('I', struct.pack('i', self.value))[0]) struct.error: 'i' format requires -2147483648 <= number <= 2147483647 The hex() function handles integers of any size, and assigning a negative value to an unsigned does what we want in C. The pack/unpack is unnecessary (and as we see, buggy). Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>	2016-02-08 20:38:17 -08:00
Matt Turner	7be8d07732	nir: Do opt_algebraic in reverse order. Walking the SSA definitions in order means that we consider the smallest algebraic optimizations before larger optimizations. So if a smaller rule is part of a larger rule, the smaller one will happen first, preventing the larger one from happening. instructions in affected programs: 32721 -> 32611 (-0.34%) helped: 106 In programs whose nir_optimize loop count changes (129 of them): before: 1164 optimization loops after: 1071 optimization loops Of the 129 affected, 16 programs' optimization loop counts increased. Prevents regressions and annoyances in the next commits. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 20:38:17 -08:00
Matt Turner	a8f0960816	nir: Recognize product of open-coded pow()s. Prevents regressions in the next commit. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 20:38:17 -08:00
Matt Turner	9f02e3ab03	nir: Add opt_algebraic rules for xor with zero. instructions in affected programs: 668 -> 664 (-0.60%) helped: 4 Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 20:38:17 -08:00
Francisco Jerez	cec6fe2ad8	vtn: Clean up acos implementation. Parameterize build_asin() on the fit coefficients so the implementation can be shared while still using different polynomials for asin and acos. Also switch back to implementing acos in terms of asin -- The improvement obtained from cancelling out the pi/2 terms was negligible compared to the approximation error.	2016-02-08 15:23:43 -08:00
Francisco Jerez	f50a651726	nir/spirv: Create integer types of correct signedness. vtn_handle_type() creates a signed type regardless of the value of the signedness flag, which usually doesn't make much of a difference except when the type is used as base sampled type of an image type, what will cause the base type of the NIR image variable to be inconsistent with its format and cause an assertion failure in the back-end (most likely only reproducible on Gen7), and may change the semantics of the image intrinsic subtly (e.g. UMIN may become IMIN).	2016-02-08 15:23:35 -08:00
Jason Ekstrand	9401516113	Merge remote-tracking branch 'mesa-public/master' into vulkan	2016-02-05 15:21:11 -08:00
Jason Ekstrand	741744f691	Merge commit mesa-public/master into vulkan This pulls in the patches that move all of the compiler stuff around	2016-02-05 15:03:44 -08:00
Matt Turner	955d052058	nir: Add lowering support for unpacking opcodes. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	9b8786eba9	nir: Add lowering support for packing opcodes. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	68f8c5730b	nir: Add opcodes to extract bytes or words. The uint versions zero extend while the int versions sign extend. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	8709dc0713	glsl: Remove 2x16 half-precision pack/unpack opcodes. i965/fs was the only consumer, and we're now doing the lowering in NIR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	9ce901058f	nir: Add lowering of nir_op_unpack_half_2x16. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	140a886c41	nir: Make argument order of unop_convert match binop_convert. Strangely the return and parameter types were reversed. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Emil Velikov	eb63640c1d	glsl: move to compiler/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-26 16:08:33 +00:00
Emil Velikov	a39a8fbbaa	nir: move to compiler/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-26 16:08:30 +00:00

... 41 42 43 44 45 ...

2288 Commits