KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Jason Ekstrand	b50f7f0011	nir: Add a better comment for INTRINSIC_RANGE	2016-03-25 14:04:05 -07:00
Jason Ekstrand	add8c837b5	nir/glsl: Stop carying a pointer to the nir_shader in the visitor	2016-03-25 14:04:05 -07:00
Jason Ekstrand	2c3f95d6aa	Merge remote-tracking branch 'public/master' into vulkan	2016-03-24 17:30:14 -07:00
Jason Ekstrand	22b343a8ec	nir: Add a pass to inline functions This commit adds a new NIR pass that lowers all function calls away by inlining the functions. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	debf23ec68	nir/builder: Add helpers for easily inserting copy_var intrinsics Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	79dec93ead	nir: Add return lowering pass This commit adds a NIR pass for lowering away returns in functions. If the return is in a loop, it is lowered to a break. If it is not in a loop, it's lowered away by moving/deleting code as needed. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	8d61d72524	nir: Add a cursor helper for getting a cursor after any phi nodes Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	18b0166749	nir/builder: Add a helper for inserting jump instructions Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	97b663481c	nir/cf: Make extracting or re-inserting nothing a no-op Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	7022a673cd	nir: Add a function for comparing cursors Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	124f229ece	nir/cf: Handle relinking top-level blocks This can happen if a function ends in a return instruction and you remove the return. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	364212f1ed	nir: Add a pass to repair SSA form Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	ea98d415e4	nir/vars_to_ssa: Use the new nir_phi_builder helper The efficiency should be approximately the same. We do a little more work per phi node because we have to sort the predecessors. However, we no longer have to walk the blocks a second time to pop things off the stack. The bigger advantage, however, is that we can now re-use the phi placement and per-block SSA value tracking in other passes. As a side-benifit, the phi builder actually handles unreachable blocks correctly. The original vars_to_ssa code, because of the way it iterated the blocks and added phi sources, didn't add sources corresponding to predecessors of unreachable blocks. The new strategy employed by the phi builder creates a phi source for each predecessor and should correctly handle unreachable blocks by setting those sources to SSA undefs. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	42ddfc611f	nir/dominance: Handle unreachable blocks Previously, nir_dominance.c didn't properly handle unreachable blocks. This can happen if, for instance, you have something like this: loop { if (...) { break; } else { break; } } In this case, the block right after the if statement will be unreachable. This commit makes two changes to handle this. First, it removes an assert and allows block->imm_dom to be null if the block is unreachable. Second, it properly skips unreachable blocks in calc_dom_frontier_cb. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	e4dc82cfcf	nir: Add a phi node placement helper Right now, we have phi placement code in two places and there are other places where it would be nice to be able to do this analysis. Instead of repeating it all over the place, this commit adds a helper for placing all of the needed phi nodes for a value. v2: Add better documentation Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Rob Clark	0bea0e7141	nir: fix dangling ssadef->name ptrs In many places, the convention is to pass an existing ssadef name ptr when construction/initializing a new nir_ssa_def. But that goes badly (as noticed by garbage in nir_print output) when the original string gets freed. Just use ralloc_strdup() instead, and add ralloc_free() in the two places that would care (not that the strings wouldn't eventually get freed anyways). Also fixup the nir_search code which was directly setting ssadef->name to use the parent instruction as memctx. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-24 08:30:04 -04:00
Jason Ekstrand	a984e44abd	nir/glsl: Propagate invariant into NIR alu ops Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:07 -07:00
Jason Ekstrand	91d6272c2b	nir/alu_to_scalar: Propagate the "exact" bit Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:06 -07:00
Jason Ekstrand	5f39e3e165	nir/cse: Properly handle nir_ssa_def.exact Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:06 -07:00
Jason Ekstrand	0dbda153aa	nir/algebraic: Flag inexact optimizations Many of our optimizations, while great for cutting shaders down to size, aren't really precision-safe. This commit tries to flag all of the inexact floating-point optimizations so they don't get run on values that are flagged "exact". It's a bit conservative and maybe flags some safe optimizations as unsafe but that's better than missing one. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:02 -07:00
Jason Ekstrand	ed3a029e80	nir/algebraic: Fix fmin detection to match the spec The previous transformation got the arguments to fmin backwards. When NaNs are involved, the GLSL min/max aren't commutative so it matters. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:00 -07:00
Jason Ekstrand	89545b1314	nir/algebraic: Get rid of an invlid fxor optimization The fxor opcode is required to return 1.0f or 0.0f but the input variable may not be 1.0f or 0.0f. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:27:58 -07:00
Jason Ekstrand	3a7cb6534c	nir/algebraic: Allow for flagging operations as being inexact Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:27:55 -07:00
Jason Ekstrand	a6f25fa7d7	nir/search: Propagate exactness into newly created expressions Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:27:52 -07:00
Jason Ekstrand	ded3133d47	nir/builder: Add a flag for setting exact Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:26:34 -07:00
Jason Ekstrand	4ff89377d9	nir: Add an "exact" bit to nir_alu_instr Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:26:34 -07:00
Jason Ekstrand	f849f53990	nir/clone: Export nir_variable_clone Reviewed-by: Rob Clark <robclark@gmail.com>	2016-03-23 15:26:11 -07:00
Jason Ekstrand	5fe8959912	nir/clone: Expose nir_constant_clone Reviewed-by: Rob Clark <robclark@gmail.com>	2016-03-23 15:26:08 -07:00
Jason Ekstrand	c4c373f156	nir: Fix whitespace Reviewed-by: Rob Clark <robclark@gmail.com>	2016-03-23 15:25:53 -07:00
Ian Romanick	d7a25a9def	nir: Don't abs slt and friends No shader-db changes, but this is symmetric with the previous commit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:48:02 -07:00
Ian Romanick	2bb006af68	nir: Don't abs the result of b2f or b2i In the results below, 2 SIMD16 shaders in Trine are lost. G4X total instructions in shared programs: 4012279 -> 4011108 (-0.03%) instructions in affected programs: 116776 -> 115605 (-1.00%) helped: 339 HURT: 0 total cycles in shared programs: 84315862 -> 84313584 (-0.00%) cycles in affected programs: 1767232 -> 1764954 (-0.13%) helped: 274 HURT: 81 Ironlake total instructions in shared programs: 6399073 -> 6396998 (-0.03%) instructions in affected programs: 218050 -> 215975 (-0.95%) helped: 600 HURT: 0 total cycles in shared programs: 128892088 -> 128888810 (-0.00%) cycles in affected programs: 2867452 -> 2864174 (-0.11%) helped: 422 HURT: 137 Sandy Bridge total instructions in shared programs: 8462174 -> 8460759 (-0.02%) instructions in affected programs: 178529 -> 177114 (-0.79%) helped: 596 HURT: 0 total cycles in shared programs: 117542276 -> 117534098 (-0.01%) cycles in affected programs: 1239166 -> 1230988 (-0.66%) helped: 369 HURT: 150 Ivy Bridge total instructions in shared programs: 7775131 -> 7773410 (-0.02%) instructions in affected programs: 162903 -> 161182 (-1.06%) helped: 590 HURT: 0 total cycles in shared programs: 65759882 -> 65747268 (-0.02%) cycles in affected programs: 1004354 -> 991740 (-1.26%) helped: 467 HURT: 141 Haswell total instructions in shared programs: 7107786 -> 7106327 (-0.02%) instructions in affected programs: 140954 -> 139495 (-1.04%) helped: 590 HURT: 0 total cycles in shared programs: 64668028 -> 64655322 (-0.02%) cycles in affected programs: 967080 -> 954374 (-1.31%) helped: 452 HURT: 149 LOST: 2 GAINED: 0 Broadwell total instructions in shared programs: 8980029 -> 8978287 (-0.02%) instructions in affected programs: 197232 -> 195490 (-0.88%) helped: 715 HURT: 0 total cycles in shared programs: 70070448 -> 70055970 (-0.02%) cycles in affected programs: 975724 -> 961246 (-1.48%) helped: 471 HURT: 111 LOST: 2 GAINED: 0 Skylake total instructions in shared programs: 9115178 -> 9113436 (-0.02%) instructions in affected programs: 203012 -> 201270 (-0.86%) helped: 715 HURT: 0 total cycles in shared programs: 68848660 -> 68834004 (-0.02%) cycles in affected programs: 993888 -> 979232 (-1.47%) helped: 473 HURT: 116 LOST: 2 GAINED: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:48:02 -07:00
Ian Romanick	348e5a71d8	nir: Simplify 0 < fabs(a) Sandy Bridge / Ivy Bridge / Haswell total instructions in shared programs: 8462180 -> 8462174 (-0.00%) instructions in affected programs: 564 -> 558 (-1.06%) helped: 6 HURT: 0 total cycles in shared programs: 117542462 -> 117542276 (-0.00%) cycles in affected programs: 9768 -> 9582 (-1.90%) helped: 12 HURT: 0 Broadwell / Skylake total instructions in shared programs: 8980833 -> 8980826 (-0.00%) instructions in affected programs: 626 -> 619 (-1.12%) helped: 7 HURT: 0 total cycles in shared programs: 70077900 -> 70077714 (-0.00%) cycles in affected programs: 9378 -> 9192 (-1.98%) helped: 12 HURT: 0 G45 and Ironlake showed no change. v2: Modify the comments to look more like a proof. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:47:56 -07:00
Ian Romanick	564a8b8a26	nir: Simplify 0 >= b2f(a) This also prevented some regressions with other patches in my local tree. Broadwell / Skylake total instructions in shared programs: 8980835 -> 8980833 (-0.00%) instructions in affected programs: 45 -> 43 (-4.44%) helped: 1 HURT: 0 total cycles in shared programs: 70077904 -> 70077900 (-0.00%) cycles in affected programs: 122 -> 118 (-3.28%) helped: 1 HURT: 0 No changes on earlier platforms. v2: Modify the comments to look more like a proof. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:44:57 -07:00
Ian Romanick	bf0d60aa11	nir: Simplify i2b with negated or abs operand This enables removing ssa_201 and ssa_202 in sequences like: vec1 ssa_200 = flt ssa_199, ssa_194 vec1 ssa_201 = b2i ssa_200 vec1 ssa_202 = i2b -ssa_201 shader-db results: Sandy Bridge total instructions in shared programs: 8462257 -> 8462180 (-0.00%) instructions in affected programs: 3846 -> 3769 (-2.00%) helped: 35 HURT: 0 total cycles in shared programs: 117542934 -> 117542462 (-0.00%) cycles in affected programs: 20072 -> 19600 (-2.35%) helped: 20 HURT: 1 Ivy Bridge total instructions in shared programs: 7775252 -> 7775137 (-0.00%) instructions in affected programs: 3645 -> 3530 (-3.16%) helped: 35 HURT: 0 total cycles in shared programs: 65760522 -> 65760068 (-0.00%) cycles in affected programs: 21082 -> 20628 (-2.15%) helped: 25 HURT: 2 Haswell total instructions in shared programs: 7108666 -> 7108589 (-0.00%) instructions in affected programs: 3253 -> 3176 (-2.37%) helped: 35 HURT: 0 total cycles in shared programs: 64675726 -> 64675272 (-0.00%) cycles in affected programs: 21034 -> 20580 (-2.16%) helped: 26 HURT: 1 Broadwell / Skylake total instructions in shared programs: 8980912 -> 8980835 (-0.00%) instructions in affected programs: 3223 -> 3146 (-2.39%) helped: 35 HURT: 0 total cycles in shared programs: 70077926 -> 70077904 (-0.00%) cycles in affected programs: 21886 -> 21864 (-0.10%) helped: 21 HURT: 6 G45 and Ironlake showed no change. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:43:28 -07:00
Ian Romanick	a4079f1cb2	nir: Lower flrp with Boolean interpolator to bcsel On Intel platforms that don't set lower_flrp, using bcsel instead of flrp seems to be a small amount worse. On those platforms, the use of flrp, bcsel, and multiply of b2f is still an active area of research. In review, Matt suggested this is because bcsel turns into CMP+SEL, and because of the flag register we can't schedule instructions well. shader-db results: G4X / Ironlake total instructions in shared programs: 4016538 -> 4012279 (-0.11%) instructions in affected programs: 161556 -> 157297 (-2.64%) helped: 1077 HURT: 1 total cycles in shared programs: 84328296 -> 84315862 (-0.01%) cycles in affected programs: 4174570 -> 4162136 (-0.30%) helped: 926 HURT: 53 Unsurprisingly, no changes on later platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:42:42 -07:00
Connor Abbott	58fe7837b8	nir: propagate bitsize information in nir_search When we replace an expresion we have to compute bitsize information for the replacement. We do this in two passes to validate that bitsize information is consistent and correct: first we propagate bitsize from child nodes to parent, then we do it the other way around, starting from the original's instruction destination bitsize. v2 (Iago): - Always use nir_type_bool32 instead of nir_type_bool when generating algebraic optimizations. Before we used nir_type_bool32 with constants and nir_type_bool with variables. - Fix bool comparisons in nir_search.c to account for bitsized types. v3 (Sam): - Unpack the double constant value as unsigned long long (8 bytes) in nir_algrebraic.py. v4 (Sam): - Use helpers to get type size and base type from nir_alu_type. Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:54:45 +01:00
Connor Abbott	3124ce699b	nir: add a bit_size parameter to nir_ssa_dest_init v2: Squash multiple commits addressing the new parameter in different files so we don't break the build (Iago) v3: Fix tgsi (Samuel) v4: Fix nir_clone.c (Samuel) v5: Fix vc4 and freedreno (Iago) v6 (Sam) - Fix build errors in nir_lower_indirect_derefs - Use helper to get type size from nir_alu_type. Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:54:45 +01:00
Iago Toral Quiroga	084b24f558	nir: rename nir_const_value fields to include bitsize information Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-17 11:16:33 +01:00
Connor Abbott	9076c4e289	nir: update opcode definitions for different bit sizes Some opcodes need explicit bitsizes, and sometimes we need to use the double version when constant folding. v2: fix output type for u2f (Iago) v3: do not change vecN opcodes to be float. The next commit will add infrastructure to enable 64-bit integer constant folding so this is isn't really necessary. Also, that created problems with source modifiers in some cases (Iago) v4 (Jason): - do not change bcsel to work in terms of floats - leave ldexp generic Squashed changes to handle different bit sizes when constant folding since otherwise we would break the build. v2: - Use the bit-size information from the opcode information if defined (Iago) - Use helpers to get type size and base type of nir_alu_type enum (Sam) - Do not fallback to sized types to guess bit-size information. (Jason) Squashed changes in i965 and gallium/nir drivers to support sized types. These functions should only see sized types, but we can't make that change until we make sure that nir uses the sized versions in all the relevant places. A later commit will address this. Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Connor Abbott	6700d7e423	nir: add nir_{src,dest}_bit_size() helpers v2: use a ternary (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Jason Ekstrand	e172dbe5d2	nir: Add a bit_size to nir_register and nir_ssa_def This really hacky commit adds a bit size to registers and SSA values. It also adds rules in the validator to validate that they do the right things. It's still an open question as to whether or not we want a bit_size in nir_alu_instr or if we just want to let it inherit from the destination. I'm inclined to just let it inherit from the destination. A similar question needs to be asked about intrinsics. v2 (Connor): - Relax validation: comparisons have explicit destination sizes and implicit source sizes. v3 (Sam): - Use helpers to get size and base types of nir_alu_type enum. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Jason Ekstrand	78f1919429	nir: Add explicitly sized types v2: Fix size/type mask to properly handle 8-bit types. v3: Add helpers to get the bitsize and base type of a nir_alu_type enum. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Jordan Justen	3fd308a357	Merge remote-tracking branch 'origin/master' into vulkan	2016-03-17 01:44:07 -07:00
Jordan Justen	b1e7cdfdcf	nir: Lower shared var atomics during nir_lower_io Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	e3cbb9d37c	nir: Add support for lowering load/stores of shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	683c359c54	nir: Add atomic operations on variables This allows us to first generate atomic operations for shared variables using these opcodes, and then later we can lower those to the shared atomics intrinsics with nir_lower_io. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	3c807607df	nir: Add compute shader shared variable storage class Previously we were receiving shared variable accesses via a lowered intrinsic function from glsl. This change allows us to send in variables instead. For example, when converting from SPIR-V. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	26f8262698	nir/print: Add space after shader_storage var mode Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jason Ekstrand	7f6a0cb29c	Merge remote-tracking branch 'public/master' into vulkan	2016-03-15 14:09:50 -07:00
Jason Ekstrand	98d58e7320	nir/clone: Add support for cloning a single function_impl Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	036b209484	nir/validate: Better function validation Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	f86f3c90aa	nir/print: Better function argument printing Since we aren't going to put the function parameters or the return variable in the list of locals, it won't get a proper declaration. This changes nir_print to print the type along with each parameter or return variable. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	13969565f9	nir/print: Factor variable name lookup into a helper Otherwise, we have a problem when we go to print functions with arguments because their names get added to the hash table during declaration which happens after we print the prototype. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	e4bebe8a02	nir: Create function parameters in function_impl_create Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	066d3c115e	nir: Add a helper for creating a "bare" nir_function_impl Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	2ef4754a20	nir: Add a new "param" variable mode for parameters and return variables Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	41ae553fda	nir/glsl: Remove dead function parameter handling code NIR has never been used on IR where we haven't already done function inlining so this code has been dead from the beginning. Let's just get rid of it for now. We can always put it back in if we decide to use NIR for function inlining at some point in the future. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	14b18aba89	nir: Add a pass for lower indirect variable dereferences This new pass lowers load/store_var intrinsics that act on indirect derefs to if-ladder of direct load/store_var intrinsics. The if-ladders perform a simple binary search on the indirect. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-03-08 10:41:54 -08:00
Matt Turner	905ff86198	nir: Recognize open-coded extract_u16. No shader-db changes, but does recognize some extract_u16 which enables the next patch to optimize some code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-04 11:52:34 -08:00
Matt Turner	76289fbfa8	nir: Recognize open-coded extract_u8. Two shaders that appear in Unigine benchmarks (Heaven and Valley) unpack three bytes from an integer and convert each into a float: float((val >> 16u) & 0xffu) float((val >> 8u) & 0xffu) float((val >> 0u) & 0xffu) Instead of shifting, masking, and type converting like this: shr(8) g15<1>UD g25<8,8,1>UD 0x00000010UD and(8) g16<1>UD g15<8,8,1>UD 0x000000ffUD mov(8) g17<1>F g16<8,8,1>UD shr(8) g18<1>UD g25<8,8,1>UD 0x00000008UD and(8) g19<1>UD g18<8,8,1>UD 0x000000ffUD mov(8) g20<1>F g19<8,8,1>UD and(8) g21<1>UD g25<8,8,1>UD 0x000000ffUD mov(8) g22<1>F g21<8,8,1>UD i965 can simply extract a byte and convert to float in a single instruction: mov(8) g17<1>F g25.2<32,8,4>UB mov(8) g20<1>F g25.1<32,8,4>UB mov(8) g22<1>F g25.0<32,8,4>UB This patch implements the first step: recognizing byte extraction. A later patch will optimize out the conversion to float. instructions in affected programs: 28568 -> 27450 (-3.91%) helped: 7 cycles in affected programs: 210076 -> 203144 (-3.30%) helped: 7 This patch decreases the number of instructions in the two Unigine programs by: #1721: 4520 -> 4374 instructions (-3.23%) #1706: 3752 -> 3582 instructions (-4.53%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-04 11:52:34 -08:00
Kristian Høgsberg Kristensen	b00b42d99b	nir/spirv: Use the new bare sampler type	2016-02-28 11:24:05 -08:00
Jason Ekstrand	c9564fd598	nir/spirv: Allow but warn for a few capabilities Unfortunately, glslang gives us cull/clip distance and GS streams even if the shader doesn't use it whenever a shader is declared as version 450. This is a glslang bug, but we can easily enough ignore it for now.	2016-02-23 22:07:25 -08:00
Jason Ekstrand	040355b688	nir/spirv: Add more capabilities	2016-02-23 21:01:00 -08:00
Jason Ekstrand	f49ba0f7d8	nir/spirv: Add support for multisampled textures	2016-02-21 22:02:38 -08:00
Jason Ekstrand	79c0781f44	nir/gather_info: Count textures and images	2016-02-18 11:42:36 -08:00
Jason Ekstrand	581e4468f9	nir/spirv: Add some more capabilities	2016-02-17 18:04:39 -08:00
Jason Ekstrand	979732fafc	nir: Add a helper for getting the one function from a shader	2016-02-17 18:04:39 -08:00
Jason Ekstrand	8c05b44bbb	nir: Add a nir_foreach_variable_safe helper	2016-02-17 18:04:39 -08:00
Kristian Høgsberg Kristensen	b8da261dc7	spirv: Fix SpvOpFwidth, SpvOpFwidthFine and SpvOpFwidthCoarse "Result is the same as computing the sum of the absolute values of OpDPdx and OpDPdy on P." We were doing sum of absolute values of OpDPdx of P and OpDPdx of NULL.	2016-02-17 15:28:52 -08:00
Jason Ekstrand	88042b9f10	nir: Get rid of the C++ NIR_SRC/DEST_INIT macros These were originally added to reduce compiler warnings but aren't really needed. Getting rid of them reduces the diff between the Vulkan branch and master, so we might as well.	2016-02-12 21:35:02 -08:00
Jason Ekstrand	3c8dc1afd1	nir/spirv/glsl: Clean up the row-skipping swizzle logic a bit	2016-02-12 10:40:39 -08:00
Jason Ekstrand	4016619931	nir/spirv: Allow the clip distance capability.	2016-02-11 15:14:46 -08:00
Jason Ekstrand	f710f3ca37	Merge remote-tracking branch 'mesa-public/master' into vulkan This also reverts commit `1d65abfa58` because now NIR handles texture offsets in a much more sane way.	2016-02-10 17:12:11 -08:00
Jason Ekstrand	8750299a42	nir: Remove the const_offset from nir_tex_instr When NIR was originally drafted, there was no easy way to determine if something was constant or not. The result was that we had lots of special-casing for constant values such as this. Now that load_const instructions are SSA-only, it's really easy to find constants and this isn't really needed anymore. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robclark@gmail.com>	2016-02-10 16:33:50 -08:00
Jason Ekstrand	70dff4a55e	nir/lower_vec_to_movs: Better report channels handled by insert_mov This fixes two issues. First, we had a use-after-free in the case where the instruction got deleted and we tried to return mov->dest.write_mask. Second, in the case where we are doing a self-mov of a register, we delete those channels that are moved to themselves from the write-mask. This means that those channels aren't reported as being handled even though they are. We now stash off the write-mask before remove unneeded channels so that they still get reported as handled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94073 Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-02-10 16:33:14 -08:00
Jason Ekstrand	9be5a4bc29	nir/spirv: Fix handling of OpGroupMemberDecorate We were pulling the member index from the wrong dword	2016-02-10 15:36:42 -08:00
Jason Ekstrand	ac04c6de2c	nir/spirv: Assert that struct member ids are in-bounds	2016-02-10 15:36:41 -08:00
Mark Janes	8179834030	nir/spirv: fix build_mat_subdet stack smasher The sub-determinate implementation pattern fixed by `6a7e2904e0` has a second instance in the same file. With the previous algorithm, when row and j are both 3, the index overruns the array. This only impacts the stack on 32 bit builds. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-10 14:43:03 -08:00
Jason Ekstrand	09b3e30dc6	anv: Fix up spirv for new texture/sampler split stuff	2016-02-09 16:48:36 -08:00
Jason Ekstrand	b14f4c1fd3	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in the separate texture/sampler stuff from upstream	2016-02-09 16:47:37 -08:00
Jason Ekstrand	e01dd59b73	vtn: Use const_index helpers	2016-02-09 16:32:38 -08:00
Jason Ekstrand	768bd7f272	Merge commit '8b0fb1c152fe191768953aa8c77b89034a377f83' into vulkan This pulls in Rob Clark's const_index changes for NIR	2016-02-09 15:30:39 -08:00
Jason Ekstrand	5ec456375e	nir: Separate texture from sampler in nir_tex_instr This commit adds the capability to NIR to support separate textures and samplers. As it currently stands, glsl_to_nir only sets the texture deref and leaves the sampler deref alone as it did before and nir_lower_samplers assumes this. Backends can still assume that they are combined and only look at only at the texture index. Or, if they wish, they can assume that they are separate because nir_lower_samplers, tgsi_to_nir, and prog_to_nir all set both texture and sampler index whenever a sampler is required (the two indices are the same in this case). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	ee85014b90	nir/tex_instr: Rename sampler to texture We're about to separate the two concepts. When we do, the sampler will become optional. Doing a rename first makes the separation a bit more safe because drivers that depend on GLSL or TGSI behaviour will be fine to just use the texture index all the time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	3f42184994	nir: Add some braces around loops and ifs	2016-02-09 15:00:17 -08:00
Rob Clark	ced8d3e773	nir: use const_index helpers Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-09 17:30:33 -05:00
Rob Clark	b6cf98bc82	gtn: use const_index helpers Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-09 17:30:33 -05:00
Rob Clark	1df3ecc1b8	nir: const_index helpers Direct access to intr->const_index[n], where different slots have different meanings, is somewhat confusing. Instead, let's put some extra info in nir_intrinsic_infos[] about which slots map to what, and add some get/set helpers. The helpers validate that the field being accessed (base/writemask/etc) is applicable for the intrinsic opc, for some extra safety. And nir_print can use this to dump out decoded const_index fields. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-09 17:30:33 -05:00
Jason Ekstrand	1d65abfa58	nir/spirv: Better handle constant offsets in texture lookups	2016-02-09 10:29:05 -08:00
Jason Ekstrand	209820739b	nir/spirv: Set the vtn_mode and interface type for sampler parameters	2016-02-09 10:29:05 -08:00
Jason Ekstrand	de6c9c5f2e	nir/inline_functions: Don't shadown variables when it isn't needed Previously, in order to get things working, we just always shadowed variables. Now, we rewrite derefs whenever it's safe to do so and only shadow if we have an in or out variable that we write or read to respectively.	2016-02-09 10:29:05 -08:00
Jason Ekstrand	b6c00bfb03	nir: Rework function parameters	2016-02-09 10:29:05 -08:00
Timothy Arceri	1aae5e8ced	nir: remove unused nir_variable fields These are used in GLSL IR to removed unused varyings and match transform feedback variables. There is no need to use these in NIR. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:49:06 +11:00
Matt Turner	371c4b3c48	nir: Recognize open-coded bitfield_reverse. Helps 11 shaders in UnrealEngine4 demos. I seriously hope they would have given us bitfieldReverse() if we exposed GL 4.0 (but we do expose ARB_gpu_shader5, so why not use that anyway?). instructions in affected programs: 4875 -> 4633 (-4.96%) cycles in affected programs: 270516 -> 244516 (-9.61%) I suspect there's a lot of room to improve nir_search/opt_algebraic's handling of this. We'd actually like to match, e.g., step2 by matching step1 once and then doing a pointer comparison for the second instance of step1, but unfortunately we generate an enormous tuple for instead. The .text size increases by 6.5% and the .data by 17.5%. text data bss dec hex filename 22957 45224 0 68181 10a55 nir_libnir_la-nir_opt_algebraic.o 24461 53160 0 77621 12f35 nir_libnir_la-nir_opt_algebraic.o I'd be happy to remove this if Unreal4 uses bitfieldReverse() if it is in a GL 4.0 context once we expose GL 4.0. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 21:20:58 -08:00
Matt Turner	2d0d9755da	nir: Handle large unsigned values in opt_algebraic. The next patch adds an algebraic rule that uses the constant 0xff00ff00. Without this change, the build fails with return hex(struct.unpack('I', struct.pack('i', self.value))[0]) struct.error: 'i' format requires -2147483648 <= number <= 2147483647 The hex() function handles integers of any size, and assigning a negative value to an unsigned does what we want in C. The pack/unpack is unnecessary (and as we see, buggy). Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>	2016-02-08 20:38:17 -08:00
Matt Turner	7be8d07732	nir: Do opt_algebraic in reverse order. Walking the SSA definitions in order means that we consider the smallest algebraic optimizations before larger optimizations. So if a smaller rule is part of a larger rule, the smaller one will happen first, preventing the larger one from happening. instructions in affected programs: 32721 -> 32611 (-0.34%) helped: 106 In programs whose nir_optimize loop count changes (129 of them): before: 1164 optimization loops after: 1071 optimization loops Of the 129 affected, 16 programs' optimization loop counts increased. Prevents regressions and annoyances in the next commits. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 20:38:17 -08:00
Matt Turner	a8f0960816	nir: Recognize product of open-coded pow()s. Prevents regressions in the next commit. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 20:38:17 -08:00
Matt Turner	9f02e3ab03	nir: Add opt_algebraic rules for xor with zero. instructions in affected programs: 668 -> 664 (-0.60%) helped: 4 Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 20:38:17 -08:00
Francisco Jerez	cec6fe2ad8	vtn: Clean up acos implementation. Parameterize build_asin() on the fit coefficients so the implementation can be shared while still using different polynomials for asin and acos. Also switch back to implementing acos in terms of asin -- The improvement obtained from cancelling out the pi/2 terms was negligible compared to the approximation error.	2016-02-08 15:23:43 -08:00
Francisco Jerez	f50a651726	nir/spirv: Create integer types of correct signedness. vtn_handle_type() creates a signed type regardless of the value of the signedness flag, which usually doesn't make much of a difference except when the type is used as base sampled type of an image type, what will cause the base type of the NIR image variable to be inconsistent with its format and cause an assertion failure in the back-end (most likely only reproducible on Gen7), and may change the semantics of the image intrinsic subtly (e.g. UMIN may become IMIN).	2016-02-08 15:23:35 -08:00
Jason Ekstrand	9401516113	Merge remote-tracking branch 'mesa-public/master' into vulkan	2016-02-05 15:21:11 -08:00
Jason Ekstrand	741744f691	Merge commit mesa-public/master into vulkan This pulls in the patches that move all of the compiler stuff around	2016-02-05 15:03:44 -08:00
Matt Turner	955d052058	nir: Add lowering support for unpacking opcodes. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	9b8786eba9	nir: Add lowering support for packing opcodes. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	68f8c5730b	nir: Add opcodes to extract bytes or words. The uint versions zero extend while the int versions sign extend. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	8709dc0713	glsl: Remove 2x16 half-precision pack/unpack opcodes. i965/fs was the only consumer, and we're now doing the lowering in NIR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	9ce901058f	nir: Add lowering of nir_op_unpack_half_2x16. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	140a886c41	nir: Make argument order of unop_convert match binop_convert. Strangely the return and parameter types were reversed. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Emil Velikov	eb63640c1d	glsl: move to compiler/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-26 16:08:33 +00:00
Emil Velikov	a39a8fbbaa	nir: move to compiler/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-26 16:08:30 +00:00

... 27 28 29 30 31

1510 Commits