KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Iago Toral Quiroga	9bb7d9ecf8	nir: Implement __intrinsic_store_ssbo v2 (Connor): - Make the STORE() macro take arguments for the extra sources (and their size) and any extra indices required. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez	003ce30e36	nir: Implement ir_unop_get_buffer_size This is how backends provide the buffer size required to compute the size of unsized arrays in the previous patch Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Kenneth Graunke	542d40d698	nir: Add new GS intrinsics that maintain a count of emitted vertices. This patch also introduces a lowering pass to convert the simple GS intrinsics to the new ones. See the comments above that for the rationale behind the new intrinsics. This should be useful for i965; it's a generic enough mechanism that I could see other drivers potentially using it as well, so I don't feel too bad about putting it in the generic code. v2: - Use nir_after_block_before_jump for the cursor (caught by Jason Ekstrand - I'd mistakenly used nir_after_block when rebasing this code onto the new NIR control flow API). - Remove the old emit_vertex intrinsic at the end, rather than in the middle (requested by Jason). - Use state->... directly rather than locals (requested by Jason). - Report progress from nir_lower_gs_intrinsics() (requested by me). - Remove "Authors:" section from file comment (requested by Michael Schellenberger Costa). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 11:00:00 -07:00
Kenneth Graunke	0a040975ec	nir: Add unit tests for control flow graphs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com> Acked-by: Connor Abbott <cwabbott0@gmail.com>	2015-09-23 11:00:00 -07:00
Kenneth Graunke	fbaa1b19d7	nir/cf: Fix dominance metadata in the dead control flow pass. The NIR control flow modification API churns the block structure, splitting blocks, stitching them back together, and so on. Preserving information about block dominance is hard (and probably not worthwhile). This patch makes nir_cf_extract() throw away all metadata, like we do when adding/removing jumps. We then make the dead control flow pass compute dominance information right before it uses it. This is necessary because earlier work by the pass may have invalidated it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 11:00:00 -07:00
Kenneth Graunke	6560838703	nir/cf: Fix unlink_block_successors to actually unlink the second one. Calling unlink_blocks(block, block->successors[0]) will successfully unlink the first successor, but then will shift block->successors[1] down to block->successor[0]. So the successors[1] != NULL check will always fail. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 11:00:00 -07:00
Kenneth Graunke	024e5ec977	nir/cf: Alter block successors before adding a fake link. Consider the case of "while (...) { break }". Or in NIR: block block_0 (0x7ab640): ... /* succs: block_1 / loop { block block_1: / preds: block_0 / break / succs: block_2 */ } block block_2: Calling nir_handle_remove_jump(block_1, nir_jump_break) will remove the break. Unfortunately, it would mangle the predecessors and successors. Here, block_2->predecessors->entries == 1, so we would create a fake link, setting block_1->successors[1] = block_2, and adding block_1 to block_2's predecessor set. This is illegal: a block cannot specify the same successor twice. In particular, adding the predecessor would have no effect, as it was already present in the set. We'd then call unlink_block_successors(), which would delete the fake link and remove block_1 from block_2's predecessor set. It would then delete successors[0], and attempt to remove block_1 from block_2's predecessor set a second time...except that it wouldn't be present, triggering an assertion failure. The fix appears to be simple: simply unlink the block's successors and recreate them to point at the correct blocks first. Then, add the fake link. In the above example, removing the break would cause block_1 to have itself as a successor (as it becomes an infinite loop), so adding the fake link won't cause a duplicate successor. v2: Add comments (requested by Connor Abbott) and fix commit message. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 10:59:59 -07:00
Kenneth Graunke	0991b2eb35	nir/cf: Conditionally do block_add_normal_succs() in unlink_jump(); There is a bug where we mess up predecessors/successors due to the ordering of unlinking/recreating edges/adding fake edges. In order to fix that, I need everything in one routine. However, calling block_add_normal_succs() isn't safe from cleanup_cf_node() - it would crash trying to insert phi undefs. So unfortunately I need to add a parameter. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 10:59:59 -07:00
Kenneth Graunke	9674c76c0e	nir/cf: Don't break outer-block successors in split_block_beginning(). Consider the following NIR: block block_0; /* succs: block_1 block_2 / if (...) { block block_1; ... } else { block block_2; } Calling split_block_beginning() on block_1 would break block_0's successors: link_block() sets both successors of a block, so calling link_block(block_0, new_block, NULL) would throw away the second successor, leaving only / succ: new_block */. This is invalid: the block before an if statement must have two successors. Changing the call to link_block(pred, new_block, pred->successors[0]) would correctly leave both successors in place, but because unlink_block may shift successor[1] to successor[0], it may not preserve the original order. NIR maintains a convention that successor[0] must point to the "then" block, while successor[1] points to the "else" block, so we need to take care to preserve this ordering. This patch creates a new function that swaps out one successor for another, preserving the ordering. It then uses this to fix the issue. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 10:59:59 -07:00
Kenneth Graunke	e2637db618	nir/cf: Make a helper function for removing a predecessor. I need to do this in a second place, and I'd rather make a helper function than cut and paste the code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 10:59:59 -07:00
Kenneth Graunke	6a67ede6b3	nir: Validate that a block doesn't have two identical successors. This is invalid, and causes disasters if we try to unlink successors: removing the first will work, but removing the second copy will fail because the block isn't in the successor's predecessor set any longer. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 10:59:59 -07:00
Jason Ekstrand	8dcbca5957	nir/lower_vec_to_movs: Don't emit unneeded movs It's possible that, if a vecN operation is involved in a phi node, that we could end up moving from a register to itself. If swizzling is involved, we need to emit the move but. However, if there is no swizzling, then the mov is a no-op and we might as well not bother emitting it. Shader-db results on Haswell: total instructions in shared programs: 6262536 -> 6259558 (-0.05%) instructions in affected programs: 184780 -> 181802 (-1.61%) helped: 838 HURT: 0 Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-23 10:12:39 -07:00
Jason Ekstrand	65e80ce5b5	nir/lower_vec_to_movs: Properly handle source modifiers on vecN ops I don't know of any piglit tests that are currently broken. However, there is nothing stopping a vecN instruction from getting source modifiers and lower_vec_to_movs is run after we lower to source modifiers. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-23 10:12:39 -07:00
Jason Ekstrand	999ff3c77d	nir/lower_alu_to_scalar: Add support for nir_op_fdph Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-22 20:37:35 -07:00
Jason Ekstrand	e5a9346d00	nir: Add fdph and fdph_replicated opcodes Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-22 20:37:35 -07:00
Jason Ekstrand	0f9bf64770	nir/lower_alu_to_scalar: Return after lower_reduction We don't use any of the code after the switch anyway. Since we check for num_components == 1 and early-return, it doesn't get executed so everything's ok. However, it makes it much clearer what's going on if we simply do an early return. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-22 20:37:35 -07:00
Jason Ekstrand	2b79db2c02	nir/lower_alu_to_scalar: Use the builder Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-22 20:37:35 -07:00
Kenneth Graunke	5cede90f62	nir: Report progress from nir_normalize_cubemap_coords(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-21 13:54:34 -07:00
Kenneth Graunke	d7ffd90ecb	nir: Add braces around multi-line loop. This was correct but not our usual style. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-21 13:47:01 -07:00
Kenneth Graunke	0a1adaf11d	nir: Report progress from nir_lower_system_values(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-21 13:47:00 -07:00
Kenneth Graunke	dc18b9357b	nir: Report progress from nir_split_var_copies(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-21 13:46:59 -07:00
Kenneth Graunke	cfae0f8a3a	nir: Report progress from nir_lower_locals_to_regs(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-21 13:46:57 -07:00
Kenneth Graunke	1adde5b87e	nir: Report progress from nir_remove_dead_variables(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-21 13:46:55 -07:00
Jason Ekstrand	9f5e7ae9d8	nir: Report progress from lower_vec_to_movs(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-21 13:46:54 -07:00
Kenneth Graunke	967a5ddb88	nir: Report progress from nir_lower_globals_vars_to_local(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-21 13:46:45 -07:00
Jason Ekstrand	46362db4a6	nir/builder: Don't use designated initializers Designated initializers are not allowed in C++ (not even C++11). Since nir_lower_samplers is now using nir_builder, and nir_lower_samplers is in C++, this breaks the build on some compilers. Aparently, GCC 5 allows it in some limited extent because mesa still builds on my system without this patch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92052 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-21 10:41:43 -07:00
Jason Ekstrand	d513388c8a	nir: Move system value -> intrinsic mapping into nir.c This way they're right next to the map going the other direction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-21 09:49:40 -07:00
Emil Velikov	de7ffdb383	nir: rename nir_lower_samplers.c{pp,} With the only C++ function having its own wrapper we can 'demote' this file to a normal C one. This allows us to get rid of extern C { #include <foo.h> } 'hacks'. Plus some of the headers may use C99 initializers, which are not supported by the ISO standard. This may cause build issue on incremental builds. If so run the following: sed -i -e 's\|samplers\.cpp\|samplers.c\|' src/glsl/nir/.deps/nir_lower_samplers.Plo Fixes: ef8eebc6ad5(nir: support indirect indexing samplers in struct arrays) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reported-by: Gottfried Haider <gottfried.haider@gmail.com> Tested-by: Gottfried Haider <gottfried.haider@gmail.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-09-21 17:02:06 +01:00
Emil Velikov	d130cda453	nir: add C wrapper around glsl_type::record_location_offset This will allow us to convert nir_lower_sampler.cpp to C. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Gottfried Haider <gottfried.haider@gmail.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-09-21 17:01:56 +01:00
Emil Velikov	bdb1faf44e	nir: move stdio.h inclusion before extern C Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Gottfried Haider <gottfried.haider@gmail.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-09-21 17:01:32 +01:00
Rob Clark	b65f91dd32	nir/print: fix coverity error Not something actually hit in real life (now state is never non-null, but only case state->syms is null is if nir_print_instr() path). But it was something I overlooked the first time, so might as well fix it. *** CID 1324642: Null pointer dereferences (REVERSE_INULL) /src/glsl/nir/nir_print.c: 299 in print_var_decl() 293 294 fprintf(fp, " (%s, %u)", loc, var->data.driver_location); 295 } 296 297 fprintf(fp, "\n"); 298 >>> CID 1324642: Null pointer dereferences (REVERSE_INULL) >>> Null-checking "state" suggests that it may be null, but it has already been dereferenced on all paths leading to the check. 299 if (state) { 300 _mesa_set_add(state->syms, name); 301 _mesa_hash_table_insert(state->ht, var, name); 302 } 303 } 304 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-20 14:04:06 -04:00
Rob Clark	e13ed3ffb4	nir: add two-sided-color lowering pass Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-09-18 21:07:50 -04:00
Rob Clark	e4dfcdcbec	nir/build: add nir_vec() helper Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-09-18 21:07:50 -04:00
Rob Clark	3745c38425	nir/lower_tex: add support to clamp texture coords Some hardware needs to clamp texture coordinates to [0.0, 1.0] in the shader to emulate GL_CLAMP. This is added to lower_tex_proj since, in the case of projected coords, the clamping needs to happen after projection. v2: comments/suggestions from Ilia and Eric, use txs to get texture size and clamp RECT textures to their dimensions rather than [0.0, 1.0] to avoid having to lower RECT textures to 2D. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-18 21:07:49 -04:00
Rob Clark	1ce8060c25	nir/lower_tex: support for lowering RECT textures v2: comments/suggestions from Ilia and Eric, split out get_texture_size() helper so we can use it in the next commit for clamping RECT textures. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-18 21:07:49 -04:00
Rob Clark	faf5f174dd	nir/lower_tex: support projector lowering per sampler type Some hardware, such as adreno a3xx, supports txp on some but not all sampler types. In this case we want more fine grained control over which texture projectors get lowered. v2: split out nir_lower_tex_options struct to make it easier to add the additional parameters coming in the following patches Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-18 21:07:49 -04:00
Rob Clark	f83ba7bc41	nir/lower_tex: split out project_src() helper Split this out to reduce noise in later patches. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-18 21:07:49 -04:00
Rob Clark	d9b9ff76f1	nir: rename nir_lower_tex_projector Since the following patches will add additional tex-lowering related functionality, which doesn't make sense to split out into a separate pass (as they would require duplication of the projector lowering logic), let's give this pass a more generic name. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-18 21:07:49 -04:00
Rob Clark	2e4ab489b5	nir/builder: fix c++11 compiler warning Fixes: In file included from nir/nir_lower_samplers.cpp:27:0: nir/nir_builder.h: In function 'nir_ssa_def* nir_channel(nir_builder, nir_ssa_def, int)': nir/nir_builder.h:222:37: warning: narrowing conversion of 'c' from 'int' to 'unsigned int' inside { } is ill-formed in C++11 [-Wnarrowing] unsigned swizzle[4] = {c, c, c, c}; Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-17 21:08:25 -04:00
Rob Clark	7c72f593ad	nir: really actually fix comment this time Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-17 21:06:11 -04:00
Rob Clark	5305603b9d	nir/print: print variable names Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-09-17 20:26:12 -04:00
Rob Clark	ba78260b0f	nir: some comment fixups Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-09-17 20:25:33 -04:00
Rob Clark	509e0c4505	nir: add lowering stage for user-clip-planes / clipdist The vertex shader lowering adds calculation for CLIPDIST, if needed (ie. user-clip-planes), and the frag shader lowering adds conditional kills based on CLIPDIST value (which should be treated as a normal interpolated varying by the driver). Note that this won't quite do the right thing in the face of MSAA plus user-clip-planes, since all the samples would be killed or not (rather than potentially only a portion of them). But it's better than no UCP support at all for drivers that don't have this in hw. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-09-17 19:57:21 -04:00
Rob Clark	53671a3723	nir: add sysval for user-clip-planes For lowering user-clip-planes, we need a way to pass the enabled/used user-clip-planes in to shader. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-09-17 19:55:43 -04:00
Jason Ekstrand	a6c467d6c5	nir: Add a pass to rewrite uses of vecN sources to the vecN destination v2 (Jason Ekstrand): - Handle non-SSA sources and destinations Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-17 08:19:48 -07:00
Jason Ekstrand	ddffe30f40	nir: Add comments to nir_index_instrs and nir_index_ssa_defs The provided indices have the very nice property that if A dominates B then A->index <= B->index. We should document that somewhere. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-17 08:16:01 -07:00
Jason Ekstrand	8ecaef967d	nir: Add a generic instruction index Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-17 08:16:01 -07:00
Timothy Arceri	ef8eebc6ad	nir: support indirect indexing samplers in struct arrays As a bonus we get indirect support for arrays of arrays for free. V5: couple of small clean-ups suggested by Jason. V4: fix struct member location caclulation, use nir_ssa_def rather than nir_src for the indirect as suggested by Jason V3: Use nir_instr_rewrite_src() with empty src rather then clearing the use_link list directly for the old indirects as suggested by Jason V2: Fixed validation error in debug build Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-17 11:28:34 +10:00
Timothy Arceri	dcd9cd0383	glsl: store uniform slot id in var location field This will allow us to access the uniform later on without resorting to building a name string and looking it up in UniformHash. V3: remove line wrap change from this patch V2: store slot number for all non-UBO uniforms to make code more consitent, renamed explicit_binding to explicit_location and added comment about what it does. Store the location at every shader stage. Updated data.location comments in ir/nir.h. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-17 11:28:14 +10:00
Rob Clark	aecbc93f2d	nir/print: print symbolic names from shader-enum v2: split out moving of FILE *fp into state structure into it's own (more complete patch) to reduce the noise in this one Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-09-16 10:15:35 -04:00
Rob Clark	840df72f93	nir/print: bit of state refactoring Rename print_var_state to print_state, and stuff FILE ptr into the state object. This avoids passing around an extra parameter everywhere. v2: even more extensive conversion.. use state everywhere instead of FILE ptr, and convert nir_print_instr() to use state as well Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-09-16 10:15:17 -04:00
Rob Clark	d9efe40dc9	nir: add lowering for ffract Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-09-16 08:27:36 -04:00
Jason Ekstrand	cb503c3227	nir/builder: Use a normal temporary array in nir_channel C++ gets cranky if we take references of temporaries. This isn't a problem yet in master because nir_builder is never used from C++. However, it will be in the future so we should fix it now. Reviewed-by: Rob Clark <robclark@freedesktop.org>	2015-09-15 14:51:05 -07:00
Jason Ekstrand	29348631fe	nir/lower_vec_to_movs: Coalesce into destinations of fdot instructions Now that we have a replicating fdot instruction, we can actually coalesce into the destinations of vec4 instructions. We couldn't really do this before because, if the destination had to end up in .z, we couldn't reswizzle the instruction. With a replicated destination, the result ends up in all channels so we can just set the writemask and we're done. Shader-db results for vec4 programs on Haswell: total instructions in shared programs: 1747753 -> 1746280 (-0.08%) instructions in affected programs: 143274 -> 141801 (-1.03%) helped: 667 HURT: 0 It turns out that dot-products matter... Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-15 12:38:48 -07:00
Jason Ekstrand	47739c7df4	nir: Add a fdot instruction that replicates the result to a vec4 Fortunately, nir_constant_expr already auto-splats if "dst" never shows up in the constant expression field so we don't need to do anything there. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-15 12:38:48 -07:00
Jason Ekstrand	2458ea95c5	nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible The old pass blindly inserted a bunch of moves into the shader with no concern for whether or not it was really needed. This adds code to try and coalesce into the destination of the instruction providing the value. Shader-db results for vec4 shaders on Haswell: total instructions in shared programs: 1754420 -> 1747753 (-0.38%) instructions in affected programs: 231230 -> 224563 (-2.88%) helped: 1017 HURT: 2 This approach is heavily based on a different patch by Eduardo Lima Mitev <elima@igalia.com>. Eduardo's patch did this in a separate pass as opposed to integrating it into nir_lower_vec_to_movs. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-15 12:38:07 -07:00
Jason Ekstrand	2b2f1f16a0	nir/lower_vec_to_movs: Get rid of start_idx and swizzle compacting Previously, we did this thing with keeping track of a separate start_idx which was different from the iteration variable. I think this was a relic of the way that GLSL IR implements writemasks. In NIR, if a given bit in the writemask is unset then that channel is just "unused", not missing. In particular, a vec4 operation with a writemask of 0xd will use sources 0, 2, and 3 and leave source 1 alone. We can simplify things a good deal (and make them correct) by removing this "compacting" step. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-09-15 11:13:48 -07:00
Jason Ekstrand	c3f8cde964	nir/lower_vec_to_movs: Handle partially SSA shaders v2 (Jason Ekstrand): - Use nir_instr_rewrite_dest - Pass the impl directly into lower_vec_to_movs_block Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-15 11:13:45 -07:00
Jason Ekstrand	b7eeced3c7	nir/lower_vec_to_movs: Pass the shader around directly Previously, we were passing the shader around, we were just calling it "mem_ctx". However, the nir_shader is (and must be for the purposes of mark-and-sweep) the mem_ctx so we might as well pass it around explicitly. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-15 11:13:40 -07:00
Jordan Justen	4f178f0d8b	nir: Add gl_WorkGroupID system variable Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-13 09:53:16 -07:00
Jordan Justen	62e011d593	nir: Add gl_LocalInvocationID variable Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-13 09:53:16 -07:00
Rob Clark	b88aeff4f5	nir: add nir_channel() to get at single components of vec's Rather than make yet another copy of channel(), let's move it into nir. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-13 11:08:27 -04:00
Jason Ekstrand	ca11c3c0a4	nir/from_ssa: Use instr_rewrite_dest Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-11 09:21:20 -07:00
Jason Ekstrand	cee29220e3	nir: Add a function for rewriting instruction destinations Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-11 09:21:20 -07:00
Jason Ekstrand	106a3b2cc3	nir: Only unlink sources that are actually valid Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2015-09-11 09:21:20 -07:00
Jason Ekstrand	a4aa25be1e	nir: Remove the mem_ctx parameter from ssa_def_rewrite_uses Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2015-09-11 09:21:20 -07:00
Jason Ekstrand	8c8fc5f833	nir: Fix a bunch of ralloc parenting errors As of `a10d4937`, we would really like things associated with an instruction to be allocated out of that instruction and not out of the shader. In particular, you should be passing the instruction that will ultimately be holding the source into nir_src_copy rather than an arbitrary memory context. We also change the prototypes of nir_dest_copy and nir_alu_src/dest_copy to explicitly take an instruction so we catch this earlier in the future. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2015-09-11 09:21:04 -07:00
Jason Ekstrand	794355e771	nir/lower_outputs_to_temporaries: Reparent the output name We copy the output, make the old output the temporary, and give the temporary a new name. The copy keeps the pointer to the old name. This works just fine up until the point where we lower things to SSA and delete the old variable and, with it, the name. Instead, we should re-parent to the copy. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-11 08:55:51 -07:00
Kenneth Graunke	b811085b79	nir: Store some geometry shader data in nir_shader. This makes it possible for NIR shaders to know the number of output vertices and the number of invocations. Drivers could also access these directly without going through gl_program. We should probably add InputType and OutputType here too, but currently those are stored as GL_* enums, and I wanted to avoid using those in NIR, as I suspect Vulkan/SPIR-V will use different enums. (We should probably make our own.) We could add VerticesIn, but it's easily computable from the input topology, so I'm not sure whether it's worth it. It's also currently not stored in gl_shader (only gl_shader_program), which would require changes to the glsl_to_nir interface or require us to store it there. This is a bit of duplication of data...ideally, we would factor these substructs out of gl_program, gl_shader_program, and nir_shader, creating a gl_geometry_info class...but it would need to go in a new place (in src/glsl?) that isn't mtypes.h nor nir.h. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-11 00:05:09 -07:00
Kenneth Graunke	cb2b118e40	nir/builder: Add nir_load_var() and nir_store_var() helpers. These provide a convenient way to do simple variable loads and stores. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-11 00:04:17 -07:00
Ilia Mirkin	56238305e5	nir: convert glsl imageSamples into a new intrinsic Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-10 17:38:52 -04:00
Ilia Mirkin	1807a08e4f	nir: add nir_texop_texture_samples and convert from glsl Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-10 17:38:33 -04:00
Rhys Kidd	32cdb49fe2	glsl: Resolve GCC sign-compare warning. mesa/src/glsl/nir/nir_lower_tex_projector.c: In function 'nir_lower_tex_projector_block': mesa/src/glsl/nir/nir_lower_tex_projector.c:63:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = 0; i < tex->num_srcs; i++) { ^ mesa/src/glsl/nir/nir_lower_tex_projector.c: In function 'nir_lower_tex_projector_block': mesa/src/glsl/nir/nir_lower_tex_projector.c:114:38: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = proj_index + 1; i < tex->num_srcs; i++) { ^ mesa/src/glsl/nir/nir_lower_tex_projector.c: In function 'nir_lower_tex_projector_block': mesa/src/glsl/nir/nir_lower_tex_projector.c:53:39: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (proj_index = 0; proj_index < tex->num_srcs; proj_index++) { ^ mesa/src/glsl/nir/nir_lower_tex_projector.c:57:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (proj_index == tex->num_srcs) ^ mesa/src/glsl/nir/nir_search.c: In function 'match_value': mesa/src/glsl/nir/nir_search.c:84:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = 0; i < num_components; ++i) ^ mesa/src/glsl/nir/nir_search.c: In function 'match_value': mesa/src/glsl/nir/nir_search.c:110:28: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = 0; i < num_components; ++i) { ^ mesa/src/glsl/nir/nir_search.c: In function 'match_value': mesa/src/glsl/nir/nir_search.c:139:19: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (i < num_components) ^ mesa/src/glsl/nir/nir_opt_peephole_ffma.c: In function 'get_mul_for_src': mesa/src/glsl/nir/nir_opt_peephole_ffma.c:130:27: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (unsigned i = 0; i < num_components; i++) ^ Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-10 14:56:41 +01:00
Jason Ekstrand	b828f7a27b	nir/glsl: Use lower_outputs_to_temporaries instead of relying on GLSL IR Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-09-09 12:29:38 -07:00
Jason Ekstrand	1dbe4af9c9	nir: Add a pass to lower outputs to temporary variables This pass can be used as a helper for NIR producers so they don't have to worry about creating the temporaries themselves. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-09-09 12:29:21 -07:00
Jason Ekstrand	f5e08ab6b1	nir/cursor: Add a constructor for the end of a block but before the jump Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-09-09 12:28:51 -07:00
Kenneth Graunke	d5d74d0b86	nir: Add a nir_system_value_from_intrinsic() function. This converts NIR intrinsics that load system values into Mesa's SYSTEM_VALUE_* enumerations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-09-08 18:02:08 -07:00
Iago Toral Quiroga	205ff843ff	nir: UBO loads no longer use const_index[1] Commit `2126c68e5c` killed the array elements parameter on load/store intrinsics that was stored in const_index[1]. It looks like that patch missed to remove this assignment in the UBO path. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-08 09:06:34 +02:00
Connor Abbott	aec6744501	nir/dead_cf: add support for removing useless loops v2: fix detecting if the loop has any phi nodes after it. v2: use nir_foreach_ssa_def() instead of nir_foreach_dest() when checking for values live after the loop to catch const_load instructions. v2: fix handling return instructions v2: add some documentation to loop_is_dead() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-01 00:58:17 -07:00
Connor Abbott	019eea1c4f	nir: add a helper for iterating over blocks in a cf node We were already doing this internally for iterating over a function implementation, so just expose it directly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-01 00:58:17 -07:00
Connor Abbott	89dc0626bd	nir: add nir_block_get_following_loop() helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-01 00:58:17 -07:00
Connor Abbott	f649afc9dd	nir/dead_cf: delete code that's unreachable due to jumps v2: use nir_cf_node_remove_after(). v2: use foreach_list_typed() instead of hardcoding a list walk. v3: update to new control flow modification helpers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-01 00:58:17 -07:00
Connor Abbott	1e6ad4b027	nir: add an optimization for removing dead control flow v2: use nir_cf_node_remove_after() instead of our own broken thing. v3: use the new control flow modification helpers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-01 00:58:17 -07:00
Jason Ekstrand	e16531fbe3	nir/builder: Use nir_after_instr to advance the cursor This should ensure that the cursor gets properly advanced in all cases. We had a problem before where, if the cursor was created using nir_after_cf_node on a non-block cf_node, that would call nir_before_block on the block following the cf node. Instructions would then get inserted in backwards order at the top of the block which is not at all what you would expect from nir_after_cf_node. By just resetting to after_instr, we avoid all these problems. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-31 18:17:07 -07:00
Kenneth Graunke	0a913a9d85	nir: Convert the builder to use the new NIR cursor API. The NIR cursor API is exactly what we want for the builder's insertion point. This simplifies the API, the implementation, and is actually more flexible as well. This required a bit of reworking of TGSI->NIR's if/loop stack handling; we now store cursors instead of cf_node_lists, for better or worse. v2: Actually move the cursor in the after_instr case. v3: Take advantage of nir_instr_insert (suggested by Connor). v4: vc4 build fixes (thanks to Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v1] Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v4] Acked-by: Connor Abbott <cwabbott0@gmail.com> [v4]	2015-08-27 13:36:57 -07:00
Kenneth Graunke	3e3cb77901	nir: Convert the NIR instruction insertion API to use cursors. This patch implements a general nir_instr_insert() function that takes a nir_cursor for the insertion point. It then reworks the existing API to simply be a wrapper around that for compatibility. This largely involves moving the existing code into a new function. Suggested by Connor Abbott. v2: Make the legacy functions static inline in nir.h (requested by Connor Abbott). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Acked-by: Connor Abbott <cwabbott0@gmail.com>	2015-08-27 13:36:57 -07:00
Kenneth Graunke	f90c6b1ce0	nir: Move nir_cursor to nir.h. We want to use this for normal instruction insertion too, not just control flow. Generally these functions are going to be extremely useful when working with NIR, so I want them to be widely available without having to include a separate file. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Acked-by: Connor Abbott <cwabbott0@gmail.com>	2015-08-27 13:36:57 -07:00
Kenneth Graunke	c44d507752	nir: Strengthen "no jumps" assertions in instruction insertion API. Jumps must be the last instruction in a block, so inserting another instruction after a jump is illegal. Previously, we only checked this when the new instruction being inserted was a jump. This is a red herring - inserting any kind of instruction after a jump is illegal. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Acked-by: Connor Abbott <cwabbott0@gmail.com>	2015-08-27 13:36:57 -07:00
Kenneth Graunke	5f14c417c8	nir: Use nir_shader::stage rather than passing it around. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-08-25 11:12:35 -07:00
Kenneth Graunke	d4d5b430a5	nir: Store gl_shader_stage in nir_shader. This makes it easy for NIR passes to inspect what kind of shader they're operating on. Thanks to Michel Dänzer for helping me figure out where TGSI stores the shader stage information. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-08-25 11:12:35 -07:00
Jason Ekstrand	c999a58f50	nir/lower_io: Remove assign_var_locations_direct_first This is no longer used so we might as well get rid of it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-25 10:18:27 -07:00
Jason Ekstrand	ce5e9139aa	nir/lower_io: Separate driver_location and base offset for uniforms Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-25 10:18:27 -07:00
Jason Ekstrand	0db8e87b4a	nir/intrinsics: Add a second const index to load_uniform In the i965 backend, we want to be able to "pull apart" the uniforms and push some of them into the shader through a different path. In order to do this effectively, we need to know which variable is actually being referred to by a given uniform load. Previously, it was completely flattened by nir_lower_io which made things difficult. This adds more information to the intrinsic to make this easier for us. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-25 10:18:27 -07:00
Kenneth Graunke	6c33d6bbf9	nir: Pass a type_size() function pointer into nir_lower_io(). Previously, there were four type_size() functions in play - the i965 compiler backend defined scalar and vec4 type_size() functions, and nir_lower_io contained its own similar functions. In fact, the i965 driver used nir_lower_io() and then looped over the components using its own type_size - meaning both were in play. The two are /basically/ the same, but not exactly in obscure cases like subroutines and images. This patch removes nir_lower_io's functions, and instead makes the driver supply a function pointer. This gives the driver ultimate flexibility in deciding how it wants to count things, reduces code duplication, and improves consistency. v2 (Jason Ekstrand): - One side-effect of passing in a function pointer is that nir_lower_io is now aware of and properly allocates space for image uniforms, allowing us to drop hacks in the backend Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> v2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-25 10:18:27 -07:00
Kenneth Graunke	4f2cdd8497	nir: Use !block_ends_in_jump() in a few places rather than open-coding. Connor introduced this helper recently; we should use it here too. I had to move the function earlier in the file for it to be available. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-08-24 15:10:55 -07:00
Connor Abbott	d7971b41ce	nir/cf: reimplement nir_cf_node_remove() using the new API This gives us some testing of it. Also, the old nir_cf_node_remove() wasn't handling phi nodes correctly and was calling cleanup_cf_node() too late. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	fc7f2d2364	nir/cf: add new control modification API's These will help us do a number of things, including: - Early return elimination. - Dead control flow elimination. - Various optimizations, such as replacing: if (foo) { ... } if (!foo) { ... } with: if (foo) { ... } else { ... } Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	476eb5e4a1	nir/cf: use a cursor for inserting control flow Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	d356f84d4c	nir/cf: add split_block_cursor() This is a helper that will be shared between the new control flow insertion and modification code. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	58a360c6b8	nir/cf: add split_block_before_instr() Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	6e47a34b29	nir/cf: add a cursor structure For now, it allows us to refactor the control flow insertion API's so that there's a single entrypoint (with some wrappers). More importantly, it will allow us to reduce the combinatorial explosion in the extract function. There, we need to specify two points to extract, which may be at the beginning of a block, the end of a block, or in the middle of a block. And then there are various wrappers based off of that (before a control flow node, before a control flow list, etc.). Rather than having 9 different functions, we can have one function and push the actual logic of determining which variant to use down to the split function, which will be shared with nir_cf_node_insert(). In the future, we may want to make the instruction insertion API's as well as the builder use this, but that's a future cleanup. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	6f5c81f86f	nir/cf: fix link_blocks() when there are no successors When we insert a single basic block A into another basic block B, we will split B into C and D, insert A in the middle, and then splice together C, A, and D. When we splice together C and A, we need to move the successors of A into C -- except A has no successors, since it hasn't been inserted yet. So in move_successors(), we need to handle the case where the block whose successors are to be moved doesn't have any successors. Fixing link_blocks() here prevents a segfault and makes it work correctly. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	6d028749ac	nir/cf: clean up jumps when cleaning up CF nodes We may delete a control flow node which contains structured jumps to other parts of the program. We need to remove the jump as a predecessor, as well as remove any phi node sources which reference it. Right now, the same problem exists for blocks that don't end in a jump instruction, but with the new API it shouldn't be an issue, since blocks that don't end in a jump must either point to another block in the same extracted CF list or not point to anything at all. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	211c79515d	nir/cf: remove uses of SSA definitions that are being deleted Unlike calling nir_instr_remove(), calling nir_cf_node_remove() (and later in the series, the nir_cf_list_delete()) implies that you're removing instructions that may still have uses, except those instructions are never executed so any uses will be undefined. When cleaning up a CF node for deletion, we must clean up any uses of the deleted instructions by making them point to undef instructions instead. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	633cbbc068	nir/cf: handle jumps better in stitch_blocks() In particular, handle the case where the earlier block ends in a jump and the later block is empty. In that case, we want to preserve the jump and remove any traces of the later block. Before, we would only hit this case when removing a control flow node after a jump, which wasn't a common occurance, but we'll need it to handle inserting a control flow list which ends in a jump, which should be more common/useful. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	940873bf22	nir/cf: handle jumps in split_block_end() Before, we would only split a block with a jump at the end if we were inserting something after a block with a jump, which never happened in practice. But now, we want to use this to extract control flow lists which may end in a jump, in which case we really need to do the correct patching up. As a side effect, when removing jumps we now correctly insert undef phi sources in some corner cases, which can't hurt. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	f596e4021c	nir/cf: add block_ends_in_jump() Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	788d45cb47	nir/cf: handle phi nodes better in split_block_beginning() Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	747ddc3cdd	nir/cf: split up and improve nir_handle_remove_jumps() Before, the process of removing a jump and wiring up the remaining block correctly was atomic, but with the new control flow modification it's split into two parts: first, we extract the jump, which creates a new block with re-wired successors as well as a free-floating jump, and then we delete the control flow containing the jump, which removes the entry in the predecessors and any phi node sources. Split up nir_handle_remove_jumps() to accomodate this, and add the missing support for removing phi node sources. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	13482111d0	nir/cf: add remove_phi_src() helper Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:41 -07:00
Connor Abbott	f41e108d8b	nir: add nir_foreach_phi_src_safe() Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:41 -07:00
Connor Abbott	762ae436ea	nir/cf: add insert_phi_undef() helper Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:41 -07:00
Connor Abbott	b49371b8ed	nir: move control flow modification to its own file We want to start reworking and expanding this code, but it'll be a lot easier to do once we disentangle it from the rest of the stuff in nir.c. Unfortunately, there are a few unavoidable dependencies in nir.c on methods we'd rather not expose publicly, since if not used in very specific situations they can cause Bad Things (tm) to happen. Namely, we need to do some magical control flow munging when adding/removing jumps. In the future, we may disallow adding/removing jumps in nir_instr_insert_*() and nir_instr_remove(), and use separate functions that are part of the control flow modification code, but for now we expose them and put them in a separate, private header. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:41 -07:00
Connor Abbott	1c53f89696	nir: make cleanup_cf_node() not use remove_defs_uses() cleanup_cf_node() is part of the control flow modification code, which we're going to split into its own file, but remove_defs_uses() is an internal function used by nir_instr_remove(). Break the dependency by making cleanup_cf_node() use nir_instr_remove() instead, which simply calls remove_defs_uses() and then removes the instruction from the list. nir_instr_remove() does do extra things for jumps, though, so we avoid calling it on jumps which matches the previous behavior (this will be fixed later in the series). Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:41 -07:00
Connor Abbott	9d5944053c	nir: inline block_add_pred() a few places It was being used to initialize function impls and loops, even though it's really a control flow modification helper. It's pretty trivial, so just inline it to avoid the dependency. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:41 -07:00
Connor Abbott	c7df141c71	nir/validate: check successors/predecessors more carefully We should be checking almost everything now. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:41 -07:00
Kenneth Graunke	8e0d4ef341	nir: Delete the nir_function_impl::start_block field. It's simply the first nir_cf_node in the nir_function_impl::body list, which is easy enough to access - we don't to store a pointer to it explicitly. Removing it means we don't need to maintain the pointer when, say, splitting the start block when modifying control flow. Thanks to Connor Abbott for suggesting this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-08-24 13:31:41 -07:00
Martin Peres	80b1707e26	nir: convert the glsl intrinsic image_size to nir_intrinsic_image_size v2, review from Francisco Jerez: - make the destination variable as large as what the nir instrinsic defines (4) instead of the size of the return variable of glsl. This is still safe for the already existing code because all the intrinsics affected returned the same amount of components as expected by glsl IR. In the case of image_size, it is not possible to do so because the returned number of component depends on the image type and this case is not well handled by nir. v3: - Style fix Signed-off-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-08-20 14:07:46 +03:00
Kenneth Graunke	ab83be590d	nir: Use nir_builder in nir_lower_io's get_io_offset(). Much more readable. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-08-19 19:29:39 -07:00
Kenneth Graunke	ed2afec3fc	nir: Pull nir_lower_io's load_op selection into a helper function. Makes the function a bit smaller. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-08-19 19:29:22 -07:00
Thomas Helland	49d0a36bd6	nir: Simplify feq(fneg(a), a)) -> feq(a, 0.0) The positive and negative value of a float can only be equal to each other if it is -0.0f and 0.0f. This is safe for Nan and Inf, as -Nan != Nan, and -Inf != Inf This gives no changes in my shader-db Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-08-18 11:34:44 -07:00
Thomas Helland	a39167d594	nir: Simplify fne(fneg(a), a) -> fne(a, 0.0) -NaN != NaN, and -Inf != Inf, so this should be safe. Found while working on my VRP pass. Shader-db results on my IVB: total instructions in shared programs: 1698267 -> 1698067 (-0.01%) instructions in affected programs: 15785 -> 15585 (-1.27%) helped: 36 HURT: 0 GAINED: 0 LOST: 0 Some shaders was found to have the following pattern in NIR: vec1 ssa_26 = fneg ssa_21 vec1 ssa_27 = fne ssa_21, ssa_26 Make that: vec1 ssa_27 = fne ssa_21, 0.0f This is found in Dota2 and Brutal Legend. One shader is cut by 8%, from 323 -> 296 instructons in SIMD8 Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-08-18 11:34:44 -07:00
Kenneth Graunke	afccbd7256	nir: Add a glsl_uint_type() wrapper. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-08-16 21:44:19 -07:00
Eric Anholt	a6e75e3cd7	nir: Add support for CSE on textures. NIR instruction count results on i965: total instructions in shared programs: 1261954 -> 1261937 (-0.00%) instructions in affected programs: 455 -> 438 (-3.74%) One in yofrankie, two in tropics. Apparently i965 had also optimized all of these out anyway. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-08-14 11:39:18 -07:00
Eric Anholt	fb2425a641	nir: Zero out texture instructions when creating them. There are so many flags in textures, that the CSE pass would have a hard time referencing the correct set when figuring out if two texture ops are the same. By zeroing, we can avoid that fragility. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-08-14 11:39:18 -07:00
Eric Anholt	d50c182671	nir: Don't try to scalarize unpack ops. Avoids regressions in vc4 when trying to do our blending in NIR. v2: Add the other unpack ops I meant to when writing the original commit message. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-08-14 11:39:18 -07:00
Eric Anholt	9e6dc5b64d	nir: Add a nir_opt_undef() to handle csels with undef. We may find a cause to do more undef optimization in the future, but for now this fixes up things after if flattening. vc4 was handling this internally most of the time, but a GLB2.7 shader that did a conditional discard and assign gl_FragColor in the else was still emitting some extra code. total instructions in shared programs: 100809 -> 100795 (-0.01%) instructions in affected programs: 37 -> 23 (-37.84%) v2: Use nir_instr_rewrite_src() to update def/use on src[0] (by Thomas Helland). v3: Make sure to flag metadata dirties, and copy the swizzle and abs/neg over to src[0], too (by anholt). Reviewed-by: Thomas Helland <thomashelland90@gmail.com> (v2) Tested-by: Thomas Helland <thomashelland90@gmail.com> (v2)	2015-08-14 11:39:18 -07:00
Timothy Arceri	2c61d583f8	nir: add missing type to type_size_vec4() Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-08-05 21:16:45 +10:00
Eric Anholt	6c28ee2041	nir: Add a nir_lower_load_const_to_scalar() pass. This is useful to increase the CSE opportunities for a scalar backend. It avoids regressions when dropping vc4's custom CSE implementation. v2: Cleanups by Matt (decl in the for loop, and unreachable()). Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-08-04 20:03:10 -07:00
Eric Anholt	a70f63ab20	nir: Add algebraic opt for no-op iand. I lazily generated some of these in VC4 NIR lowering. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-08-04 17:19:25 -07:00
Eric Anholt	eae9c3286e	Revert "nir: Use a single bit for the dual-source blend index" This reverts commit `ab5b7a0fe6`. We use more than one bit of value in tgsi_to_nir.	2015-08-04 17:19:01 -07:00
Samuel Iglesias Gonsalvez	418c004f80	nir: Fix output swizzle in get_mul_for_src Avoid copying an overwritten swizzle, use the original values. Example: Former swizzle[] = xyzw src->swizzle[] = zyxx The expected output swizzle = zyxx but if we reuse swizzle in the loop, then output swizzle would be zyzz. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-08-03 09:40:50 -07:00
Iago Toral Quiroga	01f6235020	nir/nir_lower_io: Add vec4 support The current implementation operates in scalar mode only, so add a vec4 mode where types are padded to vec4 sizes. This will be useful in the i965 driver for its vec4 nir backend (and possbly other drivers that have vec4-based shaders). Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-08-03 09:40:47 -07:00
Timothy Arceri	ab5b7a0fe6	nir: Use a single bit for the dual-source blend index The only values allowed are 0 and 1, and the value is checked before assigning. This is a copy of `8eeca7a56c` that seems to have been made to the glsl ir type after it was copied for use in nir but before nir landed. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-08-03 21:36:50 +10:00
Matt Turner	4251ccb47b	nir: Avoid double promotion. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-07-29 09:34:51 -07:00
Matt Turner	5c7fd67045	glsl: Remove MSVC implementations of copysign and isnormal. Non-Gallium parts of Mesa require MSVC 2013 which provides these.	2015-07-29 09:34:51 -07:00
Dave Airlie	80511d176a	i965: add support for ARB_shader_subroutine This just adds some missing pieces to nir/i965, it is lightly tested on my Haswell. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-07-24 10:25:08 +10:00
Dave Airlie	57f24299b7	glsl/types: add new subroutine type (v3.2) This type will be used to store the name of subroutine types as in subroutine void myfunc(void); will store myfunc into a subroutine type. This is required to the parser can identify a subroutine type in a uniform decleration as a valid type, and also for looking up the type later. Also add contains_subroutine method. v2: handle subroutine to int comparisons, needed for lowering pass. v3: do subroutine to int with it's own IR operation to avoid hacking on asserts (Kayden) v3.1: fix warnings in this patch, fix nir, fix tgsi v3.2: fixup tests Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Dave Airlie <airlied@redhat.com> tests: fix warnings	2015-07-23 17:25:25 +10:00
Connor Abbott	eaf799ddff	nir: add nir_foreach_instr_safe_reverse() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>	2015-07-17 09:49:53 -07:00
Connor Abbott	8eea091747	nir: add nir_instr_is_first() and nir_instr_is_last() helpers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>	2015-07-17 09:47:22 -07:00
Iago Toral Quiroga	6b09598d63	nir: add nir_var_shader_storage Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-07-14 07:04:03 +02:00
Kenneth Graunke	efb36271a9	nir: Fix comment above nir_convert_from_ssa() prototype. Connor renamed the parameter, inverting the sense. Update the comment accordingly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-07-08 11:28:08 -07:00
Rob Clark	959b47262b	nir/lower_phis_to_scalar: undef is trivially scalarizable Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-07-03 08:56:09 -04:00
Jason Ekstrand	89bd5ee64c	nir: Don't allow copying SSA destinations Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-07-02 15:42:33 -07:00
Connor Abbott	aa7d4cecec	nir: remove parent_instr from nir_register It's no longer used. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-06-30 11:18:27 -07:00
Connor Abbott	f49e51ef44	nir: remove nir_src_get_parent_instr() It's now unused. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-06-30 11:18:27 -07:00
Connor Abbott	2b1a1d8b12	nir/from_ssa: add a flag to not convert everything from SSA We already don't convert constants out of SSA, and in our backend we'd like to have only one way of saying something is still in SSA. The one tricky part about this is that we may now leave some undef instructions around if they aren't part of a phi-web, so we have to be more careful about deleting them. v2: rename and flip meaning of flag (Jason) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-06-30 11:18:27 -07:00
Rob Clark	dc7e6463d3	nir: cleanup open-coded instruction casts Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-06-30 12:13:44 -04:00
Kenneth Graunke	6026f7e8fb	nir: Recognize max(min(a, 1.0), 0.0) as fsat(a). We already recognize min(max(a, 0.0), 1.0) as a saturate, but neglected this variant (which is also handled by the GLSL IR pass). shader-db results on Broadwell: total instructions in shared programs: 7363046 -> 7362788 (-0.00%) instructions in affected programs: 11928 -> 11670 (-2.16%) helped: 64 HURT: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-06-25 02:12:32 -07:00
Kenneth Graunke	147cdb53ec	nir: Use a switch statement for detecting move-like operations. Suggested by Jason Ekstrand. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-06-24 10:35:04 -07:00
Kenneth Graunke	1762568fd3	nir: Allow vec2/vec3/vec4 instructions in the select peephole pass. These are basically just moves, so they should be safe as well. When disabling i965's GLSL IR level scalarizer (channel expressions) pass, I started seeing NIR code like this: if ssa_21 { block block_1: /* preds: block_0 / vec4 ssa_120 = vec4 ssa_82, ssa_83, ssa_84, ssa_30 / succs: block_3 / } else { block block_2: / preds: block_0 / / succs: block_3 / } block block_3: / preds: block_1 block_2 */ vec4 ssa_33 = phi block_1: ssa_120, block_2: ssa_2 Previously, the GLSL IR scalarizer pass would break the vec4 into a series of fmovs, which were allowed by the peephole pass. But with the vec4 operation, they were not. We want to keep getting selects. Normal i965 on Broadwell: instructions in affected programs: 200 -> 176 (-12.00%) helped: 4 With brw_fs_channel_expressions() disabled: instructions in affected programs: 1832 -> 1646 (-10.15%) helped: 30 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-06-22 14:08:36 -07:00
Jordan Justen	2867f2e8cd	nir: Add barrier intrinsic function Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-06-12 15:12:40 -07:00
Chris Forbes	e7f628c2fc	glsl: Add ir node for barrier v2: * Changes suggested by mattst88 [jordan.l.justen@intel.com: Add nir support] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-06-12 15:12:39 -07:00
Timothy Arceri	86a74e9b6b	nir: use src for ssa helper Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-06-03 06:50:39 +10:00
Timothy Arceri	5f7b8fa481	nir: remove extra semicolon Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-06-03 06:50:33 +10:00
Eduardo Lima Mitev	5b226a1242	nir: prevent use-after-free condition in should_lower_phi() lower_phis_to_scalar() pass recurses the instruction dependence graph to determine if all the sources of a given instruction are scalarizable. To prevent cycles, it temporary marks the phi instruction before recursing in, then updates the entry with the resulting value. However, it does not consider that the entry value may have changed after a recursion pass, hence causing a use-after-free situation and a crash. This patch fixes this by reloading the entry corresponding to the 'phi' after recursing and before updating its value. The crash can be reproduced ~20% of times with the dEQP test: dEQP-GLES3.functional.shaders.loops.while_constant_iterations.nested_sequence_fragment Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-06-02 20:21:49 +02:00
Iago Toral Quiroga	2231cf0ba3	nir: Fix output swizzle in get_mul_for_src When we compute the output swizzle we want to consider the number of components in the add operation. So far we were using the writemask of the multiplication for this instead, which is not correct. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-05-28 18:25:37 +02:00
Matt Turner	5614bcc416	nir: Remove sRGB colorspace conversion round-trip. Some shaders in Civilization V and Beyond Earth do pow(pow(x, 2.2), 0.454545) which is converting to and from sRGB colorspace. A more general rule that replaces pow(pow(a, b), c) with pow(a, b * c) actually regresses two shaders in Sun Temple in which the result of the inner pow is used twice, once by another pow and once by another instruction. Also, since 2.2 * 0.454545 isn't exactly one, the more general pattern would have still left us with a pow, and I'm 2.2 * 0.454545 percent sure that's not what they want. instructions in affected programs: 934 -> 886 (-5.14%) helped: 16	2015-05-22 11:26:36 -07:00
Jason Ekstrand	2126c68e5c	nir: Get rid of the array elements parameter on load/store intrinsics Previously, we used intrinsic->const_index[1] to represent "the number of array elements to load" for load/store intrinsics. However, this set to 1 by every pass that ever creates a load/store intrinsic. Also, while it might make some sense for registers, it makes no sense whatsoever in SSA. On top of that, the i965 backend was the only backend to ever support it; freedreno and vc4 just assert that it's always 1. Let's just delete it. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2015-05-20 09:28:06 -07:00
Francisco Jerez	d91d6b3f03	nir: Translate memory barrier intrinsics from GLSL IR. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-05-12 15:47:57 +03:00
Francisco Jerez	f8f8b31847	nir: Translate image load, store and atomic intrinsics from GLSL IR. v2: Undefine coordinate components not applicable to the target. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-05-12 15:47:57 +03:00
Francisco Jerez	6de78e6b0c	nir: Fix indexing of atomic counter arrays with a constant value. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-05-12 15:47:57 +03:00
Francisco Jerez	f1269a3e01	nir: Add memory barrier intrinsic. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-05-12 15:47:57 +03:00
Francisco Jerez	d9e930997f	nir: Define image load, store and atomic intrinsics. v2: Undefine coordinate components not applicable to the target. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-05-12 15:47:57 +03:00
Tapani Pälli	95774ca258	nir: fix sampler lowering pass for arrays This fixes bugs with special cases where we have arrays of structures containing samplers or arrays of samplers. I've verified that patch results in calculating same index value as returned by _mesa_get_sampler_uniform_value for IR. Patch makes following ES3 conformance test pass: ES3-CTS.shaders.struct.uniform.sampler_array_fragment v2: remove unnecessary comment (Topi) simplify changes and the overall code (Jason) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90114	2015-05-12 14:28:16 +03:00
Kenneth Graunke	d6fb155f30	nir: Fix aggressive typos in nir_from_ssa.c. s/agressive/aggressive/g Trivial.	2015-05-08 19:38:14 -07:00
Jason Ekstrand	fb5f411248	nir/search: Save/restore the variables_seen bitmask when matching Shader-db results on Broadwell: total instructions in shared programs: 7152330 -> 7137006 (-0.21%) instructions in affected programs: 1330548 -> 1315224 (-1.15%) helped: 5797 HURT: 76 GAINED: 0 LOST: 8 Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-05-08 17:29:15 -07:00
Jason Ekstrand	e0cfe59c37	nir/search: Assert that variable id's are in range Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-05-08 17:29:15 -07:00
Jason Ekstrand	13facfbd5b	nir/search: handle explicitly sized sources in match_value Previously, this case was being handled in match_expression prior to calling match_value. However, there is really no good reason for this given that match_value has all of the information it needs. Also, they weren't being handled properly in the commutative case and putting it in match_value gives us that for free. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-05-08 17:29:14 -07:00
Jason Ekstrand	f752effa08	nir/nir: Use a linked list instead of a hash set for use/def sets This commit switches us from the current setup of using hash sets for use/def sets to using linked lists. Doing so should save us quite a bit of memory because we aren't carrying around 3 hash sets per register and 2 per SSA value. It should also save us CPU time because adding/removing things from use/def sets is 4 pointer manipulations instead of a hash lookup. Running shader-db 50 times with USE_NIR=0, NIR, and NIR + use/def lists: GLSL IR Only: 586.4 +/- 1.653833 NIR with hash sets: 675.4 +/- 2.502108 NIR + use/def lists: 641.2 +/- 1.557043 I also ran a memory usage experiment with Ken's patch to delete GLSL IR and keep NIR. This patch cuts an aditional 42.9 MiB of ralloc'd memory over and above what we gained by deleting the GLSL IR on the same dota trace. On the code complexity side of things, some things are now much easier and others are a bit harder. One of the operations we perform constantly in optimization passes is to replace one source with another. Due to the fact that an instruction can use the same SSA value multiple times, we had to iterate through the sources of the instruction and determine if the use we were replacing was the only one before removing it from the set of uses. With this patch, uses are per-source not per-instruction so we can just remove it safely. On the other hand, trying to iterate over all of the instructions that use a given value is more difficult. Fortunately, the two places we do that are the ffma peephole where it doesn't matter and GCM where we already gracefully handle duplicates visits to an instruction. Another aspect here is that using linked lists in this way can be tricky to get right. With sets, things were quite forgiving and the worst that happened if you didn't properly remove a use was that it would get caught in the validator. With linked lists, it can lead to linked list corruption which can be harder to track. However, we do just as much validation of the linked lists as we did of the sets so the validator should still catch these problems. While working on this series, the vast majority of the bugs I had to fix were caught by assertions. I don't think the lists are going to be that much worse than the sets. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-05-08 17:16:13 -07:00
Jason Ekstrand	ecc2cfc8b6	nir: Use nir_instr_rewrite_src in copy propagation We were rolling our own rewrite_src variant in copy-propagation. Let's stop doing that and use the ones in core NIR. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-05-08 17:16:13 -07:00
Jason Ekstrand	f72a8d1cf0	nir: Add a function for rewriting the condition of an if statement Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-05-08 17:16:13 -07:00
Jason Ekstrand	300d729436	nir: Add and use initializer #defines for nir_src and nir_dest Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-05-08 17:16:13 -07:00
Jason Ekstrand	6702ebce57	nir: Modernize the out-of-SSA pass The out-of-SSA pass was one of the first passes written when getting SSA up-and-going (for obvious reasons). As such, it came before a lot of the nifty SSA-based helpers were introduced. This commit modernizes it so that we're no longer doing nearly as much manual banging on use/def sets. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-05-08 17:16:13 -07:00
Jason Ekstrand	7ee0216e2d	nir/validate: Validate SSA def parent instructions Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-05-08 17:16:13 -07:00
Ian Romanick	3bdbc1e436	nir: Delete all traces of nir_op_flog Nothing produces it, and nothing can consume it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-05-08 12:12:54 -07:00
Ian Romanick	ad51f9b421	nir: Don't produce nir_op_flog from GLSL IR All paths that produce GLSL IR for NIR lower ir_unop_log. All paths that consume NIR will explode if they geta nir_op_flog. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-05-08 12:12:54 -07:00
Ian Romanick	e0a17f6e31	nir: Delete all traces of nir_op_fexp Nothing produces it, and nothing can consume it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-05-08 12:12:54 -07:00
Ian Romanick	a45d55f17c	nir: Don't produce nir_op_fexp from GLSL IR All paths that produce GLSL IR for NIR lower ir_unop_exp. All paths that consume NIR will explode if they geta nir_op_fexp. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-05-08 12:12:54 -07:00
Matt Turner	8e029105c2	nir: Allow feq/fne/ieq/ine to be optimized with inot. instructions in affected programs: 380 -> 376 (-1.05%) helped: 2 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-05-07 10:51:05 -07:00
Matt Turner	f5cf74d8ba	nir: Recognize (a < c \|\| b < c) as min(a, b) < c. ... and (a >= c) \|\| (b >= c) as max(a, b) >= c. Similar to commit `97e6c1b9`. total instructions in shared programs: 6182276 -> 6182180 (-0.00%) instructions in affected programs: 6400 -> 6304 (-1.50%) helped: 68 HURT: 4 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-05-07 10:51:05 -07:00
Matt Turner	ceb8b739ce	nir: Recognize trivial min/max. No changes, but does prevent some regressions in the next commit. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-05-07 10:51:05 -07:00
Matt Turner	8ae559971a	nir: Recognize i2b(b2i(x)) as x. Helps the same set of programs as the previous commit. instructions in affected programs: 4490 -> 4346 (-3.21%) helped: 8 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-05-07 10:51:05 -07:00
Matt Turner	74697e2844	nir: Recognize imul(b2i(a), b2i(b)) as a logical AND. Four shaders in Unreal 4's Sun Temple are helped, and gain SIMD16 because we avoid an integer multiplication. instructions in affected programs: 2353 -> 2245 (-4.59%) helped: 4 GAINED: 4 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-05-07 10:51:05 -07:00
Zoë Blade	05e7f7f438	Fix a few typos Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-04-27 17:28:29 +03:00
Matt Turner	f251ea393b	nir: Transform pow(x, 4) into (xx)(x*x).	2015-04-24 11:39:01 -07:00
Jason Ekstrand	125574d1ef	nir/lower_source_mods: Don't propagate register sources The nir_lower_source_mods pass does a weak form of copy propagation to clean up all of the mov-with-negate's that get generated. However, we weren't properly checking that the sources were SSA and so we could end up moving a register read which is not, in general, valid. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-22 18:10:41 -07:00
Jason Ekstrand	296131f467	nir: Rewrite instr_rewrite_src The old code wasn't correctly handling the case where the new value of the source contains an indirect. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-22 18:10:41 -07:00
Jason Ekstrand	d61bd972d8	nir/locals_to_regs: Hanadle indirect accesses of length-1 arrays Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-22 18:10:41 -07:00
Jason Ekstrand	06f3c98b9d	nir/locals_to_regs: Initialize registers with constant initializers Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-22 18:10:41 -07:00
Jason Ekstrand	4e9b376594	nir/locals_to_regs: Pass around the nir_shader rather than a void * mem_ctx Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-22 18:10:41 -07:00
Jason Ekstrand	f50f59d3d9	nir: Add a simple growing array data structure Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-22 18:10:41 -07:00
Jason Ekstrand	8b900e7405	nir/types: Make glsl_get_length smarter Previously, this function returned the number of elements for structures and arrays and 0 for everything else. In NIR, this is almost never what you want because we also treat matricies as arrays so you have to special-case constantly. This commit glsl_get_length treat matrices as an array of columns by returning the number of columns instead of 0 This also fixes a bug in locals_to_regs caused by not checking for the matrix case in one place. v2: Only special-case for matrices and return a length of 0 for vectors as we did before. This was needed to not break the TGSI-based drivers and doesn't really affect NIR at the moment. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org>	2015-04-22 18:10:40 -07:00
Jason Ekstrand	7e1d21edbf	nir: Move get_const_initializer_load from vars_to_ssa to NIR core Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-22 18:10:40 -07:00
Jason Ekstrand	ba88760202	nir/lower_vars_to_ssa: Pass around the nir_shader instead of a void mem_ctx Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-22 18:10:40 -07:00
Jason Ekstrand	e79120afdc	nir/print: Print the closing paren on load_const instructions Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-22 18:10:40 -07:00
Jason Ekstrand	02f03fc0f1	nir/tex: Use the correct return size for query_levels and lod Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-22 18:10:40 -07:00
Jason Ekstrand	94669cb534	nir: Refactor tex_instr_dest_size to use a switch statement Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-22 18:10:40 -07:00
Jason Ekstrand	73cc76362d	nir/lower_vars_to_ssa: Actually look for indirects when determining aliasing Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-22 18:10:39 -07:00
Matt Turner	4dacb212fd	nir: Allow abs/neg in select peephole pass. total instructions in shared programs: 4314531 -> 4308949 (-0.13%) instructions in affected programs: 429085 -> 423503 (-1.30%) helped: 1680 HURT: 0 GAINED: 0 LOST: 111 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-17 11:01:34 -07:00

... 2 3 4 5 6 ...

665 Commits