Commit Graph

67559 Commits

Author SHA1 Message Date
Jason Ekstrand 4c99e3ae78 util: Move main/set to util/hash_set
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 13:21:27 -08:00
Jason Ekstrand 8ed5305d28 hash_table: Rename insert_with_hash to insert_pre_hashed
We already have search_pre_hashed.  This makes the APIs match better.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 13:21:27 -08:00
Matt Turner f0aec4ee1e i965: Don't consider null dst instructions as matching non-null dst.
When performing common subexpression elimination on instructions with
non-null destinations we emit a MOV to copy the result to a new
register that must have no other uses. In the case of:

   cmp.g.f0.0(8) null:D, vgrf43:F, 0.500000f
   ...
   cmp.g.f0.0(8) vgrf113:D, vgrf43:F, 0.500000f

we put the first instruction in the AEB and decided that we could reuse
its result when we found the second. Unfortunately, that meant that we'd
emit a MOV from the first's destination, which is null.

Don't do anything if the entry's destination is null and the
instruction's destination is non-null.

Tested-by: Tapani Pälli <tapani.palli@intel.com>
2015-01-15 10:11:42 -08:00
Matt Turner 41d9f232b6 i965/vec4: Make sure that imm writes are to registers in the same file.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87887
2015-01-15 10:11:42 -08:00
Matt Turner 3654b6d43c i965/fs: Emit MADs from (x + abs(y * z)).
Just use the abs source modifier on both of the multiplicand
arguments.

instructions in affected programs:     300 -> 296 (-1.33%)

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-01-15 10:10:44 -08:00
Matt Turner c4fab711ed i965/fs: Emit MADs from (x + -(y * z)).
Just use the negation source modifier on one of the multiplicand
arguments.

total instructions in shared programs: 5889529 -> 5880016 (-0.16%)
instructions in affected programs:     600846 -> 591333 (-1.58%)

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-01-15 10:10:44 -08:00
Jason Ekstrand 0d05d1226e nir/algebraic: Only replace an instruction once
Without the break, it was possible that an instruction would match multiple
expressions.  If this happened, you could end up trying to replace it
multiple times and get a segfault.  This makes it so that, after a
successful replacement, it moves on to the next instruction.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand c56adc68e2 i965/nir: Do a final copy lowering pass before lowering locals to regs
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand 0f85310975 nir/vars_to_ssa: Use the copy lowering from lower_var_copies
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand d3636da902 nir: Add a pass for lowering copy instructions
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand 700ba5daaf nir/vars_to_ssa: Refactor get_deref_node
This refactor allows you to more easily get the deref node associated with
a given variable.  We then use that new functionality in the
deref_may_be_aliased function instead of creating a 1-element deref chain.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand 55b5058e69 nir: Rename lower_variables to lower_vars_to_ssa
The original name wasn't particularly descriptive.  This one indicates that
it actually gives you SSA values as opposed to the old pass which lowered
variables to registers.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand 4aa6162f6e nir/tex_instr: Add a nir_tex_src struct and dynamically allocate the src array
This solves a number of problems.  First is the ability to change the
number of sources that a texture instruction has.  Second, it solves the
delema that may occur if a texture instruction has more than 4 sources.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand dcb1acdea0 nir/validate: Only build in debug mode
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand 347ab2bf24 nir/lower_variables: Improve documentation
Additional description was added to a variety of places.  Also, we no
longer use the term "leaf" to describe fully-qualified direct derefs.
Instead, we simply use the term "direct" or spell it out completely.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand 8016fa39e1 nir/lower_variables: Use a for loop for get_deref_node
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand 0c0ca8b6ae nir: Use the actual FNV-1a hash for hashing derefs
We also switch to using loops rather than recursion.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand a3b73ccf6d util/hash_table: Pull the details of the FNV-1a into helpers
This way the basics of the FNV-1a hash can be reused to easily create other
hashing functions.

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 07:20:23 -08:00
Jason Ekstrand e4115ca9d8 nir: Make intrinsic flags into an enum
This should be much better for debugging as GDB will pick up on the fact
that it's an enum and actually tell you what you're looking at instead of
giving you some arbitrary hex value you have to go look up.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand ed13f4e716 nir: Use static inlines instead of macros for list getters
This should make debugging a lot easier as GDB handles static inlines much
better than macros.  Also, static inlines are typesafe.

Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand b95fae034f nir/variable: Remove the constant_value field
This was a left-over relic of GLSL IR that we aren't using for anything.
If we ever want that value again, we can add it back, but NIR constant
folding should be just as good as GLSL IR's if not better pretty soon, so
I'm not worried about it.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand 8599b30c67 nir: Add some documentation
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand ad9d0a9ea6 nir/lower_variables: Follow the Cytron paper more closely
Previously, our variable renaming algorithm, while similar to the one in
the Cytron paper, was not the same.  While I'm pretty sure it was correct,
it will be easier for readers of the code in the variable renaming pass if
it follows more closely.  This commit removes the automatic stack popping
we were doing and replaces it with explicit popping like Cytron does.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand b1d114a48c nir/print: Various cleanups recommended by Eric
Cc: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand e2763339fe nir/lower_variables: Add a bunch of comments and re-arrange a few things
This commit seeks to make the lower_variables pass much more clear by
adding a pile of comments and re-arranging a few things.  There are no
functional or algorithmic changes.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand 40ca129ed5 nir: Rename parallel_copy_copy to parallel_copy_entry and add a foreach macro
parallel_copy_copy was a silly name.  Also, things were getting long and
annoying, so I added a foreach macro.  For historical reasons, several of
the original iterations over parallel copy entries in from_ssa used the
_safe variants of the loop.  However, all of these no longer ever remove an
entry so it's ok to make them all use the normal iterator.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand 1b720c6ed8 nir/from_ssa: Clean up parallel copy handling and document it better
Previously, we were doing a lazy creation of the parallel copy
instructions.  This is confusing, hard to get right, and involves some
extra state tracking of the copies.  This commit adds an extra walk over
the basic blocks to add the block-end parallel copies up front.  This
should be much less confusing and, consequently, easier to get right.  This
commit also adds more comments about parallel copies to help explain what
all is going on.

As a consequence of these changes, we can now remove the at_end parameter
from nir_parallel_copy_instr.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand de73d1e173 nir: Rename nir_block_following_if to nir_block_get_following_if
The new name is a little longer but less confusing.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand cb53aacaa1 i965/fs_nir: Handle sample ID, position, and mask better
Before, we were emitting the full pile of setup instructions for sample_id
and sample_pos every time they were used.  With this commit, we emit them
in their own pass once at the beginning of the shader and simply emit uses
later on.  When it comes time for setting up VS, we can put setup for its
special values in the same pass.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand 813316d150 nir/opcodes: Remove the per_component info field
Originally, this field was intended for determining if the given
instruction acted per-component or if it had mismatching source and
destination sizes that would have to be interpreted specially.  However, we
can easily derive this from output_size == 0, so it's not really that
useful.  Also, the values we were setting in nir_opcodes.h for this field
were completely bogus and it was never used.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand e2a8f9e5cc nir/search: Use nir_op_infos to determine if an operation is commutative
Prior to this commit, we had a big switch statement for this.  Now it's
baked into the opcode metadata so we can just use that.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand 46f3e1ab50 nir/opcodes: Add algebraic properties metadata
This commit adds some algebraic properties to the metadata of each opcode
in NIR.  In particular, you now know, just from the metadata, if a given
opcode is commutative or associative.  This will be useful for algebraic
transformation passes that want to be able to match a + b as well as b + a
in one go.

v2: Make algebraic properties all caps.  This was more consistent with the
    intrinsics flags and seems better for flags in general.

    Also, the enums are now declared with (1 << n) rather then hex values.

v3: fmin and fmax technically aren't commutative or associative.  Things
    get funny when one of the arguments is a NaN.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand 2c7da78805 nir: Make load_const SSA-only
As it was, we weren't ever using load_const in a non-SSA way.  This allows
us to substantially simplify the load_const instruction.  If we ever need a
non-SSA constant load, we can do a load_const and an imov.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand 675ffdef30 nir: Make nir_ssa_undef_instr_create initialize the destination
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand 951a7f23a0 i965/nir: Move the other lowering passes to before out-of-SSA
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand 5c16be1c52 nir/lower_system_values: Handle SSA destinations
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand 821e75a160 nir/lower_atomics: Use/support SSA
Previously, lower_atomics was non-SSA only.  We assert-failed if the
destination of an atomic operation intrinsic was an SSA def and we used
temporary registers for computing offsets.  This commit changes both of
these behaviors.  We now use SSA values for computing offsets (so we can
optimize them) and we handle SSA destinations.  We also move the pass to
run before we go out of SSA on i965 as it now generates SSA values.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand 8ddb03d56d nir/live_variables: Use the new ssa_def iterator
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand 28a3e164e2 nir: Use nir_foreach_ssa_def for setting up ssa destinations
Before, we were using foreach_dest and switching on whether the destination
was an SSA value.  This works, except not all destinations are SSA values
so we have to special-case ssa_undef instructions.  Now that we have a
foreach_ssa_def function, we can iterate over all of the register
destinations in one pass and iterate over the SSA destinations in a second.
This way, if we add other ssa-only instructions, we won't have to worry
about adding them to the special case we have for ssa_undef.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand 193fea9eb6 nir: Add a foreach_ssa_def function
There are some functions whose destinations are SSA-only and so aren't a
nir_dest.  This provides a function that is capable of iterating over the
SSA definitions defined by those functions.  If you want registers, you
should use the old iterator.

v2: Kenneth Graunke <kenneth@whitecape.org>:
 - Fix nir_foreach_ssa_def's return value.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand bc0735857f nir/lower_variables: Use a real dominance DFS for variable renaming
Previously, we were just iterating over the program "in order" which
kind-of approximates a DFS, but not really.  In particular, we got the
following case wrong:

loop {
   a = 3;
   if (foo) {
      a = 5;
   } else {
      break;
   }
   use(a);
}

where use(a) would get 3 instead of 5 because of premature popping of the
SSA def stack.  Now, since we do an actaul DFS, we should evaluate use(a)
immediately after a = 5 and we should be ok.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:22 -08:00
Jason Ekstrand dfb3abbaec nir: Remove predication
We stopped generating predicates in glsl_to_nir some time ago.  Right now,
it's all dead untested code that I'm not convinced always worked in the
first place.  If we decide we want them back, we can revert this patch.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00
Jason Ekstrand b3fd098e7d nir: Make bcsel a fully vector operation
Previously, the condition was a scalar that applied to all components
simultaneously.  As of this commit, the condition is a vector and each
component is switched seperately.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00
Jason Ekstrand 295faf9462 nir: Call nir_metadata_preserve more places
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00
Jason Ekstrand b6c81b3ff4 nir/metadata: Rename metadata_dirty to metadata_preserve
nir_metadata_dirty was a terrible name because the parameter it takes is
the metadata to be preserved.  This is really confusing because it looks
like it's doing the opposite of what it is actually doing.  Now it's named
sensibly.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00
Jason Ekstrand 3c2c0a164c i965/fs_nir: Add support for indirect texture arrays
v2 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Use the nir_tex_src_sampler_offset source type instead of the
   sampler_indirect thing that I cooked up before.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-01-15 07:20:21 -08:00
Jason Ekstrand 60ec60a600 nir: Rework the way samplers are lowered
v2 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Use the nir_tex_src_sampler_offset source type instead of the
   sampler_indirect thing that I cooked up before.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-01-15 07:20:21 -08:00
Jason Ekstrand 4cdabcc0fa nir/tex_instr_create: Initialize all 4 sources
This helps a lot with things like lowering passes that may need to add
sources.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00
Jason Ekstrand 62ac0ee804 nir/tex_instr: Rename the indirect source type and add an array size
In particular, we rename nir_tex_src_sampler_index to _sampler_offset and
add a sampler_array_size field to nir_tex_instr.  This way we can pass the
size of sampler arrays through to backends even after removing the variable
information and, with it, the type.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00
Jason Ekstrand 534d145e5e nir: Use a source for uniform buffer indices instead of an index
In GLSL-to-NIR we were just setting the base index to 0 whenever there was
an indirect so having it expressed as a sum makes no sense.  Also, while a
base offset may make sense for the memory location (first element in the
array, etc.) it makes less sense for the actual uniform buffer index.  This
may change later, but it seems to make more sense for now.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:21 -08:00